Monday, 22 July 2013

Testing - Automation Tool WATIR

Watir stands for "Web Application Testing in Ruby". It is a library in Ruby which allows you to automate web applications. By default Watir supports IE, but with support of supplementary libraries you're able to automate applications in Firefox, Chrome and Safari as well.

Sometimes I need to automate a task in a webbrowser. There are many tools on the market to choose from but so far Watir is my preferred one. These are my main reasons to choose for Watir:

  1. It's free! It's an open source tool, so there are no costs to use this tool.
  2. It supports multiple browsers and platforms.
  3. It uses Ruby, my favorite scripting language. Ruby is concise and a joy to read and write.
  4. The Ruby knowledge gained when using Watir can leverage my Ruby and Ruby on Rails projects, and vice versa.
  5. It's lightweight. My computer doesn't suffer when creating or running automated tests.

Testing - Decision "to Automate or Not to Automate?

To automate or not to automate? This is a question everyone should ask to himself when composing a pack of regression test cases for test automation. Here is a shortlist of criteria which help deciding whether a test case is eligible for test automation or not:

Application stability (interface)
Changing application interfaces (e.g. GUI) can impact the automated test ware. This may force the update of automation interfaces (object repository, scripts, ...).

Requirement stability
Changing functionality can impact the automated test ware. Composition of tests may need to be altered/updated.

Good quality manual tests
A poor manual test will remain a poor test, whether executed manually or automated.

Frequently run tests/functionality
Higher ROI when automating tests/functionality which require many runs.

Reusability
It may be possible to reuse certain script libraries for different applications.

Limitation of user errors
Automating repetitive test procedures decreases the risk of user errors due to loss of concentration.

Complexity of application/tests
Complex applications may require extensive training of test executors.

Testing - Test Estimation Techniques


There are 5 methods (amongst other millions!) which I have used in my career for test effort estimation.

They are,

1. Delphi Technique
2. Work Breakdown Structure
3. Test Point Analysis
4. Function Point Analysis
5. Use-case analysis methodology

1st One - Most popular:

The Delphi technique is a widely used and accepted method for gathering data from respondents within their domain of expertise. 

The technique is designed as a group communication process which aims to achieve a convergence of opinion on a specific real-world issue.  The Delphi process has been used in various fields of study such as program planning, needs assessment, policy determination, and resource utilization to develop a full range of alternatives, explore or expose underlying assumptions, as well as correlate judgments on a topic spanning a wide range of disciplines. 
The Delphi technique is well suited as a method for consensus-building by using a series of questionnaires delivered using multiple iterations to collect data from a panel of selected subjects.  Subject selection, time frames for conducting and completing a study, the possibility of low response rates, and unintentionally guiding feedback from the respondent group are areas which should be considered when designing and implementing a Delphi study.
 
THE DELPHI PROCESS
Theoretically, the Delphi process can be continuously iterated until consensus is determined to have been achieved.  However, Cyphert and Gant (1971), Brooks (1979), Ludwig (1994, 1997), and Custer, Scarcella, and Stewart (1999) point out that three iterations are often sufficient to collect the needed information and to reach a consensus in most cases.  The following discussion, however, provides guidelines for up to four iterations in order to assist those who decide to use the Delphi process as a data collection technique when it is determined that additional iterations beyond three are needed or valuable.
Round 1: In the first round, the Delphi process traditionally begins with an open-ended questionnaire. The open-ended questionnaire serves as the cornerstone of soliciting specific information about a content area from the Delphi subjects (Custer, Scarcella, & Stewart, 1999). After receiving subjects’ responses, investigators need to convert the collected information into a well-structured questionnaire.  This questionnaire is used as the survey instrument for the second round of data collection.  It should be noted that it is both an acceptable and a common modification of the Delphi process format to use a structured questionnaire in Round 1 that is based upon an extensive review of the literature.  Kerlinger (1973) noted that the use of a modified Delphi process is appropriate if basic information concerning the target issue is available and usable.

Round 2: In the second round, each Delphi participant receives a second questionnaire and is asked to review the items summarized by the investigators based on the information provided in the first round.  Accordingly, Delphi panelists may be required to rate or “rank-order items to establish preliminary priorities among items.  As a result of round two, areas of disagreement and agreement are identified” (Ludwig, 1994, p. 54-55). 
Round 3:  In the third round, each Delphi panelist receives a questionnaire that includes the items and ratings summarized by the investigators in the previous round and are asked to revise his/her judgments or “to specify the reasons for remaining outside the consensus” (Pfeiffer, 1968, p. 152).  This round gives Delphi panelists an opportunity to make further clarifications of both the information and their judgments of the relative importance of the items. 
Round 4: In the fourth and often final round, the list of remaining items, their ratings, minority opinions, and items achieving consensus are distributed to the panelists. This round provides a final opportunity for participants to revise their judgments


Delphi Technique vs Focus Groups

Delphi Technique
The Delphi Technique begins with the development of a set of open-ended questions on a specific issue. These questions are then distributed to various ‘experts’. The responses to these questions are summarised and a second set of questions that seek to clarify areas of agreement and disagreement is formulated and distributed to the same group of ‘experts’. 

Advantages of Delphi Technique.
Delphi Technique:
·           Is conducted in writing and does not require face-to-face meetings:
-        responses can be made at the convenience of the participant;
-        individuals from diverse backgrounds or from remote locations to work together on the same problems;
-        is relatively free of social pressure, personality influence, and individual dominance and is, therefore, conducive to independent thinking and gradual formulation of reliable judgments or forecasting of results;
-        helps generate consensus or identify divergence of opinions among groups hostile to each other;
·           Helps keep attention directly on the issue:
·           Allows a number of experts to be called upon to provide a broad range of views, on which to base
analysis—“two heads are better than one”:
-        allows sharing of information and reasoning among participants;
-        iteration enables participants to review, re-evaluate and revise all their previous statements in light of comments made by their peers;
·           Is inexpensive.

Disadvantages of Delphi Technique:
·           Information comes from a selected group of people and may not be representative;
·           Tendency to eliminate extreme positions and force a middle-of-the-road consensus;
·           More time-consuming than group process methods;
·           Requires skill in written communication;
·           Requires adequate time and participant commitment.

Focus Groups
Focus groups are a form of group interview that capitalises on communication between research participants in order to generate data. Although group interviews are often used simply as a quick and convenient way to collect data from several people simultaneously, focus groups explicitly use group interaction as part of the method.

Advantages of Focus Groups.
·         Useful for exploring people's knowledge and experiences and can be used to examine not only what people think but how they think and why they think that way.
·         Particularly sensitive to cultural variables- which is why it is so often used in cross cultural research and work with ethnic minorities
·         Some potential sampling advantages with focus groups
·         Do not discriminate against people who cannot read or write
·         Can encourage participation from those who are reluctant to be interviewed on their own (such as those intimidated by the formality and isolation of a one to one interview)
·         Can encourage contributions from people who feel they have nothing to say or who are deemed "unresponsive patients" (but engage in the discussion generated by other group members)

Disadvantages of Focus Groups.
·         Social Pressure, Individual domination, Halo effect, & social desirability
·         The presence of other research participants also compromises the confidentiality
·         Such group dynamics raise ethical issues (especially when the work is with "captive" populations) and may limit the usefulness of the data for certain purposes


Delphi Technique
Work Breakdown Structure
Test Point Analysis
Function Point Analysis
Use-case analysis methodology


Function point Analysis (FPA) is more or less based on the Functions involved in the Application under Test. We try to break down the application in terms of External Input, Internal Input, External Output, Internal Logic Files, External Logic files and try to estimate the Unadjusted Function Point.

We also take into consideration the 14 General System characteristics and use this as measure to derive the Adjusted Function Point Value. Using the AFP value, we can derive - Effort, Cost of testing required.

Use-case analysis methodology

we generally decide the estimation technique based on the application under test. In case the application is pretty complex and involves lot of internal algorithms, we prefer to use WBS as it absolves of any of this complexity.

But in case we wish to go as per Functional flow and components involved in testing, we prefer Use-case analysis.

The two primary elements in test estimation are time and resources.
There are many questions you need to answer in order to estimate a test effort.


1) What is the value of the application to your customers and your company? The greater the perceived value, the more stringent the exit criteria will be. The tighter the reigns on the exit criteria, the more test iterations it will take to complete the testing process and release the product.

2) What modules or functionalities will be tested and how many testers are available to test them? Of course as functionalities increase and/or number of testers decrease the more time it will take to thoroughly test the application.

3) What is the complexity of each of these modules or functionalities? As the complexity increases the more time and effort will be required to understand the application create test plans create test cases execute test cases regress test cases and retest defects.

4) How many test iterations (test runs) will be required to complete the test project? This is also related to complexity. As an application becomes more complex it will typically require more test iterations to reach the company's exit critera (the number of open defects by severity and priority that a company can live with).

5) How much time will be required by developers to produce fixes for new builds between test runs? Complexity is also a factor here. As an application becomes more complex there are often more dependencies between modules and functionalities. This often requires coordination between developers. Consequently this takes more time. This is important because your estimation must also include the amount of time testers are waiting for the next build between test runs.

6) What is the average number of defects that you anticipate will be found during each test run? You may have already guessed that complexity is a factor here too. The more complex an application the greater number of defects will reach the test team when the application is released to them. In addition the more complex the application the more likely that severe and high priority defects will be found in later stages of the test process.

7) How reliable are cross-functional groups that provide deliverables to the test team that are of high quality and on a timely basis? For example if the business requirements miss important functionalities then more time will be required to recover from this oversight. Oftentimes your history with these groups will help you decide how to handle these risk factors in your estimation.

8) Don't forget that an 8-hour day is not entirely devoted to core responsibilities. Testers go to meetings read and respond to emails and do other activities that consume time throughout the day. This needs to be factored into your estimation as well.

ONE MORE TIP:

To facilitate a business requirements review with all my testers. By reviewing the requirements with them it gives me a good idea about the complexity of the application. At the end of this meeting I assign functionalities to each tester. I give them several days to pour over their requirements again and get a good feel for their areas and then I ask them how much time they believe they will need to
1) Produce test plans
2) Produce test cases
3) Map requirement to test cases.

Typically I will take these numbers and apply a multiplier to account for their optimism as well as overall risks. There is more to estimating than I have described in this answer but I believe it gives you a good foundation to begin with.









Use Case Analysis – Test Estimation method:

Introduction
Deriving a reliable estimate of the size and effort an application needs is possible by examining the actors and scenarios of a use case. Use Case Points is a project estimation method that employs a project’s use cases to produce an accurate estimate of a project’s size and effort.

Use Case Points
Use case modelling is an accepted and widespread technique to capture the business processes and requirements of a software application. Since they provide the functional scope of the application, analyzing their contents provides valuable insight into the effort and size needed to design and implement the application. In general, applications with large, complicated use cases take more effort to design and implement than small applications with less complicated use cases. Moreover, the time to complete the application is affected by:

1.      The number of steps to complete the use case
2.      The number and complexity of the actors
3.      The technical requirements of the use case such as concurrency, security and performance
4.      Various environmental factors such as the development teams� experience and knowledge
Use Case Points (UCP) is an estimation method that provides the ability to estimate an applications size and effort from its use cases. UCP analyzes the use case actors, scenarios and various technical and environmental factors and abstracts them into an equation.

The equation is composed of four variables:

1.      Technical Complexity Factor (TCF)
2.      Environment Complexity Factor (ECF)
3.      Unadjusted Use Case Points (UUCP)
4.      Productivity Factor (PF)

Each variable is defined and computed separately, using perceived values and various constants. The complete equation is:

UCP = TCP * ECF * UUCP * PF


The necessary steps to generate the estimate based on the UCP method are:

·         Determine and compute the Technical Factors.
·         Determine and compute the Environmental Factors.
·         Compute the Unadjusted Use Case Points.
·         Determine the Productivity Factor.
·         Compute the product of the variables.
·         Technical Complexity Factors

Thirteen standard technical factors exist to estimate the impact on productivity that various technical issues have on an application. Each factor is weighted according to its relative impact. A weight of 0 indicates the factor is irrelevant and the value 5 means that the factor has the most impact.
Figure 1: Technical Factors.
For each project, the technical factors are evaluated by the development team and assigned a value from 0 to 5 according to their perceived complexity � multithreaded apps. Require more skill and time than single threaded applications, for example, as do reusable apps. A perceived complexity of 0 means the technical factor is irrelevant for this project; 3 is average; 5 mean it has strong influence.
Each factors weight is multiplied by its perceived complexity to produce its calculated factor. The calculated factors are summed to produce the Total Factor.


So, using sample perceived complexity values, the Technical Total Factor might be computed as follows:


Figure 2: Calculating the Technical Total Factor.
In Figure 2, the Total Factor is 47 derived by summing all the calculated factors. To produce the final TCF, two constants are computed with the Total Factor. The complete formula to compute the TCF is as follows:

TCF = 0.6 + (.01*Total Factor). For Figure 1, the TCF = 1.07Environmental Complexity Factors
Environmental Complexity estimates the impact on productivity that various environmental factors have on an application. Each environmental factor is evaluated and weighted according to its perceived impact and assigned a value between 0 and 5. A rating of 0 means the environmental factor is irrelevant for this project; 3 is average; 5 mean it has strong influence.

Figure 3: Example Environmental Factors.

Each factors weight is multiplied by its perceived complexity to produce its calculated factor. The calculated factors are summed to produce the Total Factor.

Using sample values for perceived impact, the Environmental Total Factor might be computed as:
Figure 4: Calculating the Environmental Total Factor.

In Figure 4, the Total Factor is 26 derived by summing all the calculated factors. To produce the final ECF, two constants are computed with the Total Factor. The complete formula to compute the ECF is as follows:

ECF = 1.4 + (-0.03*Total Factor). For Figure 4, the ECF = 0.62Unadjusted Use Case Points (UUCP)



Unadjusted Use Case Points are computed based on two computations:

The Unadjusted Use Case Weight (UUCW) based on the total number of activities (or steps) contained in all the use case Scenarios.
The Unadjusted Actor Weight (UAW) based on the combined complexity of all the use cases Actors.
UUCW
Individual use cases are categorized as Simple, Average or Complex, and weighted depending on the number of steps they contain - including alternative flows.


 Complex
 Involves a complex user interface or processing and touches 3 or more database entities; over seven steps; its implementation involves more than 10 classes.

The UUCW is computed by counting the number of use cases in each category, multiplying each category of use case with its weight and adding the products.

Figure 6: Computing UUCW.
UAW
In a similar manner, the Actors are classified as Simple, Average or Complex based on their interactions.

Figure 7: Actor Classifications.

The UAW is calculated by counting the number of actors in each category, multiplying each total by its specified weighting factor, and then adding the products.

Figure 8: Computing UAW.

Finally, the UUCP is computed by adding the UUCW and the UAW. For the sample data used in the figures, the UUCP = 220 + 44 = 264.

Productivity Factor
The Productivity Factor (PF) is a ratio of the number of man hours per use case point based on past projects. If no historical data has been collected, a figure between 15 and 30 is suggested by industry experts. A typical value is 20.

Final Calculation
The Use Case Points is determined by multiplying all the variables:

UCP = TCP * ECF * UUCP * PF For the sample values used in this article:
UCP = 1.07 * 0.62 * 264 * 20 = 3502.752 or 3503 hours.
Dividing the UCP by 40 hours (for one man work week) = 88 man-weeks. Therefore, for the sample values in this article, it would take one developer 88 weeks (or about 22 months) to complete the application.

Caveats
The Use Case Points estimate tends to be high when compared to human experts. This might be a good thing since many software projects are late, but, the estimate may still be too high. In order to produce accurate results, the variables in the equation need to be adjusted and tweaked especially in the beginning.

The number of steps in a scenario affects the estimate. A large number of steps in a use case scenario will bias the result towards complexity and increase the Use Case Points. A small number of steps will bias it towards simplicity. Sometimes, groups of steps can be reduced to a fewer number without sacrificing the business process. Strive for a uniform level of detail but don’t force a use case to conform to the estimation method.
Including and extending use cases increases the complexity. Count these as a single use case.
The number of actors in a use case also affects the estimate. If possible, generalize the actors into a single super actor. This reduces the complexity without affecting the use case. On the other hand, don’t force a generalization where none exists.
The values for the Technical and Environmental Factors need to be adjusted over time as actual data is obtained. The more projects that employ Use Case Points for their estimations will yield more accurate values for the perceived values.
Compare the Use Case Point estimate with a human experts estimate. Where there is disagreement, err on the side of the human expert and adjust the Use Case Point factors accordingly.
The Productivity Factor can only be obtained over time. Track the time spent designing and implementing the use cases and adjust the Productivity Factor accordingly.
Conclusion
Use Case Points have the potential to produce reliable results because its estimates are produced from the actual business processes � the use cases - of a software application. Additionally, in many traditional estimation methods, influential technical and environmental factors are often not adequately given enough consideration. Use Case Points includes and abstracts these subjective factors into an equation. When tweaked, over time, Use Case Points can provide estimates that are very reliable.

Testing - FAQ - Continuously updated Blog




The V-model was developed in 1992 to regulate the software development process within the German federal administration. It describes the activities and results that have to be produced during software development. It is a graphical representation of the system development lifecycle. It summarizes the main steps to be taken in conjunction with the corresponding deliverables within computerized system validation framework.
The specification stream mainly consists of:
  • User Requirement Specifications
  • Functional Specifications
  • Design Specifications
The testing stream generally consists of:
  • Installation Qualification (IQ)
  • Operational Qualification (OQ)
  • Performance Qualification (PQ)
The development stream can consist (depending on the system type and the development scope) in customization, configuration or coding.









What is 'Software Quality Assurance'?

Software QA involves the entire software development PROCESS - monitoring and improving the process, making sure that any agreed-upon standards and procedures are followed, and ensuring that problems are found and dealt with. It is oriented to 'prevention'. (See the Bookstore section's 'Software QA' category for a list of useful books on Software Quality Assurance.)

What are types of testing?
  • Black box testing - not based on any knowledge of internal designs or code. Tests are based on requirements and functionality.
  • White box testing - based on knowledge of the internal logic of an application's code. Tests are based on coverage of code statements, branches, paths, conditions.
  • Unit testing - the most 'micro' scale of testing; to test particular functions or code modules. Typically done by the programmer and not by testers, as it requires detailed knowledge of the internal program design and code. Not always easily done unless the application has a well-designed architecture with tight code; may require developing test driver modules or test harnesses.
  • Incremental integration testing - continuous testing of an application as new functionality is added; requires that various aspects of an application's functionality be independent enough to work separately before all parts of the program are completed, or that test drivers be developed as needed; done by programmers or by testers.
  • Integration testing - testing of combined parts of an application to determine if they function together correctly. The 'parts' can be code modules, individual applications, client and server applications on a network, etc. This type of testing is especially relevant to client/server and distributed systems.
  • Functional testing - black-box type testing geared to functional requirements of an application; this type of testing should be done by testers. This doesn't mean that the programmers shouldn't check that their code works before releasing it (which of course applies to any stage of testing.)
  • System testing - black-box type testing that is based on overall requirements specifications; covers all combined parts of a system.
  • End-to-end testing - similar to system testing; the 'macro' end of the test scale; involves testing of a complete application environment in a situation that mimics real-world use, such as interacting with a database, using network communications, or interacting with other hardware, applications, or systems if appropriate.
  • Sanity testing or smoke testing - typically an initial testing effort to determine if a new software version is performing well enough to accept it for a major testing effort. For example, if the new software is crashing systems every 5 minutes, bogging down systems to a crawl, or corrupting databases, the software may not be in a 'sane' enough condition to warrant further testing in its current state.
  • Regression testing - re-testing after fixes or modifications of the software or its environment. It can be difficult to determine how much re-testing is needed, especially near the end of the development cycle. Automated testing tools can be especially useful for this type of testing.
  • Acceptance testing - final testing based on specifications of the end-user or customer, or based on use by end-users/customers over some limited period of time.
  • Load testing - testing an application under heavy loads, such as testing of a web site under a range of loads to determine at what point the system's response time degrades or fails.
  • Stress testing - term often used interchangeably with 'load' and 'performance' testing. Also used to describe such tests as system functional testing while under unusually heavy loads, heavy repetition of certain actions or inputs, input of large numerical values, large complex queries to a database system, etc.
  • Performance testing - term often used interchangeably with 'stress' and 'load' testing. Ideally 'performance' testing (and any other 'type' of testing) is defined in requirements documentation or QA or Test Plans.
  • Usability testing - testing for 'user-friendliness'. Clearly this is subjective, and will depend on the targeted end-user or customer. User interviews, surveys, video recording of user sessions, and other techniques can be used. Programmers and testers are usually not appropriate as usability testers.
  • Install/uninstall testing - testing of full, partial, or upgrade install/uninstall processes.
  • Recovery testing - testing how well a system recovers from crashes, hardware failures, or other catastrophic problems.
  • Failover testing - typically used interchangeably with 'recovery testing'
  • Security testing - testing how well the system protects against unauthorized internal or external access, willful damage, etc; may require sophisticated testing techniques.
  • Compatibility testing - testing how well software performs in a particular hardware/software/operating system/network/etc. environment.
  • Exploratory testing - often taken to mean a creative, informal software test that is not based on formal test plans or test cases; testers may be learning the software as they test it.
  • Ad-hoc testing - similar to exploratory testing, but often taken to mean that the testers have significant understanding of the software before testing it.
  • Context-driven testing - testing driven by an understanding of the environment, culture, and intended use of software. For example, the testing approach for life-critical medical equipment software would be completely different than that for a low-cost computer game.
  • User acceptance testing - determining if software is satisfactory to an end-user or customer.
  • Comparison testing - comparing software weaknesses and strengths to competing products.
  • Alpha testing - testing of an application when development is nearing completion; minor design changes may still be made as a result of such testing. Typically done by end-users or others, not by programmers or testers.
  • Beta testing - testing when development and testing are essentially completed and final bugs and problems need to be found before final release. Typically done by end-users or others, not by programmers or testers.
  • Mutation testing - a method for determining if a set of test data or test cases is useful, by deliberately introducing various code changes ('bugs') and retesting with the original test data/cases to determine if the 'bugs' are detected. Proper implementation requires large computational resources.


CMM = 'Capability Maturity Model',
Now called the CMMI ('Capability Maturity Model Integration'), developed by the SEI. It's a model of 5 levels of process 'maturity' that determine effectiveness in delivering quality software. It is geared to large organizations such as large U.S. Defence Department contractors. However, many of the QA processes involved are appropriate to any organization, and if reasonably applied can be helpful. Organizations can receive CMMI ratings by undergoing assessments by qualified auditors.

Level 1 - characterized by chaos, periodic panics, and heroic efforts required by individuals to successfully complete projects. Few if any processes in place; successes may not be repeatable.

Level 2 - software project tracking, requirements management, realistic planning, and configuration management processes are in place; successful practices can be repeated.

Level 3 - standard software development and maintenance processes are integrated throughout an organization; a Software Engineering Process Group is is in place to oversee software processes, and training programs are used to ensure understanding and compliance.

Level 4 - metrics are used to track productivity, processes, and products. Project performance is predictable, and quality is consistently high.

Level 5 - the focus is on continuous process improvement. The impact of new processes and technologies can be predicted and effectively implemented when required.