Why is testing necessary
This section explains why testing is necessary and looks closely at the cost and consequences of errors in computer software.
1. Definitions Errors, Faults and Failures
An error is a human action that produces an incorrect result. A fault is a manifestation of an error in software. Faults are also known colloquially as defaults or bugs. A fault, if encountered, may cause a fault, which is a deviation of the software from its existing delivery or service.
We can illustrate these points with the true story Mercury spacecraft. The computer program aboa spacecraft contained the following statement wri1 the FORTRAN programming language.
DO 100 i = 1.10
The programmer's intention was to execute a succeeding statements up to line 100 ten times then creating a loop where the integer variable I was using the loop counter, starting 1 and ending at 10.
Unfortunately, what this code actually does is writing variable i do to decimal value 1.1 and it does that once only. Therefore remaining code is executed once and not 10 times within the loop. As a result spacecraft went off course and mission was abort considerable cost!
The correct syntax for what the programmer intended is
DO 100 i =1,10
Exercise
What do you think was the error, fault and failure in this example?
The error is __________
The fault is ___________
The failure is __________
2. Reliability
Reliability is the probability that software will not cause the failure of a system for a specified time under specified conditions. Measures of reliability include MTBF (mean time between failure), MTTF (mean time to failure) as well as service level agreements and other mechanisms.
1.4.3 Errors and how they occur
Why do we make errors that cause faults in computer software leading to potential failure of our systems? Well, firstly we are all prone to making simple human errors. This is an unavoidable fact of life. However, this is compounded by the fact that we all operate under real world pressures such as tight deadlines, budget restrictions, conflicting priorities and so on.
1.4.4 Cost of errors
The cost of an error can vary from nothing at all to large amounts of money and even loss of life. The aborted Mercury mission was obviously very costly but surely this is just an isolated example. Or is it? There are hundreds of stories about failures of computer systems that have been attributed to errors in the software. A few examples are shown below:
A nuclear reactor was shut down because a single line of code was coded as X = Y instead of X=ABS (Y) i.e. the absolute value of Y irrespective of whether Y was positive or negative.
Blue Cross of Wisconsin installed a new $200m claims processing system - it sent out $60 million in unwarranted and duplicate payments. Also, when a data entry clerk typed 'none' in the town field the system sent hundreds of checks to the non-existent town of 'NONE' in Wisconsin.
In May 1992, Pepsi fan a promotion in the Philippines. It told customers they could win a million pesos (approx. $40,000) if they bought a bottle of Pepsi and found number 349 stamped on the underside of the bottle cap. Unfortunately, due to a software error, 800,000 bottle caps were produced with number 349 instead of one, which was an equivalent of $42 billion in prize money. It cost the company dearly as some people pursued their claims through the courts and Pepsi paid out millions of dollars in compensation.
Another story was printed in the New York Times on 18th February 1994. Chemical Bank managed to allow $15 million to be withdrawn incorrectly from 100,000 accounts - a single line error in the program caused every ATM on their network to process the transaction twice.
1.4.5 what happened on October 26th & 27th 1992?
The London Ambulance Service (LAS) covers an area of just over 600 square miles and is the largest ambulance service in the world. It covers a resident population of some 6.8 million, but its daytime population is larger, especially in central London. Since 1974 South West Thames Regional Health Authority has managed it.
The LAS carries over 5000 patients every day and receives between 2000 and 2500 calls daily (including between 1300 and 1600 emergency calls i.e. 999 calls). In terms of resources the LAS has some 2700 staff, 750 ambulances and a small number of various other vehicles including 1 helicopter. LAS make almost 2 million patient journeys each year. In 1992/1993 its budgeted income was £69.7 million.
On the 26th and 27th October 1992 the new system failed, ambulances failed to turn up and people lost their lives. Although no Coroners Court ever apportioned blame for any deaths directly to the computer systems failure, it was by any standards a major disaster and made the main evening news bulletins on several occasions.
1.4.6 London Ambulance Service
In summary the problems were:
Computer Aided Dispatch - 1
The system relied on near perfect information to propose optimum resource to be allocated to an emergency. However, there were imperfections in the information and changes in operational procedures made it difficult for staff to correct the system.
This was not a problem when it went live at 7 am on 26th October 1992 as the system load was light; however as the number of emergency calls increased throughout the-day it became increasingly difficult for staff to correct errors; this led to:
• Poor, duplicated and delayed allocations.
• Build-up of exception messages and awaiting attention list.
• Slow-down of system as messages and lists built up.
• Increased number of callbacks and hence delays in telephone answering.
The cost of these errors were ultimately that ambulances didn't turn up and people lost their lives although the official enquiry report did not attribute any fatalities to the system problems. The costs in terms of loss of confidence in the computer system, industrial relations and so on were probably also high.
1.4.7 Exhaustive testing why not test everything?
It is now widely accepted that you cannot test everything. Exhausted testers you will find, but exhaustive testing you will not. Complete testing is neither theoretically, nor practically possible. Consider a 10-character string that has 280 possible input streams and corresponding outputs. If you executed one test per microsecond it would take approx. 4 times the age of the Universe to test this completely. For a survey of the methodology and limitations of formal proofs of program correctness see [Manna 78]
1.4.8 Testing and risk
How much testing would you be willing to perform if the risk of failure were negligible? Alternatively, how much testing would you be willing to perform if a single defect could cost you your life's savings, or, even more significantly, your life? [Hetzel 88].
The amount of testing performed depends on the risks involved. Risk must be used as the basis for allocating the test time that is available and for selecting what to test and where to place emphasis. A priority must be assigned to each test.
Test Managers and Project Managers come up with different prioritization schemes but ht basic principle is that you must focus the testing effort on those areas of the system that are likely to have the most defects. Another key principle is that you must execute the most important test first. Otherwise, if you run out of time, which is likely, you will not have exercised the tests that give you the best payback in terms of faults found.
1.4.9 Testing and quality
Testing identifies faults whose removal increases the software quality by increasing the software's potential reliability. Testing is the measurement of software quality. We measure how closely we have achieved quality by testing the relevant factors such as correctness, reliability, usability, maintainability, reusability, testability etc.
1.4.10 Testing and legal, contractual, regulatory or mandatory requirements
Other factors that may determine the testing performed may be legal, contractual requirements, normally defined in industry specific standards or based on agreed best practice (or more realistically non-negligent practice).
1.4.11 How much testing is enough?
It is difficult to determine how much testing is enough. Testing is always a matter of judging risks against cost of extra testing effort. Planning test effort thoroughly before you begin, and setting completion criteria will go some way towards ensuring the right amount of testing is attempted. Assigning priorities to tests will ensure that the most important tests have been done should you run out of time.
General Testing Principles:
Principle 1 – Testing shows presence of defects
Testing can show that defects are present, but cannot prove that there are no defects. Testing
reduces the probability of undiscovered defects remaining in the software but, even if no defects are
found, it is not a proof of correctness.
Principle 2 – Exhaustive testing is impossible
Testing everything (all combinations of inputs and preconditions) is not feasible except for trivial
cases. Instead of exhaustive testing, risk analysis and priorities should be used to focus testing
efforts.
Principle 3 – Early testing planning and testing
Testing activities should start as early as possible in the software or system development life cycle,
and should be focused on defined objectives.
Principle 4 – Defect clustering
A small number of modules contain most of the defects discovered during pre-release testing, or
are responsible for the most operational failures.
Principle 5 – Pesticide paradox
If the same tests are repeated over and over again, eventually the same set of test cases will no
longer find any new defects. To overcome this “pesticide paradox”, the test cases need to be
regularly reviewed and revised, and new and different tests need to be written to exercise different
parts of the software or system to potentially find more defects.
Principle 6 – Testing is context dependent
Testing is done differently in different contexts. For example, safety-critical software is tested
differently from an e-commerce site.
Principle 7 – Absence-of-errors fallacy
Finding and fixing defects does not help if the system built is unusable and does not fulfill the users’
needs and expectations.
1.5 Fundamental Test Processes
Introduction
Testing must be planned. This is one of Bill Hetzel's 6 testing principles [Hetzel 88 p25] and he says we are all agreed on this one. However, he points out that the problem is that most of us do not discipline ourselves to act upon it. Good testing requires thinking out an overall approach, designing tests and establishing expected results' for each of the test cases we choose.
You have seen already that we cannot test everything, we must make a selection, and the planning and care we expend on that selection accounts for much of the difference between good and poor testers.
The quality and effectiveness of software testing are primarily determined by the quality of the test processes used [Kit 95]. This is one of Ed Kit's 6 essentials of software testing. Test groups that operate within organizations that have an immature development process will feel more pain than those that do not. However, the test group should strive to improve its own internal testing processes. This section of the course shows a fundamental test process, based on the BS7925-2 standard for software component testing.
The fundamental test process comprises planning, specification, execution, recording and checking for completion. You will find organizations that have slightly different names for each stage of the process and you may find some processes that have just 4 stages, where 4 & 5 are combined, for example. However, you will find that all good test processes adhere to this fundamental structure.
1.5.2 Test process stages
See BS7925-2 for diagram of test process. Test planning involves producing a document that describes an overall approach and test objectives noting any assumptions you have made and stating any exceptions to your overall test strategy for your project. Test planning can be applied at all levels. Completion or exit criteria must be specified so that you know when testing (at any stage) is complete. Often a coverage target is set and used as test completion criteria.
Test specification (sometimes referred to as test design) involves designing test conditions and test cases using recognized test techniques identified at the planning stage. Here it is usual to produce a separate document or documents that fully describe the tests that you will carry out. It is important to determine the expected results prior to test execution.
Test execution involves actually running the specified test on a computer system either manually or by using an automated test tool.
Test recording involves keeping good records of the test activities that you have carried out. Versions of the software you have tested and the test specifications are software you have tested and the test specifications are recorded along with the actual outcomes of each test.
Checking for test completion involves looking at the previously specified test completion criteria to see if they have been met. If not, some test may need to be rerun and in some instances it may be appropriate to design some new test cases to meet a particular coverage target.
Note that BS7925-2 does not specify the format of any test documentation. However, The IEEE standard, known as 829, specifies in detail a standard for software test documentation.
BS7925-2 shows a diagram of a suggested hierarchy of test documentation.
HOME WORK
Exercise
Putting aside management problems. Read through test documentation examples in BS7925-2 and answer following questions:
What test techniques does component test strategy stipulate?
What percentage of decision coverage is required?
What should be done if errors are found?
The project component test plan is useful because the approach outlined allows:
a) Strict adherence to the component test strategy
b) More faults to be identified in the LOG components
c) A basic working systems to be established as early as possible
d) Isolation of the components within the test strategy
The component test plan must consist of a single document? TRUE/FALSE
The component test plan must specify test completion criteria? TRUE/FALSE
Why does component test plan specify 100% DC whereas strategy required 90%?
Which test case deals with non-numeric input?
List the expected outcome and the test condition
Why does the CTP have additional/altered test cases?
What action has been taken as a result of the test report?
1.5.3 Successful tests detect faults
As the objective of a test should be to detect faults, a successful test is one that does detect a fault. This is counter-intuitive, because faults delay progress; a successful test is one that may cause delay. The successful test reveals a fault which, if found later, may be many more times costly to correct so in the long run, is a good thing.
1.5.4 Meaning of completion or exit criteria
Completion or exit criteria are used to determine when testing (at any stage) is complete. These criteria may be defined in terms of cost, time, faults found or coverage criteria.
1.5.5 Coverage criteria
Coverage criteria are defined in terms of items that are exercised by test suites, such as branches, user requirements, and most frequently used transactions etc.
The psychology of testing:
1.6.1 Purpose
The purpose of this section is to explore differences in perspective between tester and developer (buyer & builder) and explain some of the difficulties management and staff face when working together developing and testing computer software.
1.6.2 Different mindsets
We have already discussed that none of the primary purposes of testing is to find faults in software i.e., it can be perceived as a destructive process. The development process on the other hand is a naturally creative one and experience shows that staff that work in development have different mindsets to that of testers.
We would never argue that one group is intellectually superior to another, merely that they view systems development from another perspective. A developer is looking to build new and exciting software based on user's requirements and really wants it to work (first time if possible). He or she will work long hours and is usually highly motivated and very determined to do a good job.
A tester, however, is concerned that user really does get a system that does what they want, is reliable and doesn't do thing it shouldn't. He or she will also work long hours looking for faults in software but will often find the job frustrating as their destructive talents take their tool on the poor developers. At this point, there is often much friction between developer and tester. Developer wants to finish system but tester wants all faults in software fixed before their work is done.
In summary:
Developers:
Are perceived as very creative - they write code without which there would be no system! .
Are often highly valued within an organization.
Are sent on relevant industry training courses to gain recognized qualifications.
Are rarely good communicators (sorry guys)!
Can often specialize in just one or two skills (e.g. VB, C++, JAVA, SQL).
Testers:
Are perceived as destructive - only happy when they are finding faults!
Are often not valued within the organization.
Usually do not have any industry recognized qualifications, until now
Usually require good communication skills, tack & diplomacy.
Normally need to be multi-talented (technical, testing, team skills).
1.6.3 Communication b/w developer and tester
It is vitally important that tester can explain and report fault to developer in professional manner to ensure fault gets fixed. Tester must not antagonize developer. Tact and diplomacy are essential, even if you've been up all night trying to test the wretched software.
1.6.4 How not to approach
Tester: "Hey Fred. Here's a fault report AR123. Look at this code. Who wrote this? Was it you? Why, you couldn't program your way out of a paper bag. We really want this fixed by 5 o'clock or else."
We were unable to print Fred's reply because of the language! Needless to say Fred did not fix the fault as requested.
Exercise
Your trainer will split you into small test teams. One of you will be the test team leader. You have found several faults in a program and the team leader must report these to the developer (your trainer). The background is that your team has tested this program twice before and their are still quite a lot of serious faults in the code. There are also several spelling mistakes and wrong colors on the screen layout. The test team is getting a bit fed up. However, you have to be as nice as possible to the developer.
1.6.6 Why can't we test our own work?
This seems to be a human problem in general not specifically related to software development. We find it difficult to spot errors in our own work products. Some of the reasons for this are:
We make assumptions
We are emotionally attached to the product (it's our baby and there's nothing wrong with it).
We are so familiar with the product we cannot easily see the obvious faults.
We're humans.
We see exactly what we want to see.
We have a vested interest in passing the product as ok and not finding faults.
Generally it is thought that objective independent testing is more effective. There are several levels of independence as follows:
Test cases are designed by the person(s) writing the software.
Test cases are designed by another person(s).
Test cases are designed by a person(s) from a different section.
Test cases are designed by a person(s) from a different organization.
Test cases are not chosen by a person.
The discussion of independence test groups and outsourcing is left to another section.
2 Testing throughout the project lifecycle:
2.3 Software Development Model
There are many models used to describe the sequence of activities that make a Systems Development Life Cycle (SDLC). SLDC is used to describe activities of both development and maintenance work. Three models are worth mentioning.
• Sequential (the traditional waterfall model).
• Incremental (the function by function incremental model).
• Spiral (the incremental, iterative, evolutionary, RAD, prototype model).
The three models would all benefit from earlier attention to the testing activity that has to be done at some time during the SDLC.
Any reasonable model for SDLC must allow for change and spiral approach allows for this with emphasis on slowly changing (evolving) design. We have to assume change is inevitable will have to design for change.
Fact 1. Business is always changing
2. Finding a fault causes change
Result Ease of fixing a fault defines ease of responding to change
Corollary If we want systems that can be modified and hence maintained, the earlier we start testing and try the change process, the earlier we will find out how easy it is going to be to maintain the system
1. Sequential Model
The sequential model often fails to bring satisfactory results because of the late attention to the testing activity. When earlier phases in the development cycle slip, it is the testing phase that gets squeezed. This can lead to a limited amount of testing being carried out with the associated production 'teething' problems.
2. Plan for wanted waterfall model development of system
The overall project plan for development of a system might be as shown below:
Activities:
Business study
Requirements analysis
User level design
Technical design
Program specification
Creation of code
Unit testing
Integration testing
System testing
Acceptance testing
Implementation
This is a typical Specify, Design and Build project plan.
All testing and quality control points come late in project and only done if there is time.
When testing is done so late in project it can reveal costly errors.
Project plan has testing done late because people think that only physical deliverables such as code can be tested. Clearly there has to be a better way.
The challenge it to devise a better way of developing systems. There is a need to introduce quality control points earlier in the SDLC.
3. Sequential model plus testing gives 'V' diagram
The V diagram is another way of looking at the sequential development but this time from viewpoint of testing activities that need to be completed later in SDLC.
The 'V' diagram in this simple form has been around for a long time and is especially useful as it easily demonstrates how testing work done early in SDLC is used as input to assurance work later in development.
The V model of SDLC offers considerable benefits over others as it emphasizes building of test data and test scenarios during development and not as an after thought. The V model also allows for establishment of versions, incremental development and regression testing.
Management needs to rename activities, referred to variously as systems testing or acceptance testing. There has always been a phase of development traditionally thought of as the testing phase. This is an historical perception. Testing is not a phase but rather an activity that must be carried out all through development, giving rise to the principle of Total Quality.
In the past, system testing was the only type of testing carried out. Testing was checking that programs, when linked together, met the systems specification. Whether design itself was correct was another matter. The concept of "testing" design before programs were coded was given only the most perfunctory attention. This was for two reasons:
1. By the time physical design had been done the system was very difficult to alter; modifications caused by design reviews were therefore very unwelcome.
2. The design was documented in terms of physical file layouts and program specifications, neither of which the user could comprehend. The question of whether physical design was correct could be reviewed, but the more important question: "Did the system do what the user wanted?" Was largely neglected.
4. 'V' with test recognized deliverable
Looking at the diagram above it is clear that the activity of Business Analysis has, as deliverables, the Specification of Requirements from which Acceptance Test Plan is constructed. To have created, for example, System Architecture without integration Test Specification is to do only half the job!
5 Revised plan for system development
• Business Study
• requirement Analysis
• User level design
• Technical design
• Program specification
• Unit test planning
• Creation of code
• Unit testing
• Integration test planning
• Integration testing
• System testing
• Acceptance test planning
• Creation of detailed test material
• User system testing
• Acceptance testing
• deployment/go live
• Business benefit analysis
The overall project plan for development of a system might be as shown below. Note the new early quality control points.
This plan shows that the creation and running of actual tests are separated. The creation of test material (acceptance test plans, user system test scripts, technical system tests such as integration, link, recovery restart, etc., and unit test data) is done as the relevant design is done. The potential for automation is very good and the use of tools to capture the test cases, scripts, etc. will play a big part in making the running tests efficient. The early creation of test material will make the process of developing a system effective. The emphasis must be on first being effective then being efficient.
6. Rapid Application Development
The spiral, Rapid Application Development (RAD) model has the benefit of the evolutionary approach. This is an incremental process of build a little then test a little, which has the benefit of attempting to produce a usable but limited version early.
The RAD approach relies upon the quality of the RAD team.
The management issues to address are:
Have I got knowledgeable user input in my team?
Have I got experienced designers and developers in my team?
Am I leaving a good audit trail to support future Maintenance?
Test Levels:
COMPONENT TESTING
Component testing is described fully in BS-7925 and should be aware that component testing is also known as unit testing, module testing or Program Testing. The definition from BS7925 is simply the testing of individual software components.
Traditionally, the programmer carries out component testing. This has proved to be less effective than if someone else designs and funs the tests for the component.
"Buddy" testing, where two developers test each other's work is more independent and often more effective. However, the component test strategy should describe what level of independence is applicable to a particular component.
Usually white box (structural) testing techniques are used to design test cases for component tests but some black box tests can be effective as well.
We have already covered a generic test process in this course. The component test process is shown in the following diagram:
INTEGRATION TESTING
Integration is the process of combining components into larger assemblies. From the standard BS-7925 integration testing is defined as "testing performed to expose faults in the interfaces and in the interaction between integrated components", However, in this section we look at two interpretations of integration testing known as integration testing in the large and integration testing in the small.
By Integration Testing in the Large we mean testing the integration of the new system or software package with other (complete) systems. This would include the identification of, and risk associated with, all interfaces to these other systems. Also included is testing of any interfaces to external organizations (e.g. EDI - electronic data interchange, Internet) but not testing of the processing or operation of those external systems.
We use Integration Testing in the Small in the more traditional sense of integration testing where components are assembled into sub-systems and sub¬systems are linked together to form complete systems. Integration strategies may be incremental or non-incremental and include:
bin-ban
Top-down
Bottom-up
Sandwich
Testing approach is directly related to integration strategy chosen.
Stubs & Drivers
A stub is a skeletal or special purpose implementation of a software module, used to develop or test a component that calls or is otherwise dependent on it. A test driver is a program or test tool used to execute software against a test case suite.
They mainly are substitues to a software component that is not available for integration of the entire system.
SYSTEM TESTING
System testing is defined as the process of testing an integrated system to verify that it meets specified requirements.
You will come across two very different types of system testing, Functional system testing and non-functional system testing. In plain English functional system testing is focuses on testing the system based on what it is supposed to do. Non-functional system testing looks at those aspects that are important yet not directly related to what functions the system performs. For example if the functional requirement is to issue an airline ticket, the non-functional requirement might be to issue it within 30 seconds.
A functional requirement is " a requirement that specifies a function that a system or system component must perform". Requirements-based testing means that the user requirements specification and the system requirements specification (as used for contracts) are used to derive test cases. Business process-based testing based on expected user profiles (e.g. scenarios, use cases).
Non-functional requirements cover the following areas:
1. Load ¬
2. Performance
3. Stress
4. Security
5. Usability
6. Storage
7. Volume
8. Install ability
9. Documentation
10. Recovery
Non-functional requirements are just as important as functional requirement
ACCEPTANCE TESTING
The definition of acceptance testing in BS7925 states that "acceptance testing is formal testing conducted to enable a user, customer, or other authorized entity to determine whether to accept a system or component". Acceptance testing may be the only form of testing conducted by and visible to a customer when applied to a software package. The most common usage of the term relates to the user acceptance testing (VAT) but you should be aware that there are several other uses of acceptance testing, which we briefly describe here.
User acceptance testing - the final stage of validation. Customer should perform or be closely involved in this. Customers may choose to do any test they wish, normally based on their usual business processes. A common approach is to set up a model office where systems are tested in an environment as close to field use as is achievable.
Contract acceptance testing - a demonstration of the acceptance criteria, which would have been defined in the contract, being met.
Alpha & beta testing - In alpha and beta tests, when the software seems stable, people who represent your market use the product in the same way (s) that they would if they bought the finished version and give you their comments. Alpha tests are performed at the developer's site, while beta tests are performed at the user's sites.
High level test planning
You should be aware that many people use the term 'test plan' to describe a document detailing individual tests for a component of a system. We are introducing the concept of high level test plans to show that there are a lot more activities involved in effective testing than just writing test cases.
The IEEE standard for test documentation (IEEE/ ANSI, 1983 [Std 829-1983 D, affectionately known as 829, defines a master validation test plan as follows:
Purpose of master test plan is to prescribe scope, approach, resources and schedule of testing activities. A master test plan should include the following:
1. Test Plan Identifier
2. References
3. Introduction
4. Test Items
5. Software Risk Issues
6. Features to be tested
7. Features not to be tested
8. Approach
9. Item Pass/Fail Criteria
10. Suspension Criteria
11. Resumption Requirements
12. Test Deliverables
13. Remaining Testing Tasks
14. Environmental Needs
15. Staffing and Training Needs
16. Responsibilities
17. Schedule
18. Planning Risks and Contingencies
19. Approvals
20. Glossary
Test Types
Functional Testing: Testing the application against business requirements. Functional testing is done using the functional specifications provided by the client or by using the design specifications like use cases provided by the design team.
Functional Testing covers:
Unit Testing
Smoke testing / Sanity testing
Integration Testing (Top Down,Bottom up Testing)
Interface & Usability Testing
System Testing
Regression Testing
Pre User Acceptance Testing(Alpha & Beta)
User Acceptance Testing
White Box & Black Box Testing
Globalization & LocalizationTesting
Non-Functional Testing: Testing the application against client's and performance requirement. Non-Functioning testing is done based on the requirements and test scenarios defined by the client.
Non-Functional Testing covers:
Load and Performance Testing
Ergonomics Testing
Stress & Volume Testing
Compatibility & Migration Testing
Data Conversion Testing
Security / Penetration Testing
Operational Readiness Testing
Installation Testing
Security Testing (ApplicationSecurity, Network Security, System Security)
STRUCTURAL TESTING
It is a type of white box testing which involves testing the software architecture either functional or internal code structure. Testing based on an analysis of internal workings and structure of a piece of software. Techniques involved are code coverage, decision, branch testing etc.
RE-TESTING AND REGRESSION TESTING
We find and report a fault, which is duly fixed by the developer and included in the latest release which we now have available for testing. What should we do now?
Examples of regression tests not carried out include:
The day the phones stopped. .
LAS failure on 4th November (perhaps)
Ariane 5 failure.
Whenever a fault is detected and fixed then the software should be re-tested to ensure that the original fault has bee successfully removed. You should also consider testing for similar and related faults. This is made easier if your tests are designed to be repeatable, whether they are manual or automated.
Regression testing attempts to verify that modifications have not caused unintended adverse side effects in the unchanged software (regression faults) and that the modified system still meets requirements. It is performed whenever the software, or its environment, is changed.
Most companies will build up a regression test suite or regression test pack over time and will add new tests, delete unwanted test and maintain tests as the system evolves. When a major software modification is made then the entire regression pack is likely to be run (albeit with some modification). For minor planned changes or emergency fixes then during the test planning phase the test manager must be selective and identify how many of the regression tests should be attempted. In order to react quickly to an emergency fix the test manager may create a subset of regression test pack for immediate execution in such situations.
Regression tests are often good candidates for automation provided you have designed and developed automated scripts properly (see automation section).
In order to have an effective regression test suite then good configuration management of your test assets is desirable if not essential. You must have version control of your test
Documentation (test plans, scripts etc.) as well as your test data and baseline databases. An inventory of your test environment (hardware configuration, operating system version etc.) is also necessary.
MAINTENANCE TESTING
Maintenance testing is not specifically defined in BS7925 but it is all about testing changes to the software after it has gone into productions.
There are several problems with maintenance testing. If you are testing old code the original specifications and design documentation may be poor or in some cases non-existent. When defining the scope of the maintenance testing it has to be judged in relation to the changed code; how much has changed, what areas does it really affect etc. This kind of impact analysis is difficult and so there is a higher risk when making changes - it is difficult to decide how much regression testing to do.
3. Static Technique
Overview
Static testing techniques is used to find errors before software is actually executed and contrasts therefore with dynamic testing techniques that are applied to a working system. The earlier we catch an error, the cheaper it is, usually, to correct. This module looks at a variety of different static testing techniques. Some are applied to documentation (e.g. walkthroughs, reviews and Inspections) and some are used to analyze the physical code (e.g. compilers, data flow analyzers). This is a huge subject and we can only hope to give an introduction in this module. You will be expected to appreciate the difference between the various review techniques and you will need to be aware of how and when static analysis tools are used.
Static testing techniques are Code reviews, inspections and walkthroughs.
From the black box testing point of view, static testing involves reviewing requirements and specifications. This is done with an eye toward completeness or appropriateness for the task at hand.
From the white box testing point of view, statifc testing involves reviewing code, functions, structure.
Static testing can be automated. A static testing test suite consists of programs to be analyzed by an interpreter or a compiler that asserts the programs syntactic validity.
The people involved in static testing are application developers, testers, and business analyst.
Review Process
What is a review?
A review is a fundamental technique that must be used throughout the development lifecycle. Basically a review is any of a variety of activities involving evaluation of technical matter by a group of people working together. The objective of any review is to obtain reliable information, usually as to status and/or work quality.
Some history of reviews
During any project, management requires a means of assessing a measuring progress. The so-called progress review evolved as means of achieving this. However, results of those early reviews proved to be bitter experiences for many project managers. Just how long can a project remain at 90% complete? They found that they could not measure 'true' progress until they had a means of gauging the quality of the work performed. Thus the concept of the technical review emerged to examine the quality of the work and provide input to the progress reviews.
What can be reviewed?
There are many different types of reviews throughout the development life cycle. Virtually any work produced during development can be (and is) reviewed. This includes, requirements documents, designs, database specifications, designs, data models, code, test plans, test scripts, test documentation and so on.
What has this got to do with testing?
The old fashioned view that reviews and testing are totally different things stems from the fact that testing used to be tacked onto the end of the development lifecycle. However as we all now view testing as a continuous activity that must be started as early as possible you can begin to appreciate the benefits of reviews. Reviews are the only testing technique that is available to us in the early stages of testing. At early stages in the development lifecycle we obviously cannot use dynamic testing techniques since the software is simply not ready for testing.
Reviews share similarities to test activities in that they must be planned (what are we testing), what are the criteria for success (expected results) and who will do the work (responsibilities). The next section examines the different types of review techniques in more detail.
TYPES OF REVIEW
Walk-through, informal reviews, technical reviews and Inspections are fundamental techniques that must be used throughout the development process. All have their strengths and weaknesses and their place in the project development cycle. All four techniques have some ground rules in common as follows:
A structured approach must be followed for the review process.
Be sure to know what is being reviewed - each component must have a unique identifier.
Changes must be configuration controlled.
Reviewers must prepare.
Reviewers must concentrate on their own specialization.
Be sure to review the product, not the person.
There must be:
Total group responsibility.
Correct size of review group.
Correct allocation of time.
Correct application of standards.
Checklists must be used.
Reports must be produced.
Quality must be specified in terms of:
Adherence to standards.
Reliability required.
Robustness required.
Modularity.
Accuracy.
Ability to handle errors.
TYPE 1, 2 AND 3 REVIEW PROCESSES
Review process is both most effective and universal test method and management needs to make sure that review process is working as effectively as possible. Useful model for manager is 1,2,3 model.
1, 2, 3 model is derived from work of first working party of British Computer Society Specialist Interest Group in Software Testing and book that came from this work; in Software Development.
Type 1 testing is process of making sure that product (document, program, screen design, clerical procedure or Functional Specification) is built to standards and contains those features that we would expect from the name of that product. It is a test to make sure that produce conforms to standards, is internally consistent, accurate and unambiguous.
Type 2 testing is process of testing to see if product conforms to requirements as specified by output of preceding project stage and ultimately Specifications of Requirements for whole project. Type 2 testing is backward looking and is checking that product is consistent with preceding documentation (including information on change).
Type 3 testing is forward looking and is about specification of certification process and test that are to be done on delivered product. It is asking question - Can we can build deliverables (test material, training material, next stage analysis documentation)?
Make reviews incremental
Whilst use of 1, 2, 3 model will improve review technique, reviewing task can be made easier by having incremental reviews throughout product construction.
This will enable reviewers to have more in-depth understanding of product that they are reviewing and to start construction type 3 material.
General review procedure for documents
Test team will need general procedure for reviewing documents, as this will probably form large part of team's work.
1. Establishing standards and format for document.
2. Check contents list.
3. Check appendix list.
4. Follow up references outside documents.
5. Cross check references inside document.
6. Check for correct and required outputs.
7. Review each screen layout against appropriate standard, data dictionary, processing rules and files/data base access.
8. Review each report layout against appropriate standard, data dictionary, processing rules and files/data base access.
9. Review comments and reports on work andreviews done prior to this review.
10. Documents reviewed will range from whole reports such as Specification of Requirements to pages from output that system is to produce. All documents will need careful scrutiny.
Be sensitive to voice of concerned but not necessarily assertive tester in team; this person may well have observed a fault that all others have missed. It should not be that person with load voice or strong personality is allowed to dominate.
INSPECTIONS
The Inspection technique was built further by Caroline L. Jones and Robert Mays at IBM [jones, 1985] who created a number of useful enhancements:
The kickoff meeting, for training, goal setting, and setting a strategy for the current inspection
Inspection cycle;
The causal analysis meeting;
The action database;
The action team.
Reviews and walk-through
Reviews and walkthroughs are typically peer group discussion activities - without much focus on fault identification and correction, as we have seen. They are usually without the statistical quality improvement, which is an essential part of Inspection. Walkthroughs are generally a training process, and focus on learning about a single document. Reviews focus more on consensus and buy-in to a particular document.
It may be wasteful to do walkthroughs or consensus reviews unless a document has successfully exited from Inspection. Otherwise you may be wasting people's time by giving them documents of unknown quality, which probably contain far too many opportunities for misunderstanding, learning the wrong thing and agreement about the wrong things.
Inspection is not an alternative to wal1cthroughs for training, or to reviews for consensus. In some cases it is a pre-requisite. The difference processes have different purposes. You cannot expect to remove faults effectively with walkthroughs, reviews or distribution of documents for comment. However, in other cases it may be wasteful to Inspect documents, which have not yet 'settled down' technically. Spending time searching for and removing faults in large chucks, which are later discarded, is not a good idea. In this case it may be better to aim for approximate consensus documents. The educational walkthrough could occur either before or after Inspection.
Comparison of Inspection and testing
Inspection and testing both aim at evaluating and improving the quality of the software engineering product before it reaches the customers. The purpose of both is to find and then fix errors, faults and other potential problems.
Inspection and testing can be applied early in software development, although Inspection can be applied earlier than test. Both Inspection and test, applied early, can identify faults, which can then be fixed when it is still much cheaper to do so.
Inspection and testing can be done well or badly. If they are done badly, they will not be effective at finding faults, and this causes problems at later stages, test execution, and operational use.
We need to learn from both Inspection and test experiences. Inspection and testing should both ideally (but all too rarely in practice) produce product-fault metrics and process¬ improvement metrics, which can be used to evaluate the software development process. Data should be kept on faults found in Inspection, faults found in testing, and faults that escaped both Inspection and test, and were only discovered in the field. This data would reflect frequency, document location, security, cost of finding, and cost of fixing.
There is a trade-off between fixing and preventing. The metrics should be used to fine-tune the balance between the investment in the fault detection and fault prevention techniques used. The cost of Inspection, test design, and test running should be compared with the cost of fixing. The faults at the time they were found, in order to arrive at the most cost-effective software development process.
Differences between Inspection and testing
Inspection can be used long before executable code is available to run tests. Inspection can be applied mud earlier than dynamic testing, but can also be applied earlier than test design activities. Test can only be defined when a requirements or design specification has been written, since that specification is the source for knowing the expected result of a test execution.
The one key thing that testing does and Inspection does not, is to evaluate the software while it is actually performing its function in its intended (or simulated) environment. Inspection can only examine static documents and models; testing can evaluate the product working.
Inspection, particularly the process improvement aspect, is concerned with preventing software engineers from inserting any form of fault into what they write. The information gained from faults found in running tests could be used in the same way, but this is rare in practice.
Benefits of Inspection
Opinion is divided over whether Inspection is a worthwhile element of any product 'development' process. Critics argue that it is costly; it demands too much 'upfront' time and is unnecessarily bureaucratic. Supporters claim that the eventual savings and benefits outweigh the costs and the short-term investment is crucial for long-term savings.
Development productivity is improved.
Fagan, in his original article, reported a 23% increase in 'coding productivity alone' using Inspection [Fagan, 1976, IBM Systems Journal, p 187]. He later reported further gains with the introduction of moderator training, design and code change control, and test fault tracking.
Development timescale is reduced.
Considering only the development timescales, typical net savings for project development are 35% to 50%.
Cost and time taken for testing is reduced.
Inspection reduces the number of faults still in place when testing starts because they have been removed at an earlier stage. Testing therefore runs more smoothly, there is less debugging and rework and the testing phase is shorter. At most sites Inspection eliminates 50% to 90% of the faults in the development process before test execution starts.
Lifetime costs are reduced and software reliability increased.
Inspection can be expected to reduce total system maintenance costs due to failure reduction and improvement in document intelligibility, therefore providing a more competitive product.
Management benefits.
Through Inspection, managers can expect access to relevant facts and figures about their software engineering environment, meaning they will be able to identify problems earlier and understand the payoff for dealing with these problems.
Deadline benefits.
Although it cannot guarantee that an unreasonable deadline will be met, through quality and cost metrics Inspection can give early warning of impending problems, helping avoid the temperature of inadequate correction nearer the deadline.
Costs of Inspection
The cost of running an Inspection is approximately 10% - 15% of the development budget. This percentage is about the same as other walkthrough and review methods. However, Inspection finds far more faults for the time spent and the upstream costs can be justified by the benefits of early detection and the lower maintenance costs that result.
As mentioned earlier, the costs of Inspection include additional 'up front' time in the development process and increased time spent by authors writing documents they know will be Inspected. Implementing and running Inspections will involve long-term costs in new areas. An organization will find that time and money go on:
Inspection leading training.
Management training.
Management of the Inspection leaders.
Metric analysis.
Experimentation with new techniques to
try to improve Inspection results.
Planning, checking and meeting activity: the entire Inspection process itself.
Quality improvement: the work of the process improvement teams.
Inspection steps
The Inspection process is initiated with a request for Inspection by the author or owner of a document.
The Inspection leader checks the document against entry criteria, reducing the probability of wasting resources on a deliverable destined to fail.
The Inspection objectives and tactics are planned. Practical details are decided upon and the leader develops a master plan for the team.
A kickoff meeting is held to ensure that the checkers are aware of their individual roles and the ultimate targets of the Inspection process.
Checkers work independently on the document using source documents, rules, procedures and checklists. Potential faults are identified and recorded.
A logging meeting is convened during which potential faults and issues requiring explanations, identified by individual checker, are logged. The checkers now work as a team aiming to discover further faults. And finally suggestions for methods of improving the process itself are logged.
An editor (usually the author) is given the log of issues to resolve. Faults are now classified as such and a request for permission to make the correction and improvements to the document is made by the document's owner. Footnotes might be added to avoid misinterpretation. The editor may also make further process improvement suggestions.
The leader ensures that the editor has taken action to correct all known faults, although the leader need not check the actual corrections.
The exit process is performed by the Inspection leader who uses application generic and specific exit criteria.
The Inspection process is closed and the deliverable made available with an estimate of the remaining faults in a 'warning label'.
Static Analysis by tools:
Static Code Analysis tools:
Static code analysis is the analysis of software that is performed without actually executing the code. In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code. The term is usually applied to the analysis performed by an automated tool, with human analysis being called program understanding or program comprehension.
The sophistication of the analysis performed by tools varies from those that only consider the behavior of individual statements and declarations, to those that include the complete source code of a program in their analysis. Uses of the information obtained from the analysis vary from highlighting possible coding errors (e.g., the lint tool) to formal methods that mathematically prove properties about a given program (e.g., its behavior matches that of its specification).
It helps in finding all possible run-time errors, or more generally any kind of violation of a specification on the final result of a program, is undecidable: there is no mechanical method that can always answer truthfully whether a given program may or may not exhibit runtime errors.
Some of the implementation techniques of formal static analysis include:
Model checking considers systems that have finite state or may be reduced to finite state by abstraction
Data-flow analysis is a lattice-based technique for gathering information about the possible set of values
Abstract interpretation models the effect that every statement has on the state of an abstract machine (i.e., it 'executes' the software based on the mathematical properties of each statement and declaration).
Typical defects discovered by static analysis tools include:
o referencing a variable with an undefined value;
o inconsistent interface between modules and components;
o variables that are never used;
o unreachable (dead) code;
o programming standards violations;
o security vulnerabilities;
o syntax violations of code and software models.
Various tools are available based on the programming language.
Some examples are:
CodeWizard is a unique coding standards enforcement tool that uses patented Source Code Analysis technology (patent #5,860,011) to help developers prevent errors and standardize C++ code automatically. CodeWizard spontaneously enforces C++ coding standards, saving hours of labor-intensive analysis.
Jtest is an automatic static analysis and unit testing tool for Java development. With the click of a button, Jtest automatically enforces over 300 industry-respected coding standards, allowing organizations to prevent the most common and damaging errors. Jtest reduces time spent chasing and fixing bugs by automatically generating unit test cases that test code construction and functionality, and performs regression testing to ensure that no new errors have been introduced into code.
METRIC is the software metrics system for the fully integrated TestWorks/Advisor suite of static source code analyzers and measurement tools. METRIC works as a stand-alone product or as part of the TestWorks/Advisor tool suite to quantitatively determine source code quality. After processing a source code file, METRIC automatically computes various software measurements. These metrics include the Halstead Software Science metrics, which measure data complexity in routines; the Cyclomatic Complexity metrics, which measure logic complexity in routines; and size metrics, such as number of lines, comments and executable statements.
4. Categories of test designtechniques
Broadly speaking there are two main categories, static and dynamic. However, dynamic techniques are subdivided into two more categories behavioural (black box) and structural (white box). Behavioural techniques can be further subdivided into functional and non-functional techniques.
Specification Based or Black Box Testing:
Specification based testing is often referred to as 'black box' test techniques and the common parlance is that we are 'doing black box testing'. This technique will help design test cases based on functionality of component or system under test, without necessarily having to understand underlying detail of software design. Consider functionality of system of determine test inputs and expected results.
BLACK BOX TECHNIQUES
The following list of black box techniques is from BS 7925-2. On this course we will describe and give an example of only those ones highlighted in bold:
Equivalent Partitioning.
Boundary Value Analysis
State Transition Testing
Decision Table Testing
Use Case Testing
Equivalence partitioning (EP) is a test case design technique that is based on the premise that the inputs and outputs of a component can be partitioned into classes that, according to the component's specification, will be treated similarly by the component.. Thus the result of testing a single value from an equivalence partition is considered representative of the complete partition.
Not everyone will necessarily pick the same equivalence classes; there is some subjectivity involved. But the basic assumption you are making is that anyone value from the equivalence, class, is as good as any other when we come to design the test.
This technique can dramatically reduce the number of tests that you may have for a particular software component.
Boundary Value Analysis is based on the following premise. Firstly, the inputs and outputs of a component can be partitioned into classes that, according to the component's specification, will be treated similarly by the component and, secondly, that developers are prone to marking errors in their treatment of the boundaries of these classes. Thus test cases are generated to exercise these boundaries.
Partitioning of test data ranges is explained in the equivalence partitioning test case design technique. It is important to consider both valid and invalid partitions when designing test cases.
The boundaries are the values on and around the beginning and end of a partition. If possible test cases should be created to generate inputs or outputs that will fall on and to either side of each boundary.
Where a boundary value falls within the invalid partition the test case is designed to ensure the software component handles the value in a controlled manner.Boundary value analysis can be used throughout the testing cycle and is equally applicable at all testing phases.
After determining the necessary test cases with equivalence partitioning and subsequent boundary value analysis, it is necessary to define the combinations of the test cases when there are multiple inputs to a software component.
Decision tables, like if-then-else and switch-case statements, associate conditions with actions to perform. But, unlike the control structures found in traditional programming languages, decision tables can associate many independent conditions with several actions in an elegant way
Decision table can be used if the combination of conditions are given. In decision table conditions are known as causes and serinal numbers of conditions are known as business rule.
State Transition testing is used in a State - Machine or Event driven type of a development scenario, where a program logic has multiple state-transition based on various events that occur.
Typical example would be a Order fulfilment system where Order states are Received - In progress - Completed and various other states dependending on actions taken and result of each action.
Use cases describe the system from the user's point of view.
Use cases describe the interaction between a primary actor (the initiator of the interaction) and the system itself, represented as a sequence of simple steps. Actors are something or someone which exists outside the system under study, and that take part in a sequence of activities in a dialogue with the system to achieve some goal. Actors may be end users, other systems, or hardware devices. Each use case is a complete series of events, described from the point of view of the actor.
Use cases may be described at the abstract level (business use case, sometimes called essential use case), or at the system level (system use case). The differences between these is the scope.
A business use case is described in technology-free terminology which treats the business process as a black box and describes the business process that is used by its business actors (people or systems external to the business) to achieve their goals.
A system use case is normally described at the system functionality level (for example, create voucher) and specifies the function or the service that the system provides for the user. A system use case will describe what the actor achieves interacting with the system.
Use case testing will cover the use case specification which includes pre conditions, basic course of action, alternative course of action and exception scenarios and expected outcome in each case and postconditions.
Structure Based or White Box testing:
Structural test techniques are sometime called white box text techniques hence term 'white box testing'. Glass box testing is less widely used term for structural test case design. Structural test techniques help design test cases based on internal structure and design of component or system under test. Look at code, database spec, data model, etc., to determine test inputs and expected results.
White Box Techniques
The following list of white box techniques is from BS7925¬2. On this course we will describe and give an example of only those ones highlighted in bold.
Statement testing
Branch Decision Testing.
Data Flow Testing.
Branch Condition Testing. .
Branch Condition Combination Testing
Modified Condition Decision Testing.
LCSAJ Testing.
Random Testing.
Statement testing is a structural technique based on decomposition of a software component into constituent statements. Statements are identified as executable or non-¬executable. For each test case, you must specify inputs to component, identification of statement(s) to be executed and expected outcome of test case.
Example of program with 4 statements:
X=INPUT;
Y=4;
IF X>Y THEN
Message= "Limited exceeded"
ELSE
Message= "No problems reported" END IF
z=o
Branch testing requires model of source code, which identifies decisions and decision outcomes. A decision is an executable statement, which may transfer control to another statement depending upon logic of decision statement. Typical decisions are found in loops and selections. Each possible transfer of control is a decision outcome. F or each test case you must specify inputs to component, identification of decision outcomes to be executed by test case and expected outcome of test case.
Comparison of white box techniques from "Software Testing in the Real World"
Each column in this figure represents a distinct method of white-box testing, and each row (1-4) defines a different test characteristics. For a given method (column), "Y" in a given row means that test characteristic is required for method. "N" signifies no requirement. "Implicit" means test characteristic is achieved implicitly by other requirements of method. (@ 1993, 1994 Software Development Technologies) reproduced with permission.
LCSAJ Testing
LCSAJ stands for linear code sequence and jump,
Random Testing is usually carried out when there is not enough time to conduct formal test planning and execution and the software to be released is not mission critical, life dependent software. To conduct this type of testing usually requires experience of the software programs or functions and key areas where defects might occur.
Experience Based Testing:
Experienced-based testing is where tests are derived from the tester’s knowledge and intuition and their experience with similar applications and technologies. these techniques can be useful in identifying special tests not easily captured by formal techniques, especially when applied after more formal approaches. However, this technique may yield widely varying degrees of effectiveness, depending on the testers’ experience.
A commonly used experienced-based technique is error guessing. Generally testers anticipate defects based on experience. A structured approach to the error guessing technique is to enumerate a list of possible errors and to design tests that attack these errors. This systematic
approach is called fault attack. These defect and failure lists can be built based on experience, available defect and failure data, and from common knowledge about why software fails.
Exploratory testing is concurrent test design, test execution, test logging and learning, based on a test charter containing test objectives, and carried out within time-boxes. It is an approach that is most useful where there are few or inadequate specifications and severe time pressure, or in order to augment or complement other, more formal testing. It can serve as a check on the test process, to help ensure that the most serious defects are found.
Choosing Test Techniques:
Choosing between static and dynamic mainly depends on the stage of the test process. They both are required during the testing cycle of a project.
However choosing between other dynamic techniques usually depends on the type application, usability, architecture etc.
For example:
If the application is a web site that is public facing then a non-functional behavioural testing to check the behaviour under load would be a good technique.
If there is a user form validation that takes in certain numeric under inputs so applying boundary value analysis would be appropriate technique.
5. Test Management
Overview:
This module covers the overall management of the test effort for a particular project and attempts to answer several key questions such as:
How many testers do we need?
How shall the testers be organized?
What's the reporting structure and who is in charge?
How will we estimate the amount of testing effort required for this project?
How do we keep versions of our test material in line with the development deliverables?
How do we ensure the test effort remains on track?
How do we know that we have finished testing?
What is the process for logging and tracking incidents?
Test estimation
Effort required to perform activities specified in high-level test plan must be calculated in advance. You must remember to allocate time for designing and writing the test scripts as well as estimating the test execution time. If you are going to use the test automation, there will be a steep learning curve for new people and you must allow for this as well. If your tests are going to run on multiple test environments add in extra time here too. Finally, you will never expect to complete all of the testing in one cycle, as there will be faults to fix and test will have to be re-run. Decide on how many test cycles you will require and try and estimate the amount of re¬work (fault fixing and re-testing time).
Test monitoring
Many test efforts fail despite wonderful plans. One of the reasons might be that the test team was so engrossed in detailed testing effort (working long hours, finding many faults) that they did not have time to monitor progress. This however is vitally important if the project is to remain on track. (e.g. use a weekly status report).
Configuration management (CM)
We all appreciate the need for testing and assuring quality in our development systems. But how many of us appreciate that Configuration Management is a precursor to these goals?
Configuration Management provides us with the balance to ensure that:
Systems are complete.
Systems are predictable in content.
Testing required is identified.
Change can be ring-fenced as complete.
An audit trail exits.
We've always practiced CM. Such activities an aggregating and releasing software to the production environment may, in the past, have been uncontrolled - but we did it.
Many famous organizations have found the need to develop a standard for CM that they have then since taken into the market place.
Configuration management encompasses much more that simply keeping a version control of your software and test assets, although that is a very good start. Configuration management is crucial to successful testing, especially regression testing because, in order to make repeatable, you must be able to recreate exactly the software and hardware environment that was used in the first instance.
Typical symptoms of poor CM might include:
Unable to match source and object code.
Unable to identify which version of a compiler generated the object code.
Unable to identify the source code changes made in a particular version of the software simultaneous changes are mad e to the same source code by multiple developers (and changes lost).
ISO (International Standards Organization) definition of CM:
Configuration management (CM) provides a method to identify, build, move, control and recover any baseline in any part of the life cycle, and ensures that is secure and free from external corruption.
Configuration identification requires that all configuration items (CI) and their versions in test system are known.
Configuration control is maintenance of CI’s in a library and maintenance of records on how CI’s change over time.
Status accounting is the function of recording and tracking problem reports, change requests, etc.
Configuration auditing is the function to check on the contents of libraries, etc. for standards compliance, for instance.
CM can be very complicated in environments where mixed hardware and software platforms are being used, but sophisticated cross platform CM tools are increasingly available.
Simple CM life cycle process:
CM contains a large number of components. Each component has its own process and contributes to the overall process.
Let's take a look at a simple process that raises a change, manages it through the life cycle and finally executes an implementation to the production environment. Here we can see how to CM life cycle operates by equating actions with aspects of CM.
As a precursor activity we 'Evaluate Change'. All changes are evaluated before they enter the CM Life Cycle:
1.Raise Change Packet,
Uses Change Management functions to identify and register a change. Uses Change Control functions to determine that action is valid and authorized.
2.Add Configurable Item to Change Packet.
Select and assign configurable items to the Change packet. Execute Impact Analysis to determinate the items that also require some action as a result of the change and the order in which the actions take place.
3.Check-In.
Apply version CI back under CM control.
4.Create Executable.
Build an executable for every contained CI in the order indicated.
5.Sign-Off. .
Uses Change Control to verify that the auctioneer signaling that the auctioneer signaling that testing is complete for the environment in which the change is contained is authorized to do so, and that the action is valid.
6.Check-OK to Propagate.
Uses Change Control, Co Requisite to verify request to move a Change Packet through the life cycle is valid.
a) All precursor tacks completed successfully.
b) Next life cycle environment fit to receive.
c) Subsequent change in current environment has not invalidated Change Packet.
7. Propagation
Affect next environment population by releasing Change Packet and then distributing over a wider-area.
Note: You might notice that it is composed of a number of singular functions and series executed as a cycle. Of particular note is that 'Create Executable' is a singular function. This is because we should only ever build once if at all possible. This, primarily, saves time and computer resources. However, re-building an element in a new environment may negate testing carried out in proceeding one and can lead to a lengthy problem investigation phase.
What does CM control?
CM should control everything element that is a part of a system application.
Nominally, CM:
1. Configurable Items:
Maintains registration and position of all of our CI's. These may be grouped into logically complete change packets as a part of a development of maintenance exercise.
2. Defines Development Life Cycle:
It is composed of a series of transition points, each having its own Entry/Exit criteria and which maps to a specific test execution stage.
3. Movement:
Controls and Change Packet move and progresses it through the Life Cycle.
4. Environment:
Testing takes place in a physical environment configured specifically for the stage testing that the life cycle transition point and stage reflects.
How is it controlled?
CM is like every other project in that it requires a plan. It is particularly important that CM has a plan of what it is to provide as it forms a framework for life cycle management in which to work consistently and securely.
The CM plan cover what needs to be done, not by when, and defines three major areas:
The Processes will define or include:
Raising a change
Adding elements
Booking elements in and out Exit/Entry criteria
Life cycle definition
Life cycle progression
Impact analysis,
Ring-fencing change, release aggregation
Change controls.
Naming standards
Genetic processes for build and other activities
The Roles & responsibilities covering who and what can be done:
Configuration Manager & Librarian Project Manager, Operations Personnel Users, Developers and others as necessary
Records, providing necessary audit trail will include:
What is managed, the status of the life cycle position arid change status,
Who did what, where, when and under what authority. Also the success factor for activities.
Only once the CM plan and the processes that support it are defined can we consider automation.
What does CM look like?
CM has several hubs and functions that will make or break it. Hubs of system are defined as areas where information and source code are stored. Typically major hubs are central inventory and central repository. Surrounding those are four major tool sets that allow us to work on the data:
Version Management
Allows us access to any version or revision of a stored element.
Configuration Control Allows us to group elements into manageable sets.
Change Control & Management
This is global name given to processes that govern change through application development life cycle and stages it passes through from an idea through to implementation. It may include:
Change Control Panel or Board to assess and evaluate change;
Controls: Governing who can do what, when and under what circumstances.
Management: Carrying out an action or movement through the life cycle once the controls have been satisfied
Build & Release
Control is how our elements are built and manner in which our change is propagated through life cycle.
The view is about as close to genetic global view of CM as you can get. It won't match all tools 100% as it covers all aspects of CM - and very few of the tools (although they might claim to) can do this.
Incident Management
An incident is any significant, unplanned event that occurs during testing that requires subsequent investigation and/or correction. Incidents are raised when expected and actual test results differ.
what is an incident?
You may now be thinking that incidents are simply another name for faults but his is not the case. We cannot determine at the time an incident has occurred whether there is really a fault in the software, whether environment was perhaps set up incorrectly or whether in fact test script was incorrect. Therefore we log incident and move on to the next test activity.
Incidents and the test process
An incident occurs whenever an error, query or problem arises during the test process. There must be procedures in place to ensure accurate capture of all incidents. Incident recording begins as soon as testing is introduced into system's development life cycle. First incidents that will be raised therefore are against documentation as project proceeds; incidents will be raised against database designs, and eventually program code of system under test.
Incident logging
Incidents should be logged when someone other than author of product under test performs testing. When describing incident, diplomacy is required to avoid unnecessary conflicts between different teams involved in testing process (e.g. developers and testers). Typically, information logged on an incident will include:
. Name of tester(s), data/time of incident, Software under test ID
. Expected and actual results
. Any error messages
. Test environment
. Summary description
. Detailed description including anything deemed relevant to reproducing/fixing potential fault (and continuing with work)
. Scope
. Test case reference
. Severity (e.g. showstopper, unacceptable, survivable, trivial)
. Priority (e.g. fix immediately, fix by release date, fix in next release)
. Classification. Status (e.g. opened, fixed, inspected, retested, closed)
. Resolution code (what was done to fix fault)
Incidents must be graded to identify severity of incidents and improve quality of reporting information. Many companies use simple approach such as numeric scale of I to 4 or high, medium and low. Beizer has devised a list and weighting for faults as follows:
1 Mil Poor alignment, spelling etc.
2 Moderate Misleading information, redundant information
3 Annoying Bills for 0.00, truncation of name fields etc.
4 Disturbing Legitimate actions refused, sometimes it works, sometimes not
5 Serious Loss of important material, system loses track of data, records etc.
6 Very serious The mis-posting of transactions
7 Extreme Frequent and widespread mis-postings
8 Intolerable Long term errors from which it is difficult or impossible to recover
9 Catastrophic Total system failure or out of control actions
l0 Infectious Other systems are being brought down
In practice, in the commercial world at least, this list is over the top and many companies use a simple approach such as numeric scale of 1 to 4 as outlined below:
1 Showstopper Very serious fault and includes GPF, assertion failure or
complete system hang
2 Unacceptable Serious fault where software does not meet business
requirements and there is no workaround
3 Survivable Fault that has an easy workaround - may involve partial manual operation
4 Cosmetic Covers trivial faults like screen layouts, colors, alignments, etc
Note that incident priority is not the same as severity. Priority relates to how soon the fault will be fixed and is often classified as follows:
1. Fix immediately.
2.Fix before the software is released.
3.Fix in time for the following release.
4.No plan to fix.
It is quite possible to have a severity 1 priority 4 incident and vice versa although the majority of severity 1 and 2 faults are likely to be assigned a priority of 1 or 2 using the above scheme.
Tracking and analysis
Incidents should be tracked from inception through various stages to eventual close-out and resolution. There should be a central repository holding the details of all incidents.
For management information purposes it is important to record the history of each incident. There must be incident history logs raised at each stage whilst the incident is tracked through to resolution for trace ability and audit purposes. This will also allow ht formal documentation of the incidents (and the departments who own them) at a particular point in time.
Typically, entry and exit criteria take the form of the number of incidents outstanding by severity. For this reason it is imperative to have a corporate standard for the severity levels of incidents.
Incidents are often analyzed to monitor test process and to aid in test process improvement. It is often useful to look at sample of incidents and try to determine the root cause.
6. Tool Support for Testing
Overview
When people discuss testing tools they invariably think of automated testing tools and in particular capture/replay tools. However, the market changes all the time and this module is intended to give you a flavor of the many different types of testing tool available. There is also a discussion about how to select and implement a testing tool for your organization. Remember the golden rule, if you automate a mess, you'll get automated chaos; choose tools wisely!
Types of CAST tools
There are numerous types of computer-aided software testing (CAST) tool and these are briefly described below.
Requirements testing tools provide automated support for the verification and validation of requirements models, such as consistency checking and animation.
Static analysis tools provide information about the quality of the software by examining the code, rather than buy running test cases through the code. Static analysis tools usually give objective measurements of various characteristics of the software, such as the cyclomatic complexity measures and other quality metrics.
Test design tools generate test cases from a specification that must normally be held in a CASE tool repository or from formally specified requirements held in the tools itself. Some tools generate test cases from an analysis of the code.
Test data preparation tools enable data to be selected from existing databases or created, generated, manipulated and edited fro use in tests. The most sophisticated tools can deal with a range of file and database formats.
Character-based test running tools provide test capture and replay facilities for dumb-terminal based applications. The tools simulate user-entered terminal keystrokes and capture screen responses for later comparison. Test procedures are normally captured in a programmable script language; data, test cases and expected results may be held in separate test repositories. These tools are most often used to automate regression testing.
GUI test running tools provide test capture and replay facilities for WIMP interface based applications. The tools simulate mouse movement, button clicks and keyboard inputs and can recognize GUI objects such as windows, fields, buttons and other controls. Object states and bitmap images can be captured for later comparison. Test procedures are normally captured in a programmable script language; data, test cases and expected results may be held in separate test repositories. These tools are most often used to automate regression testing.
Test harnesses and drivers are used to execute software under test, which may not have a user interface, or to run groups of existing automated test scripts, which can be controlled by the tester. Some commercially available tools exist, but custom-written programs also fall into this category. Simulators are used to support tests where code or other systems are either unavailable or impracticable to use (e.g. testing software to cope with nuclear meltdowns).
Performance test tools have two main facilities: load generation and test transaction measurement. Load generation is done either by driving application using its user interface or by test drivers, which simulate load generated by application on architecture. Records of numbers of transactions executed are logged. Driving application using its user interface, response time measurements are taken for selected transactions and these are logged. Performance testing tools normally provide reports based on test logs, and graphs of load against response times.
Dynamic analysis tools provide run-time information on state of executing software. These tools are most commonly used to monitor allocation, use and de¬-allocation of memory, flag memory leaks, unassigned pointers, pointer arithmetic and other errors difficult to find 'statically'.
Debugging tools are mainly used by programmers to reproduce bugs and investigate the state of programs. Debuggers enable programmers to execute programs line by line, to halt program at any program statement and to set and examine program variables.
Comparison tools are used. to detect differences between actual results and expected results. Standalone comparison tools normally deal with a range of file or database formats. Test running tools usually have built-in comparators that deal with character screens, Gill objects or bitmap images. These tools often have filtering or masking capabilities, whereby they can 'ignore' rows or columns of data or areas on screens.
Test management tools may have several capabilities. Test ware management is concerned with creation, management and control of test documentation, e.g. test plans, specifications, and results. Some tools support project management aspects of testing, for example, scheduling of tests, logging of results and management of incidents raised during testing. Incident management tools may also have workflow-oriented facilities to track and control allocation, correction and retesting of incidents. Most test management tools provide extensive reporting and analysis facilities.
Coverage measurement (or analysis) tools provide objective measures of structural test coverage when test are executed. Programs to be tested are instrumented before compilation. Instrumentation code dynamically captures coverage data in a log file without affecting functionality of program under test. After execution, log file is analysed and coverage statistics generated. Most tools provide statistics on most common coverage measures such as statement or branch coverage.
Tool selection and implementation
There are many test activities, which can be automated, and test execution tools are not necessarily first or only choice. Identify your test activities where tool support could be of benefit and prioritize areas of most importance.
Fit with your test process may be more important than choosing tool with most features in deciding whether you need a tool, and which one you choose. Benefits of tools usually depend on a systematic and disciplined test process. If testing is chaotic, tools may not be useful and may hinder testing. You must have a good process now, or recognize that your process must improve in parallel with tool implementation. The ease by which CAST tools can be implemented might be called 'CAST readiness’.
Tools may have interesting features, but may not necessarily be available on your platforms, e.g., 'works on 15 flavors of Unix, but not yours...’ Some tools, e.g. performance testing tools, require their own hardware, so cost of procuring this hardware should be a consideration in your cost benefit analysis. If you already have tools, you may need to consider level and usefulness of integration with other tools, e.g., you may want test execution tool to integrate with your existing test management tool (or vice versa). Some vendors offer integrated toolkits, e.g. test execution, test management, performance-testing bundles. Integration between some tools may bring major benefits, in other cases; level of integration is cosmetic only.
Once automation requirements are agreed, selection process has four stages:
Creation of a candidate tool shortlist.
Arrange demos.
Evaluation(s) of selected tool(s).
Review and select tool.
Before making a commitment to implementing the tool across all projects, a pilot project is usually undertaken to ensure the benefits of using the tool can actually be achieved. The objectives of the pilot are to gain some experience in use of the tools, identify changes in the test process required and assess the actual costs and benefits of implementation. Roll out of the tool should be based on a successful result from the evaluation of the pilot. Roll -out normally requires strong commitment from tool users and new projects, as there is an initial overhead in using any tool in new projects.
This section explains why testing is necessary and looks closely at the cost and consequences of errors in computer software.
1. Definitions Errors, Faults and Failures
An error is a human action that produces an incorrect result. A fault is a manifestation of an error in software. Faults are also known colloquially as defaults or bugs. A fault, if encountered, may cause a fault, which is a deviation of the software from its existing delivery or service.
We can illustrate these points with the true story Mercury spacecraft. The computer program aboa spacecraft contained the following statement wri1 the FORTRAN programming language.
DO 100 i = 1.10
The programmer's intention was to execute a succeeding statements up to line 100 ten times then creating a loop where the integer variable I was using the loop counter, starting 1 and ending at 10.
Unfortunately, what this code actually does is writing variable i do to decimal value 1.1 and it does that once only. Therefore remaining code is executed once and not 10 times within the loop. As a result spacecraft went off course and mission was abort considerable cost!
The correct syntax for what the programmer intended is
DO 100 i =1,10
Exercise
What do you think was the error, fault and failure in this example?
The error is __________
The fault is ___________
The failure is __________
2. Reliability
Reliability is the probability that software will not cause the failure of a system for a specified time under specified conditions. Measures of reliability include MTBF (mean time between failure), MTTF (mean time to failure) as well as service level agreements and other mechanisms.
1.4.3 Errors and how they occur
Why do we make errors that cause faults in computer software leading to potential failure of our systems? Well, firstly we are all prone to making simple human errors. This is an unavoidable fact of life. However, this is compounded by the fact that we all operate under real world pressures such as tight deadlines, budget restrictions, conflicting priorities and so on.
1.4.4 Cost of errors
The cost of an error can vary from nothing at all to large amounts of money and even loss of life. The aborted Mercury mission was obviously very costly but surely this is just an isolated example. Or is it? There are hundreds of stories about failures of computer systems that have been attributed to errors in the software. A few examples are shown below:
A nuclear reactor was shut down because a single line of code was coded as X = Y instead of X=ABS (Y) i.e. the absolute value of Y irrespective of whether Y was positive or negative.
Blue Cross of Wisconsin installed a new $200m claims processing system - it sent out $60 million in unwarranted and duplicate payments. Also, when a data entry clerk typed 'none' in the town field the system sent hundreds of checks to the non-existent town of 'NONE' in Wisconsin.
In May 1992, Pepsi fan a promotion in the Philippines. It told customers they could win a million pesos (approx. $40,000) if they bought a bottle of Pepsi and found number 349 stamped on the underside of the bottle cap. Unfortunately, due to a software error, 800,000 bottle caps were produced with number 349 instead of one, which was an equivalent of $42 billion in prize money. It cost the company dearly as some people pursued their claims through the courts and Pepsi paid out millions of dollars in compensation.
Another story was printed in the New York Times on 18th February 1994. Chemical Bank managed to allow $15 million to be withdrawn incorrectly from 100,000 accounts - a single line error in the program caused every ATM on their network to process the transaction twice.
1.4.5 what happened on October 26th & 27th 1992?
The London Ambulance Service (LAS) covers an area of just over 600 square miles and is the largest ambulance service in the world. It covers a resident population of some 6.8 million, but its daytime population is larger, especially in central London. Since 1974 South West Thames Regional Health Authority has managed it.
The LAS carries over 5000 patients every day and receives between 2000 and 2500 calls daily (including between 1300 and 1600 emergency calls i.e. 999 calls). In terms of resources the LAS has some 2700 staff, 750 ambulances and a small number of various other vehicles including 1 helicopter. LAS make almost 2 million patient journeys each year. In 1992/1993 its budgeted income was £69.7 million.
On the 26th and 27th October 1992 the new system failed, ambulances failed to turn up and people lost their lives. Although no Coroners Court ever apportioned blame for any deaths directly to the computer systems failure, it was by any standards a major disaster and made the main evening news bulletins on several occasions.
1.4.6 London Ambulance Service
In summary the problems were:
Computer Aided Dispatch - 1
The system relied on near perfect information to propose optimum resource to be allocated to an emergency. However, there were imperfections in the information and changes in operational procedures made it difficult for staff to correct the system.
This was not a problem when it went live at 7 am on 26th October 1992 as the system load was light; however as the number of emergency calls increased throughout the-day it became increasingly difficult for staff to correct errors; this led to:
• Poor, duplicated and delayed allocations.
• Build-up of exception messages and awaiting attention list.
• Slow-down of system as messages and lists built up.
• Increased number of callbacks and hence delays in telephone answering.
The cost of these errors were ultimately that ambulances didn't turn up and people lost their lives although the official enquiry report did not attribute any fatalities to the system problems. The costs in terms of loss of confidence in the computer system, industrial relations and so on were probably also high.
1.4.7 Exhaustive testing why not test everything?
It is now widely accepted that you cannot test everything. Exhausted testers you will find, but exhaustive testing you will not. Complete testing is neither theoretically, nor practically possible. Consider a 10-character string that has 280 possible input streams and corresponding outputs. If you executed one test per microsecond it would take approx. 4 times the age of the Universe to test this completely. For a survey of the methodology and limitations of formal proofs of program correctness see [Manna 78]
1.4.8 Testing and risk
How much testing would you be willing to perform if the risk of failure were negligible? Alternatively, how much testing would you be willing to perform if a single defect could cost you your life's savings, or, even more significantly, your life? [Hetzel 88].
The amount of testing performed depends on the risks involved. Risk must be used as the basis for allocating the test time that is available and for selecting what to test and where to place emphasis. A priority must be assigned to each test.
Test Managers and Project Managers come up with different prioritization schemes but ht basic principle is that you must focus the testing effort on those areas of the system that are likely to have the most defects. Another key principle is that you must execute the most important test first. Otherwise, if you run out of time, which is likely, you will not have exercised the tests that give you the best payback in terms of faults found.
1.4.9 Testing and quality
Testing identifies faults whose removal increases the software quality by increasing the software's potential reliability. Testing is the measurement of software quality. We measure how closely we have achieved quality by testing the relevant factors such as correctness, reliability, usability, maintainability, reusability, testability etc.
1.4.10 Testing and legal, contractual, regulatory or mandatory requirements
Other factors that may determine the testing performed may be legal, contractual requirements, normally defined in industry specific standards or based on agreed best practice (or more realistically non-negligent practice).
1.4.11 How much testing is enough?
It is difficult to determine how much testing is enough. Testing is always a matter of judging risks against cost of extra testing effort. Planning test effort thoroughly before you begin, and setting completion criteria will go some way towards ensuring the right amount of testing is attempted. Assigning priorities to tests will ensure that the most important tests have been done should you run out of time.
General Testing Principles:
Principle 1 – Testing shows presence of defects
Testing can show that defects are present, but cannot prove that there are no defects. Testing
reduces the probability of undiscovered defects remaining in the software but, even if no defects are
found, it is not a proof of correctness.
Principle 2 – Exhaustive testing is impossible
Testing everything (all combinations of inputs and preconditions) is not feasible except for trivial
cases. Instead of exhaustive testing, risk analysis and priorities should be used to focus testing
efforts.
Principle 3 – Early testing planning and testing
Testing activities should start as early as possible in the software or system development life cycle,
and should be focused on defined objectives.
Principle 4 – Defect clustering
A small number of modules contain most of the defects discovered during pre-release testing, or
are responsible for the most operational failures.
Principle 5 – Pesticide paradox
If the same tests are repeated over and over again, eventually the same set of test cases will no
longer find any new defects. To overcome this “pesticide paradox”, the test cases need to be
regularly reviewed and revised, and new and different tests need to be written to exercise different
parts of the software or system to potentially find more defects.
Principle 6 – Testing is context dependent
Testing is done differently in different contexts. For example, safety-critical software is tested
differently from an e-commerce site.
Principle 7 – Absence-of-errors fallacy
Finding and fixing defects does not help if the system built is unusable and does not fulfill the users’
needs and expectations.
1.5 Fundamental Test Processes
Introduction
Testing must be planned. This is one of Bill Hetzel's 6 testing principles [Hetzel 88 p25] and he says we are all agreed on this one. However, he points out that the problem is that most of us do not discipline ourselves to act upon it. Good testing requires thinking out an overall approach, designing tests and establishing expected results' for each of the test cases we choose.
You have seen already that we cannot test everything, we must make a selection, and the planning and care we expend on that selection accounts for much of the difference between good and poor testers.
The quality and effectiveness of software testing are primarily determined by the quality of the test processes used [Kit 95]. This is one of Ed Kit's 6 essentials of software testing. Test groups that operate within organizations that have an immature development process will feel more pain than those that do not. However, the test group should strive to improve its own internal testing processes. This section of the course shows a fundamental test process, based on the BS7925-2 standard for software component testing.
The fundamental test process comprises planning, specification, execution, recording and checking for completion. You will find organizations that have slightly different names for each stage of the process and you may find some processes that have just 4 stages, where 4 & 5 are combined, for example. However, you will find that all good test processes adhere to this fundamental structure.
1.5.2 Test process stages
See BS7925-2 for diagram of test process. Test planning involves producing a document that describes an overall approach and test objectives noting any assumptions you have made and stating any exceptions to your overall test strategy for your project. Test planning can be applied at all levels. Completion or exit criteria must be specified so that you know when testing (at any stage) is complete. Often a coverage target is set and used as test completion criteria.
Test specification (sometimes referred to as test design) involves designing test conditions and test cases using recognized test techniques identified at the planning stage. Here it is usual to produce a separate document or documents that fully describe the tests that you will carry out. It is important to determine the expected results prior to test execution.
Test execution involves actually running the specified test on a computer system either manually or by using an automated test tool.
Test recording involves keeping good records of the test activities that you have carried out. Versions of the software you have tested and the test specifications are software you have tested and the test specifications are recorded along with the actual outcomes of each test.
Checking for test completion involves looking at the previously specified test completion criteria to see if they have been met. If not, some test may need to be rerun and in some instances it may be appropriate to design some new test cases to meet a particular coverage target.
Note that BS7925-2 does not specify the format of any test documentation. However, The IEEE standard, known as 829, specifies in detail a standard for software test documentation.
BS7925-2 shows a diagram of a suggested hierarchy of test documentation.
HOME WORK
Exercise
Putting aside management problems. Read through test documentation examples in BS7925-2 and answer following questions:
What test techniques does component test strategy stipulate?
What percentage of decision coverage is required?
What should be done if errors are found?
The project component test plan is useful because the approach outlined allows:
a) Strict adherence to the component test strategy
b) More faults to be identified in the LOG components
c) A basic working systems to be established as early as possible
d) Isolation of the components within the test strategy
The component test plan must consist of a single document? TRUE/FALSE
The component test plan must specify test completion criteria? TRUE/FALSE
Why does component test plan specify 100% DC whereas strategy required 90%?
Which test case deals with non-numeric input?
List the expected outcome and the test condition
Why does the CTP have additional/altered test cases?
What action has been taken as a result of the test report?
1.5.3 Successful tests detect faults
As the objective of a test should be to detect faults, a successful test is one that does detect a fault. This is counter-intuitive, because faults delay progress; a successful test is one that may cause delay. The successful test reveals a fault which, if found later, may be many more times costly to correct so in the long run, is a good thing.
1.5.4 Meaning of completion or exit criteria
Completion or exit criteria are used to determine when testing (at any stage) is complete. These criteria may be defined in terms of cost, time, faults found or coverage criteria.
1.5.5 Coverage criteria
Coverage criteria are defined in terms of items that are exercised by test suites, such as branches, user requirements, and most frequently used transactions etc.
The psychology of testing:
1.6.1 Purpose
The purpose of this section is to explore differences in perspective between tester and developer (buyer & builder) and explain some of the difficulties management and staff face when working together developing and testing computer software.
1.6.2 Different mindsets
We have already discussed that none of the primary purposes of testing is to find faults in software i.e., it can be perceived as a destructive process. The development process on the other hand is a naturally creative one and experience shows that staff that work in development have different mindsets to that of testers.
We would never argue that one group is intellectually superior to another, merely that they view systems development from another perspective. A developer is looking to build new and exciting software based on user's requirements and really wants it to work (first time if possible). He or she will work long hours and is usually highly motivated and very determined to do a good job.
A tester, however, is concerned that user really does get a system that does what they want, is reliable and doesn't do thing it shouldn't. He or she will also work long hours looking for faults in software but will often find the job frustrating as their destructive talents take their tool on the poor developers. At this point, there is often much friction between developer and tester. Developer wants to finish system but tester wants all faults in software fixed before their work is done.
In summary:
Developers:
Are perceived as very creative - they write code without which there would be no system! .
Are often highly valued within an organization.
Are sent on relevant industry training courses to gain recognized qualifications.
Are rarely good communicators (sorry guys)!
Can often specialize in just one or two skills (e.g. VB, C++, JAVA, SQL).
Testers:
Are perceived as destructive - only happy when they are finding faults!
Are often not valued within the organization.
Usually do not have any industry recognized qualifications, until now
Usually require good communication skills, tack & diplomacy.
Normally need to be multi-talented (technical, testing, team skills).
1.6.3 Communication b/w developer and tester
It is vitally important that tester can explain and report fault to developer in professional manner to ensure fault gets fixed. Tester must not antagonize developer. Tact and diplomacy are essential, even if you've been up all night trying to test the wretched software.
1.6.4 How not to approach
Tester: "Hey Fred. Here's a fault report AR123. Look at this code. Who wrote this? Was it you? Why, you couldn't program your way out of a paper bag. We really want this fixed by 5 o'clock or else."
We were unable to print Fred's reply because of the language! Needless to say Fred did not fix the fault as requested.
Exercise
Your trainer will split you into small test teams. One of you will be the test team leader. You have found several faults in a program and the team leader must report these to the developer (your trainer). The background is that your team has tested this program twice before and their are still quite a lot of serious faults in the code. There are also several spelling mistakes and wrong colors on the screen layout. The test team is getting a bit fed up. However, you have to be as nice as possible to the developer.
1.6.6 Why can't we test our own work?
This seems to be a human problem in general not specifically related to software development. We find it difficult to spot errors in our own work products. Some of the reasons for this are:
We make assumptions
We are emotionally attached to the product (it's our baby and there's nothing wrong with it).
We are so familiar with the product we cannot easily see the obvious faults.
We're humans.
We see exactly what we want to see.
We have a vested interest in passing the product as ok and not finding faults.
Generally it is thought that objective independent testing is more effective. There are several levels of independence as follows:
Test cases are designed by the person(s) writing the software.
Test cases are designed by another person(s).
Test cases are designed by a person(s) from a different section.
Test cases are designed by a person(s) from a different organization.
Test cases are not chosen by a person.
The discussion of independence test groups and outsourcing is left to another section.
2 Testing throughout the project lifecycle:
2.3 Software Development Model
There are many models used to describe the sequence of activities that make a Systems Development Life Cycle (SDLC). SLDC is used to describe activities of both development and maintenance work. Three models are worth mentioning.
• Sequential (the traditional waterfall model).
• Incremental (the function by function incremental model).
• Spiral (the incremental, iterative, evolutionary, RAD, prototype model).
The three models would all benefit from earlier attention to the testing activity that has to be done at some time during the SDLC.
Any reasonable model for SDLC must allow for change and spiral approach allows for this with emphasis on slowly changing (evolving) design. We have to assume change is inevitable will have to design for change.
Fact 1. Business is always changing
2. Finding a fault causes change
Result Ease of fixing a fault defines ease of responding to change
Corollary If we want systems that can be modified and hence maintained, the earlier we start testing and try the change process, the earlier we will find out how easy it is going to be to maintain the system
1. Sequential Model
The sequential model often fails to bring satisfactory results because of the late attention to the testing activity. When earlier phases in the development cycle slip, it is the testing phase that gets squeezed. This can lead to a limited amount of testing being carried out with the associated production 'teething' problems.
2. Plan for wanted waterfall model development of system
The overall project plan for development of a system might be as shown below:
Activities:
Business study
Requirements analysis
User level design
Technical design
Program specification
Creation of code
Unit testing
Integration testing
System testing
Acceptance testing
Implementation
This is a typical Specify, Design and Build project plan.
All testing and quality control points come late in project and only done if there is time.
When testing is done so late in project it can reveal costly errors.
Project plan has testing done late because people think that only physical deliverables such as code can be tested. Clearly there has to be a better way.
The challenge it to devise a better way of developing systems. There is a need to introduce quality control points earlier in the SDLC.
3. Sequential model plus testing gives 'V' diagram
The V diagram is another way of looking at the sequential development but this time from viewpoint of testing activities that need to be completed later in SDLC.
The 'V' diagram in this simple form has been around for a long time and is especially useful as it easily demonstrates how testing work done early in SDLC is used as input to assurance work later in development.
The V model of SDLC offers considerable benefits over others as it emphasizes building of test data and test scenarios during development and not as an after thought. The V model also allows for establishment of versions, incremental development and regression testing.
Management needs to rename activities, referred to variously as systems testing or acceptance testing. There has always been a phase of development traditionally thought of as the testing phase. This is an historical perception. Testing is not a phase but rather an activity that must be carried out all through development, giving rise to the principle of Total Quality.
In the past, system testing was the only type of testing carried out. Testing was checking that programs, when linked together, met the systems specification. Whether design itself was correct was another matter. The concept of "testing" design before programs were coded was given only the most perfunctory attention. This was for two reasons:
1. By the time physical design had been done the system was very difficult to alter; modifications caused by design reviews were therefore very unwelcome.
2. The design was documented in terms of physical file layouts and program specifications, neither of which the user could comprehend. The question of whether physical design was correct could be reviewed, but the more important question: "Did the system do what the user wanted?" Was largely neglected.
4. 'V' with test recognized deliverable
Looking at the diagram above it is clear that the activity of Business Analysis has, as deliverables, the Specification of Requirements from which Acceptance Test Plan is constructed. To have created, for example, System Architecture without integration Test Specification is to do only half the job!
5 Revised plan for system development
• Business Study
• requirement Analysis
• User level design
• Technical design
• Program specification
• Unit test planning
• Creation of code
• Unit testing
• Integration test planning
• Integration testing
• System testing
• Acceptance test planning
• Creation of detailed test material
• User system testing
• Acceptance testing
• deployment/go live
• Business benefit analysis
The overall project plan for development of a system might be as shown below. Note the new early quality control points.
This plan shows that the creation and running of actual tests are separated. The creation of test material (acceptance test plans, user system test scripts, technical system tests such as integration, link, recovery restart, etc., and unit test data) is done as the relevant design is done. The potential for automation is very good and the use of tools to capture the test cases, scripts, etc. will play a big part in making the running tests efficient. The early creation of test material will make the process of developing a system effective. The emphasis must be on first being effective then being efficient.
6. Rapid Application Development
The spiral, Rapid Application Development (RAD) model has the benefit of the evolutionary approach. This is an incremental process of build a little then test a little, which has the benefit of attempting to produce a usable but limited version early.
The RAD approach relies upon the quality of the RAD team.
The management issues to address are:
Have I got knowledgeable user input in my team?
Have I got experienced designers and developers in my team?
Am I leaving a good audit trail to support future Maintenance?
Test Levels:
COMPONENT TESTING
Component testing is described fully in BS-7925 and should be aware that component testing is also known as unit testing, module testing or Program Testing. The definition from BS7925 is simply the testing of individual software components.
Traditionally, the programmer carries out component testing. This has proved to be less effective than if someone else designs and funs the tests for the component.
"Buddy" testing, where two developers test each other's work is more independent and often more effective. However, the component test strategy should describe what level of independence is applicable to a particular component.
Usually white box (structural) testing techniques are used to design test cases for component tests but some black box tests can be effective as well.
We have already covered a generic test process in this course. The component test process is shown in the following diagram:
INTEGRATION TESTING
Integration is the process of combining components into larger assemblies. From the standard BS-7925 integration testing is defined as "testing performed to expose faults in the interfaces and in the interaction between integrated components", However, in this section we look at two interpretations of integration testing known as integration testing in the large and integration testing in the small.
By Integration Testing in the Large we mean testing the integration of the new system or software package with other (complete) systems. This would include the identification of, and risk associated with, all interfaces to these other systems. Also included is testing of any interfaces to external organizations (e.g. EDI - electronic data interchange, Internet) but not testing of the processing or operation of those external systems.
We use Integration Testing in the Small in the more traditional sense of integration testing where components are assembled into sub-systems and sub¬systems are linked together to form complete systems. Integration strategies may be incremental or non-incremental and include:
bin-ban
Top-down
Bottom-up
Sandwich
Testing approach is directly related to integration strategy chosen.
Stubs & Drivers
A stub is a skeletal or special purpose implementation of a software module, used to develop or test a component that calls or is otherwise dependent on it. A test driver is a program or test tool used to execute software against a test case suite.
They mainly are substitues to a software component that is not available for integration of the entire system.
SYSTEM TESTING
System testing is defined as the process of testing an integrated system to verify that it meets specified requirements.
You will come across two very different types of system testing, Functional system testing and non-functional system testing. In plain English functional system testing is focuses on testing the system based on what it is supposed to do. Non-functional system testing looks at those aspects that are important yet not directly related to what functions the system performs. For example if the functional requirement is to issue an airline ticket, the non-functional requirement might be to issue it within 30 seconds.
A functional requirement is " a requirement that specifies a function that a system or system component must perform". Requirements-based testing means that the user requirements specification and the system requirements specification (as used for contracts) are used to derive test cases. Business process-based testing based on expected user profiles (e.g. scenarios, use cases).
Non-functional requirements cover the following areas:
1. Load ¬
2. Performance
3. Stress
4. Security
5. Usability
6. Storage
7. Volume
8. Install ability
9. Documentation
10. Recovery
Non-functional requirements are just as important as functional requirement
ACCEPTANCE TESTING
The definition of acceptance testing in BS7925 states that "acceptance testing is formal testing conducted to enable a user, customer, or other authorized entity to determine whether to accept a system or component". Acceptance testing may be the only form of testing conducted by and visible to a customer when applied to a software package. The most common usage of the term relates to the user acceptance testing (VAT) but you should be aware that there are several other uses of acceptance testing, which we briefly describe here.
User acceptance testing - the final stage of validation. Customer should perform or be closely involved in this. Customers may choose to do any test they wish, normally based on their usual business processes. A common approach is to set up a model office where systems are tested in an environment as close to field use as is achievable.
Contract acceptance testing - a demonstration of the acceptance criteria, which would have been defined in the contract, being met.
Alpha & beta testing - In alpha and beta tests, when the software seems stable, people who represent your market use the product in the same way (s) that they would if they bought the finished version and give you their comments. Alpha tests are performed at the developer's site, while beta tests are performed at the user's sites.
High level test planning
You should be aware that many people use the term 'test plan' to describe a document detailing individual tests for a component of a system. We are introducing the concept of high level test plans to show that there are a lot more activities involved in effective testing than just writing test cases.
The IEEE standard for test documentation (IEEE/ ANSI, 1983 [Std 829-1983 D, affectionately known as 829, defines a master validation test plan as follows:
Purpose of master test plan is to prescribe scope, approach, resources and schedule of testing activities. A master test plan should include the following:
1. Test Plan Identifier
2. References
3. Introduction
4. Test Items
5. Software Risk Issues
6. Features to be tested
7. Features not to be tested
8. Approach
9. Item Pass/Fail Criteria
10. Suspension Criteria
11. Resumption Requirements
12. Test Deliverables
13. Remaining Testing Tasks
14. Environmental Needs
15. Staffing and Training Needs
16. Responsibilities
17. Schedule
18. Planning Risks and Contingencies
19. Approvals
20. Glossary
Test Types
Functional Testing: Testing the application against business requirements. Functional testing is done using the functional specifications provided by the client or by using the design specifications like use cases provided by the design team.
Functional Testing covers:
Unit Testing
Smoke testing / Sanity testing
Integration Testing (Top Down,Bottom up Testing)
Interface & Usability Testing
System Testing
Regression Testing
Pre User Acceptance Testing(Alpha & Beta)
User Acceptance Testing
White Box & Black Box Testing
Globalization & LocalizationTesting
Non-Functional Testing: Testing the application against client's and performance requirement. Non-Functioning testing is done based on the requirements and test scenarios defined by the client.
Non-Functional Testing covers:
Load and Performance Testing
Ergonomics Testing
Stress & Volume Testing
Compatibility & Migration Testing
Data Conversion Testing
Security / Penetration Testing
Operational Readiness Testing
Installation Testing
Security Testing (ApplicationSecurity, Network Security, System Security)
STRUCTURAL TESTING
It is a type of white box testing which involves testing the software architecture either functional or internal code structure. Testing based on an analysis of internal workings and structure of a piece of software. Techniques involved are code coverage, decision, branch testing etc.
RE-TESTING AND REGRESSION TESTING
We find and report a fault, which is duly fixed by the developer and included in the latest release which we now have available for testing. What should we do now?
Examples of regression tests not carried out include:
The day the phones stopped. .
LAS failure on 4th November (perhaps)
Ariane 5 failure.
Whenever a fault is detected and fixed then the software should be re-tested to ensure that the original fault has bee successfully removed. You should also consider testing for similar and related faults. This is made easier if your tests are designed to be repeatable, whether they are manual or automated.
Regression testing attempts to verify that modifications have not caused unintended adverse side effects in the unchanged software (regression faults) and that the modified system still meets requirements. It is performed whenever the software, or its environment, is changed.
Most companies will build up a regression test suite or regression test pack over time and will add new tests, delete unwanted test and maintain tests as the system evolves. When a major software modification is made then the entire regression pack is likely to be run (albeit with some modification). For minor planned changes or emergency fixes then during the test planning phase the test manager must be selective and identify how many of the regression tests should be attempted. In order to react quickly to an emergency fix the test manager may create a subset of regression test pack for immediate execution in such situations.
Regression tests are often good candidates for automation provided you have designed and developed automated scripts properly (see automation section).
In order to have an effective regression test suite then good configuration management of your test assets is desirable if not essential. You must have version control of your test
Documentation (test plans, scripts etc.) as well as your test data and baseline databases. An inventory of your test environment (hardware configuration, operating system version etc.) is also necessary.
MAINTENANCE TESTING
Maintenance testing is not specifically defined in BS7925 but it is all about testing changes to the software after it has gone into productions.
There are several problems with maintenance testing. If you are testing old code the original specifications and design documentation may be poor or in some cases non-existent. When defining the scope of the maintenance testing it has to be judged in relation to the changed code; how much has changed, what areas does it really affect etc. This kind of impact analysis is difficult and so there is a higher risk when making changes - it is difficult to decide how much regression testing to do.
3. Static Technique
Overview
Static testing techniques is used to find errors before software is actually executed and contrasts therefore with dynamic testing techniques that are applied to a working system. The earlier we catch an error, the cheaper it is, usually, to correct. This module looks at a variety of different static testing techniques. Some are applied to documentation (e.g. walkthroughs, reviews and Inspections) and some are used to analyze the physical code (e.g. compilers, data flow analyzers). This is a huge subject and we can only hope to give an introduction in this module. You will be expected to appreciate the difference between the various review techniques and you will need to be aware of how and when static analysis tools are used.
Static testing techniques are Code reviews, inspections and walkthroughs.
From the black box testing point of view, static testing involves reviewing requirements and specifications. This is done with an eye toward completeness or appropriateness for the task at hand.
From the white box testing point of view, statifc testing involves reviewing code, functions, structure.
Static testing can be automated. A static testing test suite consists of programs to be analyzed by an interpreter or a compiler that asserts the programs syntactic validity.
The people involved in static testing are application developers, testers, and business analyst.
Review Process
What is a review?
A review is a fundamental technique that must be used throughout the development lifecycle. Basically a review is any of a variety of activities involving evaluation of technical matter by a group of people working together. The objective of any review is to obtain reliable information, usually as to status and/or work quality.
Some history of reviews
During any project, management requires a means of assessing a measuring progress. The so-called progress review evolved as means of achieving this. However, results of those early reviews proved to be bitter experiences for many project managers. Just how long can a project remain at 90% complete? They found that they could not measure 'true' progress until they had a means of gauging the quality of the work performed. Thus the concept of the technical review emerged to examine the quality of the work and provide input to the progress reviews.
What can be reviewed?
There are many different types of reviews throughout the development life cycle. Virtually any work produced during development can be (and is) reviewed. This includes, requirements documents, designs, database specifications, designs, data models, code, test plans, test scripts, test documentation and so on.
What has this got to do with testing?
The old fashioned view that reviews and testing are totally different things stems from the fact that testing used to be tacked onto the end of the development lifecycle. However as we all now view testing as a continuous activity that must be started as early as possible you can begin to appreciate the benefits of reviews. Reviews are the only testing technique that is available to us in the early stages of testing. At early stages in the development lifecycle we obviously cannot use dynamic testing techniques since the software is simply not ready for testing.
Reviews share similarities to test activities in that they must be planned (what are we testing), what are the criteria for success (expected results) and who will do the work (responsibilities). The next section examines the different types of review techniques in more detail.
TYPES OF REVIEW
Walk-through, informal reviews, technical reviews and Inspections are fundamental techniques that must be used throughout the development process. All have their strengths and weaknesses and their place in the project development cycle. All four techniques have some ground rules in common as follows:
A structured approach must be followed for the review process.
Be sure to know what is being reviewed - each component must have a unique identifier.
Changes must be configuration controlled.
Reviewers must prepare.
Reviewers must concentrate on their own specialization.
Be sure to review the product, not the person.
There must be:
Total group responsibility.
Correct size of review group.
Correct allocation of time.
Correct application of standards.
Checklists must be used.
Reports must be produced.
Quality must be specified in terms of:
Adherence to standards.
Reliability required.
Robustness required.
Modularity.
Accuracy.
Ability to handle errors.
TYPE 1, 2 AND 3 REVIEW PROCESSES
Review process is both most effective and universal test method and management needs to make sure that review process is working as effectively as possible. Useful model for manager is 1,2,3 model.
1, 2, 3 model is derived from work of first working party of British Computer Society Specialist Interest Group in Software Testing and book that came from this work; in Software Development.
Type 1 testing is process of making sure that product (document, program, screen design, clerical procedure or Functional Specification) is built to standards and contains those features that we would expect from the name of that product. It is a test to make sure that produce conforms to standards, is internally consistent, accurate and unambiguous.
Type 2 testing is process of testing to see if product conforms to requirements as specified by output of preceding project stage and ultimately Specifications of Requirements for whole project. Type 2 testing is backward looking and is checking that product is consistent with preceding documentation (including information on change).
Type 3 testing is forward looking and is about specification of certification process and test that are to be done on delivered product. It is asking question - Can we can build deliverables (test material, training material, next stage analysis documentation)?
Make reviews incremental
Whilst use of 1, 2, 3 model will improve review technique, reviewing task can be made easier by having incremental reviews throughout product construction.
This will enable reviewers to have more in-depth understanding of product that they are reviewing and to start construction type 3 material.
General review procedure for documents
Test team will need general procedure for reviewing documents, as this will probably form large part of team's work.
1. Establishing standards and format for document.
2. Check contents list.
3. Check appendix list.
4. Follow up references outside documents.
5. Cross check references inside document.
6. Check for correct and required outputs.
7. Review each screen layout against appropriate standard, data dictionary, processing rules and files/data base access.
8. Review each report layout against appropriate standard, data dictionary, processing rules and files/data base access.
9. Review comments and reports on work andreviews done prior to this review.
10. Documents reviewed will range from whole reports such as Specification of Requirements to pages from output that system is to produce. All documents will need careful scrutiny.
Be sensitive to voice of concerned but not necessarily assertive tester in team; this person may well have observed a fault that all others have missed. It should not be that person with load voice or strong personality is allowed to dominate.
INSPECTIONS
The Inspection technique was built further by Caroline L. Jones and Robert Mays at IBM [jones, 1985] who created a number of useful enhancements:
The kickoff meeting, for training, goal setting, and setting a strategy for the current inspection
Inspection cycle;
The causal analysis meeting;
The action database;
The action team.
Reviews and walk-through
Reviews and walkthroughs are typically peer group discussion activities - without much focus on fault identification and correction, as we have seen. They are usually without the statistical quality improvement, which is an essential part of Inspection. Walkthroughs are generally a training process, and focus on learning about a single document. Reviews focus more on consensus and buy-in to a particular document.
It may be wasteful to do walkthroughs or consensus reviews unless a document has successfully exited from Inspection. Otherwise you may be wasting people's time by giving them documents of unknown quality, which probably contain far too many opportunities for misunderstanding, learning the wrong thing and agreement about the wrong things.
Inspection is not an alternative to wal1cthroughs for training, or to reviews for consensus. In some cases it is a pre-requisite. The difference processes have different purposes. You cannot expect to remove faults effectively with walkthroughs, reviews or distribution of documents for comment. However, in other cases it may be wasteful to Inspect documents, which have not yet 'settled down' technically. Spending time searching for and removing faults in large chucks, which are later discarded, is not a good idea. In this case it may be better to aim for approximate consensus documents. The educational walkthrough could occur either before or after Inspection.
Comparison of Inspection and testing
Inspection and testing both aim at evaluating and improving the quality of the software engineering product before it reaches the customers. The purpose of both is to find and then fix errors, faults and other potential problems.
Inspection and testing can be applied early in software development, although Inspection can be applied earlier than test. Both Inspection and test, applied early, can identify faults, which can then be fixed when it is still much cheaper to do so.
Inspection and testing can be done well or badly. If they are done badly, they will not be effective at finding faults, and this causes problems at later stages, test execution, and operational use.
We need to learn from both Inspection and test experiences. Inspection and testing should both ideally (but all too rarely in practice) produce product-fault metrics and process¬ improvement metrics, which can be used to evaluate the software development process. Data should be kept on faults found in Inspection, faults found in testing, and faults that escaped both Inspection and test, and were only discovered in the field. This data would reflect frequency, document location, security, cost of finding, and cost of fixing.
There is a trade-off between fixing and preventing. The metrics should be used to fine-tune the balance between the investment in the fault detection and fault prevention techniques used. The cost of Inspection, test design, and test running should be compared with the cost of fixing. The faults at the time they were found, in order to arrive at the most cost-effective software development process.
Differences between Inspection and testing
Inspection can be used long before executable code is available to run tests. Inspection can be applied mud earlier than dynamic testing, but can also be applied earlier than test design activities. Test can only be defined when a requirements or design specification has been written, since that specification is the source for knowing the expected result of a test execution.
The one key thing that testing does and Inspection does not, is to evaluate the software while it is actually performing its function in its intended (or simulated) environment. Inspection can only examine static documents and models; testing can evaluate the product working.
Inspection, particularly the process improvement aspect, is concerned with preventing software engineers from inserting any form of fault into what they write. The information gained from faults found in running tests could be used in the same way, but this is rare in practice.
Benefits of Inspection
Opinion is divided over whether Inspection is a worthwhile element of any product 'development' process. Critics argue that it is costly; it demands too much 'upfront' time and is unnecessarily bureaucratic. Supporters claim that the eventual savings and benefits outweigh the costs and the short-term investment is crucial for long-term savings.
Development productivity is improved.
Fagan, in his original article, reported a 23% increase in 'coding productivity alone' using Inspection [Fagan, 1976, IBM Systems Journal, p 187]. He later reported further gains with the introduction of moderator training, design and code change control, and test fault tracking.
Development timescale is reduced.
Considering only the development timescales, typical net savings for project development are 35% to 50%.
Cost and time taken for testing is reduced.
Inspection reduces the number of faults still in place when testing starts because they have been removed at an earlier stage. Testing therefore runs more smoothly, there is less debugging and rework and the testing phase is shorter. At most sites Inspection eliminates 50% to 90% of the faults in the development process before test execution starts.
Lifetime costs are reduced and software reliability increased.
Inspection can be expected to reduce total system maintenance costs due to failure reduction and improvement in document intelligibility, therefore providing a more competitive product.
Management benefits.
Through Inspection, managers can expect access to relevant facts and figures about their software engineering environment, meaning they will be able to identify problems earlier and understand the payoff for dealing with these problems.
Deadline benefits.
Although it cannot guarantee that an unreasonable deadline will be met, through quality and cost metrics Inspection can give early warning of impending problems, helping avoid the temperature of inadequate correction nearer the deadline.
Costs of Inspection
The cost of running an Inspection is approximately 10% - 15% of the development budget. This percentage is about the same as other walkthrough and review methods. However, Inspection finds far more faults for the time spent and the upstream costs can be justified by the benefits of early detection and the lower maintenance costs that result.
As mentioned earlier, the costs of Inspection include additional 'up front' time in the development process and increased time spent by authors writing documents they know will be Inspected. Implementing and running Inspections will involve long-term costs in new areas. An organization will find that time and money go on:
Inspection leading training.
Management training.
Management of the Inspection leaders.
Metric analysis.
Experimentation with new techniques to
try to improve Inspection results.
Planning, checking and meeting activity: the entire Inspection process itself.
Quality improvement: the work of the process improvement teams.
Inspection steps
The Inspection process is initiated with a request for Inspection by the author or owner of a document.
The Inspection leader checks the document against entry criteria, reducing the probability of wasting resources on a deliverable destined to fail.
The Inspection objectives and tactics are planned. Practical details are decided upon and the leader develops a master plan for the team.
A kickoff meeting is held to ensure that the checkers are aware of their individual roles and the ultimate targets of the Inspection process.
Checkers work independently on the document using source documents, rules, procedures and checklists. Potential faults are identified and recorded.
A logging meeting is convened during which potential faults and issues requiring explanations, identified by individual checker, are logged. The checkers now work as a team aiming to discover further faults. And finally suggestions for methods of improving the process itself are logged.
An editor (usually the author) is given the log of issues to resolve. Faults are now classified as such and a request for permission to make the correction and improvements to the document is made by the document's owner. Footnotes might be added to avoid misinterpretation. The editor may also make further process improvement suggestions.
The leader ensures that the editor has taken action to correct all known faults, although the leader need not check the actual corrections.
The exit process is performed by the Inspection leader who uses application generic and specific exit criteria.
The Inspection process is closed and the deliverable made available with an estimate of the remaining faults in a 'warning label'.
Static Analysis by tools:
Static Code Analysis tools:
Static code analysis is the analysis of software that is performed without actually executing the code. In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code. The term is usually applied to the analysis performed by an automated tool, with human analysis being called program understanding or program comprehension.
The sophistication of the analysis performed by tools varies from those that only consider the behavior of individual statements and declarations, to those that include the complete source code of a program in their analysis. Uses of the information obtained from the analysis vary from highlighting possible coding errors (e.g., the lint tool) to formal methods that mathematically prove properties about a given program (e.g., its behavior matches that of its specification).
It helps in finding all possible run-time errors, or more generally any kind of violation of a specification on the final result of a program, is undecidable: there is no mechanical method that can always answer truthfully whether a given program may or may not exhibit runtime errors.
Some of the implementation techniques of formal static analysis include:
Model checking considers systems that have finite state or may be reduced to finite state by abstraction
Data-flow analysis is a lattice-based technique for gathering information about the possible set of values
Abstract interpretation models the effect that every statement has on the state of an abstract machine (i.e., it 'executes' the software based on the mathematical properties of each statement and declaration).
Typical defects discovered by static analysis tools include:
o referencing a variable with an undefined value;
o inconsistent interface between modules and components;
o variables that are never used;
o unreachable (dead) code;
o programming standards violations;
o security vulnerabilities;
o syntax violations of code and software models.
Various tools are available based on the programming language.
Some examples are:
CodeWizard is a unique coding standards enforcement tool that uses patented Source Code Analysis technology (patent #5,860,011) to help developers prevent errors and standardize C++ code automatically. CodeWizard spontaneously enforces C++ coding standards, saving hours of labor-intensive analysis.
Jtest is an automatic static analysis and unit testing tool for Java development. With the click of a button, Jtest automatically enforces over 300 industry-respected coding standards, allowing organizations to prevent the most common and damaging errors. Jtest reduces time spent chasing and fixing bugs by automatically generating unit test cases that test code construction and functionality, and performs regression testing to ensure that no new errors have been introduced into code.
METRIC is the software metrics system for the fully integrated TestWorks/Advisor suite of static source code analyzers and measurement tools. METRIC works as a stand-alone product or as part of the TestWorks/Advisor tool suite to quantitatively determine source code quality. After processing a source code file, METRIC automatically computes various software measurements. These metrics include the Halstead Software Science metrics, which measure data complexity in routines; the Cyclomatic Complexity metrics, which measure logic complexity in routines; and size metrics, such as number of lines, comments and executable statements.
4. Categories of test designtechniques
Broadly speaking there are two main categories, static and dynamic. However, dynamic techniques are subdivided into two more categories behavioural (black box) and structural (white box). Behavioural techniques can be further subdivided into functional and non-functional techniques.
Specification Based or Black Box Testing:
Specification based testing is often referred to as 'black box' test techniques and the common parlance is that we are 'doing black box testing'. This technique will help design test cases based on functionality of component or system under test, without necessarily having to understand underlying detail of software design. Consider functionality of system of determine test inputs and expected results.
BLACK BOX TECHNIQUES
The following list of black box techniques is from BS 7925-2. On this course we will describe and give an example of only those ones highlighted in bold:
Equivalent Partitioning.
Boundary Value Analysis
State Transition Testing
Decision Table Testing
Use Case Testing
Equivalence partitioning (EP) is a test case design technique that is based on the premise that the inputs and outputs of a component can be partitioned into classes that, according to the component's specification, will be treated similarly by the component.. Thus the result of testing a single value from an equivalence partition is considered representative of the complete partition.
Not everyone will necessarily pick the same equivalence classes; there is some subjectivity involved. But the basic assumption you are making is that anyone value from the equivalence, class, is as good as any other when we come to design the test.
This technique can dramatically reduce the number of tests that you may have for a particular software component.
Boundary Value Analysis is based on the following premise. Firstly, the inputs and outputs of a component can be partitioned into classes that, according to the component's specification, will be treated similarly by the component and, secondly, that developers are prone to marking errors in their treatment of the boundaries of these classes. Thus test cases are generated to exercise these boundaries.
Partitioning of test data ranges is explained in the equivalence partitioning test case design technique. It is important to consider both valid and invalid partitions when designing test cases.
The boundaries are the values on and around the beginning and end of a partition. If possible test cases should be created to generate inputs or outputs that will fall on and to either side of each boundary.
Where a boundary value falls within the invalid partition the test case is designed to ensure the software component handles the value in a controlled manner.Boundary value analysis can be used throughout the testing cycle and is equally applicable at all testing phases.
After determining the necessary test cases with equivalence partitioning and subsequent boundary value analysis, it is necessary to define the combinations of the test cases when there are multiple inputs to a software component.
Decision tables, like if-then-else and switch-case statements, associate conditions with actions to perform. But, unlike the control structures found in traditional programming languages, decision tables can associate many independent conditions with several actions in an elegant way
Decision table can be used if the combination of conditions are given. In decision table conditions are known as causes and serinal numbers of conditions are known as business rule.
State Transition testing is used in a State - Machine or Event driven type of a development scenario, where a program logic has multiple state-transition based on various events that occur.
Typical example would be a Order fulfilment system where Order states are Received - In progress - Completed and various other states dependending on actions taken and result of each action.
Use cases describe the system from the user's point of view.
Use cases describe the interaction between a primary actor (the initiator of the interaction) and the system itself, represented as a sequence of simple steps. Actors are something or someone which exists outside the system under study, and that take part in a sequence of activities in a dialogue with the system to achieve some goal. Actors may be end users, other systems, or hardware devices. Each use case is a complete series of events, described from the point of view of the actor.
Use cases may be described at the abstract level (business use case, sometimes called essential use case), or at the system level (system use case). The differences between these is the scope.
A business use case is described in technology-free terminology which treats the business process as a black box and describes the business process that is used by its business actors (people or systems external to the business) to achieve their goals.
A system use case is normally described at the system functionality level (for example, create voucher) and specifies the function or the service that the system provides for the user. A system use case will describe what the actor achieves interacting with the system.
Use case testing will cover the use case specification which includes pre conditions, basic course of action, alternative course of action and exception scenarios and expected outcome in each case and postconditions.
Structure Based or White Box testing:
Structural test techniques are sometime called white box text techniques hence term 'white box testing'. Glass box testing is less widely used term for structural test case design. Structural test techniques help design test cases based on internal structure and design of component or system under test. Look at code, database spec, data model, etc., to determine test inputs and expected results.
White Box Techniques
The following list of white box techniques is from BS7925¬2. On this course we will describe and give an example of only those ones highlighted in bold.
Statement testing
Branch Decision Testing.
Data Flow Testing.
Branch Condition Testing. .
Branch Condition Combination Testing
Modified Condition Decision Testing.
LCSAJ Testing.
Random Testing.
Statement testing is a structural technique based on decomposition of a software component into constituent statements. Statements are identified as executable or non-¬executable. For each test case, you must specify inputs to component, identification of statement(s) to be executed and expected outcome of test case.
Example of program with 4 statements:
X=INPUT;
Y=4;
IF X>Y THEN
Message= "Limited exceeded"
ELSE
Message= "No problems reported" END IF
z=o
Branch testing requires model of source code, which identifies decisions and decision outcomes. A decision is an executable statement, which may transfer control to another statement depending upon logic of decision statement. Typical decisions are found in loops and selections. Each possible transfer of control is a decision outcome. F or each test case you must specify inputs to component, identification of decision outcomes to be executed by test case and expected outcome of test case.
Comparison of white box techniques from "Software Testing in the Real World"
Each column in this figure represents a distinct method of white-box testing, and each row (1-4) defines a different test characteristics. For a given method (column), "Y" in a given row means that test characteristic is required for method. "N" signifies no requirement. "Implicit" means test characteristic is achieved implicitly by other requirements of method. (@ 1993, 1994 Software Development Technologies) reproduced with permission.
LCSAJ Testing
LCSAJ stands for linear code sequence and jump,
Random Testing is usually carried out when there is not enough time to conduct formal test planning and execution and the software to be released is not mission critical, life dependent software. To conduct this type of testing usually requires experience of the software programs or functions and key areas where defects might occur.
Experience Based Testing:
Experienced-based testing is where tests are derived from the tester’s knowledge and intuition and their experience with similar applications and technologies. these techniques can be useful in identifying special tests not easily captured by formal techniques, especially when applied after more formal approaches. However, this technique may yield widely varying degrees of effectiveness, depending on the testers’ experience.
A commonly used experienced-based technique is error guessing. Generally testers anticipate defects based on experience. A structured approach to the error guessing technique is to enumerate a list of possible errors and to design tests that attack these errors. This systematic
approach is called fault attack. These defect and failure lists can be built based on experience, available defect and failure data, and from common knowledge about why software fails.
Exploratory testing is concurrent test design, test execution, test logging and learning, based on a test charter containing test objectives, and carried out within time-boxes. It is an approach that is most useful where there are few or inadequate specifications and severe time pressure, or in order to augment or complement other, more formal testing. It can serve as a check on the test process, to help ensure that the most serious defects are found.
Choosing Test Techniques:
Choosing between static and dynamic mainly depends on the stage of the test process. They both are required during the testing cycle of a project.
However choosing between other dynamic techniques usually depends on the type application, usability, architecture etc.
For example:
If the application is a web site that is public facing then a non-functional behavioural testing to check the behaviour under load would be a good technique.
If there is a user form validation that takes in certain numeric under inputs so applying boundary value analysis would be appropriate technique.
5. Test Management
Overview:
This module covers the overall management of the test effort for a particular project and attempts to answer several key questions such as:
How many testers do we need?
How shall the testers be organized?
What's the reporting structure and who is in charge?
How will we estimate the amount of testing effort required for this project?
How do we keep versions of our test material in line with the development deliverables?
How do we ensure the test effort remains on track?
How do we know that we have finished testing?
What is the process for logging and tracking incidents?
Test estimation
Effort required to perform activities specified in high-level test plan must be calculated in advance. You must remember to allocate time for designing and writing the test scripts as well as estimating the test execution time. If you are going to use the test automation, there will be a steep learning curve for new people and you must allow for this as well. If your tests are going to run on multiple test environments add in extra time here too. Finally, you will never expect to complete all of the testing in one cycle, as there will be faults to fix and test will have to be re-run. Decide on how many test cycles you will require and try and estimate the amount of re¬work (fault fixing and re-testing time).
Test monitoring
Many test efforts fail despite wonderful plans. One of the reasons might be that the test team was so engrossed in detailed testing effort (working long hours, finding many faults) that they did not have time to monitor progress. This however is vitally important if the project is to remain on track. (e.g. use a weekly status report).
Configuration management (CM)
We all appreciate the need for testing and assuring quality in our development systems. But how many of us appreciate that Configuration Management is a precursor to these goals?
Configuration Management provides us with the balance to ensure that:
Systems are complete.
Systems are predictable in content.
Testing required is identified.
Change can be ring-fenced as complete.
An audit trail exits.
We've always practiced CM. Such activities an aggregating and releasing software to the production environment may, in the past, have been uncontrolled - but we did it.
Many famous organizations have found the need to develop a standard for CM that they have then since taken into the market place.
Configuration management encompasses much more that simply keeping a version control of your software and test assets, although that is a very good start. Configuration management is crucial to successful testing, especially regression testing because, in order to make repeatable, you must be able to recreate exactly the software and hardware environment that was used in the first instance.
Typical symptoms of poor CM might include:
Unable to match source and object code.
Unable to identify which version of a compiler generated the object code.
Unable to identify the source code changes made in a particular version of the software simultaneous changes are mad e to the same source code by multiple developers (and changes lost).
ISO (International Standards Organization) definition of CM:
Configuration management (CM) provides a method to identify, build, move, control and recover any baseline in any part of the life cycle, and ensures that is secure and free from external corruption.
Configuration identification requires that all configuration items (CI) and their versions in test system are known.
Configuration control is maintenance of CI’s in a library and maintenance of records on how CI’s change over time.
Status accounting is the function of recording and tracking problem reports, change requests, etc.
Configuration auditing is the function to check on the contents of libraries, etc. for standards compliance, for instance.
CM can be very complicated in environments where mixed hardware and software platforms are being used, but sophisticated cross platform CM tools are increasingly available.
Simple CM life cycle process:
CM contains a large number of components. Each component has its own process and contributes to the overall process.
Let's take a look at a simple process that raises a change, manages it through the life cycle and finally executes an implementation to the production environment. Here we can see how to CM life cycle operates by equating actions with aspects of CM.
As a precursor activity we 'Evaluate Change'. All changes are evaluated before they enter the CM Life Cycle:
1.Raise Change Packet,
Uses Change Management functions to identify and register a change. Uses Change Control functions to determine that action is valid and authorized.
2.Add Configurable Item to Change Packet.
Select and assign configurable items to the Change packet. Execute Impact Analysis to determinate the items that also require some action as a result of the change and the order in which the actions take place.
3.Check-In.
Apply version CI back under CM control.
4.Create Executable.
Build an executable for every contained CI in the order indicated.
5.Sign-Off. .
Uses Change Control to verify that the auctioneer signaling that the auctioneer signaling that testing is complete for the environment in which the change is contained is authorized to do so, and that the action is valid.
6.Check-OK to Propagate.
Uses Change Control, Co Requisite to verify request to move a Change Packet through the life cycle is valid.
a) All precursor tacks completed successfully.
b) Next life cycle environment fit to receive.
c) Subsequent change in current environment has not invalidated Change Packet.
7. Propagation
Affect next environment population by releasing Change Packet and then distributing over a wider-area.
Note: You might notice that it is composed of a number of singular functions and series executed as a cycle. Of particular note is that 'Create Executable' is a singular function. This is because we should only ever build once if at all possible. This, primarily, saves time and computer resources. However, re-building an element in a new environment may negate testing carried out in proceeding one and can lead to a lengthy problem investigation phase.
What does CM control?
CM should control everything element that is a part of a system application.
Nominally, CM:
1. Configurable Items:
Maintains registration and position of all of our CI's. These may be grouped into logically complete change packets as a part of a development of maintenance exercise.
2. Defines Development Life Cycle:
It is composed of a series of transition points, each having its own Entry/Exit criteria and which maps to a specific test execution stage.
3. Movement:
Controls and Change Packet move and progresses it through the Life Cycle.
4. Environment:
Testing takes place in a physical environment configured specifically for the stage testing that the life cycle transition point and stage reflects.
How is it controlled?
CM is like every other project in that it requires a plan. It is particularly important that CM has a plan of what it is to provide as it forms a framework for life cycle management in which to work consistently and securely.
The CM plan cover what needs to be done, not by when, and defines three major areas:
The Processes will define or include:
Raising a change
Adding elements
Booking elements in and out Exit/Entry criteria
Life cycle definition
Life cycle progression
Impact analysis,
Ring-fencing change, release aggregation
Change controls.
Naming standards
Genetic processes for build and other activities
The Roles & responsibilities covering who and what can be done:
Configuration Manager & Librarian Project Manager, Operations Personnel Users, Developers and others as necessary
Records, providing necessary audit trail will include:
What is managed, the status of the life cycle position arid change status,
Who did what, where, when and under what authority. Also the success factor for activities.
Only once the CM plan and the processes that support it are defined can we consider automation.
What does CM look like?
CM has several hubs and functions that will make or break it. Hubs of system are defined as areas where information and source code are stored. Typically major hubs are central inventory and central repository. Surrounding those are four major tool sets that allow us to work on the data:
Version Management
Allows us access to any version or revision of a stored element.
Configuration Control Allows us to group elements into manageable sets.
Change Control & Management
This is global name given to processes that govern change through application development life cycle and stages it passes through from an idea through to implementation. It may include:
Change Control Panel or Board to assess and evaluate change;
Controls: Governing who can do what, when and under what circumstances.
Management: Carrying out an action or movement through the life cycle once the controls have been satisfied
Build & Release
Control is how our elements are built and manner in which our change is propagated through life cycle.
The view is about as close to genetic global view of CM as you can get. It won't match all tools 100% as it covers all aspects of CM - and very few of the tools (although they might claim to) can do this.
Incident Management
An incident is any significant, unplanned event that occurs during testing that requires subsequent investigation and/or correction. Incidents are raised when expected and actual test results differ.
what is an incident?
You may now be thinking that incidents are simply another name for faults but his is not the case. We cannot determine at the time an incident has occurred whether there is really a fault in the software, whether environment was perhaps set up incorrectly or whether in fact test script was incorrect. Therefore we log incident and move on to the next test activity.
Incidents and the test process
An incident occurs whenever an error, query or problem arises during the test process. There must be procedures in place to ensure accurate capture of all incidents. Incident recording begins as soon as testing is introduced into system's development life cycle. First incidents that will be raised therefore are against documentation as project proceeds; incidents will be raised against database designs, and eventually program code of system under test.
Incident logging
Incidents should be logged when someone other than author of product under test performs testing. When describing incident, diplomacy is required to avoid unnecessary conflicts between different teams involved in testing process (e.g. developers and testers). Typically, information logged on an incident will include:
. Name of tester(s), data/time of incident, Software under test ID
. Expected and actual results
. Any error messages
. Test environment
. Summary description
. Detailed description including anything deemed relevant to reproducing/fixing potential fault (and continuing with work)
. Scope
. Test case reference
. Severity (e.g. showstopper, unacceptable, survivable, trivial)
. Priority (e.g. fix immediately, fix by release date, fix in next release)
. Classification. Status (e.g. opened, fixed, inspected, retested, closed)
. Resolution code (what was done to fix fault)
Incidents must be graded to identify severity of incidents and improve quality of reporting information. Many companies use simple approach such as numeric scale of I to 4 or high, medium and low. Beizer has devised a list and weighting for faults as follows:
1 Mil Poor alignment, spelling etc.
2 Moderate Misleading information, redundant information
3 Annoying Bills for 0.00, truncation of name fields etc.
4 Disturbing Legitimate actions refused, sometimes it works, sometimes not
5 Serious Loss of important material, system loses track of data, records etc.
6 Very serious The mis-posting of transactions
7 Extreme Frequent and widespread mis-postings
8 Intolerable Long term errors from which it is difficult or impossible to recover
9 Catastrophic Total system failure or out of control actions
l0 Infectious Other systems are being brought down
In practice, in the commercial world at least, this list is over the top and many companies use a simple approach such as numeric scale of 1 to 4 as outlined below:
1 Showstopper Very serious fault and includes GPF, assertion failure or
complete system hang
2 Unacceptable Serious fault where software does not meet business
requirements and there is no workaround
3 Survivable Fault that has an easy workaround - may involve partial manual operation
4 Cosmetic Covers trivial faults like screen layouts, colors, alignments, etc
Note that incident priority is not the same as severity. Priority relates to how soon the fault will be fixed and is often classified as follows:
1. Fix immediately.
2.Fix before the software is released.
3.Fix in time for the following release.
4.No plan to fix.
It is quite possible to have a severity 1 priority 4 incident and vice versa although the majority of severity 1 and 2 faults are likely to be assigned a priority of 1 or 2 using the above scheme.
Tracking and analysis
Incidents should be tracked from inception through various stages to eventual close-out and resolution. There should be a central repository holding the details of all incidents.
For management information purposes it is important to record the history of each incident. There must be incident history logs raised at each stage whilst the incident is tracked through to resolution for trace ability and audit purposes. This will also allow ht formal documentation of the incidents (and the departments who own them) at a particular point in time.
Typically, entry and exit criteria take the form of the number of incidents outstanding by severity. For this reason it is imperative to have a corporate standard for the severity levels of incidents.
Incidents are often analyzed to monitor test process and to aid in test process improvement. It is often useful to look at sample of incidents and try to determine the root cause.
6. Tool Support for Testing
Overview
When people discuss testing tools they invariably think of automated testing tools and in particular capture/replay tools. However, the market changes all the time and this module is intended to give you a flavor of the many different types of testing tool available. There is also a discussion about how to select and implement a testing tool for your organization. Remember the golden rule, if you automate a mess, you'll get automated chaos; choose tools wisely!
Types of CAST tools
There are numerous types of computer-aided software testing (CAST) tool and these are briefly described below.
Requirements testing tools provide automated support for the verification and validation of requirements models, such as consistency checking and animation.
Static analysis tools provide information about the quality of the software by examining the code, rather than buy running test cases through the code. Static analysis tools usually give objective measurements of various characteristics of the software, such as the cyclomatic complexity measures and other quality metrics.
Test design tools generate test cases from a specification that must normally be held in a CASE tool repository or from formally specified requirements held in the tools itself. Some tools generate test cases from an analysis of the code.
Test data preparation tools enable data to be selected from existing databases or created, generated, manipulated and edited fro use in tests. The most sophisticated tools can deal with a range of file and database formats.
Character-based test running tools provide test capture and replay facilities for dumb-terminal based applications. The tools simulate user-entered terminal keystrokes and capture screen responses for later comparison. Test procedures are normally captured in a programmable script language; data, test cases and expected results may be held in separate test repositories. These tools are most often used to automate regression testing.
GUI test running tools provide test capture and replay facilities for WIMP interface based applications. The tools simulate mouse movement, button clicks and keyboard inputs and can recognize GUI objects such as windows, fields, buttons and other controls. Object states and bitmap images can be captured for later comparison. Test procedures are normally captured in a programmable script language; data, test cases and expected results may be held in separate test repositories. These tools are most often used to automate regression testing.
Test harnesses and drivers are used to execute software under test, which may not have a user interface, or to run groups of existing automated test scripts, which can be controlled by the tester. Some commercially available tools exist, but custom-written programs also fall into this category. Simulators are used to support tests where code or other systems are either unavailable or impracticable to use (e.g. testing software to cope with nuclear meltdowns).
Performance test tools have two main facilities: load generation and test transaction measurement. Load generation is done either by driving application using its user interface or by test drivers, which simulate load generated by application on architecture. Records of numbers of transactions executed are logged. Driving application using its user interface, response time measurements are taken for selected transactions and these are logged. Performance testing tools normally provide reports based on test logs, and graphs of load against response times.
Dynamic analysis tools provide run-time information on state of executing software. These tools are most commonly used to monitor allocation, use and de¬-allocation of memory, flag memory leaks, unassigned pointers, pointer arithmetic and other errors difficult to find 'statically'.
Debugging tools are mainly used by programmers to reproduce bugs and investigate the state of programs. Debuggers enable programmers to execute programs line by line, to halt program at any program statement and to set and examine program variables.
Comparison tools are used. to detect differences between actual results and expected results. Standalone comparison tools normally deal with a range of file or database formats. Test running tools usually have built-in comparators that deal with character screens, Gill objects or bitmap images. These tools often have filtering or masking capabilities, whereby they can 'ignore' rows or columns of data or areas on screens.
Test management tools may have several capabilities. Test ware management is concerned with creation, management and control of test documentation, e.g. test plans, specifications, and results. Some tools support project management aspects of testing, for example, scheduling of tests, logging of results and management of incidents raised during testing. Incident management tools may also have workflow-oriented facilities to track and control allocation, correction and retesting of incidents. Most test management tools provide extensive reporting and analysis facilities.
Coverage measurement (or analysis) tools provide objective measures of structural test coverage when test are executed. Programs to be tested are instrumented before compilation. Instrumentation code dynamically captures coverage data in a log file without affecting functionality of program under test. After execution, log file is analysed and coverage statistics generated. Most tools provide statistics on most common coverage measures such as statement or branch coverage.
Tool selection and implementation
There are many test activities, which can be automated, and test execution tools are not necessarily first or only choice. Identify your test activities where tool support could be of benefit and prioritize areas of most importance.
Fit with your test process may be more important than choosing tool with most features in deciding whether you need a tool, and which one you choose. Benefits of tools usually depend on a systematic and disciplined test process. If testing is chaotic, tools may not be useful and may hinder testing. You must have a good process now, or recognize that your process must improve in parallel with tool implementation. The ease by which CAST tools can be implemented might be called 'CAST readiness’.
Tools may have interesting features, but may not necessarily be available on your platforms, e.g., 'works on 15 flavors of Unix, but not yours...’ Some tools, e.g. performance testing tools, require their own hardware, so cost of procuring this hardware should be a consideration in your cost benefit analysis. If you already have tools, you may need to consider level and usefulness of integration with other tools, e.g., you may want test execution tool to integrate with your existing test management tool (or vice versa). Some vendors offer integrated toolkits, e.g. test execution, test management, performance-testing bundles. Integration between some tools may bring major benefits, in other cases; level of integration is cosmetic only.
Once automation requirements are agreed, selection process has four stages:
Creation of a candidate tool shortlist.
Arrange demos.
Evaluation(s) of selected tool(s).
Review and select tool.
Before making a commitment to implementing the tool across all projects, a pilot project is usually undertaken to ensure the benefits of using the tool can actually be achieved. The objectives of the pilot are to gain some experience in use of the tools, identify changes in the test process required and assess the actual costs and benefits of implementation. Roll out of the tool should be based on a successful result from the evaluation of the pilot. Roll -out normally requires strong commitment from tool users and new projects, as there is an initial overhead in using any tool in new projects.