TOSEM21

On the Interaction between Test-Suite Reduction and Regression-Test Selection Strategies

Abstract

Unit testing is one of the most established and practically applicable quality-assurance techniques in modern software development work-flows. One major advantage of unit testing as compared to more heavy-weight
techniques is the arbitrarily adjustable trade-off between efficiency (i.e., testing effort) and effectiveness (i.e., fault-detection probability). To this end, a variety of testing strategies have been proposed to exploit, and explicitly control, this trade-off. In particular, test-suite reduction techniques are concerned with reducing the number of (presumably redundant) test cases in a test suite while testing a single program version. Regressiontest selection strategies are concerned with a similar goal, yet focusing on the selection of test cases for consecutive program revisions. As a consequence, both kinds of strategies may potentially influence—or even obstruct— each others’ performance in various ways. For instance, test cases discarded during test-suite reduction for a particular program version may become relevant again during regression-test selection after a program revision, and vice versa. Hence, how to find a suitable combination of both strategies leading to a reasonable efficiency/effectiveness trade-off throughout the entire version history of a program is an open question. The goal of this paper is to gain a better understanding of the potential interactions between both kinds of unit-testing strategies with respect to efficiency and effectiveness. To this end, we present a configurable experimental evaluation framework for automated unit-testing of C programs comprising different strategies for test-suite reduction and as regression-test selection and possible combinations thereof. We apply this framework to a collection of subject systems, delivering several crucial insights: (1) test-suite reduction has almost always a negative impact on the effectiveness of regression-test selection, yet a positive impact on efficiency and (2) test cases revealing to testers the effect of program modifications between consecutive program versions are far more effective than test cases simply covering modified code parts, yet causing much more testing effort.

Supplementary Data

ArtifactDownload
RegreTS1, 2, 3, 4
Results1, 2, 3
TestCovzip

The zip files for RegreTS contain the test-case generation framework used for generating the test suites as explained in "Reproduction of Results".

The zip files for Results contains all results, as well as files with the results for each test-suite and the aggregated results for each strategy.

Additionally, the programs used for creating traversing and revealing test-cases are in the corresponding folders (i.e., traversing and compare folders), and the program used to check if the test-suite actually managed to find the bug is present in the corresponding "check" folders.

Lastly, the zip files for TestCov contain the test-suite validator TestCov, which is used for reducing the test-suites and checking if the test-suites find the corresponding bugs.

Reproducing the Results

To create a test-suite for a given comparator-program (traversing or revealing) RegreTS needs to be unpackaged and executed with the following call:

  • scripts/cpa.sh tigertestcomp20 -setprop "tiger.testSuiteFolder=pathToTestSuite" -setprop "tiger.fqlQuery=query" -setprop "tiger.numberOfTestCasesPerGoal=NRT" -benchmark -heap 15000M pathToProgram

where

  • pathToProgram is the path to the comparator-program
  • pathToTestSuite is the path for the resulting test-suite
  • query is either "Goals:G1" for revealing comparator-programs (in the "compare" folders) or  "GoalRegex:.*_CHANGE_.*" for traversing
  • NRT is the number of test-cases created per goal

This step should be repeated for each comparator-program of the current subject system and the test-suites of prior versions merged into the test-suites of later versions (to re-use the prior generated test cases).

The resulting test-suite can then be reduced by calling TestCov with the following parameters:

  • python3 bin/testcov --goal goalFile --test-suite testsuite.zip --timelimit-per-run 30 --reduction reductionStrategy --no-plots program

where

  • goalFile is the corresponding goalFile in the TestCov folder (either goal_traversing.prp or goal_revealing.prp)
  • testsuite.zip is the zipped test-suite previously generated
  • reductionStrategy if either DIFF, ILP or FAST_PP
  • program is the comparator-program the test-suite was generated on

The result will be a reduced test suite.

To check wether a test-suite finds the correspding bug, call TestCov as follows:

  • python3 bin/testcov --goal goal_revealing.prp --test-suite testsuite.zip --timelimit-per-run 30 --no-plots checkProgram

where

  • testsuite.zip is the corresponding test suite
  • checkProgram is the comparator-program (with the same version number as the comparator-program the test-suite was generated on) which compares the bugged and the original version (in the "check" folder).