Maintenance study

How Do Automatically Generated Unit Tests Influence Software Maintenance?

Sina Shamshiri, José Miguel Rojas, Juan Pablo Galeotti, Neil Walkinshaw and Gordon Fraser.

Download PDF

Abstract

Generating unit tests automatically saves time over
writing tests manually and can lead to higher code coverage.
However, automatically generated tests are usually not based
on realistic scenarios, and are therefore generally considered
to be less readable. This places a question mark over their
practical value: Every time a test fails, a developer has to decide
whether this failure has revealed a regression fault in the program
under test, or whether the test itself needs to be updated. Does
the fact that automatically generated tests are harder to read
outweigh the time-savings gained by their automated generation,
and render them more of a hindrance than a help for software
maintenance? In order to answer this question, we performed
an empirical study in which participants were presented with an
automatically generated or manually written failing test, and
were asked to identify and fix the cause of the failure. Our
experiment and two replications resulted in a total of 150 data
points based on 75 participants. Whilst maintenance activities
take longer when working with automatically generated tests, we
found developers to be equally effective with manually written
and automatically generated tests. This has implications on how
automated test generation is best used in practice, and it indicates
a need for research into the generation of more realistic tests.

Download Artefacts and Replication Package

Download Replication Package