Icon for gfsd IntelliJ IDEA

How to initialize test data in unit tests?

Read this guide to test data initialization

In this post, we cover the three key ways of initializing test data for your unit tests: direct initialization, the Object Mother pattern, and the Test Data Builder pattern.

When testing to ensure your software applications are up to scratch, you’ll go through several phases of the testing workflow. As you start doing unit testing, you’ll inevitably need test data to test your functions against. Normally, you’ll just rely on object creation when initializing classes.

However, more often than not, you’ll find yourself initializing similar test data over and over. In such scenarios, advanced techniques like using Object Mothers or Test Data Builders can help reduce the effort. (Of course, the use of mocking reduces the need for test data, as you’ll just use one value or a set of values, or just return the desired result e.g. when mocking an external database).

⚡️ Reduce unit testing effort with test generation

Symflower enables you to generate JUnit 4 and JUnit 5 tests for Java, Spring, and Spring Boot with minimal effort. Create ready-to-go test templates or generate complete test suites (beta) with our plugin for VS Code, IntelliJ, CLI & more. Try it now.

Wait, just what is test data again?

To determine whether your application functions as intended, you’ll test it against some example data. That example data is what we call test data. Ideally, your test dataset will cover the whole variety of both possible and impossible inputs so that you can test how your software handles data input errors.

Test data are categorized into three key groups:

  • Normal data is the stuff that actually “makes sense”. It’s the kind of data your application is supposed to take and process.
  • Boundary data (or extreme data) is still valid, but is near the limits of possible data ranges so it’s possible that it could cause problems.
  • Erroneous data is what you’ll use to test how your software handles errors in data input. These are values that your program isn’t expected to accept.

🤔 Curious about the ways to generate test data?

Check out our post: Methods for automated test value generation

When you want to test against any of the above types of data, you’ll first need to initialize that data in your tests. But in most cases, you’ll have several tests that use the same (or similar) data as input. So it makes sense to optimize the way you initialize that data. Let’s see how you can do that to optimize your testing workflow!

Test data initialization: 3 ways and best practices

There are, in general, three ways you can initialize test data, each with its own characteristics, pros, and cons that we’ll dive into below:

🤨 Testing pyramid or testing trophy?

Read our introduction to the testing trophy to see how it stacks up against the tried-and-tested testing pyramid!

Directly initializing test data

Direct initialization is the most straightforward option. It means directly using the initializing options that the class in question provides. Specifically, if you choose to go with directly initializing test data, you have the following options:

1) Using constructors

This option has you pass relevant initialization values to a constructor. Example:

new Person("Peter", "Pan", 43);

The advantages of this are obvious: directly initializing test data is short and simple. It’s done in a single line of code that is simple to read and easy to understand. Another benefit is that this way, you can’t create invalid objects since your constructor will do sanitization/validation on the provided initialization values. It makes sense to use the constructor that allows setting most of your test values because this way, your tests stay nice and concise.

2) Manually assigning data to the fields of the object

If you cannot initialize everything using a constructor, you can either look for public fields that you can manipulate directly, or just use the setter methods the class provides. Here’s how manual assignment would work:

Person p = new Person(); p.name="Peter";

In the following case, a setter method is used:

Person p = new Person(); p.setName("Peter");

One advantage is that setter methods work even with fields that are not public – however, if you choose to go down this path, be aware that your code can get bloated very quickly (since each new value is typed into a new line).

3) Using reflection

You can also initialize objects via the Java Virtual Machine (JVM). One downside, though, is that this is very complex to do. Consider it a brute-force method that isn’t very elegant and should only be used for private fields that can’t really be initialized any other way.

If you’re using reflection, keep in mind that you’re testing an object state that might never be possible in production. Be sure to carefully weigh your options! Answer the following questions:

  • Why do you need to use reflection to get an object in a certain state? Can this state even be reached in production?
  • Does your test even make sense?

In case you have a good enough reason to use it, reflection enables you to change the content of a private field when there’s no other way to access it directly.

Whichever method you choose to use, here are some key DOs and DON’Ts for direct initialization:

  • Use the best-fitting constructor for your test data to initialize as much as possible.
  • If some fields are missing, use the data manipulation options (i.e. accessing fields directly or setters) that the class provides to finish the initialization of your test data.
  • That said, refrain from changing the data manipulation options of a class “just for testing”, i.e. do not make a field public just because you cannot access it any other way in a test. This increases the chance of bugs as another team member may end up using this option in production code.

The problem is that each of the above options requires quite a bit of manual effort when initializing lots of objects. That makes them less than ideal for that use case. Let us present two other, more advanced options for when you need to initialize a large volume of objects.

🤓 Generating readable tests

You can use Symflower to generate tests with the best-fitting version of the above scenarios to maximize the readability of your tests. Check out Symflower’s documentation to learn more.

Using Object Mothers

Stepping up from direct initialization, the next option is using the Object Mother pattern. As defined by Martin Fowler, “An object mother is a kind of class used in testing to help create example objects that you use for testing."

For a lot of tests, you’ll likely need similar data that you’ll want to reuse across test classes (aka your test fixture). In such cases, it’s a good idea to create a factory class to return your test fixtures/objects instead of initializing a fixture/object by hand when setting up your test. The term object mother refers to this object factory.

In essence, the object mother produces “canned” objects you can use (and reuse) in your tests. Think of it as a shortcut object generator, a factory class that returns a fixed (set of) dummy object(s).

A huge advantage of using Object Mothers is that these predefined dummy objects can be reused with practically no manual effort. Upon reusing, you can configure different variations for common scenarios to further reduce the effort that goes into initializing data. The actual initialization of objects is hidden within the factory class, so if the constructor changes, you’ll only need to adapt the factory class once.

Despite all those benefits, using Object Mothers isn’t very scalable if you need a wide range of different configurations for your fixtures/objects. On the other hand, if you only need to initialize a few objects with a couple of values, using the Object Mother pattern can be an overkill.

Keep in mind that you’ll still need to apply some form of direct initialization inside the mother object.

Example:

public class PersonFactory {
    public static Person createMiddleAgedMan(){
        return new Person("Peter", "Pan", new LocalDate.of(1978,5,12));
    }

    public static Person createJaneDoe() {
        return new Person("Jane", "Doe", new LocalDate.of(1990,5,12));
    }

    
}

Using the above ObjectMother with a static import of createMiddleAgedMan would look like this:

@Test
public void getIncomeWithObjectMothers() {
	Accounting a = new Accounting();
	Person person = createMiddleAgedMan();

	// Testing logic
}

So let’s summarize: direct initialization provides ample flexibility but does not scale well. The Object Mother option enables scalability but isn’t very flexible. You may find you need some in-between option that offers the best of both worlds. That’s the Test Data Builder pattern!

Test Data Builder pattern

Going with the Test Data Builder option has you create a builder class that is already set up with good default data, but lets you customize the test object with minimal effort. You can choose to customize only certain attributes of your test fixture/object, making Test Data Builder a very flexible option.

Using a Test Data Builder offers easy “drop-in” predefined dummy objects and even defining different objects for common scenarios. The actual object initialization will be hidden within the builder class, which is great because if the constructor changes, all you need to do is adapt it once rather than at every instance.

🦾 Looking for more advanced testing techniques?

Check out this post about mutation testing and our series on JUnit testing tips and tricks!

The Test Data Builder pattern makes it easy to modify certain aspects or values of the built object while leaving everything else as default. Another benefit is that instantiation reads like a sentence and only contains the values that are important for the test, leaving everything in the default state. Better still, going with the Test Data Builder option enables you to nest multiple builders for different classes to create more complex data.

In short, the Test Data Builder is a versatile and powerful option for initializing test data. The only downside is that it might be an overkill if you only need to initialize a limited number of objects with a few values. In addition, you’ll still need to do some form of direct initialization within the builder class.

A test data builder for our person class would look as follows:

public class PersonTestBuilder {
    private String firstName = "Peter";
    private String lastName = "Pan";
    private LocalDate birthDay = LocalDate.of(1978,1,1);
    public static PersonTestBuilder aPerson() {
        return new PersonTestBuilder();
    }
    
    private PersonTestBuilder () {}

    private PersonTestBuilder(PersonTestBuilder builder) {
        this.firstName = builder.firstName;
        this.lastName = builder.lastName;
        this.birthDay = builder.birthDay;
    }

    public PersonTestBuilder withFirstName(String firstName) {
        final var copy = new PersonTestBuilder(this);
        copy.firstName = firstName;
        return copy;
    }

    public PersonTestBuilder withLastName(String lastName) {
        final var copy = new PersonTestBuilder(this);
        copy.lastName = lastName;
        return copy;
    }

    public PersonTestBuilder withBirthDate(LocalDate birthDay) {
        final var copy = new PersonTestBuilder(this);
        copy.birthDay = birthDay;
        return copy;
    }

    public Person build() {
        final var person = new Person();
        person.setFirstName(firstName);
        person.setLastName(lastName);
        person.setBirthDay(birthDay);

        return person;
    }
}

Note that the constructors are kept private on purpose so the only way to start working with a PersonTestBuilder is to use its static method aPerson. It stores 3 member variables, which mimic the data used for initializing a person. Those are initialized with meaningful default values, e.g. aPost().build() could be used directly when you are happy with receiving a person of the form new Person("Peter", "Pan", LocalDate.of(1978,1,1)).

If you want to overwrite individual members, you can call the provided with* methods and even chain these calls. See the following example where we want to have the test date set to a person with the name ‘Jane Doe', rather than ‘Peter Pan':

@Test
	public void getIncomeWithObjectBuilder() {
		Accounting a = new Accounting();
		Person person = aPerson().withFirstName("Jane").withLastName("Doe").build();

		// Testing logic
	}

Summary: advanced test data initialization

So those are the three main ways to initialize test data, from the most simple to the most advanced solution. It’s a good idea to carefully weigh your options and stick with the most simple, easy to use and easy to read option.

If initialization is a bit of a headache, you’ll be delighted to learn that Symflower’s generated test templates take care of initialization automatically. What’s more, our new beta feature even generates test values for your test suites to check all possible code paths! So not only can you save the time and hassle of initialization, but your entire testing workflow may be reduced to reviewing generated tests.

Try Symflower in your IDE to generate unit test templates & test suites

Make sure you never miss any of our upcoming content by signing up for our newsletter and by following us on Twitter, LinkedIn or Facebook!

| 2024-04-30