A flaky test is one that fails unpredictably, without a corresponding change in the code under test. These often show up in CI runs where a test unrelated to any change made suddenly fails, and then mysteriously passes when re-run. There are many type and causes of “flakes”. Today I want to talk about flaky tests caused by time precision, and time-order dependence.
The specific tools mentioned here will be Ruby, Rails, and PostgreSQL specific. But analogs exist in nearly all other languages and frameworks, so the techniques are broadly applicable.
🥶 Freeze Time
Sometimes a test assumes that time will elapse (or not elapse) during the execution of the test. In such tests, when comparing time/timestamps, try to work with entirely predictable time. Ideally by making time… umm… more static.
In RSpec, you can ensure that Time.now
and friends always return the same value during a single test execution by freezing time with Rails’ freeze_time
helper.
From the docs:
Time.current #=> Sun, 09 Jul 2017 15:34:49 EST -05:00
freeze_time do
sleep(1)
Time.current #=> Sun, 09 Jul 2017 15:34:49 EST -05:00
end
Time.current #=> Sun, 09 Jul 2017 15:34:50 EST -05:00
I prefer to add a bit of RSpec metadata to do the magic. 🪄
# spec/spec_helper.rb
config.around(:each, :time_frozen) do |example|
freeze_time do
example.call
end
end
With that, you can add the :time_frozen
metadata to a wrapping context
or describe
block:
RSpec.describe MyClass, :time_frozen do
# Time is frozen here… 🥶
end
Now all references to Time.now
and friends are frozen.
Time will not move forward inside that block.
It’s important to know that any setup (like RSpec let!
, before
, or around
blocks) happening before the time-frozen block did not have time frozen, and so can have slightly different values Time.now
than inside the block.
We’ll come back to this later. 😉
⏱️ Platform-specific Time Precision
Ruby’s Time
and DateTime
objects have up to nanosecond precision.
That is, up to 9 significant digits after the decimal point.
e.g. Time.utc(2024, 5, 20, 0, 0, 0.123456789)
would result in a Time
object with 123,456,789 nanoseconds of precision.
However, the precision of Time.now
(and friends) varies by platform.
On macOS, it’s limited to microseconds - 6 significant digits after the decimal point.
Linux tends to be nanoseconds.
This usually doesn’t matter because any single test run happens on the same machine. However, there’s a wrinkle: PostgreSQL. Or, more generally, the precision of time-like data in your database.
PostgreSQL’s time
, timestamp
and interval
data types have a resolution of microseconds.
And MySQL has a similar limitation on precision.
So during a CI run, for example, it’s possible to do something like this:
now = Time.now #=> 2024-05-20 19:58:45.660142789 +0000
BlogPost.create(published_at: now)
post = BlogPost.last
expect(post.published_at).to eq(now) #=> Flakes b/c post.published_at lost resolution
post.published_at #=> 2024-05-20 19:58:45.660142000 +0000
In my experience, the need for directly comparing to a time field from the database is rare.
But in those rare cases, I prefer to #round
the in-memory time to the same precision as the database column.
expect(post.published_at).to eq(now.round(6))
Another option is more fuzzy expectations/matching.
RSpec has the be_within
matcher, for example.
Though I find this more confusing than helpful, and tend to avoid it.
🤹 Time-order Dependent Tests
In that contrived example it’s easy to see how we end up with mismatched precision. A more tricky variation occurs when we create a bunch of test data and then make assertions on that data, based on the order of various time fields.
posts = FactoryBot.create_list(:post, :published) #=> Generate Post's with #published_at = Time.now at time of creation
sorted_posts = posts.sort_by(:published_at).reverse #=> sort the in-memory list, "newest" first
recent_posts = RecentPostQuery.call
expect(recent_posts).to eq(sorted_posts) #=> Flakes
In this case, the posts
collection was held in memory, with each Post#published_at
value having full nanosecond precision.
When saving to PostgreSQL, the precision was reduced to microsecond.
Meaning it’s possible two Post
s were created fast enough that when sub-microsecond precision is lost, they have the “same” #published_at
in the database.
Sorting equal values is non-deterministic, so Post
instances with the same #published_at
can have different ordering within the recent_posts
collection.
And so… flaky test failures.
A naive way to handle this would be to #reload
the posts collection before sorting it in-memory.
Though that can result in the same problem of loss of precision, with the in-memory collection having non-deterministic ordering.
A robust way to solve this is to use explicit Time
/DateTime
values when creating the test data.
After all, the ordering-by-time is actually the thing we’re trying to test.
Being more explicit about that in test setup helps our future selves understand which parts of setup are important, vs incidental.
oldest = FactoryBot.create(:post, published_at: 2.days.ago)
middle = FactoryBot.create(:post, published_at: 1.day.ago)
newest = FactoryBot.create(:post, published_at: Time.now)
recent_posts = RecentPostQuery.call
expect(recent_posts).to eq([newest, middle, oldest]) #=> works every time!
🥶⏱️ Precision of Frozen Time
Using Rails’ built in #freeze_time
, #travel_to
, etc… will result in all sub-second precision being lost, by default.
This can be surprising, or at least unexpected.
I know it surprised me!
As with freezing time in general, this loss of precision usually doesn’t matter. However, it can cause issues similar to those above where a loss of precision leads to non-deterministic ordering. Consider this terribly made up example:
let!(:post) { FactoryBot.create(:post, :published) }
context "with a draft post" do
let(:draft) { FactoryBot.create(:post, :draft) }
before do
freeze_time
end
it "includes a published draft as the newest Post" do
draft.publish! #=> sets published_at = Time.now
recent_posts = RecentPostsQuery.call
expect(recent_posts).to eq([draft, post]) #=> Flakes
end
end
This can be tricky to spot.
Here the let!(:post)
happened before we froze time, so it might have a #published_at = 2024-05-20 0:0:0.123456000 +0000
.
Then, a fraction of a second later (because computers are fast), we freeze time, create the draft
, and #publish!
it.
Since we know the creation and publication of the draft
happens after the creation of post
, this setup seems to make sense.
Except frozen time has no sub-second precision.
So it’s possible that thedraft
’s published_at = 2024-05-20 0:0:0.000000 +0000
.
Which is a fraction of a second before the post
’s, rather than after it.
We might try reaching for our :time_frozen
RSpec helper here, relying on RSpec’s behavior to ensure all before
blocks (which include let!
) run before the actual test code (it
block).
And indeed that is the case.
But this is also subtly broken as now all of the times lack sub-second precision.
Meaning we’re back in the non-deterministic ordering problem.
let!(:post) { FactoryBot.create(:post, :published) }
context "with a draft post", :time_frozen do
let(:draft) { FactoryBot.create(:post, :draft) }
it "includes a published draft as the newest Post" do
draft.publish! #=> sets published_at = Time.now
recent_posts = RecentPostsQuery.call
expect(recent_posts).to eq([draft, post]) #=> Flakes b/c time's can be equal
end
end
The solution, like before, is to be explicit about time values when ordering based on them is important to the test.
Preserving sub-second precision
With newer versions of Rails, sub-second precision can optionally be preserved using the with_usec
option.
You need to opt into this at each freeze_time
, travel
, and travel_to
call site, but at least it’s possible!
freeze_time(with_usec: true) do
# Time here is frozen, and includes sub-second precision.
end
🎁 Wrapping up
In summary, take care when directly comparing or sorting on time (or time-like) values. And when setting up test data where time is important, be explicit about the exact time values being used.
Good luck. And happy testing!
🙏 Thanks
Thank you to Hazel for feedback and proofing!