Bug Deep Dive: Fixing Guardian Angels
A deep dive into one of the bugs that was bugging all of you… and us!
Hey Crewmates!
Yesterday we pushed out a fix to solve the bug where if an Impostor tries to kill a Crewmate protected by a Guardian Angel, they lose the ability to kill ever again. It was such a strange bug we wanted to share the process with you here, and show you exactly how weird and mysterious programming can be sometimes.
TL;DR – the Guardian Angel was broken because of … timezones.
The Problem
On November 10th, 2021, one day after the Among Us roles update launched, a new bug ticket popped up from our QA team.
Occasionally, after an Impostor’s kill is blocked by a Guardian Angel, the Impostor can no longer kill any other player for remainder of the match. Our team couldn’t reproduce the bug and bug reports seemed sparse and inconsistent. Amongst all the conflicting priorities at the end of 2021, the bug was, while concerning, thought to be an uncommon issue.
Once we were finally able to take a deep look at this bug, things got weird. Adriel, our lead programmer, could now reproduce the bug nearly 100% of the time on the live servers. However, when trying to reproduce the bug with another programmer, the bug would consistently not occur (WHAT). Even though we were using a non-live server to test together as a team, the servers’ versions were exactly the same. Noting the difference between live and non-live, we pulled a live server to our non-live environment and guess what happened—the bug didn’t occur (more WHAT??). Let’s call that server “Server A” and we’ll get back to that one later.
At this point, we already had been frantically combing the code for programming “gotchas” and theorizing that the amount of load on the live servers compared to the non-live servers was the root cause of the bug. And then, after many hours of pulling hairs, Forest (our CEO) saw a very suspicious line of server code that decides if a player is protected by a Guardian Angel:
We found a DateTime.Now.
If you’re a backend programmer, I’d bet you’re in full facepalm mode right now. Dealing with current time is a programming nightmare due to time zones and daylight savings time. The value provided by DateTime.Now is the local time of the machine the application is running on. The fix is to always use “Coordinated Universal Time”, abbreviated as UTC, which is the same value for every machine at any time (we won’t get into the nitty gritty, but check Google for more details). We do use the respective C# value DateTime.UtcNow and we missed it in this single spot.
From the screenshot above, the player.LastProtected time was generated from DateTime.UtcNow, while the current time during the IsPlayerProtected() calculation was derived from DateTime.Now. So, if the latter value is local time, the difference in our time calculation would likely be hours off, depending on the time zone of the server’s machine. We could also confirm that by forcing this time check to fail, we could reproduce the Guardian Angel bug 100% of the time.
But wait a second! How did we miss this error? How did this bug occur consistently on our live servers, but elusively inconsistent on our non-live servers? What about Server A?
The Solution
It just so turns out that most of our live servers are hosted on Hosting Environment 1, while our non-live servers and select live servers, including Server A, are hosted on Hosting Environment 2.
These two hosting services likely configure time zones differently. We can confirm that our Hosting Environment 2 servers are configured to use UTC as system time, meaning that the value of DateTime.Now should be the same as DateTime.UtcNow. That configuration would explain why we could not reproduce the bug on our non-live servers or Server A. While we are not able to definitively say, we presume that most Hosting Environment 1 hosted machines are providing a non-UTC time from DateTime.Now, causing the bug. In essence, the bug itself and its severity was hidden from us because of varying hosting environments.
We updated our servers with our fix and the bug is no longer reproducible, giving us confidence that the bug has been squashed. Nonetheless, there are a few remaining mysteries.
- As stated in the last paragraph, we aren’t 100% sure about what time values we were getting on different hosting environments.
- While our code and logging analysis explains why an Impostor couldn’t kill the originally Guardian Angel protected crewmate again, it doesn’t explain why the Impostor cannot kill any other Crewmate again.
While we feel the urge to completely understand the root of the problem, sometimes we must accept a few mysteries and move on. Such is life. We have plenty of other Among Us bugs to squash and we’d rather first move on to those.
Thanks for reading!
Mik

