Becoming a Dungeon Master for an Interview

Interviews should be as close as possible to the work that the candidate would do if they join.
While that seems like a pretty tautological statement, it is unfortunately not always true. When interviewing as a software engineer, for example, you’ll run into places that lean heavily on algorithms interviews.
And don’t get me wrong, those interviews can be fun (in the same way that Project Euler can be fun), but it’s not very relevant for most jobs. For some jobs, it's a proxy for the technical skills you'll need for the job, but often it feels like it’s testing if you took an algo class that covered dynamic programing.
So, when we were designing our interviews for Support Engineers, we were met with the same problem - how can we best simulate and test what the job would be like?
Support has a lot of unique aspects to it - you have to be able to context switch, you have to be able to debug issues well, you have to be able to make customers feel heard, and so much more all at the same time. For a technical product, you also need to be able to read and write code.
The Interview
After we talked about it, we decided on a format:
- We describe a fake product. To avoid everyone needing to ramp up on it, it’s just a simplified clone of a very well known product.
- We provide docs for the fake product.
- We provide a queue of emails/questions for that fake product, loosely based on real questions / customer issues we’ve debugged.
- The goal is to resolve the queue, just like you would in real life.
There are so many aspects to this that give us actually a lot of signal. Candidates have to first prioritize the queue, which means deciding which issues seem important and which can wait. Candidates have to be able to parse through questions that are vague or maybe misinterpretations of what’s happening.
It’s a pretty interesting exercise, but there’s one problem:
You can’t really resolve tickets in isolation.
Real customer issues often require back and forth. You may need to clarify something, get more information, dig through logs, etc. Without that, you are just blindly guessing at what the underlying issue is.
Becoming a Dungeon Master (DM)
To solve that, we took each of the tickets that we made and we created extra assets for them. These are things like code snippets that a customer would send over if prompted, answers that engineers would give if asked about the product, or what the actual underlying issue is (in case it isn’t obvious from the ticket).
The idea was that the candidate would basically just say what they would do next and we would provide what happened.
It’s been a while since I’ve played DnD, but it feels like a very, very light version of the prep you do when designing a campaign as a DM. You want to be prepared for the different things that might happen, because you aren’t 100% sure which direction they will take it.
In the DnD context, this could be someone carefully going over every detail of a room that you didn’t expect them to care about. In the interview context, this could be someone asking to check the status page before responding to a ticket about API errors - something that is very reasonable, but you may not have thought of ahead of time.
Fortunately, relative to an entire fantasy world, a small product with minimal docs is pretty easy to prep for.
What does it look like in practice?
- Candidate: Hmm, ok, so the user is getting timeouts in production but not locally. Can they clarify if all requests are timing out or just some?
- Us: They say it’s every request
- Candidate: As a sanity check, could we have them make a request to a different service in production and see if that times out?
- Us: They said that also timed out.
- Candidate: Great, so this is probably a networking issue that isn’t specific to us. I’d probably ask another question about how their infrastructure is set up.
- Us: They then apologized and said they figured out the issue from there. They were using Lambda functions in prod but hadn’t set it up to have internet access yet.
To be clear, we aren’t expecting the candidate to say “Oh it’s probably in a private subnet,” or even go into the networking weeds with the user, but we are expecting that the right prompts will lead the user to figure it out.
The whole exercise takes about 45 minutes and provides a lot of good insights. For that same example, there are a bunch of paths to the same answer. Some candidates will ask if we are down before saying anything to the user, some candidates will try to reproduce the issue themselves (which is hard in this case, but not all of them), and some candidates will lean more heavily on their intuition to figure out what’s wrong.
Any problems with it?
So far, it’s been going pretty well. I will say there are probably two areas that could be better.
The first is that there are occasions where the candidate doesn’t really know the bounds (or lack thereof) of what they can do from the prompt. In these cases, I switch to an easy ticket and walk them through it to show how it works.
The second is that it can be tricky to ramp other people up on. The prep work helps standardize the question, but a group of people would likely need to shadow each other for a little while before they can all ask it and compare notes correctly.
Summary
While it requires a bit of up front work (and some creativity), designing an interview process that mimics the actual day to day responsibilities can give you much more signal than something more divorced from reality.
We’ve also received the feedback that it’s a much more enjoyable interview process than most - and it never hurts to provide an excellent candidate experience!


