How does latency affect my live music laptop?
Latency is one of the things that has and will always be with us in music performance, whether it involves digital instruments or not. But what is it? Why is it there? And do we need to do anything about it?
Latency is the delay between when you want a sound to happen, and when it actually happens. If you have your head right up against a drum, and you hit the drum, the sound will get from the drum to your ear pretty much instantly, and so you don't perceive any latency. (It will also be really loud. Not recommended.)
But in fact, the speed of sound in air is around 340 meters per second, or 1120 ft/sec, so even if your ear is only one foot (30cm) from the drum, the sound will take about a millisecond to get to your ear. If you are standing in front of a drum kit, let's say about 10ft away, the sound of the drums will reach your ears about 10ms after the drummer has hit them. If you are the conductor of an orchestra, it could take about 35ms for the sound of the percussionist’s snare drum to reach your ears after they have hit it.
Likewise, some instruments have a built-in latency. If you hit a drum, it will produce sound pretty much instantly, but if you blow into a tuba, it takes a while for the sound to travel around its tubes and make a sound. So throughout music history there has always been a concept of latency, which has broadly been assimilated and compensated for by the complex relationship between conductors, performers and their instruments.
When we talk in the context of electronic music, and music involving digital equipment, the main source of latency that people refer to is the processing delay of a computer or electronic instrument. When you play a note on a synthesizer or electronic drum kit, how long does it take the sound to come out of the speakers? Usually, the electronic device has to recognize that you are playing a note, send that note data down some cables, receive that information, process that information and play or generate some kind of sound, send that sound to some kind of Digital to Analogue converter, and then to some speakers, and then that sound has to get from the speakers to your ears.
All these stages of processing take some time, and that time can add up to a number that's much larger than the amounts of latency that are encountered with traditional instruments.
Where does latency come from in a digital audio system?
There are many stages of a digital musical instrument that introduce latency. Let's take a look at where all these delays are coming from.
Computer latency (aka audio buffer)
Audio drivers used for audio interfaces have a buffer size setting associated with them. This is the number of individual samples that are collected by the software before they can be processed. This is because often it's more efficient for software to process audio in larger chunks of samples rather than one sample at a time. (Samples are the micro measurements of digital audio, determined by the sample rate of your analogue-to-digital converter. For example, 48kHz audio captures 48,000 individual samples per second.)
The audio buffer is measured in samples, and usually range from something like 32 to 1024 samples with several increments in between. On my own system, I feel that running at 32 samples is possibly asking for trouble with DSP heavy plugins, so I tend to stick to a minimum of 64 samples. 64 samples is 1.3 ms at 48kHz. In reality, the output latency of a system is this number plus another buffer that is set by the manufacturer, and so usually an application will list an output latency of around 2 ms for a 64-sample buffer. Different interfaces will list a slightly different latency given the same sample buffer size.
If you wanted to play an instrument like a guitar through a computer, you would need to at least double this number since there is input and output latency. This is often referred to as ‘roundtrip’ latency. But for live keys playing, it's only really the output latency we are interested in, as there is no audio going into the computer.
So, we have this number of 1.3 ms output latency for a 64-sample buffer. This sounds great — but maybe too great. So I measured it.
How long does it take for a sound to come out of a speaker once you have played a key?
To work this out, I set up two microphones: one to record the sound of me physically hitting a key, and another right up next to a speaker to measure the sound coming out. The gap between these two recordings (assuming the microphones are as close as possible to the key and the speaker respectively) would give me the overall latency. And... It was kind of disappointing!
Using StageBox, I measured around 9ms for a 64 sample buffer; 13ms for a 128-sample buffer; 14ms for a 256-sample buffer; 25ms for 512-sample buffer. All these numbers are way higher than the latency reported by the similar software at these buffer settings, so I wanted to dig deeper into what is going on here?
Turns out there are a couple of other factors that might be causing more delays:
MIDI keyboard velocity sensing
In order to work out how hard you hit a key, a MIDI keyboard needs to measure the time difference between the key hitting two different contacts. This takes a little processing, but should be in the region of 0.5 ms - 1ms for a good modern MIDI controller.
USB to MIDI interface latency
If you are using USB MIDI for connecting the keyboard to the computer, there is another approximately 1ms of delay caused by the USB bus.
Audio interface converter latency
In order for an audio interface to convert a digital signal from inside a computer, into an analog audio signal that can play through speakers, another delay is encountered. This delay varies from one interface to another depending on exactly the technology they are using for the conversion, but it is usually in the 0.5ms region. Newer converters tend to have smaller conversion delays than older ones. So adding the 1.3 ms buffer delay to the 0.5 ms MIDI keyboard processing delay to the 1ms MIDI interface delay to the 0.5ms converter delay gives around 3.3ms. Still way less than the 9ms I recorded.
Mechanical properties
Then I started to think about how long it actually takes to press a key down. I don't have any super accurate way to measure this, so the best I could come up with was to use an iPhone and record some key presses in Slo-Mo. Slo-Mo videos are 240 frames per second, so each frame lasts around 4ms.
Looking at these Slo-Mo videos I could see that from where my finger touched the key (the onset of the microphone waveform) to where it's around half way down (where the lower key contact is, and so the point at which the keyboard works out how hard you just hit the key and can send out a MIDI note to the software) is somewhere between 1 and 2 frames. So let's call it 1.5 frames. Which is about 6ms.
Now, if we add this 6ms to the 3.3 ms of the input, processing and output latencies, we get 9.3 ms, which is pretty much what I recorded!
And just to be sure, I did do some loopback recordings with the same system to verify that I wasn't encountering any other weird delays. The majority of my recorded delay was definitely all in the key travel of the MIDI keyboard.
So in reality, even with a buffer size of 64 samples, my total recorded key press to sound out was around 9ms.
If you can run your system with the buffer set at 32 samples, you can probably make it respond faster than a hardware Nord….
Hardware keyboards have latency too!
If you’ve followed this far, you have read about the latency encountered in a laptop-based system running with a 64-sample audio buffer. It’s about 9ms. Just for fun I did the same tests with some other all-hardware synths to see what happened. Here's what I found. The Nord Wave 2 was 8ms, the Roland Juno 60 was 20ms, and my Hammond M3 was 4ms.
I think the Hammond measures as being fastest because the key contacts are really close to the top of the key travel, so you only need to make the key move a tiny bit downward for it to start making a sound. Other key mechanisms register the switch closure towards the bottom of the travel to accurately measure velocity. I was honestly surprised by the Juno 60, and the Nord Wave 2 is only 1ms faster than my laptop rig (with a 64 sample buffer setting), which is indiscernible. .
So in summary, the computer-based system is definitely in the same realm of overall latency as the hardware synths when it’s set to a 64 sample buffer size. Differences start to occur when raising the buffer size, which is something you might need to do to optimise performance when you have a lot of plugins running. When setting the audio buffer to 128 samples, the latency was 13ms versus 8ms of the Nord — but actually even at 128 samples it felt very fast in practice and perfectly playable (and still less overall latency than the classic Juno 60!). At 256 samples StageBox still feels really good to me. At 512 I can start to tell something is up! All these numbers are lower than the equivalent latencies for each buffer size in MainStage. In Mainstage, a buffer of 256 feels off to me (128 is fine), so the two bits of software must be doing something slightly different at the buffer size increases.
Here’s a YouTube video where I look into some of these things: