Music Synthesis
Previous    Next

Music Synthesis

To make any sense of the following we need to get semi-technical and yes, I'm going to have to force some things into nice neat categories that they don't really fit in. Let's start by forcing everything into two major categories.

  1. How is the sound generated?
  2. How is the sound modulated?

1. Sound Generation

What is the physical device that is creating the sound?
A good way to categorize this is

  • Analog
  • Digital
  • Virtual

In the 60's and 70's synthesizers were entirely analog. There was literally a physical electric current that was modulated and this current was amplified into a sound. They did use transistors to adjust the current but at its base, it was an electric signal from start to finish.

In the 80's came the digital synthesizers. The technology to do this existed a decade earlier but it took a while for the cost to come down to a level where it was commercially viable. Instead of an electric current, a computer converted the sound wave to a set of numbers which approximated the sound wave. The numbers were then manipulated and only at the last step were they transformed back into sound. This had both positive and negative results.

On the positive side, consistency and precision.

Consistency. A true analog signal can vary with the slightest change in voltage. Think of a trumpet in an orchestra whose pitch changes as the trumpet warms or cools and where every instrument in the orchestra has to tune before every single performance. A digital instrument does not have this issue.

Less obvious is the increase in precision. Many of the ways in which the signal is adjusted can change a great deal with only the very smallest modification. Take the situation where you send the signal from filter A and then the output to Filter B. If Both A and B multiply the signal then it only takes the smallest changes in A to see a large change. These changes can be difficult with a true analog system where you literally twist a knob to make the change.

On the negative side, fidelity and richness.

Fidelity. A purely analog signal can go higher and lower than what a human can hear. A digital synth is limited by the hardware and the hardware of the 80's was such that the highs were not as high and the lows were not as low.

Richness. When talking about analog and digital synthesizers you always hear how the sound of an analog synth is fat or rich. It takes a lot of processing power to calculate every possible harmonic and secondary frequency change and this was definitely not possible in the 80's.

Yes, if it's digital you are already making a virtual representation of the sound and then processing it so this may seem like a meaningless variation. What I am looking to differentiate are specialized hardware synthesizers from what are now called VST's or "Virtual Synthesizers", which are only software programs with no hardware.

The first of these took advantage of the faster processing of modern computers to remove the need for specialized digital signal processing chips that were required for speed in the 80's. The next round moved the technology a step future by developing complicated programs that could closely mimic the sounds from analog synths.

2. Sound Modulation

Once you have the sound created, you modulate and change it. Again we will simplify and this time into four basic types.

  • Subtractive
  • Additive/Frequency Modulation (FM)
  • Wavetable/Sampler/Rompler
  • Looping/PCM

Make a sound. Modulated it with filters, and then removed the parts you don't want.

Additive/Frequency Modulation (FM)
Additive: Take very simple sounds and combine them into more complicated sounds.
Frequency Modulation: Take a simple sound and modulate it into something more complex

Record a sound and play it back. Simple and straightforward. This idea suffered from the limits of early technology in that the cost of memory made it impossible to store every single possible type of sound. Early samplers would record a couple of sounds and then extrapolate between them leading to a less than an ideal result.

Record only part of a sound and play it back. Then uses either Subtractive or Additive synthesis to fill in the end of the sound.

This sounds needlessly complex, but in the 80's when this was developed, memory was extremely expensive. It was not economically feasible to record the entire sound and so a shortcut was developed. It was determined that only the first half or quarter second of a sound, like the sound of a piano hitting a key, was unique. After the strike of the key, the next few seconds of sustain were relatively similar and could be approximated with either subtractive or additive synthesis. This would increase the realism of the sound and reduces the amount of memory required to store the sound. As computer memory increased this type of sound generation slowly disappeared.