Reading Spectrograms
Most of the work of reading live spectrograms is done by knowing what sequence of phonemes should appear by listening to the audio. However, it is important to be able to recognize the stressed and drawn out vowels for landmarks.
We will traverse the vowels in order of their constriction location starting with the palatal vowels and ending with the lower pharyngeal vowels
The palatal and velar vowels have formants that are far apart (i.e. they are far away from the Y=X line one the formant plot).
Every voice has different formants due to differences in vocal tract length, so we must judge the formant height relative to the i
(Fleece) vowel which is the most extreme vowel with the lowest first formant and highest second formant. The ɪ
(Kit) vowel is the lax version of the i
vowel. Lax vowels have formants that are more centralized in the formant plot and so ɪ
will have a slightly higher first formant and slightly lower second.
The ɛ
(Dress) vowel is very similar to the ɪ
(Kit) vowel with slightly higher first and lower second formant. It is the degree of jaw opening that distinguishes these two vowels, and cues other thant formant heights are important for distinguishing these two vowels. The e
(Air) vowel is distinguished from the ɛ
vowel as it is actually a diphthong ending in a schwa.
The u
(Goose) vowel has a similar first formant as the i
vowel but a lower second formant. The second formant varies significantly depending on the degree of lip rounding, more rounded lips lower the second formant. It is actually a diphthong starting with spread lips close to an i
and ending at u
as the lips finish rounding. The ʊ
(Foot) vowel is the lax version of u
and again will look similar but more central (higher first and lower second formant).
Where as the the first and second formants of the palatal and velar vowels were far apart, the formants of the pharyngeal vowels are close together.
The ɔ
(Thought) vowel is the other extreme vowel. Whereas i
had the highest second formant, ɔ
has the lowest.
The ɑ
(Bath) and ʌ
(Strut) vowels are similar to the ɑ
vowel but with slightly higher formants.
The æ
(Trap) vowel’s first and second formants are the furthest apart of all the pharyngeal vowels, such that it is similar to the ɛ
(Dress) vowel. We have come full circle around the formant chart.
The ɜ
(Earth) vowel has a neutral vocal tract shape and resides in the center of the formant plot.
Dithongs are vowels that change constriction location and thus formant frequencies.
Fricatives are distinguished by a period of noise.
Voiceless stop consonants can be recognized by a period of silence followed by a period of noise. Unfortunately, voiced and unaspirated stop consonants only can be recognized by their effect on surrounding sounds (i.e. on the formants of the following vowel).
Nasals have harmonics like vowels but much fewer them. The first and second harmonics are usually present. With Voiced Fricatives, only the first few formants are visible.
Approximants are extreme versions of diphthongs and are recognizable by a rapid change in the formants of their pitch harmonics. w
, l
, and r
have first and second formants that are very low, where as j
has a low first and a high second formant. As such, w
, l
, and r
are most easily recognized as the onset/rhyme of a palatal or velar vowel, and j
of a pharyngeal vowel.