Abstract
Objective
In 1974 Minoru Hirano proposed his theory of voice production that is now known as the cover-body theory. He described the thyroarytenoid (TA) and cricothyroid (CT) muscles as the major determinants of vocal fold shape and stiffness, and theorized four typical laryngeal configurations resulting from unique TA/CT activations, with implications for the resulting voice quality. In this study, we directly observed the vocal fold medial surface shape under Hirano’s unique TA/CT activation conditions to obtain a 3-Dimensional (3D) understanding of these laryngeal configurations during muscle activation.
Study Design
In vivo canine hemilarynx model.
Methods
Fleshpoints were marked along the medial surface of the vocal fold. Selective TA and CT activation were performed via respective laryngeal nerves. 3D reconstructions of the vocal fold medial surface were derived using digital image correlation.
Results
(1) Low level TA and CT activation yielded anteroposterior lengthening and vertical thinning of the vocal fold. (2) When TA activation is far greater than CT, the vocal fold shortens and thickens. (3) With slightly greater TA than CT activation the vocal length is maintained on average while its vertical thickness decreases. (4) With CT far greater than TA activation the vocal fold lengthens and thins. In all conditions, glottal contour changes remained minimal.
Conclusions
Analysis of the 3D geometry of the vocal fold medial surface under Hirano’s four typical laryngeal configurations revealed that the key geometric changes during TA/CT interactions lie within the anteroposterior length and the vertical thickness of the vocal fold.
Keywords: Larynx, Voice, Canine, Cover-body, Vocal Fold, Vocal Register, Intrinsic Laryngeal Muscle, Hirano
INTRODUCTION
The human larynx, a single sound generator, is capable of producing a voice of tremendous variety in pitch, intensity, and quality. In the 1970s Minoru Hirano set out to understand how singers can produce such variety in voice using only a single set of vocal folds. In 1974, Hirano published the cover-body theory of vocal fold vibration. Here he described the unique morphological structure of the vocal folds and divided them biomechanically into the body and the cover layer. The main substance of the vocal fold is the thyroarytenoid (TA) muscle. The TA, innervated by the recurrent laryngeal nerve, has mechanical properties that vary with activation. Overlying the TA is the elastic conus, or vocal ligament. This fibrous vocal ligament interdigitates with vocalis muscle fibers allowing the TA and vocal ligament to function as a single vibratory unit (body) during phonation. Superficial to the body lies the vocal fold epithelium and superficial layer of the lamina propria loosely associated with the vocal ligament. The superficial lamina propria and epithelium move as a vibratory unit (cover) weakly coupled to the body. Therefore, this cover-body theory of vocal fold vibration dictates the vocal folds function at least as a double-structured vibrator.
The mechanical properties of the vocal fold are primarily determined by the intrinsic laryngeal muscles. Hirano recognized the TA muscle controlled body stiffness and TA interaction with the cricothyroid (CT) muscle controlled tension of the vocal fold cover layer. He then described four typical laryngeal adjustments achieving unique relationships between the body and cover for combinations of TA and CT activation. These four laryngeal adjustments traverse the gamut of voice production, characterized by different vocal registers and intensities, demonstrating how the vocal folds function as many different sound generators1,2.
Vocal registers are defined by a series of frequencies of similar quality produced through common physiologic means3. Vocal registers include glottal fry, modal, chest, head, falsetto, and whistle registers. Vocal registers may be referred to as light, such as head or falsetto registers, or heavy, such as chest register. Current theories suggest that register control is a primary laryngeal event, dependent upon laryngeal muscle activity, vocal fold adduction and glottal shape4. Hirano’s first laryngeal adjustment corresponds to low level TA and CT activation (Low TA Low CT) producing soft phonation. The body and cover are flexible and both involved in vibration. The second laryngeal adjustment corresponds to a much greater TA than CT (High TA Low CT). This results in a stiff body and slackened cover, producing loud heavy voice. The third adjustment corresponds to a slightly greater TA activation than CT (TA Slight > CT). This yields heavy or modal register and vocal fold deformation is evenly shared between body and cover. The fourth adjustment corresponds to very low TA activation with high level CT (CT ≫ TA). This results in passive stretching of the body and cover achieving light or falsetto register. In his 1974 publication Hirano provides a coronal pictorial representation of these four laryngeal adjustments recreated here, Figure 1A. He alludes to the importance of the vocal fold medial surface shape in his images, but this remains to be quantitatively described. In this report, we use an in vivo canine hemilarynx model to directly measure vocal fold adduction, thickness, and length. We provide precise and graded stimulation of the TA and CT muscles over 36 activation combinations to fully characterize changes in glottal shape as they relate to Hirano’s four laryngeal paradigms and vocal registers.
MATERIALS AND METHODS
This study was approved by the Institutional Animal Care and Use Committee. One mongrel canine was used. The laryngotracheal framework was exposed in the neck as previously described5–8. Tracheostomy was performed followed by an infrahyoid pharyngotomy and pharyngeal division. A right hemilaryngectomy was performed exposing the left vocal fold. India ink was used to mark fleshpoints spaced 1.3 mm apart (fleshpoint diameter = 130 to 220 microns) in a grid-like fashion along the vocal fold medial surface. The hypotenuse of a glass right-angled prism abutted the anatomic glottal midline. The prism allowed two distinct views of the vocal fold medial surface captured by a high-speed digital camera.
Mapping functions for 3D analysis were calculated by calibrating the camera (384 × 672 pixel resolution; 0.04 mm/pixel) to a standardized calibration plate as previously described7–9. These mapping functions helped create 3D contour plots of the vocal fold medial surface from which adduction, thickness and length were measured.
The recurrent laryngeal nerve (RLN) branch to the TA and the external branch of the superior laryngeal nerve (SLN) to the CT were isolated, ligated and fashioned with a cuff electrode for stimulation as previously described7,8. Graded stimulation of the TA and CT were performed over 8 levels, from zero, no activation, to 7, maximal activation. Both TA and CT saturated their glottal shape deformation at activation level 5. As such, we studied all combinations of TA and CT from activation level 0–5 (36 combinations). Vocal fold deformation was captured at 3,000 frames per second with a high-speed digital camera (Phantom v210, Vision Research Inc., Wayne, NJ).
The image-processing program DaVis (LaVision Inc., Version 7.2, Goettingen, Germany) was used for time series cross-correlation analysis for 3D deformation calculations of the medial surface for the 36 TA and CT combinations7,8. From surface height measurements vocal fold adduction, thickness, and length were extracted. We grouped TA and CT activation combinations with each of Hirano’s laryngeal adjustments. Condition 1, Low TA Low CT, includes TA and CT activation levels 0–2 (n = 8). Condition 2, High TA Low CT, includes TA activation levels 3–5 and CT levels 0–2 (n = 8). Condition 3, TA Slight > CT, includes TA and CT levels 3–5 (n = 6). Lastly, condition 4, CT ≫ TA, includes TA levels 0–2 and CT levels 3–5 (n = 7).
RESULTS
In Hirano’s 1974 publication he drew 4 laryngeal configurations that he felt pictorially represented the vocal fold as a double-structured vibrator capable of different vocal registers and pitch. Figure 1A re-creates these coronal sections. Condition 1, Low TA Low CT, represents a flexible body and cover. Condition 2, High TA Low CT, represents a firm, stiff body with a slackened cover. Condition 3, TA Slight > CT, the body and cover contribute to vocal fold vibration producing heavy or modal register. Condition 4, CT ≫ TA, represents a maximally stretched body and cover.
In Titze’s Principles of Voice Production, he described the concept of muscle activation plots10. Namely a plot of the percent maximum CT activity versus percent maximum TA activity. He divided these plots into four quadrants which spanned the gamut of vocal registers (e.g. pressed, chest, speech/modal, and falsetto) much as Hirano’s 4 laryngeal configurations did. Figure 1B provides a muscle activation plot showing the array of TA and CT combinations we evaluated in this report. Here we looked at TA and CT activation from level 0, inactive, to level 5, maximally active. Based on Hirano’s 4 configurations we then grouped different TA and CT combinations into one of these 4 conditions.
Using our hemilarynx model we set out to quantitatively recreate the glottal coronal sections for Hirano’s 4 conditions. Figure 2 demonstrates the coronal sections through 2 points along the A-P axis of the vocal fold; the mid membranous vocal fold (Mid Fold) and a point midway between the mid fold and the anterior commissure (Ant Fold). Blue lines represent resting coronal posture while red lines depict final posture 117 ms following stimulation. We chose this end-time based on our prior work7,8. Here you see a slight favoring of a convergent glottis in condition 1, near complete vocal fold adduction in conditions 2 and 3, and slight abduction in 4. Overall, despite the unique coronal sections Hirano drew for each of his 4 conditions, we fail to appreciate much change amongst conditions beyond simple adduction and abduction. Importantly, conditions 2 and 3 must be interpreted with caution as adduction ultimately leads to the vocal fold medial surface contacting the glass prism in the midline which will directly influence the coronal shape. However, the glass prism will function much like an endogenous contralateral vocal fold and ultimately this suggests that with robust vocal fold adduction the glottal channel assumes a rectangular configuration.
Our model also allows for 3D reconstruction of the entire medial surface contour. In Figure 3, we show the baseline medial surface contour plot (first column) and the final posture medial surface contour plot 117 ms after stimulation (second column) for these 4 conditions. There is subtle difference in the baseline contour for each condition, the absolute value of which is insignificant. These differences are a product of the software algorithm which must estimate the baseline contour plot each time a new condition is presented. For Low TA Low CT we see lengthening of the vocal fold, maintained thickness, and minimal in-plane motion. For High TA Low CT the vocal fold shortens, thickens, and adducts. For TA Slight > CT the vocal fold length is maintained or increased, the vocal fold is thinned, and adducts. For CT ≫ TA the vocal fold lengthens, thins, and abducts.
In Figure 4 we average the change in vocal fold length, anterior and mid-fold thickness, and vocal fold adduction for all combinations of TA and CT within Hirano’s 4 paradigms. For Low TA Low CT the vocal fold lengthens by 6%, thins by 2.7% and 1.9%, while adduction is relatively unaffected. For High TA Low CT the vocal fold shortens by 2.5%, thickens by 6.6% and 1.1%, and adducts by 60.5%. For TA Slight > CT the vocal fold length is unchanged, the vertical height thins by 8.0% and 8.3% and adducts by 77.2%. Lastly, for CT ≫ TA the vocal fold lengthens by 8.4%, thins by 10.5% and 8.4% and abducts by 10.6%. Uniquely for condition 2 and 3 (High TA Low CT and TA Slight > CT), the balance between TA and CT activation leaves a relatively unchanged vocal fold length while adducting and thickening the fold in an isometric fashion. Overall the glottal shape is uniquely altered for each of these 4 laryngeal conditions with condition 1 lengthening, condition 2 thickening, condition 3 thinning, and condition 4 lengthening and thinning.
DISCUSSION
Hirano’s G. Paul Moore Lecture in 1988 summarized two decades of his research2. The concept of the vocal fold with a cover and body layer having distinct loosely coupled mechanical properties was updated to include three distinct layers. The cover, including the epithelium and superficial layer of the lamina propia, a transition zone, including the vocal ligament, and the body, consisting of the vocalis muscle. Each layer with distinct mechanical properties with more pliability the more superficial the location.
Van den Berg was the first to discuss glottal shape as it relates to vocal registers11. In 1968 he showed coronal glottal schemes based on x-ray tomograms in a male larynx producing chest voice and a female larynx in falsetto. This alluded to vocal fold thickness as a key difference between chest and falsetto voice11. More recent work by Zhang using a 3D continuum model of phonation supported this finding12. Furthermore, chest voice is characterized by active longitudinal tensions in the vocalis muscle. Whereas falsetto has thin vocal folds with passive longitudinal tensions in the vocal ligaments produced by the cricothyroid muscle.
Hirano’s proposed four laryngeal conditions to explain the spectrum of vocal registers was a function of the ratio of vocalis to cricothyroid muscle activation. Current theories of register control hold that it is dependent upon intrinsic laryngeal muscle activity. Studies to date have observed changes in glottal shape indirectly through x-ray tomograms and videokymography. They have also used electromyography to directly measure TA and CT muscle activity during various vocal exercises. In this report, we provide the first direct observation and quantitative analysis of pre-phonatory glottal shape under these 4 conditions using the canine hemilarynx model. In doing so, we see how the TA:CT ratio relates to vocal fold length, thickness, adduction, and medial surface contour.
The most developed analysis of the mechanisms of vocal registers have evaluated the activity of TA and CT muscles using laryngeal EMG. In the 1980s Hirano published his work on laryngeal EMG to understand the role of laryngeal muscles in vocal register control2. Decades later Kochis-Jennings performed similar EMG work to understand the role of TA and CT in register control and the idea of muscle dominance. At low pitch, vocalists maintain or increase TA activity as they transition to a heavier register. TA activity is lowest for falsetto, low for head register, and greatest for higher pitch in the heavier registers. CT activity was more variable between subjects2–4,13. These data corroborate Hirano’s and Titze’s grouping of CT/TA activation into four conditions and supports our grouping of TA/CT activation levels for analysis.
Activity of the TA thickens the body of the vocal fold and slackens the cover. On videokymographic images of the vocal folds these changes are reflected by increased sharpness of the lateral peaks in vibration patterns. Sharper lateral peaks suggests greater vertical phase difference, more activity of the TA, and a thicker vocal fold. Such was seen by Herbst for chest register in contrast to falsetto suggesting adjustments in vocal fold thickness to achieve unique registers14–16. Here we directly appreciate the variation in vocal fold thickness as a function of register. We see increased vocal fold thickness for High TA Low CT, Hirano’s second condition and Titze’s Pressed vocal register. For TA Slight > CT and CT ≫ TA, average vocal fold thickness decreases and chest and falsetto registers are achieved respectively.
Kochis-Jennings also details variations in vocal fold adduction across vocal registers. Most studies support that greater vocal process adduction occurs as singers move to heavier registers. Here we looked at adduction of the mid membranous vocal fold. Under conditions of TA and CT activation that correspond to lighter registers, Hirano’s condition 1 and 4, and Titze’s speech/modal/falsetto register, vocal fold adduction is weak, absent, or opposite (abduction). Conversely vocal fold adduction is strong for Hirano’s condition 2 and 3, and Titze’s pressed/chest registers.
Lastly, we provide a quantitative look at vocal fold length as it relates to vocal register and Hirano’s 4 configurations. Vocal fold length is increased for light, speech and falsetto registers (Hirano’s condition 1 and 4) while unchanged for pressed and chest registers (Hirano’s condition 2 and 3). In vocal register control TA and CT are antagonistic of one another. In this way, a near-isometric activation of TA is possible in heavier registers. Such fine tuning may help explain how singers can maintain a vocal register while altering pitch and vice versa.
Although our focus has been TA and CT activity, surely the lateral cricoarytenoid (LCA) also plays a part. Hirano’s G. Paul Moore Lecture in 1988 demonstrates this clearly with LCA laryngeal EMG activity across different registers2. Here we chose to focus on TA and CT muscle activity to simplify activation combinations but plan to evaluate all 3 intrinsic laryngeal muscles in future work. It must also be noted that Hirano’s original depiction of these 4 laryngeal adjustments were described as coronal sections during vocal fold vibration. Here, we do not incorporate vocal fold vibration or phonation. We are only interested in vocal fold pre-phonatory shape and the target to better modify and direct laryngeal framework surgery. In future work, we aim to incorporate vocal fold vibration and the resulting acoustics. We also aim to translate the canine hemilarynx model to a human ex vivo hemilarynx model and in doing so, better understand unique and similar aspects of human laryngeal physiology to that of a canine.
The canine larynx is a well-accepted model of human laryngeal physiology. Histologically, anatomically and geometrically the canine larynx does exhibit features distinct from human larynges. In canines, there is no vocal ligament as the elastic conus ends within the superficial lamina propria without forming a true ligament. The canine lamina propria is also thicker contributing to its greater thickness while the collagen and elastin density are less concentrated17–19. Despite these differences, the function of the intrinsic laryngeal muscles is qualitatively the same. It has also been shown that similar glottographic waveforms can be achieved in canine and human larynges. Furthermore, the canine vocal folds generate similar vibration patterns to human vocal folds with mucosal waves and vertical phase differences while the stiffness in the canine and human cover layers is no different2,18,20. In theory, these interspecies differences could explain our inability to appreciate a significant change in medical surface coronal contour. However, if so, we would expect prior studies to have found more functional differences between canine and human larynges. Nevertheless, translation of these techniques to the human ex vivo larynx will obviate such concerns. It should also be highlighted that the findings from this study are from a single canine experiment. We have repeated these experiments in 2 other canines which showed similar results. However, the canine presented in this study was the only one with the most complete data set of all simulation conditions. As such, for simplicity and clarity we present the full data set from this single canine.
CONCLUSION
While the concept of vocal registers in part remains an enigma, Hirano’s early work made great strides in conceptualizing the relationship between unique glottal configurations and vocal registers. The importance of glottal posture or shape on vocal registers is well recognized but poorly described. Here we provide a direct view of the vocal fold medial surface during each of Hirano’s 4 paradigms. Low level TA and CT lengthens the vocal fold, an excess of TA to CT thickens and adducts the vocal fold, slight excess of TA to CT thins and adducts the vocal fold, and a gross excess of CT to TA, lengthens, thins, and abducts the vocal fold. Meanwhile change in contour of coronal glottal sections is subtle.
Acknowledgments
This study was supported by grants R01DC011299 and R01DC011300 from the National Institutes of Health.
Footnotes
CONFLICT OF INTEREST: None
FINANCIAL DISCLOSERS: None
Presented as an oral presentation at the American Laryngological Association’s 2017 Spring Meeting at COSM in San Diego, California on April 26–28, 2017.
LEVEL OF EVIDENCE
N/A
Contributor Information
Andrew M. Vahabzadeh-Hagh, Email: AVahabzadehhagh@mednet.ucla.edu.
Zhaoyan Zhang, Email: zyzhang@ucla.edu.
Dinesh K. Chhetri, Email: DChhetri@mednet.ucla.edu.
References
- 1.Hirano M. Morphological structure of the vocal cord as a vibrator and its variations. Folia Phoniatr (Basel) 1974;26(2):89–94. doi: 10.1159/000263771. [DOI] [PubMed] [Google Scholar]
- 2.Hirano M. Vocal Mechanisms in Singing: Laryngological and Phoniatric Aspects. Journal of Voice. 1988;2(1):51–69. [Google Scholar]
- 3.Kochis-Jennings KA, Finnegan EM, Hoffman HT, Jaiswal S. Laryngeal muscle activity and vocal fold adduction during chest, chestmix, headmix, and head registers in females. Journal of voice: official journal of the Voice Foundation. 2012;26(2):182–193. doi: 10.1016/j.jvoice.2010.11.002. [DOI] [PubMed] [Google Scholar]
- 4.Kochis-Jennings KA, Finnegan EM, Hoffman HT, Jaiswal S, Hull D. Cricothyroid muscle and thyroarytenoid muscle dominance in vocal register control: preliminary results. Journal of voice: official journal of the Voice Foundation. 2014;28(5):652.e621–652.e629. doi: 10.1016/j.jvoice.2014.01.017. [DOI] [PubMed] [Google Scholar]
- 5.Chhetri DK, Neubauer J, Bergeron JL, Sofer E, Peng KA, Jamal N. Effects of asymmetric superior laryngeal nerve stimulation on glottic posture, acoustics, vibration. The Laryngoscope. 2013;123(12):3110–3116. doi: 10.1002/lary.24209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Chhetri DK, Neubauer J, Berry DA. Graded activation of the intrinsic laryngeal muscles for vocal fold posturing. J Acoust Soc Am. 2010;127(4):El127–133. doi: 10.1121/1.3310274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Vahabzadeh-Hagh AM, Zhang Z, Chhetri DK. Three-dimensional posture changes of the vocal fold from paired intrinsic laryngeal muscles. The Laryngoscope. 2017;127(3):656–664. doi: 10.1002/lary.26145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Vahabzadeh-Hagh AM, Zhang Z, Chhetri DK. Quantitative Evaluation of the In Vivo Vocal Fold Medial Surface Shape. Journal of voice: official journal of the Voice Foundation. 2017;31(4):513.e515–513.e523. doi: 10.1016/j.jvoice.2016.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zhang Z, Neubauer J, Berry DA. Aerodynamically and acoustically driven modes of vibration in a physical model of the vocal folds. J Acoust Soc Am. 2006;120(5 Pt 1):2841–2849. doi: 10.1121/1.2354025. [DOI] [PubMed] [Google Scholar]
- 10.Titze IR. Principles of Voice Production. Iowa City, IA: National Center for Voice and Speech; 2000. pp. 281–303. [Google Scholar]
- 11.Van den Berg J. Register problems. Annals of the New York Academy of Sciences. 1968;155(1):129–134. doi: 10.1111/j.1749-6632.1968.tb56756.x. [DOI] [PubMed] [Google Scholar]
- 12.Zhang Z. Cause-effect relationship between vocal fold physiology and voice production in a three-dimensional phonation model. J Acoust Soc Am. 2016;139(4):1493. doi: 10.1121/1.4944754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Hirano M. The laryngeal muscles in singing. In: Kirchner J, Bless D, editors. Neurolaryngology. Boston, MA: College Hill Press; 1987. pp. 209–230. [Google Scholar]
- 14.Herbst CT, Ternstrom S, Svec JG. Investigation of four distinct glottal configurations in classical singing–a pilot study. J Acoust Soc Am. 2009;125(3):El104–109. doi: 10.1121/1.3057860. [DOI] [PubMed] [Google Scholar]
- 15.Svec JG, Sundberg J, Hertegard S. Three registers in an untrained female singer analyzed by videokymography, strobolaryngoscopy and sound spectrography. J Acoust Soc Am. 2008;123(1):347–353. doi: 10.1121/1.2804939. [DOI] [PubMed] [Google Scholar]
- 16.Svec JG, Sram F, Schutte HK. Videokymography in voice disorders: what to look for? The Annals of otology, rhinology, and laryngology. 2007;116(3):172–180. doi: 10.1177/000348940711600303. [DOI] [PubMed] [Google Scholar]
- 17.Berke GS, Moore DM, Hantke DR, Hanson DG, Gerratt BR, Burstein F. Laryngeal modeling: theoretical, in vitro, in vivo. The Laryngoscope. 1987;97(7 Pt 1):871–881. [PubMed] [Google Scholar]
- 18.Garrett CG, Coleman JR, Reinisch L. Comparative histology and vibration of the vocal folds: implications for experimental studies in microlaryngeal surgery. The Laryngoscope. 2000;110(5 Pt 1):814–824. doi: 10.1097/00005537-200005000-00011. [DOI] [PubMed] [Google Scholar]
- 19.Sanders I, Rai S, Han Y, Biller HF. Human vocalis contains distinct superior and inferior subcompartments: possible candidates for the two masses of vocal fold vibration. The Annals of otology, rhinology, and laryngology. 1998;107(10 Pt 1):826–833. doi: 10.1177/000348949810701003. [DOI] [PubMed] [Google Scholar]
- 20.Chhetri DK, Rafizadeh S. Young’s modulus of canine vocal fold cover layers. Journal of voice: official journal of the Voice Foundation. 2014;28(4):406–410. doi: 10.1016/j.jvoice.2013.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]