Volume II - 1994

Beyond MTV: Music Videos as Foreign Language Text
by Thomas J. Garza

      Dr. Thomas J. Garza is assistant professor and coordinator in the Department of Slavic Languages at the University of Texas at Austin. He has lectured and conducted teacher development sessions in the U.S., Russia, Ukraine, Yugoslavia, Hungary, Armenia, and Belorus, demonstrating the use of authentic broadcast video in foreign language teaching.

      Rock music videos are rarely, if ever, referred to as "text," even in the broad definition of the term in foreign language pedagogy. Indeed, while this much-maligned medium may claim responsibility for setting trends in popular music, fashion, and culture, it does not often appear in the syllabus of print, audio and video materials used in the development of language skills. And yet, the well-selected music video clip may provide precisely the potent music/lyric/image combination necessary to unleash the imagination of the learner and promote proficiency in the foreign language.

      In the realm of representing the arts and humanities in language learning and teaching, poetry and song have received considerable attention. Maley (1987), for example, suggests and elucidates no fewer than ten qualities of poetry and song that make them appropriate devices in language learning: Memorability, Rhythmically, Performance/Reciteability, Ambiguity, Non-Triviality, Universality, Playfulness, Reactional Language, and Motivation/Interaction. Maley contends that these qualities support the retention of lyrics and verse once learned, even when communicative competence has diminished. Further, such qualities are particularly positive when they occur in the context of authentic language samples; that is, language created by and produced for native speakers of the same language. Songs fulfill this criterion of authenticity, both in linguistic and cultural dimensions.

Pragmatic Considerations of Video

      As a medium for presenting a foreign language teaching text, video offers language instructors and students a highly-accessible and manipulable product. Linear videotape is easily obtained, inexpensive and readily adapted, modified and edited into a useable classroom tool. Even in its unmodified prerecorded form, the videotape format offers the instructor a variety of choices for presenting and manipulating filmed material in the course of instruction: even the most primitive videotape player allows the user to stop action, freeze frame, view in fast-forward or slow-motion, and add or remove the sound track in order to exploit the video material to its fullest advantage. The instructor or student can focus on specific points in the video, isolating paralinguistic information, such as gestures, proxemics, or other markers of body language. In addition, through the use of editing equipment, much more sophisticated enhancements of the video text can be achieved. Of the wide array of post-production editing techniques possible, the one that has the greatest implications for language teaching is the addition of open captions, or original language subtitles, which often assist the student in comprehending the language of the segment. Several major studies over the past decade, such as Price (1983) and Garza (1991) strongly suggest a positive correlation between the addition of target-language captions to video materials and increased comprehension. If comprehension does, indeed, precede production in a foreign language, then captioning may serve as a valuable aid in bringing our students to the level of proficiency needed to understand and more fully appreciate authentic television broadcasts, motion pictures, documentaries--or even music videos!

      As with any text for foreign language instruction, video materials must be carefully chosen to meet the needs and goals established for the course, especially in a proficiency or performance-based curriculum. Since not all material that appears in the video format is necessarily appropriate for classroom use, certain criteria have been generally agreed upon as essential for selecting "good" video segments (Lonergan 1984, Altman 1989). First, useful video must contain the linguistic material (lexical, syntactic, phonetic, functional, etc.) desired for instruction. Second, the video segment should be thematically interesting and culturally relevant for the target audience. Third, the selected materials should be multi-layered; that is, they should be able to maintain student interest in the face of repeated close viewing. Fourth, in the ideal segment, the visual images are no less important than the accompanying spoken text, and the two depend on each other for complete comprehension of the text. Finally there remains the issue of length of the video segments for classroom use. Here, too, various foreign language researchers agree that approximately four minutes of video provides the optimal amount of layered information for processing at one time (Altman 1989, Lavery 1984).

      In the proficiency-oriented foreign language classroom, many rock music videos fulfill virtually all of the above-mentioned criteria for use as text. The products of MTV (cable television's Music Television network, with twenty-four channels broadcasting music videos and related features) may well be considered the international format and standard for all music videos produced. Clips are three to four minutes in length, which is ideal for video-based instruction. Many are saturated with evocative images thematically linked to the lyrics, while others--the so-called "concept videos"--present a truncated narrative and a cast of characters, telling one of the many stories contained within the lyrics. Though the correlation of images to lyrics may not be very high, the effect is to create precisely the environment needed to encourage repeated viewings (i.e., repeated exposures to the language and cultural material) and autonomous interaction of the student/class with the video.

Textual Considerations

      While the lyrics of a typical video contain substantial lexical, grammatical, and functional material for classroom use, it is actually the visual text that overlays the lyrics with images and activates the learner's imagination. Once engaged, the imagination can provide unlimited contexts in which the student can manipulate the newly-acquired linguistic material. In this connection, the very malleable nature of video technology plays a most important role in making classroom treatment of these video segments interactive and directed towards student performance in the target language. The video can be stopped and reviewed to examine more closely individual moments and images encapsulated in the video -- often in very quick succession -- and allow the students, rather than the instructor, to discover much of the underlying visual text of the segment. To facilitate this more student-centered approach, it helps to leave the remote control in the hands of students, allowing them to dictate the selection of items and the order of their explication. Similarly, instead of the usual teacher-centered explanations of cultural elements, the students themselves are allowed to draw first on their own individual and collective prior knowledge about the world and the target culture to try to ascribe meaning and textual order to the video images.

      To illustrate this potential engagement of the student's imagination in the context of a music video, consider the recent video released to accompany Sting's award-winning song "If I Ever Lose My Faith In You." The lyrics are an excellent example of the poetic, yet still quite functional, repetition that is typical of many contemporary rock videos:

You could say I lost my faith in science and progress.
You could say I lost my belief in the holy church.
You could say I lost my sense of direction.
You could say all of this and worse, but
If I ever lose my faith in you,
There'd be nothing left for me to do.
(Sting, 1993)

      On the surface, the lyrics might provide the EFL instructor with excellent material to present and practice modal constructions in English. To ignore the video modality of this clip, however, would be to sacrifice much of the power of this medium as an effective text for teaching and learning the target culture. In the video clip, the lyrics are sung over extraordinary images of authentic Arthurian and chivalric legends, Glastonbury Tor, King Richard III, and Saint Joan. These images invoke the literary and cultural icons from British history and literature, all recognizable on sight by most educated native speakers of British English. It is such culturally authentic material that provides several types of production tasks for creating performance-based classroom interaction. First, since the prior texts involved in such video montages may not be known to the students learning the language, the original texts might be told, read, retold by the students, illustrated, or matched with the appropriate images in the music video. Second, since the connection between lyrics and video text may be thematic at best, students can be challenged to identify such relationships or to create their own links between the music lyrics and visual elements. This can generate lively and creative oral or written pieces. Third, using the original lyrics to inspire a class-produced video- or photo-montage can be a full-scale project for a class at any level of proficiency. Even for novice and intermediate level learners, the music without the lyrics may invoke particular images that students can assemble, produce or draw, and try to explicate in oral or written form.

      Just as the Sting video text can be effectively exploited in an EFL classroom for British English, American rock video sources provide no less material. Exemplary of the text-rich concept videos from American artists is Madonna's "Like a Prayer," which combines simple but evocative lyrics with powerful and controversial visual images:

Life is a mystery
Everyone must stand alone
I hear you call my name
And it feels like home.
When you call my name, it's like a little prayer;
I'm down on my knees, I want to take you there;
In the midnight hour, I can feel your power;
Just like a prayer, you know I'll take you there.
(Madonna and Leonard, 1989)

      These seemingly innocuous romantic lyrics are layered over an explosive visual montage depicting a racially-provoked murder, a miscarriage of justice in an American courtroom, and an intermingling of an interracial love story and religious affirmation. Visual literary allusions are made to American literary classics such as Elmer Gantry and To Kill a Mockingbird, requiring students to express and support opinions on complex social issues, all the while keeping the language material well-grounded in the relevant cultural setting. As Swaffar (1992) and Kramsch (1993) maintain, using literary texts with interactive communicative activities helps the students understand and acquire the shared meanings contained within literary works. With the teacher acting as native or near-native informant, the students are taught how to extract and exploit the cultural information often found in popular literature.

On Michael Jackson and Minimal Pairs

      Songs, like poetry, are one of the most powerful combinations in helping the learner commit limited phrases and word combinations to memory, to be put into active service at a later time in communication. Rock music videos can also be most effective in teaching pronunciation and intonation. The success of basic introductory EFL course materials such as Graham's Jazz Chants (1978) attests to the appropriateness of the musical text in language learning. Songs often contain the elements of repetition, rhyme and rhythm that facilitate quick memorization and easy imitation of the original text material. It is no wonder, then, that in song a "foreign accent" pronunciation is much more easily masked or eliminated than in normal conversational speech. Recent rock groups such as Sweden's "Abba" and Norway's "A-ha" and the scores of Japanese Elvis impersonators are not unique in their mastery of spoken English after only singing and recording for several years from a written phonetic English transcription! Even native English-speaking musicians such as Eric Clapton and Billy Idol often choose to sing with native-like American phonetics, though their normal speaking voices produce pure British English sounds.

      Music videos also have the advantage of being in the highly-manipulable video format, allowing for on-screen exploitation, such as captioning, as well as other task-specific procedures. The most relevant of these techniques for pronunciation is colorization, in which certain items in the on-screen captions appear in different colors. This technique was developed and used in the acclaimed PBS series "ColorSounds" throughout the 1980s and into the 1990s as a means of teaching basic literacy skills to English-deficient school students and adults throughout the United States (Bell 1984). Selected contemporary music videos were captioned with particular sounds (e.g., /th/, word-final /r/, schwa, etc.) or grammatical items (eg., nouns, adjectives, plurals, etc.) colorized in the on-screen lyrics throughout. Students were encouraged to sing along with the video and note the occurrences of the particular item that was in color. In 1985, some of these materials were adapted for inclusion in the video-based EFL package In America, aimed at teaching functional American English to speakers of Japanese (Dow 1985). For this project, music videos were selected to represent sounds which presented difficulty for the native speaker of Japanese. These sounds were colored in the captions to focus the learner's attention on their articulation in the song's lyrics. Sometimes these sounds would be contrasted with their allophones, such /r/ and /l/, with the two sounds appearing in different colors in the same song to emphasize each phoneme's distinctive articulation and pronunciation. Thus, the specialized captioning of these videos provided an excellent aid to presenting and teaching the phonetic material, while the actual performance of the music video created a perfect vehicle for production practice.

Foreign Language Music Video

      The rock music format enjoys popularity both as English-language broadcast-quality videos and as locally-produced native-language music videos in countries all over the world. Spanish, French and German-speaking countries all tout MTV-like channels or programming, usually playing a mix of indigenous and imported videos. Just as the MTV videos can serve the EFL classroom, these foreign language rock videos can provide a wealth of memorable, functional language units contextualized by relevant, culturally-saturated visual images. Materials development projects over the last three years in the Department of Slavic Languages at the University of Texas have provided valuable opportunities to observe students of Russian working with authentic Russian-language rock music videos selected from the Moscow television program "Muzykal'noe obozrenie" ("Musical Review") which has emerged in the post-Gorbachev Commonwealth of Independent States as the local version of MTV. Like its American counterpart (which began limited broadcasting in Russia in 1993, complete with a Russian-speaking video jockey), "Muzoboz," as it is called in Moscow, broadcasts primarily performance or "concert" video footage, while the pedagogically more useful "concept" video format is still in its nascent stages of development. Still, many quite acceptable music video texts for Russian language teaching have emerged over the past five years.

      One such video is from the St. Petersburg-based group "Kino." The video is for the song "Videli noch'" ["We Saw the Night"] and, like any good rock music video text, has linguistically interesting lyrics combined with useful visuals. The lyrics alone provide considerable material for production activities in the Russian class:

My vysli iz doma kogda vo vsex oknax
Pogasli ogni odin za odnim.
My videli kak uezzaet poslednyj tramvaj.
Ezdjat taksi no nam necem platit'
I Nam nezacem exat'.
My guljaem odni.
Na nasem kasetnike koncilas' pljonka.
Est' sigarety, spicki i butylka vina
I ona pomozet Nam zdat', pomozet poverit'
Cto vse spjat i my zdes' vdojom.
Videli noc', guljali vsju noc' do utra.
We left home when in all the windows
The lights went out one by one.
We watched the last tram leave. E
Taxis are out but we don't have money
And we have no reason to go.
So we walk around alone.
The tape on our player has ended.
We've got cigarettes, matches and a bottle of wine
And it'll help us wait, it'll help us believe
That everyone's asleep and just we two are here.
We saw the night, we walked all night till dawn.

(Viktor Tsoy 1986)

      These song lyrics provide excellent examples of verbal tense movement in colloquial narration from past to present to future activity. With a preponderance of palatalized consonants, the sounds in the video are well-suited for work on phonetics as well. But the video also allows the students to explore and begin to discover the Russian's world-- in this case for university students, the world of their peers in St. Petersburg. Since many of our students may not get the opportunity to travel to Russia during their study, such brief video excursions into the culture of language being studied are crucial, both to inform and to motivate. By watching the video scenes of Petersburg's streets late at night, and how Russian youth try to find places and ways to be out with friends, our students come to understand more about the nature of the lives of the Russians, about the ways they are alienated in Russian society, and how that is like and unlike their own experience in the US For many of our students, the mere fact that many Russian university students live with their parents all the way through and long after their schooling is new information and a cultural discovery. Our students speak about an immediate sense of identification with their Russian peers that comes from finally being able to see them in their native Russian environment. How the Russian students dress, how they talk, how they behave with one another all help our students find more relevance in their study of Russian language and culture. For teachers wanting to exploit the potential of rock music videos as part of language and culture instruction, but who do not have access to broadcast or prerecorded sources of these videos, motion pictures can often provide a satisfactory substitute. In Russian, for example, the classic romantic comedy "Ironija sud'by" ["The Irony of Fate"] has several songs integrated into the visuals and plot of the film. One famous scene, set in snow-covered St. Petersburg on New Year's day, shows stunning views of the city's most famous architectural and historical landmarks as the heroine of the story walks along the banks of the Neva River. This footage could not be a better introduction to the city for students and has the potential for a variety of performance-based applications in class, from simple elicitation of where something is located, or observations on the weather, to advanced tasks of deducing what a character might be thinking and why. All of this material is shown beneath the lyrics of the ballad "Ja sprosil u jasenja" ["I Asked the Ash Tree"], which succinctly reviews the high-frequency u + genitive case construction in Russian!

      If the adage "A picture is worth a thousand words" has any validity to it, then the use of video materials as part of a performance-based curriculum for foreign language instruction seems to be a natural consequence of including the most useful texts to encourage and stimulate active interaction of our students in the target language. By extension, I would contend that the inclusion of music videos as the source text exponentially increases the utility and benefits of regular video materials by adding the equally evocative modalities of lyric poetry and music, creating one of the potentially richest three minutes of useable text to be brought into a classroom. With limited additional preparation time, rock music videos offer both teacher and student endless variations of situational, functional, and truly communicative activities for developing proficiency. Turn on MTV and give music videos another critical review. Perhaps something in them will excite your own imagination in language teaching and you'll find a video that suits you and your needs. Then, in the immortal words of David Essex, "Rock on!"


Altman, R. (1989). The Video Connection: Integrating Video Into Language Teaching. Boston: Houghton Mifflin Company.

Bell, J. Michael. (1984). " The ColorSounds story." ColorSounds Monthly 1(9):1-2, Austin: The ColorSounds Educational Foundation.

Dow, Anne R. (1985). In America. Reeves Corporate Services for International Horizons (Cura‡ao), N.V., A Learning Technologies Ltd. Company.

Garza, T.J. (1991). "Evaluating the use of captioned video materials in advanced foreign language learning." Foreign Language Annals 24(3) 239-258.

Graham, Carolyn. (1978). Jazz Chants. New York: Oxford University Press.

Kramsch, Claire. (1993). Context and Culture in Language Teaching. Oxford: Oxford University Press.

Lavery, M. (1984). Active Viewing Plus. Oxford: Modern English Publications, Ltd.

Lonergan, Jack (1984). Video in Language Teaching. Cambridge: Cambridge University Press.

Madonna and Patrick Leonard. (1989). "Like a Prayer, Like a Prayer." Warner Brothers Music.

Maley, Alan. (1987). "Poetry and song as effective language-learning activities." Interactive Language Learning, Wilga Rivers, ed. New York: Cambridge University Press, 93-109.

Price, Karen. (1983). "Closed-captioned TV: An untapped resource," MATSOL Newsletter, vol. 12, no. 2.

Sting. (1993). "If I ever lose my faith in you." Ten Summoner's Tales, A&M Records.

Swaffar, Janet. (1992). "Written texts and cultural readings." Text and Context: Cross-Disciplinary Perspectives on Language Study, Kramsch and McConnell-Ginet, eds. Lexington: D.C. Heath.

Tsoy, Viktor. 1986. Videli noc', Kino, Melodiya Records.

back to content page