Now Reading
Understanding (Diegetic and Non-Diegetic) Sound

Understanding (Diegetic and Non-Diegetic) Sound

The moving image can be found everywhere. You can see it on TV, in Film, in games or in animation, on the Internet, in advertising and on hand-held devices such as mobile phones and tablets. Sound is also ever-present in our lives. While sound and the moving image can both exist separately, there is a relationship between them, especially in the media.

Sound can be broken down into three categories: voice, music, and sounds, such as ambient sound and sound effects. Each of these kinds of sound adds something to the Image and changes the feel of it.

Diegetic and Non-Diegetic Sound

Once more, sounds can either be digestive or non-diegetic. Diegetic sounds are those that link to something visible on screen, and can also be heard by the characters. This includes dialogue and the sounds of objects/things on the screen. Non-diegetic sound is, by contrast, all of the sounds that the audience hears but the characters cannot. This could be narration, ambient sound, “mood” music, and some sound effects. This blog post will also refer to the two forms of sounds in a number of examples ranging from TV to Film to animation. All of the sounds featured in the examples to follow have been created somehow. This analysis will also explain how these sounds are created, and who is responsible for creating them.


Voice in the Moving Image

Voice is frequently used alongside the video. It could be in the form of dialogue, narration, or interviews. Often, the use of voice with the Moving Image helps communicate ideas to the audience easily and it can change the tone of the piece.


In this scene from Harry Potter and the Deathly Hallows Part 2, hearing the conversation helps the audience to understand the series of shots they are seeing. Without the dialogue, the audience would never understand the conflict going on between these characters. It helps the audience understand that Ron is extremely frustrated by how slow their search for Horcruxes is. Also, without the dialogue, the audience would never understand that Harry and Hermione made a crucial discovery in their quest to destroy Horcruxes.

This scene is a clear example of diegetic sound because the characters are able to hear the sound that the audience is hearing.

Without the sound, the visuals do not set the tone of the scene. By hearing the characters booming voices yelling loudly at one another, the audience can understand that this is a tense moment.

The dialogue in this scene was recorded during filming by using boom microphones. These large microphones hooked up to a boom pole suspended over the actors’ heads pick up their voice clearly so that it can be synced to the visuals later on in the editing process. In this case, a boom operator is responsible for capturing the sound.


However, in some situations, other methods are taken to record dialogue. In the first video from the film 300, there is too many people and much movement in the scene to allow boom operators to stand still and capture the sound. In this case, the actors had to record the dialogue for this scene separately and edit it in during post-production. This was done in a studio with mounted microphones and sound technicians. In the case of animation, as shown in the clip from the film Corpse Bride, the visuals weren’t recorded with sound (because the characters do not have voices yet). Voice was recorded separately by actors in the studio, then included into the scene later on.


Voice can also accompany video in the form of narration. Narration simply refers to the sound that is added to the video that belongs to a ‘third-person’. In a way, it is as though this external person is recounting the scene and its events to the audience. Here is an example of narration from Tim Burton’s short film Vincent:

The narration for this scene is a non-diegetic sound because it does not belong to anyone present in the frame. Unlike dialogue added during the animation, the sound we hear is not coming from the characters but rather an external person who sees the action and explains it.


Access our 200+ Years of Film Tech eBook from this link


What is helpful about the narration is that it provides viewers with added information they couldn’t gather from the visuals alone, such as Vincent’s age, the type of person he is, and what his interests are. It also sets the tone for the video. This video is about a boy who dreams of being like the actor Vincent Price. This actor often played parts in horror films where he was a sort of mad scientist, creating all sorts of things in his basement. By using Vincent Price’s voice to accompany this clip, it gives the film a dark and creepy feel, commonly found in the monster movies it makes reference to. Just like dialogue, the narration is recorded by an actor in the studio, then added to the film when it is in its final stages. Mounted microphones were used in the studio with the help of sound technicians.



Voice can also come in the form of interviews. This type of sound is digestive because it comes from what we can see in the video and the people in the clip can hear it too.

Their voices were recorded during filming with quality microphones placed very close to the subjects. This is apparent because every sound is picked up, including the low scratching noise coming from Lance Armstrong’s jacket as he moves his arms.


Music in the Moving Image

Pairing music with visuals is probably the most effective way of conveying emotion. They add to the feel that is already hinted at on-screen. Music in the Moving Image could come in the form of soundtracks and library sounds.


Soundtracks are basically the music that has been specifically created to pair with video. By using music that relates to the themes touched upon in the video, it sets the tone and reinforces the message. Adele recorded the song “Skyfall” specifically for use in the new James Bond film of the same name. The clip below shows how her song highlights key themes and sets the tone for the film.



In this scene from the opening credits, various symbols are shown that give clues as to what happens later on in the film. However, they don’t tell us very much about what will happen in the film without the music accompanying it. The music accompanying this scene is therefore the only way of making sense of what will happen in the film as it progresses. The lyrics highlight themes of solidarity and death, both of which are present in the rest of the film. The audio track itself has a dramatic, melancholy feel that sets the tone early on for the entire film.

The song is another clear example of non-diegetic sound because its origin is not present in the scene. The song was recorded by Adele and a number of other people who specialize in sound production. The singing was recorded with microphones in a studio and the instrumental was either recorded with live instruments ( that have clip-on lavalier microphones attached to them) or using software such as Logic Pro. Library Sounds Instead of creating music specifically for the video, it is also possible to use music that already exists. It is also a good way to set the tone for the video.


Sound in the Moving Image

It is also important to mention that sound effects (SFX) and ambient sound can also accompany the moving image. Just like the other types of sound, these sounds have an impact on the Moving Image.

Sound Effects

Sound effects are an important part of video production. They can be used to enhance an existing sound, added or to create a sound that did not exist. Here are examples of both situations.

In this fight scene, work was done to emphasize the sound of the punches, kicks, and slaps. If this were to happen in real life, it wouldn’t sound nearly as dramatic! This is done to draw our attention to specific actions that we might miss otherwise or in order to make it more interesting.

Even though the sound has been over-emphasized, it is still a diegetic sound because its origin is present on the screen and the characters can also hear the sound. Sound effects can be done by Foley Artists. It is their job to watch the video during post-production and create the necessary sounds for it. In the studio, Foley artists manipulate props and record sound for what they see in the video. For example, the punches were probably the result of punching a cabbage and manipulating the sound it creates so that it syncs with the actions we see on screen.

It is exactly the same process for “new” sound, such as the sound of a Transformer. Since it would be rather difficult to find a Transformer and record the sound directly, it had to be created. The sound we hear is a mixture of a number of ordinary sounds which are manipulated to give the desired effect. For things like footsteps and other sounds, Foley artists had to recreate them after the scene was shot.

Ambient Sound

Ambient sounds are background sounds that help the audience understand where the action is taking place. They can also help to set the tone.



In this scene from the Pursuit of Happyness, the ambient sound makes the audience aware of the scene’s location. We register that it is a subway station partly because we hear the familiar sounds of the trains.

The ambient sound in this video is diegetic because it comes from trains we can see and the characters can hear the sound too.

The sound of the subway was likely recorded by boom operators during the filming process. They probably went into the tunnels and recorded it after this specific scene was finished and edited it in afterward. As you can see, the sound makes all the difference when it is paired with the Moving Image. Together, sound can help to communicate ideas easily, evoke emotion and set the tone for visuals. It can enhance the viewing experience and provide the audience with clues that the visuals might not be able to.


Understanding Sound

View Comments (2)

Leave a Reply

Your email address will not be published.

Scroll To Top