The blueprint of the spatial sonic field/ chapter 3
3. Wave Field Synthesis
The Wave Field Synthesis, also referred as wavefield-synthesis or WFS, is a spatial audio reproduction procedure. Their perception no longer depending on psychoacoustic phantom sound source perception as like the conventional audio procedures. The sound field becomes reconstructed physically. For this purpose the synthesis emulate nature like wave fronts according Huygens principle by assembling of elementary waves. A computer synthesis move each solitary speaker membrane, arranged as array around the listener, just in that moment, if the wave front of a virtual point source reach their point in space.Such as apparent in the following animation, the initial wave front becomes restore physically.
3.1 Mathematical Base
In the late 1980´s at the Delft University of Technology was developed a procedure for constituting such virtual sound sources. Mathematical base of that “Wave Field Synthesis” procedure is the Kirchhoff- Helmholtz integral. It expresses if proven sound pressure and particle velocity concerning the surface of a source free volume, the sound pressure at any point within this volume is determined. According Rayleigh II the sound pressure at the point A within a half-space is determined, if only the pressure distribution on a plain is known. On both sides of this plain an acoustic field occurs. If the rear sound suppress, half-space radiation accrue.
3.2 Physical principle
More graspable Huygens Principle in this matter. Christiaan Huygens discover, each point of a wave front pose starting point of an elementary wave. More as 300 years ago, the Dutch Mathematician explains the diffraction effects by that principle. The principle applicable for each wave propagation, light waves as well as sound waves. Huygens Principle is one of the most important cognition in range of physics. In range of acoustics today the knowledge delivers the possibility, restoring genuine like sound waves from such elementary waves:
In this animation we consider the holes in baffle, respectively the loudspeakers, as such elementary wave starting points. As long as dimension and spacing of the holes remain small compare by the sonic wavelength, the sound pressure will not differ between both sides of the hole. The superposition of sufficient number of such elementary waves completely restores the genuine wave front. All we need is the dry recorded source signal and the distance concerning each starting point of the elementary waves.
Unfortunately, the sonic field in the recording room is not established from the direct wave front alone. The major fraction of the sound energy contain in reflections. For true spatial audio we cannot radiate the reflections alone from main source direction, as often done during conventionally audio. The difference in direction of the first reflections regarding the direct wave starting point deliver the most important cues regarding source distance and distances of the recording room walls. The huge amount of later reflections which arrange the reverberation tail are less important regarding the direction but provide information's regarding fine structure and properties from recording room surfaces.
The WFS - loudspeaker arrangements able for create more as one virtual sound source. Her signal content independently and may originate by different sources. Congruent signal content radiate from different source positions faking reflections of the main source from arbitrary different starting points. As describe in the second chapter, the genuine sound field in recording room is established from huge amount of such starting points with same signal content. If we reconstruct all those positions, the spatial sound field would be completely recreated from single, dry recorded mono audio. The recording room acoustics don't work differently. The main difficulty for restore the genuine sound field is, appointing all of the starting points of all of the reflections in the recording room.
3.2.1 The model based approach
Wave Field Synthesis provides two different ways in this matter. The more simply method is the model based approach. According the mirror source model the starting points become calculated simply from recording room geometry. The calculated distance of each virtual sound source position regarding each of the Loudspeaker positions determinates runtime and level. The reflection factors from walls are including at this calculation as well as the directional radiation of the primary source. Such procedure is practicable for restore direct wave and first reflections in the recording room, but the huge amount of discrete reflections in reverberation tail makes impossible the correct reconstruction of the complete sound field by means of the model based approach in practise.
3.2.2 Impulse response based approach
By that reason common practise in the scientific institutes, which refining Berkhout´s idea, the application of the impulse response based approach. In prearrangement of the transmitting process become captured the spatial impulse response of the recording room. In that purpose a line array of microphones arranged in the recording room comparably as the loudspeakers arranged in the playback room. For capture the spatial impulse response, a short impulse induced on the later position of the primary sound source, catch from the microphone array. The impulse will hit the nearest microphone at first. The align loudspeaker on the align position will radiate the audio signal ahead all other loudspeakers during playback. The other Microphones in the recording room later strike in turn from the impulse.
In the scientific institutes which pursuit Berkhout´s idea, the application of the impulse response based approach is common practise. The convolution of each loudspeaker signal into the recorded impulse response of the assigned microphone will recreate direct wave and all its reflections in the recording room from correct starting points.[1]
Yet in practise isn't possible recording the spatial impulse response on all possible microphone positions in the recording room for all possible positions of the sound source. The measuring results must extrapolate during playback for all different positions. This calculation needs to include all of the mirror source positions. Over and above, the microphone array poses different acoustic length as the loudspeakers at different temperature in playback room. That would cause loss in upper frequency range, as far as the different propagation speed isn’t include in the calculation. Over and above, the loudspeaker positions different the microphone positions in normal case. Especially for moving sound sources, the amount of calculation tasks hardly hand able in real time at the currently available computing power.
3.3 Procedure advantages
In principle though, the wave field synthesis has the ability for produce a virtual copy of the genuine sound field. All sound sources and all of its reflections in the recording room may produce virtually at any points inside the horizontal plane of the listener. That's different regarding conventionally audio reproduction procedures, which trying transmitting the whole information in some separate audio channels. The synthesis from dry recorded source signal, in the same manner as the recording room establishes the spatial sound field in recording room, a possible way for true spatial loudspeaker playback.
Such volume based solution doesn't capture in narrow sweet spot, the wave field synthesis restores the field, changes of the listener position in the playback room cause the same changes in perception as listener movements in the recording room. That would be never possible at the psychoacoustic based phantom source detection, because their position in room depends from listener position. Yet the virtual acoustic sources, produced from sufficient amount of elementary waves, pose the same behaviour as real sound sources. The Loudspeaker itself no longer remains referring point.
Wave Field Synthesis provides the ability for align the virtual sound source in front of the loudspeaker alignment. In the principle animation wouldn't appear any difference for delay times and levels, if the virtual source behind or in front of the microphone row. Thus we would perceive the starting point in any case behind the speakers. But if the delay times become inverted by the “Time Mirror Approach”, the loudspeakers would produce concave wave fronts. In that case the virtual source appears in the focus point inside the playback area. We can walk around yet in certain degree.
3.4 Remaining problems
3.4.1 Horizontal restriction
Wave Field Synthesis isn't limited on plane in principle; the procedure would able for restore the sound field in all three room dimensions. But at the impulse response based solution the available computing power for 3D audio would be beyond of reach until today. Besides, covering all playback room walls by loudspeakers hardly a usable approach in practise.
Looking for practicable solution, the developers abandoning the elevation level representation in compromise. Reducing the speakers upon single line around the listener was an acceptable solution, realizable in the nineties already. Our detection in azimuth is mainly based by time detection, which becomes reconstructed perfectly by such horizontal loudspeaker lines.
Such solutions are possible today with hundreds of loudspeakers without unbearable problems. Yet the horizontal limitation remains clearly audible, especially in damping environments, which need for suppress the playback room acoustics for such WFS- approach. Other procedures, like Ambisonics or Vector Base Amplitude Panning (VBAP) have shown a really three dimensional reproduction of the sound event is essential.
3.4.2 Disturbing playback room acoustics
Besides the acceptance factor for such loudspeaker rows all around, the loudspeaker rows cannot solve the problem of the disturbing additionally playback room acoustics in transmitting chain. In order to produce alone the acoustics of recording room, the playback room acoustics must get completely suppressed.
Horizontal rows of loudspeakers doesn't really produce parallel wave fronts, they radiate cylindrical waves. Such wave fronts lose 3 dB of its volume every time the distance is doubled. Since the listener is relatively near the speakers, the increasing volume of the nearby speakers becomes disturbing. Over and above loosen sound energy comes back from the playback room walls in case of insufficient damping. Especially the strong playback room ceiling reflection is hardly avoidable at horizontal cylinder wave radiation.
3.4.3 Aliasing Effects
The Kirchhoff- Helmholtz integral describe unlimited amount of elementary waves. In Practise yet, the number of loudspeakers will be limited. As each quantisation in audio, that causes aliasing effects.
Inside the playback area, depend from audio wavelength dots of higher level change by dots of a lack in magnitude across the room. At one dedicated point, the notches and hills have very small bandwidth. Fortunately such effects in perception less disturbing as in the measured frequency response curve.
The difference in notches and hills magnitude depends from distance between the elementary wave sources and listener and source position regarding the radiating loudspeaker alignment. For aliasing free reproduction a distance of less then one inch would be need.
3.4.4 Truncation effect
As far as the radiating loudspeaker arrangement not closed around the listener, the ends of the radiating surface cause the “Truncation Effect”. As visible in principle animation, at this ends suddenly no further elementary waves contributing sound pressure. That will change the resulting superposition of all elementary waves suddenly as well, a shadow wave arising.
In certain degree that effect is avoidable by decreasing the outer speakers in level. As far as the virtual acoustic source aligns behind the loudspeakers, the shadow wave arrives at listener later as the direct wave front. But, as far as the source inside the playback room, the shadow wave first arrive at first, what very disturbing for perception.
3.4.5 Concave wave fronts
Another problem for such virtual sound sources inside the playback area the wrong ITD´s of concave wave fronts. All surfaces of real sound sources in nature are curved convex; we have no listening experience inside a sound source. Thus, the time difference cues of such wave fronts produce utterly odd perception.
If the listener situated between the radiating loudspeakers, such misguiding cues accrue. Two different ways for solve the problem described in the protected solution EP1637012, or in the DE 10 2006 054 961 A1 Application, which is no longer protected.
3.4.6 Parallax problems
The last mentioned proposal usable for solving another problem of the acoustic blueprint. We have the possibility for producing virtual sound sources inside the spectator’s area, but we cannot produce the sound source of a connected picture at that point. The described way for combine the physically principle by psychoacoustic faked source position discover the breathtaking possibilities of the wave field synthesis principle.
3.5. Compatibility
The wave field synthesis principle a object based approach. We have to transmit the pure, dry recorded audio (content) and in addition the Data regarding the recording room properties. In range of computer such object based transmitting standard since long time because of its efficiency. At the German Fraunhofer Institute was developing the MPEG4 standard, joined for such object based audio broadcast.
Unfortunately many of the traditionally components cannot play that standard at the time. On the other hand, traditional audio may played in WFS loudspeakers, but the fundamental advantage for producing true spatial impression of the recording room are going lost during such reproduction. We can feed the channals in virtual panning spots, virtual loudspeakers far beyond the real playback room walls. That lessens the influence of the listener position in playback room, the angels and levels regarding the distanced loudspeakers hardly changed at different points in playback room. That enhanced the sweet spot to ….fast over the whole playback room. But, the perception is traditionally phantom source based audio including all of the disadvantages of the traditionally procedures, described in the first chapter.
3.7. Subjective impression
Impression ever subjectively, but I want to describe as neutral as possible my own impressions from different occasions for listen the wave field synthesis loudspeaker rows:
Until today the installations remain clearly apart from the goal of congruent perception regarding the genuine sound event. Most notably audible was the reduction onto the horizontal plane. The loudspeakers in the rows align at least 20 cm apart each other, though aliasing effects weren’t really disturbing. More audible was a tonal inaccuracy, especially loss in upper frequency range. The loss increase with the number of demonstrations in some cases, possibly some solvable problems for including the room temperature in to the calculation was responsible.
On the other hand, the spatial impression incomparably better as possible in all traditionally procedures. The positions of the sound sources are absolutely stable. Never in traditional audio will possible so clearly estimate the distance regarding the sound source. No loudspeakers remain audible, the sound moves seemingly independent from all loudspeakers, outside and inside playback room.
The position remains constant in playback room, even if the listener moves across the area. The source level change accordingly the distance regarding the virtual sound source. Just very near the virtual sound source inside the room perception become indifferently, without concrete source position.
The most of the problems seem solvable in foreseeable time; first plants of tightly assembled two dimensionally indicate promising results. Such speaker fields would be applicable for put into practice a WFS- “Holophony” approach, which should be the final goal of the procedure.