Speech Analysis Technology for Voice Identification
Tech Area / Field
- INF-OTH/Other/Information and Communications
- INF-SIG/Sensors and Signal Processing/Information and Communications
3 Approved without Funding
Institute of Control Systems, Georgia, Tbilisi
- Center of Audiology and Hearing Rehabilitation, Georgia, Tbilisi
- Universitat Trier/Phonetik, Germany, Trier\nFostanschrift Klinikum rechis der isar Experimentalle Audiologie, Germany, Munich\nForensic Communication Associates, USA, FL, Gainesville
The purpose of the proposed project is to elaborate a complex system to analyze the inpidual characteristics of speech signal, as represented by a speech phonogram. The system is thought to ensure voice identification in criminology.
Experience of the elaboration of biometric methods for identification of a speaker by his voice and pronunciation manner is widely reviewed in juridical literature of the past three decades. This problem acquired its topicality as the phonogram became accepted as objective material evidence for juridical investigations and practice. However, in spite of its obvious urgency, this problem proved to be very complex and, therefore, it has yet to be resolved, let alone used in contemporary criminology.
Difficulties are caused, in the first instance, by the ability of the object of investigation (the speech phonogram) to change under the influence of different factors. These factors include several forms of disturbance (noise, nonstandard characteristics of sound recording apparatus, the changing nature of the acoustic condition of record preparation, various degrees of qualification of the person recording on a phonogram, etc.), and factors connected with the intrinsic psychical and physiological state of the speaker under scrutiny (health, emotional state, various semantic and pragmatic characteristics of analyzed speech fragments, possible deliberate voice distortion by a suspected person, etc.).
These difficulties alone have predetermined the fact that a perfect method of voice identification, detecting and determining all factors that affect a speech phonogram, with versatile and adequate analysis of voice and speech behavior, has yet to be designed.
The project participants have extensive experience in investigation of speech signal acoustic characteristics, which reflect voice inpiduality, and in elaboration of systems for sanctioned speech control of objects for special purposes (speech verification). Moreover, they created the first system of speaker identification in the former USSR, which was used by forensic institutions of the Ministry of Justice. The project participants have published more than one hundred scientific works, including inventions and 4 monographs.
In light of the above experience, a system system comprising both subjective and automatic approaches, i.e. auditory, linguistic and automatic (computer) assessments of voice similarity, represented by arguable and sample phonograms, should be created.
So, analysis of voice inpiduality and speech behavior, irrespective of the pronounced speech segment, should be performed on different levels of speech signal description - from phonetic features to integral properties, which are revealed on long speech segments. Dynamic features of pronunciation should be given as prosodic speech parameters.
Auditory appraisal of the phonogram, as the most adequate form estimating voice and articulatory skills of the speaker, is needed to form a special basis for determination of the group to which the speaker belongs and for direct recognition of the speaker's personality. To this end a method should be elaborated for the selection of qualified experts and estimation of their acoustic abilities. So, usage of special equipment and this method would make it possible to conduct experiments to reveal new features of parametric description of inpiduality, such as evoked otoacoustic emission (EOE), and a signal received by the method of bone conduction (MBC), which reflects anatomic peculiarities of the inpidual. Specialists of the Center of Audiology and Hearing Rehabilitation, based on their high qualification and experience, will make acoustic sensor elements, determine spectral and temporal parameters of MBC, EOE, and evaluate their stability in the long term. EOE inter-aural asymmetry, which can also be of an inpidual nature, is also of interest.
Linguistic appraisal should be based on the features which are, on the one hand, connected with deviations from the linguistic norm, and, on the other hand, which are within the norm itself. Visual appraisal of so-called voice "imprints" (standard speech segments) can be ascribed to the subjective methods of speech investigation. In addition to subjective experiments objective appraisal by computer should be organized, i.e. using spectral amplitude and temporal parameters of the speech signal measured in both static and dynamic modes.
In addition to the above-mentioned problems a mathematical decision-making model would be elaborated, based on probability-statistical modeling of the appraisal process and using methods of informational classification of samples.
The project results, i.e. a scheme of an expert-criminologist's working place, all necessary facilities for carrying out the appraisal, and a method for organizing consecutive operations will be recommended for the institutions of the Ministry of Justice, and independent forensic appraisal.
Taking into account that basic activities of the project participants were dedicated to research in closed fields, in the construction of sanctioned speech control of moving objects and means of delivery (projects of the Ministry of Defense and Military-Industrial Commission of the USSR Council of Ministers "Kub", "Koler", "Karta", "Konvert" etc.), the proposed tasks completely correspond with ISTC objectives. For example, the skills of weapons researchers are now directed to peaceful purposes (the struggle against corruption, terrorism, blackmail, etc.).
The project is planned for 36 months.
20 specialists, including 2 Doctors of Technical Sciences, 1 Doctor of Medical Sciences, 4 Candidates of Technical Sciences, 2 Candidates of Biological Sciences and 1 Candidate of Physical and Mathematical Sciences will be employed.
Full project volume totals 400 man-months.
The contribution of foreign collaborators will take the form of information exchange in the course of project implementation and discussion of results at joint seminars and workshops.
The International Science and Technology Center (ISTC) is an intergovernmental organization connecting scientists from Kazakhstan, Armenia, Tajikistan, Kyrgyzstan, and Georgia with their peers and research organizations in the EU, Japan, Republic of Korea, Norway and the United States.
ISTC facilitates international science projects and assists the global scientific and business community to source and engage with CIS and Georgian institutes that develop or possess an excellence of scientific know-how.