Speaker Recognition, Still a Long Way to Go

For the last post I will make for our blog, I want to say a little bit about the recent condition and the future about speaker recognition.

Although speaker recognition have already been researched for several decades, it still have a long way to go. We have good algorithms that can get the recognition accuracy to more than 98%, even 100% for very long time training. Those prototypes are more aiming to high performance instead of computation cost and so on. So computation time is a barrier that stop the system to be used in some real system like banking system etc. An 630 people’s group needs few hours to generate the model, we can’t imagine how many time is needed for a bank where have millions of people. But at least you can divide the calculation in different processors which however needs investigate more money.

Another barrier is that the high performances are getting from very clean speech where don’t even have silence or other voice instead of the speaker. However, in real application and devices, we are difficult to filter out all the noise and other sounds, these will give a big challenge to the frond-end processing and the innovation of good recording hardware.

The biggest problem will still be the security problems, since we can simply mimic a people’s voice or record that, that will make the speaker recognition system more complex to detect the channel difference from the source and receiver. Some sophistic algorithms will be involved in.

So, a very good speaker recognition system needs enormous peripherals, and the balance between the core system and the outside processing parts. All in all, it seems we still need a long way to go to achieve a simple, accurate, secure speaker recognition system!

4 comments

timvanloock · March 31, 2015

It would be great to see an evolution in this and I am confident in this. This will have great applications in the future, but for security reasons i think it is better for now to focus on accuracy, this has to be a 100% for sure before I would even consider using voice recognition in security systems. Speed is also important but that will be easier to increase I think.

LikeLike

mathiasder · March 31, 2015

i think this will have some usefull applications. But not many people will put their money in it (for security). Cause their are other/ better/ cheaper systems in the market right now. And they don’t have big disadvantages.

LikeLike

Sam B. · March 31, 2015

I find it interesting how even after so much effort to understand human voices, our voices can be easily replicated by the most trivial of recording systems. This seems to be such a huge security flaw that I can hardly imagine it’s use as the sole defender of our possessions.

LikeLiked by 1 person

haoyuguo314 · April 1, 2015

yes, it’s true.human’s voice is so easy to mimic. I don’t think this technology is able to be used in high security needed situation.I remeber in the previous blog you said this the technology can be used for old people who are alone at home, for such kind of situation I suppose it works.Anyway, for the sake of security, people should never end uo with just one way.Combine with voice ,figureprint ,retina would be quite helpful I suppose.

LikeLike

timvanloock · March 31, 2015

It would be great to see an evolution in this and I am confident in this. This will have great applications in the future, but for security reasons i think it is better for now to focus on accuracy, this has to be a 100% for sure before I would even consider using voice recognition in security systems. Speed is also important but that will be easier to increase I think.

LikeLike

mathiasder · March 31, 2015

i think this will have some usefull applications. But not many people will put their money in it (for security). Cause their are other/ better/ cheaper systems in the market right now. And they don’t have big disadvantages.

LikeLike

Sam B. · March 31, 2015

I find it interesting how even after so much effort to understand human voices, our voices can be easily replicated by the most trivial of recording systems. This seems to be such a huge security flaw that I can hardly imagine it’s use as the sole defender of our possessions.

LikeLiked by 1 person

haoyuguo314 · April 1, 2015

yes, it’s true.human’s voice is so easy to mimic. I don’t think this technology is able to be used in high security needed situation.I remeber in the previous blog you said this the technology can be used for old people who are alone at home, for such kind of situation I suppose it works.Anyway, for the sake of security, people should never end uo with just one way.Combine with voice ,figureprint ,retina would be quite helpful I suppose.

LikeLike

Speaker Recognition

Master Thesis – Speaker Recognition – We tell you who you are!

Speaker Recognition, Still a Long Way to Go

4 comments

Leave a comment Cancel reply

Share this:

Related

4 comments

Leave a comment Cancel reply