Think about it: it’s just after midnight and your mother’s number is calling on your phone. When you pick up, her voice on the other end of the line – it’s clearly hers and, after all, the recognition shows that she’s the one who called – shouts “Let me go!”, “Help!” and other phrases in a panic. The phone is then answered by a man, who informs you that they broke into her house, but found nothing of value and are holding her hostage. And they require you to send them a sum of money immediately – before you hang up – so they don’t kill it.
Until today, telephone scams have a specific pattern and – for the most part – specific victims. Someone calls impersonating the police and informs the victim that a relative has been involved in a serious car accident. To avoid prison, she tells him, he will have to pay a serious sum. Although most of the victims were elderly, the latest was a 13-year-old boy in Athens, Greece, who gave the scammers €2,500 in cash, jewelry and watches. If you think you’re hard to fall victim to, you might want to reconsider, as fraudsters are enlisting the help of Artificial Intelligence to create voice clones in the new types of horrifying scams…
Three seconds of speech. That’s how long it takes an AI app to clone your voice – or that of a loved one. Beyond that, in order to complete vishing – as voice phishing is called – it is necessary to intercept personal information and contacts, clone the SIM card (criminal practices that are very widespread today) and entrust the AI application with the message to be spoken -in the right tone, of course. The deception, now, has become so convincing that it can seduce even the most informed. Of course, everything can be made much easier by using a piece of video and other information posted on social networks.
It is not difficult to imagine how easy it is to intercept your voice or that of a loved one. From a phone call to a short wander on TikTok, Facebook and other social networks, where dozens of videos are posted daily by each user…
“Million Targets”
Although very new, this type of phone scam has become very popular in the US and Asia and is now coming to Europe. British bank Starling Bank issued a statement last week, warning that “millions of people could fall victim to fraud that uses AI programs to clone their voice.”
In the US, the scam became widely known when it was revealed that using this technology, thousands of Democrats received a call with the voice of Joe Biden urging them to abstain from the election. In Korea, there was a shock in the case of a businessman who was convinced that he was participating in a video call with his company’s executives, who were actually criminals, and who extorted $25 million from him.
AI voice mimics now make phone scams more believable and more dangerous than ever. Globally, around one in four people have either fallen victim to or know someone who has, according to a report by Beware the Artificial Imposter, with 77% of victims losing money.
With the cybersecurity firm finding that 6.5 billion scam calls are made worldwide each quarter and that each American receives an average of 12 scam calls each month, it was only a matter of time before this criminal industry was “modernized”. Fraud techniques will become significantly more sophisticated. When combined with leaked or stolen data, fraudsters can pretend to be someone their victim knows and add personal details to increase the credibility of the scam.” Scams of this kind have become so large that an Arizona senator in the US has proposed changing the law to consider AI a weapon when used to commit a crime, in order to increase the penalties for fraudsters.
It doesn’t take much for the gangs that run this type of fraud. An “investment” of a few hundred euros is enough to enter the area of deepfake audio. There are several companies that have AI and deep learning applications that can produce copies of any voice of such high quality that they can fool anyone. All of them were created for other reasons, such as for use in film productions, for speaking e-books in different languages, for lending famous voices to commercials or in more… imaginative applications. For example, a restaurant chain in the US uses the cloned voice of famous American football player Keith Byars to take orders, while a company in South Korea restores the voices of loved ones who have passed away to their relatives. But we also meet her for other good purposes, such as the Voice Keeper application, which uses artificial samples of the voice of a person suffering from diseases such as pharyngeal cancer, Parkinson’s, ALS, which cause aphonia, to utters texts with her. But it was only a matter of time before this technology was used for bad purposes.
The scripts used
With the use of voice impersonation technology, gangs now have a much wider range of scenarios to use in order to achieve their ultimate goal of extracting as much money as possible from their victims.
The most popular scenario used by these gangs in the US is that of an alleged kidnapping. In many cases, families with young children are targeted and when they are involved in activities that cannot be easily and quickly controlled by the parents, they are convinced that they have been kidnapped and that a ransom must be paid immediately. In these cases the money is very often given through applications that offer anonymity or even in physical presence, as is done in real kidnappings. They usually ask for unrealistically large amounts (over $1 million) and settle for much less (for example, $50,000).
In other cases, the stolen voice is used to otherwise extort money from the victims. With the AI application, the scammers call (usually undercover) an acquaintance or relative, who listens to their supposed human telling them that they had a car accident on a remote road. “I was injured and must be operated on immediately. I have lost everything, wallet, cards, cell phone and the hospital is asking for X amount to proceed because my insurance doesn’t cover it. Please put them in their account on this app and I’ll give them to you as soon as I get back because I have to go into surgery right away.”
In other cases, even “professional” voices are used to convince the victims to give a lot of money and immediately. Who wouldn’t panic if they heard their accountant tell them they have to pay a large amount to the IRS to avoid seizure or jail, their lawyer for something similar, or even more so a trusted business associate. Gangs more rarely target business accounts, using the voice of their leaders to give a verbal command to disburse a sum or pay a debt immediately to an alleged partner.
Banking institutions are also on alert, not only for blocking suspicious transactions that refer to such frauds, but mainly as Artificial Intelligence through deep learning becomes more and more convincing in imitating the human voice it is commissioned to clone, the more likely it is to started to be used by criminal groups to help them gain access to their victims’ bank accounts by perfecting existing scams and bypassing some existing security safeguards – that’s why when contacting banks by phone customers are required to confirm some personal details them, such as the NTN number, the identity number and the patronymic.
In the same way, of course, they can be used – as has already been done in the case of the mass alleged call by Joe Biden – to spread fake news and try to influence hitherto inviolable processes, such as elections. But this, is a whole different category…
Of course, the usual – also in our country – scenarios have not ceased to be used, such as that of a traffic accident with death or non-injury, where the relatives of the alleged victim ask for money in order not to proceed with the criminal prosecution of the (alleged) perpetrator or in the alleged post-accident release bond payment.
There is, of course, the other side. A few months ago, UC Berkeley professor Farid participated in a Zoom call with former US President Barack Obama. Obama told him that he wanted to learn more about the use of Artificial Intelligence, but there was a… small problem: for more than 10 minutes, the professor doubted that it was the former American president on the other end of the line. “There are so many deepfakes with Obama that for a long time I was telling myself that he probably isn’t the one talking to me,” he explains. So even Obama had to somehow prove that he was… who he said he was.
The “keys” of protection
Experts say that, despite the fact that applications are developing at high speed that will be able to detect scam calls before they start (already on some types of phones there are warnings about possible fraud from caller ID) but also to detect the use of Artificial Intelligence, “there’s not going to be anything that effective.” For this reason, they suggest that citizens be careful and have taken their measures, so that it is practically impossible to fall victim.
First, the most important thing is to recognize the phone scam calls. The same applies to today’s scams, using extra care, avoiding revealing personal information to a caller (whoever they claim to be) and securing personal data and information that is often exposed on the Internet. “It is important”, they say, “that there is no discussion when we suspect that it is a fraud. It is better to simply stop the conversation by hanging up.”
Perhaps the most useful tip, however, is to use a family password. Experts recommend that families have pre-agreed on the use of one or more words and phrases that will be used by family members in case of danger. These “passwords” should never be revealed to third parties or written in messages exchanged on social networks and should be words or phrases that cannot be spoken by someone outside the family.
That way, the grandparent, mom, or child will know that the person they’re hearing on the other end of the phone line isn’t actually their person telling them that “I’m being held by some bad people” or that ” I’m in really bad trouble and I need money right now.”



