This Time I will show how to use TF4500 Beckhoff TwinCAT SPeech. We can use this Function to build communication between humans and machines by Voice input/output.
Of course it not only supports English but also other languages – It may be used in any case of FA production Site.
We can use TF4500 for:
- Output Voice from PC-Base Controller
- Using Microsoft SAPI to input/output the sounds
- Using Amazon Polly to output the sounds
It is possible to use Microsoft SAP in offline applications,Amazon Polly is your choice also if the internet is connected.
TwinCAT Speech is made by two important components – ASR(Automatic Speech Recognition) and TTS(Text-to-Speech).
System Requirement
Please be careful in the Visual Studio and HMI version.
Sound Card Test
Before we install the TF4500 in our machine, let’s test the sound card first.
Open your Control Panel.
Click the Speech Recognition.
Click the Text to Speech.
Choose which voices that you want to output and Click the Preview Voice to check your voice card is working or not.
Install
Not only TF4500, you also need to install TF2000 HMI Server and .NET Runtime.
TF4500
Please access the following link the download the exe file.
Please wait a minute…
Next>.
Agree the license and Next>.
Next.
Click Install.
Please wait a minute..
OK, now TF4500 is installed.
HMI Server install
Access the following link to download the exe file.
Choose English and OK.
Please wait a minute..
Next>.
Agree the license and Next>.
Enter your information and Next.
Choose Complete and Next>.
Click install.
Wait a mins..
Now the TF2000 is also installed.
Visual Studio download
Please use the 2017 or 2019 version.
https://my.visualstudio.com/Downloads?q=visual%20studio%202017&wt.mc_id=o~msft~vscom~older-downloads
.NET Runtime Install
You can download the Runtime in the link below or from NS Visual Studio.
https://dotnet.microsoft.com/download
Microsoft Visual Studio Error
If you get the error message like VSPackageTcSpeechConfigurator package did not load correctly,please check your Visual Studio version.
Concept
OK,let me explain it clearly – TwinCAT Speech is using the Sound-card for the Speech input/output operation. We will use the Speech Configurator inside the TwinCAT Engineering system to configure it, and download the Configuration file inside the TwinCAT Runtime.
We will not explain what HMI Server is this time.
SAPI is the Speech Application Programming Interface, the API that is provided by Microsoft.
Support ASR
ASR=Automatic speech Recognition is the technology that detects Speech input and changes it to Text.
Support TTS
TTS=Text-to-Speech is the technology to create the human sounds output from your system.
(For Example,Siri in MAC)
Function Block
FB_SpeechRecongnition
This is the Function for Speech Recognition.
You can start the operation by a Rise up signal of bListen.
VAR_INPUT | ||
bListen | BOOL | Rise up Edge to start the Speech Recognition. |
nConfigurationId | UINT | Reference to ASR Configuration ID |
VAR_OUTPUT | ||
bBusy | BOOL | |
bError | BOOL | |
nErrorId | ETcsSpeechCommandExitCode | Error information |
eRecognitionEngineState | ETcsRecognitionEngineState | The current status of the Speech Recognition Engine. |
fRecognitionConfidence | REAL | The confidence of the last Speech Recognition data |
nLastCommandExitCode | UINT | The Exit Code of the last operation. |
sRecognitionTag | STRING(255) | The last Recognition Tag |
sRecognitionRule | STRING(255) | The last Recognition Rules |
sRecognitionUtterance | STRING(4095) | The last Text data of the operation |
FB_TextToSpeech
This is the function block to output the Speech.
The speech will be output while there is a Rise up edge of bSpeack ,reference to nConfigurationId(TSS ID) and nLanguageId(1033..etc) and sUtterance(output speech data).
VAR_INPUT | ||
bSpeak | BOOL | Rise Up Edge to start the speech output |
sUtterance | STRING | The text output data |
nConfigurationId | nConfigurationI | TSS ID |
nLanguageId | UINT | Default=0、The default language of the TSS configuration |
VAR_OUTPUT | ||
bBusy | BOOL | True=Executing |
bError | BOOL | True=Error |
nErrorId | ETcsSpeechCommandExitCode | Error information |
nLastCommandExitCode | UINT | The last ExitCode of speech output. |
nPlaybackPosition | ULINT | The position of current speech output |
nPlaybackTotal | ULINT | The length of Speech output(ms) |
Example
Insert the TwinCAT Speech Configurator
File>New>Project.
Click the TwinCAT Controller>Empty TwinCAT Controller Project to create a new Controller project.
We can see that TwinCAT Controller1 is inserted in our project.
Select the TwinCAT Controller and Right Click >New Item.
Choose TwinCAT Speech Configurator and Add it into your project.
The Speech Configurator is inserted .
ASR
Insert the Configuration
Right Click the TwinCAT Speech Configurator that we inserted before> Start ASK Wizard.
Press the + Button to create a new Configuration.
The Device Wizard is shown.
We can test the sound input device here.
Drop the Select Device List to choose what devices that you want to test , and press the microphone button to output it.
Press save if there is no problem.
Insert the Service
Now we need to configure the ASR Service.
Press the + Button to add a new ASR Service.
Insert the Grammar File
Input your ASR Service Name.
And then press the + button to insert the Grammar File.
Choose “Create new Grammar File”.
Choose English as Default Language.
Press the + Button to add the new Grammer.
We can see 3 Items in here.
- Recognition Tag
- Recognition Text
- Recognition Group
For example – if the Recognition Text system detected “Good Morning” from the input, this data will be cataloged as ”Say_Good_Morning” in the Recognition Tag of the group ”OpenCommand”.
*Group setting is not necessary.
Press save to save your configuration.
Press the Finished Button to Finish the ASR Configuration.
TSS
Inset the Configuration
Right Click the TSS and Start TTS Wizard.
Press the + Button to configure the Speech output device.
The below Popup is Displayed and same as ASR, I will not Explain too much.
Press Next to Finish the Configuration.
Insert the Service
Press the + Button to start the TTS Service Configuration.
The below screen is displayed.
We can choose to use TTS Synthesis Service or Amazon Polly.
SAPI is configured in this tutorial.
Enter the Service name and Save it.
Choose English in the Language Selection.
We can choose David or Zira to output the Speech.
Finally we press the + Button to insert it.
Press Save to save your configuration.
Storage
We will not cover it in this Tutorial and press Finish to save the configuration.
ASR Configuration Id
We will use this Id in the program, Please note it.
TSS ID
As same as TSSID.
Activate Configuration
Finally open the General Tab and Activate the Configuration.
。
Press OK to confirm the overwrite configuration.
Press OK to Finish the Configuration.
Insert the TwinCAT Project
Insert your TwinCAT Project.
Add library
Error
PROGRAM
This is the program that I modified from the samples.
It will define which speech is detected and operate a different output.
MAIN_ASR
Please be careful of the nConfigIdASR and match it with your applications.
VAR
VAR // Start speech recognition by setting to true bListen : BOOL := FALSE; bRecognition : BOOL; // ASR Configuration nConfigIdASR : UINT := 100; fbASR : FB_SpeechRecognition := (nConfigurationId := nConfigIdASR); // ASR Variables nLastRecoId : ULINT := 0; timer : TON; END_VAR |
Code
We will only output the result while fRecognitionConfidence is higher than 0.7.
// Set bListen to true, to start speech recognition fbASR(bListen := gvl.bListen, nConfigurationId:= nConfigIdASR); // If new recognition is available and recognition confidence is high enough (over 70%) set bRecognition to true IF nLastRecoId <> fbASR.nRecognitionId THEN nLastRecoId := fbASR.nRecognitionId; IF fbAsr.fRecognitionConfidence > 0.7 THEN bRecognition := TRUE; END_IF END_IF // Keep bRecognition true for just a second IF bRecognition THEN gvl.bSpeak:=TRUE; gvl.bListen:=FALSE; gvl.cmd:=fbASR.sRecognitionTag; timer(IN := TRUE, PT := T#1S); IF timer.Q THEN timer(IN := FALSE); bRecognition := FALSE; END_IF END_IF |
MAIN_TTS
Please be careful with the nLanguageId and match it for your application.
VAR
VAR // TTS Variables bSpeak : BOOL := FALSE; {attribute ‘TcEncoding’:= ‘UTF-8’} sText2Speech : STRING(4095) := ‘<speak>Hello world!</speak>’; // TTS Configuration nConfigIdTTS : UINT := 200; nLanguageId : UINT := 1033; fbTTS : FB_TextToSpeech := (nConfigurationId := nConfigIdTTS, nLanguageId := nLanguageId); END_VAR |
Code
depends on the GVL.cmd, different speech is output.
IF gvl.Cmd = ‘Say_Good_Morning’ THEN sText2Speech:=’<speak>Good morning Sir. I am your TwinCAT system help. What can i Help you?</speak>‘; ELSIF gvl.Cmd = ‘Motor1_on’ THEN stext2Speech:=’<speak>OK. I get your order and turn on the motor 1 after 5 seconds.</speak>‘; END_IF fbTTS(sUtterance := sText2Speech, bSpeak := gvl.bSpeak, nConfigurationId:= nConfigIdTTS); IF NOT fbTTS.bBusy THEN fbTTS(sUtterance := sText2Speech,bSpeak := FALSE, nConfigurationId:= nConfigIdTTS); bSpeak := FALSE; gvl.bSpeak:=FALSE; END_IF |
MAIN
Code
MAIN_ASR(); MAIN_TTS(); |
Result
Here is the sample :
Please download the code in this link:
https://github.com/soup01Threes/TwinCAT3/blob/main/Project_TF4500_1.tnzip