How to implement an Alexa Skill for SAP HR

fechterf · ‎05-30-2017

Hi everybody,

During my internship at T.CON (a German SAP consulting company), I built a skill for the Amazon voice service Alexa that is integrated with SAP HR via the SAP Cloud Platform. With this skill Alexa should be able to answer your questions e.g. about your taken leaves, taken leaves of your team-members and create leave requests.

T.CON already created a SAP Cloud platform solution called "T.CON HR Portal", which provides a simple and intuitive interface for the most important HR processes. This solution also provides restful web services, and in order to keeps things simple the backend-services of the Alexa skill reuses those services and runs in the same instance.

In the end the user is able to book his holiday while sitting on his couch, as you can see in the following video (language is German):

In this blogpost I want to give you a short technical overview of the necessary steps and some problems:

Overview of the architecture and communication flow

After the Amazon Echo recognizes the keyword (e.g. Alexa), it will dispatch the audio data to Alexa. The Alexa-service will convert the audio into text with a "Automatic Speech Recognition" (ASR) engine. The text will be received by the "Natural Language Understanding" (NLU) module. From this text it will read the corresponding intent from the user. The module looks also for values/variables like a date. To get this working Alexa needs some information from the developer, like the possible intents and values defined in the intent schema with slots. For assigning the intent he has to enter some example utterances that the user can say. Alexa passes the result, like the intent and the slots, to the backend the developer had built. After that Alexa should give an answer to the user. This answers comes from the backend and consists of a string and if needed some so called "SSML-Tag" (Speech Synthesis Markup Language) to sound better, like a break. The backend response this to Alexa. Now Alexa can generate a voice output with the "Text to Speech" (TTS) engine and forward it to the Echo-device.

Entry in the Alexa skill development view.

At first I have to choose the [first] language of the skill.
Then I inserted a name(the title for the skill catalogue) and an invocation name(for calling the skill).

The interaction model page is the most interesting one. With the intent schema I defined the structure of the dialog. I created a custom slot type for catching the statements which I couldn't assign. The last thing on this page was the specifying of some sample statements/utterances.

Since my backend runs on the SAP CP I have to choose a HTTPS endpoint. I also use account linking to assign the Amazon account to a HR Portal account.
The information came from the SAP CP:

First of all you have to create an authentication client in your CP instance for your application. You can find the redirect link at the configuration page in your skill developer view of Amazon.

The link for your OAuth endpoint are defined in the OAuth -> Branding view.

You also need a scope (you can use those for defining permissions in your backend code)

On the SSL certification view, I have to use the second certificate option. Because a SAP CP application/backend isn't directly certified.

Now you Alexa skill is able to communicate to your backend.

Background and the skill kit (Java)(Eclipse)

Prepare the environment.

Open your Eclipse. I use Eclipse Neon.2 (4.6.2).
Next create a new Dynamic Web Project (Don't forget to change the JRE System Library and add the SAP Web Profile).
For the skill you need also the followed libs (put them into the WebContent\WEB-INF\lib folder):

Alexa-skill-kit-X.X.jar, commons-codec-X.X.jar, commons-io-X.X.jar, commons-lang3-X.X.jar, jackson-annotations-….jar, jackson-core-….jar, jackson-databind-….jar

The should appear in Libraries/Web App Libraries.

Important classes of the Alexa Sill Kit:

SpeechletServlet:

This will be handled like a servlet. It calls the request validator and the content parser. So the Speechlet object gets only valid java objects. A SpeechletServlet object also have to get the Speechlet object which should get the intent.

Speechlet(V2 is the new version):

You will need a class which override the Speechlet interface to react on intents. That class needs to override the methods from below:

onIntent(SpeechletRequestEnvelope<IntentRequest>)
This method will be called if you ask your skill something like "ask Skill can you help me?", "[open Skill] how much leave do I have?" or "ask Skill to request holiday for next week.".
The IntentRequest includes the information about the intent/question. For example the name for the intent and the slots/values. It also imply information about the request, the request id and the language from the user.
The Session attribute contains the session information such as the user with his token, Skill defined session variables and the session id.

onLaunch(SpeechletRequestEnvelope<LaunchRequest>)
If the user open the skill as per "open/ask/start Skill" the onLaunch method will be called.

onSessionEnded(SpeechletRequestEnvelope<SessionEndedRequest>)
This method react if Alexa close a Session.

onSessionStarted(SpeechletRequestEnvelope<SessionStartedRequest>)
This method react if Alexa start a Session.

SpeechletResponse

The SpeechletResponse includes your answer.

OutputSpeech
This attribut include the text which the Amazon Echo reads to the user.
PlainTextOutputSpeech outputSpeech = new PlainTextOutputSpeech();

outputSpeech.setText("Welcome to my first skill.");

response.setOutputSpeech(outputSpeech);
Card
If you want to send a lasting graphical answer, use a card.
SimpleCard card = new SimpleCard();

card.setTitle("Welcome");

card.setContent("Welcome to my first skill.");

response.setCard(card);
Directives
If you want to request something from the user, you maybe could use this attribute for a predefine request
List<Directive> directives = new LinkedList<Directive>();

directives.add(new ConfirmIntentDirective());                

response.setDirectives(directives);
Reprompt
If the user haven't said anything for a while you can use this attribute to help him.
Reprompt reprompt = new Reprompt();

PlainTextOutputSpeech repromptSpeech = new PlainTextOutputSpeech();

repromptSpeech.setText("If you need help say, help me.");

reprompt.setOutputSpeech(repromptSpeech);

response.setReprompt(reprompt);
ShouldEndSession
Use this attribute to tell alexa to close the session.
response.setShouldEndSession(true);

First create a Servlet which extends the SpeechletServlet.

@WebServlet("/Skill")

public class SkillServlet extends SpeechletServlet{}

If you want to get some information for the request bevor sending the response, overwrite the doPost method from SpeechletServlet.
But don't forget to call the doPost method from the Speechlet class.

protected void doPost(HttpServletRequest arg0, HttpServletResponse arg1) throws IOException {

    SKILL_LOGGER.info("Received a message from %s -- IP: %",arg0.getRemoteHost(),arg0.getRemoteAddr());

    super.doPost(arg0, arg1);

}

The SAP CP writes the username after the authentication into the response.
So I had to read and save it in the doPost method.

User user = new User(request.getRemoteUser(),getToken(request));

Next create a class which implements the SpeechletV2 interface.

public class SkillSpeechlet2 implements SpeechletV2{

      //...

}

And override the methods with the code you want to execute.

Connect to the restful web services

The SAP SDK provides some function to connect to a web service on the same cloud instance. (Only the Java EE Web 6 SDK)

//You define the destination name in the Cloud manager

public static HttpClient getHttpClientFor(String destination) throws NamingException, DestinationNotFoundException,DestinationException {

      //The server context defined by the SAP Cloud

      Context ctx = new InitialContext();

      DestinationFactory destinationFactory = (DestinationFactory) ctx.lookup

        (DestinationFactory.JNDI_NAME);

      HttpDestination httpDest = (HttpDestination) destinationFactory.getDestination(destination);

      //With this Client you can Request information without setting the Authorisation header

      return httpDest.createHttpClient();

}

Problems with sending the oauth token / token not found in the header

After I activated the account linking (oauth), Alexa wasn't able to connect to my backend, and after many hours of checking the configurations I activated the com.sap.cloud.security.oauth2 Logger. He said that he doesn't find a token. So I searched for a solution and I found somebody who had the same problem (https://forums.developer.amazon.com/questions/46982/can-alexa-return-the-access-token-in-the-header-...).

He found out that Alexa doesn't send the token in the header (as the standard would require it) but in the body. The good thing was that I haven't caused it, but how could I solve it? I decided to create a gateway without authentication that reads the token from the body, write it into the header and forward the request to the protected skill. I know it is a workaround but it has the benefit that I can log the request from Alexa and the response from my skill if a communication error occurs.

For that workaround I created a second servlet which forwards the request (with the Authentication Header) to the SpeechletServlet.

And replies with the response from that servlet. At least I had to change the web.xml. In order that only the SpeechletServlet is in the authenticated zone.

Tutorials for Alexa Skills

https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/getting-started-guide

https://speakerdeck.com/wolfpaulus/building-alexa-skills-with-the-java-alexaskillskit-sdk

I hope this little description helps you with your first Alexa Skill on the SAP CP - have fun with Alexa!

Cheers, Florian