[incubator-nlpcraft-website] branch master updated: WIP.

aradzinski Sun, 04 Apr 2021 19:37:03 -0700

This is an automated email from the ASF dual-hosted git repository.

aradzinski pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 8fe9a31  WIP.
8fe9a31 is described below

commit 8fe9a31af0f5c0475ef97b73dae186ce3e6498d6
Author: Aaron Radzinski <[email protected]>
AuthorDate: Sun Apr 4 19:36:42 2021 -0700

    WIP.
---
 data-model.html      |   2 +-
 intent-matching.html | 298 +++++++++++++++++++++++++++++++++++----------------
 2 files changed, 204 insertions(+), 96 deletions(-)

diff --git a/data-model.html b/data-model.html
index 8576a56..c1a6f3b 100644
--- a/data-model.html
+++ b/data-model.html
@@ -531,7 +531,7 @@ intents:
             </li>
             <li>
                 Model elements can have many-to-many group memberships.
-            </li>
+            
</li>(UNI_CHAR|UNDERSCORE|LETTER|DOLLAR)+(UNI_CHAR|DOLLAR|LETTER|[0-9]|COLON|MINUS|UNDERSCORE)*
             <li>
                 Model elements can form a hierarchical structure.
             </li>
diff --git a/intent-matching.html b/intent-matching.html
index 482f59f..a153af6 100644
--- a/intent-matching.html
+++ b/intent-matching.html
@@ -57,80 +57,7 @@ id: intent_matching
     <section>
         <h2 id="idl" class="section-title">IDL - Intent Definition 
Language</h2>
         <p>
-            NLPCraft intents are written in Intent Definition Language (IDL). 
IDL declarations can be placed in
-            different locations based on user preferences:
-        </p>
-        <ul>
-            <li>
-                <p>
-                    <a target="javadoc" 
href="/apis/latest/org/apache/nlpcraft/model/NCIntent.html">@NCIntent</a> 
annotation
-                    takes a string as its parameter that should be a valid IDL 
declaration. For example, Scala code snippet:
-                </p>
-                <pre class="brush: scala, highlight: [1, 2]">
-                &#64;NCIntent("import('/opt/myproj/global_fragments.idl')")
-                &#64;NCIntent("intent=act term(act)={has(tok_groups(), 'act')} 
fragment(f1)")
-                def onMatch(
-                    &#64;NCIntentTerm("act") actTok: NCToken,
-                    &#64;NCIntentTerm("loc") locToks: List[NCToken]
-                ): NCResult = {
-                    ...
-                }
-            </pre>
-
-            </li>
-            <li>
-                <p>
-                    External JSON/YAML <a href="/data-model.html#config">data 
model configuration</a> can provide one or more
-                    IDL declarations in <code>intents</code> field. For 
example:
-                </p>
-                <pre class="brush: js, highlight: [7]">
-                {
-                    "id": "nlpcraft.alarm.ex",
-                    "name": "Alarm Example Model",
-                    .
-                    .
-                    .
-                    "intents": [
-                        "import('/opt/myproj/global_fragments.idl')",
-                        "import('/opt/myproj/my_intents.idl')",
-                        "intent=alarm term~{tok_id()=='x:alarm'}"
-                    ]
-                }
-            </pre>
-            </li>
-            <li>
-                External <code>*.idl</code> files contain IDL declarations and 
can be imported in any other places where
-                IDL declarations are allowed. See <code>import()</code> 
statement explanation below. For example:
-                <pre class="brush: idl">
-                    /*
-                     * File 'my_intents.idl'.
-                     * ======================
-                     */
-
-                    import('/opt/globals.idl') // Import global intents and 
fragments.
-
-                    // Fragments.
-                    // ----------
-                    fragment=buzz term~{tok_id() == 'x:alarm'}
-                    fragment=when
-                        term(nums)~{
-                            // Term variables.
-                            @type = meta_tok('nlpcraft:num:unittype')
-                            @iseq = meta_tok('nlpcraft:num:isequalcondition')
-
-                            tok_id() == 'nlpcraft:num' && @type != 'datetime' 
&& @iseq == true
-                        }[0,7]
-
-                    // Intents.
-                    // --------
-                    intent=alarm
-                        fragment(buzz)
-                        fragment(when)
-                </pre>
-            </li>
-        </ul>
-        <h2 id="ild_grammar" class="section-sub-title">IDL Grammar</h2>
-        <p>
+            NLPCraft intents are written in Intent Definition Language (IDL).
             IDL is a relatively straightforward and simple language. You can 
review the formal
             <a target="github" 
href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft/src/main/scala/org/apache/nlpcraft/model/intent/compiler/antlr4/NCIdl.g4";>ANTLR4
 grammar</a> for IDL. Here
             are the common properties of IDL:
@@ -162,24 +89,27 @@ id: intent_matching
                 Numeric literals use Java string conversions.
             </li>
             <li>
-                Only 10 reserved keywords: <code>flow fragment import intent 
meta ordered term true false null</code>
+                IDL has only 10 reserved keywords: <code>flow fragment import 
intent meta ordered term true false null</code>
             </li>
             <li>
                 Identifiers and literals can use the same Unicode space as 
Java.
             </li>
             <li>
-                IDL provides over 50 built-in functions to aid in intent 
matching. IDL functions are pure immutable mathematical functions
+                IDL provides over 50 <a href="#idl_functions">built-in 
functions</a> to aid in intent matching. IDL functions are pure immutable 
mathematical functions
                 that work on a runtime stack. In other words, they look like 
Python functions: IDL <code>length(trim(" text "))</code> vs.
                 OOP-style <code>" text ".trim().length()</code>.
             </li>
         </ul>
         <p>
-            IDL script consists of one or more statements, where each 
statement can be one of the following three types:
+            IDL program consists of one or more statements of type 
<code>import</code>, <code>intent</code>, or <code>fragment</code>:
         </p>
         <ul>
             <li>
                 <p>
-                    <b>Import</b>
+                    <b><code>import</code> statement</b>
+                </p>
+                <p>
+                    Import statement allows to import IDL declarations from 
either local file, classpath resource or URL:
                 </p>
                 <pre class="brush: idl">
                     // Import using absolute path.
@@ -192,17 +122,35 @@ id: intent_matching
                     import('ftp://user:password@myhost:22/opt/globals.idl')
                 </pre>
                 <p>
-                    Import statement allows to import IDL declarations from 
either local file, classpath resource or URL.
-                    The effect of importing is the same as if the imported 
declarations were inserted in place of import
-                    statement. Recursive and cyclic imports are detected and 
ignored. Import statement starts
-                    with <code>import</code> keyword and has a string 
parameter that indicates
-                    the location of the resource to import. Note that for the 
classpath resource you don't need to specify
-                    leading forward slash.
+                    <b>NOTES:</b>
                 </p>
+                <ul>
+                    <li>
+                        The effect of importing is the same as if the imported 
declarations were inserted in place of import
+                        statement.
+                    </li>
+                    <li>
+                        Recursive and cyclic imports are detected and safely 
ignored.
+                    </li>
+                    <li>
+                        Import statement starts with <code>import</code> 
keyword and has a string parameter that indicates
+                        the location of the resource to import.
+                    </li>
+                    <li>
+                        For the classpath resource you don't need to specify 
leading forward slash.
+                    </li>
+                </ul>
+                <p></p>
             </li>
             <li>
                 <p>
-                    <b>Intent Definition</b>
+                    <b><code>intent</code> statement</b>
+                </p>
+                <p>
+                    Intent is defined as one or more terms. Each term is a 
predicate over a instance of
+                    <a target="javadoc" 
href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a> 
interface.
+                    For an intent to match all of its terms have to evaluate 
to true.
+                    Intent definition statement can be informally explained 
using the following example:
                 </p>
                 <pre class="brush: idl">
                     intent=xa
@@ -220,24 +168,184 @@ id: intent_matching
                         flow=/#flowModelMethod/
                         ordered=true
                         term(a)=/org.mypackage.MyClass#termMethod/?
-                        term(b)~{
-                            @x = 2
-                            @xx = ((@x * @x) / 2) * 3
-
-                            @xx == 6 && has(
-                                json(meta_req('user_json_payload')),
-                                list("موسكو\"", 'v1\'v1', "k2", "v2")
-                            )
-                        }
+                        fragment(frag1)
                 </pre>
+                <dl>
+                    <dt>
+                        <code>intent=xa</code> <sup><small>line 
1</small></sup><br/>
+                        <code>intent=xb</code> <sup><small>line 
12</small></sup>
+                    </dt>
+                    <dd>
+                        Mandatory intent IDs. Intent ID is any arbitrary 
unique string matching the following lexer template:
+                        
<code>(UNI_CHAR|UNDERSCORE|LETTER|DOLLAR)+(UNI_CHAR|DOLLAR|LETTER|[0-9]|COLON|MINUS|UNDERSCORE)*</code>
+                    </dd>
+                    <dt><code>ordered=true</code> <sup><small>line 
14</small></sup></dt>
+                    <dd>
+                        <em>Optional.</em>
+                        Whether or not this intent is ordered. Default is 
<code>false</code>.
+                        For ordered intent the specified order of terms is 
important for matching this intent.
+                        If intent is unordered its terms can be found in any 
order in the input text.
+                        Note that ordered intent significantly limits the user 
input it can match. In most cases
+                        the ordered intent is only applicable to processing of 
a formal grammar (like a programming language)
+                        and mostly unsuitable for the natural language 
processing.
+                    </dd>
+                    <dt>
+                        <code>flow="^(?:xx)(^:zz)*$"</code> <sup><small>line 
2</small></sup><br/>
+                        <code>flow=/#flowModelMethod/</code> <sup><small>line 
13</small></sup>
+                    </dt>
+                    <dd>
+                        <p>
+                            <em>Optional.</em> Dialog flow is a history of 
previously matched intents to match on. If provided,
+                            the intent will first match on the history of the 
previously matched intents before processing its
+                            terms. There are two way to define a match on the 
dialog flow:
+                        </p>
+                        <ul>
+                            <li>
+                                <p><b>Regular Expression</b></p>
+                                <p>
+                                    Dialog flow specification is a standard <a 
target=_blank 
href="https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html";>Java
 regular expression</a>.
+                                    The history of previously matched intents 
is presented as a space separated string of intent IDs that were
+                                    selected as the best match during the 
current conversation, in the chronological order with the most
+                                    recent matched intent ID being the first 
element in the string. Dialog flow regular expression
+                                    will be matched against that string 
representing intent IDs.
+                                </p>
+                                <p>
+                                    In this example, the 
<code>^(?:xx)(^:zz)*$</code> dialog flow regular expression defines that intent
+                                    should only match when the immediate 
previous intent was <code>xx</code> and no <code>zz</code> intents
+                                    are in the history. If history is 
<code>"xx yy yy"</code> - this intent will match. However, for
+                                    <code>"xx zz"</code> or <code>"yy 
xx"</code> history this dialog flow will not match.
+                                </p>
+                            </li>
+                            <li>
+                                <p><b>User-Defined Callback</b></p>
+                            </li>
+                        </ul>
+                        <p>
+                            Note that if dialog flow is defined and it doesn't 
match the history the terms of the intent won't be tested at all.
+                        </p>
+                    </dd>
+                    <dt>
+                        <code>term(term1)={group @@ 'my_group'}?</code><br>
+                        <code>term(term2)~{trim(partId.partAlias.id) == 
'token1:id'}[1,3]</code>
+                    </dt>
+                    <dd>
+                        <p>
+                            Term, also known as a slot, is a building block of 
the intent. Term has optional ID, predicate and quantifiers.
+                            It can support conversation context if it uses 
<code>'~'</code> symbol or not if it uses <code>'='</code>
+                            symbol in its definition. For conversational term 
the system will search for a match using tokens from
+                            the current request as well as the tokens from 
conversation STM (short-term-memory). For a non-conversational
+                            term - only tokens from the current request will 
be considered.
+                        </p>
+                        <p>
+                            A term represents one or more tokens, sequential 
or not, detected in the user input. Intent has a list of terms
+                            (always at least one) that all have to be found in 
the user input for the intent to match. Note that term
+                            can be optional if its min quantifier is zero. 
Whether or not the order of the terms is important
+                            for matching is governed by 
<code>ordered=true</code> parameter.
+                        </p>
+                        <p>
+                            Term ID (<code>term1</code> and 
<code>term2</code>) is optional, and when provided, is used by 
<code>@NCIntentTerm</code>
+                            annotation to link term's tokens to a formal 
parameter of the callback method. Note that term ID follows
+                            the same lexical rules as intent ID.
+                        </p>
+                        <p>
+                            Inside of curly brackets <code>{</code> 
<code>}</code> is a <a href="data-model.html#dsl">token DSL</a>
+                            expression: <code>group @@ 'my_group'</code> and 
<code>trim(partId.partAlias.id) == 'token1:id'</code>.
+                            Note that exactly the same syntax is used for 
token DSL as well as for intent DSL for defining a token predicate.
+                            Consult <a href="data-model.html#dsl">token 
DSL</a> documentation for details on its syntax.
+                        </p>
+                        <p>
+                            <code>?</code> and <code>[1,3]</code> define an 
inclusive quantifier for that term (how many time this term should appear
+                            for it to be considered found). You can also use 
the following standard abbreviations:
+                        </p>
+                        <ul>
+                            <li><code>*</code> is equal to 
<code>[0,∞]</code></li>
+                            <li><code>+</code> is equal to 
<code>[1,∞]</code></li>
+                            <li><code>?</code> is equal to 
<code>[0,1]</code></li>
+                            <li>No quantifier defaults to 
<code>[1,1]</code></li>
+                        </ul>
+                    </dd>
+                </dl>
             </li>
             <li>
                 <p>
-                    <b>Fragment Definition</b>
+                    <b><code>fragment</code> statement</b>
                 </p>
             </li>
         </ul>
         <h2 id="idl_functions" class="section-sub-title">IDL Functions</h2>
+        <h2 id="idl_location" class="section-sub-title">IDL Location</h2>
+        <p>
+            IDL declarations can be placed in different locations based on 
user preferences:
+        </p>
+        <ul>
+            <li>
+                <p>
+                    <a target="javadoc" 
href="/apis/latest/org/apache/nlpcraft/model/NCIntent.html">@NCIntent</a> 
annotation
+                    takes a string as its parameter that should be a valid IDL 
declaration. For example, Scala code snippet:
+                </p>
+                <pre class="brush: scala, highlight: [1, 2]">
+                &#64;NCIntent("import('/opt/myproj/global_fragments.idl')")
+                &#64;NCIntent("intent=act term(act)={has(tok_groups(), 'act')} 
fragment(f1)")
+                def onMatch(
+                    &#64;NCIntentTerm("act") actTok: NCToken,
+                    &#64;NCIntentTerm("loc") locToks: List[NCToken]
+                ): NCResult = {
+                    ...
+                }
+            </pre>
+
+            </li>
+            <li>
+                <p>
+                    External JSON/YAML <a href="/data-model.html#config">data 
model configuration</a> can provide one or more
+                    IDL declarations in <code>intents</code> field. For 
example:
+                </p>
+                <pre class="brush: js, highlight: [7]">
+                {
+                    "id": "nlpcraft.alarm.ex",
+                    "name": "Alarm Example Model",
+                    .
+                    .
+                    .
+                    "intents": [
+                        "import('/opt/myproj/global_fragments.idl')",
+                        "import('/opt/myproj/my_intents.idl')",
+                        "intent=alarm term~{tok_id()=='x:alarm'}"
+                    ]
+                }
+            </pre>
+            </li>
+            <li>
+                External <code>*.idl</code> files contain IDL declarations and 
can be imported in any other places where
+                IDL declarations are allowed. See <code>import()</code> 
statement explanation below. For example:
+                <pre class="brush: idl">
+                    /*
+                     * File 'my_intents.idl'.
+                     * ======================
+                     */
+
+                    import('/opt/globals.idl') // Import global intents and 
fragments.
+
+                    // Fragments.
+                    // ----------
+                    fragment=buzz term~{tok_id() == 'x:alarm'}
+                    fragment=when
+                        term(nums)~{
+                            // Term variables.
+                            @type = meta_tok('nlpcraft:num:unittype')
+                            @iseq = meta_tok('nlpcraft:num:isequalcondition')
+
+                            tok_id() == 'nlpcraft:num' && @type != 'datetime' 
&& @iseq == true
+                        }[0,7]
+
+                    // Intents.
+                    // --------
+                    intent=alarm
+                        fragment(buzz)
+                        fragment(when)
+                </pre>
+            </li>
+        </ul>
         <h2 id="idl_syntax_highlight" class="section-sub-title">IDL Syntax 
Highlighting</h2>
     </section>
     <section id="annotations">

[incubator-nlpcraft-website] branch master updated: WIP.

Reply via email to