This is an automated email from the ASF dual-hosted git repository.
aradzinski pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git
The following commit(s) were added to refs/heads/master by this push:
new 8fe9a31 WIP.
8fe9a31 is described below
commit 8fe9a31af0f5c0475ef97b73dae186ce3e6498d6
Author: Aaron Radzinski <[email protected]>
AuthorDate: Sun Apr 4 19:36:42 2021 -0700
WIP.
---
data-model.html | 2 +-
intent-matching.html | 298 +++++++++++++++++++++++++++++++++++----------------
2 files changed, 204 insertions(+), 96 deletions(-)
diff --git a/data-model.html b/data-model.html
index 8576a56..c1a6f3b 100644
--- a/data-model.html
+++ b/data-model.html
@@ -531,7 +531,7 @@ intents:
</li>
<li>
Model elements can have many-to-many group memberships.
- </li>
+
</li>(UNI_CHAR|UNDERSCORE|LETTER|DOLLAR)+(UNI_CHAR|DOLLAR|LETTER|[0-9]|COLON|MINUS|UNDERSCORE)*
<li>
Model elements can form a hierarchical structure.
</li>
diff --git a/intent-matching.html b/intent-matching.html
index 482f59f..a153af6 100644
--- a/intent-matching.html
+++ b/intent-matching.html
@@ -57,80 +57,7 @@ id: intent_matching
<section>
<h2 id="idl" class="section-title">IDL - Intent Definition
Language</h2>
<p>
- NLPCraft intents are written in Intent Definition Language (IDL).
IDL declarations can be placed in
- different locations based on user preferences:
- </p>
- <ul>
- <li>
- <p>
- <a target="javadoc"
href="/apis/latest/org/apache/nlpcraft/model/NCIntent.html">@NCIntent</a>
annotation
- takes a string as its parameter that should be a valid IDL
declaration. For example, Scala code snippet:
- </p>
- <pre class="brush: scala, highlight: [1, 2]">
- @NCIntent("import('/opt/myproj/global_fragments.idl')")
- @NCIntent("intent=act term(act)={has(tok_groups(), 'act')}
fragment(f1)")
- def onMatch(
- @NCIntentTerm("act") actTok: NCToken,
- @NCIntentTerm("loc") locToks: List[NCToken]
- ): NCResult = {
- ...
- }
- </pre>
-
- </li>
- <li>
- <p>
- External JSON/YAML <a href="/data-model.html#config">data
model configuration</a> can provide one or more
- IDL declarations in <code>intents</code> field. For
example:
- </p>
- <pre class="brush: js, highlight: [7]">
- {
- "id": "nlpcraft.alarm.ex",
- "name": "Alarm Example Model",
- .
- .
- .
- "intents": [
- "import('/opt/myproj/global_fragments.idl')",
- "import('/opt/myproj/my_intents.idl')",
- "intent=alarm term~{tok_id()=='x:alarm'}"
- ]
- }
- </pre>
- </li>
- <li>
- External <code>*.idl</code> files contain IDL declarations and
can be imported in any other places where
- IDL declarations are allowed. See <code>import()</code>
statement explanation below. For example:
- <pre class="brush: idl">
- /*
- * File 'my_intents.idl'.
- * ======================
- */
-
- import('/opt/globals.idl') // Import global intents and
fragments.
-
- // Fragments.
- // ----------
- fragment=buzz term~{tok_id() == 'x:alarm'}
- fragment=when
- term(nums)~{
- // Term variables.
- @type = meta_tok('nlpcraft:num:unittype')
- @iseq = meta_tok('nlpcraft:num:isequalcondition')
-
- tok_id() == 'nlpcraft:num' && @type != 'datetime'
&& @iseq == true
- }[0,7]
-
- // Intents.
- // --------
- intent=alarm
- fragment(buzz)
- fragment(when)
- </pre>
- </li>
- </ul>
- <h2 id="ild_grammar" class="section-sub-title">IDL Grammar</h2>
- <p>
+ NLPCraft intents are written in Intent Definition Language (IDL).
IDL is a relatively straightforward and simple language. You can
review the formal
<a target="github"
href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft/src/main/scala/org/apache/nlpcraft/model/intent/compiler/antlr4/NCIdl.g4">ANTLR4
grammar</a> for IDL. Here
are the common properties of IDL:
@@ -162,24 +89,27 @@ id: intent_matching
Numeric literals use Java string conversions.
</li>
<li>
- Only 10 reserved keywords: <code>flow fragment import intent
meta ordered term true false null</code>
+ IDL has only 10 reserved keywords: <code>flow fragment import
intent meta ordered term true false null</code>
</li>
<li>
Identifiers and literals can use the same Unicode space as
Java.
</li>
<li>
- IDL provides over 50 built-in functions to aid in intent
matching. IDL functions are pure immutable mathematical functions
+ IDL provides over 50 <a href="#idl_functions">built-in
functions</a> to aid in intent matching. IDL functions are pure immutable
mathematical functions
that work on a runtime stack. In other words, they look like
Python functions: IDL <code>length(trim(" text "))</code> vs.
OOP-style <code>" text ".trim().length()</code>.
</li>
</ul>
<p>
- IDL script consists of one or more statements, where each
statement can be one of the following three types:
+ IDL program consists of one or more statements of type
<code>import</code>, <code>intent</code>, or <code>fragment</code>:
</p>
<ul>
<li>
<p>
- <b>Import</b>
+ <b><code>import</code> statement</b>
+ </p>
+ <p>
+ Import statement allows to import IDL declarations from
either local file, classpath resource or URL:
</p>
<pre class="brush: idl">
// Import using absolute path.
@@ -192,17 +122,35 @@ id: intent_matching
import('ftp://user:password@myhost:22/opt/globals.idl')
</pre>
<p>
- Import statement allows to import IDL declarations from
either local file, classpath resource or URL.
- The effect of importing is the same as if the imported
declarations were inserted in place of import
- statement. Recursive and cyclic imports are detected and
ignored. Import statement starts
- with <code>import</code> keyword and has a string
parameter that indicates
- the location of the resource to import. Note that for the
classpath resource you don't need to specify
- leading forward slash.
+ <b>NOTES:</b>
</p>
+ <ul>
+ <li>
+ The effect of importing is the same as if the imported
declarations were inserted in place of import
+ statement.
+ </li>
+ <li>
+ Recursive and cyclic imports are detected and safely
ignored.
+ </li>
+ <li>
+ Import statement starts with <code>import</code>
keyword and has a string parameter that indicates
+ the location of the resource to import.
+ </li>
+ <li>
+ For the classpath resource you don't need to specify
leading forward slash.
+ </li>
+ </ul>
+ <p></p>
</li>
<li>
<p>
- <b>Intent Definition</b>
+ <b><code>intent</code> statement</b>
+ </p>
+ <p>
+ Intent is defined as one or more terms. Each term is a
predicate over a instance of
+ <a target="javadoc"
href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a>
interface.
+ For an intent to match all of its terms have to evaluate
to true.
+ Intent definition statement can be informally explained
using the following example:
</p>
<pre class="brush: idl">
intent=xa
@@ -220,24 +168,184 @@ id: intent_matching
flow=/#flowModelMethod/
ordered=true
term(a)=/org.mypackage.MyClass#termMethod/?
- term(b)~{
- @x = 2
- @xx = ((@x * @x) / 2) * 3
-
- @xx == 6 && has(
- json(meta_req('user_json_payload')),
- list("موسكو\"", 'v1\'v1', "k2", "v2")
- )
- }
+ fragment(frag1)
</pre>
+ <dl>
+ <dt>
+ <code>intent=xa</code> <sup><small>line
1</small></sup><br/>
+ <code>intent=xb</code> <sup><small>line
12</small></sup>
+ </dt>
+ <dd>
+ Mandatory intent IDs. Intent ID is any arbitrary
unique string matching the following lexer template:
+
<code>(UNI_CHAR|UNDERSCORE|LETTER|DOLLAR)+(UNI_CHAR|DOLLAR|LETTER|[0-9]|COLON|MINUS|UNDERSCORE)*</code>
+ </dd>
+ <dt><code>ordered=true</code> <sup><small>line
14</small></sup></dt>
+ <dd>
+ <em>Optional.</em>
+ Whether or not this intent is ordered. Default is
<code>false</code>.
+ For ordered intent the specified order of terms is
important for matching this intent.
+ If intent is unordered its terms can be found in any
order in the input text.
+ Note that ordered intent significantly limits the user
input it can match. In most cases
+ the ordered intent is only applicable to processing of
a formal grammar (like a programming language)
+ and mostly unsuitable for the natural language
processing.
+ </dd>
+ <dt>
+ <code>flow="^(?:xx)(^:zz)*$"</code> <sup><small>line
2</small></sup><br/>
+ <code>flow=/#flowModelMethod/</code> <sup><small>line
13</small></sup>
+ </dt>
+ <dd>
+ <p>
+ <em>Optional.</em> Dialog flow is a history of
previously matched intents to match on. If provided,
+ the intent will first match on the history of the
previously matched intents before processing its
+ terms. There are two way to define a match on the
dialog flow:
+ </p>
+ <ul>
+ <li>
+ <p><b>Regular Expression</b></p>
+ <p>
+ Dialog flow specification is a standard <a
target=_blank
href="https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html">Java
regular expression</a>.
+ The history of previously matched intents
is presented as a space separated string of intent IDs that were
+ selected as the best match during the
current conversation, in the chronological order with the most
+ recent matched intent ID being the first
element in the string. Dialog flow regular expression
+ will be matched against that string
representing intent IDs.
+ </p>
+ <p>
+ In this example, the
<code>^(?:xx)(^:zz)*$</code> dialog flow regular expression defines that intent
+ should only match when the immediate
previous intent was <code>xx</code> and no <code>zz</code> intents
+ are in the history. If history is
<code>"xx yy yy"</code> - this intent will match. However, for
+ <code>"xx zz"</code> or <code>"yy
xx"</code> history this dialog flow will not match.
+ </p>
+ </li>
+ <li>
+ <p><b>User-Defined Callback</b></p>
+ </li>
+ </ul>
+ <p>
+ Note that if dialog flow is defined and it doesn't
match the history the terms of the intent won't be tested at all.
+ </p>
+ </dd>
+ <dt>
+ <code>term(term1)={group @@ 'my_group'}?</code><br>
+ <code>term(term2)~{trim(partId.partAlias.id) ==
'token1:id'}[1,3]</code>
+ </dt>
+ <dd>
+ <p>
+ Term, also known as a slot, is a building block of
the intent. Term has optional ID, predicate and quantifiers.
+ It can support conversation context if it uses
<code>'~'</code> symbol or not if it uses <code>'='</code>
+ symbol in its definition. For conversational term
the system will search for a match using tokens from
+ the current request as well as the tokens from
conversation STM (short-term-memory). For a non-conversational
+ term - only tokens from the current request will
be considered.
+ </p>
+ <p>
+ A term represents one or more tokens, sequential
or not, detected in the user input. Intent has a list of terms
+ (always at least one) that all have to be found in
the user input for the intent to match. Note that term
+ can be optional if its min quantifier is zero.
Whether or not the order of the terms is important
+ for matching is governed by
<code>ordered=true</code> parameter.
+ </p>
+ <p>
+ Term ID (<code>term1</code> and
<code>term2</code>) is optional, and when provided, is used by
<code>@NCIntentTerm</code>
+ annotation to link term's tokens to a formal
parameter of the callback method. Note that term ID follows
+ the same lexical rules as intent ID.
+ </p>
+ <p>
+ Inside of curly brackets <code>{</code>
<code>}</code> is a <a href="data-model.html#dsl">token DSL</a>
+ expression: <code>group @@ 'my_group'</code> and
<code>trim(partId.partAlias.id) == 'token1:id'</code>.
+ Note that exactly the same syntax is used for
token DSL as well as for intent DSL for defining a token predicate.
+ Consult <a href="data-model.html#dsl">token
DSL</a> documentation for details on its syntax.
+ </p>
+ <p>
+ <code>?</code> and <code>[1,3]</code> define an
inclusive quantifier for that term (how many time this term should appear
+ for it to be considered found). You can also use
the following standard abbreviations:
+ </p>
+ <ul>
+ <li><code>*</code> is equal to
<code>[0,∞]</code></li>
+ <li><code>+</code> is equal to
<code>[1,∞]</code></li>
+ <li><code>?</code> is equal to
<code>[0,1]</code></li>
+ <li>No quantifier defaults to
<code>[1,1]</code></li>
+ </ul>
+ </dd>
+ </dl>
</li>
<li>
<p>
- <b>Fragment Definition</b>
+ <b><code>fragment</code> statement</b>
</p>
</li>
</ul>
<h2 id="idl_functions" class="section-sub-title">IDL Functions</h2>
+ <h2 id="idl_location" class="section-sub-title">IDL Location</h2>
+ <p>
+ IDL declarations can be placed in different locations based on
user preferences:
+ </p>
+ <ul>
+ <li>
+ <p>
+ <a target="javadoc"
href="/apis/latest/org/apache/nlpcraft/model/NCIntent.html">@NCIntent</a>
annotation
+ takes a string as its parameter that should be a valid IDL
declaration. For example, Scala code snippet:
+ </p>
+ <pre class="brush: scala, highlight: [1, 2]">
+ @NCIntent("import('/opt/myproj/global_fragments.idl')")
+ @NCIntent("intent=act term(act)={has(tok_groups(), 'act')}
fragment(f1)")
+ def onMatch(
+ @NCIntentTerm("act") actTok: NCToken,
+ @NCIntentTerm("loc") locToks: List[NCToken]
+ ): NCResult = {
+ ...
+ }
+ </pre>
+
+ </li>
+ <li>
+ <p>
+ External JSON/YAML <a href="/data-model.html#config">data
model configuration</a> can provide one or more
+ IDL declarations in <code>intents</code> field. For
example:
+ </p>
+ <pre class="brush: js, highlight: [7]">
+ {
+ "id": "nlpcraft.alarm.ex",
+ "name": "Alarm Example Model",
+ .
+ .
+ .
+ "intents": [
+ "import('/opt/myproj/global_fragments.idl')",
+ "import('/opt/myproj/my_intents.idl')",
+ "intent=alarm term~{tok_id()=='x:alarm'}"
+ ]
+ }
+ </pre>
+ </li>
+ <li>
+ External <code>*.idl</code> files contain IDL declarations and
can be imported in any other places where
+ IDL declarations are allowed. See <code>import()</code>
statement explanation below. For example:
+ <pre class="brush: idl">
+ /*
+ * File 'my_intents.idl'.
+ * ======================
+ */
+
+ import('/opt/globals.idl') // Import global intents and
fragments.
+
+ // Fragments.
+ // ----------
+ fragment=buzz term~{tok_id() == 'x:alarm'}
+ fragment=when
+ term(nums)~{
+ // Term variables.
+ @type = meta_tok('nlpcraft:num:unittype')
+ @iseq = meta_tok('nlpcraft:num:isequalcondition')
+
+ tok_id() == 'nlpcraft:num' && @type != 'datetime'
&& @iseq == true
+ }[0,7]
+
+ // Intents.
+ // --------
+ intent=alarm
+ fragment(buzz)
+ fragment(when)
+ </pre>
+ </li>
+ </ul>
<h2 id="idl_syntax_highlight" class="section-sub-title">IDL Syntax
Highlighting</h2>
</section>
<section id="annotations">