This is an automated email from the ASF dual-hosted git repository.
sergeykamov pushed a commit to branch NLPCRAFT-513
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git
The following commit(s) were added to refs/heads/NLPCRAFT-513 by this push:
new a8b4418 WIP.
a8b4418 is described below
commit a8b44184199169a773f4efdb606a7ee28447827f
Author: skhdl <[email protected]>
AuthorDate: Tue Oct 25 12:18:14 2022 +0400
WIP.
---
api-review.html | 125 ++++++++++++++++++++++++-------------------------
built-components.html | 8 +++-
custom-components.html | 5 +-
semantic.html | 6 ++-
4 files changed, 75 insertions(+), 69 deletions(-)
diff --git a/api-review.html b/api-review.html
index 4043395..8134bc4 100644
--- a/api-review.html
+++ b/api-review.html
@@ -26,33 +26,30 @@ id: overview
<h2 class="section-title">Library API review <a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
- NlpCraft library contains two base concepts: <code>Model</code>
and <code>Client</code> which have API representations
+ NlpCraft library is based on two main base concepts
<code>Model</code> and <code>Client</code>
+ which have API representations
<a href="apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a>
and
<a
href="apis/latest/org/apache/nlpcraft/NCModelClient.html">NCModelClient</a>.
- When you work with the system - you should prepare model
configuring its parameters and defining its components.
+ When you work with the system you should prepare model by
configuring its parameters and defining its components.
After you just communicate with this model via client's methods.
</p>
<ul>
<li>
<code>Model</code> is domain specific object which responsible
for user input interpretation.
- <code>Model</code> contains intents, defined via NlpCraft IDL
with related code callbacks.
- Intent is user defined callback and rule, according to which
this callback should be called.
- Rule is most often some template, based on expected set of
entities in user input, but it can be more flexible.
</li>
-
<li>
- <code>Client</code> is object, which allows to communicate
with the given model.
+ <code>Client</code> is object which allows to communicate with
the given model.
</li>
</ul>
<p>Typical part of code:</p>
<pre class="brush: scala, highlight: []">
- // Prepares domain model.
+ // Initialized prepared domain model.
val mdl = new CustomNlpModel()
- // Prepares client for given model.
+ // Creates client for given model.
val client = new NCModelClient(mdl)
// Sends text request to model by user ID "userId".
@@ -62,24 +59,12 @@ id: overview
client.clearDialog("userId")
</pre>
- <p>
- <code>Model</code> definition includes two parts:
- </p>
- <ul>
- <li>
- <code>Configuration</code>. Static configuration parameters
including name, version, etc.
- </li>
- <li>
- <code>Pipeline</code>. Most important component, which defines
user input processing chain.
- <code>Pipeline</code> can be based on standard and custom user
defined components.
- </li>
- </ul>
</section>
<section id="model">
- <h2 class="section-title">Model overview<a href="#"><i class="top-link
fas fa-fw fa-angle-double-up"></i></a></h2>
+ <h2 class="section-title">Model responsibility overview<a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
- Let's start with terminology and describe work workflow.
+ Let's start with terminology and describe the system work workflow.
</p>
<ul>
@@ -87,27 +72,38 @@ id: overview
<code>Token</code> represented as <a
href="apis/latest/org/apache/nlpcraft/NCToken.html">NCToken</a>.
It is simple string, part of user input, which split according
to some rules,
for instance by spaces and some additional conditions, which
depends on language and some expectations.
- So user input "<b>Where is it?</b>" contains four tokens:
"<b>Where</b>", "<b>is</b>", "<b>it</b>", "<b>?</b>".
- Usually <code>tokens</code>are words and punctuation symbols
which can also contain some additional
+ So user input "<b>Where is it?</b>" contains four tokens:
+ "<code>Where</code>", "<code>is</code>", "<code>it</code>",
"<code>?</code>".
+ Usually <code>tokens</code> are words and punctuation symbols
which can also contain some additional
information like point of speech etc.
- Tokens data are input for searching <code>entities</code>.
+ <code>Tokens</code> are input for searching the
<code>entities</code>.
</li>
<li>
<code>Entity</code> represented as <a
href="apis/latest/org/apache/nlpcraft/NCEntity.html">NCEntity</a>.
According to wikipedia, named entity is a real-world object,
such as a person, location, organization,
product, etc., that can be denoted with a proper name. It can
be abstract or have a physical existence.
- Each entity can contain one or more tokens.
- Entities data are input for searching <a
href="intent-matching.html">Intent matching</a> conditions.
+ Each <code>entity</code> can contain one or more tokens.
+ <code>Entities</code> are input for searching
<code>intents</code> according to <a href="intent-matching.html">Intent
matching</a> conditions.
</li>
<li>
<code>Variant</code> represented as <a
href="apis/latest/org/apache/nlpcraft/NCVariant.html">NCVariant</a>.
- List of entities. Potentially, each token can be recognized as
different entities,
- so user input can be processed as set of variants.
- For example user input "Mercedes" can be processed as 2
variants,
- both of them contains single element list of entities: car
brand or Spanish family name.
+ It is a list of <code>entities</code>. Potentially, each
<code>token</code> can be recognized as
+ different <code>entities</code>,
+ so user input can be processed as set of <code>variants</code>.
+ For example user input "Mercedes" can be processed as two
<code>variants</code>,
+ both of them contains single element list of
<code>entities</code>: <b>car brand</b> or <b>Spanish female name</b>.
When words are not overlapped with different
<code>entities</code> there is only one
<code>variant</code> detected.
</li>
+ <li>
+ <code>Intent</code> is user defined callback and rule,
according to which this callback should be called.
+ Rule is most often some template, based on expected set of
<code>entities</code> in user input,
+ but it can be more flexible.
+ Parameters extracted from user text input are passed into
callback methods.
+ These methods execution results are provided to user as answer
on his request.
+ <code>Intent</code> callbacks are methods defined in
<code>Model</code> class annotated by
+ <code>intent</code> rules via <a
href="intent-matching.html">IDL</a>.
+ </li>
</ul>
<p>
@@ -117,71 +113,72 @@ id: overview
<ul>
<li>
Parse user input text as the <code>tokens</code>.
- They are input for searching <code>name entities</code>.
- Tokens parsing components should be included into <a
href="#model-pipeline">Model pipeline</a>.
+ They are input for searching <code>named entities</code>.
+ <code>Tokens</code> parsing components should be included into
<a href="#model-pipeline">Model pipeline</a>.
</li>
<li>
- Find <code>name entities</code> based on these parsed tokens.
- <code>name entities</code>.
+ Find <code>named entities</code> based on these parsed
<code>tokens</code>.
They are input for searching <code>intents</code>.
- Entity parsing components should be included into <a
href="#model-pipeline">Model pipeline</a>.
+ <code>Entity</code> parsing components should be included into
<a href="#model-pipeline">Model pipeline</a>.
</li>
<li>
- Prepare callback methods which contain business logic
- and rules for matching user requests on them.
- These callbacks with their rules and named as
<code>intents</code>.
- These matched callback methods execution with parameters
extracted from user text input and
- these execution results returned to user as answer on his
request.
- These callbacks methods should be defined in the model or
model should have reference in them.
+ Prepare <code>intents</code> with their callbacks methods
which contain business logic.
+ These methods should be defined directly in the model class
definition or the model should have references on them.
It will be described below.
</li>
</ul>
<p>
- Let's prepare the system which can call persons from your contact
list.
- Typical commands are: "Please call to John Smith" or "Connect me
with Barbara Dillan".
- This model should be able to recognize in user text following
entities:
+ As example, let's prepare the system which can call persons from
your contact list.
+ Typical commands are: "<b>Please call to John Smith</b>" or
"<b>Connect me with Barbara Dillan</b>".
+ For solving this task this model should be able to recognize in
user text following entities:
<code>command</code> and <code>person</code> to apply this command.
</p>
+ <p>
+ So, when request "<b>Please call to John Smith</b>" received, our
model should be able to:
+ </p>
+
<ul>
<li>
- Parsing split your input on tokens ["<code>please</code>",
"<code>call</code>", "<code>to</code>", "<code>john</code>",
"<code>smith</code>"]
+ Parse tokens splitting user text input:
+ "<code>please</code>", "<code>call</code>", "<code>to</code>",
"<code>john</code>", "<code>smith</code>".
</li>
<li>
- By these tokens model should be able to found two named
entities:
+ Find two named entities:
<ul>
<li>
- <code>command</code> by token <code>call</code>.
+ <code>command</code> by token "<code>call</code>".
</li>
<li>
- <code>person</code> by tokens <code>john</code> and
<code>smith</code>.
+ <code>person</code> by tokens "<code>john</code>" and
"<code>smith</code>".
</li>
</ul>
</li>
<li>
- Also intents should be prepared:
+ Have prepared intent:
<pre class="brush: scala, highlight: [1, 2, 5, 6]">
@NCIntent("intent=call term(command)={# == command'}
term(person)={# == 'person'}")
- def onMatch(
+ def onCommand(
ctx: NCContext,
im: NCIntentMatch,
@NCIntentTerm("command") command: NCEntity,
@NCIntentTerm("person") person: NCEntity
- ): NCResult = ...
+ ): NCResult = ? // Implement business logic here.
</pre>
<ul>
<li>
- <code>Line 1</code> defines intent <code>ls</code>
with two conditions.
+ <code>Line 1</code> defines intent <code>call</code>
with two conditions.
</li>
<li>
- <code>Line 2</code> defines related callback method.
+ <code>Line 2</code> defines related callback method
<code>onCommand()</code>.
</li>
<li>
<code>Lines 4 and 5</code> define two callback
method's arguments which are corresponded to
- <code>call</code> intent conditions. You can extract
normalized value
- <code>john smith</code> from the <code>person</code>
parameter and use in the method body.
+ <code>call</code> intent terms conditions. You can
extract normalized value
+ <code>john smith</code> from the <code>person</code>
parameter and use it in the method body
+ for getting his phone number etc.
</li>
</ul>
@@ -196,11 +193,13 @@ id: overview
</section>
<section id="client">
- <h2 class="section-title">Client overview<a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
+ <h2 class="section-title">Client responsibility overview<a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
- Base client methods:
+ Client which is represented as <a
href="apis/latest/org/apache/nlpcraft/NCModelClient.html">NCModelClient</a>
+ is necessary for communication with the model. Base client methods
are described below.
</p>
+
<ul>
<li>
<code>ask()</code> passes user text input to the model and
receives back execution
@@ -233,8 +232,8 @@ id: overview
<h2 class="section-title">Model configuration <a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
- <code>Model configuration</code> which is represented as is set of
model parameter values.
- Its API representation is <a
href="apis/latest/org/apache/nlpcraft/NCModelConfig.html">NCModelConfig</a>.
+ Model configuration <a
href="apis/latest/org/apache/nlpcraft/NCModelConfig.html">NCModelConfig</a>
represents set of model parameter values.
+ Its properties are described below.
</p>
<ul>
<li>
@@ -364,8 +363,8 @@ id: overview
<ul class="side-nav">
<li class="side-nav-title">On This Page</li>
<li><a href="#overview">Overview</a></li>
- <li><a href="#model">Model overview</a></li>
- <li><a href="#client">Client overview</a></li>
+ <li><a href="#model">Model responsibility overview</a></li>
+ <li><a href="#client">Client responsibility overview</a></li>
<li><a href="#model-configuration">Model configuration</a></li>
<li><a href="#model-pipeline">Model pipeline</a></li>
<li><a href="#model-behavior">Model behavior overriding</a></li>
diff --git a/built-components.html b/built-components.html
index c876535..4e8b991 100644
--- a/built-components.html
+++ b/built-components.html
@@ -26,8 +26,9 @@ id: overview
<h2 class="section-title">Built components <a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
- Model <a
href="apis/latest/org/apache/nlpcraft/NCPipeline.html">NCPipeline</a> is base
component which responsible for sentence processing.
- It is consists of a number of traits, some built implementations
of them are described below.
+ Model <a
href="apis/latest/org/apache/nlpcraft/NCPipeline.html">NCPipeline</a>
+ is base model element. It defines chain of components traits which
are responsible for sentence processing.
+ Some built implementations of these traits are described below.
</p>
<div class="bq info">
@@ -120,9 +121,12 @@ id: overview
Look at the supported <b>Name Finder</b> models <a
href="https://opennlp.sourceforge.net/models-1.5/">here</a>.
For example for English language are accessible:
<code>Location</code>, <code>Money</code>,
<code>Person</code>, <code>Organization</code>,
<code>Date</code>, <code>Time</code> and <code>Percentage</code>.
+ There are also accessible dome models for another
languages.
</li>
<li>
<code>NCStanfordNLPEntityParser</code> is wrapper on
<a href="https://nlp.stanford.edu/">Stanford NLP</a> NER components.
+ For example for English language are accessible:
<code>Location</code>, <code>Money</code>,
+ <code>Person</code>, <code>Organization</code>,
<code>Date</code>, <code>Time</code> and <code>Percent</code>.
Look at the detailed information <a
href="https://nlp.stanford.edu/software/CRF-NER.shtml">here</a>.
</li>
<li>
diff --git a/custom-components.html b/custom-components.html
index 529acbd..51c6f67 100644
--- a/custom-components.html
+++ b/custom-components.html
@@ -27,14 +27,15 @@ id: overview
<p>
NlpCraft provides a numeric of useful built components for English
language.
- You can use them to prepare <code>Pipeline</code> for your
<code>Model</code>.
+ You can use them to prepare <a
href="apis/latest/org/apache/nlpcraft/NCPipeline.html">NCPipeline</a>
+ for your <a
href="apis/latest/org/apache/nlpcraft/NCModel.html">NCModel</a>.
You also can use provided wrappers on <a
href="https://opennlp.apache.org/">Apache OpenNLP</a> and
<a href="https://nlp.stanford.edu/">Stanford NLP</a> projects NER
components.
Their models work with English and some another languages.
</p>
<p>
But you can need to extend provided functionality and develop your
own components.
- Let's review these components step by step.
+ Let's look how to do it and when it can be useful for all kind of
components step by step.
</p>
</section>
<section id="token-parser">
diff --git a/semantic.html b/semantic.html
index c53cdf6..bc2e186 100644
--- a/semantic.html
+++ b/semantic.html
@@ -26,8 +26,10 @@ id: semantic
<h2 class="section-title">Semantic parser<a href="#"><i
class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
- Semantic entity parser <a
href="apis/latest/org/apache/nlpcraft/nlp/parsers/NCSemanticEntityParser.html">NCSemanticEntityParser</a>
- is the implementation of <a
href="apis/latest/org/apache/nlpcraft/NCEntityParser.html">NCEntityParser</a>.
+ Semantic entity parser
+ <a
href="apis/latest/org/apache/nlpcraft/nlp/parsers/NCSemanticEntityParser.html">NCSemanticEntityParser</a>
+ is the implementation of <a
href="apis/latest/org/apache/nlpcraft/NCEntityParser.html">NCEntityParser</a>,
+ which in turn is component of the model pipeline <a
href="apis/latest/org/apache/nlpcraft/NCPipeline.html">NCPipeline</a>.
In spite of that it is just one of defined in NlpCraft built
component, it deserves a special mention.
This parser provides simple but very powerful way to find domain
specific data in the input text.
It defines list of <a
href="apis/latest/org/apache/nlpcraft/nlp/parsers/NCSemanticElement.html">NCSemanticElement</a>