This is an automated email from the ASF dual-hosted git repository.
sergeykamov pushed a commit to branch NLPCRAFT-520
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft.git
The following commit(s) were added to refs/heads/NLPCRAFT-520 by this push:
new 459edd5b WIP.
459edd5b is described below
commit 459edd5bb3b57b73249f7d93695532fedd6d69e6
Author: Sergey Kamov <[email protected]>
AuthorDate: Thu Jan 5 19:39:54 2023 +0400
WIP.
---
.../apache/nlpcraft/nlp/parsers/NCSemanticElement.scala | 14 +++++++++++---
.../nlpcraft/nlp/parsers/NCSemanticEntityParser.scala | 16 ++++++++++------
2 files changed, 21 insertions(+), 9 deletions(-)
diff --git
a/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticElement.scala
b/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticElement.scala
index 24ab4195..218f2a59 100644
---
a/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticElement.scala
+++
b/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticElement.scala
@@ -26,10 +26,17 @@ import org.apache.nlpcraft.*
*
* Each trait contains a set of synonyms to match on named entity.
* A synonym can have one or more individual words.
- * Note that element's type is its implicit synonym so that even if no
additional synonyms are defined at least one synonym always exists.
- * Note also that synonym matching is performed on normalized and stemmatized
forms of both a synonym and user input on first phase and if first attempt is
not successful, it tries to match stemmatized forms of synonyms with
stemmatized forms of user input which were lemmatized preliminarily.
+ * Note that element's type is its implicit synonym so that even if no
additional synonyms are defined at least one synonym
+ * always exists.
+ * Note also that synonym matching is performed on normalized and stemmatized
forms of both a synonym and user input on
+ * first phase and if first attempt is not successful, it tries to match
stemmatized forms of synonyms
+ * with stemmatized forms of user input which were lemmatized preliminarily.
* This approach allows to provide more accurate matching and doesn't force
users to prepare synonyms in initial words form.
*
+ * Also **Semantic** element can have an optional set of special synonyms
called values or "proper nouns" for this element.
+ * Unlike basic synonyms, each value is a pair of a name and a set of
standard synonyms by which that value,
+ * and ultimately its element, can be recognized in the user input.
+ *
* See detailed description on the website
[[https://nlpcraft.apache.org/built-in-entity-parser.html#parser-semantic
Semantic Parser]].
*
* @see [[NCSemanticEntityParser]]
@@ -55,13 +62,14 @@ trait NCSemanticElement:
* Gets values map. Each element can contain multiple value,
* each value is described as name and list of its synonyms.
* They allows to find element's value in text.
+ * Note that macros can be used for synonyms definition.
*
* @return Values.
*/
def getValues: Map[String, Set[String]] = Map.empty
/**
- * Gets elements synonyms list. They allows to find element in text.
+ * Gets elements synonyms list. They allows to find element in text. Note
that macros can be used for synonyms definition.
*
* @return Synonyms.
*/
diff --git
a/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticEntityParser.scala
b/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticEntityParser.scala
index 283bcab3..2ca49df1 100644
---
a/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticEntityParser.scala
+++
b/nlpcraft/src/main/scala/org/apache/nlpcraft/nlp/parsers/NCSemanticEntityParser.scala
@@ -123,14 +123,18 @@ private object NCSemanticEntityParser:
import NCSemanticEntityParser.*
/**
- * **Semantic** [[NCEntityParser entity parser]] synonyms based
implementation.
+ * **Semantic** [[NCEntityParser entity parser]] implementation.
*
- * This parser provides simple but very powerful way to find domain specific
data in the input text.
- * It configured by list of [[NCSemanticElement]] which are represent
[[NCEntity name entities]] and
+ * This synonyms based parser provides simple but very powerful way to find
domain specific data in the input text.
+ * It is configured via [[NCSemanticElement]] list which are represent
[[NCEntity name entities]] and
* can be produced by this parser.
*
- * [[NCSemanticElement]] elements can be configured via YAML or JSON files in
special format or created and passed
- * programmatically.
+ * [[NCSemanticElement]] elements list can be configured via YAML or JSON
files in special format or
+ * programmatically prepared list of [[NCSemanticElement]] can be passed in
this parser directly.
+ *
+ * [[NCSemanticElement]] elements synonyms can be based on special
+ * [[https://nlpcraft.apache.org/built-in-entity-parser.html#macros macros]]
definitions which can be provided
+ * in YAML and JSON files or passed directly in case when programmatically
prepared list of [[NCSemanticElement]] passed in this parser.
*
* See detailed description on the website
[[https://nlpcraft.apache.org/built-in-entity-parser.html#parser-semantic
Semantic Parser]].
*
@@ -143,7 +147,7 @@ import NCSemanticEntityParser.*
* There are several constructors with different set of parameters.
* - **stemmer** [[NCStemmer]] implementation which used for matching tokens
and given [[NCSemanticElement]] synonyms.
* - **parser** [[NCTokenParser]] implementation which used for given
[[NCSemanticElement]] synonyms tokenization. It should be same implementation
as used in [[NCPipeline.getTokenParser]].
- * - **macros** Macros map which are used for extracting
[[NCSemanticElement]] synonyms defined via **macros**. Empty by default. Look
more at the website
[[https://nlpcraft.apache.org/built-in-entity-parser.html#parser-semantic
Macros]].
+ * - **macros** Macros map which are used for extracting
[[NCSemanticElement]] synonyms defined via **macros**. Empty by default. Look
more at the website
[[https://nlpcraft.apache.org/built-in-entity-parser.html#macros Macros]].
* - **elements** Programmatically prepared [[NCSemanticElement]] instances.
* - **mdlRes** Relative path, absolute path, classpath resource or URL to
YAML or JSON semantic model which contains [[NCSemanticElement]] definitions.
*