[incubator-nlpcraft] branch NLPCRAFT-468 updated: WIP.

sergeykamov Thu, 14 Oct 2021 00:29:09 -0700

This is an automated email from the ASF dual-hosted git repository.

sergeykamov pushed a commit to branch NLPCRAFT-468
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft.git



The following commit(s) were added to refs/heads/NLPCRAFT-468 by this push:
     new 742b715  WIP.
742b715 is described below

commit 742b715474e7e040405312a2701e4548f092912a
Author: Sergey Kamov <[email protected]>
AuthorDate: Thu Oct 14 10:28:54 2021 +0300

    WIP.
---
 nlpcraft/src/main/scala/org/apache/nlpcraft/interfaces.txt | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/nlpcraft/src/main/scala/org/apache/nlpcraft/interfaces.txt 
b/nlpcraft/src/main/scala/org/apache/nlpcraft/interfaces.txt
index d9c5bc7..9ed5d65 100644
--- a/nlpcraft/src/main/scala/org/apache/nlpcraft/interfaces.txt
+++ b/nlpcraft/src/main/scala/org/apache/nlpcraft/interfaces.txt
@@ -18,10 +18,11 @@
 Interfaces (pluggable components). All of them have built-in implementations.
 
 1. Text-to-words tokenizer - org.apache.nlpcraft.model.nlp.NCNlpTokenizer.
-All frameworks allow to configure this low-level component according your 
requirements
-Look it :
+This component should be pluggable because different tokenization approaches 
can be needed. Provided default is fine in 90% - 99% (for EN)
+Look at:
 
https://nlp.stanford.edu/nlp/javadoc/javanlp-3.5.0/edu/stanford/nlp/process/Tokenizer.html,
 
https://opennlp.apache.org/docs/1.8.2/apidocs/opennlp-tools/opennlp/tools/tokenize/Tokenizer.html
+All frameworks allow to configure this low-level component according users' 
requirements.
 Delivered:
   - org.apache.nlpcraft.model.components.tokenizer.NCOpenNlpTokenizer (not 
configured)
   - Stanford impl (not configured)
@@ -29,7 +30,7 @@ Mandatory.
 Default in config - NCOpenNlpTokenizer.
 When user needs to implement his own:
  - another standard logic required (look at different variants by links above)
- - own logic required (tokenization for commands in own format like: 
'give_me_coffee_please')
+ - user's own logic required (tokenization for commands in own format like: 
'give_me_coffee_please')
  - new languages support
 
 2. Ners finder - org.apache.nlpcraft.model.nlp.NCNlpNerParse.
@@ -52,7 +53,7 @@ Optional (if null, stop, swear and suspicious words are not 
detected, these prop
 Default in config - NCDefaultStopWordsDetector, NCDefaultSwearWordsDetector.
 (`suspicious` detector is not set by default. Can be configured if necessary 
by NCConfiguredWordsDetector)
 When user needs to implement his own:
-  - own sophisticated logic implementation, which cannot be configured by 
NCConfiguredWordsDetector.
+  - user's own sophisticated logic implementation, which cannot be configured 
by NCConfiguredWordsDetector.
   - new languages support
 
 4. org.apache.nlpcraft.model.NCModelBehaviour

[incubator-nlpcraft] branch NLPCRAFT-468 updated: WIP.

Reply via email to