This is an automated email from the ASF dual-hosted git repository.

aradzinski pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 13b514b  WIP.
13b514b is described below

commit 13b514b18c483921878e586e767dff0951688310
Author: Aaron Radzinski <[email protected]>
AuthorDate: Sat Nov 14 21:48:02 2020 -0800

    WIP.
---
 _data/news.yml               |   4 +-
 blogs/short_term_memory.html | 180 ++++++++++++++++++++++++++++++++++++++++++-
 images/stm1.png              | Bin 0 -> 50047 bytes
 tools/embedded_probe.html    |   8 +-
 4 files changed, 185 insertions(+), 7 deletions(-)

diff --git a/_data/news.yml b/_data/news.yml
index 5c8aefa..8d9da28 100644
--- a/_data/news.yml
+++ b/_data/news.yml
@@ -17,7 +17,7 @@
 
 - title: πŸ“ƒ ΠšΡ€Π°Ρ‚ΠΊΠΈΠΉ ΠΎΠ±Π·ΠΎΡ€ систСмы Apache NlpCraft
   url: https://habr.com/ru/post/526950/
-  excerpt: ЦСль ΠΏΡ€ΠΎΠ΅ΠΊΡ‚Π° β€” Ρ‚ΠΎΡ‚Π°Π»ΡŒΠ½ΠΎΠ΅ ΡƒΠΏΡ€ΠΎΡ‰Π΅Π½ΠΈΠ΅ доступа ΠΊ возмоТностям NLP 
Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚Ρ‡ΠΈΠΊΠ°ΠΌ ΠΏΡ€ΠΈΠ»ΠΎΠΆΠ΅Π½ΠΈΠΉ. Π£Π»ΠΎΠ²ΠΈΡ‚ΡŒ баланс ΠΌΠ΅ΠΆΠ΄Ρƒ простотой вхоТдСния Π² NLP 
ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΠ°Ρ‚ΠΈΠΊΡƒ ΠΈ ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠΎΠΉ ΡˆΠΈΡ€ΠΎΠΊΠΎΠ³ΠΎ Π΄ΠΈΠ°ΠΏΠ°Π·ΠΎΠ½Π° возмоТностСй ΠΏΡ€ΠΎΠΌΡ‹ΡˆΠ»Π΅Π½Π½ΠΎΠΉ 
Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊΠΈ.
+  excerpt: ЦСль ΠΏΡ€ΠΎΠ΅ΠΊΡ‚Π° - Ρ‚ΠΎΡ‚Π°Π»ΡŒΠ½ΠΎΠ΅ ΡƒΠΏΡ€ΠΎΡ‰Π΅Π½ΠΈΠ΅ доступа ΠΊ возмоТностям NLP 
Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚Ρ‡ΠΈΠΊΠ°ΠΌ ΠΏΡ€ΠΈΠ»ΠΎΠΆΠ΅Π½ΠΈΠΉ. Π£Π»ΠΎΠ²ΠΈΡ‚ΡŒ баланс ΠΌΠ΅ΠΆΠ΄Ρƒ простотой вхоТдСния Π² NLP 
ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΠ°Ρ‚ΠΈΠΊΡƒ ΠΈ ΠΏΠΎΠ΄Π΄Π΅Ρ€ΠΆΠΊΠΎΠΉ ΡˆΠΈΡ€ΠΎΠΊΠΎΠ³ΠΎ Π΄ΠΈΠ°ΠΏΠ°Π·ΠΎΠ½Π° возмоТностСй ΠΏΡ€ΠΎΠΌΡ‹ΡˆΠ»Π΅Π½Π½ΠΎΠΉ 
Π±ΠΈΠ±Π»ΠΈΠΎΡ‚Π΅ΠΊΠΈ.
   author: Π‘Π΅Ρ€Π³Π΅ΠΉ Камов
   publish_date: November 8, 2020
   avatar_url: images/sergey_kamov_avatar.png
@@ -49,7 +49,7 @@
 
 - title: πŸ“œ Short-Term Memory - Maintaining Conversation Context
   url: /blogs/short_term_memory.html
-  excerpt: Short-Term Memory (STM), a technique used to maintain 
conversational context in NLPCraft. Maintaining the proper conversation context 
β€” remembering what the current conversation is about β€” is essential for all 
human interaction and thus essential for computer-based natural language 
understanding.
+  excerpt: Short-Term Memory (STM), a technique used to maintain 
conversational context in NLPCraft. Maintaining the proper conversation context 
- remembering what the current conversation is about - is essential for all 
human interaction and thus essential for computer-based natural language 
understanding.
   author: Aaron Radzinksi
   avatar_url: images/lion.jpg
   publish_date: July 26, 2019
diff --git a/blogs/short_term_memory.html b/blogs/short_term_memory.html
index 12b20fa..ce09d25 100644
--- a/blogs/short_term_memory.html
+++ b/blogs/short_term_memory.html
@@ -28,7 +28,185 @@ publish_date: July 26, 2019
 -->
 
 <section id="">
-
+    <p>
+        <img alt="" class="img-fluid" src="/images/stm1.png">
+    </p>
+    <p>
+        In this blog, I’ll try to give a high-level overview of STM - 
Short-Term Memory, a technique used to
+        maintain conversational context in NLPCraft. Maintaining the proper 
conversation context - remembering
+        what the current conversation is about - is essential for all human 
interaction and thus essential for
+        computer-based natural language understanding. To my knowledge, 
NLPCraft provides one of the most advanced
+        implementations of STM, especially considering how tightly it is 
integrated with NLPCraft’s unique
+        intent-based matching (Google’s <a target=google 
href="https://cloud.google.com/dialogflow/";>DialogFlow</a> is very similar yet).
+    </p>
+    <p>
+        Let’s dive in.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Parsing User Input</h2>
+    <p>
+        One of the key objectives when parsing user input sentence for Natural 
Language Understanding (NLU) is to
+        detect all possible semantic entities, a.k.a <em>named entities</em>. 
Let’s consider a few examples:
+    </p>
+    <ul>
+        <li>
+            <code>"What’s the current weather in Tokyo?"</code><br/>
+            This sentence is fully sufficient for the processing
+            since it contains the topic <code>weather</code> as well as all 
necessary parameters
+            like time (<code>current</code>) and location (<code>Tokyo</code>).
+        </li>
+        <li>
+            <code>"What about Tokyo?"</code><br/>
+            This is an unclear sentence since it does not have the subject of 
the
+            question - what is it about Tokyo?
+        </li>
+        <li>
+            <code>"What’s the weather?"</code><br/>
+            This is also unclear since we are missing important parameters
+            of location and time for our request.
+        </li>
+    </ul>
+    <p>
+        Sometimes we can use default values like the current user’s location 
and the current time (if they are missing).
+        However, this can easily lead to the wrong interpretation if the 
conversation has an existing context.
+    </p>
+    <p>
+        In real life, as well as in NLP-based systems, we always try to start 
a conversation with a fully defined
+        sentence since without a context the missing information cannot be 
obtained and the sentenced cannot be interpreted.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Semantic Entities</h2>
+    <p>
+        Let’s take a closer look at the named entities from the above examples:
+    </p>
+    <ul>
+        <li>
+            <code>weather</code> - this is an indicator of the subject of the 
conversation. Note that it indicates
+            the type of question rather than being an entity with multiple 
possible values.
+        </li>
+        <li>
+            <code>current</code> - this is an entity of type <code>Date</code> 
with the value of <code>now</code>.
+        </li>
+        <li>
+            <code>Tokyo</code> - this is an entity of type 
<code>Location</code> with two values:
+            <ul>
+                <li><code>city</code> - type of the location.</li>
+                <li><code>Tokyo, Japan</code> - normalized name of the 
location.</li>
+            </ul>
+        </li>
+    </ul>
+    <p>
+        We have two distinct classes of entities:
+    </p>
+    <ul>
+        <li>
+            Entities that have no values and only act as indicators or types. 
The entity <code>weather</code> is the
+            type indicator for the subject of the user input.
+        </li>
+        <li>
+            Entities that additionally have one or more specific values like 
<code>current</code> and <code>Tokyo</code> entities.
+        </li>
+    </ul>
+    <div class="bq success">
+        <div style="display: inline-block; margin-bottom: 20px">
+            <a style="margin-right: 10px" target="opennlp" 
href="https://opennlp.apache.org";><img src="/images/opennlp-logo.png" 
height="32px" alt=""></a>
+            <a style="margin-right: 10px" target="google" 
href="https://cloud.google.com/natural-language/";><img 
src="/images/google-cloud-logo-small.png" height="32px" alt=""></a>
+            <a style="margin-right: 10px" target="stanford" 
href="https://stanfordnlp.github.io/CoreNLP";><img 
src="/images/corenlp-logo.gif" height="48px" alt=""></a>
+            <a style="margin-right: 10px" target="spacy" 
href="https://spacy.io";><img src="/images/spacy-logo.png" height="32px" 
alt=""></a>
+        </div>
+        <p>
+            Note that NLPCraft provides <a 
href="/integrations.html">support</a> for wide variety of named entities (with 
all built-in ones being properly normalized)
+            including  <a href="/integrations.html">integrations</a> with
+            <a target="spacy" href="https://spacy.io/";>spaCy</a>,
+            <a target="stanford" 
href="https://stanfordnlp.github.io/CoreNLP";>Stanford CoreNLP</a>,
+            <a target="opennlp" href="https://opennlp.apache.org/";>OpenNLP</a> 
and
+            <a target="google" 
href="https://cloud.google.com/natural-language/";>Google Natural Language</a>.
+        </p>
+    </div>
+</section>
+<section>
+    <h2 class="section-title">Incomplete Sentences</h2>
+    <p>
+        Assuming previously asked questions about the weather in Tokyo (in the 
span of the ongoing conversation) one
+        could presumably ask the following questions using a <em>shorter, 
incomplete</em>, form:
+    </p>
+    <ul>
+        <li>
+            <code>"What about Kyoto?</code><br/>
+            This question is missing both the subject and the time. However, we
+            can safely assume we are still talking about current weather.
+        </li>
+        <li>
+            <code>"What about tomorrow?"</code><br/>
+            Just like above we automatically assume the weather subject but
+            use <code>Kyoto</code> as the location since it was mentioned the 
last.
+        </li>
+    </ul>
+    <p>
+        These are incomplete sentences. This type of short-hands cannot be 
interpreted without prior context (neither
+        by humans or by machines) since by themselves they are missing 
necessary information.
+        In the context of the conversation, however, these incomplete 
sentences work. We can simply provide one or two
+        entities and rely on the <em>"listener"</em> to recall the rest of 
missing information from a
+        <em>short-term memory</em>, a.k.a conversation context.
+    </p>
+    <p>
+        In NLPCraft, the intent-matching logic will automatically try to find 
missing information in the
+        conversation context (that is automatically maintained). Moreover, it 
will properly treat such recalled
+        information during weighted intent matching since it naturally has 
less "weight" than something that was
+        found explicitly in the user input.
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Short-Term Memory</h2>
+    <p>
+        The short-term memory is exactly that... a memory that keeps only 
small amount of recently used information
+        and that evicts its contents after a short period of inactivity.
+    </p>
+    <p>
+        Let’s look at the example from a real life. If you would call your 
friend in a couple of hours asking <code>"What about a day after?"</code>
+        (still talking about weather in Kyoto) - he’ll likely be thoroughly 
confused. The conversation is timed out, and
+        your friend has lost (forgotten) its context. You will have to explain 
again to your confused friend what is that you are asking about...
+    </p>
+    <p>
+        NLPCraft has a simple rule that 5 minutes pause in conversation leads 
to the conversation context reset. However,
+        what happens if we switch the topic before this timeout elapsed?
+    </p>
+</section>
+<section>
+    <h2 class="section-title">Context Switch</h2>
+    <p>
+        Resetting the context by the timeout is, obviously, not a hard thing 
to do. What can be trickier is to detect
+        when conversation topic is switched and the previous context needs to 
be forgotten to avoid very
+        confusing interpretation errors. It is uncanny how humans can detect 
such switch with seemingly no effort, and yet
+        automating this task by the computer is anything but effortless...
+    </p>
+    <p>
+        Let’s continue our weather-related conversation. All of a sudden, we 
ask about something completely different:
+    </p>
+    <ul>
+        <li>
+            <code>"How much mocha latter at Starbucks?"</code><br/>
+            At this point we should forget all about previous conversation 
about weather and assume going forward
+            that we are talking about coffee in Starbucks.
+        </li>
+        <li>
+            <code>"What about Peet’s?"</code><br/>
+            We are talking about latter at Peet’s.
+        </li>
+        <li>
+            <code>"...and croissant?"</code><br/>
+            Asking about Peet’s crescent-shaped fresh rolls.
+        </li>
+    </ul>
+    <p>
+        Despite somewhat obvious logic the implementation of context switch is 
not an exact science. Sometimes, you
+        can have a "soft" context switch where you don’t change the topic of 
the conversation 100% but yet sufficiently
+        enough to forget at least some parts of the previously collected 
context. NLPCraft has a built-in algorithm
+        to detect the hard switch in the conversation. It also exposes API to 
perform a selective reset on the
+        conversation in case of β€œsoft” switch.
+    </p>
 </section>
 
 
diff --git a/images/stm1.png b/images/stm1.png
new file mode 100644
index 0000000..73bafbf
Binary files /dev/null and b/images/stm1.png differ
diff --git a/tools/embedded_probe.html b/tools/embedded_probe.html
index 0be9f61..1bcfa4b 100644
--- a/tools/embedded_probe.html
+++ b/tools/embedded_probe.html
@@ -83,11 +83,11 @@ public class AlarmTest {
 
     &#64;BeforeEach
     void setUp() throws NCException, IOException {
-        NCEmbeddedProbe.start(AlarmModel.class);
+        if (NCEmbeddedProbe.start(AlarmModel.class)) {
+            cli = new NCTestClientBuilder().newBuilder().build();
 
-        cli = new NCTestClientBuilder().newBuilder().build();
-
-        cli.open("nlpcraft.alarm.ex");
+            cli.open("nlpcraft.alarm.ex");
+        }
     }
 
     &#64;AfterEach

Reply via email to