Author: schor
Date: Fri May 20 15:14:26 2016
New Revision: 1744753

URL: http://svn.apache.org/viewvc?rev=1744753&view=rev
Log:
no Jira - add table consolidating useful comparative information about the 
alternative CAS Serialization capabilities

Modified:
    
uima/uimaj/trunk/uima-docbook-tutorials-and-users-guides/src/docbook/tug.application.xml

Modified: 
uima/uimaj/trunk/uima-docbook-tutorials-and-users-guides/src/docbook/tug.application.xml
URL: 
http://svn.apache.org/viewvc/uima/uimaj/trunk/uima-docbook-tutorials-and-users-guides/src/docbook/tug.application.xml?rev=1744753&r1=1744752&r2=1744753&view=diff
==============================================================================
--- 
uima/uimaj/trunk/uima-docbook-tutorials-and-users-guides/src/docbook/tug.application.xml
 (original)
+++ 
uima/uimaj/trunk/uima-docbook-tutorials-and-users-guides/src/docbook/tug.application.xml
 Fri May 20 15:14:26 2016
@@ -485,17 +485,21 @@ ae.destroy();</programlisting></para>
       <title>Saving CASes to file systems or general Streams</title>
       
       <para>The UIMA framework provides multiple APIs to save and restore the 
contents of a CAS to streams. 
+      Two common uses of this are to save CASes to the file system, and to 
send CASes to other processes, running
+      on remote systems.</para>
+      
+      <para>
         The CASes can be serialized in multiple formats:
         <itemizedlist>
           <listitem>
             <para>Binary formats:
               <itemizedlist>
                 <listitem>
-                  <para>plain binary:  This is used to communicate with remote 
services, and also for interfacing with
+                  <para>plain binary: This is used to communicate with remote 
services, and also for interfacing with
                   annotators written in C/C++ or related languages via the JNI 
Java interface, from Java</para>
                 </listitem>
                 <listitem>
-                  <para>Two forms of compressed binary.  The recommend one is 
form 6, which also allows
+                  <para>Compressed binary: There are two forms of compressed 
binary.  The recommend one is form 6, which also allows
                   type filtering. See <olink targetdoc="&uima_docs_ref;" 
targetptr="ugr.ref.compress.overview"/>.</para>
                 </listitem>
               </itemizedlist>
@@ -515,6 +519,141 @@ ae.destroy();</programlisting></para>
         </itemizedlist>
       </para>
       
+      <para>Each of these serializations has different capabilities, 
summarized in the table below.
+       <table frame="all" id="ugr.tug.tbl.serialization_capabilities">
+          <title>Serialization Capabilities</title>
+          <tgroup cols="7" rowsep="1" colsep="1">
+            <colspec colname="c1"/>
+            <colspec colname="c2"/>
+            <colspec colname="c3"/>
+            <colspec colname="c4"/>
+            <colspec colname="c5"/>
+            <colspec colname="c6"/>
+            <colspec colname="c7"/>
+            <thead>
+              <row>
+                <entry align="center"></entry>
+                <entry align="center">XCAS</entry>
+                <entry align="center">XMI</entry>
+                <entry align="center">JSON</entry>
+                <entry align="center">Binary</entry>
+                <entry align="center">Cmpr 4</entry>
+                <entry align="center">Cmrp 6</entry>
+              </row>
+            </thead>
+            <tbody>
+              <row>
+                <entry>Output</entry>
+                <entry>Output Stream</entry>
+                <entry>Output Stream</entry>
+                <entry>Output Stream, File, Writer</entry>
+                <entry>Output Stream</entry>
+                <entry>Output Stream, Data Output Stream, File</entry>
+                <entry>Output Stream, Data Output Stream, File</entry>
+              </row>
+              <row>
+                <entry>Lists/Arrays inline formatting?</entry>
+                <entry>-</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+              </row>
+              <row>
+                <entry>Formatted?</entry>
+                <entry>-</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+              </row>
+              <row>
+                <entry>Type Filtering?</entry>
+                <entry>-</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+                <entry>Yes</entry>
+              </row>
+              <row>
+                <entry>Delta Cas?</entry>
+                <entry>-</entry>
+                <entry>Yes</entry>
+                <entry>-</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+              </row>
+              <row>
+                <entry>OOTS?</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+              </row>
+              <row>
+                <entry>Only send indexed + reachable FSs?</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+                <entry>Yes</entry>
+                <entry>send all</entry>
+                <entry>send all</entry>
+                <entry>Yes</entry>
+              </row>
+              <row>
+                <entry>NameSpace/Schemas?</entry>
+                <entry>-</entry>
+                <entry>Yes</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+                <entry>-</entry>
+              </row>
+            </tbody>
+          </tgroup>
+          
+        </table>
+      </para>
+      
+      <para>In the above table, Cmpr 4 and Cmpr 6 refer to Compressed forms of 
the serialization.</para>
+      
+      <para>For the XMI and JSON formats, lists and arrays can sometimes be 
formatted "inline".
+      In this representation, the elements are formatted directly as the value 
of a particular
+      feature.  This is only done if the arrays and lists are not 
multiply-referenced.</para>
+      
+      <para>Type Filtering support enables only a subset of the types and/or 
features to be
+      serialized. An additional type system object is used to specify the 
types to be included
+      in the serialization.  This can be useful, for instance, when sending a 
CAS to a remote service,
+      where the remote service only uses a small number of the types and 
features, to reduce the size
+      of the serialized CAS.</para>
+      
+      <para>Delta Cas support makes use of a "mark" set in the CAS, and only 
serializes changes in the CAS,
+      both new and modified Feature Structures, that were added or changed 
after the mark was set.
+      This is useful for remote services, supporting the use-case where a 
large CAS is sent to the service,
+      which sets the mark in the received CAS, and then adds a small amount of 
information; 
+      the Delta CAS then serializes only that small amount as the "reply" sent 
back to the sender.</para>
+      
+      <para>OOTS means "Out of Type System" support, intended to support the 
use-case where a CAS is being sent
+      to a remote application.  This supports deserializing an incoming CAS 
where
+      some of the types and/or features may not be present in the receiving 
CAS's type system.  A "lenient" 
+      option on the deserialization permits the deserialization to proceed, 
with the out-of-type-system
+      information preserved so that when the CAS is subsequently reserialized 
(in the use-case, to be 
+      returned back to the sender), the out-of-type-system information is 
re-merged back into the output stream.
+      </para>
+      
+      <para>The Binary and Compressed Form 4 serializations send all the 
Feature Structures in the CAS,
+      in the order they were created in the CAS.  The other methods only 
+      send Feature Structures that are reachable, either by 
+      their being in some CAS index, or being referenced 
+      as a feature of another Feature Structure which is reachable.</para>
+      
+      <para>The NameSpace/Schema support allows specifying a set of schemas, 
each one corresponding to a particular
+      namespace, used in XMI serialization.</para>
       <para>To save an XMI representation of a CAS, use the 
<literal>serialize</literal> method of the class
         <literal>org.apache.uima.util.XmlCasSerializer</literal>. To save an 
XCAS representation of a CAS,
         use the class 
<literal>org.apache.uima.cas.impl.XCASSerializer</literal> instead; see the 
Javadocs


Reply via email to