Revision: 17348
          http://sourceforge.net/p/gate/code/17348
Author:   adamfunk
Date:     2014-02-19 17:05:15 +0000 (Wed, 19 Feb 2014)
Log Message:
-----------
Partial documentation of the PMI bank.

Modified Paths:
--------------
    userguide/trunk/misc-creole.tex
    userguide/trunk/recent-changes.tex

Modified: userguide/trunk/misc-creole.tex
===================================================================
--- userguide/trunk/misc-creole.tex     2014-02-19 17:03:32 UTC (rev 17347)
+++ userguide/trunk/misc-creole.tex     2014-02-19 17:05:15 UTC (rev 17348)
@@ -3254,15 +3254,16 @@
 TermRaider is a set of term extraction and scoring tools developed in the NeOn
 and ARCOMEM projects.  Although the plugin is still experimental, we are now
 including it in GATE as a response to frequent requests from GATE users who 
have
-read publications related to those projects.  Please note that although the
-TermRaider GUI and API are themselves fairly stable, they are subject to change
-and the output formats are unstable.
+read publications related to those projects.
 
 The easiest way to test TermRaider is to populate a corpus with related
 documents, load the sample
 application (\texttt{plugins/TermRaider/applications/termraider-eng.gapp}),
 and run it.  This application will process the documents and create instances 
of
 three termbank language resources with sensible parameters.
+
+All the language resources in TermRaider are properly serializable and so can 
be
+stored in GATE datastores.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \subsect{Termbank language resources}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -3376,12 +3377,34 @@
 \emph{docFrequencyFeature} unless that is blank.  Note that the default values
 are not blank---you need to clear either or both parameters to prevent these
 annotation features from being filled in.
-%
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-% OBVIOUSLY I'm not finished documenting this.
-% -- Adam
+\subsect{The PMI bank language resource}
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+Like termbanks, the \emph{PMI Bank} is a GATE language resource derived from
+annotations on one or more GATE corpora.  The PMI Bank, however, works on
+\emph{collocations}---pairs of ``inner'' annotations (e.g., \emph{Token} or
+named entity types) within a sliding window defined as a number of ``outer''
+annotations (usually 1 or 2 \emph{Sentence} annotations).
 
+The documents need to be processed to create the required inner and outer
+annotations, as shown in the \texttt{pmi-example.gapp} sample application
+provided in this plugin.  The PMI Bank can then be created with the following
+init parameters.
+%%
+\begin{description}
+\item[allowOverlapCollocations] default \texttt{false}
+\item[corpora]
+\item[debugMode] default \texttt{false}
+\item[innerAnnotationTypes] default \verb![Entity]!
+% TODO change the default to something more sensible
+\item[inputASName]
+\item[inputAnnotationFeature] default \texttt{canonical}
+\item[languageFeature] default \texttt{lang}
+\item[outerAnnotationType] default \texttt{Sentence}
+\item[outerAnnotationWindow] default \texttt{2}
+\item[requireTypeDifference] default \texttt{false}
+\item[scoreProperty] default \texttt{pmiScore}
+\end{description}
 %
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \sect[sec:misc-creole:doc-normalizer]{Document Normalizer}

Modified: userguide/trunk/recent-changes.tex
===================================================================
--- userguide/trunk/recent-changes.tex  2014-02-19 17:03:32 UTC (rev 17347)
+++ userguide/trunk/recent-changes.tex  2014-02-19 17:05:15 UTC (rev 17348)
@@ -21,6 +21,11 @@
 
 \rcSect[next-release]{Next Release}
 
+\rcSubsect{February 2014}
+
+The Twitter JSON document format and corpus population tool are now documented
+in the Twitter plugin (Section~\ref{sec:creole:tweet}).
+
 \rcSubsect{January 2014}
 
 A new plugin that allows for document normalization has been added. This

This was sent by the SourceForge.net collaborative development platform, the 
world's largest Open Source development site.


------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
GATE-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/gate-cvs

Reply via email to