Revision: 19350
          http://sourceforge.net/p/gate/code/19350
Author:   markagreenwood
Date:     2016-05-25 08:01:21 +0000 (Wed, 25 May 2016)
Log Message:
-----------
first cut of documentation for GATE-Time

Modified Paths:
--------------
    userguide/branches/release-8.2/misc-creole.tex
    userguide/branches/release-8.2/recent-changes.tex

Modified: userguide/branches/release-8.2/misc-creole.tex
===================================================================
--- userguide/branches/release-8.2/misc-creole.tex      2016-05-25 01:21:54 UTC 
(rev 19349)
+++ userguide/branches/release-8.2/misc-creole.tex      2016-05-25 08:01:21 UTC 
(rev 19350)
@@ -3646,3 +3646,99 @@
 application, as it is not possible to provide this in an easily portable way.
 See Section \ref{sec:misc-creole:wn} for details of how to load WordNet into
 GATE.
+
+%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\sect[sec:misc-creole:gate-time]{GATE-Time}
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%
+This plugin provides a number of components and applications for annotating 
time
+related information and events within documents.
+
+\subsect{DCTParser}
+If processing news (news-style and also colloquial) documents, it is important
+that later components (based around HeidelTime) know the document creation time
+(DCT) of the documents.
+
+Note that it is not the time when the documents have been loaded into GATE.
+Instead, it is the time when the document was written, e.g., when a news doc-
+ument was published. To provide the DCT of a document / all documents in
+the corpus, the DCTParser can be used. It can be used in two ways:
+
+\begin{itemize}
+\item to parse the DCT out of TimeML-style xml documents, e.g., the corpora
+TempEval-3 TimeBank, TempEval-3 Aquaint, and TempEval-3 platinum
+contain DCT information in this format. (cf. very last section)
+\item to manually set the DCT for a document or a full corpus.
+\end{itemize}
+
+%It might make sense to add further parsing formats to DCTParser, e.g., 
+%that the dct can be parsed out of a document’s name (e.g., if the documents
+%in a corpus are named like “NYT-20100910-article1.txt” and “NYT-20100911-
+%article1.txt”).
+It is crucial to know that if a corpus contains many documents,
+then, the documents typically have differing DCTs. Currently, the DCT can
+only be parsed if it is available in TimeML-style format, or it can be manually
+provided for the document or the full corpus. If HeidelTime processes news doc-
+uments with wrong DCT information, relative and underspecified expressions
+will, of course, be normalized incorrectly.
+If the documents that are to be processed are narrative documents (e.g.,
+Wikipedia documents), no document creation time is required. The HeidelTime
+GATE wrapper can handle this automatically if the domain of the HeidelTime
+component is set to “narratives” (see next section).
+
+The DCTParser is configured through the following runtime parameters:
+\begin{itemize}
+\item[dctParsingFormat] timeml or manualdate
+\item[inputASName] name of the annotation set where DCT is stored
+\item[manuallySetDct] if format is set to “manualdate”, the user can set a date
+manually and this date is stored as DCT by DCTParser
+\item[outputASName] name of annotation set for output
+\end{itemize}
+
+\subsect{HeidelTime}
+HeidelTime can be used for many languages and four domains (in partic-
+ular news and narrative, but also colloquial and autonomic for English – see
+Heideltime standalone Manual). Note that HeidelTime can perform linguistic
+preprocessing for all the languages if respective tools are installed 
correctly and
+configured correctly in the config.props file. 
+
+If processing HeidelTime narrative-style documents, it is not important that
+DCT information is available for the documents. If news-style (and colloquial)
+documents are processed, then DCT information is crucial and processing fails,
+if no DCT information is available. For this, creationDateAnnotationType has
+to contain information about the DCT annotation (see above).
+
+HeidelTime can be used in such a way that the linguistic preprocessing is
+performed internally. For this further tools have to be set-up and
+the parameter doPreprocessing has to be set to ’true’. In this case, some other
+parameters are ignored (about Sentence, Token, POS).
+If other preprocessing annotations shall be used (e.g., those of ANNIE) then
+doPreprocessing has to be set to ’false’ and the other parameters (about Sen-
+tence, Token, POS) have to be provided correctly.
+
+HeidelTime’s is configured via three init parameters:
+different models have to be loaded depending on language and domain).
+
+\begin{itemize}
+\item[configFile] the location of the config.props file
+\item[documentType] narratives, news, colloquial, or scientific
+\item[language] english, german, dutch, .......
+\end{itemize}
+
+and the following runtime parameters:
+
+\begin{itemize}
+\item[creationDateAnnotationType] if DCTParser is used to set the DCT, then
+the value is “DCT”
+\item[doPreprocessing] set to false to use existing annotations, true if you 
want HeidelTime to pre-process the document
+\item[inputASName] name of annotation set, where token, sentence, pos 
information are stored (if any)
+\item[outputASName] name of annotation set for output
+\item[posAnnotationNameAsTokenAttribute] name of the `part-of-speech' 
attribute of the `Token' annotations (if using ANNIE, this is `category')
+\item[sentenceAnnotationType] name of the `Sentence' annotation (if using 
ANNIE, this is `Sentence')
+\item[tokenAnnotationType] name of the `Token' annotation (if using ANNIE, 
this is `Token')
+\end{itemize}
+
+\subsect{TimeML Event Detection}
+
+The plugin also contains a ``Ready Made'' application for detecting TimeML 
based events.

Modified: userguide/branches/release-8.2/recent-changes.tex
===================================================================
--- userguide/branches/release-8.2/recent-changes.tex   2016-05-25 01:21:54 UTC 
(rev 19349)
+++ userguide/branches/release-8.2/recent-changes.tex   2016-05-25 08:01:21 UTC 
(rev 19350)
@@ -21,6 +21,10 @@
 
 \rcSect[next]{Next Release}
 
+\rcSubsect{May 2016}
+Added the \verb!Tagger_GATE-Time! plugin which wraps HeidelTime and provides a 
TimeML event detection app. See
+section~\ref{sec:misc-creole:gate-time} for more details.
+
 \rcSubsect{April 2016}
 
 Added \verb!Lang_Welsh! plugin, providing a Welsh language named entity

This was sent by the SourceForge.net collaborative development platform, the 
world's largest Open Source development site.


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
GATE-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/gate-cvs

Reply via email to