On 12/7/06, Marshall Schor <[EMAIL PROTECTED]> wrote:
To support "chunking" as used in OmniFind, the reference chapter for the
CPE says things like:
<term>throttleID</term>
<listitem><para>[String] special attribute currently used by
OmniFind.</para></listitem>
It seems a bit strange to have Omnifind specific information in the
Apache UIMA documentation.
Does anyone have a suggestion on how to better handle this? -Marshall
Well, can we just document how to use the feature (which I think
involves using a particular type system for document metadata and
populating it in the CollectionReader)? Others might have similar
requirements and want to use this.
I guess one problem is that the type name might not be what we want
(e.g. does it start with com.ibm?) We might want to consider making
this a UIMA built-in type (or perhaps not truly built-in, but a type
system descriptor provided with the SDK). Maybe it should be combined
with the org.apache.uima.examples.SourceDocumentInformation type,
which already defines a Document URI and an isLastSegment flag - we'd
need to add the SegmentNumber.
-Adam