[
https://issues.apache.org/jira/browse/UIMA-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905646#action_12905646
]
Marshall Schor commented on UIMA-1861:
--------------------------------------
Thanks for the fixes/patch. Here are a few suggested changes, to take
advantage of JCas better. (I've attached this version as patch2 above)
1) Although this annotator is set up as a JCas annotator, it is missing the
JCas type for TokenAnnotation. Because of this, it goes to some lengths to not
make use of this type where it could be useful. To add the JCas cover types
for this is easy: open the desc/SnowballAnnotator.xml descriptor in the
Component Descriptor editor in Eclipse, click the typesystem page, and push the
JCasGen button. This will generate the missing classes for the types and add
them to the project.
If the TokenAnnotation JCas type was available, the lines:
(original)
// iterate over all token annotations and add stem if available
FSIterator tokenIterator =
aJCas.getCas().getAnnotationIndex(this.tokenAnnotation).iterator();
(with patch)
// iterate over all token annotations and add stem if available
FSIterator tokenIterator =
aJCas.getAnnotationIndex(this.tokenAnnotation).iterator();
// note: causes a warning leading to a suppress warnings, related to
generics
could be written
// iterate over all token annotations and add stem if available
FSIterator<TokenAnnotation> tokenIterator =
(FSIterator<TokenAnnotation>)(FSIterator<?>) // very ugly "double-fisted cast"
aJCas.getAnnotationIndex(TokenAnnotation.type).iterator();
and the code in the bottom method (typeSystemInit) would not be needed. The
"double-fisted cast" is described here
http://markmail.org/message/w5kpympalj6tvqq3.
Alternatively, to avoid the double cast, the FSIterator could be over the type
Annotation, and an explicit cast of the next() could be done to TokenAnnotation:
// iterate over all token annotations and add stem if available
FSIterator<Annotation> tokenIterator =
aJCas.getAnnotationIndex(TokenAnnotation.type).iterator();
...
TokenAnnotation annot = (TokenAnnotation) tokenIterator.next();
The line further on down which reads
// get stemmer result and set annotation feature
annot.setStringValue(this.tokenAnnotationStemmFeature,
stemmer.getCurrent());
would be better written (using JCas style) as:
// get stemmer result and set annotation feature
annot.setStem(stemmer.getCurrent());
If the JCas style is used, the typeSystemInit method can be deleted, along with
all the constants added to support it, because the things its computing are not
used. In any case, it should not be called in the process method. (The UIMA
framework calls it directly, but only when the type system changes).
> SnowballAnnotator needs refactoring
> -----------------------------------
>
> Key: UIMA-1861
> URL: https://issues.apache.org/jira/browse/UIMA-1861
> Project: UIMA
> Issue Type: Bug
> Components: Sandbox-SnowballAnnotator
> Affects Versions: 2.3.1
> Reporter: Tommaso Teofili
> Assignee: Tommaso Teofili
> Fix For: 2.3.1
>
> Attachments: SnowballAnnotatorPatch2.txt, UIMA1861-patch.txt
>
>
> SnowballAnnotator is extending the deprecated JTextAnnotator_ImplBase, have
> some unused imports and generics should be enabled.
> Moreover the initialize() method fails due to the AnnotatorContext object
> being null when run in a 2.3.1-SNAPSHOT distribution.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.