[ 
https://issues.apache.org/jira/browse/OPENNLP-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987572#action_12987572
 ] 

Jörn Kottmann commented on OPENNLP-62:
--------------------------------------

Reviewed your code now.

ChunkSample.phrasesAsSpanList:
The phrases ArrayList should be initialized to the sice of the input
sentence, that will always be the max number of Span object
which the code below can create, otherwise the internal
array must be resized if sentences is longer than default value.

The test for this method should contain a case where we have a
BOB sequence, to test that after an O everything still works correctly.

> Chunker should output chunks also as Spans
> ------------------------------------------
>
>                 Key: OPENNLP-62
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-62
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Chunker
>            Reporter: Jörn Kottmann
>            Assignee: William Colen
>             Fix For: tools-1.5.1-incubating
>
>
> The chunker currently takes a string array as input and outputs a tag for 
> each input string.
> The interface should be extended in a way that it can output an array of 
> Spans instead, where
> each Span contains the type, and the begin/end offset in the input array. 
> Like the name finder
> does. Like its done by ChunkSample.getPhrasesAsSpanList().

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to