[ https://issues.apache.org/jira/browse/UIMA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734606#action_12734606 ]
Jérôme Rocheteau edited comment on UIMA-1447 at 7/23/09 7:46 AM: ----------------------------------------------------------------- I suggest this patch: it merely checks if the current character isn't a whitespace while creating a token annotation for a special character. was (Author: jerome.rocheteau): I suggest this patch: it merely checks if the current character isn't a whitespace while creating a token annotation is created for a special character. > Tabulations are annotated as tokens after a space > ------------------------------------------------- > > Key: UIMA-1447 > URL: https://issues.apache.org/jira/browse/UIMA-1447 > Project: UIMA > Issue Type: Bug > Components: Sandbox-WhitespaceTokenizer > Affects Versions: 2.3S > Environment: Unix (ubuntu 8.04), Eclipse Galileo 3.5 > Reporter: Jérôme Rocheteau > Attachments: patch-an-wst.txt > > > This is a test-text for the Whitespace Tokenizer in the UIMA Sandbox. > It behaves as follows: i.e. a '\t' character after a space is > annotated as a token and its covered text is set to the empty string ""! > I suppose it shoudn't be the case, am I wrong? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.