Dude thanks for your interest again. Your reply helped me solve my problem. I noticed that you have <START:person> where I have <START person>. I had a space where I should have had a colon. Thanks a ton!
On Wed, Nov 20, 2013 at 9:53 AM, Jim - FooBar(); <[email protected]>wrote: > It puts spaces within the tag - are you 100% positive it puts spaces > outside the tag too? Generally this doesn't matter as tokens are already > space-tokenised but surely you can imagine a case where the text was > "...said to Mr. John." and the annotation would normally be "... said to > Mr. <START:person> John <END>.". If your scripts forces spaces before and > after the tag, in the majority of cases you would end up with double spaces > everywhere. My annotator for opennlp does not enforce spaces outside the > tag and assumes the user can sort these few weird cases with an editor > which supports regex. > > Jim > > > On 20/11/13 17:03, Walrus theCat wrote: > >> Hi Jim, >> >> Thanks for your interest. I realize that's how most other people solved >> this error message, but it's not applicable in my case. The code errors >> out on the first document, which doesn't commit this formatting error, and >> it's not possible for any of my text to be formatted like that because the >> script that generates it puts in spaces. To be thorough, I did search the >> docs and nothing comes up. Does anyone have any ideas what could be wrong >> here? >> >> Thanks >> >> >> On Wed, Nov 20, 2013 at 2:38 AM, Jim - FooBar(); <[email protected] >> >wrote: >> >> On 20/11/13 07:23, Walrus theCat wrote: >>> >>> In training my NameFinderME, I get the following error message: >>>> >>>> Computing event counts... java.io.IOException: Found unexpected >>>> annotation: >>>> >>>> In everything else Google has found me for this error message, it's >>>> always >>>> a simple error in the spacing of the training data (e.g., change >>>> <START:entity>some >>>> text<END> to <START:entity> some text <END> . This isn't applicable to >>>> me >>>> (it's all correctly spaced.) It's all UTF-16, and specified to be so >>>> when >>>> I >>>> set up the objects to do the training. Any ideas on what could be wrong? >>>> >>>> Thank you >>>> >>>> >>>> press ctrl+f on your favourite editor and search 'n' replace ">." with >>> "> >>> ." and possibly ">," with "> ,". I've been bitten by this before :) >>> >>> hope that helps, >>> Jim >>> >>> >
