RE: Context Dependent Annotation I had needs for fast local write (Consumer) that could directly be imported into a SQL database for further analysis independent of cTakes. Consuming annotations are MUCH faster than what I was doing before
Supports fast CSV->SQL inserts for the following types in a standard database table format * Sentence (annotation) * BaseToken * NewLineToken * ContractionToken * PunctuationToken * SymbolToken * WordToken * NumToken Creates database tables for each token type. If folks are interested in "yet another annotation consumer" I can upload. --Andy On May 1, 2013, at 3:02 AM, "Masanz, James J." <masanz.ja...@mayo.edu> wrote: > I was thinking something like this: > > public void process(JCas jcas) throws AnalysisEngineProcessException { > for (Sentence sentence : JCasUtil.select(jcas, Sentence.class)) { > int sentenceBegin = sentence.getBegin(); > int fromIndex = 0; > String s = sentence.getCoveredText(); > while (fromIndex < s.length()) { > LookupWindowAnnotation lwa = new LookupWindowAnnotation(jcas); > lwa.setBegin(sentenceBegin + fromIndex); > int commaPosition = s.indexOf(COMMA, fromIndex); > if (commaPosition > 0) { > lwa.setEnd(sentenceBegin+commaPosition+1); // include the comma > } else { > lwa.setEnd(sentence.getEnd()); > } > lwa.addToIndexes(jcas); > fromIndex = lwa.getEnd()-sentenceBegin; > } > } > ) > > -- James > >> -----Original Message----- >> From: dev-return-1566-Masanz.James=mayo....@ctakes.apache.org [mailto:dev- >> return-1566-Masanz.James=mayo....@ctakes.apache.org] On Behalf Of Piyush >> Jain >> Sent: Tuesday, April 30, 2013 8:55 AM >> To: dev@ctakes.apache.org >> Subject: Re: cTAKES - Context Annotation >> >> Hello James, >> >> Thanks for your reply. Apologies for late reply. >> >> Could you please help me on how can I create a new WindowAnnotationClass >> for keeping comma out of scope of a sentence? >> >> Regards, >> Piyush >> >> >> On Wed, Apr 3, 2013 at 11:14 AM, Masanz, James J. >> <masanz.ja...@mayo.edu>wrote: >> >>> In addition to "MaxLeftScopeSize" and "MaxRightScopeSize", the context >>> annotations created by ctakes-ne-contexts use the parameter called >>> WindowAnnotationClass to control the boudaries. By default, when the >>> context annotator collects the context annotations for a named entity, >>> it will not look beyond the boundaries of the sentence that the named >>> entity is found in, because within NegationAnnotator.xml and within >>> StatusAnnotator.xml, WindowAnnotationClass is set to be Sentence >>> (org.apache.ctakes.typesystem.type.textspan.Sentence) >>> >>> If you want a narrower range (to not cross commas) then I suggest you >>> create new annotations whose boundaries are determined by the start >>> and end of sentences and by occurrences of commas, and use the name of >>> that new annotation as the WindowAnnotationClass. >>> >>> -- James >>> >>> >>>> -----Original Message----- >>>> From: dev-return-1419-Masanz.James=mayo....@ctakes.apache.org [mailto: >>> dev- >>>> return-1419-Masanz.James=mayo....@ctakes.apache.org] On Behalf Of >>>> Piyush Jain >>>> Sent: Monday, April 01, 2013 10:56 AM >>>> To: dev@ctakes.apache.org >>>> Subject: cTAKES - Context Annotation >>>> >>>> Hell Sir/Madam, >>>> >>>> I am new to this forum and I have few question regarding cTAKES. >>>> As of now, I am posting my question in this email only, please let >>>> me >>> know >>>> if I have to post my questions somewhere else. >>>> >>>> Question: >>>> I am using cTAKES context annotation. I am trying to keep >>>> "MaxLeftScopeSize" and "MaxRightScopeSize" as 2, earlier it was 10. >>>> Also, I need to keep the context annotation look up to be limited to >>>> same sentence and same "," span. >>>> >>>> Could you please help me with this or if you are not the right >>>> person could you please redirect me to right person? >>>> >>>> >>>> Regards, >>>> Piyush >>>