lvg entries

2014-04-17 Thread Miller, Timothy
The LVG annotator creates an enormous number of "lemmas" for every WordToken in the CAS, and I'm wondering what the original purpose was? I think this is probably a minor bottleneck for speed but mostly a pretty big space hog (at least 50% of the space of xmi files in my tests). As of right now I'

Re: lvg entries

2014-04-17 Thread Dligach, Dmitriy
Tim, this is a very interesting observation. Could you please send a few examples of what LVG generates? Both sensical and non :) Dima On Apr 17, 2014, at 11:28, Miller, Timothy wrote: > The LVG annotator creates an enormous number of "lemmas" for every > WordToken in the CAS, and I'm wond

Re: lvg entries

2014-04-17 Thread Miller, Timothy
Sure, just as an example, I gave it a note with about 1000 words. It generates 11500 NonEmptyFSList elements (each is basically one lexical variant). For the word "symptomatic", these are the first 10 of 20 lexical variants: Symptomaticer/JJ Symptomaticer/RB Symptomaticed/VB Symptomaticcing/VB Sym

Re: lvg entries

2014-04-17 Thread Dligach, Dmitriy
I don’t know of any applications within cTAKES that make use of this… The reverse (mapping from these “variants” to the normal form) may be useful though. Dima On Apr 17, 2014, at 11:50, Miller, Timothy wrote: > Sure, just as an example, I gave it a note with about 1000 words. It > generat

Re: lvg entries

2014-04-17 Thread Miller, Timothy
Pei and I had a similar discussion in person -- mapping from lexical variants to a stem might be useful. Pei also mentioned that one intended use might have been searching the dictionary with lexical variants, but I don't think that is done. Looking at the precision of the variants, I think its hig

RE: lvg entries

2014-04-17 Thread Finan, Sean
17, 2014 1:25 PM To: dev@ctakes.apache.org Subject: Re: lvg entries Pei and I had a similar discussion in person -- mapping from lexical variants to a stem might be useful. Pei also mentioned that one intended use might have been searching the dictionary with lexical variants, but I don't

RE: lvg entries

2014-04-17 Thread Masanz, James J.
...@childrens.harvard.edu] Sent: Thursday, April 17, 2014 11:27 AM To: dev@ctakes.apache.org Subject: lvg entries The LVG annotator creates an enormous number of "lemmas" for every WordToken in the CAS, and I'm wondering what the original purpose was? I think this is probably a minor bottlenec

Re: lvg entries

2014-04-17 Thread Miller, Timothy
t sure if still does. > > There is an option for turning off the posting of the lemmas to the cas. > > Hope that helps > > -Original Message- > From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] > Sent: Thursday, April 17, 2014 11:27 AM > To:

Re: lvg entries

2014-04-17 Thread Miller, Timothy
cas. > > Hope that helps > > -Original Message- > From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] > Sent: Thursday, April 17, 2014 11:27 AM > To: dev@ctakes.apache.org > Subject: lvg entries > > The LVG annotator creates an enormous number of "lemm

RE: lvg entries

2014-04-17 Thread Masanz, James J.
) output of the normalizer function of the LVG component -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Thursday, April 17, 2014 3:34 PM To: dev@ctakes.apache.org Subject: Re: lvg entries Thanks James. Does it ring a bell to you that the origina

RE: lvg entries

2014-04-17 Thread Masanz, James J.
ot;node", being the normalized form of "nodes", would be used when searching dictionary entries (in addition to searching dictionary entries for "nodes") -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Thursday, Apr

Re: lvg entries

2014-04-17 Thread andy mcmurry
t;nodes", would be used when searching dictionary entries (in addition to > searching dictionary entries for "nodes") > > -Original Message- > From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] > Sent: Thursday, April 17, 2014 4:33 PM > To

Re: lvg entries

2014-04-18 Thread Miller, Timothy
e dependency parsers used the Lemma >> annotations at one point. >> Not sure if still does. >> >> There is an option for turning off the posting of the lemmas to the cas. >> >> Hope that helps >> >> -Original Message- >> From: Miller, Timothy [mai

RE: lvg entries

2014-04-18 Thread Masanz, James J.
dText Not sure what the intent there was. -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Friday, April 18, 2014 11:16 AM To: dev@ctakes.apache.org Subject: Re: lvg entries Hmm... I don't see normalizedForm filled in. I see LVG filling in

Re: lvg entries

2014-04-18 Thread Miller, Timothy
ry entries for "nodes") >> >> -Original Message- >> From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] >> Sent: Thursday, April 17, 2014 4:33 PM >> To: dev@ctakes.apache.org >> Subject: Re: lvg entries >> >> Quick follo

RE: lvg entries

2014-04-18 Thread Finan, Sean
+1 false -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Friday, April 18, 2014 2:54 PM To: dev@ctakes.apache.org Subject: Re: lvg entries Thanks for tracking that down Andy. I am making a pass at UimaFit-izing the configuration parameters

Re: lvg entries

2014-04-18 Thread andy mcmurry
arvard.edu] > Sent: Friday, April 18, 2014 2:54 PM > To: dev@ctakes.apache.org > Subject: Re: lvg entries > > Thanks for tracking that down Andy. > > I am making a pass at UimaFit-izing the configuration parameters for all > the annotators in the default pipeline, before I c

new dictionary lookup {was RE: lvg entries]

2014-04-21 Thread Masanz, James J.
, 2014 12:52 PM To: dev@ctakes.apache.org Subject: RE: lvg entries Those variants are not used by the dictionary lookup. I did look at them to see if it was worthwhile for the new dictionary, but they are all over the place so I passed. From: Miller

Re: new dictionary lookup {was RE: lvg entries]

2014-04-22 Thread andy mcmurry
rd.edu] > Sent: Thursday, April 17, 2014 12:52 PM > To: dev@ctakes.apache.org > Subject: RE: lvg entries > > Those variants are not used by the dictionary lookup. I did look at them > to see if it was worthwhile for the new dictionary, but they are all over > the place so I pa

RE: new dictionary lookup {was RE: lvg entries]

2014-04-22 Thread Finan, Sean
ril 22, 2014 4:23 AM To: dev@ctakes.apache.org Subject: Re: new dictionary lookup {was RE: lvg entries] Highly Relevant *DNorm: disease name normalization* http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3810844/ "Disease names are often created by combining roots and affixes from Greek or