Sorry, I meant suggest to search for 'soft' in the dictionary file not 'short'
grep -i ,\'soft\', *.script On Sat, Aug 1, 2020 at 7:47 PM Jeffrey Miller <jeff...@gmail.com> wrote: > Hi Peter, > > To my knowledge, there isn't any drastic difference in the behavior of the > dictionary gui creator and the way the sno_rx dictionary was created. I > originally thought there was, but I realized the difference was that I had > not installed all of UMLS to my machine (just the vocabularies I was > interested in) and I was missing synonyms. The first thing I would check, > are you able to find a matching entry in the .script file for your ctakes > dictionary when you do this: > > grep -i ,\'short\', *.script > > That would confirm whether or not you have a term in your dictionary made > up only of 'short' and whether it mapped to the CUI equal to "SHORT > STATURE, ONYCHODYSPLASIA, FACIAL DYSMORPHISM, AND HYPOTRICHOSIS SYNDROME". > If it's not in there, something else is going on. You could do the same for > 'bed'. > > If not, another thing I might check is that I noticed you are using > the OverlapJCasTermAnnotator in your prior e-mail. I don't have much > experience with it, and I don't think it should cause this behavior, but I > wonder if that could be making the difference (as compared > to DefaultJCasTermAnnotator). > > Jeff > > On Sat, Aug 1, 2020 at 5:27 PM Peter Abramowitsch <pabramowit...@gmail.com> > wrote: > >> >> Hi All >> >> Having created a new dictionary from the 2020AA UMLS and added Genes and >> Receptors to the dictionary-creator's default selections, I have a curious >> problem where cTakes now assigns the most bizarre acronyms to ordinary >> words used in POS contexts where it shouldn't find <XXX>Mentions. >> >> Here are two examples: >> >> 1. soft (in "soft tissue...") >> becomes "SHORT STATURE, ONYCHODYSPLASIA, FACIAL DYSMORPHISM, AND >> HYPOTRICHOSIS SYNDROME", >> >> 2. bed in ("The wound bed was...") >> becomes "BORNHOLM EYE DISEASE" >> >> I have not changed the TermConsumer type in the descriptor XML. >> >> Are the DictionaryCreator's defaults, the equivalent to the default >> sno_rx that's delivered with the app? >> >> Attached is the vocab subsets list I used >> >> >> Peter >> >> >>