It works!
I've added one entry "bed" with my new dictionary and blacklisting it from
disorders, S&S, procedures, etc to the case sensitive version
When I spell it bed, it is blacklisted.
When I spell it BED, I get
"ontologyConceptArr": [{
"_type": "UmlsConcept",
"codingScheme": "SNOMEDCT_US",
"code": "718718009",
"score": 0.0,
"disambiguated": false,
"cui": "C3159311",
"tui": "T047",
"preferredText": "BORNHOLM EYE DISEASE"
}],
On Tue, Aug 4, 2020 at 4:41 PM Jeffrey Miller <[email protected]> wrote:
> Where in the source code is this feature implemented?
>
> On Tue, Aug 4, 2020 at 7:30 PM Peter Abramowitsch <[email protected]
> >
> wrote:
>
> > Blacklist format
> > Actually I got it inverted, its:
> >
> > semantic_code1, semantic_code2,...|text1
> > semantic_code1, semantic_code2,...|text2
> >
> > Peter
> >
> > On Tue, Aug 4, 2020 at 4:16 PM Peter Abramowitsch <
> [email protected]
> > >
> > wrote:
> >
> > > Ok Thanks Jeff. I'm glad I wasn't missing something important.
> > >
> > > There already is a blacklist text mechanism which suppresses
> > > identification of specific text by clinical domain.
> > > Looking at the code it collects entries like
> > > cTakesSemanticCode,texta,textb,textc
> > > NE_TYPE_ID_DRUG, jasmine, coriander, bleach
> > > There's a case sensitive list and a case insensitive one.
> > >
> > > So I will try that.
> > > in one of my examples, I'll say that 'bed' is not a disorder, while
> > 'BED'
> > > could be one.
> > >
> > >
> > >
> > > On Tue, Aug 4, 2020 at 2:12 PM Jeffrey Miller <[email protected]>
> wrote:
> > >
> > >> Hi Peter,
> > >>
> > >> To your question about sno_rx_16ab I suspect that the CUI is new since
> > >> 2016, or if it existed in UMLS back then, it was not associated with a
> > >> term
> > >> in snomed or rxnorm at that time.
> > >>
> > >> To those solutions, if you are able to use the trunk I know Sean said
> > >> there
> > >> was a suppression text feature, otherwise in the past I have removed
> the
> > >> lines from the .script file
> > >>
> > >> I definitely think the acronym case sensitive feature would be great.
> > >>
> > >> Jeff
> > >>
> > >> On Tue, Aug 4, 2020 at 3:28 PM Peter Abramowitsch <
> > >> [email protected]>
> > >> wrote:
> > >>
> > >> > Hi Jeff et al
> > >> >
> > >> > To take up the thread from a few days ago where a simple english
> word
> > >> such
> > >> > as bed, soft, shop also maps into a legitimate but rarely used
> acronym
> > >> and
> > >> > shows up in the same POS as a potentially interesting entity, what
> is
> > >> the
> > >> > mechanism you would use to disambiguate?
> > >> >
> > >> > This problem only started since I constructed a SNO+RX+HGNC
> > dictionary
> > >> > from the 2020A UMLS dump. Adding more TUIS where a more
> conventional
> > >> > word-sense of the target word occurs, does not fix this problem.
> > >> >
> > >> > For instance, why does the sno_rx dictionary not contain this
> disease
> > >> which
> > >> > aliases to "bed" ?
> > >> >
> > >> > ucsf_dict_v1 $ grep 3159311 *.script
> > >> > *INSERT INTO CUI_TERMS VALUES(3159311,0,1,'bed','bed')*
> > >> > INSERT INTO CUI_TERMS VALUES(3159311,5,8,'myopia , high , with
> > >> > nonprogressive cone dysfunction','nonprogressive')
> > >> > INSERT INTO CUI_TERMS VALUES(3159311,0,3,'bornholm eye
> > >> disease','bornholm')
> > >> > INSERT INTO CUI_TERMS VALUES(3159311,5,6,'x-linked cone dysfunction
> > >> > syndrome with myopia','myopia')
> > >> > INSERT INTO TUI VALUES(3159311,47)
> > >> > *INSERT INTO PREFTERM VALUES(3159311,'BORNHOLM EYE DISEASE')*
> > >> > INSERT INTO SNOMEDCT_US VALUES(3159311,718718009)
> > >> >
> > >> >
> > >> > sno_rx_16ab $ grep 3159311 *.script
> > >> > nada
> > >> >
> > >> > Solutions good or evil?
> > >> >
> > >> > - Strip the relevant lines out of ths dict.script file?
> > >> > - Blacklist the text?
> > >> > - Add to my stopCUI list (a little feature I added)?
> > >> > - Some other configuration I don't know about?
> > >> > For instance, is there a CUI:ACRONYM table?
> > >> > I'm tempted to create one. This would require the matching term
> to
> > >> be
> > >> > present in upper case.
> > >> >
> > >> > Peter
> > >> >
> > >>
> > >
> >
>