RE Tuning custom dictionary recommendations

Peter Abramowitsch Tue, 04 Aug 2020 12:29:10 -0700

Hi Jeff et al

To take up the thread from a few days ago where a simple english word such
as bed, soft, shop also maps into a legitimate but rarely used acronym and
shows up in the same POS as a potentially interesting entity,  what is the
mechanism you would use to disambiguate?


This problem only started since I  constructed a SNO+RX+HGNC dictionary
from the 2020A UMLS dump.   Adding more TUIS where a more conventional
word-sense of the target word occurs, does not fix this problem.

For instance, why does the sno_rx dictionary not contain this disease which
aliases to  "bed" ?

ucsf_dict_v1 $ grep 3159311 *.script
*INSERT INTO CUI_TERMS VALUES(3159311,0,1,'bed','bed')*
INSERT INTO CUI_TERMS VALUES(3159311,5,8,'myopia , high , with
nonprogressive cone dysfunction','nonprogressive')
INSERT INTO CUI_TERMS VALUES(3159311,0,3,'bornholm eye disease','bornholm')
INSERT INTO CUI_TERMS VALUES(3159311,5,6,'x-linked cone dysfunction
syndrome with myopia','myopia')
INSERT INTO TUI VALUES(3159311,47)
*INSERT INTO PREFTERM VALUES(3159311,'BORNHOLM EYE DISEASE')*
INSERT INTO SNOMEDCT_US VALUES(3159311,718718009)


sno_rx_16ab $ grep 3159311 *.script
nada

Solutions good or evil?

   - Strip the relevant lines out of ths dict.script file?
   - Blacklist the text?
   - Add to my stopCUI list (a little feature I added)?
   - Some other configuration I don't  know about?
   For instance, is there a CUI:ACRONYM table?
   I'm tempted to create one.  This would require the matching term to be
   present in upper case.

Peter

RE Tuning custom dictionary recommendations

Reply via email to