Ok, hopefully one last question. Based on your example everything runs, however the Anat and Snomed runs don't produce any valid CUIs but RXNorm does. I'm not sure if this has anything to do with it but every UMLS source read is against MRSTY.
Here's my command java -cp dictionarytool.jar;lib/* org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls /path/to/UMLS/META -fd ./data/tiny -atui ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt -ol \path\to\file\Umls2015.bsv Any suggestions? Thanks again, Brandon -----Original Message----- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, September 16, 2015 3:05 PM To: dev@ctakes.apache.org Subject: RE: Fast Dictionary Update Yes, that will make the rare word dictionary in a memory-based hsql database - the same as the default for the dictionary-lookup-fast module. -----Original Message----- From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] Sent: Wednesday, September 16, 2015 2:42 PM To: dev@ctakes.apache.org Subject: RE: Fast Dictionary Update Thanks Sean, much appreciated. To clarify the example below would create the dictionary for use for the rare word approach? Thanks, Brandon -----Original Message----- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, September 16, 2015 2:16 PM To: dev@ctakes.apache.org Subject: RE: Fast Dictionary Update Hi Brandon, I just checked in a bin/dictionarytool.zip It should have everything that you need (.jar, lib/, data/). java -cp dictionarytool.jar;lib/* org.apache.ctakes.dictionarytool.DictionaryCreator2 [args] Should do the trick. To recreate a 2015 version of the current ctakes dictionary, the arguments are: -umls my/path/to/2015AA/META -fd ./data/tiny -atui ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt -db jdbc:hsqldb:file:my/path/to/snorx2015 -tbl CUI_TERMS Create my/path/to/snorx2015 by copying resources/memdbtemplate/ctakesumls.properties to my/path/to/snorx2015.properties - there is a resources/README about this. Before populating a DB, I usually do a trial run first, writing to a flat file. Replace "-db ... -tbl ..." with "-ol my/path/to/testout.bsv" Sean -----Original Message----- From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] Sent: Wednesday, September 16, 2015 1:49 PM To: dev@ctakes.apache.org Subject: RE: Fast Dictionary Update Hi Sean, That'd be great. I think I'm building it incorrectly because after I build the jar and try to run specifying DictionaryCreator2 as the main class it says it can't find it. I'm not too familiar with Java and building projects/jars so it could be my ignorance causing the problem. Thanks, Brandon -----Original Message----- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, September 16, 2015 1:45 PM To: dev@ctakes.apache.org Subject: RE: Fast Dictionary Update Hi Brandon, I can send you a jar or commit one pre-built. What goes wrong when you try to build the tool? Sean -----Original Message----- From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] Sent: Wednesday, September 16, 2015 1:23 PM To: 'dev@ctakes.apache.org' Subject: Fast Dictionary Update Does someone have the DictionaryTool jar available? I'm having trouble creating the jar file from the project and would like to be able to create an updated UMLS fast dictionary for 2015. Thanks, Brandon IMPORTANT WARNING: The information in this message (and the documents attached to it, if any) is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken, or omitted to be taken, in reliance on it is prohibited and may be unlawful. If you have received this message in error, please delete all electronic copies of this message (and the documents attached to it, if any), destroy any hard copies you may have created and notify me immediately by replying to this email. Thank you. Geisinger Health System utilizes an encryption process to safeguard Protected Health Information and other confidential data contained in external e-mail messages. If email is encrypted, the recipient will receive an e-mail instructing them to sign on to the Geisinger Health System Secure E-mail Message Center to retrieve the encrypted e-mail.