Did you add it to data/default/ CtakesSources.txt ? If not then you need to specify -src ./data/tiny/CtakesSources.txt
Sorry for any confusion. As soon as my inet isn't overloaded I'll download 2015AA and see if I can build a dictionary. -----Original Message----- From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] Sent: Wednesday, September 16, 2015 8:14 PM To: dev@ctakes.apache.org; dev@ctakes.apache.org Subject: RE: Fast Dictionary Update Sean, I added that and still had the same issue. Thanks, Brandon _____________________________ From: Finan, Sean <sean.fi...@childrens.harvard.edu<mailto:sean.fi...@childrens.harvard.edu>> Sent: Wednesday, September 16, 2015 7:56 PM Subject: RE: Fast Dictionary Update To: <dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>> And you added "SNOMEDCT_US" to data/tiny/CtakesSources.txt ? -----Original Message----- From: Tomasz Oliwa [mailto:ol...@uchicago.edu] Sent: Wednesday, September 16, 2015 7:13 PM To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> Subject: RE: Fast Dictionary Update I have exactly the same problem with the tool. A grep on MRCONSO.RRF for "SNOMEDCT" or for "SNOMEDCT_US" shows many lines. ________________________________________ From: Geise, Brandon D. [bdge...@geisinger.edu<mailto:bdge...@geisinger.edu>] Sent: Wednesday, September 16, 2015 5:05 PM To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> Subject: RE: Fast Dictionary Update Yes, it finds "SNOMEDCT_US". -----Original Message----- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, September 16, 2015 5:17 PM To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> Subject: RE: Fast Dictionary Update Ah, now I see what you mean. Can you do a grep on your MRCONSO.RRF for "SNOMEDCT" ? -----Original Message----- From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] Sent: Wednesday, September 16, 2015 4:04 PM To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> Subject: RE: Fast Dictionary Update I tried changing as suggested. Below is what I see for the snomed piece, but for RXNorm it writes terms at the end. Reading list of Source Types from ./data/default/CtakesSources.txt File Lines 1 list of Source Types 1 Reading list of Tuis from ./data/tiny/CtakesSnomedTuis.txt File Lines 24 list of Tuis 24 Compiling list of Cuis with wanted Tuis using /patto/UMLS_Current_Version/META/MRSTY.RRF File Line 200000 Cuis 60895 File Line 300000 Cuis 85750 File Line 400000 Cuis 135098 File Line 600000 Cuis 183925 File Line 1700000<tel:1700000> Cuis 376338 File Line 1800000<tel:1800000> Cuis 471009 File Line 1900000<tel:1900000> Cuis 568375 File Line 2100000<tel:2100000> Cuis 674715 File Line 2800000<tel:2800000> Cuis 903583 File Line 3300000<tel:3300000> Cuis 973791 File Lines 3370173<tel:3370173> Cuis 999451 ..................................................File Line 100000 Valid Cuis 0 ..................................................File Line 200000 Valid Cuis 0 ..................................................File Line 300000 Valid Cuis 0 ..................................................File Line 400000 Valid Cuis 0 ..................................................File Line 500000 Valid Cuis 0 ..................................................File Line 600000 Valid Cuis 0 ..................................................File Line 700000 Valid Cuis 0 ..................................................File Line 800000 Valid Cuis 0 ..................................................File Line 900000 Valid Cuis 0 ..................................................File Line 1000000<tel:1000000> Valid Cuis 0 ..................................................File Line 1100000<tel:1100000> Valid Cuis 0 ..................................................File Line 1200000<tel:1200000> Valid Cuis 0 ..................................................File Line 1300000<tel:1300000> Valid Cuis 0 ..................................................File Line 1400000<tel:1400000> Valid Cuis 0 ..................................................File Line 1500000<tel:1500000> Valid Cuis 0 ..................................................File Line 1600000<tel:1600000> Valid Cuis 0 ..................................................File Line 1700000<tel:1700000> Valid Cuis 0 ..................................................File Line 1800000<tel:1800000> Valid Cuis 0 ..................................................File Line 1900000<tel:1900000> Valid Cuis 0 ..................................................File Line 2000000<tel:2000000> Valid Cuis 0 ..................................................File Line 2100000<tel:2100000> Valid Cuis 0 ..................................................File Line 2200000<tel:2200000> Valid Cuis 0 ..................................................File Line 2300000<tel:2300000> Valid Cuis 0 ..................................................File Line 2400000<tel:2400000> Valid Cuis 0 ..................................................File Line 2500000<tel:2500000> Valid Cuis 0 ..................................................File Line 2600000<tel:2600000> Valid Cuis 0 ..................................................File Line 2700000<tel:2700000> Valid Cuis 0 ..................................................File Line 2800000<tel:2800000> Valid Cuis 0 ..................................................File Line 2900000<tel:2900000> Valid Cuis 0 ..................................................File Line 3000000<tel:3000000> Valid Cuis 0 ..................................................File Line 3100000<tel:3100000> Valid Cuis 0 ..................................................File Line 3200000<tel:3200000> Valid Cuis 0 ..................................................File Line 3300000<tel:3300000> Valid Cuis 0 ..................................................File Line 3400000<tel:3400000> Valid Cuis 0 ..................................................File Line 3500000<tel:3500000> Valid Cuis 0 ..................................................File Line 3600000<tel:3600000> Valid Cuis 0 ..................................................File Line 3700000<tel:3700000> Valid Cuis 0 ..................................................File Line 3800000<tel:3800000> Valid Cuis 0 ..................................................File Line 3900000<tel:3900000> Valid Cuis 0 ..................................................File Line 4000000<tel:4000000> Valid Cuis 0 ..................................................File Line 4100000<tel:4100000> Valid Cuis 0 ..................................................File Line 4200000<tel:4200000> Valid Cuis 0 ..................................................File Line 4300000<tel:4300000> Valid Cuis 0 ..................................................File Line 4400000<tel:4400000> Valid Cuis 0 ..................................................File Line 4500000<tel:4500000> Valid Cuis 0 ..................................................File Line 4600000<tel:4600000> Valid Cuis 0 ..................................................File Line 4700000<tel:4700000> Valid Cuis 0 ..................................................File Line 4800000<tel:4800000> Valid Cuis 0 ..................................................File Line 4900000<tel:4900000> Valid Cuis 0 ..................................................File Line 5000000<tel:5000000> Valid Cuis 0 ..................................................File Line 5100000<tel:5100000> Valid Cuis 0 ..................................................File Line 5200000<tel:5200000> Valid Cuis 0 ..................................................File Line 5300000<tel:5300000> Valid Cuis 0 ..................................................File Line 5400000<tel:5400000> Valid Cuis 0 ..................................................File Line 5500000<tel:5500000> Valid Cuis 0 ..................................................File Line 5600000<tel:5600000> Valid Cuis 0 ..................................................File Line 5700000<tel:5700000> Valid Cuis 0 ..................................................File Line 5800000<tel:5800000> Valid Cuis 0 ..................................................File Line 5900000<tel:5900000> Valid Cuis 0 ..................................................File Line 6000000<tel:6000000> Valid Cuis 0 ..................................................File Line 6100000<tel:6100000> Valid Cuis 0 ..................................................File Line 6200000<tel:6200000> Valid Cuis 0 ..................................................File Line 6300000<tel:6300000> Valid Cuis 0 ..................................................File Line 6400000<tel:6400000> Valid Cuis 0 ..................................................File Line 6500000<tel:6500000> Valid Cuis 0 ..................................................File Line 6600000<tel:6600000> Valid Cuis 0 ..................................................File Line 6700000<tel:6700000> Valid Cuis 0 ..................................................File Line 6800000<tel:6800000> Valid Cuis 0 ..................................................File Line 6900000<tel:6900000> Valid Cuis 0 ..................................................File Line 7000000<tel:7000000> Valid Cuis 0 ..................................................File Line 7100000<tel:7100000> Valid Cuis 0 ..................................................File Line 7200000<tel:7200000> Valid Cuis 0 ..................................................File Line 7300000<tel:7300000> Valid Cuis 0 ..................................................File Line 7400000<tel:7400000> Valid Cuis 0 ..................................................File Line 7500000<tel:7500000> Valid Cuis 0 ..................................................File Line 7600000<tel:7600000> Valid Cuis 0 ..................................................File Line 7700000<tel:7700000> Valid Cuis 0 ..................................................File Line 7800000<tel:7800000> Valid Cuis 0 ..................................................File Line 7900000<tel:7900000> Valid Cuis 0 ..................................................File Line 8000000<tel:8000000> Valid Cuis 0 ..................................................File Line 8100000<tel:8100000> Valid Cuis 0 ..................................................File Line 8200000<tel:8200000> Valid Cuis 0 ..................................................File Line 8300000<tel:8300000> Valid Cuis 0 ..................................................File Line 8400000<tel:8400000> Valid Cuis 0 ..................................................File Line 8500000<tel:8500000> Valid Cuis 0 ..................................................File Line 8600000<tel:8600000> Valid Cuis 0 ..................................................File Line 8700000<tel:8700000> Valid Cuis 0 ..................................................File Line 8800000<tel:8800000> Valid Cuis 0 .............File Lines 8827152<tel:8827152> Valid Cuis 0 Compiling map of Umls Cuis and Texts ..................................................File Line 100000 Terms 0 ..................................................File Line 200000 Terms 0 ..................................................File Line 300000 Terms 0 ..................................................File Line 400000 Terms 0 ..................................................File Line 500000 Terms 0 ..................................................File Line 600000 Terms 0 ..................................................File Line 700000 Terms 0 ..................................................File Line 800000 Terms 0 ..................................................File Line 900000 Terms 0 ..................................................File Line 1000000<tel:1000000> Terms 0 ..................................................File Line 1100000<tel:1100000> Terms 0 ..................................................File Line 1200000<tel:1200000> Terms 0 ..................................................File Line 1300000<tel:1300000> Terms 0 ..................................................File Line 1400000<tel:1400000> Terms 0 ..................................................File Line 1500000<tel:1500000> Terms 0 ..................................................File Line 1600000<tel:1600000> Terms 0 ..................................................File Line 1700000<tel:1700000> Terms 0 ..................................................File Line 1800000<tel:1800000> Terms 0 ..................................................File Line 1900000<tel:1900000> Terms 0 ..................................................File Line 2000000<tel:2000000> Terms 0 ..................................................File Line 2100000<tel:2100000> Terms 0 ..................................................File Line 2200000<tel:2200000> Terms 0 ..................................................File Line 2300000<tel:2300000> Terms 0 ..................................................File Line 2400000<tel:2400000> Terms 0 ..................................................File Line 2500000<tel:2500000> Terms 0 ..................................................File Line 2600000<tel:2600000> Terms 0 ..................................................File Line 2700000<tel:2700000> Terms 0 ..................................................File Line 2800000<tel:2800000> Terms 0 ..................................................File Line 2900000<tel:2900000> Terms 0 ..................................................File Line 3000000<tel:3000000> Terms 0 ..................................................File Line 3100000<tel:3100000> Terms 0 ..................................................File Line 3200000<tel:3200000> Terms 0 ..................................................File Line 3300000<tel:3300000> Terms 0 ..................................................File Line 3400000<tel:3400000> Terms 0 ..................................................File Line 3500000<tel:3500000> Terms 0 ..................................................File Line 3600000<tel:3600000> Terms 0 ..................................................File Line 3700000<tel:3700000> Terms 0 ..................................................File Line 3800000<tel:3800000> Terms 0 ..................................................File Line 3900000<tel:3900000> Terms 0 ..................................................File Line 4000000<tel:4000000> Terms 0 ..................................................File Line 4100000<tel:4100000> Terms 0 ..................................................File Line 4200000<tel:4200000> Terms 0 ..................................................File Line 4300000<tel:4300000> Terms 0 ..................................................File Line 4400000<tel:4400000> Terms 0 ..................................................File Line 4500000<tel:4500000> Terms 0 ..................................................File Line 4600000<tel:4600000> Terms 0 ..................................................File Line 4700000<tel:4700000> Terms 0 ..................................................File Line 4800000<tel:4800000> Terms 0 ..................................................File Line 4900000<tel:4900000> Terms 0 ..................................................File Line 5000000<tel:5000000> Terms 0 ..................................................File Line 5100000<tel:5100000> Terms 0 ..................................................File Line 5200000<tel:5200000> Terms 0 ..................................................File Line 5300000<tel:5300000> Terms 0 ..................................................File Line 5400000<tel:5400000> Terms 0 ..................................................File Line 5500000<tel:5500000> Terms 0 ..................................................File Line 5600000<tel:5600000> Terms 0 ..................................................File Line 5700000<tel:5700000> Terms 0 ..................................................File Line 5800000<tel:5800000> Terms 0 ..................................................File Line 5900000<tel:5900000> Terms 0 ..................................................File Line 6000000<tel:6000000> Terms 0 ..................................................File Line 6100000<tel:6100000> Terms 0 ..................................................File Line 6200000<tel:6200000> Terms 0 ..................................................File Line 6300000<tel:6300000> Terms 0 ..................................................File Line 6400000<tel:6400000> Terms 0 ..................................................File Line 6500000<tel:6500000> Terms 0 ..................................................File Line 6600000<tel:6600000> Terms 0 ..................................................File Line 6700000<tel:6700000> Terms 0 ..................................................File Line 6800000<tel:6800000> Terms 0 ..................................................File Line 6900000<tel:6900000> Terms 0 ..................................................File Line 7000000<tel:7000000> Terms 0 ..................................................File Line 7100000<tel:7100000> Terms 0 ..................................................File Line 7200000<tel:7200000> Terms 0 ..................................................File Line 7300000<tel:7300000> Terms 0 ..................................................File Line 7400000<tel:7400000> Terms 0 ..................................................File Line 7500000<tel:7500000> Terms 0 ..................................................File Line 7600000<tel:7600000> Terms 0 ..................................................File Line 7700000<tel:7700000> Terms 0 ..................................................File Line 7800000<tel:7800000> Terms 0 ..................................................File Line 7900000<tel:7900000> Terms 0 ..................................................File Line 8000000<tel:8000000> Terms 0 ..................................................File Line 8100000<tel:8100000> Terms 0 ..................................................File Line 8200000<tel:8200000> Terms 0 ..................................................File Line 8300000<tel:8300000> Terms 0 ..................................................File Line 8400000<tel:8400000> Terms 0 ..................................................File Line 8500000<tel:8500000> Terms 0 ..................................................File Line 8600000<tel:8600000> Terms 0 ..................................................File Line 8700000<tel:8700000> Terms 0 ..................................................File Line 8800000<tel:8800000> Terms 0 .............File Line 8827152<tel:8827152> Terms 0 Writing map of Cuis and Texts to pathtoUmls2015.bsv -----Original Message----- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, September 16, 2015 4:00 PM To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> Subject: RE: Fast Dictionary Update Thank you! I believe that was a change post 2011! You should actually be ok with both SNOMEDCT and SNOMEDCT_US in CtakesSources.txt Cheers, Sean -----Original Message----- From: Maite Meseure Hugues [mailto:meseure.ma...@gmail.com] Sent: Wednesday, September 16, 2015 3:43 PM To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> Subject: Re: Fast Dictionary Update If this can helps, I had to replace 'SNOMEDCT' with 'SNOMEDCT_US' in CtakesSources.txt. On Wed, Sep 16, 2015 at 2:33 PM, Finan, Sean < sean.fi...@childrens.harvard.edu<mailto:sean.fi...@childrens.harvard.edu>> wrote: > I'm not sure that I understand your question. As I sent it, the anat, > snomed and rxnorm are not separate runs. The args line I sent earlier > is for a single run that will create a dictionary with snomed and > rxnorm terms. The anatomy tui list has a special use in correctly > processing snomed codes. > > -----Original Message----- > From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] > Sent: Wednesday, September 16, 2015 3:27 PM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject: RE: Fast Dictionary Update > > Ok, hopefully one last question. > > Based on your example everything runs, however the Anat and Snomed > runs don't produce any valid CUIs but RXNorm does. I'm not sure if > this has anything to do with it but every UMLS source read is against MRSTY. > > Here's my command > > java -cp dictionarytool.jar;lib/* > org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls > /path/to/UMLS/META -fd ./data/tiny -atui > ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt > -ol path o ileUmls2015.bsv > > Any suggestions? > > Thanks again, > Brandon > > > -----Original Message----- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Wednesday, September 16, 2015 3:05 PM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject: RE: Fast Dictionary Update > > Yes, that will make the rare word dictionary in a memory-based hsql > database - the same as the default for the dictionary-lookup-fast module. > > -----Original Message----- > From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] > Sent: Wednesday, September 16, 2015 2:42 PM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject: RE: Fast Dictionary Update > > Thanks Sean, much appreciated. To clarify the example below would > create the dictionary for use for the rare word approach? > > Thanks, > Brandon > > -----Original Message----- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Wednesday, September 16, 2015 2:16 PM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject: RE: Fast Dictionary Update > > Hi Brandon, > > I just checked in a bin/dictionarytool.zip It should have everything > that you need (.jar, lib/, data/). > java -cp dictionarytool.jar;lib/* > org.apache.ctakes.dictionarytool.DictionaryCreator2 [args] Should do > the trick. > > To recreate a 2015 version of the current ctakes dictionary, the > arguments > are: > -umls my/path/to/2015AA/META -fd ./data/tiny -atui > ./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt > -db > jdbc:hsqldb:file:my/path/to/snorx2015 -tbl CUI_TERMS > > Create my/path/to/snorx2015 by copying > resources/memdbtemplate/ctakesumls.properties to > my/path/to/snorx2015.properties - there is a resources/README about this. > > Before populating a DB, I usually do a trial run first, writing to a > flat file. Replace "-db ... -tbl ..." with "-ol my/path/to/testout.bsv" > > > Sean > > -----Original Message----- > From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] > Sent: Wednesday, September 16, 2015 1:49 PM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject: RE: Fast Dictionary Update > > Hi Sean, > > That'd be great. > > I think I'm building it incorrectly because after I build the jar and > try to run specifying DictionaryCreator2 as the main class it says it > can't find it. I'm not too familiar with Java and building > projects/jars so it could be my ignorance causing the problem. > > Thanks, > Brandon > > -----Original Message----- > From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] > Sent: Wednesday, September 16, 2015 1:45 PM > To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org> > Subject: RE: Fast Dictionary Update > > Hi Brandon, > > I can send you a jar or commit one pre-built. What goes wrong when you > try to build the tool? > > Sean > > -----Original Message----- > From: Geise, Brandon D. [mailto:bdge...@geisinger.edu] > Sent: Wednesday, September 16, 2015 1:23 PM > To: 'dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>' > Subject: Fast Dictionary Update > > Does someone have the DictionaryTool jar available? I'm having trouble > creating the jar file from the project and would like to be able to > create an updated UMLS fast dictionary for 2015. > > Thanks, > Brandon > > > IMPORTANT WARNING: The information in this message (and the documents > attached to it, if any) is confidential and may be legally privileged. > It is intended solely for the addressee. Access to this message by > anyone else is unauthorized. If you are not the intended recipient, > any disclosure, copying, distribution or any action taken, or omitted > to be taken, in reliance on it is prohibited and may be unlawful. If > you have received this message in error, please delete all electronic > copies of this message (and the documents attached to it, if any), > destroy any hard copies you may have created and notify me immediately by > replying to this email. Thank you. > > Geisinger Health System utilizes an encryption process to safeguard > Protected Health Information and other confidential data contained in > external e-mail messages. If email is encrypted, the recipient will > receive an e-mail instructing them to sign on to the Geisinger Health > System Secure E-mail Message Center to retrieve the encrypted e-mail. > > > >