Hi Finan, I am sorry if I am asking too much but I am really stuck ...
1- could you please give me a link where I can download the latest version of dictionarytool 2- The current version I have always produce for icd10pcs although I have in the -src file icd10CM, icd10pcs is statically added inside dictionarytool ? if I changed from within the code it should work ? 3- after running the tool lines like below are added to the .script file am i on the right track ? INSERT INTO CUI_TERMS VALUES(20417,1,2,'hyoid bones','bones') INSERT INTO CUI_TERMS VALUES(20417,0,2,'os hyoideum','os') 4- as naive as this sound but what is tui insides CtakesAnatTuis.txt? 5- any documentation you advice to read ? On Thu, Dec 10, 2015 at 10:37 AM, Alaa al Barari <[email protected]> wrote: > Finan, from where to download the 2015. properties from sourceforg. those > all ICDs and snowmed ? > > I prefer to learn how to generate my own db because I will need to create > my own later on, so your help is appreciated. > > On Thu, Dec 10, 2015 at 9:13 AM, Alaa al Barari <[email protected]> > wrote: > >> Thank, but what I endup with is >> wrong ? >> On Dec 10, 2015 4:26 AM, "Finan, Sean" <[email protected]> >> wrote: >> >>> Hi Alaa, >>> >>> If you downloaded the 2015 .property and .script files then you do not >>> need to run the dictionary creation tool. Those databases are already >>> populated and ready to use. >>> >>> Sean >>> >>> >>> -----Original Message----- >>> From: Alaa al Barari [mailto:[email protected]] >>> Sent: Wednesday, December 09, 2015 6:33 PM >>> To: [email protected] >>> Subject: Re: ctakes with icd10; 2015 versions available on sourceforge! >>> >>> so basically looks like the path had Desktop as capital thats why it did >>> not work. >>> >>> I ended up having rows like this inside ctakesicd2015.scripts : >>> >>> INSERT INTO CUI_TERMS VALUES(2723481,8,15,'magnesium sulfate 1000 mg / >>> 50 ml - nacl 0 . 9 % intravenous solution','nacl') INSERT INTO CUI_TERMS >>> VALUES(2723481,9,16,'magnesium sulfate , 2 g / 100 ml >>> - nacl 0 . 9 % intravenous solution','nacl') INSERT INTO CUI_TERMS >>> VALUES(2723481,0,7,'magnesium sulfate 20 mg / ml >>> injection','magnesium') >>> >>> >>> does this mean it worked ? >>> >>> >>> >>> >>> >>> On Thu, Dec 10, 2015 at 1:07 AM, Alaa al Barari <[email protected] >>> > >>> wrote: >>> >>> > Thanks Finan and Brandon, your help is appreciated a lot. >>> > >>> > I downloaded the dictionary tool from >>> > https://urldefense.proofpoint.com/v2/url?u=https-3A__svn.apache.org_re >>> > pos_asf_ctakes_sandbox_dictionarytool_bin_dictionarytool.zip&d=BQIBaQ& >>> > c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYm >>> > QCP6r0bcpKGd4f7d4gTao&m=uJq_3OpLiUaBOz9vqxKBI-gUAtLhJMme9uKXqroHhMM&s= >>> > JVOlLM08gTn5rV2T3R_bqeZT8XbMDgLhfKg8Fo5mAQw&e= >>> > I hope its the latest and bug free. >>> > >>> > >>> > my running command is : java -cp ./dictionarytool.jar:lib/* >>> > org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls >>> > /home/abarari/Desktop/umls/2015AB/META/ -atui >>> > ./data/optional/CtakesAnatTuis.txt -db >>> > jdbc:hsqldb:file:/home/abarari/Desktop/dictionarytool/output/ctakesicd >>> > 2015 -tbl CUI_TERMS -df ./data/optional/ -src >>> > ./data/small/ConversionSources.txt >>> > -tui ./data/optional/CtakesAllTuis.txt >>> > >>> > >>> > >>> > I am running on ubuntu by the way ... anyway under >>> > /home/abarari/Desktop/dictionarytool/output/ >>> > >>> > there is only >>> > >>> > abarari@ubuntu:~/Desktop/dictionarytool/output$ ls ctakesicd2015.log >>> > ctakesicd2015.properties ctakesicd2015.script >>> > >>> > >>> > where is the database ? am I doing something wrong ? do I need to >>> > create the database before executing the dictionarytool or what ? >>> > >>> > >>> > I found couple of issues in the dictionary tool, it does not work well >>> > with relative paths. >>> > >>> > >>> > On Wed, Dec 9, 2015 at 7:11 AM, Pei Chen <[email protected]> wrote: >>> > >>> >> Brandon, >>> >> That sounds great! >>> >> Please open a Jira ticket for any contributions (anyone should be >>> >> able to create a Jira account). There are some legal items built >>> >> into the ASF Jira attachments for accepting contributions/donations. >>> >> It will also credit the contributors with the merit appropriately. >>> >> Anyone who is interested can follow the Jira item. (Even better if >>> >> contributions were open discussion/open development.) --Pei >>> >> >>> >> On Tue, Dec 8, 2015 at 10:36 PM, Geise, Brandon D. >>> >> <[email protected]> wrote: >>> >> > I'd be interested in contributing to making the dictionary tool >>> >> > more >>> >> user friendly with a GUI. >>> >> > >>> >> > Thanks, >>> >> > Brandon >>> >> > >>> >> > -----Original Message----- >>> >> > From: Finan, Sean [mailto:[email protected]] >>> >> > Sent: Tuesday, December 08, 2015 6:12 PM >>> >> > To: [email protected] >>> >> > Subject: RE: ctakes with icd10; 2015 versions available on >>> sourceforge! >>> >> > >>> >> > Hi Dave, >>> >> > >>> >> > I'm always happy to see interest in our stuff! >>> >> > >>> >> >>Step 1 >>> >> > I built the tool to be able to build a dictionary using anything in >>> >> > the >>> >> umls - snomed, icd9, hpo, etc. so using the veterinary extension >>> >> shouldn't be a problem. You just add it to the CtakesSources file >>> >> (or create an alternate file and point to it with -src). To answer >>> >> another of your questions, there can be zero or more sources - you >>> >> saw snomedct and snomedct_us (each valid in a different umls version). >>> >> > It also can include any semantic type, just add (or remove) the >>> >> appropriate tuis in a different data file. >>> >> > >>> >> >>Step 2 >>> >> > You have it right - you copy the templates to another location and >>> >> output to that location. Otherwise you 'lose' your templates. >>> >> > >>> >> >>Step 3 and 4 >>> >> > The jar is built from source. I need to (soon) check in updates to >>> >> > the >>> >> source, and at the same time I can check in a default prebuilt .jar >>> >> The lib/ directory is in the source repository. >>> >> > >>> >> > Various people have toyed with the idea of putting the tool into a >>> >> ctakes module, putting it into an "installation package", making a >>> gui ... >>> >> The best option (imo) is probably to make an easy to use gui and keep >>> >> a pre-built version in sandbox. Someday, after the rainbow, maybe >>> >> I'll get a chance to do that ... >>> >> > >>> >> > Sean >>> >> > >>> >> > >>> >> > -----Original Message----- >>> >> > From: David Kincaid [mailto:[email protected]] >>> >> > Sent: Tuesday, December 08, 2015 4:57 PM >>> >> > To: [email protected] >>> >> > Subject: Re: ctakes with icd10; 2015 versions available on >>> sourceforge! >>> >> > >>> >> > Thanks, Sean! It's great that cTAKES may soon have an up to date >>> >> database out of the box. Hopefully it will cut down on the need for >>> >> many to build their own DB's. Thank you much for doing that. >>> >> > >>> >> > Unfortunately, I still will need to build a custom one for us. I >>> >> > work >>> >> in veterinary medicine so I need to add in the veterinary extension >>> >> for SNOMED-CT into the database. >>> >> > >>> >> > I looked over the steps below that Brandon included and have some >>> >> questions: >>> >> > >>> >> > step 1 says to "Change /data/default/CtakesSources.txt from >>> "SNOMEDCT" >>> >> to "SNOMEDCT_US". The file that I have has two lines in it. First >>> >> line is SNOMED, second line is SNOMEDCT_US. So this step doesn't >>> really make sense. >>> >> > >>> >> > step 2 should reference the two scripts as being in >>> >> resource/memdbtemplate so others don't have to search for them. Not >>> >> sure what it means to move them to "location to put new UMLS DB". >>> >> Does that mean move them into a new directory where the newly created >>> >> UMLS DB will get written? >>> >> > >>> >> > steps 3 and 4 for running the tools reference dictionarytool.jar >>> >> > which >>> >> doesn't exist. Does one need to build that somehow from the source >>> >> before running it? The command line also adds "lib/*" to the >>> >> classpath. Is that the lib directory inside the dictionarytool source >>> >> code or some other location? >>> >> > >>> >> > What else would I need to do to include the SNOMED-CT Veterinary >>> >> Extension along with the snomedct and rxnorm sources? >>> >> > >>> >> > I'll probably not have time to try this out for a while yet, but >>> >> > when I >>> >> do I'd be happy to write up an easy to follow tutorial for building a >>> >> custom dictionary assuming I am able to get it to work. >>> >> > >>> >> > Has anyone considered making this tool available outside of the >>> >> > source >>> >> code itself? Like including it in the main cTAKES release? It seems >>> >> there is demand for it. >>> >> > >>> >> > - Dave >>> >> > >>> >> > On Tue, Dec 8, 2015 at 3:22 PM, Finan, Sean < >>> >> [email protected]> wrote: >>> >> > >>> >> >> Hi Brandon, thanks for finding and forwarding the instructions! >>> >> >> >>> >> >> I have checked in two new hsqldb dictionaries, both from the >>> >> >> 2015AB version of the UMLS. They both have codes for snomedct_us, >>> >> >> rxnorm, icd9cm and icd10pcs - as well as the usual cui, tui, >>> >> >> preferred term >>> >> mappings. >>> >> >> >>> >> >> One uses cuis filtered by snomed and rxnorm, the other adds cuis >>> >> >> filtered by icd9 and icd10. >>> >> >> What this means: Cuis that exist for a [filter source] are added >>> >> >> to the dictionary, as are all text variations from all sources >>> >> >> that contain that cui. Both dictionaries also use the standard >>> >> >> ctakes semantic group tui filters. >>> >> >> >>> >> >> The names are ctakessnorx2015 and ctakesicd2015 >>> >> >> >>> >> >> The snomed rxnorm : >>> >> >> >>> >> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.ne >>> >> >> t_p_ >>> >> >> ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2 >>> >> >> Drwo >>> >> >> rd-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictiona >>> >> >> ry_l >>> >> >> ookup_fast_ctakessnorx2015_&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW1 >>> >> >> 4JZM >>> >> >> SdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqws >>> >> >> l3Fm >>> >> >> uUXq77GmVlfXn0lE0pVRkL53DNhukcaW6c&s=kWCcj3-hcqYWZXIPhsERggDLCO-5g >>> >> >> ppCR >>> >> >> oS1Gav7r2A&e= >>> >> >> >>> >> >> The snomed rxnorm icd9 icd10: >>> >> >> >>> >> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__sourceforge.ne >>> >> >> t_p_ >>> >> >> ctakesresources_code_HEAD_tree_trunk_ctakes-2Dresources-2Dsnomed-2 >>> >> >> Drwo >>> >> >> rd-2Dhsqldb-2D2011ab_src_main_resources_org_apache_ctakes_dictiona >>> >> >> ry_l >>> >> >> ookup_fast_ctakesicd2015_&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14J >>> >> >> ZMSd >>> >> >> ioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqwsl3 >>> >> >> FmuU >>> >> >> Xq77GmVlfXn0lE0pVRkL53DNhukcaW6c&s=RZ--ZQ2qvGnhm4h2Vvz1oU97qA8BG2G >>> >> >> 39Tw >>> >> >> w7EdYgKA&e= >>> >> >> >>> >> >> The svn root for the whole ugly thing is: >>> >> >> svn checkout svn://svn.code.sf.net/p/ctakesresources/code/trunk >>> >> >> >>> >> >> Stats: >>> >> >> ctakessnorx2015 >>> >> >> 545,913 Terms >>> >> >> 229,251 Concepts (Cuis) >>> >> >> 272,987 Snomed codes >>> >> >> 32,419 Rxnorm codes >>> >> >> 11,321 icd9 codes >>> >> >> 61 icd10 codes >>> >> >> >>> >> >> Ctakesicd2015 >>> >> >> 611,230 Terms >>> >> >> 282,211 Concepts >>> >> >> 18,626 icd9 codes >>> >> >> 45,818 icd10 codes >>> >> >> Snomed and Rxnorm counts are the same >>> >> >> >>> >> >> So, adding the icd filters gave us an extra ~53,000 concepts and >>> >> >> ~65,000 terms. >>> >> >> >>> >> >> I would like to move this all to a better root (not >>> >> >> ctakes-resources-snomed-rword-hsqldb-2011ab) but I wasn't able to >>> >> >> write directly in trunk (??) and need to get moving on to other >>> things. >>> >> >> >>> >> >> There is help on the ctakes wiki: >>> >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache. >>> >> >> org_ >>> >> >> confluence_display_CTAKES_cTAKES-2B3.2-2B-2D-2BFast-2BDictionary-2 >>> >> >> BLoo >>> >> >> kup&d=BQIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67G >>> >> >> vlGZ >>> >> >> stTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=SRqwsl3FmuUXq77GmVlfXn0lE0pVR >>> >> >> kL53 DNhukcaW6c&s=98W_vAHGZ2FLEMPfrSgEHtZt-mQ3XJjF6yQYM26tqP4&e= >>> >> >> Though I should probably add a few items ... >>> >> >> >>> >> >> >>> >> >> Sean >>> >> >> >>> >> >> >>> >> >> -----Original Message----- >>> >> >> From: Geise, Brandon D. [mailto:[email protected]] >>> >> >> Sent: Tuesday, December 08, 2015 12:51 PM >>> >> >> To: [email protected] >>> >> >> Subject: RE: ctakes with icd10 >>> >> >> >>> >> >> Not to perpetuate the instructions again but I sent these out not >>> >> >> long ago when I was going through the process and Sean was helping >>> me. >>> >> >> >>> >> >> 1. Change /data/default/CtakesSources.txt from "SNOMEDCT" >>> >> >> to "SNOMEDCT_US" >>> >> >> 2. Copy ctakesumls.properties and ctakesumls.script from >>> >> >> memdbtemplate to location to put new UMLS DB >>> >> >> 3. Run DictionaryCreator2 >>> >> >> java -cp dictionarytool.jar;lib/* >>> >> >> org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls >>> >> >> "\pathToUmls\META" -atui ./data/tiny/CtakesAnatTuis.txt -db >>> >> >> jdbc:hsqldb:file:pathTonewDB\snorx2015 -tbl CUI_TERMS >>> >> >> 4. Run CodeMapCreator >>> >> >> java -cp dictionarytool.jar;lib/* >>> >> >> org.apache.ctakes.dictionarytool.CodeMapCreator -umls >>> >> "\pathToUmls\META" >>> >> >> -atui ./data/tiny/CtakesAnatTuis.txt -db >>> >> >> jdbc:hsqldb:file:pathTonewDB\snorx2015 -tbl CUI_TERMS >>> >> >> 5. Copy new DB files to new location and create a copy of >>> >> >> cTakesHsql.xml and update dictionary location >>> >> >> >>> >> >> Thanks, >>> >> >> Brandon >>> >> >> >>> >> >> -----Original Message----- >>> >> >> From: David Kincaid [mailto:[email protected]] >>> >> >> Sent: Tuesday, December 08, 2015 12:47 PM >>> >> >> To: [email protected] >>> >> >> Subject: Re: ctakes with icd10 >>> >> >> >>> >> >> This seems like a pretty common request and with such an old >>> >> >> version of UMLS database shipped with cTAKES it's only going to >>> get worse. >>> >> >> I've been wanting to build a dictionary using the latest UMLS >>> >> >> release (as well as a custom database), so would be happy to write >>> >> >> up the steps as I go through it. That assumes that I can dig up >>> >> >> the >>> >> instructions in the dev list. >>> >> >> >>> >> >> - Dave >>> >> >> >>> >> >> On Tue, Dec 8, 2015 at 11:36 AM, Finan, Sean < >>> >> >> [email protected]> wrote: >>> >> >> >>> >> >> > Hi Alaa, >>> >> >> > >>> >> >> > The -shortest- answer is that you'll need to run the dictionary >>> >> >> > creation tool. There are instructions in older devlist threads. >>> >> >> > By default the dictionary creation tool does add icd9 and icd10 >>> >> >> > tables to >>> >> >> the dictionary. >>> >> >> > The problem is that in Umls 2011AB those codes weren't very well >>> >> >> > populated. The 2015AB icd# set is much more rich so those >>> >> >> > tables should be pretty good. Then in ctakes you would look up >>> >> >> > annotations by icd9 or icd10 codes instead of by cui: >>> >> >> > OntologyConceptUtil.getAnnotationsByCode( jcas, lookupWindow, >>> >> >> > icd#Code ); OntologyConceptUtil.getAnnotationsByCode( jcas, >>> >> >> > icd#Code ); >>> >> >> > >>> >> >> > Sean >>> >> >> > >>> >> >> > -----Original Message----- >>> >> >> > From: Savova, Guergana >>> >> >> > [mailto:[email protected]] >>> >> >> > Sent: Tuesday, December 08, 2015 12:17 PM >>> >> >> > To: [email protected] >>> >> >> > Subject: RE: ctakes with icd10 >>> >> >> > >>> >> >> > Hi Alaa, >>> >> >> > You need to create a resource off the terminology/ontology you >>> >> >> > want to use (in this case ICD9 or ICD10). Then run that resource >>> >> >> > with cTAKES for the fast dictionary lookup. There is cTAKES code >>> >> >> > and some documentation on how to create that resource. By >>> >> >> > default, cTAKES runs with a resource created from the English >>> >> >> > version of SNOMED CT >>> >> and RxNORM. >>> >> >> > Hope this helps. >>> >> >> > --Guergana >>> >> >> > >>> >> >> > -----Original Message----- >>> >> >> > From: Alaa al Barari [mailto:[email protected]] >>> >> >> > Sent: Tuesday, December 8, 2015 10:01 AM >>> >> >> > To: [email protected] >>> >> >> > Subject: ctakes with icd10 >>> >> >> > >>> >> >> > Hi, >>> >> >> > >>> >> >> > I downloaded Latest umls version, and I want to know how to make >>> >> >> > ctakes work with icd10 and icd9. >>> >> >> > >>> >> >> > >>> >> >> > Thanks >>> >> >> > >>> >> >> >>> >> >> >>> >> >> IMPORTANT WARNING: The information in this message (and the >>> >> >> documents attached to it, if any) is confidential and may be >>> legally privileged. >>> >> >> It is intended solely for the addressee. Access to this message by >>> >> >> anyone else is unauthorized. If you are not the intended >>> >> >> recipient, any disclosure, copying, distribution or any action >>> >> >> taken, or omitted to be taken, in reliance on it is prohibited and >>> >> >> may be unlawful. If you have received this message in error, >>> >> >> please delete all electronic copies of this message (and the >>> >> >> documents attached to it, if any), destroy any hard copies you may >>> >> >> have created and notify me immediately >>> >> by replying to this email. Thank you. >>> >> >> >>> >> >> Geisinger Health System utilizes an encryption process to >>> >> >> safeguard Protected Health Information and other confidential data >>> >> >> contained in external e-mail messages. If email is encrypted, the >>> >> >> recipient will receive an e-mail instructing them to sign on to >>> >> >> the Geisinger Health System Secure E-mail Message Center to >>> retrieve the encrypted e-mail. >>> >> >> >>> >> >>> > >>> > >>> > >>> > -- >>> > Eng Alaa Al-Barari >>> > phone 0599297470 >>> > >>> >>> >>> >>> -- >>> Eng Alaa Al-Barari >>> phone 0599297470 >>> >> > > > -- > Eng Alaa Al-Barari > phone 0599297470 > -- Eng Alaa Al-Barari phone 0599297470
