Hi Brandon,
Sorry for the late reply - I've been out for an extended weekend.
The coding scheme change is fairly simply explained (imo). The plain old CUI
is not a snomed code. If the snomed codes are reported by ctakes (uncomment
the snomed line in ctakesHsql.xml ) then their UmlsConcept entries in the
ontology array have the coding scheme name "SNOMEDCT".
<!-- Optional tables for optional term info.
Uncommenting these lines alone may not persist term information;
persistence depends upon the TermConsumer. -->
<property key="snomedTable" value="snomedct"/>
Basically, the "CTAKES" name indicates that the scheme only contains Umls Cuis
that have TUIs of the default ctakes configuration. ctakes does not use all
umls tuis, therefore I did not name the scheme "UMLS". If you make a custom
scheme (etc.) you can change the name in cTakesHsql.xml or in a custom .xml
<!-- Depending upon the consumer, the value of codingScheme may or
may not be used. With the packaged consumers,
codingScheme is a default value used only for cuis that do not have
secondary codes (snomed, rxnorm, etc.) -->
<property key="codingScheme" value="CTAKES"/>
The " RelationsExtractor" in the dictionary creator tool is completely
experimental and unfinished - but perhaps some day it will throw umls relations
into a format that ctakes can directly use. For the time being it should be
avoided.
Sean
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Thursday, September 17, 2015 10:23 PM
To: [email protected]
Subject: RE: Fast Dictionary Update
You can disregard my question about the relation extraction as I fixed this by
building the new dictionary with the default data files in the dictionarytool.
I am curious about the SNOMED change still though.
Thanks,
Brandon
-----Original Message-----
From: Geise, Brandon D.
Sent: Thursday, September 17, 2015 9:40 PM
To: cTAKES Developer list <[email protected]>
Subject: RE: Fast Dictionary Update
Thanks Dmitriy. I was referring to the RelationsExtractor class found in the
dictionarytool. On a similar note, the coding scheme for all SNOMEDCT codes
for the new dictionary is CTAKES compared to SNOMED with the UMLS version
packaged with cTakes. Is there something else I need to run for the dictionary
creation that I'm missing?
Thanks,
Brandon
-----Original Message-----
From: Dligach, Dmitriy [mailto:[email protected]]
Sent: Thursday, September 17, 2015 8:42 PM
To: cTAKES Developer list <[email protected]>
Subject: Re: Fast Dictionary Update
Hi Brandon,
Relation extraction at the moment only handles two specific relation types:
LocationOf and DegreeOf. You are welcome to run it if you need these specific
relations.
Dima
--
Dmitriy (Dima) Dligach, Ph.D.
Boston Children's Hospital and Harvard Medical School
(617) 651-0397
On Sep 17, 2015, at 17:08, Geise, Brandon D.
<[email protected]<mailto:[email protected]>> wrote:
Does the RelationsExtractor need to be run in order to generate information on
relationships from cTakes? When running with 2011 UMLS dictionary I'm able to
get relationships for BodyLocationMentions but with the dictionary I created I
am not getting this information. Any advice?
Thanks,
Brandon
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Thursday, September 17, 2015 1:18 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
It claims that the database is connected and the preceding line of are spat out
during loading, which took ~3-4 seconds (so something was there):
............
17 Sep 2015 12:58:58 INFO JdbcConnectionFactory - Database connected
Strange. I don't really know what to tell you right now. Perhaps something
will click with me later ...
Did you also run org.apache.ctakes.dictionarytool.CodeMapCreator ? It isn't
strictly necessary but it stores the tuis in the database so that cTakes can
identify the semantic group of a mention.
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Thursday, September 17, 2015 1:02 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Not specifically loaded. Here's what I see when loading the pipeline:
17 Sep 2015 12:58:54 INFO JdbcConnectionFactory - Connecting to
jdbc:hsqldb:file:path/to/ctakes/ctakes-dictionary-lookup-fast-res/src/main/resources/org/apache/ctakes/dictionary/lookup/fast/UMLS2015/snorx2015:
............
17 Sep 2015 12:58:58 INFO JdbcConnectionFactory - Database connected
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Thursday, September 17, 2015 12:57 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Making an alternate copy of cTakesHsql.xml and pointing to the new dictionary
is all that is necessary. Do you see a message in the initialization output
indicating that the dictionary db has been loaded?
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Thursday, September 17, 2015 12:54 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Great, thanks both seemed to work for populating the script table.
Besides the path to the new dictionary needing to be changed in cTakesHsql.xml,
does anything else need to be modified to use the new dictionary? My pipeline
runs however there aren't any annotations related to the UMLS concepts. The
only annotations I'm seeing are date, roman numeral, or modifier related. (My
pipeline if UMLSFastProcessor with additions for modifiers and templatefiller).
Any suggestions would be appreciated.
Thanks,
Brandon
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Thursday, September 17, 2015 10:40 AM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Correct, Hsql should automatically read the .log file upon first use, and then
perform the inserts into the .script file.
In case you want to play it safe, check the README in the resource/ directory
(where you got the hsqldb template). The last paragraph indicates how you can
launch a simple sql tool to play with the db. You will need to change the name
of the db accordingly. Upon first launch of the sql tool everything should be
moved from the .log to the .script file. It is a strange setup/workflow, but
it seems to work.
Sean
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Thursday, September 17, 2015 10:31 AM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
When I run the tool it outputs a file with a .log extension that has all the
insert statements. Do I copy this to the .script template from memcachedb in
the dictionarytool project or should the inserts be put into the .script file
by default on the program execution?
Thanks,
Brandon
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 9:59 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Excellent!
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 9:55 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
No, I had changed it on the Tiny source file. I just changed the default file
and it looks to be running as expected now.
Thank you for all your help and patience, Brandon
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 9:35 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Did you add it to data/default/ CtakesSources.txt ?
If not then you need to specify -src ./data/tiny/CtakesSources.txt
Sorry for any confusion.
As soon as my inet isn't overloaded I'll download 2015AA and see if I can build
a dictionary.
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 8:14 PM
To: [email protected]<mailto:[email protected]>;
[email protected]<mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Sean,
I added that and still had the same issue.
Thanks,
Brandon
_____________________________
From: Finan, Sean
<[email protected]<mailto:[email protected]><mailto:[email protected]>>
Sent: Wednesday, September 16, 2015 7:56 PM
Subject: RE: Fast Dictionary Update
To:
<[email protected]<mailto:[email protected]><mailto:[email protected]>>
And you added "SNOMEDCT_US" to data/tiny/CtakesSources.txt ?
-----Original Message-----
From: Tomasz Oliwa [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 7:13 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
I have exactly the same problem with the tool.
A grep on MRCONSO.RRF for "SNOMEDCT" or for "SNOMEDCT_US" shows many lines.
________________________________________
From: Geise, Brandon D.
[[email protected]<mailto:[email protected]><mailto:[email protected]>]
Sent: Wednesday, September 16, 2015 5:05 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Yes, it finds "SNOMEDCT_US".
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 5:17 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Ah, now I see what you mean. Can you do a grep on your MRCONSO.RRF for
"SNOMEDCT" ?
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 4:04 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
I tried changing as suggested.
Below is what I see for the snomed piece, but for RXNorm it writes terms at the
end.
Reading list of Source Types from ./data/default/CtakesSources.txt File Lines 1
list of Source Types 1 Reading list of Tuis from
./data/tiny/CtakesSnomedTuis.txt File Lines 24 list of Tuis 24 Compiling list
of Cuis with wanted Tuis using /patto/UMLS_Current_Version/META/MRSTY.RRF
File Line 200000 Cuis 60895
File Line 300000 Cuis 85750
File Line 400000 Cuis 135098
File Line 600000 Cuis 183925
File Line 1700000<tel:1700000> Cuis 376338 File Line 1800000<tel:1800000> Cuis
471009 File Line 1900000<tel:1900000> Cuis 568375 File Line
2100000<tel:2100000> Cuis 674715 File Line 2800000<tel:2800000> Cuis 903583
File Line 3300000<tel:3300000> Cuis 973791 File Lines 3370173<tel:3370173> Cuis
999451 ..................................................File Line 100000 Valid
Cuis 0 ..................................................File Line 200000 Valid
Cuis 0 ..................................................File Line 300000 Valid
Cuis 0 ..................................................File Line 400000 Valid
Cuis 0 ..................................................File Line 500000 Valid
Cuis 0 ..................................................File Line 600000 Valid
Cuis 0 ..................................................File Line 700000 Valid
Cuis 0 ..................................................File Line 800000 Valid
Cuis 0 ..................................................File Line 900000 Valid
Cuis 0 ..................................................File Line
1000000<tel:1000000> Valid Cuis 0
..................................................File Line
1100000<tel:1100000> Valid Cuis 0
..................................................File Line
1200000<tel:1200000> Valid Cuis 0
..................................................File Line
1300000<tel:1300000> Valid Cuis 0
..................................................File Line
1400000<tel:1400000> Valid Cuis 0
..................................................File Line
1500000<tel:1500000> Valid Cuis 0
..................................................File Line
1600000<tel:1600000> Valid Cuis 0
..................................................File Line
1700000<tel:1700000> Valid Cuis 0
..................................................File Line
1800000<tel:1800000> Valid Cuis 0
..................................................File Line
1900000<tel:1900000> Valid Cuis 0
..................................................File Line
2000000<tel:2000000> Valid Cuis 0
..................................................File Line
2100000<tel:2100000> Valid Cuis 0
..................................................File Line
2200000<tel:2200000> Valid Cuis 0
..................................................File Line
2300000<tel:2300000> Valid Cuis 0
..................................................File Line
2400000<tel:2400000> Valid Cuis 0
..................................................File Line
2500000<tel:2500000> Valid Cuis 0
..................................................File Line
2600000<tel:2600000> Valid Cuis 0
..................................................File Line
2700000<tel:2700000> Valid Cuis 0
..................................................File Line
2800000<tel:2800000> Valid Cuis 0
..................................................File Line
2900000<tel:2900000> Valid Cuis 0
..................................................File Line
3000000<tel:3000000> Valid Cuis 0
..................................................File Line
3100000<tel:3100000> Valid Cuis 0
..................................................File Line
3200000<tel:3200000> Valid Cuis 0
..................................................File Line
3300000<tel:3300000> Valid Cuis 0
..................................................File Line
3400000<tel:3400000> Valid Cuis 0
..................................................File Line
3500000<tel:3500000> Valid Cuis 0
..................................................File Line
3600000<tel:3600000> Valid Cuis 0
..................................................File Line
3700000<tel:3700000> Valid Cuis 0
..................................................File Line
3800000<tel:3800000> Valid Cuis 0
..................................................File Line
3900000<tel:3900000> Valid Cuis 0
..................................................File Line
4000000<tel:4000000> Valid Cuis 0
..................................................File Line
4100000<tel:4100000> Valid Cuis 0
..................................................File Line
4200000<tel:4200000> Valid Cuis 0
..................................................File Line
4300000<tel:4300000> Valid Cuis 0
..................................................File Line
4400000<tel:4400000> Valid Cuis 0
..................................................File Line
4500000<tel:4500000> Valid Cuis 0
..................................................File Line
4600000<tel:4600000> Valid Cuis 0
..................................................File Line
4700000<tel:4700000> Valid Cuis 0
..................................................File Line
4800000<tel:4800000> Valid Cuis 0
..................................................File Line
4900000<tel:4900000> Valid Cuis 0
..................................................File Line
5000000<tel:5000000> Valid Cuis 0
..................................................File Line
5100000<tel:5100000> Valid Cuis 0
..................................................File Line
5200000<tel:5200000> Valid Cuis 0
..................................................File Line
5300000<tel:5300000> Valid Cuis 0
..................................................File Line
5400000<tel:5400000> Valid Cuis 0
..................................................File Line
5500000<tel:5500000> Valid Cuis 0
..................................................File Line
5600000<tel:5600000> Valid Cuis 0
..................................................File Line
5700000<tel:5700000> Valid Cuis 0
..................................................File Line
5800000<tel:5800000> Valid Cuis 0
..................................................File Line
5900000<tel:5900000> Valid Cuis 0
..................................................File Line
6000000<tel:6000000> Valid Cuis 0
..................................................File Line
6100000<tel:6100000> Valid Cuis 0
..................................................File Line
6200000<tel:6200000> Valid Cuis 0
..................................................File Line
6300000<tel:6300000> Valid Cuis 0
..................................................File Line
6400000<tel:6400000> Valid Cuis 0
..................................................File Line
6500000<tel:6500000> Valid Cuis 0
..................................................File Line
6600000<tel:6600000> Valid Cuis 0
..................................................File Line
6700000<tel:6700000> Valid Cuis 0
..................................................File Line
6800000<tel:6800000> Valid Cuis 0
..................................................File Line
6900000<tel:6900000> Valid Cuis 0
..................................................File Line
7000000<tel:7000000> Valid Cuis 0
..................................................File Line
7100000<tel:7100000> Valid Cuis 0
..................................................File Line
7200000<tel:7200000> Valid Cuis 0
..................................................File Line
7300000<tel:7300000> Valid Cuis 0
..................................................File Line
7400000<tel:7400000> Valid Cuis 0
..................................................File Line
7500000<tel:7500000> Valid Cuis 0
..................................................File Line
7600000<tel:7600000> Valid Cuis 0
..................................................File Line
7700000<tel:7700000> Valid Cuis 0
..................................................File Line
7800000<tel:7800000> Valid Cuis 0
..................................................File Line
7900000<tel:7900000> Valid Cuis 0
..................................................File Line
8000000<tel:8000000> Valid Cuis 0
..................................................File Line
8100000<tel:8100000> Valid Cuis 0
..................................................File Line
8200000<tel:8200000> Valid Cuis 0
..................................................File Line
8300000<tel:8300000> Valid Cuis 0
..................................................File Line
8400000<tel:8400000> Valid Cuis 0
..................................................File Line
8500000<tel:8500000> Valid Cuis 0
..................................................File Line
8600000<tel:8600000> Valid Cuis 0
..................................................File Line
8700000<tel:8700000> Valid Cuis 0
..................................................File Line
8800000<tel:8800000> Valid Cuis 0 .............File Lines 8827152<tel:8827152>
Valid Cuis 0 Compiling map of Umls Cuis and Texts
..................................................File Line 100000 Terms 0
..................................................File Line 200000 Terms 0
..................................................File Line 300000 Terms 0
..................................................File Line 400000 Terms 0
..................................................File Line 500000 Terms 0
..................................................File Line 600000 Terms 0
..................................................File Line 700000 Terms 0
..................................................File Line 800000 Terms 0
..................................................File Line 900000 Terms 0
..................................................File Line
1000000<tel:1000000> Terms 0
..................................................File Line
1100000<tel:1100000> Terms 0
..................................................File Line
1200000<tel:1200000> Terms 0
..................................................File Line
1300000<tel:1300000> Terms 0
..................................................File Line
1400000<tel:1400000> Terms 0
..................................................File Line
1500000<tel:1500000> Terms 0
..................................................File Line
1600000<tel:1600000> Terms 0
..................................................File Line
1700000<tel:1700000> Terms 0
..................................................File Line
1800000<tel:1800000> Terms 0
..................................................File Line
1900000<tel:1900000> Terms 0
..................................................File Line
2000000<tel:2000000> Terms 0
..................................................File Line
2100000<tel:2100000> Terms 0
..................................................File Line
2200000<tel:2200000> Terms 0
..................................................File Line
2300000<tel:2300000> Terms 0
..................................................File Line
2400000<tel:2400000> Terms 0
..................................................File Line
2500000<tel:2500000> Terms 0
..................................................File Line
2600000<tel:2600000> Terms 0
..................................................File Line
2700000<tel:2700000> Terms 0
..................................................File Line
2800000<tel:2800000> Terms 0
..................................................File Line
2900000<tel:2900000> Terms 0
..................................................File Line
3000000<tel:3000000> Terms 0
..................................................File Line
3100000<tel:3100000> Terms 0
..................................................File Line
3200000<tel:3200000> Terms 0
..................................................File Line
3300000<tel:3300000> Terms 0
..................................................File Line
3400000<tel:3400000> Terms 0
..................................................File Line
3500000<tel:3500000> Terms 0
..................................................File Line
3600000<tel:3600000> Terms 0
..................................................File Line
3700000<tel:3700000> Terms 0
..................................................File Line
3800000<tel:3800000> Terms 0
..................................................File Line
3900000<tel:3900000> Terms 0
..................................................File Line
4000000<tel:4000000> Terms 0
..................................................File Line
4100000<tel:4100000> Terms 0
..................................................File Line
4200000<tel:4200000> Terms 0
..................................................File Line
4300000<tel:4300000> Terms 0
..................................................File Line
4400000<tel:4400000> Terms 0
..................................................File Line
4500000<tel:4500000> Terms 0
..................................................File Line
4600000<tel:4600000> Terms 0
..................................................File Line
4700000<tel:4700000> Terms 0
..................................................File Line
4800000<tel:4800000> Terms 0
..................................................File Line
4900000<tel:4900000> Terms 0
..................................................File Line
5000000<tel:5000000> Terms 0
..................................................File Line
5100000<tel:5100000> Terms 0
..................................................File Line
5200000<tel:5200000> Terms 0
..................................................File Line
5300000<tel:5300000> Terms 0
..................................................File Line
5400000<tel:5400000> Terms 0
..................................................File Line
5500000<tel:5500000> Terms 0
..................................................File Line
5600000<tel:5600000> Terms 0
..................................................File Line
5700000<tel:5700000> Terms 0
..................................................File Line
5800000<tel:5800000> Terms 0
..................................................File Line
5900000<tel:5900000> Terms 0
..................................................File Line
6000000<tel:6000000> Terms 0
..................................................File Line
6100000<tel:6100000> Terms 0
..................................................File Line
6200000<tel:6200000> Terms 0
..................................................File Line
6300000<tel:6300000> Terms 0
..................................................File Line
6400000<tel:6400000> Terms 0
..................................................File Line
6500000<tel:6500000> Terms 0
..................................................File Line
6600000<tel:6600000> Terms 0
..................................................File Line
6700000<tel:6700000> Terms 0
..................................................File Line
6800000<tel:6800000> Terms 0
..................................................File Line
6900000<tel:6900000> Terms 0
..................................................File Line
7000000<tel:7000000> Terms 0
..................................................File Line
7100000<tel:7100000> Terms 0
..................................................File Line
7200000<tel:7200000> Terms 0
..................................................File Line
7300000<tel:7300000> Terms 0
..................................................File Line
7400000<tel:7400000> Terms 0
..................................................File Line
7500000<tel:7500000> Terms 0
..................................................File Line
7600000<tel:7600000> Terms 0
..................................................File Line
7700000<tel:7700000> Terms 0
..................................................File Line
7800000<tel:7800000> Terms 0
..................................................File Line
7900000<tel:7900000> Terms 0
..................................................File Line
8000000<tel:8000000> Terms 0
..................................................File Line
8100000<tel:8100000> Terms 0
..................................................File Line
8200000<tel:8200000> Terms 0
..................................................File Line
8300000<tel:8300000> Terms 0
..................................................File Line
8400000<tel:8400000> Terms 0
..................................................File Line
8500000<tel:8500000> Terms 0
..................................................File Line
8600000<tel:8600000> Terms 0
..................................................File Line
8700000<tel:8700000> Terms 0
..................................................File Line
8800000<tel:8800000> Terms 0 .............File Line 8827152<tel:8827152> Terms
0 Writing map of Cuis and Texts to pathtoUmls2015.bsv
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 4:00 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Thank you! I believe that was a change post 2011! You should actually be ok
with both SNOMEDCT and SNOMEDCT_US in CtakesSources.txt
Cheers,
Sean
-----Original Message-----
From: Maite Meseure Hugues [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 3:43 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: Re: Fast Dictionary Update
If this can helps, I had to replace 'SNOMEDCT' with 'SNOMEDCT_US' in
CtakesSources.txt.
On Wed, Sep 16, 2015 at 2:33 PM, Finan, Sean <
[email protected]<mailto:[email protected]><mailto:[email protected]>>
wrote:
I'm not sure that I understand your question. As I sent it, the anat, snomed
and rxnorm are not separate runs. The args line I sent earlier is for a single
run that will create a dictionary with snomed and rxnorm terms. The anatomy tui
list has a special use in correctly processing snomed codes.
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 3:27 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Ok, hopefully one last question.
Based on your example everything runs, however the Anat and Snomed runs don't
produce any valid CUIs but RXNorm does. I'm not sure if this has anything to do
with it but every UMLS source read is against MRSTY.
Here's my command
java -cp dictionarytool.jar;lib/*
org.apache.ctakes.dictionarytool.DictionaryCreator2 -umls /path/to/UMLS/META
-fd ./data/tiny -atui ./data/tiny/CtakesAnatTuis.txt -tui
./data/tiny/CtakesSnomedTuis.txt -ol path o ileUmls2015.bsv
Any suggestions?
Thanks again,
Brandon
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 3:05 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Yes, that will make the rare word dictionary in a memory-based hsql database -
the same as the default for the dictionary-lookup-fast module.
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 2:42 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Thanks Sean, much appreciated. To clarify the example below would create the
dictionary for use for the rare word approach?
Thanks,
Brandon
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 2:16 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Hi Brandon,
I just checked in a bin/dictionarytool.zip It should have everything that you
need (.jar, lib/, data/).
java -cp dictionarytool.jar;lib/*
org.apache.ctakes.dictionarytool.DictionaryCreator2 [args] Should do the trick.
To recreate a 2015 version of the current ctakes dictionary, the arguments
are:
-umls my/path/to/2015AA/META -fd ./data/tiny -atui
./data/tiny/CtakesAnatTuis.txt -tui ./data/tiny/CtakesSnomedTuis.txt -db
jdbc:hsqldb:file:my/path/to/snorx2015 -tbl CUI_TERMS
Create my/path/to/snorx2015 by copying
resources/memdbtemplate/ctakesumls.properties to
my/path/to/snorx2015.properties - there is a resources/README about this.
Before populating a DB, I usually do a trial run first, writing to a flat file.
Replace "-db ... -tbl ..." with "-ol my/path/to/testout.bsv"
Sean
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 1:49 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Hi Sean,
That'd be great.
I think I'm building it incorrectly because after I build the jar and try to
run specifying DictionaryCreator2 as the main class it says it can't find it.
I'm not too familiar with Java and building projects/jars so it could be my
ignorance causing the problem.
Thanks,
Brandon
-----Original Message-----
From: Finan, Sean [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 1:45 PM
To:
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: RE: Fast Dictionary Update
Hi Brandon,
I can send you a jar or commit one pre-built. What goes wrong when you try to
build the tool?
Sean
-----Original Message-----
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Wednesday, September 16, 2015 1:23 PM
To:
'[email protected]<mailto:[email protected]><mailto:[email protected]>'
Subject: Fast Dictionary Update
Does someone have the DictionaryTool jar available? I'm having trouble creating
the jar file from the project and would like to be able to create an updated
UMLS fast dictionary for 2015.
Thanks,
Brandon
IMPORTANT WARNING: The information in this message (and the documents attached
to it, if any) is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this message by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken, or omitted to be taken, in reliance
on it is prohibited and may be unlawful. If you have received this message in
error, please delete all electronic copies of this message (and the documents
attached to it, if any), destroy any hard copies you may have created and
notify me immediately by replying to this email. Thank you.
Geisinger Health System utilizes an encryption process to safeguard Protected
Health Information and other confidential data contained in external e-mail
messages. If email is encrypted, the recipient will receive an e-mail
instructing them to sign on to the Geisinger Health System Secure E-mail
Message Center to retrieve the encrypted e-mail.