That parameter isn’t in my xml file, this is:
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!-- New format for the .xml lookup specification. Uses table name and
value type/class for Concept Factories. -->
<lookupSpecification>
<dictionaries>
<dictionary>
<name>custom2Terms</name>
<implementationName>org.apache.ctakes.dictionary.lookup2.dictionary.JdbcRareWordDictionary</implementationName>
<properties>
<!-- urls for hsqldb memory connections must be file types in hsql 1.8.
These file urls must be either absolute path or relative to current working
directory.
They cannot be based upon the classpath.
Though JdbcConnectionFactory will attempt to "find" a db based upon the parent
dir of the url
for the sake of ide ease-of-use, the user should be aware of these hsql
limitations.
-->
<property key="jdbcDriver" value="org.hsqldb.jdbcDriver"/>
<property key="jdbcUrl"
value="jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/custom2/custom2"/>
<property key="jdbcUser" value="sa"/>
<property key="jdbcPass" value=""/>
<property key="rareWordTable" value="cui_terms"/>
<property key="umlsUrl"
value="https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser"/>
<property key="umlsVendor" value="NLM-6515182895"/>
<property key="umlsUser" value="XXXX"/>
<property key="umlsPass" value="XXXX"/>
</properties>
</dictionary>
</dictionaries>
<conceptFactories>
<conceptFactory>
<name>custom2Concepts</name>
<implementationName>org.apache.ctakes.dictionary.lookup2.concept.JdbcConceptFactory</implementationName>
<properties>
<property key="jdbcDriver" value="org.hsqldb.jdbcDriver"/>
<property key="jdbcUrl"
value="jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/custom2/custom2"/>
<property key="jdbcUser" value="sa"/>
<property key="jdbcPass" value=""/>
<property key="umlsUrl"
value="https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser"/>
<property key="umlsVendor" value="NLM-6515182895"/>
<property key="umlsUser" value="XXXX"/>
<property key="umlsPass" value="XXXX"/>
<property key="tuiTable" value="tui"/>
<property key="prefTermTable" value="prefTerm"/>
<!-- Optional tables for optional term info.
Uncommenting these lines alone may not persist term information;
persistence depends upon the TermConsumer. -->
<property key="snomedct_usTable" value="long"/>
</properties>
</conceptFactory>
</conceptFactories>
<!-- Defines what terms and concepts will be used -->
<dictionaryConceptPairs>
<dictionaryConceptPair>
<name>custom2Pair</name>
<dictionaryName>custom2Terms</dictionaryName>
<conceptFactoryName>custom2Concepts</conceptFactoryName>
</dictionaryConceptPair>
</dictionaryConceptPairs>
<!-- DefaultTermConsumer will persist all spans.
PrecisionTermConsumer will only persist only the longest overlapping span of
any semantic group.
SemanticCleanupTermConsumer works as Precision** but also removes signs/sympoms
contained within disease/disorder,
and (just in case) removes any s/s and d/d that are also (exactly) anatomical
sites. -->
<rareWordConsumer>
<name>Term Consumer</name>
<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.DefaultTermConsumer</implementationName>
<!--<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.PrecisionTermConsumer</implementationName>-->
<!--<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.SemanticCleanupTermConsumer</implementationName>-->
<properties>
<!-- Depending upon the consumer, the value of codingScheme may or may not be
used. With the packaged consumers,
codingScheme is a default value used only for cuis that do not have secondary
codes (snomed, rxnorm, etc.) -->
<property key="codingScheme" value="custom2"/>
</properties>
</rareWordConsumer>
</lookupSpecification>
On 14/03/2017, 09:58, "Finan, Sean" <[email protected]> wrote:
You are pointing to your "DictionaryDescriptor" parameter to your custom
.xml configuration file?
-----Original Message-----
From: Martijn [mailto:[email protected]]
Sent: Tuesday, March 14, 2017 12:50 PM
To: [email protected]
Subject: Re: 2016AB UMLS (ctakessnorx)
Apparently, my database browser created a new database file instead of
opening the dictionary file (sorry for that!). When I open the correct file, it
shows the SNOMEDCT_US table.
Unfortunately, that doesn’t explain why cTAKES won’t return SNOMED concepts…
On 14/03/2017, 09:44, "Finan, Sean" <[email protected]>
wrote:
That is very strange. How large is the database .script file? It is
unlikely, but I wonder if the db library is running out of memory but not
reporting the problem.
-----Original Message-----
From: Martijn [mailto:[email protected]]
Sent: Tuesday, March 14, 2017 12:41 PM
To: [email protected]
Subject: Re: 2016AB UMLS (ctakessnorx)
According to the output the database should be filled, but when I
browse it, it’s empty.
INFO RareWordDbWriter:168 - Main Table Rows 341512 INFO
RareWordDbWriter:169 - Tui Table Rows 242342 INFO RareWordDbWriter:170 -
Preferred Term Table Rows 220791 INFO RareWordDbWriter:184 - SNOMEDCT_US Table
Rows 230545 INFO MainPanel:182 - Dictionary custom2 successfully built in
On 14/03/2017, 09:31, "Kean Kaufmann" <[email protected]> wrote:
FWIW, I also ran the GUI in the last few weeks and got all the
secondary
tables for the sources I selected, including SNOMEDCT_US.
On Tue, Mar 14, 2017 at 12:18 PM, Finan, Sean <
[email protected]> wrote:
> Hi Martijn,
> That is very strange. I don't know why the database would have
an empty
> table. It is acting like snomed codes were not found for any of
your cuis
> in your local umls installation, but that is terribly unlikely.
>
> I ran the gui about two weeks ago and the secondary database
tables were
> populated.
>
> Sorry that I can't help, this is unfortunate,
> Sean
>
> -----Original Message-----
> From: Martijn [mailto:[email protected]]
> Sent: Tuesday, March 14, 2017 12:07 PM
> To: [email protected]
> Subject: Re: 2016AB UMLS (ctakessnorx)
>
> Hi Sean,
>
> It’s an array, but I only sent you a single item out of that
array.
>
> The .xml file lists the snomedct_us table, but the database is
empty. I’m
> sure that I selected SNOMEDCT_US in the dictionarygui tool. Do
you have any
> idea why the database could be empty?
>
> Thanks.
>
> - Martijn
>
> On 10/03/2017, 10:33, "Finan, Sean"
<[email protected]>
> wrote:
>
> Hi Martijn,
>
> The UmlsConcept should be in an array in the
IdentifiedAnnotation.
> Does your array only contain a single concept?
>
> When you use the gui, it should store codes in the database
as a
> unique table for every (target) vocabulary that you selected in
the left
> panel. The .xml file that it creates should list all of those
types in the
> <conceptFactory> section. In your .xml you should see the line:
> <property key="snomedct_usTable" value="long"/>
>
> If that line is in the .xml, you can inspect your database
directly
> with a hsqldb tool. I can help you do that if needed. The db
should have
> a table named "snomedct_us".
>
> If all of those are ok then I will need to look at the lookup
code as
> something must have broken.
>
> Sean
>
> -----Original Message-----
> From: Martijn [mailto:[email protected]]
> Sent: Thursday, March 09, 2017 5:35 PM
> To: [email protected]
> Subject: Re: 2016AB UMLS (ctakessnorx)
>
> Hi Sean,
>
> Thanks! I used the GUI to generate a custom dictionary, but I
still
> get the UMLS code and not the SNOMED CT one.
>
> If I print the concept that was detected it returns:
> UmlsConcept
> codingScheme: "custom"
> code: <null>
> oid: "null#custom"
> oui: <null>
> score: 0.0
> disambiguated: false
> cui: "C1281583"
> tui: "T023"
> preferredText: "Entire hand"
>
> As you can see, the code is <null>. The concept is present in
the UMLS
> subset and there is also a SNOMED CT code listed there:
>
> Entire hand [A3421866/SNOMEDCT_US/PT] CUI:C1281583
SCUI:302539009
>
>
> Am I doing something wrong?
>
> - Martijn
>
> On 08/03/2017, 08:12, "Finan, Sean"
<[email protected]>
> wrote:
>
> Hi Martijn,
>
> The dictionary creator gui is in sandbox just like the
command
> line tool, but it is newer and easier to use.
>
> OntologyConceptUtil is in org.apache.ctakes.core.util.
>
> Sean
>
> -----Original Message-----
> From: Martijn [mailto:[email protected]]
> Sent: Tuesday, March 07, 2017 5:38 PM
> To: [email protected]
> Subject: Re: 2016AB UMLS (ctakessnorx)
>
> Hi Sean,
>
> Thanks so much for your quick reply.
> I used the command line directorytool. Is that different
than the
> gui? Can that explain the decrease in tagged concepts?
>
> I’m not able to find the OntologyConceptUtil class, may I
ask what
> the path for that class is?
>
> - Martijn
>
> On 07/03/2017, 14:24, "Finan, Sean"
<[email protected].
> edu> wrote:
>
> Hi Martijn,
>
> Since you say that you've created your own dictionary
I will
> assume that you used the gui in sandbox to do so. If that isn't
the case
> then let me know.
>
> The any dictionary created using the default settings
on the
> gui does have snomedct and rxnorm codes in addition to the cuis.
However,
> umls cui is always used as the primary normalization code for
ctakes
> annotations.
>
> To obtain codes for an annotation, check the
> OntologyConceptUtil in ctakes core. It has methods that will
return all
> associated codes as well as one to get all associated codes for a
> scheme/vocabulary (like snomedct_us, etc.). It can do this for a
single
> annotation, a collection of annotations, the entire document, or
a section
> of the document (sentence, paragraph, section). It also has
methods that
> allow you to fetch annotations found in the document by codes
other than
> the umls cui.
>
> Sean
>
>
> -----Original Message-----
> From: Martijn [mailto:[email protected]]
> Sent: Tuesday, March 07, 2017 5:13 PM
> To: [email protected]
> Subject: 2016AB UMLS (ctakessnorx)
>
> Hi,
>
> I've been using cTAKES for a bit now, but I still
can't figure
> out how to upgrade the UMLS version to the most recent one.
> If I create my own dictionary, cTAKES only returns
UMLS
> concepts and no SNOMED CT ones (I'm interested in those). The
amount of
> concepts returned is also way less compared to the 2011 UMLS
that's
> included with cTAKES.
>
> Can someone help me out by providing me a proper 2016
> dictionary or clear explanation how to implement the newest
version of the
> UMLS (with SNOMED CT).
>
>
> Thanks!
>
> - Martijn
>
>
>
>
>
>
>
>
>
>
>
>
>
--
_____________________________________________________
*Kean Kaufmann*
NLP Developer
RecordsOne
nSight Driven | *Priority. Clarity. Integrity. *
*mobile* |
240-401-6131
*Twitter: **@R1_RecordsOne*
---------------------------------------------------------------------------------------------------
*Confidentiality Notice: *This email, including any attachments is
the
property of RecordsOne, LLC and is intended for the sole use of the
intended recipient(s). It may contain information that is
privileged and
confidential. Any unauthorized review, use, disclosure, or
distribution is
prohibited. If you are not the intended recipient, please reply to
the
sender that you have received the message in error, then delete this
message.
---------------------------------------------------------------------------------------------------
*Mailing*: 10641 Airport Pulling Road, Suite 30 | Naples, FL 34109
*Main*: 239.451.6112
*Please consider the environmental impact before printing this
email. *