Hi Ted,

In addition to performing searches, 
>  the hyperSql ( http://hsqldb.org/ ) database tool
should allow you to perform inserts into the umls dictionary database used by 
cTakes.

You can also create your own customized dictionary and run cTakes using only 
that dictionary or with umls plus that dictionary.  There are several ways to 
create a custom dictionary, and I think that you can start by looking in the 
resources/ ... /dictionary/lookup/ directory for examples.  It can be a little 
overwhelming if you just want to add one or two terms, and I am in the process 
of trying to make this a little easier for any user.  It may be a while before 
I can add my work to the trunk.   Until then, if you decide to go with the csv 
approach you can probably make it through with the examples in cTakes 
resources.  If you want to create a new hsql database then I can send you my 
(old) instructions on that process - but it might be overkill.

If you really want to know what lies behind the mask of the cTakes umls 
dictionary then I highly recommend that you just interface with it directly 
using the hsql tool.

Sean

________________________________________
From: Assur, Ted [theodore.as...@providence.org]
Sent: Friday, November 01, 2013 5:36 PM
To: dev@ctakes.apache.org
Subject: RE: specificity in selecting EntityMentions when using 
AggregatePlaintextUMLSProcessor

OK, Kind of resurfacing the original topic on this one, after I redirected it 
towards ICD codes last month:

I have several examples, like the one below, where it would be very helpful to 
be able to include UMLS terms that are in the UMLS 2011AB release, e.g. "CIN 1" 
(CUI = C0349458).

So if I have particular UMLS concepts I want to make sure and include, is there 
a way for me to *add* them to the umls dictionary used by cTAKES?

Ted


-----Original Message-----
From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu]
Sent: Wednesday, September 04, 2013 9:37 AM
To: dev@ctakes.apache.org
Subject: RE: specificity in selecting EntityMentions when using 
AggregatePlaintextUMLSProcessor

I don't know if this is exactly what you want, but you can use the hyperSql ( 
http://hsqldb.org/ ) database tool to perform searches on the umls dictionary 
used by cTakes.
For instance " select * from UMLS_MS_2011AB where FWORD = 'CIN' " will provide 
all the available terms starting with CIN.  In the result you'll see that there 
is no term "CIN I", and you'll also see that the only listing from ICD9 is for 
"CIN III" [C0851140, T191, MTHICD9 233.1]

If you want an icd9 code that isn't in the cTakes umls dictionary then you can 
find it online ... but that won't do you much good wrt cTakes.

Sean

-----Original Message-----
From: Assur, Ted [mailto:theodore.as...@providence.org]
Sent: Wednesday, September 04, 2013 11:56 AM
To: dev@ctakes.apache.org
Subject: RE: specificity in selecting EntityMentions when using 
AggregatePlaintextUMLSProcessor

Thanks for looking into this, it's been puzzling me.

On another note, I know the cTAKES dictionary uses ICD9, but I'm not familiar 
with how to access that information: In the example I've described below, where 
would I locate the ICD9 for a specific entity?

Thank you

Ted

-----Original Message-----
From: Pei Chen [mailto:chen...@apache.org]
Sent: Tuesday, September 03, 2013 7:13 PM
To: dev@ctakes.apache.org
Subject: Re: specificity in selecting EntityMentions when using 
AggregatePlaintextUMLSProcessor

You're right, it should have gotten "CIN I"- that's a strange one, probably 
needs to be debugged/looked into further...

On Tue, Sep 3, 2013 at 10:05 PM, Miller, Timothy 
<timothy.mil...@childrens.harvard.edu> wrote:
> Ah. So it will get
> CIN 2 (in SNOMED)
> CIN III (in SNOMED)
> CIN 3 (in SNOMED)
>
> but the rest are not in SNOMED?
>
> I wonder why it doesn't get CIN I? It looks like that exists in SNOMED
> (though I don't fully understand what all the symbols mean in the umls
> browser).
>
>> CIN I - Cervical intraepithelial neoplasia 1
>> [A3002690/SNOMEDCT/SY/285836003]
>
>
> On 09/03/2013 09:55 PM, Pei Chen wrote:
>> It has the correct parse (POS, chunks, and lookupwindow)- but some of
>> the terms do not exist in SNOMED- CIN 2 - Cervical intraepithelial
>> neoplasia 2 [A3002688/SNOMEDCT/SY/285838002] exists but not CIN II.
>> CIN III [A3333965/SNOMEDCT/SY/20365006] also exists that's why it was
>> able to perform the lookup successfully.
>> Note that CIN II synonyms do exist in other umls thersauses such as
>> MEDCIN, CCPSS though.  However, the bundled cTAKES dictionaries only
>> contain (MeSH, SNOMEDCT, RxNORM, NCI, ICD9) IRRC.
>>
>> --Pei
>>
>> On Tue, Sep 3, 2013 at 9:44 PM, Miller, Timothy
>> <timothy.mil...@childrens.harvard.edu> wrote:
>>> That is a good question, Ted!
>>>
>>> I tried it with a simple context: "The patient has a CIN III." I'm
>>> not sure if that is a correct context but I was able to duplicate
>>> your findings. (Finds a CUI for CIN III but not if you change it to
>>> CIN II)
>>>
>>> My first thought was that it is the chunker. But the chunker seems
>>> to get it right, as CIN II and CIN III are both called NPs, and
>>> similarly the LookupWindowAnnotator handles them both identically.
>>> So that suggests it is a problem with the actual lookup of the
>>> tokens in the LookupWindow.
>>>
>>> That's all I can do for now but maybe someone else who knows more
>>> about its behavior offhand will have an idea.
>>>
>>> Tim
>>>
>>>
>>>
>>>
>>> On 09/03/2013 08:24 PM, Assur, Ted wrote:
>>>> I'm trying to understand what would prevent the 
>>>> AggregatePlaintextUMLSProcessor AE from correctly parsing specific 
>>>> problems that are defined in the UMLS version used by cTAKES.
>>>>
>>>> For example,
>>>> CIN (Cervical Intraepithelial Neoplasia) in its general usage is parsed 
>>>> out as UMLS CUI C0206708.
>>>>
>>>> CIN comes in 3 grades, 1, 2 and 3. Sometimes this is reported with Roman 
>>>> Numerals, I,II, and III.
>>>>
>>>> cTAKES correctly identifies "CIN 3" and "CIN III" with UMLS CUI C0851140: 
>>>> "Carcinoma in situ of uterine cervix."
>>>>
>>>> However, I cannot get it to recognize CIN 1, CIN I, CIN 2, or CIN II as 
>>>> their correct concepts, "Cervical intraepithelial neoplasia grade 1" and 
>>>> "Cervical intraepithelial neoplasia grade 2" respectively.
>>>>
>>>> Is there a way to tune the detection of UMLS concepts?
>>>>
>>>>
>>>>
>>>>
>>>> --------------------------------------------
>>>> Ted Assur
>>>> IT Solutions Architect for Cancer Research Providence Health &
>>>> Services ted.as...@providence.org
>>>> 503-215-6476
>>>>
>>>> Crede, ut intelligas.
>>>> Intellego, ut credam.
>>>>
>>>>
>>>>
>>>>
>>>>   ________________________________
>>>>
>>>> This message is intended for the sole use of the addressee, and may 
>>>> contain information that is privileged, confidential and exempt from 
>>>> disclosure under applicable law. If you are not the addressee you are 
>>>> hereby notified that you may not use, copy, disclose, or distribute to 
>>>> anyone the message or any information contained in the message. If you 
>>>> have received this message in error, please immediately advise the sender 
>>>> by reply email and delete this message.
>>>>
>


________________________________

This message is intended for the sole use of the addressee, and may contain 
information that is privileged, confidential and exempt from disclosure under 
applicable law. If you are not the addressee you are hereby notified that you 
may not use, copy, disclose, or distribute to anyone the message or any 
information contained in the message. If you have received this message in 
error, please immediately advise the sender by reply email and delete this 
message.



________________________________

This message is intended for the sole use of the addressee, and may contain 
information that is privileged, confidential and exempt from disclosure under 
applicable law. If you are not the addressee you are hereby notified that you 
may not use, copy, disclose, or distribute to anyone the message or any 
information contained in the message. If you have received this message in 
error, please immediately advise the sender by reply email and delete this 
message.

Reply via email to