[+dev]
Hi Arohi,
I'm glad that you have it working.
To get started, I think a good place to get started would be to take a look at 
the current type system[1] which outlines the output[2] that cTAKES currently 
supports.
As you already found, the IdentifiedAnnotation (and it's subsclasses such as 
xMention, and the UmlsConcept codes)
Unfortunately, there isn't much more documentation than what's in the current 
guides[4] at this point in time.  However, the mailing lists are a great place 
to look for answers you may have.  To learn more about the flow of control of 
the code, you may want to check out the UIMA [4] framework which cTAKES is 
built on top of.
[1] 
http://ctakes.apache.org/user-faqs.html#what-are-the-available-attributes-types-in-ctakes
[2] 
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-type-system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSystem.xml
[3] 
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+Component+Use+Guide
[4] http://uima.apache.org

I hope that helps.
--Pei

From: Arohi Kumar [mailto:ar...@mobipulse.in] 
Sent: Tuesday, September 03, 2013 2:53 PM
To: Chen, Pei
Subject: Re: Information Regarding Apache cTAKES-3.0

Hi Pei,
Thanks for your suggestion. That worked like a charm. I also made it work using 
Lucene 3.6 to write a new index which was subsequently readable by the Lucene 
4.0 jars present in the project. Just for curiosity, I have found(by 
experimenting with Lucene versions) that the original OrangeBook index was 
written to by a Lucene version preceding 1.9. Hope that I am right?
Now that I am obtaining the output :
I want to be able to understand what I am getting. I have looked at the output 
and things like the LookupWindowAnnotation, SignSymptomMention, Concept, 
UmlsConcept jump out as being really useful. I want to understand the other 
outputs as well as how the code gave them to me. I have looked at the Component 
Use Guide, which gives me a overall idea of the cTAKES pipeline. I am looking 
for a more detailed explanation. 

I understand that ultimately I will have to get my hands dirty and delve into 
the code. Are there any other resources for helping me get started like an 
explanation of the output and the flow of control of the code.
Thank you
Arohi Kumar
Ex-CSE, IIT Kharagpur

On Tue, Sep 3, 2013 at 7:07 PM, Chen, Pei <pei.c...@childrens.harvard.edu> 
wrote:
Hi Arohi,
OrangeBook is included in cTAKES' ctakes-dictionary-lookup-res project now:
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/
Feel free to let us know if that works for you.
--Pei
 
From: Arohi Kumar [mailto:ar...@mobipulse.in] 
Sent: Tuesday, September 03, 2013 6:29 AM
To: Chen, Pei
Subject: Re: Information Regarding Apache cTAKES-3.0
 
I'm sorry, the link is 
https://sourceforge.net/p/ctakesresources/code/HEAD/tree/trunk/ctakes-resources-dictionary/src/main/resources/org/apache/ctakes/dictionary/lookup/OrangeBook/
 
On Tue, Sep 3, 2013 at 3:58 PM, Arohi Kumar <ar...@mobipulse.in> wrote:
Hi Pei,
I am a newbie and learning Apache cTAKES-3.0 for a project. 
I was facing an error which was caused when lucene-4.0(included in Apache 
cTAKES) tries to read the OrangeBook index. 
I went through the mail archives and found that clearing up and replacing the 
OrangeBook index with 


will solve the problem. The above link seems to be broken. I will be grateful 
if you could send me an updated link if one exists.
Some alternative ways of solving the problem:
1. Since the orangebook index has size of only 19,000(approx), I think that we 
can also write a new index using lucene-3.0(because 4.0 is able to read indexes 
written by 3.0 and later).
2. Change the lucene-4.0 jars in maven dependency to lucene-3.0 jars, but that 
would lead to dependencies being broken and so, I don't want to get into that.
You suggestions are most welcome.
Thanks
Arohi Kumar
Ex- CSE, IIT Kharagpur 
 
 

Reply via email to