[+dev] Hi Arohi, I'm glad that you have it working. To get started, I think a good place to get started would be to take a look at the current type system[1] which outlines the output[2] that cTAKES currently supports. As you already found, the IdentifiedAnnotation (and it's subsclasses such as xMention, and the UmlsConcept codes) Unfortunately, there isn't much more documentation than what's in the current guides[4] at this point in time. However, the mailing lists are a great place to look for answers you may have. To learn more about the flow of control of the code, you may want to check out the UIMA [4] framework which cTAKES is built on top of. [1] http://ctakes.apache.org/user-faqs.html#what-are-the-available-attributes-types-in-ctakes [2] http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-type-system/src/main/resources/org/apache/ctakes/typesystem/types/TypeSystem.xml [3] https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+Component+Use+Guide [4] http://uima.apache.org
I hope that helps. --Pei From: Arohi Kumar [mailto:ar...@mobipulse.in] Sent: Tuesday, September 03, 2013 2:53 PM To: Chen, Pei Subject: Re: Information Regarding Apache cTAKES-3.0 Hi Pei, Thanks for your suggestion. That worked like a charm. I also made it work using Lucene 3.6 to write a new index which was subsequently readable by the Lucene 4.0 jars present in the project. Just for curiosity, I have found(by experimenting with Lucene versions) that the original OrangeBook index was written to by a Lucene version preceding 1.9. Hope that I am right? Now that I am obtaining the output : I want to be able to understand what I am getting. I have looked at the output and things like the LookupWindowAnnotation, SignSymptomMention, Concept, UmlsConcept jump out as being really useful. I want to understand the other outputs as well as how the code gave them to me. I have looked at the Component Use Guide, which gives me a overall idea of the cTAKES pipeline. I am looking for a more detailed explanation. I understand that ultimately I will have to get my hands dirty and delve into the code. Are there any other resources for helping me get started like an explanation of the output and the flow of control of the code. Thank you Arohi Kumar Ex-CSE, IIT Kharagpur On Tue, Sep 3, 2013 at 7:07 PM, Chen, Pei <pei.c...@childrens.harvard.edu> wrote: Hi Arohi, OrangeBook is included in cTAKES' ctakes-dictionary-lookup-res project now: http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-res/src/main/resources/org/apache/ctakes/dictionary/lookup/ Feel free to let us know if that works for you. --Pei From: Arohi Kumar [mailto:ar...@mobipulse.in] Sent: Tuesday, September 03, 2013 6:29 AM To: Chen, Pei Subject: Re: Information Regarding Apache cTAKES-3.0 I'm sorry, the link is https://sourceforge.net/p/ctakesresources/code/HEAD/tree/trunk/ctakes-resources-dictionary/src/main/resources/org/apache/ctakes/dictionary/lookup/OrangeBook/ On Tue, Sep 3, 2013 at 3:58 PM, Arohi Kumar <ar...@mobipulse.in> wrote: Hi Pei, I am a newbie and learning Apache cTAKES-3.0 for a project. I was facing an error which was caused when lucene-4.0(included in Apache cTAKES) tries to read the OrangeBook index. I went through the mail archives and found that clearing up and replacing the OrangeBook index with will solve the problem. The above link seems to be broken. I will be grateful if you could send me an updated link if one exists. Some alternative ways of solving the problem: 1. Since the orangebook index has size of only 19,000(approx), I think that we can also write a new index using lucene-3.0(because 4.0 is able to read indexes written by 3.0 and later). 2. Change the lucene-4.0 jars in maven dependency to lucene-3.0 jars, but that would lead to dependencies being broken and so, I don't want to get into that. You suggestions are most welcome. Thanks Arohi Kumar Ex- CSE, IIT Kharagpur