Hi John, As far as documentation, I doubt you are missing anything
It all really starts from http://ctakes.apache.org/ but with most of it on the confluence wiki https://cwiki.apache.org/confluence/display/CTAKES/ And then yes there are bits and pieces spread across old posts. As far as code, you might try taking a look at any of these https://issues.apache.org/jira/i#browse/CTAKES-217 https://issues.apache.org/jira/i#browse/CTAKES-155 https://issues.apache.org/jira/i#browse/CTAKES-66 Or any other JIRA issue that doesn't have an assignee. Or if you see an open issue with an assignee that looks interesting, you could also ask if someone is in the middle of a fix or not. I also plan to post or checkin some a groovy script for running the AggregatePlaintextUMLSProcessor today or tomorrow. The version I am working on doesn't do the dynamic downloading of the required components - I am taking the tactic of assuming the user downloaded a cTAKES binary and the separately downloadable resources. I think it would be great if you could try it out, and perhaps extend so there is a way to run an example of each component to give people an idea of what each component can do, as has been discussed on the dev@ mailing list. Another task might be to point out where you think the documentation is most scattered and maybe we can address that. -- James -----Original Message----- From: dev-return-2334-Masanz.James=mayo....@ctakes.apache.org [mailto:dev-return-2334-Masanz.James=mayo....@ctakes.apache.org] On Behalf Of John Green Sent: Friday, December 20, 2013 3:15 PM To: dev@ctakes.apache.org Subject: Documentation Hi all, Happy Holidays! I have a week off, then 6 weeks of insanity, then Ill finally be regularly free to try and help out, not that anyone is holding their breath or anything. When February roles around and I really start applying myself to some development corner of cTakes, is there anything I can do in the meantime that is pressing slop-work? Anything that I can leverage my clinical experience with to helping ctakes? Other than committing some more notes. Or maybe menial coding that is pressing? Like I've said before, Im no computer scientist (only aspiring), but I can definitely knock out some grunt-work coding. In the meantime, this week, Im still trying to get a real working understanding of all the moving parts in cTakes, both from a user side (building annotators, pipelines, dictionaries, etc) and a development side. I dont want to trouble anyone with individual questions before I've tackled all the literature/code documentation; however, what --is-- all the literature? The documentation, beyond installing the software, seems to be very spread out. Am I missing something obvious? Or is it just, "dig in and read a billion different posts from 2008-till present" time (including all the UIMA documentation/Lucene documentation etc)? As is the patent phrase in moments like these: forgive me if this has been asked before. John Green