Sean,

Thanks for getting back to me in this.  I was afraid that was what the answer was going to be.

I appreciate you taking the time to fill in some of the gaps.  If it's so dependent on Java 1.8, someone should probably remove the "or higher" on the download page.


I look forward to getting this application up and running.

Until then,

rik.

On 2/3/23 15:57, Finan, Sean wrote:
Hi Rick,

Thank you for the questions and for reminding us that the documentation is 
sparse, outdated and not very detailed.  Everybody needs a prod now and then to 
get things done.

I hope that we can get a solid README and Wiki going on GitHub, as well as an 
update to the primary website.  It will take a lot of work and some cooperation 
by committers and users alike.

I have tried to address your questions inline below.

Sean

________________________________
From: Rick Coleman <rcole...@jilocasin.net>
Sent: Friday, February 3, 2023 3:14 PM
To: dev@ctakes.apache.org <dev@ctakes.apache.org>
Subject: Crash course in cTakes [EXTERNAL]

* External Email - Caution *


Hello everyone,

Can anyone point me to an exhaustive set of documentation regarding cTakes?

   *   Not really.  The wiki that you found is the most that there is.
   *   Most information is scattered across emails written on the dev and user 
lists.  You can search them here:  https://apache.markmail.org/

The main site feels like it was written by a marketing major, lots of
flash and catchiness, but little in the way of detailed documentation.
Even the User Install Guide and the Developer Install guide read like
what they are, install guides.

For example:
Is cTakes the whole package, or just the front end?

   *   ctakes is a clinical nlp platform (vague enough?).   I would say "whole 
package", but extendable.
   *   It is built on Apache UIMA and allows users to create pipelines of 
various nlp and i/o components.
   *   It comes with many components that have been built for clinical nlp.
   *   It is extendable; UIMA components from other sources can be placed in 
the pipelines.
   *   There are front-ends for some tasks, such as running a pipeline or 
creating a custom dictionary.

If it's just the front end, what's the back end?

   *   I would say that each UIMA component is a bit of back-end, as is the 
controller that actually runs the pipeline.
   *   As mentioned above, you can extend it with non-ctakes back-end 
components .

It mentions using my UMLS credentials, can you use a local copy of the
relevant UMLS data?  If so how?

   *   If you are compiling and running the source then ctakes will 
automatically download a default dictionary.
   *   If you are running a packaged binary then you'll need to manually pull 
down a dictionary.
   *   Previous to ctakes 5 downlaoding, unzipping and copying the dictionary 
was a manual process.
   *   If you are using v5 then you can run bin/getUmlsDictionary and a simple 
gui will do it for you.
   *   You can also create your own custom dictionary.
   *   The wiki has a page on the dictionary creator gui.
   *   There are instructions on youtube that start with first steps.

Are the requirements listed, 1GB drive space, Oracle Java 1.8 the
minimum or the recommended?  What about RAM or CPU? Is non-Oracle Java
acceptable?  What about 1.17, the current LTS version?

1GB disk
== Java 1.8
2GB RAM  (>= 4 recommended)
= 64bit CPU
OpenJDK seems to be fine.

Every java release past 8 is bad for ctakes.  ctakes has a lot of dependencies, 
many of which are old and rely on a java 8 feature here and there.  ctakes 
itself probably requires a java 8 special here and there, but I honestly don't 
know. Unfortunately, ctakes needs to have a serious update effort - maybe for 
v6.  Part of the problem is actually its capabilities and versatility - the 
availability of multiple available components and workflows.  A 'minor' change 
can require a dozen end-to-end tests in dev and user environments on multiple 
platforms.  Unit tests do not suffice.


So, does anyone know where I can find out this information?


Thanks.

rik.


Reply via email to