Andy, That would be pretty cool. Currently, there are all of pre-built ctakes dictionaries in maven central- we can add more as there are more contributions: http://search.maven.org/#search%7Cga%7C1%7Cctakes-resources
Agree that it would be nice if there was an apt-get or similar install that downloads and unpacks for each use case... > -----Original Message----- > From: andy mcmurry [mailto:[email protected]] > Sent: Tuesday, September 09, 2014 5:33 PM > To: [email protected] > Subject: Recommendation for ctakes default (UMLS) dictionaries > > Greetings ctakes-dev: > > *UMLS license restrictions have been getting more lax over the years -- > *much of the UMLS can be downloaded directly from the NCBI official FTP > site. > > In fact, the NIH (and implicitly the NLM) *have already made the standard > terms public for some medical specialities*. > > For example: Here is the UMLS subset specific to Medical Genetics > (MedGen) and Genetic Testing (GTR) complete with SNOMED-CT concept > CUI(s) and names, etc : > > [ ftp://ftp.ncbi.nlm.nih.gov/pub/medgen/README.html ] > > My team has developed a JVM based wrapper for MetaMap 2013AB which I > intend to open source soon (Clojure). It includes REST support for invoking > MetaMap with any or all of the command line arguments. > We do not integrate with UIMA, we are basically a wrapper around the > binary installation of MetaMap. The emphasis is on publication text not > clinical text, still, some services are common (such as LVG). > > Strangely, the NLM still requires UMLS licenses to download MetaMap > execution binaries. The MetaMap binary install is better but customizing > dictionaries (DataFileBuilder) is not as easy to use as CTAKES with YTEXT > > [ https://cwiki.apache.org/confluence/display/CTAKES/YTEX+Installation ] > > *** Hence, there is a real opportunity here to enable Apache cTAKES to have > a stronger default dictionary. ** * > > Imagine if we could > *$ apt-get install apache-ctakes * > > and instantly have a working package for SOME problem domain. > In my case (Medical Genetics) the UMLS definitions are already available and > the UMLS license problem becomes a non issue, at least for many first time > users > > Your thoughts? > AndyMC
