+1000 on this! Great lets make a jira!!!
> On Nov 11, 2014, at 5:02 PM, andy mcmurry <mcmurry.a...@gmail.com> wrote: > > Hello! > > https://bitbucket.org/invitae/medgen-mysql (Apache Licensed ASL2) > > We just released a new library containing a huge chunk of UMLS concepts > which are available without registering accounts/username/passwords. > LEGALLY. Yes, really! > > The subset is from NCBI and it contains *thousands of concepts from SNOMED > and other vocabularies*. > > The code is essentially > 1. a list of WGET targets to various NCBI FTP site mirrors > 2. Makefile for building the databases of interest > > Our legal team has approved distribution for Open Access work, ASL2 > LICENSE. > > I recommend we use this opportunity to make this the default distribution > for CTAKES UMLS connections, because it obviates the need for so much > painful credentialing and back and forth agreements with the US National > Library of Medicine. > > Cheers! > --Andy > > > On Wed, Sep 10, 2014 at 12:13 PM, Masanz, James J. <masanz.ja...@mayo.edu> > wrote: > >> >> I would love to see the install be as simple as apt-get install to end up >> with some working dictionary that have more than a handful of entries to >> get them started. >> >> Regards, >> James Masanz >> >> -----Original Message----- >> From: andy mcmurry [mailto:mcmurry.a...@gmail.com] >> Sent: Tuesday, September 09, 2014 4:32 PM >> To: ctakes-...@incubator.apache.org >> Subject: Recommendation for ctakes default (UMLS) dictionaries >> >> Greetings ctakes-dev: >> >> *UMLS license restrictions have been getting more lax over the years -- >> *much of the UMLS can be downloaded directly from the NCBI official FTP >> site. >> >> In fact, the NIH (and implicitly the NLM) *have already made the standard >> terms public for some medical specialities*. >> >> For example: Here is the UMLS subset specific to Medical Genetics (MedGen) >> and Genetic Testing (GTR) complete with SNOMED-CT concept CUI(s) and names, >> etc : >> >> [ ftp://ftp.ncbi.nlm.nih.gov/pub/medgen/README.html ] >> >> My team has developed a JVM based wrapper for MetaMap 2013AB which I >> intend to open source soon (Clojure). It includes REST support for >> invoking MetaMap with any or all of the command line arguments. >> We do not integrate with UIMA, we are basically a wrapper around the >> binary installation of MetaMap. The emphasis is on publication text not >> clinical text, still, some services are common (such as LVG). >> >> Strangely, the NLM still requires UMLS licenses to download MetaMap >> execution binaries. The MetaMap binary install is better but customizing >> dictionaries (DataFileBuilder) is not as easy to use as CTAKES with YTEXT >> >> [ https://cwiki.apache.org/confluence/display/CTAKES/YTEX+Installation ] >> >> *** Hence, there is a real opportunity here to enable Apache cTAKES to >> have a stronger default dictionary. ** * >> >> Imagine if we could >> *$ apt-get install apache-ctakes * >> >> and instantly have a working package for SOME problem domain. >> In my case (Medical Genetics) the UMLS definitions are already available >> and the UMLS license problem becomes a non issue, at least for many first >> time users >> >> Your thoughts? >> AndyMC >>