Peter is absolutely correct. It is possible to use ctakes and the UMLS dictionary completely offline, but it isn't recommended for regular use. If you have any way to connect to the internet please use the standard methods.
Many years ago the initial creators of ctakes negotiated with the NLM to enable the unique manner in which ctakes uses the UMLS. At the time it had never been done and the UMLS could not be redistributed by outside agencies. A legal (and amicable) partnership between the NLM and ctakes is absolutely necessary, and upholding our side of the agreement is how we make that happen. NLM maintains the umls and without proof of importance this maintenance would cease. NLM grants are one mechanism by which ctakes development gets funding, so it really is important that they know who is using the UMLS and how frequently. There is no detriment to providing them this information. You will never be charged for use, no matter how heavy it may be. The NLM sends annual requests for users to complete a survey. It is extremely important that you complete the survey and indicate that you use the UMLS for NLP and ctakes. The NLM and other agencies fund projects in part upon user-base. The larger the user base of ctakes, the greater the chance of funding for its development. Any funding in development turns into more accurate annotation engines, more capabilities and simpler usage for everybody. Of course, private funding would also help ... Sean ________________________________________ From: Peter Abramowitsch <pabramowit...@gmail.com> Sent: Sunday, May 24, 2020 2:49 AM To: dev@ctakes.apache.org; Akram Subject: Re: using UMLS Metathesaurus in cTAKES offline [EXTERNAL] * External Email - Caution * Having the data is not synonymous with umls authentication. Out of the box, you do need internet connectivity for the authentication to take place. It will happen once during startup. That will be sufficient for as long as the instance is running. The authentication is really meant to be umls' way to measure usage as much as it is a permission scheme. There is a mechanism for the authentication to be proxied through a different url which can be built onto to create something like you're thinking of. I've used that, but for a different purpose. But in these days of ever diminishing government, it's valuable for the nlm to have those authentication hits. Peter On Sat, May 23, 2020, 6:30 PM Akram <as...@yahoo.com.invalid> wrote: > I want to use cTAKES offline > > I am using command line > > run\runClinicalPipeline -i E:\cTAKES\files\MedReps\Input --xmiOut > E:\cTAKES\files\MedReps\Output --user myuser --pass mypassword > > This is my piper file > > load DefaultTokenizerPipeline.piper > > add DefaultJCasTermAnnotator > > load AttributeCleartkSubPipe.piper > > writeXmis > > > I thought UMLS will be accessed once and download all needed files so the > next time it does not need the internet to access **UMLS** > > but I was wrong. > > When I work offline cTAKES does not work in attempt to access UMLS and > gives error. > > I found that UMLS offers to download its data, so I did > > I downloaded **umls-2020AA-full.zip** > > I extracted Metathesaurus using MetamorphoSys and added it to > > E:\cTAKES\resources\org\apache\ctakes\dictionary\lookup\umls2020aa > > It is a huge folder 30GB+ full of .RRF files but did not work > > Not sure where the problem is > > do I have to change pipers? > > do I have to change the command? > > do I have to change files in the folder umls2020aa? > > > How to fix these problems to use cTAKES offline? > >