Hi Harpreet, If you are willing to use cTakes 3.2, try the dictionary-lookup-fast module as a replacement of the default dictionary-lookup. That module has a new dictionary resource (hsql, not lucene) and slightly different methods for lookup and matching. In time trials it has been faster than the default module (hence the name). Accuracy depends upon the parameter settings, but in the tests performed so far the results are comparable or better. The new dictionary is much leaner than the current default dictionary, small enough to port from the hsql cached version to a hsql in-memory version. Using the in-memory version makes dictionary lookup practically instantaneous (hundredths of a second). Limited documentation is available in the module's doc/ directory.
I will be on vacation for a week, but please don't hesitate to write if you have any questions. Sean ________________________________________ From: Harpreet Khanduja [hsk5...@rit.edu] Sent: Thursday, July 17, 2014 5:07 PM To: dev@ctakes.apache.org Subject: Lucene for UMLS2014 Hello, I would be grateful if someone could help. I created a lucene index for umls2014 but only for snomed vocabulary. I did this because I thought this would reduce the dictionary look up time. But it still almost the same. Is there any other way to improve the dictionary look up time? Thank you, Harpreet