Re: iterate on the features of CAS consumer (FileWriterCasConsumer)
Hi Tim,I was able to use CasIOUtil package to iterate on the Cas features. First, I need it to create a new Cas and I used JCasFactory for that. Below is the two lines of code. Thanks for your help JCas jcas = JCasFactory.createJCas(); //create a new case CasIOUtil.readJCas(jcas, new File("C:\\temp\\uima\\xcas\\xCasAbstrct.xcas")); //load the existing Cas into the new one Samir On Wednesday, April 15, 2015 2:53 PM, samir chabou wrote: Thanks Tim for your suggestion I'll try to experiment with the CasIOUtil method and keep the uesr/dev list posted. On Wednesday, April 15, 2015 7:07 AM, "Miller, Timothy" wrote: The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines. For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class: https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations. Hope this helps. Tim From: samir chabou [samir...@yahoo.com.INVALID] Sent: Monday, April 13, 2015 11:22 PM To: dev@ctakes.apache.org; u...@ctakes.apache.org Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer) Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. please advise Thanks On Saturday, April 11, 2015 12:54 AM, samir chabou wrote: Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. Thanks
RE: DB DictionaryLookupAnnotator sqlserver exception
Ah, that led me in the right direction. I was accidentally combining Lucene for the rxnorm & orange book with jdbc for the lookup. Swapped rxnorm & orange book to the jdbc version and it works. Thanks! -Original Message- From: Finan, Sean [mailto:sean.fi...@childrens.harvard.edu] Sent: Wednesday, April 15, 2015 2:59 PM To: dev@ctakes.apache.org Subject: RE: DB DictionaryLookupAnnotator sqlserver exception Hi Alex, This is some pretty odd behavior. Obviously, it is indicating that the resource type loaded or specified is not the correct class. Specification is (for the standard UMLS pipeline) in ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml lines #226 and #289. Both should be org.apache.ctakes.core.resource.JdbcConnectionResourceImpl There is an identical specification on line #352, but that is for Orangebook which (I'm pretty sure) is no longer used and I think that this is one of a couple sections that was missed during refactoring, so you can ignore it. If you are running from source then you could try editing org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.java lines #140, #141 and add to the exception message something like + " instead of " + (extResrc == null ? "NULL" : + extResrc.getClass().getName() ) To find out what it thinks that it has underfoot. Sean From: Milinovich, Alex [mailto:mili...@ccf.org] Sent: Wednesday, April 15, 2015 12:50 PM To: dev@ctakes.apache.org Subject: DB DictionaryLookupAnnotator sqlserver exception Attempting to use the sqlserver jdbc connection for the DictionaryLookupAnnotator. When loading the aggregate engine, the connection is established fine, but then it gives the error - java.lang.Exception: Expected external resource to be:interface org.apache.ctakes.core.resource.JdbcConnectionResource at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDictionaryXml(LookupParseUtilities.java:140) at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDictionaries(LookupParseUtilities.java:94) at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDescriptor(LookupParseUtilities.java:80) at org.apache.ctakes.dictionary.lookup.ae.DictionaryLookupAnnotator.configInit(DictionaryLookupAnnotator.java:88) ... 26 more Any ideas as to why this isn't working? [cid:image001.jpg@01D0777A.A2C77340] Alex Milinovich | System Analyst III | Quantitative Health Sciences 9500 Euclid Ave. - JJN3 | Cleveland, OH 44195 | p: (216) 444-9931 | m: (216) 245-7655 === Please consider the environment before printing this e-mail Cleveland Clinic is ranked as one of the top hospitals in America by U.S.News & World Report (2014). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you. === Please consider the environment before printing this e-mail Cleveland Clinic is ranked as one of the top hospitals in America by U.S.News & World Report (2014). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.
RE: DB DictionaryLookupAnnotator sqlserver exception
Hi Alex, This is some pretty odd behavior. Obviously, it is indicating that the resource type loaded or specified is not the correct class. Specification is (for the standard UMLS pipeline) in ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml lines #226 and #289. Both should be org.apache.ctakes.core.resource.JdbcConnectionResourceImpl There is an identical specification on line #352, but that is for Orangebook which (I'm pretty sure) is no longer used and I think that this is one of a couple sections that was missed during refactoring, so you can ignore it. If you are running from source then you could try editing org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.java lines #140, #141 and add to the exception message something like + " instead of " + (extResrc == null ? "NULL" : extResrc.getClass().getName() ) To find out what it thinks that it has underfoot. Sean From: Milinovich, Alex [mailto:mili...@ccf.org] Sent: Wednesday, April 15, 2015 12:50 PM To: dev@ctakes.apache.org Subject: DB DictionaryLookupAnnotator sqlserver exception Attempting to use the sqlserver jdbc connection for the DictionaryLookupAnnotator. When loading the aggregate engine, the connection is established fine, but then it gives the error - java.lang.Exception: Expected external resource to be:interface org.apache.ctakes.core.resource.JdbcConnectionResource at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDictionaryXml(LookupParseUtilities.java:140) at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDictionaries(LookupParseUtilities.java:94) at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDescriptor(LookupParseUtilities.java:80) at org.apache.ctakes.dictionary.lookup.ae.DictionaryLookupAnnotator.configInit(DictionaryLookupAnnotator.java:88) ... 26 more Any ideas as to why this isn't working? [cid:image001.jpg@01D0777A.A2C77340] Alex Milinovich | System Analyst III | Quantitative Health Sciences 9500 Euclid Ave. - JJN3 | Cleveland, OH 44195 | p: (216) 444-9931 | m: (216) 245-7655 === Please consider the environment before printing this e-mail Cleveland Clinic is ranked as one of the top hospitals in America by U.S.News & World Report (2014). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.
Re: iterate on the features of CAS consumer (FileWriterCasConsumer)
Thanks Tim for your suggestion I'll try to experiment with the CasIOUtil method and keep the uesr/dev list posted. On Wednesday, April 15, 2015 7:07 AM, "Miller, Timothy" wrote: The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines. For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class: https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations. Hope this helps. Tim From: samir chabou [samir...@yahoo.com.INVALID] Sent: Monday, April 13, 2015 11:22 PM To: dev@ctakes.apache.org; u...@ctakes.apache.org Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer) Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. please advise Thanks On Saturday, April 11, 2015 12:54 AM, samir chabou wrote: Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. Thanks
DB DictionaryLookupAnnotator sqlserver exception
Attempting to use the sqlserver jdbc connection for the DictionaryLookupAnnotator. When loading the aggregate engine, the connection is established fine, but then it gives the error - java.lang.Exception: Expected external resource to be:interface org.apache.ctakes.core.resource.JdbcConnectionResource at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDictionaryXml(LookupParseUtilities.java:140) at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDictionaries(LookupParseUtilities.java:94) at org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.parseDescriptor(LookupParseUtilities.java:80) at org.apache.ctakes.dictionary.lookup.ae.DictionaryLookupAnnotator.configInit(DictionaryLookupAnnotator.java:88) ... 26 more Any ideas as to why this isn't working? [cid:image001.jpg@01D0777A.A2C77340] Alex Milinovich | System Analyst III | Quantitative Health Sciences 9500 Euclid Ave. - JJN3 | Cleveland, OH 44195 | p: (216) 444-9931 | m: (216) 245-7655 === Please consider the environment before printing this e-mail Cleveland Clinic is ranked as one of the top hospitals in America by U.S.News & World Report (2014). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.
RE: iterate on the features of CAS consumer (FileWriterCasConsumer)
The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines. For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class: https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations. Hope this helps. Tim From: samir chabou [samir...@yahoo.com.INVALID] Sent: Monday, April 13, 2015 11:22 PM To: dev@ctakes.apache.org; u...@ctakes.apache.org Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer) Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. please advise Thanks On Saturday, April 11, 2015 12:54 AM, samir chabou wrote: Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. Thanks