RE: SimpleServer, instantiating CAS with custom typesystem?

2013-02-19 Thread Helen Johnson -X (heljohns - Infobahn Softworld Inc at Cisco)
Thanks for your reply, Jens.

I admit I had been avoiding setting the text of the CAS to be the entire XML 
string I get back from the first REST service because it is a massive string 
and I only want a couple nodes from that xml string to be processed throughout 
the UIMA pipeline. But I see your point.


So then, in this new AE,  I retrieve the entire XML string from the CAS, do the 
zone-information processing from the specific nodes of the XML. I assume it is 
straightforward to then reset the CAS text to be just this text I have found in 
the original XML.  Specifically, I would use CAS.reset() to empty the CAS of 
the original (full XML) text, then jCAS.setDocumentText() with the new string 
of just the relevant text, as well as load all the doc-zone annotations at this 
point. Is this right?

Cheers,
Helen

-Original Message-
From: Jens Grivolla [mailto:j+...@grivolla.net] 
Sent: Tuesday, February 19, 2013 3:20 AM
To: user@uima.apache.org
Subject: Re: SimpleServer,  instantiating CAS with custom typesystem?

Hi, SimpleServer itself is in a way your CR, creating a CAS with the document 
text you sent. Why do you want to change SimpleServer, it seems that you only 
want to add annotations to the CAS, not fundamentally change how the CAS is 
created.

It seems to me that it would be far easier to just create an AE that adds those 
annotations. Then you won't have any typesystem issues either, since the AE 
would have the appropriate typesystem.

HTH,
Jens

On 02/18/2013 10:37 PM, Helen Johnson -X (heljohns - Infobahn Softworld Inc at 
Cisco) wrote:
 I'm stumped:

 I have a UIMA pipeline that starts with a CollectionReader that

 -  reads XML input (response from a REST service),

 -  identifies a couple of relevant XML nodes

 -  makes document-level annotations from the relevant nodes (title, 
 document body, footnote section)
  From there, the AnalysisEngine portion of the pipeline has many AEs that 
 I've wrapped into a single AggregateAnalysisEngine.
 The CollectionReader and the AAE all work correctly in this pipeline.

 Now I need to transfer this pipeline into a SimpleServer REST service 
 environment.
 I've created a PEAR of the AAE portion of the pipeline, but I can't include 
 the CollectionReader in this PEAR.
 First question:
 It is my understanding the CR cannot be included in the PEAR for the 
 simpleServer, am I correct in this?

 In order to get those document-zoning annotations of title, body  footnote, 
 I have added some methods to the Service.java class in the SimpleServer 
 package that do the XML parsing and then do the adding of these annotations 
 to the JCAS before the AAE is called. The error that is being thrown at this 
 point is this:

 The server encountered an internal error (JCas type 
 myPackage.DocClass.ArticleMainTitle used in Java code, but was not declared 
 in the XML type descriptor.) that prevented it from fulfilling this request.

 Second question:
 Where is Service.java looking for the typesystem xml file to be? I have tried 
 all of the following, with the same error result:

 -  put the typesystem descriptor file, myTSD.xml, in SimpleServer/lib

 -  create a jar containing myTSD.xml, put it into SimpleServer/lib 
 and add that to the build path

 -  (after the two above attempts), in SimpleServer project 
 properties, add lib to the UIMA CDE Property Page

 -  in SimpleServer project properties, in UIMA Type System, point to 
 the myTSD.xml file in lib

 -  put myTSD.xml in SimpeServer/WebContent/WEB-INF/lib

 -  put the jar containing myTSD.xml in the 
 SimpleServer/WebContent/WEB-INF/lib

 -  put myTSD.xml in SimpleServer/WebContent/WEB-INF/resources

 Final question:
 When a CAS gets instantiated (or reset, as it does in Service.java), how can 
 I tell it to use a custom typesystem, and where will it look for that 
 typesystem.xml file within the SimpleServer project?

 Thank you,
 Helen Johnson






Re: SimpleServer, instantiating CAS with custom typesystem?

2013-02-19 Thread Thomas Ginter
Helen,

You might also consider using UIMA-AS instead.  UIMA-AS allows you to deploy a 
service (your AAE) that can be remotely accessed by UIMA-AS clients on other 
machines or in other JVMs for scalable deployments.  Each client provides a 
CollectionReader to supply documents to the service and a Listener to catch 
return events from the service to know when processing is complete.  You can 
find some additional getting started information about UIMA-AS at the following:

http://uima.apache.org/doc-uimaas-what.html
  

Thanks,

Thomas Ginter
801-448-7676
thomas.gin...@utah.edu




On Feb 19, 2013, at 7:04 AM, Helen Johnson -X (heljohns - Infobahn Softworld 
Inc at Cisco) heljo...@cisco.com wrote:

 Thanks for your reply, Jens.
 
 I admit I had been avoiding setting the text of the CAS to be the entire XML 
 string I get back from the first REST service because it is a massive string 
 and I only want a couple nodes from that xml string to be processed 
 throughout the UIMA pipeline. But I see your point.
 
 
 So then, in this new AE,  I retrieve the entire XML string from the CAS, do 
 the zone-information processing from the specific nodes of the XML. I assume 
 it is straightforward to then reset the CAS text to be just this text I have 
 found in the original XML.  Specifically, I would use CAS.reset() to empty 
 the CAS of the original (full XML) text, then jCAS.setDocumentText() with the 
 new string of just the relevant text, as well as load all the doc-zone 
 annotations at this point. Is this right?
 
 Cheers,
 Helen
 
 -Original Message-
 From: Jens Grivolla [mailto:j+...@grivolla.net] 
 Sent: Tuesday, February 19, 2013 3:20 AM
 To: user@uima.apache.org
 Subject: Re: SimpleServer,  instantiating CAS with custom typesystem?
 
 Hi, SimpleServer itself is in a way your CR, creating a CAS with the document 
 text you sent. Why do you want to change SimpleServer, it seems that you only 
 want to add annotations to the CAS, not fundamentally change how the CAS is 
 created.
 
 It seems to me that it would be far easier to just create an AE that adds 
 those annotations. Then you won't have any typesystem issues either, since 
 the AE would have the appropriate typesystem.
 
 HTH,
 Jens
 
 On 02/18/2013 10:37 PM, Helen Johnson -X (heljohns - Infobahn Softworld Inc 
 at Cisco) wrote:
 I'm stumped:
 
 I have a UIMA pipeline that starts with a CollectionReader that
 
 -  reads XML input (response from a REST service),
 
 -  identifies a couple of relevant XML nodes
 
 -  makes document-level annotations from the relevant nodes (title, 
 document body, footnote section)
 From there, the AnalysisEngine portion of the pipeline has many AEs that 
 I've wrapped into a single AggregateAnalysisEngine.
 The CollectionReader and the AAE all work correctly in this pipeline.
 
 Now I need to transfer this pipeline into a SimpleServer REST service 
 environment.
 I've created a PEAR of the AAE portion of the pipeline, but I can't include 
 the CollectionReader in this PEAR.
 First question:
 It is my understanding the CR cannot be included in the PEAR for the 
 simpleServer, am I correct in this?
 
 In order to get those document-zoning annotations of title, body  footnote, 
 I have added some methods to the Service.java class in the SimpleServer 
 package that do the XML parsing and then do the adding of these annotations 
 to the JCAS before the AAE is called. The error that is being thrown at this 
 point is this:
 
 The server encountered an internal error (JCas type 
 myPackage.DocClass.ArticleMainTitle used in Java code, but was not 
 declared in the XML type descriptor.) that prevented it from fulfilling this 
 request.
 
 Second question:
 Where is Service.java looking for the typesystem xml file to be? I have 
 tried all of the following, with the same error result:
 
 -  put the typesystem descriptor file, myTSD.xml, in SimpleServer/lib
 
 -  create a jar containing myTSD.xml, put it into SimpleServer/lib 
 and add that to the build path
 
 -  (after the two above attempts), in SimpleServer project 
 properties, add lib to the UIMA CDE Property Page
 
 -  in SimpleServer project properties, in UIMA Type System, point to 
 the myTSD.xml file in lib
 
 -  put myTSD.xml in SimpeServer/WebContent/WEB-INF/lib
 
 -  put the jar containing myTSD.xml in the 
 SimpleServer/WebContent/WEB-INF/lib
 
 -  put myTSD.xml in SimpleServer/WebContent/WEB-INF/resources
 
 Final question:
 When a CAS gets instantiated (or reset, as it does in Service.java), how can 
 I tell it to use a custom typesystem, and where will it look for that 
 typesystem.xml file within the SimpleServer project?
 
 Thank you,
 Helen Johnson
 
 
 
 



RE: SimpleServer, instantiating CAS with custom typesystem?

2013-02-19 Thread Helen Johnson -X (heljohns - Infobahn Softworld Inc at Cisco)
So it turns out I cannot invoke
cas.reset() 
from inside an annotator, nor can I set anew the text to be processed using
cas.setDocumentText()
once the doc-text has already been set in the SimpleServer Service.java class.

What are my options for altering the document text inside an annotator after 
the SimpleServer Service.java has already set the document text?
-Helen

-Original Message-
From: Helen Johnson -X (heljohns - Infobahn Softworld Inc at Cisco) 
Sent: Tuesday, February 19, 2013 7:05 AM
To: user@uima.apache.org
Subject: RE: SimpleServer,  instantiating CAS with custom typesystem?

Thanks for your reply, Jens.

I admit I had been avoiding setting the text of the CAS to be the entire XML 
string I get back from the first REST service because it is a massive string 
and I only want a couple nodes from that xml string to be processed throughout 
the UIMA pipeline. But I see your point.


So then, in this new AE,  I retrieve the entire XML string from the CAS, do the 
zone-information processing from the specific nodes of the XML. I assume it is 
straightforward to then reset the CAS text to be just this text I have found in 
the original XML.  Specifically, I would use CAS.reset() to empty the CAS of 
the original (full XML) text, then jCAS.setDocumentText() with the new string 
of just the relevant text, as well as load all the doc-zone annotations at this 
point. Is this right?

Cheers,
Helen

-Original Message-
From: Jens Grivolla [mailto:j+...@grivolla.net] 
Sent: Tuesday, February 19, 2013 3:20 AM
To: user@uima.apache.org
Subject: Re: SimpleServer,  instantiating CAS with custom typesystem?

Hi, SimpleServer itself is in a way your CR, creating a CAS with the document 
text you sent. Why do you want to change SimpleServer, it seems that you only 
want to add annotations to the CAS, not fundamentally change how the CAS is 
created.

It seems to me that it would be far easier to just create an AE that adds those 
annotations. Then you won't have any typesystem issues either, since the AE 
would have the appropriate typesystem.

HTH,
Jens

On 02/18/2013 10:37 PM, Helen Johnson -X (heljohns - Infobahn Softworld Inc at 
Cisco) wrote:
 I'm stumped:

 I have a UIMA pipeline that starts with a CollectionReader that

 -  reads XML input (response from a REST service),

 -  identifies a couple of relevant XML nodes

 -  makes document-level annotations from the relevant nodes (title, 
 document body, footnote section)
  From there, the AnalysisEngine portion of the pipeline has many AEs that 
 I've wrapped into a single AggregateAnalysisEngine.
 The CollectionReader and the AAE all work correctly in this pipeline.

 Now I need to transfer this pipeline into a SimpleServer REST service 
 environment.
 I've created a PEAR of the AAE portion of the pipeline, but I can't include 
 the CollectionReader in this PEAR.
 First question:
 It is my understanding the CR cannot be included in the PEAR for the 
 simpleServer, am I correct in this?

 In order to get those document-zoning annotations of title, body  footnote, 
 I have added some methods to the Service.java class in the SimpleServer 
 package that do the XML parsing and then do the adding of these annotations 
 to the JCAS before the AAE is called. The error that is being thrown at this 
 point is this:

 The server encountered an internal error (JCas type 
 myPackage.DocClass.ArticleMainTitle used in Java code, but was not declared 
 in the XML type descriptor.) that prevented it from fulfilling this request.

 Second question:
 Where is Service.java looking for the typesystem xml file to be? I have tried 
 all of the following, with the same error result:

 -  put the typesystem descriptor file, myTSD.xml, in SimpleServer/lib

 -  create a jar containing myTSD.xml, put it into SimpleServer/lib 
 and add that to the build path

 -  (after the two above attempts), in SimpleServer project 
 properties, add lib to the UIMA CDE Property Page

 -  in SimpleServer project properties, in UIMA Type System, point to 
 the myTSD.xml file in lib

 -  put myTSD.xml in SimpeServer/WebContent/WEB-INF/lib

 -  put the jar containing myTSD.xml in the 
 SimpleServer/WebContent/WEB-INF/lib

 -  put myTSD.xml in SimpleServer/WebContent/WEB-INF/resources

 Final question:
 When a CAS gets instantiated (or reset, as it does in Service.java), how can 
 I tell it to use a custom typesystem, and where will it look for that 
 typesystem.xml file within the SimpleServer project?

 Thank you,
 Helen Johnson






Re: SimpleServer, instantiating CAS with custom typesystem?

2013-02-19 Thread Chris Roeder

Hi Helen,

We use CAS views. Chapter six should help:
http://uima.apache.org/d/uimaj-2.4.0/tutorials_and_users_guides.pdf


We read the xml into one view and then have an annotator
scrape the plain text and put it in another.
You can work this so the XML to plaintext code is sofa aware and puts
the plain text where your existing code expects it to be without modification.
It's been a while since I've looked at this, but I left some notes:
http://code.google.com/p/uimafit/wiki/WorkingWithSofaViews
Also pay attention to the section called Sofa Incompatibilities between UIMA
version 1 and version 2 in the user guide.

The uimafit users google group is another good resource.
https://groups.google.com/forum/?fromgroups=#!forum/uimafit-users

-Chris

On 2/19/13 4:43 PM, Helen Johnson -X (heljohns - Infobahn Softworld Inc at 
Cisco) wrote:

So it turns out I cannot invoke
cas.reset()
from inside an annotator, nor can I set anew the text to be processed using
cas.setDocumentText()
once the doc-text has already been set in the SimpleServer Service.java class.

What are my options for altering the document text inside an annotator after 
the SimpleServer Service.java has already set the document text?
-Helen

-Original Message-
From: Helen Johnson -X (heljohns - Infobahn Softworld Inc at Cisco)
Sent: Tuesday, February 19, 2013 7:05 AM
To: user@uima.apache.org
Subject: RE: SimpleServer,  instantiating CAS with custom typesystem?

Thanks for your reply, Jens.

I admit I had been avoiding setting the text of the CAS to be the entire XML 
string I get back from the first REST service because it is a massive string 
and I only want a couple nodes from that xml string to be processed throughout 
the UIMA pipeline. But I see your point.


So then, in this new AE,  I retrieve the entire XML string from the CAS, do the 
zone-information processing from the specific nodes of the XML. I assume it is 
straightforward to then reset the CAS text to be just this text I have found in 
the original XML.  Specifically, I would use CAS.reset() to empty the CAS of 
the original (full XML) text, then jCAS.setDocumentText() with the new string 
of just the relevant text, as well as load all the doc-zone annotations at this 
point. Is this right?

Cheers,
Helen

-Original Message-
From: Jens Grivolla [mailto:j+...@grivolla.net]
Sent: Tuesday, February 19, 2013 3:20 AM
To: user@uima.apache.org
Subject: Re: SimpleServer,  instantiating CAS with custom typesystem?

Hi, SimpleServer itself is in a way your CR, creating a CAS with the document 
text you sent. Why do you want to change SimpleServer, it seems that you only 
want to add annotations to the CAS, not fundamentally change how the CAS is 
created.

It seems to me that it would be far easier to just create an AE that adds those 
annotations. Then you won't have any typesystem issues either, since the AE 
would have the appropriate typesystem.

HTH,
Jens

On 02/18/2013 10:37 PM, Helen Johnson -X (heljohns - Infobahn Softworld Inc at 
Cisco) wrote:

I'm stumped:

I have a UIMA pipeline that starts with a CollectionReader that

-  reads XML input (response from a REST service),

-  identifies a couple of relevant XML nodes

-  makes document-level annotations from the relevant nodes (title, 
document body, footnote section)
  From there, the AnalysisEngine portion of the pipeline has many AEs that I've 
wrapped into a single AggregateAnalysisEngine.
The CollectionReader and the AAE all work correctly in this pipeline.

Now I need to transfer this pipeline into a SimpleServer REST service 
environment.
I've created a PEAR of the AAE portion of the pipeline, but I can't include the 
CollectionReader in this PEAR.
First question:
It is my understanding the CR cannot be included in the PEAR for the 
simpleServer, am I correct in this?

In order to get those document-zoning annotations of title, body  footnote, I 
have added some methods to the Service.java class in the SimpleServer package that 
do the XML parsing and then do the adding of these annotations to the JCAS before 
the AAE is called. The error that is being thrown at this point is this:

The server encountered an internal error (JCas type 
myPackage.DocClass.ArticleMainTitle used in Java code, but was not declared in the XML 
type descriptor.) that prevented it from fulfilling this request.

Second question:
Where is Service.java looking for the typesystem xml file to be? I have tried 
all of the following, with the same error result:

-  put the typesystem descriptor file, myTSD.xml, in SimpleServer/lib

-  create a jar containing myTSD.xml, put it into SimpleServer/lib and 
add that to the build path

-  (after the two above attempts), in SimpleServer project properties, add 
lib to the UIMA CDE Property Page

-  in SimpleServer project properties, in UIMA Type System, point to 
the myTSD.xml file in lib

-  put myTSD.xml in SimpeServer/WebContent

SimpleServer, instantiating CAS with custom typesystem?

2013-02-18 Thread Helen Johnson -X (heljohns - Infobahn Softworld Inc at Cisco)
I'm stumped:

I have a UIMA pipeline that starts with a CollectionReader that

-  reads XML input (response from a REST service),

-  identifies a couple of relevant XML nodes

-  makes document-level annotations from the relevant nodes (title, 
document body, footnote section)
From there, the AnalysisEngine portion of the pipeline has many AEs that I've 
wrapped into a single AggregateAnalysisEngine.
The CollectionReader and the AAE all work correctly in this pipeline.

Now I need to transfer this pipeline into a SimpleServer REST service 
environment.
I've created a PEAR of the AAE portion of the pipeline, but I can't include the 
CollectionReader in this PEAR.
First question:
It is my understanding the CR cannot be included in the PEAR for the 
simpleServer, am I correct in this?

In order to get those document-zoning annotations of title, body  footnote, I 
have added some methods to the Service.java class in the SimpleServer package 
that do the XML parsing and then do the adding of these annotations to the JCAS 
before the AAE is called. The error that is being thrown at this point is this:

The server encountered an internal error (JCas type 
myPackage.DocClass.ArticleMainTitle used in Java code, but was not declared 
in the XML type descriptor.) that prevented it from fulfilling this request.

Second question:
Where is Service.java looking for the typesystem xml file to be? I have tried 
all of the following, with the same error result:

-  put the typesystem descriptor file, myTSD.xml, in SimpleServer/lib

-  create a jar containing myTSD.xml, put it into SimpleServer/lib and 
add that to the build path

-  (after the two above attempts), in SimpleServer project properties, 
add lib to the UIMA CDE Property Page

-  in SimpleServer project properties, in UIMA Type System, point to 
the myTSD.xml file in lib

-  put myTSD.xml in SimpeServer/WebContent/WEB-INF/lib

-  put the jar containing myTSD.xml in the 
SimpleServer/WebContent/WEB-INF/lib

-  put myTSD.xml in SimpleServer/WebContent/WEB-INF/resources

Final question:
When a CAS gets instantiated (or reset, as it does in Service.java), how can I 
tell it to use a custom typesystem, and where will it look for that 
typesystem.xml file within the SimpleServer project?

Thank you,
Helen Johnson