RE: DUCC web server interfacing

2014-11-20 Thread D. Heinze
Eddie... thanks.  Yes, that sounds like I would not have the advantage of DUCC 
managing the UIMA pipeline. 

To break it down a little for the uninitiated (me), 

 1. how do I start a DUCC job that stays resident because it has high startup 
cost (e.g. 2 minutes to load all the resources for the UIMA pipeline VS about 2 
seconds to process each request)?

2. once I have a resident job, how do I get the Job Driver to iteratively feed 
references to each next document (as they are received) to the resident Job 
Process?  Because all the input jobs will be archived anyhow, I'm okay with 
passing them through the file system if needed.

Thanks / Dan

-Original Message-
From: Eddie Epstein [mailto:eaepst...@gmail.com] 
Sent: Thursday, November 20, 2014 6:06 PM
To: user@uima.apache.org
Subject: Re: DUCC web server interfacing

Ooops, in this case the web server would be feeding the service directly.

On Thu, Nov 20, 2014 at 9:04 PM, Eddie Epstein  wrote:

> The preferred approach is to run the analytics as a DUCC service, and 
> have an application driver that feeds the service instances with incoming 
> data.
> This service would be a scalable UIMA-AS service, which could have as 
> many instances as are needed to keep up with the load. The driver 
> would use the uima-as client API to feed the service. The application 
> driver could itself be another DUCC service.
>
> DUCC manages the life cycle of its services, including restarting them 
> on failure.
>
> Eddie
>
>
> On Thu, Nov 20, 2014 at 6:45 PM, Daniel Heinze  wrote:
>
>> I just installed DUCC this week and can process batch jobs.  I would 
>> like DUCC to initiate/manage one or more copies of the same UIMA 
>> pipeline that has high startup overhead and keep it/them active and 
>> feed it/them with documents that arrive periodically over a web 
>> service.  Any suggestions on the preferred way (if any) to do this in DUCC.
>>
>>
>>
>> Thanks / Dan
>>
>>
>




Re: DUCC web server interfacing

2014-11-20 Thread Eddie Epstein
Ooops, in this case the web server would be feeding the service directly.

On Thu, Nov 20, 2014 at 9:04 PM, Eddie Epstein  wrote:

> The preferred approach is to run the analytics as a DUCC service, and have
> an application driver that feeds the service instances with incoming data.
> This service would be a scalable UIMA-AS service, which could have as
> many instances as are needed to keep up with the load. The driver would
> use the uima-as client API to feed the service. The application driver
> could
> itself be another DUCC service.
>
> DUCC manages the life cycle of its services, including restarting them on
> failure.
>
> Eddie
>
>
> On Thu, Nov 20, 2014 at 6:45 PM, Daniel Heinze  wrote:
>
>> I just installed DUCC this week and can process batch jobs.  I would like
>> DUCC to initiate/manage one or more copies of the same UIMA pipeline that
>> has high startup overhead and keep it/them active and feed it/them with
>> documents that arrive periodically over a web service.  Any suggestions on
>> the preferred way (if any) to do this in DUCC.
>>
>>
>>
>> Thanks / Dan
>>
>>
>


Re: DUCC web server interfacing

2014-11-20 Thread Eddie Epstein
The preferred approach is to run the analytics as a DUCC service, and have
an application driver that feeds the service instances with incoming data.
This service would be a scalable UIMA-AS service, which could have as
many instances as are needed to keep up with the load. The driver would
use the uima-as client API to feed the service. The application driver
could
itself be another DUCC service.

DUCC manages the life cycle of its services, including restarting them on
failure.

Eddie


On Thu, Nov 20, 2014 at 6:45 PM, Daniel Heinze  wrote:

> I just installed DUCC this week and can process batch jobs.  I would like
> DUCC to initiate/manage one or more copies of the same UIMA pipeline that
> has high startup overhead and keep it/them active and feed it/them with
> documents that arrive periodically over a web service.  Any suggestions on
> the preferred way (if any) to do this in DUCC.
>
>
>
> Thanks / Dan
>
>


DUCC web server interfacing

2014-11-20 Thread Daniel Heinze
I just installed DUCC this week and can process batch jobs.  I would like
DUCC to initiate/manage one or more copies of the same UIMA pipeline that
has high startup overhead and keep it/them active and feed it/them with
documents that arrive periodically over a web service.  Any suggestions on
the preferred way (if any) to do this in DUCC.  

 

Thanks / Dan 



Re: DUCC stuck Waiting for Resources - new install on CentOS 6.5 VM

2014-11-20 Thread Daniel Heinze
Jim. thanks.  Yes something did go awry with installation.  I had done

All the 'is it plugged in' checks you suggest (I had initially had an issue

with the passwordless ssh).  In the end, the problem turned out to be that

I must have been logged in as root at some point during the install because

various directories were owned by root (e.g. the log dir) and could not

be rwx by user ducc.  Chown fixed the problem.

 

-Dan 

 

===

Dan,

 I need more information.  The lack of logs tells me something went 

awry with the

 installation.

 

 First off, some 'is it plugged in' checks -

 

- did you run ducc_post_install error free?

- if you run check_ducc -c does it show errors, or clean configuration?

- do you have passwordless ssh set up?

   To verify, ssh to your ducc userid on your centos machine; it 

should work without a

   password prompt.  This is a necessary check even if you only 

have one machine, and

   is the most common cause of this problem  for me (because I 

disable ssh on my

  laptop when I'm not testing DUCC on it and forget!)

Jim

 

On 11/18/14, 3:37 PM, Dan Heinze wrote:

> I've read the "DUCC stuck Waiting for Resources on Amazon..." thread.

> I have a similar problem.  I did my first install of DUCC yesterday on a

> CentOS 6.5 VM with 9GB RAM.  No problems with the install. ./start_ducc -s

> seems to work fine, but when I look at ducc-mon Reservations, I find that

> Job Driver is stuck "Waiting for Resources", I have given it hours, but it

> just stays stuck there.  Also, nothing is being written to the logs... the

> ${DUCC_HOME}/logs directory is empty.  Any help will be appreciated.

> 

> -Dan

> 

 



Re: can't remove duplicate Annotations with Java Set Collection

2014-11-20 Thread Marshall Schor
Sorry, the pictures/images don't come through this email list...  If you want to
include them, please post them on a well-know clip-site, and include a link to
them in your email.

I think the issue you're having is that you wrote:

...
_@Override_
__*_public_*_ _*_int_*_ compare(Annotation __o1__, Annotation __o2__) {_
__...

The @Override indicates an error if the method signature you're defining can't
be matched to a method in the supertype.

The supertype here is "Comparator" and it only has a signature for compare with
2 args which are both "Object"s.

You can remove the @Override to get rid of this check.

-Marshall

On 11/18/2014 2:06 PM, Kameron Cole wrote:
>
> Awesome.  Your change will work.  And i will try it, thank you!
>
> But maybe you can help me to get this to work?   As I posted, if I use Object
> as the parameter in the compare method signature, Eclipse is ok; but when I
> change it to Annotation, it says I must override the methods - as though
> something about Annotator confuses Eclipse.  Here's the code I really want to
> work:
>
>
> ---
>
> *public* *static* ArrayList  dedupe (AnnotationIndex
> idx2){
>
> ArrayList tempList = *new* ArrayList(idx2.size());
> FSIterator it2  = idx2.iterator();
> *while*(it2.hasNext())
> {
>
> tempList.add((Annotation) it2.next());
>
> }
>
> _Set_ set = *_new_*_ TreeSet(_*_new_*_ Comparator() {_
> ___@Override_
> __*_public_*_ _*_int_*_ compare(Annotation __o1__, Annotation __o2__) {_
> __*_if_*_(__o1__.getCoveredText()==__o2__.getCoveredText()){_
> __*_return_*_ 0;_
> _}_
> __*_return_*_ 1;_
> _}_
> _})_;
>
> _set__.addAll(__tempList__)_;
>
> tempList.clear();
> tempList.addAll(_set_);
> System.*/out/*.println("templist length: "+tempList.size());
> *return* tempList;
>
> -
>
> But look:at what Eclipse gives me:
>
>
>
>
>
>
>
> 
> 
>
> *Kameron Arthur Cole
> Watson Content Analytics Applications and Support
> email: **kameronc...@us.ibm.com* * | Tel:
> 305-389-8512**
> **upload logs here*   
>
>   
>
>   
>
> 
> 
>
>
> 
> 
>
>
>
> Inactive hide details for Marshall Schor ---11/18/2014 11:54:50 AM---An even
> simpler approach: Use a HashMap, where the key is Marshall Schor ---11/18/2014
> 11:54:50 AM---An even simpler approach: Use a HashMap, where the key is the
> annotation.getCoveredText() and the va
>
> From: Marshall Schor 
> To: user@uima.apache.org
> Date: 11/18/2014 11:54 AM
> Subject: Re: can't remove duplicate Annotations with Java Set Collection
>
> 
>
>
>
> An even simpler approach:
>
> Use a HashMap, where the key is the annotation.getCoveredText() and the value 
> is
> the annotation, instead of a HashSet.
>
> replace this (in your original):
>
> // push tempList into HashSet
> HashSet hs = new HashSet();
> hs.addAll(tempList);
>
>
> with
>
> // push tempList into HashMap
> HashMap hm = new HashSet();
> for (Annotation a : tempList) {
>  hm.put(a.getCoveredText(), a);
> }
>
> -Marshall
>
> On 11/18/2014 9:45 AM, Marshall Schor wrote:
> > Eclipse pointed out a bug in my code, fix is below
> > On 11/18/2014 9:37 AM, Marshall Schor wrote:
> >> Hi Kameron,
> >>
> >> Based on this code snip, the two "cat" annotations you create are 
> >> "different"
> >> using the HashSet definition, because they correspond to two distinct UIMA
> >> Annotations.  You could, for instance, update one of them, and not the 
> >> other;
> >> that it the sense in which they are distinct.  In the case below, the two 
> >> "cat"
> >> annotations would have different begin and end offsets.
> >>
> >> I'm guessing that your goal was to to have one of the two cat annotations 
> >> be
> >> dropped.
> >>
> >> You could do that by using your hash set approach, if you defined equal to 
> >> mean
> >> that just the covered text of the annotation was equal.
> >>
> >> Here's one way to do this:  Create a "cover object" for your annotations, 
> >> that
> >> contains a reference to the annotation and defines equals and hashcode (you
> have
> >> to define these together).  The easy way to do this is using Eclipse - 
> >> define a
> >> new class: e.g.
> >>
> >> public class MyAnnotationWithSpecialEquals {
> >>   final public Annotation annotation;   // the covered annotation
> >>  
> >>   public MyAnnotationWithSpecialEquals(Annotation annotation) {
> >> this.annotation = annotation;
> >>   }
> >> }
> >>
> >> and then use Eclipse to define the equals and hashcode:  go to Menu ->
> Source ->
> >> Generate hashcode() and equals()
> >> and