Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread Simon Hafner
2014-11-17 0:00 GMT-06:00 reshu.agarwal :
> I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same machines with
> different user. Can this be possible?
Yes, that should be possible. You'll have to make sure there are no
port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
might be randomly assigned. Just set that port manually and watch out
for any errors during the start to see which other components have
hardcoded port numbers.

Personally, I'd just fire up a VM with qemu or VirtualBox.


Re: DUCC-1.1.0: Machines are going down very frequently

2014-11-17 Thread Lou DeGenaro
Reshu,

Have you tried looking at the log files in DUCC's log directory for signs
of errors or exceptions?  Are any daemons producing core dumps?

Lou.

On Mon, Nov 17, 2014 at 1:21 AM, reshu.agarwal 
wrote:

>
> Dear Lou,
>
> I am using default configuration:
>
> ducc.agent.node.metrics.publish.rate=3
> ducc.rm.node.stability = 5
>
> Reshu.
>
>
> Signature On 11/12/2014 05:03 PM, Lou DeGenaro wrote:
>
>> What do you have defined in your ducc.properties for
>> ducc.rm.node.stability and ducc.agent.node.metrics.publish.rate?  The
>> Web Server considers a node down according to the following
>> calculation:
>>
>> private long getAgentMillisMIA() {
>>  String location = "getAgentMillisMIA";
>>  long secondsMIA = DOWN_AFTER_SECONDS*SECONDS_PER_MILLI;
>>  Properties properties = DuccWebProperties.get();
>>  String s_tolerance = properties.getProperty("ducc.
>> rm.node.stability");
>>  String s_rate =
>> properties.getProperty("ducc.agent.node.metrics.publish.rate");
>>  try {
>>  long tolerance = Long.parseLong(s_tolerance.trim());
>>  long rate = Long.parseLong(s_rate.trim());
>>  secondsMIA = (tolerance * rate) / 1000;
>>  }
>>  catch(Throwable t) {
>>  logger.warn(location, jobid, t);
>>  }
>>  return secondsMIA;
>>  }
>>
>> The default is 65 seconds. Note that the Web Server has no effect on
>> actual operations in this case.  If is just a reporter of information.
>>
>> Lou.
>>
>> On Wed, Nov 12, 2014 at 12:45 AM, reshu.agarwal
>>  wrote:
>>
>>> Hi,
>>>
>>> When I was trying DUCC-1.1.0 on multi machine, I have faced an up-down
>>> status problem in machines. I have configured two machines and these
>>> machines are going down one by one. This makes the DUCC Services disable
>>> and
>>> Jobs to be initialize again and again.
>>>
>>> DUCC 1.0.0 was working fine on same machines.
>>>
>>> How can I fix this problem? I have also compared ducc.properties file for
>>> both versions. Both are using same configuration to check heartbeats.
>>>
>>> Re-Initialization of Jobs are increasing the processing time. Can I
>>> change
>>> or re-configure this process?
>>>
>>> Services are getting disabled automatically and showing excessive
>>> Initialization error status on mark over on disabled status but logs are
>>> not
>>> showing any error.
>>>
>>> I have to use DUCC 1.0.0 instead of DUCC 1.1.0.
>>>
>>> Thanks in Advance.
>>>
>>> --
>>> Signature *Reshu Agarwal*
>>>
>>>
>


DUCC doesn't use all available machines

2014-11-17 Thread Simon Hafner
I fired the DuccRawTextSpec.job on a cluster consisting of three
machines, with 100 documents. The scheduler only runs the processes on
two machines instead of all three. Can I mess with a few config
variables to make it use all three?

id:22 state:Running total:100 done:0 error:0 retry:0 procs:1
id:22 state:Running total:100 done:0 error:0 retry:0 procs:2
id:22 state:Running total:100 done:0 error:0 retry:0 procs:4
id:22 state:Running total:100 done:1 error:0 retry:0 procs:8
id:22 state:Running total:100 done:6 error:0 retry:0 procs:8


Re: DUCC-1.1.0: Machines are going down very frequently

2014-11-17 Thread reshu.agarwal

Lou,

I tried to find any sign of error and exception but didn't find any.

Reshu.
On 11/17/2014 05:18 PM, Lou DeGenaro wrote:

Reshu,

Have you tried looking at the log files in DUCC's log directory for signs
of errors or exceptions?  Are any daemons producing core dumps?

Lou.

On Mon, Nov 17, 2014 at 1:21 AM, reshu.agarwal 
wrote:


Dear Lou,

I am using default configuration:

ducc.agent.node.metrics.publish.rate=3
ducc.rm.node.stability = 5

Reshu.


Signature On 11/12/2014 05:03 PM, Lou DeGenaro wrote:


What do you have defined in your ducc.properties for
ducc.rm.node.stability and ducc.agent.node.metrics.publish.rate?  The
Web Server considers a node down according to the following
calculation:

private long getAgentMillisMIA() {
  String location = "getAgentMillisMIA";
  long secondsMIA = DOWN_AFTER_SECONDS*SECONDS_PER_MILLI;
  Properties properties = DuccWebProperties.get();
  String s_tolerance = properties.getProperty("ducc.
rm.node.stability");
  String s_rate =
properties.getProperty("ducc.agent.node.metrics.publish.rate");
  try {
  long tolerance = Long.parseLong(s_tolerance.trim());
  long rate = Long.parseLong(s_rate.trim());
  secondsMIA = (tolerance * rate) / 1000;
  }
  catch(Throwable t) {
  logger.warn(location, jobid, t);
  }
  return secondsMIA;
  }

The default is 65 seconds. Note that the Web Server has no effect on
actual operations in this case.  If is just a reporter of information.

Lou.

On Wed, Nov 12, 2014 at 12:45 AM, reshu.agarwal
 wrote:


Hi,

When I was trying DUCC-1.1.0 on multi machine, I have faced an up-down
status problem in machines. I have configured two machines and these
machines are going down one by one. This makes the DUCC Services disable
and
Jobs to be initialize again and again.

DUCC 1.0.0 was working fine on same machines.

How can I fix this problem? I have also compared ducc.properties file for
both versions. Both are using same configuration to check heartbeats.

Re-Initialization of Jobs are increasing the processing time. Can I
change
or re-configure this process?

Services are getting disabled automatically and showing excessive
Initialization error status on mark over on disabled status but logs are
not
showing any error.

I have to use DUCC 1.0.0 instead of DUCC 1.1.0.

Thanks in Advance.

--
Signature *Reshu Agarwal*






Re: DUCC-1.1.0: Machines are going down very frequently

2014-11-17 Thread Lou DeGenaro
Also, do all the daemons on the System -> Daemons page show status "up"?

Have a look at the Broker page for live demo on Apache here:
http://uima-ducc-vm.apache.org:42133/system.broker.jsp and compare with
yours.  Do all of the Topics appear with consumers > 0 ?

On Mon, Nov 17, 2014 at 6:48 AM, Lou DeGenaro 
wrote:

> Reshu,
>
> Have you tried looking at the log files in DUCC's log directory for signs
> of errors or exceptions?  Are any daemons producing core dumps?
>
> Lou.
>
> On Mon, Nov 17, 2014 at 1:21 AM, reshu.agarwal 
> wrote:
>
>>
>> Dear Lou,
>>
>> I am using default configuration:
>>
>> ducc.agent.node.metrics.publish.rate=3
>> ducc.rm.node.stability = 5
>>
>> Reshu.
>>
>>
>> Signature On 11/12/2014 05:03 PM, Lou DeGenaro wrote:
>>
>>> What do you have defined in your ducc.properties for
>>> ducc.rm.node.stability and ducc.agent.node.metrics.publish.rate?  The
>>> Web Server considers a node down according to the following
>>> calculation:
>>>
>>> private long getAgentMillisMIA() {
>>>  String location = "getAgentMillisMIA";
>>>  long secondsMIA = DOWN_AFTER_SECONDS*SECONDS_PER_MILLI;
>>>  Properties properties = DuccWebProperties.get();
>>>  String s_tolerance = properties.getProperty("ducc.
>>> rm.node.stability");
>>>  String s_rate =
>>> properties.getProperty("ducc.agent.node.metrics.publish.rate");
>>>  try {
>>>  long tolerance = Long.parseLong(s_tolerance.trim());
>>>  long rate = Long.parseLong(s_rate.trim());
>>>  secondsMIA = (tolerance * rate) / 1000;
>>>  }
>>>  catch(Throwable t) {
>>>  logger.warn(location, jobid, t);
>>>  }
>>>  return secondsMIA;
>>>  }
>>>
>>> The default is 65 seconds. Note that the Web Server has no effect on
>>> actual operations in this case.  If is just a reporter of information.
>>>
>>> Lou.
>>>
>>> On Wed, Nov 12, 2014 at 12:45 AM, reshu.agarwal
>>>  wrote:
>>>
 Hi,

 When I was trying DUCC-1.1.0 on multi machine, I have faced an up-down
 status problem in machines. I have configured two machines and these
 machines are going down one by one. This makes the DUCC Services
 disable and
 Jobs to be initialize again and again.

 DUCC 1.0.0 was working fine on same machines.

 How can I fix this problem? I have also compared ducc.properties file
 for
 both versions. Both are using same configuration to check heartbeats.

 Re-Initialization of Jobs are increasing the processing time. Can I
 change
 or re-configure this process?

 Services are getting disabled automatically and showing excessive
 Initialization error status on mark over on disabled status but logs
 are not
 showing any error.

 I have to use DUCC 1.0.0 instead of DUCC 1.1.0.

 Thanks in Advance.

 --
 Signature *Reshu Agarwal*


>>
>


Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread Lou DeGenaro
The broker port is specifiable in ducc.properties.  The default is
ducc.broker.port = 61617.

Lou.

On Mon, Nov 17, 2014 at 5:29 AM, Simon Hafner  wrote:

> 2014-11-17 0:00 GMT-06:00 reshu.agarwal :
> > I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same machines with
> > different user. Can this be possible?
> Yes, that should be possible. You'll have to make sure there are no
> port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
> might be randomly assigned. Just set that port manually and watch out
> for any errors during the start to see which other components have
> hardcoded port numbers.
>
> Personally, I'd just fire up a VM with qemu or VirtualBox.
>


Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread reshu.agarwal

Lou,

I have changed the broker port and ws port too but still faced a problem 
in starting the ducc1.1.0 version simultaneously.


Reshu.

On 11/17/2014 05:34 PM, Lou DeGenaro wrote:

The broker port is specifiable in ducc.properties.  The default is
ducc.broker.port = 61617.

Lou.

On Mon, Nov 17, 2014 at 5:29 AM, Simon Hafner  wrote:


2014-11-17 0:00 GMT-06:00 reshu.agarwal :

I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same machines with
different user. Can this be possible?

Yes, that should be possible. You'll have to make sure there are no
port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
might be randomly assigned. Just set that port manually and watch out
for any errors during the start to see which other components have
hardcoded port numbers.

Personally, I'd just fire up a VM with qemu or VirtualBox.





Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread Lou DeGenaro
Are these problems related?  That is, are you having the node down problem
and the multiple DUCC's problem together on the same set of nodes?

Can you run the either configuration alone without issue?

Lou.

On Mon, Nov 17, 2014 at 7:41 AM, reshu.agarwal 
wrote:

> Lou,
>
> I have changed the broker port and ws port too but still faced a problem
> in starting the ducc1.1.0 version simultaneously.
>
> Reshu.
>
>
> On 11/17/2014 05:34 PM, Lou DeGenaro wrote:
>
>> The broker port is specifiable in ducc.properties.  The default is
>> ducc.broker.port = 61617.
>>
>> Lou.
>>
>> On Mon, Nov 17, 2014 at 5:29 AM, Simon Hafner 
>> wrote:
>>
>>  2014-11-17 0:00 GMT-06:00 reshu.agarwal :
>>>
 I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same machines
 with
 different user. Can this be possible?

>>> Yes, that should be possible. You'll have to make sure there are no
>>> port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
>>> might be randomly assigned. Just set that port manually and watch out
>>> for any errors during the start to see which other components have
>>> hardcoded port numbers.
>>>
>>> Personally, I'd just fire up a VM with qemu or VirtualBox.
>>>
>>>
>


can't remove duplicate Annotations with Java Set Collection

2014-11-17 Thread Kameron Cole

Hello,

I am trying to get rid of duplicates in the FSIndex.  I thought a very
clever way to do this would be to just push them into a Set Collection in
Java, which does not allow duplicates. This is very (very) standard Java:

ArrayList al = new ArrayList();
// add elements to al, including duplicates
HashSet hs = new HashSet();
hs.addAll(al);
al.clear();
al.addAll(hs);

This list will contain no duplicates.

However, I am not getting this to work in my UIMA code:


System.out.println("Index size is: "+idx.size());

AnnotationIndex idx = aJCas.getAnnotationIndex();

ArrayList tempList = new ArrayList(idx.size());

FSIterator it  = idx.iterator();

//load the Annotations into a temporary list.  includes duplicates

while(it.hasNext())
{

tempList.add((Annotation) it.next());

}

Iterator tempIt = tempList.iterator();

// remove all Annotations from the index.  this works fine

while(tempIt.hasNext()){
((Annotation) tempIt.next()).removeFromIndexes(aJCas);
}

// push tempList into HashSet

HashSet hs = new HashSet();

hs.addAll(tempList);

// this should not allow duplicates

System.out.println("HS length: "+hs.size()); // size should be less the
size of the FSIndex by the number of duplicates.  it is not. This is the
main problem

tempList.clear();

tempList.addAll(hs);

System.out.println("templist length: "+tempList.size());


Iterator it2 = tempList.iterator(); // this should now be the
clean list


while(it2.hasNext()){
it2.next().addToIndexes(aJCas);
}

Re: DUCC doesn't use all available machines

2014-11-17 Thread Eddie Epstein
DuccRawTextSpec.job specifies that each job process (JP)
run 8 analytic pipeline threads. So for this job with 100 work
items, no more than 13 JPs would ever be started.

After successful initialization of the first JP, DUCC begins scaling
up the number of JPs using doubling. During JP scale up the
scheduler monitors the work item completion rate, compares that
with the JP initialization time, and stops scaling up JPs when
starting more JPs will not make the job run any faster.

Of course JP scale up is also limited by the job's "fair share"
of resources relative to total resources available for all preemptable jobs.

To see more JPs, increase the number and/or size of the input text files,
or decrease the number of pipeline threads per JP.

Note that it can be counter productive to run "too many" pipeline
threads per machine. Assuming analytic threads are 100% CPU bound,
running more threads than real cores will often slow down the overall
document processing rate.


On Mon, Nov 17, 2014 at 6:48 AM, Simon Hafner  wrote:

> I fired the DuccRawTextSpec.job on a cluster consisting of three
> machines, with 100 documents. The scheduler only runs the processes on
> two machines instead of all three. Can I mess with a few config
> variables to make it use all three?
>
> id:22 state:Running total:100 done:0 error:0 retry:0 procs:1
> id:22 state:Running total:100 done:0 error:0 retry:0 procs:2
> id:22 state:Running total:100 done:0 error:0 retry:0 procs:4
> id:22 state:Running total:100 done:1 error:0 retry:0 procs:8
> id:22 state:Running total:100 done:6 error:0 retry:0 procs:8
>


Re: DUCC doesn't use all available machines

2014-11-17 Thread Jim Challenger
It is also possible that RM "prediction" has decided that additional 
processes are not needed.  It
appears that there were likely 64 work items dispatched, plus the 6 
completed, leaving only
30 that were "idle".  If these work items appeared to be completing 
quickly, the RM would decide

that scale-up would be wasteful and not do it.

Very gory details if you're interested:
The time to start a new processes is measured by the RM based on the
observed initialization time of the processes plus an estimate of how 
long it would take to get
a new process actually running.  A fudge-factor is added on top of this 
because in a large operation
it is wasteful to start processes (with associated preemptions) that 
only end up doing a "few" work

tems.  All is subjective and configurable.

The average time-per-work item is also reported to the RM.

The RM then looks at the number of work items remaining, and the 
estimated time needed to
processes this work based on the above, and if it determines that the 
job will be completed before

new processes can be scaled up and initialized, it does not scale up.

For short jobs, this can be a bit inaccurate, but those jobs are short :)

For longer jobs, the time-per-work-item becomes increasingly accurate so 
the RM prediction tends
to improve and ramp-up WILL occur if the work-item time turns out to be 
larger than originally
thought.  (Our experience is that work-item times are mostly uniform 
with occasional outliers, but

the prediction seems to work well).

Relevant configuration parameters in ducc.properties:
# Predict when a job will end and avoid expanding if not needed. Set to 
false to disable prediction.

   ducc.rm.prediction = true
# Add this fudge factor (milliseconds) to the expansion target when 
using prediction

   ducc.rm.prediction.fudge = 12

You can observe this in the rm log, see the example below.  I'm 
preparing a guide to this log; for now,
the net of these two log lines is: the projection for the job in 
question (job 208927) is that 16 processes
are needed to complete this job, even though the job could use 20 
processes at full expanseion - the BaseCap -

so a max of 16 will be scheduled for it,  subject to fair-share constraint.

17 Nov 2014 15:07:38,880  INFO RM.RmJob - */getPrjCap/* 208927  bobuser 
O 2 T 343171 NTh 128 TI 143171 TR 6748.601431980907 R 1.8967e-02 QR 5043 
P 6509 F 0 ST 1416254363603*/return 16/*
17 Nov 2014 15:07:38,880  INFO RM.RmJob - */initJobCap/* 208927 bobuser 
O 2 */Base cap:/* 20 Expected future cap: 16 potential cap 16 actual cap 16


Jim

On 11/17/14, 3:44 PM, Eddie Epstein wrote:

DuccRawTextSpec.job specifies that each job process (JP)
run 8 analytic pipeline threads. So for this job with 100 work
items, no more than 13 JPs would ever be started.

After successful initialization of the first JP, DUCC begins scaling
up the number of JPs using doubling. During JP scale up the
scheduler monitors the work item completion rate, compares that
with the JP initialization time, and stops scaling up JPs when
starting more JPs will not make the job run any faster.

Of course JP scale up is also limited by the job's "fair share"
of resources relative to total resources available for all preemptable jobs.

To see more JPs, increase the number and/or size of the input text files,
or decrease the number of pipeline threads per JP.

Note that it can be counter productive to run "too many" pipeline
threads per machine. Assuming analytic threads are 100% CPU bound,
running more threads than real cores will often slow down the overall
document processing rate.


On Mon, Nov 17, 2014 at 6:48 AM, Simon Hafner  wrote:


I fired the DuccRawTextSpec.job on a cluster consisting of three
machines, with 100 documents. The scheduler only runs the processes on
two machines instead of all three. Can I mess with a few config
variables to make it use all three?

id:22 state:Running total:100 done:0 error:0 retry:0 procs:1
id:22 state:Running total:100 done:0 error:0 retry:0 procs:2
id:22 state:Running total:100 done:0 error:0 retry:0 procs:4
id:22 state:Running total:100 done:1 error:0 retry:0 procs:8
id:22 state:Running total:100 done:6 error:0 retry:0 procs:8





Re: can't remove duplicate Annotations with Java Set Collection

2014-11-17 Thread Marshall Schor
Hi,

Two Feature Structures are considered "equal" in the sense used by HashSet, if
fs1.equals(fs2).   The definition of "equals" for feature structures is: they
are equal if they refer to the same underlying CAS, and the same "spot" in the
the CAS Heap.

How did you create the Annotations that you think are "equal" in the HashSet 
sense?

Here's an example of two annotations which are "equal" in the UIMA sorted index
sense, but unequal in the HashSet sense.

Annotation fs1 = new Annotation(myJCas, 0, 4); // create an instance of
Annotation in myJCas, with a begin = 0, and end = 4.
Annotation fs2 = new Annotation(myJCas, 0, 4); // create an instance of
Annotation in myJCas, with a begin = 0, and end = 4.

These will be "equal" in the UIMA sense - the same kind of annotation, in the
same CAS, with the same feature values, but will be two distinct feature
structures, so HashSet will consider them to be unequal.

Could this be what is happening in your case?  Please respond so we can see if
there's another straight-forward solution that does what you're looking for.

-Marshall
on 11/17/2014 2:59 PM, Kameron Cole wrote:
> Hello,
>
> I am trying to get rid of duplicates in the FSIndex.  I thought a very
> clever way to do this would be to just push them into a Set Collection in
> Java, which does not allow duplicates. This is very (very) standard Java:
>
> ArrayList al = new ArrayList();
> // add elements to al, including duplicates
> HashSet hs = new HashSet();
> hs.addAll(al);
> al.clear();
> al.addAll(hs);
>
> This list will contain no duplicates.
>
> However, I am not getting this to work in my UIMA code:
>
>
> System.out.println("Index size is: "+idx.size());
>
> AnnotationIndex idx = aJCas.getAnnotationIndex();
>
> ArrayList tempList = new ArrayList(idx.size());
>
>   FSIterator it  = idx.iterator();
>
> //load the Annotations into a temporary list.  includes duplicates
>
>   while(it.hasNext())
>   {
>
>   tempList.add((Annotation) it.next());
>
>   }
>
> Iterator tempIt = tempList.iterator();
>
> // remove all Annotations from the index.  this works fine
>
>   while(tempIt.hasNext()){
>   ((Annotation) tempIt.next()).removeFromIndexes(aJCas);
>   }
>
> // push tempList into HashSet
>
>   HashSet hs = new HashSet();
>
>   hs.addAll(tempList);
>
> // this should not allow duplicates
>
> System.out.println("HS length: "+hs.size()); // size should be less the
> size of the FSIndex by the number of duplicates.  it is not. This is the
> main problem
>
> tempList.clear();
>
>   tempList.addAll(hs);
>
>   System.out.println("templist length: "+tempList.size());
>
>
> Iterator it2 = tempList.iterator(); // this should now be the
> clean list
>
>
>   while(it2.hasNext()){
>   it2.next().addToIndexes(aJCas);
>   }



Re: can't remove duplicate Annotations with Java Set Collection

2014-11-17 Thread Kameron Cole

Input text:

--

bird, cat, bush, cat



Create the Annotations:

---
docText = aJCas.getDocumentText();

 int index = docText.indexOf("cat");
 while(index >= 0) {
 int begin = index;
int end = begin+3;
Animal animal = new Animal(aJCas);
animal.setBegin(begin);
animal.setEnd(end);
animal.addToIndexes();

index = docText.indexOf("cat", index+1);
 }

 index = docText.indexOf("bird");
 while(index >= 0) {
 int begin = index;
int end = begin+4;
Animal animal = new Animal(aJCas);
animal.setBegin(begin);
animal.setEnd(end);
animal.addToIndexes();

index = docText.indexOf("bird", index+1);
 }

 index = docText.indexOf("bush");
 while(index >= 0) {
 int begin = index;
int end = begin+4;
Vegetable animal = new Vegetable(aJCas);
animal.setBegin(begin);
animal.setEnd(end);
animal.addToIndexes();

index = docText.indexOf("bird", index+1);
 }
--
   
   
   
 Kameron Arthur Cole   
 Watson Content
 Analytics Applications
 and Support   
 email:
 kameronc...@us.ibm.com
 | Tel: 305-389-8512   
 upload logs here  
   
   
   
   
   





From:   Marshall Schor 
To: user@uima.apache.org
Date:   11/17/2014 04:35 PM
Subject:Re: can't remove duplicate Annotations with Java Set Collection



Hi,

Two Feature Structures are considered "equal" in the sense used by HashSet,
if
fs1.equals(fs2).   The definition of "equals" for feature structures is:
they
are equal if they refer to the same underlying CAS, and the same "spot" in
the
the CAS Heap.

How did you create the Annotations that you think are "equal" in the
HashSet sense?

Here's an example of two annotations which are "equal" in the UIMA sorted
index
sense, but unequal in the HashSet sense.

Annotation fs1 = new Annotation(myJCas, 0, 4); // create an instance of
Annotation in myJCas, with a begin = 0, and end = 4.
Annotation fs2 = new Annotation(myJCas, 0, 4); // create an instance of
Annotation in myJCas, with a begin = 0, and end = 4.

These will be "equal" in the UIMA sense - the same kind of annotation, in
the
same CAS, with the same feature values, but will be two distinct feature
structures, so HashSet will consider them to be unequal.

Could this be what is happening in your case?  Please respond so we can see
if
there's another straight-forward solution that does what you're looking
for.

-Marshall
on 11/17/2014 2:59 PM, Kameron Cole wrote:
> Hello,
>
> I am trying to get rid of duplicates in the FSIndex.  I thought a very
> clever way to do this would be to just push them into a Set Collection in
> Java, which does not allow duplicates. This is very (very) standard Java:
>
> ArrayList al = new ArrayList();
> // add elements to al, including duplicates
> HashSet hs = new HashSet();
> hs.addAll(al);
> al.clear();
> al.addAll(hs);
>
> This list will contain no duplicates.
>
> However, I am not getting this to work in my UIMA code:
>
>
> System.out.println("Index size is: "+idx.size());
>
> AnnotationIndex idx = aJCas.getAnnotationIndex();
>
> ArrayList tempList = new ArrayList(idx.size());
>
>FSIterator it  = idx.iterator();
>
> //load the Annotations into a temporary list.  includes duplicates
>
> 

Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread Jim Challenger

An excellent question by Lou.

Been wracking my brains over what would cause the agent instability and 
this would absolutely do it.


What will likely happen if you try to run 2 DUCCs on the same machines 
if you don't change the broker
port, is the second DUCC will see the broker alive and figure all is 
well.  There are a number of
use-cases where this is acceptable so we don't throw an alert, e.g. if 
you choose to use a non-DUCC

managed broker (as we do here).

To add to the confusion, sometimes the 'activemq stop' that DUCC issues 
doesn't work, for reasons out

of DUCC control, so when you think the broker is down, it isn't.

Try this:
1.  stop all the DUCCS, then use the ps command to make sure there is no 
errant broker. I use this:

   ps auxw | grep DUCC_AMQ_PORT
 and kill -9 any process that it shows.
2.  Now start JUST ONE DUCC, I suggest the 1.1.0, and see if life gets 
better.  1.1.0 has some nice

 things so you'll be better with that if we can make it work for you.

Jim

On 11/17/14, 8:03 AM, Lou DeGenaro wrote:

Are these problems related?  That is, are you having the node down problem
and the multiple DUCC's problem together on the same set of nodes?

Can you run the either configuration alone without issue?

Lou.

On Mon, Nov 17, 2014 at 7:41 AM, reshu.agarwal 
wrote:


Lou,

I have changed the broker port and ws port too but still faced a problem
in starting the ducc1.1.0 version simultaneously.

Reshu.


On 11/17/2014 05:34 PM, Lou DeGenaro wrote:


The broker port is specifiable in ducc.properties.  The default is
ducc.broker.port = 61617.

Lou.

On Mon, Nov 17, 2014 at 5:29 AM, Simon Hafner 
wrote:

  2014-11-17 0:00 GMT-06:00 reshu.agarwal :

I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same machines
with
different user. Can this be possible?


Yes, that should be possible. You'll have to make sure there are no
port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
might be randomly assigned. Just set that port manually and watch out
for any errors during the start to see which other components have
hardcoded port numbers.

Personally, I'd just fire up a VM with qemu or VirtualBox.






Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread reshu.agarwal


Dear Lou,

These both problems are different but nodes are same. DUCC 1.0.0 is 
running perfectly on same nodes.


When I was trying to configure DUCC 1.1.0 singly, I faced node down 
problem. Due to this I thought why don't I try to run DUCC 1.0.0 and 
DUCC 1.1.0 simultaneously on same node to compare their behaviour. But, 
Now, I am facing problems in this too.


Reshu.


Signature On 11/17/2014 06:33 PM, Lou DeGenaro wrote:

Are these problems related?  That is, are you having the node down problem
and the multiple DUCC's problem together on the same set of nodes?

Can you run the either configuration alone without issue?

Lou.

On Mon, Nov 17, 2014 at 7:41 AM, reshu.agarwal 
wrote:


Lou,

I have changed the broker port and ws port too but still faced a problem
in starting the ducc1.1.0 version simultaneously.

Reshu.


On 11/17/2014 05:34 PM, Lou DeGenaro wrote:


The broker port is specifiable in ducc.properties.  The default is
ducc.broker.port = 61617.

Lou.

On Mon, Nov 17, 2014 at 5:29 AM, Simon Hafner 
wrote:

  2014-11-17 0:00 GMT-06:00 reshu.agarwal :

I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same machines
with
different user. Can this be possible?


Yes, that should be possible. You'll have to make sure there are no
port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
might be randomly assigned. Just set that port manually and watch out
for any errors during the start to see which other components have
hardcoded port numbers.

Personally, I'd just fire up a VM with qemu or VirtualBox.






Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread reshu.agarwal

Dear Jim,

When I was trying DUCC 1.1.0 on the nodes on which DUCC 1.0.0 was 
running perfectly, I first stopped DUCC 1.0.0 using ./check_ducc -k . My 
Broker ports were different that time as well as I made changes in 
duccling path. When I started DUCC 1.1.0. It looks like working fine. 
But Then I faced agent instability problem so I re- configured DUCC 1.0.0.


Then I tried to configure DUCC 1.0.0 and DUCC 1.1.0. My Broker ports 
were different and I made all possible changes in ports so, that it 
wouldn't conflict with other ducc's ports. Right Now, I am still working 
to configure both on same nodes.


I will try what you suggested to me. I will let you know if I succeed.

If I missed something please reply me.

Thanks.

Reshu.

On 11/18/2014 04:06 AM, Jim Challenger wrote:

An excellent question by Lou.

Been wracking my brains over what would cause the agent instability 
and this would absolutely do it.


What will likely happen if you try to run 2 DUCCs on the same machines 
if you don't change the broker
port, is the second DUCC will see the broker alive and figure all is 
well.  There are a number of
use-cases where this is acceptable so we don't throw an alert, e.g. if 
you choose to use a non-DUCC

managed broker (as we do here).

To add to the confusion, sometimes the 'activemq stop' that DUCC 
issues doesn't work, for reasons out

of DUCC control, so when you think the broker is down, it isn't.

Try this:
1.  stop all the DUCCS, then use the ps command to make sure there is 
no errant broker. I use this:

   ps auxw | grep DUCC_AMQ_PORT
 and kill -9 any process that it shows.
2.  Now start JUST ONE DUCC, I suggest the 1.1.0, and see if life gets 
better.  1.1.0 has some nice

 things so you'll be better with that if we can make it work for you.

Jim

On 11/17/14, 8:03 AM, Lou DeGenaro wrote:
Are these problems related?  That is, are you having the node down 
problem

and the multiple DUCC's problem together on the same set of nodes?

Can you run the either configuration alone without issue?

Lou.

On Mon, Nov 17, 2014 at 7:41 AM, reshu.agarwal 


wrote:


Lou,

I have changed the broker port and ws port too but still faced a 
problem

in starting the ducc1.1.0 version simultaneously.

Reshu.


On 11/17/2014 05:34 PM, Lou DeGenaro wrote:


The broker port is specifiable in ducc.properties.  The default is
ducc.broker.port = 61617.

Lou.

On Mon, Nov 17, 2014 at 5:29 AM, Simon Hafner 
wrote:

  2014-11-17 0:00 GMT-06:00 reshu.agarwal :

I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same machines
with
different user. Can this be possible?


Yes, that should be possible. You'll have to make sure there are no
port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
might be randomly assigned. Just set that port manually and watch out
for any errors during the start to see which other components have
hardcoded port numbers.

Personally, I'd just fire up a VM with qemu or VirtualBox.








DUCC-Un-managed Reservation??

2014-11-17 Thread reshu.agarwal


Hi,

I am bit confused. Why we need un-managed reservation? Suppose we give 
5GB Memory size to this reservation. Can this RAM be consumed by any 
process if required?


In my scenario,  when all RAMs of Nodes was consumed by JOBs, all 
processes went in waiting state. I need some reservation of RAMs for 
this so that it can not be consumed by shares for Job Processes but if 
required internally it could be used.


Can un-managed reservation be used for this?

Thanks in advanced.

Reshu.




Re: DUCC 1.1.0- How to Run two DUCC version on same machines with different user

2014-11-17 Thread reshu.agarwal

Hi,

I am getting this error on job process as well as on login.

 arg[14]: org.apache.uima.ducc.common.main.DuccService
1001 Command launching...
/usr/java/jdk1.7.0_17/jre/bin/java: error while loading shared libraries: 
libjli.so: cannot open shared object file: No such file or directory


How to resolve this?

Thanks in advanced.
Reshu.

On 11/18/2014 10:10 AM, reshu.agarwal wrote:

Dear Jim,

When I was trying DUCC 1.1.0 on the nodes on which DUCC 1.0.0 was 
running perfectly, I first stopped DUCC 1.0.0 using ./check_ducc -k . 
My Broker ports were different that time as well as I made changes in 
duccling path. When I started DUCC 1.1.0. It looks like working fine. 
But Then I faced agent instability problem so I re- configured DUCC 
1.0.0.


Then I tried to configure DUCC 1.0.0 and DUCC 1.1.0. My Broker ports 
were different and I made all possible changes in ports so, that it 
wouldn't conflict with other ducc's ports. Right Now, I am still 
working to configure both on same nodes.


I will try what you suggested to me. I will let you know if I succeed.

If I missed something please reply me.

Thanks.

Reshu.

On 11/18/2014 04:06 AM, Jim Challenger wrote:

An excellent question by Lou.

Been wracking my brains over what would cause the agent instability 
and this would absolutely do it.


What will likely happen if you try to run 2 DUCCs on the same 
machines if you don't change the broker
port, is the second DUCC will see the broker alive and figure all is 
well.  There are a number of
use-cases where this is acceptable so we don't throw an alert, e.g. 
if you choose to use a non-DUCC

managed broker (as we do here).

To add to the confusion, sometimes the 'activemq stop' that DUCC 
issues doesn't work, for reasons out

of DUCC control, so when you think the broker is down, it isn't.

Try this:
1.  stop all the DUCCS, then use the ps command to make sure there is 
no errant broker. I use this:

   ps auxw | grep DUCC_AMQ_PORT
 and kill -9 any process that it shows.
2.  Now start JUST ONE DUCC, I suggest the 1.1.0, and see if life 
gets better.  1.1.0 has some nice
 things so you'll be better with that if we can make it work for 
you.


Jim

On 11/17/14, 8:03 AM, Lou DeGenaro wrote:
Are these problems related?  That is, are you having the node down 
problem

and the multiple DUCC's problem together on the same set of nodes?

Can you run the either configuration alone without issue?

Lou.

On Mon, Nov 17, 2014 at 7:41 AM, reshu.agarwal 


wrote:


Lou,

I have changed the broker port and ws port too but still faced a 
problem

in starting the ducc1.1.0 version simultaneously.

Reshu.


On 11/17/2014 05:34 PM, Lou DeGenaro wrote:


The broker port is specifiable in ducc.properties.  The default is
ducc.broker.port = 61617.

Lou.

On Mon, Nov 17, 2014 at 5:29 AM, Simon Hafner 
wrote:

  2014-11-17 0:00 GMT-06:00 reshu.agarwal :
I want to run two DUCC version i.e. 1.0.0 and 1.1.0 on same 
machines

with
different user. Can this be possible?


Yes, that should be possible. You'll have to make sure there are no
port conflicts, I'd guess the ActiveMQ port is hardcoded, the rest
might be randomly assigned. Just set that port manually and watch 
out

for any errors during the start to see which other components have
hardcoded port numbers.

Personally, I'd just fire up a VM with qemu or VirtualBox.











Re: can't remove duplicate Annotations with Java Set Collection

2014-11-17 Thread Richard Eckart de Castilho
On 17.11.2014, at 20:59, Kameron Cole  wrote:

> I am trying to get rid of duplicates in the FSIndex.  I thought a very
> clever way to do this would be to just push them into a Set Collection in
> Java, which does not allow duplicates. This is very (very) standard Java:
> 
> ArrayList al = new ArrayList();
> // add elements to al, including duplicates
> HashSet hs = new HashSet();
> hs.addAll(al);
> al.clear();
> al.addAll(hs);

There is no universal definition of equality other than object equality. And 
this is what Java defaults to unless equals() and hashCode() are implemented.
Since each UIMA user might have a different opinion on what is equal, UIMA 
defers this decision to its indexing mechanism instead of hard-baking it into 
equals()/hashcode() methods.

I suggest you do the following:

- implement a Comparator or Comparator 
according to your definition of equality

- create a TreeSet based on your comparator

- drop all your annotations into this TreeSet

- "duplicates" according to your definition are dropped. The rest is sorted (or 
not) depending on what your comparator returns in a non-equality case (return 
value != 0). 

Cheers,

-- Richard