Re: small memory footprint tradeoff configuration

2009-02-24 Thread Eddie Epstein
 Eddie Epstein wrote:
 Process calls to a Vinci service have always broken FS references.
 Same for calls thru the compatibility wrapper that allows calling
 colocated UIMA 1.4x annotators from Apache UIMA.

Actually, I think that the compatibility wrapper does preserve FS
addresses because it uses binary serialization. But Vinci definitely
does not.

Eddie


Re: small memory footprint tradeoff configuration

2009-02-24 Thread Adam Lally
On Tue, Feb 24, 2009 at 2:53 AM, Thilo Goetz twgo...@gmx.de wrote:
 I have found the discussion again that I was referring to.  It wasn't
 on this list, it was in the OASIS spec discussions.  Sorry about the
 confusion.  I don't feel at liberty to publish that conversation here,
 but maybe Adam would like to comment?  He and I were debating this at
 the time (nearly two years ago).


I'm not sure about what OASIS discussion you mean (is it about xmi:id
consistency?), but I thought the link that Marshall posted was a
reasonable summary of the discussion, including the concerns that I
had:
http://markmail.org/thread/aolbz4nrvmgjhuyb.

The only sticking point I was really concerned about was the
invalidation of the FS handle held by an application.  But, it was
definitely not my intention to shoot down any work in this area (in
fact you'll see in that email thread where I explicitly said I'm in
favor of doing something in this space).  I just want to discuss it
and see if we can come to a mutually acceptable plan.

To address Eddie's point about Vinci services breaking FS handles
already - I consider that a bug, so am not happy using that as a
rationale to invalidate FS handles as a general policy.  And I'm
worried that users who haven't been using Vinci services (I bet we
have plenty of those) have built applications that rely on this
behavior.  I remember suggesting that we post on the user list about
this, but am not sure if we ever did.

If you do a GC approach, is there not any way to include
application-created FeatureStructures as part of the root set?  Or
to look at it another way, the set of FS's that you do the GC over is
only those created since the CAS was input to the current AE (possible
aggregate).

It seems like Marshall's angle (if I understood it) is not really GC
at all, but a model where an annotator decides to explicitly delete
FS.  I could be okay with that idea, too.  A GC model by definition
should preserve any referenced FSs, but if we say we have an explicit
deletion model where anybody can delete anyone else's stuff, at least
we won't confuse people about what's going on.  Current applications
that use existing annotators would not break (because the annotators
would not delete anything), and if a new annotator is introduced that
breaks the application, it's the annotator's fault for being too
aggressive in deleting stuff that someone else might still need.

-Adam


[jira] Commented: (UIMA-1245) Processing order of parent CAS different on UIMA and UIMA AS

2009-02-24 Thread Burn Lewis (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676308#action_12676308
 ] 

Burn Lewis commented on UIMA-1245:
--

To summarize my understanding of recent discussions ...

First I'd like to suggest that the default should not change.  Processing the 
parent last does not guarantee that UIMA-AS will act like core UIMA ... in 
addition the size of all downstream pools must be set to 1 to ensure that each 
child is processed sequentially.  We should document the settings needed for 
UIMA-like processing but I think the default should be UIMA-AS style 
processing, i.e. processParentLast=false.

With the current design parents are held in the final step of an aggregate 
until all children have completed processing in that aggregate.  This ensures 
that any child errors can be reported on the input CAS, and that aggregate CMs 
satisfy the CM contract of not processing the parent until all children have 
been returned.  If this aggregate is nested in another, the same conditions 
hold at the final step of the outer aggregate.

But with this new processParentLast=true option the parent must be held after 
the CM until all of its children have completed processing in all aggregates, 
i.e. have been returned to their pool.  Unlike the previous case we must track 
the number of children active in any of the nested aggregates.

 Processing order of parent CAS different on UIMA and UIMA AS
 

 Key: UIMA-1245
 URL: https://issues.apache.org/jira/browse/UIMA-1245
 Project: UIMA
  Issue Type: Bug
  Components: Async Scaleout
Reporter: Eddie Epstein

 Arron Kaplan raised the question of when parent CASes are processed relative 
 to their children. See http://markmail.org/message/5cop7iv2nshouhgs  As of 
 now, the processing order for a multi-threaded UIMA AS aggregate is different 
 than that for a single-threaded UIMA aggregate.
 A discussion with Burn, Adam, Jerry, Marshall and myself concluded that the 
 default processing order for UIMA AS should be changed to be the same as in 
 UIMA, in order to have the same application behavior for both. This will be 
 done by suspending flow of a parent CAS after it is returned from a 
 CasMultiplier delegate until all its children CASes have finished processing.
 However, there also needs to be a UIMA AS deployment option for CasMultiplier 
 delegates that allows the parent CAS to resume processing immediately after 
 being returned from the CM. This option is needed to enable parallel 
 processing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (UIMA-1294) Enable access of service's ipaddr from process Cas replies

2009-02-24 Thread Eddie Epstein (JIRA)
Enable access of service's ipaddr from process Cas replies
--

 Key: UIMA-1294
 URL: https://issues.apache.org/jira/browse/UIMA-1294
 Project: UIMA
  Issue Type: Improvement
  Components: Async Scaleout
Reporter: Eddie Epstein
Assignee: Eddie Epstein
Priority: Minor


Process Cas reply messages contain the service's host ipaddr, but there is no 
mechanism to retrieve this info. Also, would be nice for the sample program, 
RunRemoteAsyncAE to show how to access this info and to display it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (UIMA-1293) Replies from remote CasMultipliers that don't always generate CASes are not handled correctly

2009-02-24 Thread Jerry Cwiklik (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676422#action_12676422
 ] 

Jerry Cwiklik commented on UIMA-1293:
-

A remote (primitive) cas multiplier's  process() method returns an input CAS 
back to the client only if ALL CASes produced by the CM have been released. 
This is wrong. The correct logic is to release the input CAS when all *its* 
children have been released. Modified the code to use a child count associated 
with the input CAS when making a decision whether or not to send the input CAS 
back to the client.


 Replies from remote CasMultipliers that don't always generate CASes are not 
 handled correctly
 -

 Key: UIMA-1293
 URL: https://issues.apache.org/jira/browse/UIMA-1293
 Project: UIMA
  Issue Type: Bug
  Components: Async Scaleout
Reporter: Burn Lewis
 Fix For: 2.3

 Attachments: UIMA1293.patch


 If a remote CM generates 1 CAS for every N input, some of the childless 
 parents do not continue in the flow. Since the default FC uses 
 dropIfNewCasProduced all CASes should continue in the flow except for every 
 N-th one being replaced by its child.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Closed: (UIMA-1194) JMX stats for UIMA AS seem inconsistent

2009-02-24 Thread Jerry Cwiklik (JIRA)

 [ 
https://issues.apache.org/jira/browse/UIMA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Cwiklik closed UIMA-1194.
---

Resolution: Fixed

 JMX stats for UIMA AS seem inconsistent 
 

 Key: UIMA-1194
 URL: https://issues.apache.org/jira/browse/UIMA-1194
 Project: UIMA
  Issue Type: Bug
  Components: Async Scaleout
Reporter: Jerry Cwiklik

 The aggregate's JMX stats for remote delegate seem different from those shown 
 by the delegate's JMX stats. Specifically, analysis times are different. 
 These numbers should be the same in both. it appears that the numbers shown 
 in the delegate's stats are always larger. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.