Re: small memory footprint tradeoff configuration

2009-03-27 Thread Marshall Schor
Another way to reduce the footprint of UIMA: One user reported the basic UIMA framework as taking approx. 5 MB (not sure exactly what was measured). I investigated to see if UIMA might be loading more classes than needed. I found that at startup time, UIMA reads a factory configuration file and

Re: small memory footprint tradeoff configuration

2009-03-13 Thread Thilo Goetz
Marshall Schor wrote: I agree with both of these concepts: only GC'ing things which are not in the index and also not reachable from something that is in the index, and making GC'ing (mostly) automatic, based on thresholds, etc, when a component exits back to the framework. This would be

Re: small memory footprint tradeoff configuration

2009-03-13 Thread Adam Lally
On Fri, Mar 13, 2009 at 3:07 AM, Thilo Goetz twgo...@gmx.de wrote: Marshall Schor wrote: I agree with both of these concepts:  only GC'ing things which are not in the index and also not reachable from something that is in the index, and making GC'ing (mostly) automatic, based on thresholds,

Re: small memory footprint tradeoff configuration

2009-03-13 Thread Adam Lally
On Thu, Mar 12, 2009 at 3:49 PM, Eddie Epstein eaepst...@gmail.com wrote: On Thu, Mar 12, 2009 at 12:14 PM, Adam Lally ala...@alum.rpi.edu wrote: The next question is under what conditions would a GC execute. Requiring an explicit call seems counter to what other garbage collecting runtime

Re: small memory footprint tradeoff configuration

2009-03-13 Thread Eddie Epstein
On Fri, Mar 13, 2009 at 10:03 AM, Adam Lally ala...@alum.rpi.edu wrote: One user scenario that motivated this thread was that an aggregate designer knew exactly when they wanted to do GC. What is wrong with giving them a CAS method to call? Where would they call this method from?  An

Re: small memory footprint tradeoff configuration

2009-03-13 Thread Marshall Schor
Thilo Goetz wrote: Marshall Schor wrote: I agree with both of these concepts: only GC'ing things which are not in the index and also not reachable from something that is in the index, and making GC'ing (mostly) automatic, based on thresholds, etc, when a component exits back to the

Re: small memory footprint tradeoff configuration

2009-03-12 Thread Adam Lally
On Wed, Mar 11, 2009 at 8:53 AM, Marshall Schor m...@schor.com wrote: I agree in general about not making things more complicated at least to the user.  I can imagine education working for  1) things like string interning  2) things like deleting features from type systems where they're not

Re: small memory footprint tradeoff configuration

2009-03-12 Thread Thilo Goetz
Adam Lally wrote: On Wed, Mar 11, 2009 at 8:53 AM, Marshall Schor m...@schor.com wrote: I agree in general about not making things more complicated at least to the user. I can imagine education working for 1) things like string interning 2) things like deleting features from type systems

Re: small memory footprint tradeoff configuration

2009-03-12 Thread Marshall Schor
I agree with both of these concepts: only GC'ing things which are not in the index and also not reachable from something that is in the index, and making GC'ing (mostly) automatic, based on thresholds, etc, when a component exits back to the framework. This would be fine for now - if use cases

Re: small memory footprint tradeoff configuration

2009-03-12 Thread Eddie Epstein
On Thu, Mar 12, 2009 at 12:14 PM, Adam Lally ala...@alum.rpi.edu wrote: The next question is under what conditions would a GC execute. Requiring an explicit call seems counter to what other garbage collecting runtime environments do, and like Thilo I'm confused about who would call this and

Re: small memory footprint tradeoff configuration

2009-03-12 Thread D.J. McCloskey
Subject:Re: small memory footprint tradeoff configuration

Re: small memory footprint tradeoff configuration

2009-03-11 Thread Thilo Goetz
Marshall Schor wrote: After reviewing the previous chain of discussion on this topic, I would like to start the next round, hopefully getting to some convergence :-). 1) On the topic of doing GC (garbage collection) versus copy to another CAS - GC is conceptually perhaps less complex - you

Re: small memory footprint tradeoff configuration

2009-03-11 Thread Marshall Schor
Thanks for your comments. Thilo Goetz wrote: Marshall Schor wrote: After reviewing the previous chain of discussion on this topic, I would like to start the next round, hopefully getting to some convergence :-). 1) On the topic of doing GC (garbage collection) versus copy to another CAS

Re: small memory footprint tradeoff configuration

2009-03-11 Thread Thilo Goetz
Marshall Schor wrote: [...] I agree that backward compatibility is important and is an issue. To help the transition to this new scheme, I think an overall global switch is needed (similar to the switches we have for JCas interning) that would by default make things work the way they do now.

Re: small memory footprint tradeoff configuration

2009-03-11 Thread Marshall Schor
Thilo Goetz wrote: Marshall Schor wrote: [...] I agree that backward compatibility is important and is an issue. To help the transition to this new scheme, I think an overall global switch is needed (similar to the switches we have for JCas interning) that would by default make things

Re: small memory footprint tradeoff configuration

2009-03-10 Thread Marshall Schor
After reviewing the previous chain of discussion on this topic, I would like to start the next round, hopefully getting to some convergence :-). 1) On the topic of doing GC (garbage collection) versus copy to another CAS - GC is conceptually perhaps less complex - you don't have mutliple CASes

Re: small memory footprint tradeoff configuration

2009-02-25 Thread Eddie Epstein
On Tue, Feb 24, 2009 at 9:36 AM, Adam Lally ala...@alum.rpi.edu wrote: To address Eddie's point about Vinci services breaking FS handles already - I consider that a bug, so am not happy using that as a rationale to invalidate FS handles as a general policy.  And I'm worried that users who

Re: small memory footprint tradeoff configuration

2009-02-25 Thread Adam Lally
On Wed, Feb 25, 2009 at 9:07 AM, Eddie Epstein eaepst...@gmail.com wrote: It seems like Marshall's angle (if I understood it) is not really GC at all, but a model where an annotator decides to explicitly delete FS.  I could be okay with that idea, too.  A GC model by definition should preserve

Re: small memory footprint tradeoff configuration

2009-02-24 Thread Eddie Epstein
Eddie Epstein wrote: Process calls to a Vinci service have always broken FS references. Same for calls thru the compatibility wrapper that allows calling colocated UIMA 1.4x annotators from Apache UIMA. Actually, I think that the compatibility wrapper does preserve FS addresses because it

Re: small memory footprint tradeoff configuration

2009-02-24 Thread Adam Lally
On Tue, Feb 24, 2009 at 2:53 AM, Thilo Goetz twgo...@gmx.de wrote: I have found the discussion again that I was referring to.  It wasn't on this list, it was in the OASIS spec discussions.  Sorry about the confusion.  I don't feel at liberty to publish that conversation here, but maybe Adam

Re: small memory footprint tradeoff configuration

2009-02-23 Thread Thilo Goetz
Marshall Schor wrote: Thilo Goetz wrote: Marshall Schor wrote: One of the ideas for GC was to change the basic heap design to use java objects for feature structures. I'm thinking of some kind of explicit GC, called by the user, at a point where they know a bunch of objects is no longer

Re: small memory footprint tradeoff configuration

2009-02-23 Thread Marshall Schor
Thilo Goetz wrote: Marshall Schor wrote: Thilo Goetz wrote: Marshall Schor wrote: One of the ideas for GC was to change the basic heap design to use java objects for feature structures. I'm thinking of some kind of explicit GC, called by the user, at a point where

Re: small memory footprint tradeoff configuration

2009-02-23 Thread Eddie Epstein
On Mon, Feb 23, 2009 at 12:10 PM, Thilo Goetz twgo...@gmx.de wrote: So the users are supposed to figure out if they need internal IDs? I don't think that's a good idea. Either we make guarantees about things like references into the CAS surviving calls to process(), or we don't. Process

Re: small memory footprint tradeoff configuration

2009-02-23 Thread Thilo Goetz
Eddie Epstein wrote: On Mon, Feb 23, 2009 at 12:10 PM, Thilo Goetz twgo...@gmx.de wrote: So the users are supposed to figure out if they need internal IDs? I don't think that's a good idea. Either we make guarantees about things like references into the CAS surviving calls to process(), or

Re: small memory footprint tradeoff configuration

2009-02-21 Thread Thilo Goetz
Eddie Epstein wrote: On Fri, Feb 20, 2009 at 7:23 AM, Thilo Goetz twgo...@gmx.de wrote: It would change the internal IDs of FSs, which was always a big no-no for some people. True, ID's would change, but this would be documented behavior, and there should be no problem if an annotator

Re: small memory footprint tradeoff configuration

2009-02-20 Thread Thilo Goetz
Marshall Schor wrote: Some users are beginning to ask for the ability to shift the internal tradeoffs UIMA takes toward having a smaller memory footprint, at some cost in performance. Several areas in particular have come up: 1) interning string objects, so that only one copy exists

Re: small memory footprint tradeoff configuration

2009-02-20 Thread Marshall Schor
One of the ideas for GC was to change the basic heap design to use java objects for feature structures. I'm thinking of some kind of explicit GC, called by the user, at a point where they know a bunch of objects is no longer needed (because they've just deleted things out of the index, for

Re: small memory footprint tradeoff configuration

2009-02-20 Thread Thilo Goetz
Marshall Schor wrote: One of the ideas for GC was to change the basic heap design to use java objects for feature structures. I'm thinking of some kind of explicit GC, called by the user, at a point where they know a bunch of objects is no longer needed (because they've just deleted things

Re: small memory footprint tradeoff configuration

2009-02-20 Thread Marshall Schor
Thilo Goetz wrote: Marshall Schor wrote: One of the ideas for GC was to change the basic heap design to use java objects for feature structures. I'm thinking of some kind of explicit GC, called by the user, at a point where they know a bunch of objects is no longer needed (because

small memory footprint tradeoff configuration

2009-02-18 Thread Marshall Schor
Some users are beginning to ask for the ability to shift the internal tradeoffs UIMA takes toward having a smaller memory footprint, at some cost in performance. Several areas in particular have come up: 1) interning string objects, so that only one copy exists 2) having some way to compact