[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-11-18 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677858#comment-15677858
 ] 

Richard Eckart de Castilho commented on UIMA-5106:
--

In WebAnno, we're using the FS address for this - it remains stable as long as 
we use SerialFormat.SERIALIZED_TSI or SerialFormat.SERIALIZED.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-11-17 Thread Daniel Gruhl (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15674095#comment-15674095
 ] 

Daniel Gruhl commented on UIMA-5106:


In systems with persistent analytics (that is, where CAS are stored long term 
and incrementally annotated, often by humans) it is very helpful to have a 
stabile UUID to a feature structure. For example, there may be a document in a 
CAS that is under analysis. Being able to refer to a span of that sofa and send 
it to a human for review or adjudication is very helpful. It also allows the 
use of CAS to hold "entity information", that is, frames of knowledge, or to 
represent higher level concepts (e.g., a web site CAS can be pointed to by all 
it's page CAS).

This was critical in large persistent UIMA system such as WebFountain and it 
would be nice to see it make its way into the standard.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-10-25 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605866#comment-15605866
 ] 

Marshall Schor commented on UIMA-5106:
--

# re: getting a blessed id feature that all FSs would inherit: I wasn't 
thinking of users adding a new feature on TOP or Annotation - you are correct 
in observing that this would be restricted to some user-defined type which 
wasn't one of the built-in "feature-final" types.  A key idea is that only 
selected types would have this - the ones you wanted it for.

# re: users wanting their own stable IDs - they can manage this themselves.  
True, but the dev list has had requests for UIMA to help here.  There were 2 
kinds of help wanted:  
#* a) assigning unique ids, and 
#* b) having a way to go from those IDs to the associated Feature Structures (a 
map).  

As we move into more distributed environments, having some principled way to 
have a hierarchical naming that results in guaranteed unique names seems 
useful; this context is where the OID idea came up.  But you may be right - 
this may not be of very much interest (yet) to the wider community.  (Community 
- if this is wrong, please speak up :-) ).

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-10-25 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605577#comment-15605577
 ] 

Richard Eckart de Castilho commented on UIMA-5106:
--

Hm, ok. So then I am not sure if this feature here is needed. If users want to 
assign own stable IDs, then they can create a feature for that and are good. 
But users cannot create new features on high-level types (e.g. on TOP or 
Annotation) - is it that you wanted to introduce a "blessed" external ID 
feature that all FSes would inherit?

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-10-25 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605501#comment-15605501
 ] 

Marshall Schor commented on UIMA-5106:
--

V3 isn't planning to remove the existing IDs, which work as you mention above, 
so you could continue to use that.  It also currently assigns values and 
adjusts values coming back from parallel executing remote services so they 
remain unique.  None of this is planned to change, except it is somewhat likely 
the actual IDs will be changed from incrementing-by-1 to 
incrementing-exactly-like-v2-increments-them :-) .

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-10-25 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605494#comment-15605494
 ] 

Marshall Schor commented on UIMA-5106:
--

another user says we should look at / consider using OIDs.  
https://en.wikipedia.org/wiki/Object_identifier

The general use case is for future UIMA uses where Feature Structures are 
generated and stored in a potentially widely distributed manner.

This could solve this problem:
* a client sends a CAS to 2 services (for parallel processing), who both 
process it and return it
* the client (first thought) would adjust the unique IDs for one of the 
returned CAS's new feature structures.   This is actually done today for the 
internal IDs.

But, on reflection, we might imagine that the purpose for having the unique ID 
was to put that value into other features as well.  There is no reasonable way 
to find all those uses and re-adjust them as well, I think.

Using OIDs solves this, because they don't need adjusting.  It could be 
implemented along these lines:
* normal OIDs for new FSs would be, for instance ".1", ".2", ...
* OIDs for new FSs produced at a service from a client would have OIDs of .8.1, 
.8.2, ... for one service, and .9.1, .9.2 etc, for another. 
* These OIDs would never need adjusting.  
* The prefix (.8 .9, in the above example) could be generated by the client, 
and sent along with the CAS to each remote service call

Combining this with the facility to only have these things attached the subset 
of Feature Structures users want unique ids for (using the reserved feature 
name, which we might call uimaOID),  this feels like a good direction to 
consider, especially for farther in the future use cases.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-10-25 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605216#comment-15605216
 ] 

Richard Eckart de Castilho commented on UIMA-5106:
--

Being used to UIMA managing IDs, I would prefer that UIMA continues to manage 
the IDs and that is automatically and always assigns them - just like a 
database auto-increment primary key. I would also prefer if they continue to be 
kept separate from the features space.

IMHO using an int ID should be sufficient. Int worked so far... long would be 
nice... but not really important.

IMHO all feature structures should have the ID, even the FSArray.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-10-25 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605203#comment-15605203
 ] 

Marshall Schor commented on UIMA-5106:
--

Thinking harder about this, I'd like to close this Jira as 
won't-do-it-this-way, and open a new one that changes the goal slightly to 
support a user-specified unique ID feature which could selectively be added to 
selected Feature Structure (FS) declarations.

The main difference is this allows users to specify which FS Types they want 
this additional ID on.  This allows other FS to remain more light-weight.  Some 
consequences:

* The built-in FSarray would not have this ID (it doesn't have fields).
* No space cost in FSs of this when not being used
* No space/time cost for doing the special indexing by id for FSs the user is 
not interested in (for example, the little FSs that make up the list cells in 
the various FSLists).

2 approaches come to mind:
# having a "reserved" feature name.  The user would declare this feature with 
range "long" on any FS where they wanted the unique ID
# letting users designate one or more features of type long to be a unique-id, 
using an API call.

The 2nd approach has some difficulties with type merging - the "application" 
consuming someone else's aggregate+typesystem may not know the other's 
assumptions about unique-id.

So I think the "reserved name" approach would be best.  Possible feature name: 
uimaBuiltInUID  or uimaUID (UIMA Unique ID).

Other thoughts welcome.  

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-10-24 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15603636#comment-15603636
 ] 

Marshall Schor commented on UIMA-5106:
--

I was planning to use the existing _id() for this; making it a long turns out 
to cascade into a whole lot of work. I think this is designing ahead of need - 
so I plan to keep this as an int as it is now, until we see some real use 
requirement.


> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-09-17 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498418#comment-15498418
 ] 

Richard Eckart de Castilho commented on UIMA-5106:
--

I understand, thanks.

Regarding the preservation of IDs across serialization: this is very useful. 

Maybe it should not always be mandatory. A user may intentionally want to 
"garbage collect" the ID space. E.g. right now with v2, I use a variant of 
SERIALIZED if I want to preserve IDs and COMPRESSED_FILTERED if I wanted to 
garbage-collect IDs (and FSes). I could imagine that with v3, the preservation 
of IDs could become a parameter to some serialization/deserialization formats.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-09-16 Thread Marshall Schor (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15498033#comment-15498033
 ] 

Marshall Schor commented on UIMA-5106:
--

only because the low level cas address in v2 is not guaranteed to remain 
preserved across various serializations/deserializations.

This would be "elevating" this previously "internal use" value (that many users 
made use of, in spite of it's non-guarantees of stability), to a more official 
and stable status.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (UIMA-5106) uv3 constant "id" for FSs (Proposed new Feature for uv3)

2016-09-16 Thread Richard Eckart de Castilho (JIRA)

[ 
https://issues.apache.org/jira/browse/UIMA-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15497557#comment-15497557
 ] 

Richard Eckart de Castilho commented on UIMA-5106:
--

I though the ID property (it is not a feature as in Feature Structure) 
resembles the LowLevelCas address in v2, so I'm not exactly sure why this is 
considered to be a new feature.

> uv3 constant "id" for FSs (Proposed new Feature for uv3)
> 
>
> Key: UIMA-5106
> URL: https://issues.apache.org/jira/browse/UIMA-5106
> Project: UIMA
>  Issue Type: New Feature
>  Components: Core Java Framework
>Reporter: Marshall Schor
>Priority: Minor
> Fix For: 3.0.0SDKexp
>
>
> Add constant ID for FSs. This would be an incrementing, long value. It would 
> be constant through serialization/ deserialization cycles. There would be a 
> lazily created map from longs to FSs (via weak links) to allow direct access 
> from the ID to the FS.  Lazy intent is to not have a cost for this 
> (space/time) other than the cost for 1 long / FS, if it is not used.
> We could make this feature optional, as well, to avoid the 8 bytes per FS 
> overhead, but in V3, I think that's not a good tradeoff (space savings vs 
> complexity).  
> Issues: 
> * Current design allows parallelism of services, with returned results 
> "stacked" into receiving CAS; would need to change (some of) the IDs coming 
> back.
> CAS would need to have the high-water-mark value as part of serializations.
> Backwards compatibility:
> * loading V2 CASs: generate new IDs upon loading.
> * serializing to V2: (for connecting to V2 services): drop the IDs.
> This is a proposed new V3 feature; comments appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)