[ 
https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292786#comment-16292786
 ] 

Marshall Schor commented on UIMA-5662:
--------------------------------------

I think an essential point Richard was making is it would be nice for UIMA to 
support addressing FSs by an int key, with the following properties:
1) the int keys would be "stable" as in fs-ids, preserved across 
serialization/deserialization
2) it would be "automatic" or "built-in" (but could be under the control of 
some enabling switch)
3) it would support additional APIs to allow better management in v3 - 
including "deleting" from the int -> fs map remembering this.

The current access API for this is the CAS LowLevel api, using the getFSForRef 
kinds of calls.  Would be nice for backwards compatibility to keep this.

Here's a proposal, in 2 parts: one for deserializers, and one for the "access 
API", that's more "built-in" than the previous proposal.

# under control of an enabler switch, have deserializers create FSs having the 
same id as the serialized form id; also, add all created FSs to the low level 
CAS ref-to-FS internal built-in map, already present, to support low level cas 
getFSForRef calls.
# add to the lowlevel API the ability to remove an int->fs, or to "clear" the 
entire map, to allow FSs to be reclaimed by GC

Details: 
# enabler switch: a ThreadLocal kind of param, set by default from a -D system 
property.  ThreadLocal allows keeping APIs unmodified.
# deserializers could include things having imputed FS addresses, as well as 
explicit ones, so this could work for more than just, say cascomplete style.

Does this sound more aligned with your use case?  WDYT?

> uv3 support CAS deserialization subsequent low level access
> -----------------------------------------------------------
>
>                 Key: UIMA-5662
>                 URL: https://issues.apache.org/jira/browse/UIMA-5662
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 3.0.0SDK-beta
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>            Priority: Minor
>             Fix For: 3.0.0SDK
>
>
> Some users depend 1) constant v2-ids for FSs preserved in deserialization and 
> serialization, and 2) low level cas API access to these.
> V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak 
> refs are used) prevent GC of unreachable FSs.
> Based on a mode, set by -Duima.deserialize_perserve_ids, and also 
> controllable by new config option per deserialize call, alter the 
> deserialization for those deserializers which know about v2 ids, to put these 
> into the map used for low-level CAS access, using the actual v2 ids, and 
> change the v3 next available id for future new FSs to be 1 beyond the end.
> The -Duima.deserialize-preserve_ids global setting is needed to handle the 
> use case of some annotators using low-level APIs, when part of a pipeline is 
> "remoted". 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to