[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295557#comment-16295557 ]
Marshall Schor edited comment on UIMA-5662 at 12/18/17 7:46 PM: ---------------------------------------------------------------- Here's my thinking on the way forward at this point. # add a mode. This mode will be on a CAS instance, not a thread local, since the (one) user of this prefers that. If other users ask for the thread-local approach, we could do that too (with errors signaled if they didn't agree, or something like that). # mode can be set by default using a system property, and also by some methods on the LowLevelCas instance, including the use of AutoClosable to support try-with-resources. # If the mode is set: #* all new FSs are added to the ll_getFSForRef table, including those created via deserialization #* deserializations modified to use explicit or imputed FSids. IDs for new items to start after highest deserialized one. #* serializations changed to include FSs only reachable via ll_getFSForRef table. Implications of this mode being on: * FSs in ref table can't be GC'd * No way to remove these FSs from the ref table (this might be added later, if needed) * FSids ought to be stable across many different serializations/deserializations * CasCopy of entire CAS not guaranteed to preserve IDs. If needed, make this request in a new Jira. WDYT? was (Author: schor): Here's my thinking on the way forward at this point. # add a mode. This mode will be on a CAS instance, not a thread local, since the (one) user of this prefers that. If other users ask for the thread-local approach, we could do that too (with errors signaled if they didn't agree, or something like that). # If the mode is set: #* all new FSs are added to the ll_getFSForRef table, including those created via deserialization #* deserializations modified to use explicit or imputed FSids. IDs for new items to start after highest deserialized one. #* serializations changed to include FSs only reachable via ll_getFSForRef table. Implications of this mode being on: * FSs in ref table can't be GC'd * No way to remove these FSs from the ref table (this might be added later, if needed) * FSids ought to be stable across many different serializations/deserializations * CasCopy of entire CAS not guaranteed to preserve IDs. If needed, make this request in a new Jira. WDYT? > uv3 support CAS deserialization subsequent low level access > ----------------------------------------------------------- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework > Affects Versions: 3.0.0SDK-beta > Reporter: Marshall Schor > Assignee: Marshall Schor > Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)