[ https://issues.apache.org/jira/browse/UIMA-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285754#comment-14285754 ]
Marshall Schor commented on UIMA-3399: -------------------------------------- found a bunch more issues with offset handling - working on the fixes.. > More consistent handling of multiple add-to-index behavior for same Feature > Structure > ------------------------------------------------------------------------------------- > > Key: UIMA-3399 > URL: https://issues.apache.org/jira/browse/UIMA-3399 > Project: UIMA > Issue Type: Improvement > Reporter: Marshall Schor > Assignee: Marshall Schor > Priority: Minor > Fix For: 2.7.0SDK > > > UIMA has a somewhat unusual indexing architecture. You can define indexes > (sorted, bag, set), and then add / remove a feature structure (FS) to all of > the defined indexes. > The design intention (I think) was to support the concept of a FS being > indexed, or not. However, the current design allows some anomalies that > behave inconsistently between code being run "locally", versus as remote > services (due to how serialization handles this). Serialization encodes only > the concept of a FS being either in an index or not. > The problem arises in the edge case where the same identical FS is added to > the indexes multiple times. For local (non-remote) cases, for bag and sorted > indexes, the same exact FS would be added multiple times. This would have > the consequences: > - Iterating would return multiple == FSs. > - Remove from indexes of a multiply-added FS would reduce the number by 1; > the FS would still be in the index unless the last remaining one was removed.. > For the same code, running remotely, serialization would have "collapsed" the > multiple additions into one, so would behave differently. > This Jira changes the behavior of "add-to-index" so that subsequent > add-to-indexes of a same identical FS would be a no-op. To cover users who > might be exploiting the old behavior, the JVM property > "uima.allow_duplicate_add_to_indices", read when the UIMA classes are loaded, > would restore the previous behavior. > Note that with this change, the UIMA "Set" index still has a distinct purpose > , separate from the "Bag" index, because it defines Feature Structure > equivalence based not on identity, but rather on specified key feature values > being equal. > This change better aligns how code running locally or remotely works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)