[jira] [Created] (UIMA-5704) uv3 edge case failures in setting up JCas
Marshall Schor created UIMA-5704: Summary: uv3 edge case failures in setting up JCas Key: UIMA-5704 URL: https://issues.apache.org/jira/browse/UIMA-5704 Project: UIMA Issue Type: Bug Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Fix For: 3.0.0SDK After adding the capability to merge from JCas class definitions into existing type systems, some edge cases appeared causing failures or replication in conformance testing and setting up the offsets in JCas classes when loading them with type systems. Rework to remove redundant checking, and insure offsets are set up in all cases. Use a simplified supertype/superclass validity checker. Remove 2nd map used just for class loader associated with pears (not needed). make type2jcci map only have entries for types having loaded JCas classes (others return null). Insure updateOrValidateAllCallSitesForJCasClass is called for each combo of a type system and class loader, to either set it up, or validate the offsets. fix a bug setting the threadlocal used for a previous (alpha) version of some jcas class impl - in case those jcas classes are hanging around (one is, in a PEAR test case) Skip conformance checking for built-ins. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (UIMA-5661) uv3 Ref doc: add all -D settings
[ https://issues.apache.org/jira/browse/UIMA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5661. -- Resolution: Fixed > uv3 Ref doc: add all -D settings > > > Key: UIMA-5661 > URL: https://issues.apache.org/jira/browse/UIMA-5661 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > -D settings for some v3 things are missing from the v2 chapter having the > table of these. Example -Duima.v2_pretty_print_format > Scan code and add all missing ones. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5655) uv3 migrate to current versions of slf4j and log4j and other dependencies
[ https://issues.apache.org/jira/browse/UIMA-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5655. -- Resolution: Fixed > uv3 migrate to current versions of slf4j and log4j and other dependencies > - > > Key: UIMA-5655 > URL: https://issues.apache.org/jira/browse/UIMA-5655 > Project: UIMA > Issue Type: Improvement >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > move to log4j 2.10.0 > move to slf4j 1.7.25 > move to jackson-core 2.9.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5661) uv3 Ref doc: add all -D settings
[ https://issues.apache.org/jira/browse/UIMA-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5661: - Description: -D settings for some v3 things are missing from the v2 chapter having the table of these. Example -Duima.v2_pretty_print_format Scan code and add all missing ones. was: -D settings for some v3 things are missing from the v2 chapter having the table of these. Example -Duima.42_pretty_print_format Scan code and add all missing ones. > uv3 Ref doc: add all -D settings > > > Key: UIMA-5661 > URL: https://issues.apache.org/jira/browse/UIMA-5661 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > -D settings for some v3 things are missing from the v2 chapter having the > table of these. Example -Duima.v2_pretty_print_format > Scan code and add all missing ones. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5653) uv3 update trace logging using defined markers
[ https://issues.apache.org/jira/browse/UIMA-5653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5653: - Fix Version/s: (was: 3.0.0SDK) > uv3 update trace logging using defined markers > -- > > Key: UIMA-5653 > URL: https://issues.apache.org/jira/browse/UIMA-5653 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > > The docs define some tracing markers. Add tracing logging using these > markers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5698) uv3 create type systems with JCas features merged in
[ https://issues.apache.org/jira/browse/UIMA-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5698. -- Resolution: Fixed > uv3 create type systems with JCas features merged in > > > Key: UIMA-5698 > URL: https://issues.apache.org/jira/browse/UIMA-5698 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some use cases involve setting up (over time) multiple CASs with similar but > not identical type systems, differing in the number of features the types > have. The use case for this also has JCas class definitions defining > features. The following currently fails: > # load type T with no features, load JCas for T with features f1, f2, f3. > This works correctly. > # Now load type T with features f1, f2, f3, using the same classloader that > loaded the JCas for T previously. This fails because the offsets for f1, f2, > f3 in the JCas were computed previously, and cannot be changed (because > existing instances of the JCas class for T would stop working). > Implement a fix for this that works by having the initial setup of the JCas > feature offsets be preceeded by a step which, prior to committing the type > system, loads the JCas classes and programmatically adds in to the type > system any features defined in the JCas but not in the Type system. > Include logging messages showing what's happening. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5698) uv3 create type systems with JCas features merged in
Marshall Schor created UIMA-5698: Summary: uv3 create type systems with JCas features merged in Key: UIMA-5698 URL: https://issues.apache.org/jira/browse/UIMA-5698 Project: UIMA Issue Type: Improvement Components: Core Java Framework Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK Some use cases involve setting up (over time) multiple CASs with similar but not identical type systems, differing in the number of features the types have. The use case for this also has JCas class definitions defining features. The following currently fails: # load type T with no features, load JCas for T with features f1, f2, f3. This works correctly. # Now load type T with features f1, f2, f3, using the same classloader that loaded the JCas for T previously. This fails because the offsets for f1, f2, f3 in the JCas were computed previously, and cannot be changed (because existing instances of the JCas class for T would stop working). Implement a fix for this that works by having the initial setup of the JCas feature offsets be preceeded by a step which, prior to committing the type system, loads the JCas classes and programmatically adds in to the type system any features defined in the JCas but not in the Type system. Include logging messages showing what's happening. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5656) uv3 consider enabling Java 9
[ https://issues.apache.org/jira/browse/UIMA-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5656: - Fix Version/s: (was: 3.0.0SDK) > uv3 consider enabling Java 9 > > > Key: UIMA-5656 > URL: https://issues.apache.org/jira/browse/UIMA-5656 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > > Investigate and see what's needed to enable running with Java 9. > Modify code to make future use of Java 9 better. > Continue to run with Java 8 > A main thing to do now it to make automatic naming of Jars have appropriate > names. See > http://blog.joda.org/2017/05/java-se-9-jpms-automatic-modules.html . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310069#comment-16310069 ] Marshall Schor commented on UIMA-5662: -- Current v2 design for Xmi deserialization creates the internal CAS structures in a different order from the original CAS, so the "addresses" do not match. However, the deserialization can return extra metadata, so that a subsequent reserialization will have the original Xmi ids. For now, V3 will keep this same behavior. (so no changes are needed to the XmiCasDeserializer code). > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (UIMA-5691) hex to byte conversion routine wrong for lower case hex
[ https://issues.apache.org/jira/browse/UIMA-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor closed UIMA-5691. Resolution: Fixed > hex to byte conversion routine wrong for lower case hex > --- > > Key: UIMA-5691 > URL: https://issues.apache.org/jira/browse/UIMA-5691 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta, 2.10.2SDK >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK, 2.10.3SDK > > > bug in XmiDeserialization code in hex char to byte when converting lower-case > hex chars - using wrong lower bound char (should be 'a', but is using '1'). > This bug is from 2008. Since no one has noticed, it's probably true that > lower case hex representations are never being used in Xmi byte array > serializations. But this should be fixed anyways. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5691) hex to byte conversion routine wrong for lower case hex
Marshall Schor created UIMA-5691: Summary: hex to byte conversion routine wrong for lower case hex Key: UIMA-5691 URL: https://issues.apache.org/jira/browse/UIMA-5691 Project: UIMA Issue Type: Bug Components: Core Java Framework Affects Versions: 2.10.2SDK, 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK, 2.10.3SDK bug in XmiDeserialization code in hex char to byte when converting lower-case hex chars - using wrong lower bound char (should be 'a', but is using '1'). This bug is from 2008. Since no one has noticed, it's probably true that lower case hex representations are never being used in Xmi byte array serializations. But this should be fixed anyways. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309678#comment-16309678 ] Marshall Schor commented on UIMA-5662: -- I'm trying to support XCAS and Xmi in this new mode, as well. For Xmi, the serialized form may contain sequences of UIMA Lists, encoded as just the item values; this serialization doesn't have any fsId information for these. (Note: some list elements may be multiply referenced; these will have fsIds). For the missing fsId case, I'm thinking of assigning fsIds to these, following the deserialization. XCAS should be OK - all Feature Structures (I believe) have id's in the serialized format. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308681#comment-16308681 ] Marshall Schor edited comment on UIMA-5662 at 1/2/18 8:47 PM: -- Committed support for the v2 mode. It works I think for both cas object serialization (including cas complete), as well as the 3 binary forms (plain, and compressed4 and 6). In this mode, too, new FSs will have the v2-style "ids". See docs for this in the v3 users guide (section 2.4 Preserving V2 ids, with low level CAS Api accessibility in the backwards compatibility chapter). was (Author: schor): Committed support for the v2 mode. It works I think for both cas object serialization (including cas complete), as well as the 3 binary forms (plain, and compressed4 and 6). In this mode, too, new FSs will have the v2-style "ids". See docs for this in the v3 users guide. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308681#comment-16308681 ] Marshall Schor commented on UIMA-5662: -- Committed support for the v2 mode. It works I think for both cas object serialization (including cas complete), as well as the 3 binary forms (plain, and compressed4 and 6). In this mode, too, new FSs will have the v2-style "ids". See docs for this in the v3 users guide. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5687) uv3 index flush: don't do the copy-on-write
[ https://issues.apache.org/jira/browse/UIMA-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5687. -- Resolution: Fixed > uv3 index flush: don't do the copy-on-write > > > Key: UIMA-5687 > URL: https://issues.apache.org/jira/browse/UIMA-5687 > Project: UIMA > Issue Type: Improvement >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Index flush operations are used in cas reset and in the "removeAll" > operations. These operations are so unlikely to need the value having > iterators continue to work, that the copy-on-write operations can be skipped. > Needs documentation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5689) uv3 CasCompare edge case with type filtering - sort order wrong
[ https://issues.apache.org/jira/browse/UIMA-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5689. -- Resolution: Fixed > uv3 CasCompare edge case with type filtering - sort order wrong > --- > > Key: UIMA-5689 > URL: https://issues.apache.org/jira/browse/UIMA-5689 > Project: UIMA > Issue Type: Bug >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > When sorting FSs prior to compare, for those FS types which have no > corresponding type in the other cas, substitute null when comparing, because > the other cas will have null there. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5689) uv3 CasCompare edge case with type filtering - sort order wrong
Marshall Schor created UIMA-5689: Summary: uv3 CasCompare edge case with type filtering - sort order wrong Key: UIMA-5689 URL: https://issues.apache.org/jira/browse/UIMA-5689 Project: UIMA Issue Type: Bug Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Fix For: 3.0.0SDK When sorting FSs prior to compare, for those FS types which have no corresponding type in the other cas, substitute null when comparing, because the other cas will have null there. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5687) uv3 index flush: don't do the copy-on-write
Marshall Schor created UIMA-5687: Summary: uv3 index flush: don't do the copy-on-write Key: UIMA-5687 URL: https://issues.apache.org/jira/browse/UIMA-5687 Project: UIMA Issue Type: Improvement Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK Index flush operations are used in cas reset and in the "removeAll" operations. These operations are so unlikely to need the value having iterators continue to work, that the copy-on-write operations can be skipped. Needs documentation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5686) uv3 efficiency - wrong set of fs flag for in set or sorted index
Marshall Schor created UIMA-5686: Summary: uv3 efficiency - wrong set of fs flag for in set or sorted index Key: UIMA-5686 URL: https://issues.apache.org/jira/browse/UIMA-5686 Project: UIMA Issue Type: Bug Reporter: Marshall Schor Assignee: Marshall Schor The add-to-indexes is setting the flag for in-set-or-sorted-indexes for FSs which are only in bag indexes. This causes inefficiencies. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5683) uv3 space calc for v2 size off by 1 for non-heap-stored arrays
[ https://issues.apache.org/jira/browse/UIMA-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5683. -- Resolution: Fixed > uv3 space calc for v2 size off by 1 for non-heap-stored arrays > -- > > Key: UIMA-5683 > URL: https://issues.apache.org/jira/browse/UIMA-5683 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > Bug in the calc of v2 space for arrays - mistook the meaning of a method > name. Rename the method, fix callers -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5683) uv3 space calc for v2 size off by 1 for non-heap-stored arrays
[ https://issues.apache.org/jira/browse/UIMA-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5683: - Affects Version/s: 3.0.0SDK-beta Fix Version/s: 3.0.0SDK Component/s: Core Java Framework > uv3 space calc for v2 size off by 1 for non-heap-stored arrays > -- > > Key: UIMA-5683 > URL: https://issues.apache.org/jira/browse/UIMA-5683 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > Bug in the calc of v2 space for arrays - mistook the meaning of a method > name. Rename the method, fix callers -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5683) uv3 space calc for v2 size off by 1 for non-heap-stored arrays
Marshall Schor created UIMA-5683: Summary: uv3 space calc for v2 size off by 1 for non-heap-stored arrays Key: UIMA-5683 URL: https://issues.apache.org/jira/browse/UIMA-5683 Project: UIMA Issue Type: Bug Reporter: Marshall Schor Assignee: Marshall Schor Bug in the calc of v2 space for arrays - mistook the meaning of a method name. Rename the method, fix callers -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298751#comment-16298751 ] Marshall Schor edited comment on UIMA-5662 at 12/25/17 2:37 PM: Initial commit done, not tested, no test cases, so may be buggy... but if you want to test :-) - the mode is document in the uima-docbook-v3-users-guide in the the backwards compatibility chapter. It should work for casComplete, and the various binary serializations, but not for Xmi nor XCAS. Not yet quite ready - needs fixes to assign the same id's as in the serialized form. (12/25) was (Author: schor): Initial commit done, not tested, no test cases, so may be buggy... but if you want to test :-) - the mode is document in the uima-docbook-v3-users-guide in the the backwards compatibility chapter. It should work for casComplete, and the various binary serializations, but not for Xmi nor XCAS. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300538#comment-16300538 ] Marshall Schor edited comment on UIMA-5662 at 12/23/17 7:21 PM: V3's "select" which iterates over the contents of FSs in the CAS does not respect this mode. It does not include unreachable Feature Structures. Is this OK? was (Author: schor): V3's "select" which iterates over the contents of FSs in the CAS does not respect this mode. It dies not include unreachable Feature Structures. Is this OK? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300651#comment-16300651 ] Marshall Schor commented on UIMA-5662: -- testing reveals it's pretty easy to mess up how this is used (I did so ...). If you "turn on" the mode on a non empty cas, the IDs don't reflect the V2 addresses. In order for that to work, you have to have the mode on before you put FSs into the CAS. I'll add an exception of the CAS isn't empty when the mode is set. Also, I'll add a "static" method on LowLevelCas, which will set a thread local, which will not be used except when creating new CASs - if it is set, it will set the mode for the new CAS. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300538#comment-16300538 ] Marshall Schor commented on UIMA-5662: -- V3's "select" which iterates over the contents of FSs in the CAS does not respect this mode. It dies not include unreachable Feature Structures. Is this OK? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300534#comment-16300534 ] Marshall Schor commented on UIMA-5662: -- writing some tests for this. Found/fixed one bug already... > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298832#comment-16298832 ] Marshall Schor edited comment on UIMA-5662 at 12/20/17 6:05 PM: The CasCompleteSerializer I believe will now store unreachable FSs (if running in this special mode, of course). I guess some docs need improving - can you point to where? was (Author: schor): The CasCompleteSerializer I believe will now store unreachable FSs. I guess some docs need improving - can you point to where? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298832#comment-16298832 ] Marshall Schor commented on UIMA-5662: -- The CasCompleteSerializer I believe will now store unreachable FSs. I guess some docs need improving - can you point to where? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298751#comment-16298751 ] Marshall Schor commented on UIMA-5662: -- Initial commit done, not tested, no test cases, so may be buggy... but if you want to test :-) - the mode is document in the uima-docbook-v3-users-guide in the the backwards compatibility chapter. It should work for casComplete, and the various binary serializations, but not for Xmi nor XCAS. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295557#comment-16295557 ] Marshall Schor edited comment on UIMA-5662 at 12/18/17 7:46 PM: Here's my thinking on the way forward at this point. # add a mode. This mode will be on a CAS instance, not a thread local, since the (one) user of this prefers that. If other users ask for the thread-local approach, we could do that too (with errors signaled if they didn't agree, or something like that). # mode can be set by default using a system property, and also by some methods on the LowLevelCas instance, including the use of AutoClosable to support try-with-resources. # If the mode is set: #* all new FSs are added to the ll_getFSForRef table, including those created via deserialization #* deserializations modified to use explicit or imputed FSids. IDs for new items to start after highest deserialized one. #* serializations changed to include FSs only reachable via ll_getFSForRef table. Implications of this mode being on: * FSs in ref table can't be GC'd * No way to remove these FSs from the ref table (this might be added later, if needed) * FSids ought to be stable across many different serializations/deserializations * CasCopy of entire CAS not guaranteed to preserve IDs. If needed, make this request in a new Jira. WDYT? was (Author: schor): Here's my thinking on the way forward at this point. # add a mode. This mode will be on a CAS instance, not a thread local, since the (one) user of this prefers that. If other users ask for the thread-local approach, we could do that too (with errors signaled if they didn't agree, or something like that). # If the mode is set: #* all new FSs are added to the ll_getFSForRef table, including those created via deserialization #* deserializations modified to use explicit or imputed FSids. IDs for new items to start after highest deserialized one. #* serializations changed to include FSs only reachable via ll_getFSForRef table. Implications of this mode being on: * FSs in ref table can't be GC'd * No way to remove these FSs from the ref table (this might be added later, if needed) * FSids ought to be stable across many different serializations/deserializations * CasCopy of entire CAS not guaranteed to preserve IDs. If needed, make this request in a new Jira. WDYT? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295557#comment-16295557 ] Marshall Schor commented on UIMA-5662: -- Here's my thinking on the way forward at this point. # add a mode. This mode will be on a CAS instance, not a thread local, since the (one) user of this prefers that. If other users ask for the thread-local approach, we could do that too (with errors signaled if they didn't agree, or something like that). # If the mode is set: #* all new FSs are added to the ll_getFSForRef table, including those created via deserialization #* deserializations modified to use explicit or imputed FSids. IDs for new items to start after highest deserialized one. #* serializations changed to include FSs only reachable via ll_getFSForRef table. Implications of this mode being on: * FSs in ref table can't be GC'd * No way to remove these FSs from the ref table (this might be added later, if needed) * FSids ought to be stable across many different serializations/deserializations * CasCopy of entire CAS not guaranteed to preserve IDs. If needed, make this request in a new Jira. WDYT? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295161#comment-16295161 ] Marshall Schor edited comment on UIMA-5662 at 12/18/17 4:55 PM: I'm thinking that v3 wants (over time) to deprecate using FSids for these kinds of things, and replace them with structures which store sets or maps (between ids -> FSs for example) as objects in the CAS, customized by users for what they want (e.g. weak references, etc.). I see this issue as focused on supporting more backwards compatibility. Towards that end, having this mode, and setting the FsId equal to the imputed or explicit "address" in the serialized form, helps. But as the comment 2 above, the CasComplete format doesn't prevent unreachable FSs from being dropped on subsequent serializations, and because the id's are not written out (but rather imputed from order information), this represents a breaking change I think. (Of course, perhaps I read the code wrong - have you tried this with v3 where something becomes unreachable?) I should note, that we could change the design for cas complete (and other serializers) to include FSs only reachable via the getFSForRef table. I'm beginning to think that that should be included if the special mode is set. was (Author: schor): I'm thinking that v3 wants (over time) to deprecate using FSids for these kinds of things, and replace them with structures which store sets or maps (between ids -> FSs for example) as objects in the CAS, customized by users for what they want (e.g. weak references, etc.). I see this issue as focused on supporting more backwards compatibility. Towards that end, having this mode, and setting the FsId equal to the imputed or explicit "address" in the serialized form, helps. But as the comment 2 above, the CasComplete format doesn't prevent unreachable FSs from being dropped on subsequent serializations, and because the id's are not written out (but rather imputed from order information), this represents a breaking change I think. (Of course, perhaps I read the code wrong - have you tried this with v3 where something becomes unreachable?) > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295161#comment-16295161 ] Marshall Schor commented on UIMA-5662: -- I'm thinking that v3 wants (over time) to deprecate using FSids for these kinds of things, and replace them with structures which store sets or maps (between ids -> FSs for example) as objects in the CAS, customized by users for what they want (e.g. weak references, etc.). I see this issue as focused on supporting more backwards compatibility. Towards that end, having this mode, and setting the FsId equal to the imputed or explicit "address" in the serialized form, helps. But as the comment 2 above, the CasComplete format doesn't prevent unreachable FSs from being dropped on subsequent serializations, and because the id's are not written out (but rather imputed from order information), this represents a breaking change I think. (Of course, perhaps I read the code wrong - have you tried this with v3 where something becomes unreachable?) > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295148#comment-16295148 ] Marshall Schor commented on UIMA-5662: -- Re: "choose between stable addresses and GC on a per-CAS-instance base even when working with multiple CASes simultaneously in a single thread". Both the bit-in-the-CAS and the thread-local approaches would support this requirement. The bit-in-the-cas is obvious - set the bit in the cas for the mode and then be careful to use that particular CAS in a way that's aligned with that mode. Make sure the mode is proper if you "check out" a new cas from a cas pool, etc. The thread-local approach would take code that was doing some deserialization and surround it with a (e.g.) try-with-resources, setting the proper mode. This might be more visible in the code, and the try-with-resources would document exact boundaries where this mode was to be in effect. I'm still leaning slightly in favor of this approach. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295141#comment-16295141 ] Marshall Schor commented on UIMA-5662: -- Thanks, Richard, for the nice use case. Re: * all FSs always have an address: in v3, all FSs have an "id" which is an int, like an address * FS's ids are written out for some serializations (XCAS, Xmi, JSON), and are "imputed" for others (Binary, CasComplete, Compressed). By imputed, I mean the ids are not written, but the FSs are output in a specific order, and that order can be used to determine "ids". As a side effect, if a CAS is deserialized (and has all of its FSs "reachable"), then as long as the reachability of those FSs doesn't change, the ids (written or imputed) will be the same when written out (and subsequently deserialized) because * for written-out ids, the ids haven't changed, and * for imputed ids, the order is kept by sorting all the FSs by id order. In V3 (currently) the map in the CAS that enables low-level getFSForRef(int) to work, won't be consulted when serializers determine what FSs are "reachable", so in that sense, the serializers all perform a kind of GC when serializing. But, for serializers writing the fsId into the serialized form, these will write the actual id, so the id's will be "stable" even if some of the FSs are no longer reachable. This might mean, though, that one of your use cases won't work, unless you change the "save the stable id's form" to xmi. (The v3 cas complete kind will miss collecting no-longer-reachable FSs, and that form doesn't explicitly encode the id's (it is using the impute approach). > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293977#comment-16293977 ] Marshall Schor edited comment on UIMA-5662 at 12/17/17 1:24 AM: the try (x = xxx) was "wishful thinking" - it must be contagious :-) It ought to read try (AutoClosable x = xxx or something like that. Re static vs method on a cas instance: I see these as two alternative implementations - the static one goes with a ThreadLocal value, and the per CAS instance uses another flag in the CAS instance. I think there are some benefits to the thread local, overall if is more likely that those using this would want it set for all CASs in an application, and just setting it once (perhaps in a try with resources block surrounding the main "produceAnalysisEngine" - run things through it) would be clearer and more reliable. If it was on the individual CAS, the app would need to be examined in detail to see where it creates CASs and each CAS individually would need to be set. But perhaps I'm not thinking of the "popular" use cases I should be? was (Author: schor): the try (x = xxx) was "wishful thinking" - it must be contagious :-) It ought to read try (AutoClosable x = xxx or something like that. Re static vs method on a cas instance: I see these as two alternative implementations - the static one goes with a ThreadLocal value, and the per CAS instance uses another flag in the CAS instance. I think there are some benefits to the thread local, overall. I thinkl it is more likely that those using this would want it set for all CASs in an application, and just setting it once (perhaps in a try with resources block surrounding the main "produceAnalysisEngine" - run things through it) would be clearer and more reliable. If it was on the individual CAS, the app would need to be examined in detail to see where it creates CASs and each CAS individually would need to be set. But perhaps I'm not thinking of the "popular" use cases I should be? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293977#comment-16293977 ] Marshall Schor commented on UIMA-5662: -- the try (x = xxx) was "wishful thinking" - it must be contagious :-) It ought to read try (AutoClosable x = xxx or something like that. Re static vs method on a cas instance: I see these as two alternative implementations - the static one goes with a ThreadLocal value, and the per CAS instance uses another flag in the CAS instance. I think there are some benefits to the thread local, overall. I thinkl it is more likely that those using this would want it set for all CASs in an application, and just setting it once (perhaps in a try with resources block surrounding the main "produceAnalysisEngine" - run things through it) would be clearer and more reliable. If it was on the individual CAS, the app would need to be examined in detail to see where it creates CASs and each CAS individually would need to be set. But perhaps I'm not thinking of the "popular" use cases I should be? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293814#comment-16293814 ] Marshall Schor commented on UIMA-5662: -- Good ideas! +1 to putting the enabler methods on the LowLevelCas. I think these belong as static methods on the LowLevelCas, not as methods on instances. I like your idea of try-with-resources, and I think a couple of characters (e.g. s = ) extra to accommodate the syntax of that is a pretty small syntactic cost. I think the Java 8 lambda thing would work except for the pesky exception issues. So, I'll probably do something like {code} try (w = LowLevelCas.ll_enableGetFSForRef()) { ... some code, including deserializers ... } {code} The ll_enableGetFSForRef would return an AutoClosable, whose close method would restore the previous value. You could use this without the try-with-resources, to just set it true or false. The no arg case would be the same as a *true* arg. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1629#comment-1629 ] Marshall Schor commented on UIMA-5662: -- I'm thinking that the enabler should be on the "topic" of enabling Feature Structure access using the low level CAS API. e.g. enable_ll_getFSForRef() or enable_ll_getFSForRef(true/false) If turned on, and deserialization occurs, the deserializers would use the externally present FsIDs if available, or their imputed values, if not explicitly in the serialized form. This has the advantage of also enabling low level access for FSs created using new XXX(jcas)i, which could be useful for backwards compatibility These methods would return an AutoClosable that would restore the previous state. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293310#comment-16293310 ] Marshall Schor commented on UIMA-5662: -- I thought try-with-resources required a variable id, such as the "ac" below: {code} try (AutoClosable ac = withFsMapping()) deserialize... } {code} Is there some way to write this without the variable? If so, what is that statement called (e.g. try-with-resources statement in the Java Language Spec has this issue, but maybe there's some other thing that works without variables?) > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293004#comment-16293004 ] Marshall Schor commented on UIMA-5662: -- I generally share your view on ThreadLocal - it has poor visibility properties... The number of APIs being modified could be quite large. These would be APIs I guess taking an additional boolean argument. We've done several iterations on these APIs, and have multiple styles. The last one IIRC is the CasIOUtils, with the load(...) in many forms. These are mostly cover apis that translate into other public api alternative calls. Additional arguments would need to be passed down through multiple layers of these. So this ends up being lots of user-facing APIs needing additional args times lots of "layers" in the API calls. I guess I was dreading getting all of this right... and so, on balance, I thought the ThreadLocal approach would be a safer, more reliable, change. The reason it's a "thread-local" is to support multi-threaded apps, where only some threads want this; the threads could easily be working on different CASes, and have nothing in particular to do with each other (more like a kind of scale-up within a single machine having many CPUs).. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292786#comment-16292786 ] Marshall Schor commented on UIMA-5662: -- I think an essential point Richard was making is it would be nice for UIMA to support addressing FSs by an int key, with the following properties: 1) the int keys would be "stable" as in fs-ids, preserved across serialization/deserialization 2) it would be "automatic" or "built-in" (but could be under the control of some enabling switch) 3) it would support additional APIs to allow better management in v3 - including "deleting" from the int -> fs map remembering this. The current access API for this is the CAS LowLevel api, using the getFSForRef kinds of calls. Would be nice for backwards compatibility to keep this. Here's a proposal, in 2 parts: one for deserializers, and one for the "access API", that's more "built-in" than the previous proposal. # under control of an enabler switch, have deserializers create FSs having the same id as the serialized form id; also, add all created FSs to the low level CAS ref-to-FS internal built-in map, already present, to support low level cas getFSForRef calls. # add to the lowlevel API the ability to remove an int->fs, or to "clear" the entire map, to allow FSs to be reclaimed by GC Details: # enabler switch: a ThreadLocal kind of param, set by default from a -D system property. ThreadLocal allows keeping APIs unmodified. # deserializers could include things having imputed FS addresses, as well as explicit ones, so this could work for more than just, say cascomplete style. Does this sound more aligned with your use case? WDYT? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5619) uv3 consider converting Iterable to Collection
[ https://issues.apache.org/jira/browse/UIMA-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5619. -- Resolution: Won't Fix this was address in other ways > uv3 consider converting Iterable to Collection > -- > > Key: UIMA-5619 > URL: https://issues.apache.org/jira/browse/UIMA-5619 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > > As part of the integration with Java 8, consider switching classes that > implement Iterable, to Collection. This requires that the item implement > size(). If done, then stream and spliterator are included in the class. > Example: JCas xxxArray, xxxList -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5677) uv3 internal list iterators - align with definition of previous
[ https://issues.apache.org/jira/browse/UIMA-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5677. -- Resolution: Fixed > uv3 internal list iterators - align with definition of previous > --- > > Key: UIMA-5677 > URL: https://issues.apache.org/jira/browse/UIMA-5677 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > The Java definition of previous and next says that next followed by previous > followed by next etc. always returns the same value. > In our impls, this means that > * next returns the currently pointed-to object, then advances, and > * previous first goes backwards, and then returns that object > Some of our internal implementations implement previous by returning the > currently pointed-to object and then goes backwards. Fix the implementations > and uses (if any) (including the hasPrevious) to do this following the same > conventions as Java. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5675) uv3 upgrade internal listIterator interfaces to support no-checking next, previous
[ https://issues.apache.org/jira/browse/UIMA-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5675. -- Resolution: Fixed > uv3 upgrade internal listIterator interfaces to support no-checking next, > previous > -- > > Key: UIMA-5675 > URL: https://issues.apache.org/jira/browse/UIMA-5675 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > add the nextNvc, previousNvc methods, default the next and previous methods, > add the required impls, change the usage of next/prev to nextNvc/previousNvc > where appropriate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5674) uv3 upgrade multiple hash map impls
[ https://issues.apache.org/jira/browse/UIMA-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5674. -- Resolution: Fixed > uv3 upgrade multiple hash map impls > --- > > Key: UIMA-5674 > URL: https://issues.apache.org/jira/browse/UIMA-5674 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > The package org.apache.uima.internal.util has several hash maps and sets, all > implemented in the same general way. Consolidate all of the "same" parts of > these impls into a superclass, and fix various edge case bugs revealed from > the comparisons of the multiple implementations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5666) uv3 previous call in list style iterators not quite right
[ https://issues.apache.org/jira/browse/UIMA-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5666. -- Resolution: Duplicate > uv3 previous call in list style iterators not quite right > - > > Key: UIMA-5666 > URL: https://issues.apache.org/jira/browse/UIMA-5666 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > Java has a ListIterator interface, which defines previous() and next() such > that repeated operations of next(), previous(), ... return the same item. > This means that the next impl returns the current item and then advances the > position, while the previous impl decrements the position (first) and then > returns that item. > v3 added impl of ListIterator to fsIterator, but implemented this backwards > for previous. Change this to comply with the definition in listIterator. > Also, change other list-iterator like things. that have the same issue. This > change should not cause backwards compatibility issues, because v2 didn't > implement ListIterator interface for FsIterators. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5675) uv3 upgrade internal listIterator interfaces to support no-checking next, previous
[ https://issues.apache.org/jira/browse/UIMA-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5675: - Summary: uv3 upgrade internal listIterator interfaces to support no-checking next, previous (was: uv3 upgrade internal IntListIterator interface to support no-checking next, previous) > uv3 upgrade internal listIterator interfaces to support no-checking next, > previous > -- > > Key: UIMA-5675 > URL: https://issues.apache.org/jira/browse/UIMA-5675 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > add the nextNvc, previousNvc methods, default the next and previous methods, > add the required impls, change the usage of next/prev to nextNvc/previousNvc > where appropriate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5677) uv3 internal list iterators - align with definition of previous
Marshall Schor created UIMA-5677: Summary: uv3 internal list iterators - align with definition of previous Key: UIMA-5677 URL: https://issues.apache.org/jira/browse/UIMA-5677 Project: UIMA Issue Type: Improvement Components: Core Java Framework Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Fix For: 3.0.0SDK The Java definition of previous and next says that next followed by previous followed by next etc. always returns the same value. In our impls, this means that * next returns the currently pointed-to object, then advances, and * previous first goes backwards, and then returns that object Some of our internal implementations implement previous by returning the currently pointed-to object and then goes backwards. Fix the implementations and uses (if any) (including the hasPrevious) to do this following the same conventions as Java. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5675) uv3 upgrade internal IntListIterator interface to support no-checking next, previous
[ https://issues.apache.org/jira/browse/UIMA-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5675: - Summary: uv3 upgrade internal IntListIterator interface to support no-checking next, previous (was: uv3 upgrade internal IntListIterator interface to support no-checking next, prev) > uv3 upgrade internal IntListIterator interface to support no-checking next, > previous > > > Key: UIMA-5675 > URL: https://issues.apache.org/jira/browse/UIMA-5675 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > add the nextNvc, previousNvc methods, default the next and previous methods, > add the required impls, change the usage of next/prev to nextNvc/previousNvc > where appropriate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5675) uv3 upgrade internal IntListIterator interface to support no-checking next, prev
Marshall Schor created UIMA-5675: Summary: uv3 upgrade internal IntListIterator interface to support no-checking next, prev Key: UIMA-5675 URL: https://issues.apache.org/jira/browse/UIMA-5675 Project: UIMA Issue Type: Improvement Components: Core Java Framework Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK add the nextNvc, previousNvc methods, default the next and previous methods, add the required impls, change the usage of next/prev to nextNvc/previousNvc where appropriate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5674) uv3 upgrade multiple hash map impls
Marshall Schor created UIMA-5674: Summary: uv3 upgrade multiple hash map impls Key: UIMA-5674 URL: https://issues.apache.org/jira/browse/UIMA-5674 Project: UIMA Issue Type: Bug Components: Core Java Framework Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Fix For: 3.0.0SDK The package org.apache.uima.internal.util has several hash maps and sets, all implemented in the same general way. Consolidate all of the "same" parts of these impls into a superclass, and fix various edge case bugs revealed from the comparisons of the multiple implementations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284040#comment-16284040 ] Marshall Schor commented on UIMA-5662: -- Some brief responses; will be thinking more about this in general. First, thanks for your discussion, always useful! Re: "do not really plan on adding a new built-in type...", the thought was to add a new "semi-built-in" type. These are just like built-in types, but you have to explicitly import them (for backwards compatibility when working with type systems from v2 with binary serializations). Although the "type" would be built-in, instances of it would not be - it would be up to the user to create 1 or more instances . Perhaps this is what you were trying to say. Re: "approach taken in the XMI deserializer-serialzer where ID info is kept": yes, that is similar. Re: "I'd have to manually figure out the next ID" - yes, in v3, that's a simple api call on the FS: fs._id(). re: circumstances: * lookups FS -> ID are fast - yes they are * Lookups ID -> FS are fast - that would depend on what kind of map was used, but in general it should be like a hash map. * maps store in the CAS - I think not, in the proposal, if you mean in the sense that the client code could use some special CAS APIs to access via ints (such as the low-level CAS apis). The idea I am exploring is generalizing this, which would involve the client knowing about the map. * The client code can set up an Id assignment - that could be "optional" - in that the client could choose to use the fs._id() int instead. * "one such strategy ..." - I think that was part of this proposal, if you restrict the "reader components" to deserializers. Re: some questions: * removing the need for XmiSerializationSharedData - that is used for multiple purposes, not just id mapping. * "out-of-typesystem info" - there's no generalization proposed for that - it is supported for some (not all) kinds of (de)serializations, more as an internal implementation detail. re: risks: * if the maps are stored like FSes - the proposal would be to store these using the v3 support for arbitrary Java objects in the CAS. This support already accomodates serialization / deserialization to v2 systems, by arranging the transportable form to be common uima objects. (see https://uima.apache.org/d/uimaj-3.0.0-beta/version_3_users_guide.html#uv3.custom_java_objects ) > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283707#comment-16283707 ] Marshall Schor edited comment on UIMA-5662 at 12/8/17 3:16 PM: --- I'm thinking about an alterantive approach. Another Jira ( UIMA-5664) adds a semi-built-in new type, a map from int -> FS. Another approach to deserialization could be: 1) change most deserializers to set the fsId to the v2-fs-id value imputed from the serialized form, either explicitly or implicitly. * This allows a subsequent serialization to reuse these, keeping them more-or-less stable. * This would be independent of supporting low-level-access, remembering the fs-id -> fs relation in a map, etc. 2) Allow users calling deserializers to specify a map from int -> FS (map of their choice, for instance a HashMap or LinkedHashMap). If such a parameter is provided, the deserializers would add all the deserialized FSs to the map, with key being the v2-fs-id. 3) The deserializers would **not** add the FSs to the internal hidden map that makes low-level-cas-API to get FSFromRef work. The pluses and minuses of this seem to be: * 3a) minus: low level cas Access doesn't work. Code using this needs to be upgraded. * 3b) plus: the internal low-level cas Access map not being used, allows future garbage collection. * 3c) plus: users have more explicit control of what stays in the map. Things not in the map could be GCd. Depending on their use case, they could ** clear the map after use ** remove selected entries * 3c) Going forward, they could convert their application to save the map in the cas (using the new semi-built-in type); they could support multiple maps, etc. Does this sound like a good general direction? Other thoughts? was (Author: schor): I'm thinking about an alterantive approach. Another Jira ( UIMA_5664) adds a semi-built-in new type, a map from int -> FS. Another approach to deserialization could be: 1) change most deserializers to set the fsId to the v2-fs-id value imputed from the serialized form, either explicitly or implicitly. * This allows a subsequent serialization to reuse these, keeping them more-or-less stable. * This would be independent of supporting low-level-access, remembering the fs-id -> fs relation in a map, etc. 2) Allow users calling deserializers to specify a map from int -> FS (map of their choice, for instance a HashMap or LinkedHashMap). If such a parameter is provided, the deserializers would add all the deserialized FSs to the map, with key being the v2-fs-id. 3) The deserializers would **not** add the FSs to the internal hidden map that makes low-level-cas-API to get FSFromRef work. The pluses and minuses of this seem to be: * 3a) minus: low level cas Access doesn't work. Code using this needs to be upgraded. * 3b) plus: the internal low-level cas Access map not being used, allows future garbage collection. * 3c) plus: users have more explicit control of what stays in the map. Things not in the map could be GCd. Depending on their use case, they could ** clear the map after use ** remove selected entries * 3c) Going forward, they could convert their application to save the map in the cas (using the new semi-built-in type); they could support multiple maps, etc. Does this sound like a good general direction? Other thoughts? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283707#comment-16283707 ] Marshall Schor commented on UIMA-5662: -- I'm thinking about an alterantive approach. Another Jira ( UIMA_5664) adds a semi-built-in new type, a map from int -> FS. Another approach to deserialization could be: 1) change most deserializers to set the fsId to the v2-fs-id value imputed from the serialized form, either explicitly or implicitly. * This allows a subsequent serialization to reuse these, keeping them more-or-less stable. * This would be independent of supporting low-level-access, remembering the fs-id -> fs relation in a map, etc. 2) Allow users calling deserializers to specify a map from int -> FS (map of their choice, for instance a HashMap or LinkedHashMap). If such a parameter is provided, the deserializers would add all the deserialized FSs to the map, with key being the v2-fs-id. 3) The deserializers would **not** add the FSs to the internal hidden map that makes low-level-cas-API to get FSFromRef work. The pluses and minuses of this seem to be: * 3a) minus: low level cas Access doesn't work. Code using this needs to be upgraded. * 3b) plus: the internal low-level cas Access map not being used, allows future garbage collection. * 3c) plus: users have more explicit control of what stays in the map. Things not in the map could be GCd. Depending on their use case, they could ** clear the map after use ** remove selected entries * 3c) Going forward, they could convert their application to save the map in the cas (using the new semi-built-in type); they could support multiple maps, etc. Does this sound like a good general direction? Other thoughts? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5666) uv3 previous call in list style iterators not quite right
Marshall Schor created UIMA-5666: Summary: uv3 previous call in list style iterators not quite right Key: UIMA-5666 URL: https://issues.apache.org/jira/browse/UIMA-5666 Project: UIMA Issue Type: Bug Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Fix For: 3.0.0SDK Java has a ListIterator interface, which defines previous() and next() such that repeated operations of next(), previous(), ... return the same item. This means that the next impl returns the current item and then advances the position, while the previous impl decrements the position (first) and then returns that item. v3 added impl of ListIterator to fsIterator, but implemented this backwards for previous. Change this to comply with the definition in listIterator. Also, change other list-iterator like things. that have the same issue. This change should not cause backwards compatibility issues, because v2 didn't implement ListIterator interface for FsIterators. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5664) uv3 add semibuiltin map from int -> fs
Marshall Schor created UIMA-5664: Summary: uv3 add semibuiltin map from int -> fs Key: UIMA-5664 URL: https://issues.apache.org/jira/browse/UIMA-5664 Project: UIMA Issue Type: Improvement Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK Some applications find it so convenient to have a map from ints to FSs, that they use the low level CAS APIs, and the fs's v2 "address" for this. In v3, use of the low level APIs in this manner disables garbage collection, as there is no way to clear the map. Support an alternative way to have int -> FS maps, under user control of what exactly gets added to them, supporting removes, and clearing under application control, by implementing a new semi-built-in type -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280247#comment-16280247 ] Marshall Schor commented on UIMA-5662: -- Thinking more (out loud) about this topic, mainly from the perspective of backwards compatibility: there seem to be 2 distinct issues: deserializing FSs with the fsIds in the serialized form, and populating the map that enables ll_getFSForRef( int ). The map already is populated, normally, only when a FS is created using a ll_ interface. It can be forced to populate for all FS creations using the -D swtich. Normally, it isn't populated for * regular fs creation * creation via cas copier * creation via deserializations * creation via side effects (e.g., if you do a cas.getDocumentAnnotation() an one doesn't exist). A general observation: if an application uses the ll_getFSForRef(int) for some FS access, this map must be populated for those FSs, in order to work. When deserializing, some serialization forms store explicitly an FsId, some "impute" an FsId from the layout. The latter forms don't preserve constant FsIds over a sequence of operations such as: # deserialize -> CAS # update by removing some of the deserialized FSs # reserialize out The reserialize (in v3) only serializes "reachable" FSs, so the "layout" will be different. When deserializing, we can create FSs with the same explicit FsIds. It makes sense to do this for just those forms where the FsId is stored explicitly, so they can remain "constant". For other forms, there's no point in doing this as far as I can see, because the ids are not constant. It would be possible to design things so that the deserialization (for those forms having an explicit fsId, and as long as the special "merge" form of deserialization isn't being used), we could implement this to always keep the same fsIds. (The merge form is used when sending a cas to multiple remote services in parallel, and "merging" back the results from all of those, when they return). This would allow backwards compatibility with applications using deserialization + getFSForRef() calls, without a global -D flag. I don't think it would have any negative impacts. For casCopying, if copying just a single fs, or a single view, the target cas may already have FSs with the same fsId. Rather than handle special cases where this might be made to work, for now, we can just say that cas copying won't preserve fsIds. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280197#comment-16280197 ] Marshall Schor commented on UIMA-5662: -- The ll_getFSForRef of course goes the opposite way, mapping from int -> fs. The fs.hashcode() goes from fs -> int. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16279200#comment-16279200 ] Marshall Schor commented on UIMA-5662: -- There are two parts to this solution. One is having the FsIds of the deserialized items (retrievable using fs.hashcode()) match what was serialized. This is doable for Object serialization, Binary (non-compressed ) serialization, Xmi and XCAS serialization, but not for Binary compressed (form 4 and 6). The second part is to have low level fs retrieval using the id's (fs.hashcode()) work. These are independent; there already is a "global" switch (-Duima.enable_id_to_feature_structure_map_for_all_fss ) that will do the 2nd part (at the expense of disabling garbage collection). A use case for this would be v2 code that did things like a) get a ref to a feature structure b) extract the fsId (e.g. fs.hashcode()) c) later, try to use this id with a low level cas API to retrieve the fs. This use case doesn't require any serialization/deserialization. It can be supported in v3 by enabling that flag. I'm wondering if the right implementation for this issue is just to have it do part 1, for the subset of serialized forms where the fs id can be retrieved. Users using the low leve cas APIs would need to use the global -D switch, and then this change would make things work. I'm also wondering, if we can dispense with all configuration swtiches (except for the -Duima.enable_id_to_feature_structure_map_for_all_fss), by having the deserialization logic test if this -D... switch is on, and if so, installing the right fsIds when deserializing (those forms that have the fsIds)? That would seem to be a good set of trade-offs. WDYT? > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
[ https://issues.apache.org/jira/browse/UIMA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5662: - Description: Some users depend 1) constant v2-ids for FSs preserved in deserialization and serialization, and 2) low level cas API access to these. V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak refs are used) prevent GC of unreachable FSs. Based on a mode, set by -Duima.deserialize_perserve_ids, and also controllable by new config option per deserialize call, alter the deserialization for those deserializers which know about v2 ids, to put these into the map used for low-level CAS access, using the actual v2 ids, and change the v3 next available id for future new FSs to be 1 beyond the end. The -Duima.deserialize-preserve_ids global setting is needed to handle the use case of some annotators using low-level APIs, when part of a pipeline is "remoted". was: Some users depend 1) constant v2-ids for FSs preserved in deserialization and serialization, and 2) low level cas API access to these. V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak refs are used) prevent GC of unreachable FSs. Based on a mode, set by -Duima.deserialize_perserve_v2_ids, and also controllable by new config option per deserialize call, alter the deserialization for those deserializers which know about v2 ids, to put these into the map used for low-level CAS access, using the actual v2 ids, and change the v3 next available id for future new FSs to be 1 beyond the end. > uv3 support CAS deserialization subsequent low level access > --- > > Key: UIMA-5662 > URL: https://issues.apache.org/jira/browse/UIMA-5662 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Some users depend 1) constant v2-ids for FSs preserved in deserialization and > serialization, and 2) low level cas API access to these. > V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak > refs are used) prevent GC of unreachable FSs. > Based on a mode, set by -Duima.deserialize_perserve_ids, and also > controllable by new config option per deserialize call, alter the > deserialization for those deserializers which know about v2 ids, to put these > into the map used for low-level CAS access, using the actual v2 ids, and > change the v3 next available id for future new FSs to be 1 beyond the end. > The -Duima.deserialize-preserve_ids global setting is needed to handle the > use case of some annotators using low-level APIs, when part of a pipeline is > "remoted". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5660) uv3 JCas loading superclass test too strong
[ https://issues.apache.org/jira/browse/UIMA-5660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5660. -- Resolution: Fixed > uv3 JCas loading superclass test too strong > --- > > Key: UIMA-5660 > URL: https://issues.apache.org/jira/browse/UIMA-5660 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > When JCas Classes are checked for conformity to UIMA Type Systems, a check is > made that the super types match; if they don't, and error is thrown. > There are cases when JCas classes could be changed to have a different super > class inserted, which would be allowable. Accommodate this by changing the > action when this happens to be just a warning. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5662) uv3 support CAS deserialization subsequent low level access
Marshall Schor created UIMA-5662: Summary: uv3 support CAS deserialization subsequent low level access Key: UIMA-5662 URL: https://issues.apache.org/jira/browse/UIMA-5662 Project: UIMA Issue Type: Improvement Components: Core Java Framework Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK Some users depend 1) constant v2-ids for FSs preserved in deserialization and serialization, and 2) low level cas API access to these. V3 normally doesn't maintain tables linking ids to FSs, as these (unless weak refs are used) prevent GC of unreachable FSs. Based on a mode, set by -Duima.deserialize_perserve_v2_ids, and also controllable by new config option per deserialize call, alter the deserialization for those deserializers which know about v2 ids, to put these into the map used for low-level CAS access, using the actual v2 ids, and change the v3 next available id for future new FSs to be 1 beyond the end. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5661) uv3 Ref doc: add all -D settings
Marshall Schor created UIMA-5661: Summary: uv3 Ref doc: add all -D settings Key: UIMA-5661 URL: https://issues.apache.org/jira/browse/UIMA-5661 Project: UIMA Issue Type: Bug Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK -D settings for some v3 things are missing from the v2 chapter having the table of these. Example -Duima.42_pretty_print_format Scan code and add all missing ones. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5660) uv3 JCas loading superclass test too strong
Marshall Schor created UIMA-5660: Summary: uv3 JCas loading superclass test too strong Key: UIMA-5660 URL: https://issues.apache.org/jira/browse/UIMA-5660 Project: UIMA Issue Type: Improvement Components: Core Java Framework Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Fix For: 3.0.0SDK When JCas Classes are checked for conformity to UIMA Type Systems, a check is made that the super types match; if they don't, and error is thrown. There are cases when JCas classes could be changed to have a different super class inserted, which would be allowable. Accommodate this by changing the action when this happens to be just a warning. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5612) update the uima-wide shared superPOM
[ https://issues.apache.org/jira/browse/UIMA-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5612. -- Resolution: Fixed > update the uima-wide shared superPOM > > > Key: UIMA-5612 > URL: https://issues.apache.org/jira/browse/UIMA-5612 > Project: UIMA > Issue Type: Improvement > Components: Build, Packaging and Test >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: parent-pom-11 > > > Things to update: > 1) version of it's parent > 2) version of many plugins. e.g. the maven release plugin > 3) The Eclipse update site building - to support code-signing > Please add your candidates:-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5612) update the uima-wide shared superPOM
[ https://issues.apache.org/jira/browse/UIMA-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274637#comment-16274637 ] Marshall Schor commented on UIMA-5612: -- eclipse update site build process updated to support build signing > update the uima-wide shared superPOM > > > Key: UIMA-5612 > URL: https://issues.apache.org/jira/browse/UIMA-5612 > Project: UIMA > Issue Type: Improvement > Components: Build, Packaging and Test >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: parent-pom-11 > > > Things to update: > 1) version of it's parent > 2) version of many plugins. e.g. the maven release plugin > 3) The Eclipse update site building - to support code-signing > Please add your candidates:-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (UIMA-5612) update the uima-wide shared superPOM
[ https://issues.apache.org/jira/browse/UIMA-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor reassigned UIMA-5612: Assignee: Marshall Schor Fix Version/s: parent-pom-11 > update the uima-wide shared superPOM > > > Key: UIMA-5612 > URL: https://issues.apache.org/jira/browse/UIMA-5612 > Project: UIMA > Issue Type: Improvement > Components: Build, Packaging and Test >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: parent-pom-11 > > > Things to update: > 1) version of it's parent > 2) version of many plugins. e.g. the maven release plugin > 3) The Eclipse update site building - to support code-signing > Please add your candidates:-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (UIMA-4412) upgrade uima-wide parent pom to ref Apache pom version 17 (from 14)
[ https://issues.apache.org/jira/browse/UIMA-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor closed UIMA-4412. Resolution: Fixed > upgrade uima-wide parent pom to ref Apache pom version 17 (from 14) > --- > > Key: UIMA-4412 > URL: https://issues.apache.org/jira/browse/UIMA-4412 > Project: UIMA > Issue Type: Task > Components: Build, Packaging and Test >Affects Versions: parent-pom-10 >Reporter: Marshall Schor >Priority: Minor > Fix For: parent-pom-11 > > > I did a diff - this appears to be mostly version upgrades of various tools. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5612) update the uima-wide shared superPOM
[ https://issues.apache.org/jira/browse/UIMA-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273347#comment-16273347 ] Marshall Schor commented on UIMA-5612: -- First batch of updates to 11-SNAPSHOT (you can test by changing your project's parent pom from 10 to 11-SNAPSHOT) Apache-wide superpom - update to current: apache pom 14 -> 18 switch http to https: mail archives (except uima.markmail.org) repositories: repo1.maven.org maven prereq 3.0.0 -> 3.3.9 (could go higher) defined version properties: 0.12 3.0.0-M1 3.0.5 3.0.0-M1 3.3.0 1.7 1.7 512m removed defunct repository removed inherited repository plugin upgrades: deploy -> 2.8.2 source ->3.0.1 resources -> 3.0.2 remote-resources -> 1.5 build-helper 1.8 -> 3.0.0 enforcer 1.3 -> 3.0.0-M1 plugin-plugin - 3.3 to 3.5, removed: true compiler plugin 1.7 -> 3.7.0 dependency 2.10 -> 3.0.2 jar 2.3.2 -> 3.0.2 assembly 2.6 -> 3.1.0 findbugs 2.5.4 -> 3.0.5 eclipse-plugin 2.8 -> 2.10 (may not be used) java source/target 1.6 -> 1.7 pearPackagingMavenPlugin 2.5 -> 2.10.2 antrun 1.7 -> 1.8 ant dependencies: ant-contrib 1.0b3 -> 20020829, ant-apache-regexp 1.9.2 -> 1.10.1 bundle 2.3.7 -> 3.3.0 rat 0.11 -> 0.12 changes 2.10 -> 2.12.1 findbugs 2.5.4 -> 3.0.5 cobertura 2.6 -> 2.7 pmd 3.1 -> 3.8 configured javadoc pluging for source = ${maven.compiler.source} and javadocVersion = ${maven.compiler.source} configured enforcer to require java version 1.7, and maven 3.3.9 > update the uima-wide shared superPOM > > > Key: UIMA-5612 > URL: https://issues.apache.org/jira/browse/UIMA-5612 > Project: UIMA > Issue Type: Improvement > Components: Build, Packaging and Test >Reporter: Marshall Schor >Priority: Minor > > Things to update: > 1) version of it's parent > 2) version of many plugins. e.g. the maven release plugin > 3) The Eclipse update site building - to support code-signing > Please add your candidates:-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5658) 2 uimaj eclipse plugins being dropped from update site when using recent bundle plugin version
[ https://issues.apache.org/jira/browse/UIMA-5658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5658. -- Resolution: Fixed > 2 uimaj eclipse plugins being dropped from update site when using recent > bundle plugin version > -- > > Key: UIMA-5658 > URL: https://issues.apache.org/jira/browse/UIMA-5658 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta, 2.10.2SDK >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK, 2.10.3SDK > > > For a while, we've had to run the "bundle" step for the uimaj-ep-jcasgenp and > uimaj-ep-configurator eclipse plugins using maven-bundle-plugin version > 2.3.7, which is an old release of the bundle plugin. If run with more > current versions, no errors are reported, and the build works OK, but the > subsequent build of the eclipse-update-site drops these two plugins when > running the features-and-bundles-publishing step. The workaround was to use > the older version of the maven-bundle-plugin. > By trial and error, found the fix for this was changing the clause to keep the line having just "*" as the last line. This involved > reordering the lines following this up before this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5658) 2 uimaj eclipse plugins being dropped from update site when using recent bundle plugin version
[ https://issues.apache.org/jira/browse/UIMA-5658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5658: - Summary: 2 uimaj eclipse plugins being dropped from update site when using recent bundle plugin version (was: 2 uimaj eclipse plugins dropped from update site with recent bundle plugin) > 2 uimaj eclipse plugins being dropped from update site when using recent > bundle plugin version > -- > > Key: UIMA-5658 > URL: https://issues.apache.org/jira/browse/UIMA-5658 > Project: UIMA > Issue Type: Bug > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta, 2.10.2SDK >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK, 2.10.3SDK > > > For a while, we've had to run the "bundle" step for the uimaj-ep-jcasgenp and > uimaj-ep-configurator eclipse plugins using maven-bundle-plugin version > 2.3.7, which is an old release of the bundle plugin. If run with more > current versions, no errors are reported, and the build works OK, but the > subsequent build of the eclipse-update-site drops these two plugins when > running the features-and-bundles-publishing step. The workaround was to use > the older version of the maven-bundle-plugin. > By trial and error, found the fix for this was changing the clause to keep the line having just "*" as the last line. This involved > reordering the lines following this up before this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5658) 2 uimaj eclipse plugins dropped from update site with recent bundle plugin
Marshall Schor created UIMA-5658: Summary: 2 uimaj eclipse plugins dropped from update site with recent bundle plugin Key: UIMA-5658 URL: https://issues.apache.org/jira/browse/UIMA-5658 Project: UIMA Issue Type: Bug Components: Core Java Framework Affects Versions: 2.10.2SDK, 3.0.0SDK-beta Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK, 2.10.3SDK For a while, we've had to run the "bundle" step for the uimaj-ep-jcasgenp and uimaj-ep-configurator eclipse plugins using maven-bundle-plugin version 2.3.7, which is an old release of the bundle plugin. If run with more current versions, no errors are reported, and the build works OK, but the subsequent build of the eclipse-update-site drops these two plugins when running the features-and-bundles-publishing step. The workaround was to use the older version of the maven-bundle-plugin. By trial and error, found the fix for this was changing the
[jira] [Updated] (UIMA-5655) uv3 migrate to current versions of slf4j and log4j and other dependencies
[ https://issues.apache.org/jira/browse/UIMA-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5655: - Description: move to log4j 2.10.0 move to slf4j 1.7.25 move to jackson-core 2.9.2 was: log4j is now at 2.10.0 slf4j is now at 1.7.25 > uv3 migrate to current versions of slf4j and log4j and other dependencies > - > > Key: UIMA-5655 > URL: https://issues.apache.org/jira/browse/UIMA-5655 > Project: UIMA > Issue Type: Improvement >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > move to log4j 2.10.0 > move to slf4j 1.7.25 > move to jackson-core 2.9.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5655) uv3 migrate to current versions of slf4j and log4j and other dependencies
[ https://issues.apache.org/jira/browse/UIMA-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5655: - Summary: uv3 migrate to current versions of slf4j and log4j and other dependencies (was: uv3 migrate to current versions of slf4j and log4j) > uv3 migrate to current versions of slf4j and log4j and other dependencies > - > > Key: UIMA-5655 > URL: https://issues.apache.org/jira/browse/UIMA-5655 > Project: UIMA > Issue Type: Improvement >Reporter: Marshall Schor >Assignee: Marshall Schor > Fix For: 3.0.0SDK > > > log4j is now at 2.10.0 > slf4j is now at 1.7.25 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5656) uv3 consider enabling Java 9
[ https://issues.apache.org/jira/browse/UIMA-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267075#comment-16267075 ] Marshall Schor commented on UIMA-5656: -- A fairly big problem is the problem of modularization and split packages. Uimaj's artifacts - core, cpe, etc., have split packages. For instance, both uimaj-core and uimaj-cpe projects have classes in package org.apache.uima.collection.impl . I think that two projects, one having package a.b , the other package a.b.c , are not considered split packages. Can anyone confirm that? > uv3 consider enabling Java 9 > > > Key: UIMA-5656 > URL: https://issues.apache.org/jira/browse/UIMA-5656 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Investigate and see what's needed to enable running with Java 9. > Modify code to make future use of Java 9 better. > Continue to run with Java 8 > A main thing to do now it to make automatic naming of Jars have appropriate > names. See > http://blog.joda.org/2017/05/java-se-9-jpms-automatic-modules.html . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5656) uv3 consider enabling Java 9
[ https://issues.apache.org/jira/browse/UIMA-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5656: - Description: Investigate and see what's needed to enable running with Java 9. Modify code to make future use of Java 9 better. Continue to run with Java 8 A main thing to do now it to make automatic naming of Jars have appropriate names. See http://blog.joda.org/2017/05/java-se-9-jpms-automatic-modules.html . was: Investigate and see what's needed to enable running with Java 9. Modify code to make future use of Java 9 better. Continue to run with Java 8 > uv3 consider enabling Java 9 > > > Key: UIMA-5656 > URL: https://issues.apache.org/jira/browse/UIMA-5656 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Investigate and see what's needed to enable running with Java 9. > Modify code to make future use of Java 9 better. > Continue to run with Java 8 > A main thing to do now it to make automatic naming of Jars have appropriate > names. See > http://blog.joda.org/2017/05/java-se-9-jpms-automatic-modules.html . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5656) uv3 consider enabling Java 9
Marshall Schor created UIMA-5656: Summary: uv3 consider enabling Java 9 Key: UIMA-5656 URL: https://issues.apache.org/jira/browse/UIMA-5656 Project: UIMA Issue Type: Improvement Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK Investigate and see what's needed to enable running with Java 9. Modify code to make future use of Java 9 better. Continue to run with Java 8 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5655) uv3 migrate to current versions of slf4j and log4j
Marshall Schor created UIMA-5655: Summary: uv3 migrate to current versions of slf4j and log4j Key: UIMA-5655 URL: https://issues.apache.org/jira/browse/UIMA-5655 Project: UIMA Issue Type: Improvement Reporter: Marshall Schor Assignee: Marshall Schor Fix For: 3.0.0SDK log4j is now at 2.10.0 slf4j is now at 1.7.25 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5642) uv3 migration tool should add comment about the conversion
[ https://issues.apache.org/jira/browse/UIMA-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5642. -- Resolution: Fixed > uv3 migration tool should add comment about the conversion > -- > > Key: UIMA-5642 > URL: https://issues.apache.org/jira/browse/UIMA-5642 > Project: UIMA > Issue Type: Improvement > Components: Tools >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > The current v3 migration tool should add a comment for each converted class, > similar to what JCasGen does, documenting the running of the tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (UIMA-5642) uv3 migration tool should add comment about the conversion
[ https://issues.apache.org/jira/browse/UIMA-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor reassigned UIMA-5642: Assignee: Marshall Schor Affects Version/s: (was: 3.0.0SDK-beta) Fix Version/s: 3.0.0SDK > uv3 migration tool should add comment about the conversion > -- > > Key: UIMA-5642 > URL: https://issues.apache.org/jira/browse/UIMA-5642 > Project: UIMA > Issue Type: Improvement > Components: Tools >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > The current v3 migration tool should add a comment for each converted class, > similar to what JCasGen does, documenting the running of the tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (UIMA-5550) Eclipse Update Site improvement for 3.0.0SDK
[ https://issues.apache.org/jira/browse/UIMA-5550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor closed UIMA-5550. Resolution: Fixed Fix Version/s: (was: 3.0.0SDK) fixed - new uima.apache.org website pages describe this process; implemented it to prune the update site for 2.10.2 > Eclipse Update Site improvement for 3.0.0SDK > > > Key: UIMA-5550 > URL: https://issues.apache.org/jira/browse/UIMA-5550 > Project: UIMA > Issue Type: Task > Components: Eclipse plugins >Reporter: Marshall Schor >Assignee: Marshall Schor > > Figure out how to drop the older versions (alpha and beta) from the eclipse > update site for 3.0.0. > Decide on a naming convention for the 3.0.0 "branch" if keeping separate. > Figure out how to move older versions for v2 Uima to "archive" spot, not in > the Apache "mirrors". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5652) uv3 update/impl logging mdc/ndc
[ https://issues.apache.org/jira/browse/UIMA-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5652. -- Resolution: Fixed > uv3 update/impl logging mdc/ndc > --- > > Key: UIMA-5652 > URL: https://issues.apache.org/jira/browse/UIMA-5652 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Replace NDC with MDC, implement suitable context info, update docs. > Context maintained: > 1) id of pipeline being run (an incr int + suitable string if one can be > found, from root context) > 2) id of CAS (an incr int, and/or the CAS UniqueId) > 3) annotator being run (within a pipeline) - both Java Class & the path in > the context to it > Insure values are not held onto to prevent GC problems. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5653) uv3 update trace logging using defined markers
Marshall Schor created UIMA-5653: Summary: uv3 update trace logging using defined markers Key: UIMA-5653 URL: https://issues.apache.org/jira/browse/UIMA-5653 Project: UIMA Issue Type: Improvement Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK The docs define some tracing markers. Add tracing logging using these markers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5652) uv3 update/impl logging mdc/ndc
Marshall Schor created UIMA-5652: Summary: uv3 update/impl logging mdc/ndc Key: UIMA-5652 URL: https://issues.apache.org/jira/browse/UIMA-5652 Project: UIMA Issue Type: Improvement Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK Replace NDC with MDC, implement suitable context info, update docs. Context maintained: 1) id of pipeline being run (an incr int + suitable string if one can be found, from root context) 2) id of CAS (an incr int, and/or the CAS UniqueId) 3) annotator being run (within a pipeline) - both Java Class & the path in the context to it Insure values are not held onto to prevent GC problems. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5650) uv3 add FSLinkedHashSet
[ https://issues.apache.org/jira/browse/UIMA-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5650. -- Resolution: Fixed > uv3 add FSLinkedHashSet > --- > > Key: UIMA-5650 > URL: https://issues.apache.org/jira/browse/UIMA-5650 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > Many users of HashSet find they need to switch to a LinkedHashSet in order to > have a predictable iteration order, for reproducability. Add this style. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5650) uv3 add FSLinkedHashSet
Marshall Schor created UIMA-5650: Summary: uv3 add FSLinkedHashSet Key: UIMA-5650 URL: https://issues.apache.org/jira/browse/UIMA-5650 Project: UIMA Issue Type: Improvement Components: Core Java Framework Reporter: Marshall Schor Assignee: Marshall Schor Priority: Minor Fix For: 3.0.0SDK Many users of HashSet find they need to switch to a LinkedHashSet in order to have a predictable iteration order, for reproducability. Add this style. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (UIMA-4666) UV3 JCasGen
[ https://issues.apache.org/jira/browse/UIMA-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor closed UIMA-4666. Resolution: Fixed Fix Version/s: 3.0.0SDK > UV3 JCasGen > --- > > Key: UIMA-4666 > URL: https://issues.apache.org/jira/browse/UIMA-4666 > Project: UIMA > Issue Type: Story > Components: Core Java Framework >Reporter: Marshall Schor > Fix For: 3.0.0SDK > > > Changes to JCasGen approach. > Allow users an easy migration from existing JCasGen classes > - a reporting tool that identifies existing JCas gen classes being used > - tooling support for setting up "merged" JCasGen class definitions from > multiple sources > -- Goal: support PEARs having different (but mergable) JCasGen definitions > Internal: only one class per type - the xxx_type class is removed in UV3 > - document migration path for users previously using xxx_type capabilities > for low-level CAS operation > Document use-cases for JCasGen customization and changes possible due to new > support for JavaObjects as data in Features. > Document the use-cases supported, including > - A single set of JCasGen classes used for different type systems > - External semi-manual merging of JCasGen classes from PEARs and their > containers > - Continuing to have JCasGen optional -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (UIMA-5642) uv3 migration tool should add comment about the conversion
Marshall Schor created UIMA-5642: Summary: uv3 migration tool should add comment about the conversion Key: UIMA-5642 URL: https://issues.apache.org/jira/browse/UIMA-5642 Project: UIMA Issue Type: Improvement Components: Tools Affects Versions: 3.0.0SDK-beta Reporter: Marshall Schor Priority: Minor The current v3 migration tool should add a comment for each converted class, similar to what JCasGen does, documenting the running of the tool. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (UIMA-4662) UIMA Version 3 Epic: A container for the various changes in UIMA V3
[ https://issues.apache.org/jira/browse/UIMA-4662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor closed UIMA-4662. Resolution: Fixed Fix Version/s: 3.0.0SDK > UIMA Version 3 Epic: A container for the various changes in UIMA V3 > --- > > Key: UIMA-4662 > URL: https://issues.apache.org/jira/browse/UIMA-4662 > Project: UIMA > Issue Type: Epic > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK > > > I'm trying something new: organizing the various issues for UIMA v3 under > this "Epic". > Version 3 of UIMA adds some often requested features to UIMA. > It is planned to be mostly backwards compatible, with some some needed > changes aided by migration aids (JCasGen in particular). > It will require and make use of Java 8 capabilities. > It will trade off more storage for performance, given the trends in hardware > over the past 15 years. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (UIMA-5637) JCasUtil.selectAt has different meaning than expected
[ https://issues.apache.org/jira/browse/UIMA-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236487#comment-16236487 ] Marshall Schor commented on UIMA-5637: -- I agree, I think the && should be ||. > JCasUtil.selectAt has different meaning than expected > - > > Key: UIMA-5637 > URL: https://issues.apache.org/jira/browse/UIMA-5637 > Project: UIMA > Issue Type: Bug > Components: uimaFIT >Affects Versions: 2.3.0uimaFIT >Reporter: Mario Juric >Priority: Major > > I was wondering what exactly the semantics of JCasUtil.selectAt is supposed > to be after looking into the implementation, since the JavaDoc isn’t very > precise. I initially thought that it would select annotations of the given > type with the exact begin and end, but this is not the case when inspecting > the implementation. The problem is in CasUtil.selectAt with the following > while loop: > while (it.isValid()) { > AnnotationFS a = it.get(); > // If the offsets do not match the specified offets, we're done > if (a.getBegin() != aBegin && a.getEnd() != aEnd) { > break; > } > it.moveToNext(); > list.add(a); > } > I would have expected that either begin or end must be different to drop the > item, i.e. "if (a.getBegin() != aBegin || a.getEnd() != aEnd)" instead. This > is obviously not the case, and it does not have the same behaviour like > selectCovered either, so what is the intend if it’s not a bug? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5633) uv3 generic spec poor for select api in some Interfaces
[ https://issues.apache.org/jira/browse/UIMA-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5633. -- Resolution: Fixed > uv3 generic spec poor for select api in some Interfaces > --- > > Key: UIMA-5633 > URL: https://issues.apache.org/jira/browse/UIMA-5633 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > For situations where an interface is generically typed (e.g. FSIndex), the > generic spec for methods like select should have a param like , > not . This allows better automation for types > while using these forms in user code. Do a review of the generic specs and > update where needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5631) uv3 jcas interface missing the emptyXXXlist methods
[ https://issues.apache.org/jira/browse/UIMA-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5631. -- Resolution: Fixed > uv3 jcas interface missing the emptyXXXlist methods > --- > > Key: UIMA-5631 > URL: https://issues.apache.org/jira/browse/UIMA-5631 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > I noticed that the jcas interface in uv3 was missing the emptyFloatList, > emptyStringList, etc. methods that are part of the cas interface. Add these. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5630) uv3 documentation - update 4 older books to reflect better uv3
[ https://issues.apache.org/jira/browse/UIMA-5630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5630. -- Resolution: Fixed > uv3 documentation - update 4 older books to reflect better uv3 > -- > > Key: UIMA-5630 > URL: https://issues.apache.org/jira/browse/UIMA-5630 > Project: UIMA > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > Make a pass through the 4 "books" of docs for UIMA sdk, and update to use new > v3 idioms, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5632) uv3 javadocs missing on some cas/jcas/fsindex APIs for select
[ https://issues.apache.org/jira/browse/UIMA-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5632. -- Resolution: Fixed > uv3 javadocs missing on some cas/jcas/fsindex APIs for select > - > > Key: UIMA-5632 > URL: https://issues.apache.org/jira/browse/UIMA-5632 > Project: UIMA > Issue Type: Improvement >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > add missing javadocs for methods like select in the various APIs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (UIMA-5625) uv3 update Examples project to use UV3 idioms
[ https://issues.apache.org/jira/browse/UIMA-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor resolved UIMA-5625. -- Resolution: Fixed > uv3 update Examples project to use UV3 idioms > - > > Key: UIMA-5625 > URL: https://issues.apache.org/jira/browse/UIMA-5625 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > UV3 has many simplified idioms for accessing things in the CAS, integrating > with Java 8, etc. Upgrade the examples in the uimaj-examples project to make > use of these. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (UIMA-5625) uv3 update Examples project to use UV3 idioms
[ https://issues.apache.org/jira/browse/UIMA-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor reassigned UIMA-5625: Assignee: Marshall Schor > uv3 update Examples project to use UV3 idioms > - > > Key: UIMA-5625 > URL: https://issues.apache.org/jira/browse/UIMA-5625 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > UV3 has many simplified idioms for accessing things in the CAS, integrating > with Java 8, etc. Upgrade the examples in the uimaj-examples project to make > use of these. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5633) uv3 generic spec poor for select api in some Interfaces
[ https://issues.apache.org/jira/browse/UIMA-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5633: - Fix Version/s: (was: 3.0.0SDK) 3.0.0SDK-beta > uv3 generic spec poor for select api in some Interfaces > --- > > Key: UIMA-5633 > URL: https://issues.apache.org/jira/browse/UIMA-5633 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > For situations where an interface is generically typed (e.g. FSIndex), the > generic spec for methods like select should have a param like , > not . This allows better automation for types > while using these forms in user code. Do a review of the generic specs and > update where needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (UIMA-5633) uv3 generic spec poor for select api in some Interfaces
[ https://issues.apache.org/jira/browse/UIMA-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marshall Schor updated UIMA-5633: - Description: For situations where an interface is generically typed (e.g. FSIndex), the generic spec for methods like select should have a param like , not . This allows better automation for types while using these forms in user code. Do a review of the generic specs and update where needed. (was: For situations where an interface is generically typed (e.g. FSIndex), the generic spec for methods like select should have a param like , not . This allows better automation for types while using these forms in user code.) > uv3 generic spec poor for select api in some Interfaces > --- > > Key: UIMA-5633 > URL: https://issues.apache.org/jira/browse/UIMA-5633 > Project: UIMA > Issue Type: Improvement > Components: Core Java Framework >Affects Versions: 3.0.0SDK-beta >Reporter: Marshall Schor >Assignee: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDK-beta > > > For situations where an interface is generically typed (e.g. FSIndex), the > generic spec for methods like select should have a param like , > not . This allows better automation for types > while using these forms in user code. Do a review of the generic specs and > update where needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)