[ https://issues.apache.org/jira/browse/UIMA-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240559#comment-17240559 ]
Mario Juric commented on UIMA-6295: ----------------------------------- I found the reason for the problem, which happens during serialization. It's because _save_to_cas_data is called twice, the first time is during processIndexedFeatureStructures (l. 753, BinaryCasSerDes6 and l. 157, AllFSs) where the feature structures are collected for serialization, and the second time is right after during initSrcTgtIdMapsAndStrings (l. 775, BinaryCasSerDes6 and l. 3041, BinaryCasSerDes6) where serialization identifier mappings are constructed. The collected feature structure is no longer the same as the one inside the FeatureMap after the second save call and at the point where serialization of the array field happens, so the id lookup fails with the new FSArray (l. 835, BinaryCasSerDes6 and l. 986, BinaryCasSerDes6), because it was the previous FSArray id that was stored in the serialization identifier mappings. It all works after I commented out the second save call (l. 3040-3042, BinaryCasSerDes6), but I am not sure whether this is the correct solution, since I don't know what motivated this second call, and there could be edge cases that it is suppose to solve. However, all tests still succeed after the change, and I could add simple unit tests that captures the current issue to make sure the problem isn't just re-introduced when some edge case should show up in the future. To be honest, I wouldn't know what else to do anyway without knowing why this additional call was introduced. The alternative is to do nothing with a workaround where users are required check that there has been any change to the data structure before initializing a new feature structure, but I would consider this flawed to rely upon users to figure this out themselves, and it is not efficient with more than one save call. > CAS transportable Java object not serialised or deserialised with compressed > binary > ----------------------------------------------------------------------------------- > > Key: UIMA-6295 > URL: https://issues.apache.org/jira/browse/UIMA-6295 > Project: UIMA > Issue Type: Bug > Components: uimaj > Affects Versions: 3.1.1SDK > Environment: [^cas-transported-java-objects.zip] > Reporter: Mario Juric > Priority: Major > Attachments: cas-transported-java-objects.zip > > > I have been experimenting with wrapping a CAS transportable Java HashMap > inside an UIMA type, and I found that the internal UIMA FSArray is either not > stored or restored, although _save_to_cas_data and _init_from_cas_data of > UimaSerializableFSs are called during serialisation and deserialisation of a > compressed CAS binary. I have not yet been able to pinpoint where it goes > wrong, serialisation or deserialisation, but I attached a simple Maven > project with a test that reproduces the problem. Notice that the test that > uses XMI succeeds, while the one that uses > SerialFormat.COMPRESSED_FILTERED_TS fails. > [^cas-transported-java-objects.zip] -- This message was sent by Atlassian Jira (v8.3.4#803005)