[
https://issues.apache.org/jira/browse/UIMA-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240559#comment-17240559
]
Mario Juric commented on UIMA-6295:
-----------------------------------
I found the reason for the problem, which happens during serialization. It's
because _save_to_cas_data is called twice, the first time is during
processIndexedFeatureStructures (l. 753, BinaryCasSerDes6 and l. 157, AllFSs)
where the feature structures are collected for serialization, and the second
time is right after during initSrcTgtIdMapsAndStrings (l. 775, BinaryCasSerDes6
and l. 3041, BinaryCasSerDes6) where serialization identifier mappings are
constructed. The collected feature structure is no longer the same as the one
inside the FeatureMap after the second save call and at the point where
serialization of the array field happens, so the id lookup fails with the new
FSArray (l. 835, BinaryCasSerDes6 and l. 986, BinaryCasSerDes6), because it was
the previous FSArray id that was stored in the serialization identifier
mappings.
It all works after I commented out the second save call (l. 3040-3042,
BinaryCasSerDes6), but I am not sure whether this is the correct solution,
since I don't know what motivated this second call, and there could be edge
cases that it is suppose to solve. However, all tests still succeed after the
change, and I could add simple unit tests that captures the current issue to
make sure the problem isn't just re-introduced when some edge case should show
up in the future. To be honest, I wouldn't know what else to do anyway without
knowing why this additional call was introduced. The alternative is to do
nothing with a workaround where users are required check that there has been
any change to the data structure before initializing a new feature structure,
but I would consider this flawed to rely upon users to figure this out
themselves, and it is not efficient with more than one save call.
> CAS transportable Java object not serialised or deserialised with compressed
> binary
> -----------------------------------------------------------------------------------
>
> Key: UIMA-6295
> URL: https://issues.apache.org/jira/browse/UIMA-6295
> Project: UIMA
> Issue Type: Bug
> Components: uimaj
> Affects Versions: 3.1.1SDK
> Environment: [^cas-transported-java-objects.zip]
> Reporter: Mario Juric
> Priority: Major
> Attachments: cas-transported-java-objects.zip
>
>
> I have been experimenting with wrapping a CAS transportable Java HashMap
> inside an UIMA type, and I found that the internal UIMA FSArray is either not
> stored or restored, although _save_to_cas_data and _init_from_cas_data of
> UimaSerializableFSs are called during serialisation and deserialisation of a
> compressed CAS binary. I have not yet been able to pinpoint where it goes
> wrong, serialisation or deserialisation, but I attached a simple Maven
> project with a test that reproduces the problem. Notice that the test that
> uses XMI succeeds, while the one that uses
> SerialFormat.COMPRESSED_FILTERED_TS fails.
> [^cas-transported-java-objects.zip]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)