[ 
https://issues.apache.org/jira/browse/UIMA-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240559#comment-17240559
 ] 

Mario Juric commented on UIMA-6295:
-----------------------------------

I found the reason for the problem, which happens during serialization. It's 
because _save_to_cas_data is called twice, the first time is during 
processIndexedFeatureStructures (l. 753, BinaryCasSerDes6 and l. 157, AllFSs) 
where the feature structures are collected for serialization, and the second 
time is right after during initSrcTgtIdMapsAndStrings (l. 775, BinaryCasSerDes6 
and l. 3041, BinaryCasSerDes6) where serialization identifier mappings are 
constructed. The collected feature structure is no longer the same as the one 
inside the FeatureMap after the second save call and at the point where 
serialization of the array field happens, so the id lookup fails with the new 
FSArray (l. 835, BinaryCasSerDes6 and l. 986, BinaryCasSerDes6), because it was 
the previous FSArray id that was stored in the serialization identifier 
mappings.

It all works after I commented out the second save call (l. 3040-3042, 
BinaryCasSerDes6), but I am not sure whether this is the correct solution, 
since I don't know what motivated this second call, and there could be edge 
cases that it is suppose to solve. However, all tests still succeed after the 
change, and I could add simple unit tests that captures the current issue to 
make sure the problem isn't just re-introduced when some edge case should show 
up in the future. To be honest, I wouldn't know what else to do anyway without 
knowing why this additional call was introduced. The alternative is to do 
nothing with a workaround where users are required check that there has been 
any change to the data structure before initializing a new feature structure, 
but I would consider this flawed to rely upon users to figure this out 
themselves, and it is not efficient with more than one save call.

> CAS transportable Java object not serialised or deserialised with compressed 
> binary
> -----------------------------------------------------------------------------------
>
>                 Key: UIMA-6295
>                 URL: https://issues.apache.org/jira/browse/UIMA-6295
>             Project: UIMA
>          Issue Type: Bug
>          Components: uimaj
>    Affects Versions: 3.1.1SDK
>         Environment: [^cas-transported-java-objects.zip]
>            Reporter: Mario Juric
>            Priority: Major
>         Attachments: cas-transported-java-objects.zip
>
>
> I have been experimenting with wrapping a CAS transportable Java HashMap 
> inside an UIMA type, and I found that the internal UIMA FSArray is either not 
> stored or restored, although _save_to_cas_data and _init_from_cas_data of 
> UimaSerializableFSs are called during serialisation and deserialisation of a 
> compressed CAS binary. I have not yet been able to pinpoint where it goes 
> wrong, serialisation or deserialisation, but I attached a simple Maven 
> project with a test that reproduces the problem. Notice that the test that 
> uses XMI succeeds, while the one that uses 
> SerialFormat.COMPRESSED_FILTERED_TS fails.
> [^cas-transported-java-objects.zip]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to