[ https://issues.apache.org/jira/browse/UIMA-6162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000929#comment-17000929 ]
Richard Eckart de Castilho commented on UIMA-6162: -------------------------------------------------- I have set up a unit test on a PR branching off master before your fix: https://github.com/apache/uima-uimaj/pull/16 The test in this PR fails because it builds on a version which doesn't include your fix yet. When merging master into it, it should work. > Concurrent binary serialization produces corrupt output > ------------------------------------------------------- > > Key: UIMA-6162 > URL: https://issues.apache.org/jira/browse/UIMA-6162 > Project: UIMA > Issue Type: Bug > Components: UIMA > Affects Versions: 3.1.1SDK > Reporter: Richard Eckart de Castilho > Priority: Major > Attachments: admin.ser > > Time Spent: 10m > Remaining Estimate: 0h > > I suspect there could be an issue in `BinaryCasSerDes`. > When deserializing the attached file `admin.ser`, I get this stack trace: > {code:java} > Caused by: java.lang.ClassCastException: class > org.apache.uima.jcas.tcas.Annotation cannot be cast to class > org.apache.uima.jcas.cas.Sofa (org.apache.uima.jcas.tcas.Annotation and > org.apache.uima.jcas.cas.Sofa are in unnamed module of loader > org.apache.catalina.loader.ParallelWebappClassLoader @4593ff34)at > org.apache.uima.cas.impl.BinaryCasSerDes.makeSofaFromHeap(BinaryCasSerDes.java:1823) > ~[uimaj-core-3.1.1.jar:3.1.1]at > org.apache.uima.cas.impl.BinaryCasSerDes.getSofaFromAnnotBase(BinaryCasSerDes.java:1817) > ~[uimaj-core-3.1.1.jar:3.1.1]at > org.apache.uima.cas.impl.BinaryCasSerDes.createFSsFromHeaps(BinaryCasSerDes.java:1701) > ~[uimaj-core-3.1.1.jar:3.1.1]at > org.apache.uima.cas.impl.BinaryCasSerDes.reinit(BinaryCasSerDes.java:259) > ~[uimaj-core-3.1.1.jar:3.1.1]at > org.apache.uima.cas.impl.BinaryCasSerDes.reinit(BinaryCasSerDes.java:328) > ~[uimaj-core-3.1.1.jar:3.1.1]at > org.apache.uima.cas.impl.Serialization.deserializeCASComplete(Serialization.java:129) > ~[uimaj-core-3.1.1.jar:3.1.1]{code} > The code used to read the file before deserializing is as follows: > {code:java} > public static void readSerializedCas(CAS aCas, File aFile) > throws IOException > { > try (ObjectInputStream is = new ObjectInputStream(new > FileInputStream(aFile))) { > CASCompleteSerializer serializer = (CASCompleteSerializer) > is.readObject(); > deserializeCASComplete(serializer, (CASImpl) aCas); > } > catch (ClassNotFoundException e) { > throw new IOException(e); > } > } > {code} > I set a breakpoint to BinaryCasSerDes:1608 which is a for loop iterating over > the heap. Apparently, the first feature structure that is encountered is an > annotation type which is NOT the SOFA. Then in line 1700, the deserializer > tries to resolve the SOFA for this annotation but fails because it has not > yet been deserialized. Eventually makeSofaFromHeap is called and checks if a > SOFA needs to be created. It tries to look up the SOFAs ID (1) from > csds.addr2fs.get(sofaAddr) (BinaryCasSerDes:1821) and generates a new SOFA. > However, when the SECOND annotation is read and csds.addr2fs.get(sofaAddr) > (BinaryCasSerDes:1821) is called again and tries to resolve the SOFA from > addr 1, it gets the previously deserialized annotation instead of the SOFA > annotation that had been created. > The SOFA that has been implicitly created is added to the csds.addr2fs map at > key 1... however, later in BinaryCasSerDes:1723, the key 1 is overwritten by > the deserialized annotation: > {code} > if (!isSofa) { // if it was a sofa, other code added or pended it > csds.addFS(fs, heapIndex); // this overrides to SOFA that was > created at key 1 because heapIndex is also 1 > } > {code} > The heap looks something like this: > {code} > [0, 187, 1, 33, 46, 199, 200, 201, 44, 202, 187, 1, 33, 46, 203, 204, 205, > 45, 206, 187, 1, 33, 46, 207, 208, 209, 46, 210, 187, 1, 33, 46, 211, 212, > 213, 47, 214, 187, 1, 33, 46, 215, 216, 217, 48, 1, 187, 1,... > {code} > I guess that 187 is the type code of the first annotation and we can see it > repeats a couple of times. The 1 seems to be the SOFA ID - the first feature > of the feature structures. However, instead of 1 referring to the address of > the SOFA, it points at the first annotation which is NOT a SOFA. > Bug in the serialization code assuming that the SOFA is always in the first > position? -- This message was sent by Atlassian Jira (v8.3.4#803005)