[ https://issues.apache.org/jira/browse/GORA-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney updated GORA-392: -------------------------------------- Fix Version/s: 0.7 > Move PersistentSerialization to the top of serializations list > -------------------------------------------------------------- > > Key: GORA-392 > URL: https://issues.apache.org/jira/browse/GORA-392 > Project: Apache Gora > Issue Type: Improvement > Components: gora-core > Affects Versions: 0.5 > Reporter: Sergey Weiss > Fix For: 0.7 > > > In a process of making Nutch2 run on Hadoop 2.3.0 + HBase 0.98.1 we > encountered java.io.EOFException's like ones described in this mail thread: > http://www.mail-archive.com/user%40nutch.apache.org/msg12644.html > We applied a patch mentioned there and got our setup running but being very > unstable: it would fail with an ArrayIndexOutOfBounds exception whenever we > try to generate a batch of some 50 or more pages to fetch. > We investigated the problem and discovered that in working setup of Nutch2 + > Hadoop 1.2.0 + HBase 0.94.14, PersistentDeserializer is used for > deserialization during reduce phase, and not > AvroSerialization.AvroDeserializer. The reason for this sudden swap of > deserializers lies in GoraMapReduceUtils#setIOSerializations method. It uses > StringUtils.joinStringArrays and this method uses HashSet under the hood. Two > more serializations were added to io.serializations property in Hadoop 2.3.0 > compared to Hadoop 1.2.0 and this results in AvroSpecificSerialization being > placed on top of serializations list. > After we have patched GoraMapReduceUtils#setIOSerializations, having > explicitly set PersistentSerialization to be the top of the list, we have > fixed the problem with instability. Moreover, we don't even need to patch > Avro now, just one simple change in Gora and everything works like a charm! > So we propose to move PersistentSerialization to the top of serializations > list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)