Hi Mark,
Oh! I forgot to mention that I've checked Cassandra and Pig classpath,
even more drastical, I removed the pig-0.10 jar from Cassandra build/lib
path but still I get the same problem.
I check Cassandra and Pig classpath by adding an echo extraline before
the execution on java -jar on it's scripts files, could you confirm me
if this is ok? Or if I have to modify hadoop-env.sh to add the same echo
line?
Thank you,
Claudio
El 08/11/13 19:39, Mark Wagner escribió:
Hey Claudio,
I saw this internally and another person on the list had the same
issue. Every time I've seen it the cause was a version <= 0.10 of Pig
sneaked onto the classpath. Removing it will solve the issue.
Thanks,
Mark
On Tue, Nov 5, 2013 at 1:02 PM, Claudio Romo Otto
<[email protected]> wrote:
Hi guys,
For some reason I cannot setup any version higher than Pig 0.10 with Hadoop
1.2.1 and Cassandra 1.2.10. For example, using Pig 0.12 when I try a very
simple dump I get this error from JobTracker log:
2013-11-05 17:44:12,000 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201311051740_0002_m_000002_0: java.io.IOException:
Deserialization error: invalid stream header: 2DD01810
at
org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:55)
at org.apache.pig.impl.util.UDFContext.deserialize(UDFContext.java:192)
at
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil.setupUDFContext(MapRedUtil.java:159)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.setupUdfEnvAndStores(PigOutputFormat.java:229)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:275)
at org.apache.hadoop.mapred.Task.initialize(Task.java:515)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.StreamCorruptedException: invalid stream header: 2DD01810
at
java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
at
org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:52)
... 11 more
When I change to Pig 0.10 everything goes fine..
Now for the record, things I've tried:
- Compile Pig 0.12 / 0.11
- Compile using ant clean jar-withouthadoop -Dhadoopversion=20
- Compile using ant clean jar-withouthadoop -Dhadoopversion=23 (mega fail
due to hadoop 1.2)
- Compile Hadoop to get 1.2.2 instead default 1.2.1
- Compile Cassandra 1.2.10 (Included Pig 0.10 into examples dir works fine
too)
I want to leverage Pig 0.12 but this problem is getting me nuts. Can someone
tell me what I'm doing wrong?