ERROR 2135: Received error from store function.Premature EOF: no length prefix available

2015-06-09 Thread pth001

Hi,

My pig on Tez (to store dataset into a partitioned hive table) throws 
the following exception. What can be wrong? How can I fix it?


2015-06-09 10:59:57,268 ERROR [TezChild] runtime.PigProcessor: 
Encountered exception while processing:
org.apache.pig.backend.executionengine.ExecException: ERROR 2135: 
Received error from store function.Premature EOF: no length prefix available
at 
org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:141)
at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:316)
at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:195)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)

at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2208)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1440)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1362)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:589)


BR,
Patcharee


filter by query result

2015-05-27 Thread pth001

Hi,

I am new to pig. First I queried a hive table (x = LOAD 'x' USING 
org.apache.hive.hcatalog.pig.HCatLoader();) and got a single 
record/value. How can I used this single value to filter in another 
query? I hope to get a better performance by filter as soon as possible.


BR,
Patcharee


create a pipeline

2015-04-15 Thread pth001

Hi,

How can I create a pipeline (containing a sequence of pig scripts)?

BR,
Patcharee