Re: InputSizeReducerEstimator cannot get PhysicalOperators, so My pig job always make one reducer in Hadoop 2.2.0

2014-02-06 Thread 최종원
Thank you for your answer. But where can I find pig source of pig-0.12.0-h2 version ? I think, there must be difference between pig-0.12.0 and pig-0.12.0-h2. but I cannot find the source version 0.12.0-h2. when I extract the jar file, there are additional package (like org.apache.pig.backend.had

Re: InputSizeReducerEstimator cannot get PhysicalOperators, so My pig job always make one reducer in Hadoop 2.2.0

2014-02-06 Thread Cheolsoo Park
Hi, Sounds like you're bitten by PIG-3512- https://issues.apache.org/jira/browse/PIG-3512 Can you try to apply the patch and rebuild the jar? Thanks, Cheolsoo On Thu, Feb 6, 2014 at 7:27 PM, 최종원 wrote: > This is the log ... > > 2014-02-06 17:29:19,087 [Thread-42] INFO > > > org.apache.pig

Re: java.lang.OutOfMemoryError: Java heap space

2014-02-06 Thread Cheolsoo Park
Looks like you're running out of space in MapOutputBuffer. Two suggestions- 1) You said that io.sort.mb is already set to 768 MB, but did you try to lower io.sort.spill.percent in order to spill earlier and more often? Page 12- http://www.slideshare.net/Hadoop_Summit/optimizing-mapreduce-job-perf

Invalid "mapreduce.jobtracker.address" configuration value for LocalJobRunner : "hivecluster2:8021"

2014-02-06 Thread Russell Jurney
I have the following error, which makes no sense - because my configuration is correct... hivecluster2:8021 is the jobtracker. Any idea what I'm supposed to do? grunt> bluecoat = LOAD '/securityx/test' USING AvroStorage(); grunt> dump bluecoat 2014-02-06 17:21:49,256 [main] INFO org.apache.pig.t

Re: InputSizeReducerEstimator cannot get PhysicalOperators, so My pig job always make one reducer in Hadoop 2.2.0

2014-02-06 Thread 최종원
This is the log ... 2014-02-06 17:29:19,087 [Thread-42] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2014-02-06 17:29:19,087 [Thread-42] INFO org.apache.pig.backend.hadoop.executionengine.mapReduc

Re: java.lang.OutOfMemoryError: Java heap space

2014-02-06 Thread praveenesh kumar
Its a normal join. I can't use replicated join, as the data is very large. Regards Prav On Thu, Feb 6, 2014 at 7:52 PM, abhishek wrote: > Hi Praveenesh, > > Did you use "replicated join" in your pig script or is it a regular join ?? > > Regards > Abhishek > > Sent from my iPhone > > > On Feb 6

Re: java.lang.OutOfMemoryError: Java heap space

2014-02-06 Thread abhishek
Hi Praveenesh, Did you use "replicated join" in your pig script or is it a regular join ?? Regards Abhishek Sent from my iPhone > On Feb 6, 2014, at 11:25 AM, praveenesh kumar wrote: > > Hi all, > > I am running a Pig Script which is running fine for small data. But when I > scale the data,

java.lang.OutOfMemoryError: Java heap space

2014-02-06 Thread praveenesh kumar
Hi all, I am running a Pig Script which is running fine for small data. But when I scale the data, I am getting the following error at my map stage. Please refer to the map logs as below. My Pig script is doing a group by first, followed by a join on the grouped data. Any clues to understand wh

InputSizeReducerEstimator cannot get PhysicalOperators, so My pig job always make one reducer in Hadoop 2.2.0

2014-02-06 Thread 최종원
Hello. My Pig job always make one reduce job in version 0.12.0-h2, ... because InputSizeReducerEstimator class return input file size always -1. I'm not sure the reason, but actually, PlanHelper.getPhysicalOperators method always return 0 size list. public int estimateNumberOfReducers(Job jo