Hi,

Sounds like you're bitten by PIG-3512-
https://issues.apache.org/jira/browse/PIG-3512

Can you try to apply the patch and rebuild the jar?

Thanks,
Cheolsoo



On Thu, Feb 6, 2014 at 7:27 PM, 최종원 <jongwons.c...@gmail.com> wrote:

> This is the log ...
>
> 2014-02-06 17:29:19,087 [Thread-42] INFO
>
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Reduce phase detected, estimating # of required reducers.
> 2014-02-06 17:29:19,087 [Thread-42] INFO
>
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Using reducer estimator:
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
> 2014-02-06 17:29:19,087 [Thread-42] INFO
>
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
> - BytesPerReducer=100000000 maxReducers=999=-1 totalInputFileSize
> 2014-02-06 17:29:19,087 [Thread-42] INFO
>
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Could not estimate number of reducers and no requested or default
> parallelism set. Defaulting to 1 reducer.
> 2014-02-06 17:29:19,087 [Thread-42] INFO
>
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting Parallelism to 1
> 2014-02-06 17:29:19,104 [Thread-42] INFO
>
>  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 1 map-reduce job(s) waiting for submission.
>
> InputSizeReducerEstimator cannot calculate map files size, so doesn't
> estimate reducer size.
> But I think, I gave the right hadoop file path.
> I tried many possible pathes like...
>
>   relative-path/to/file
>   /user/myuser/absolute-path/to/file
>   hdfs://host:8020/user/myuser/absolute-path/to/file
>   hdfs://host:9000/user/myuser/absolute-path/to/file/change-the-hdfs-port
>
> etc...
>
> but the pig failed to estimate reducer size.
>
> I am almost defeated... by this problem.
>
>
>
> 2014-02-06 21:31 GMT+09:00 최종원 <jongwons.c...@gmail.com>:
>
> > Hello.
> >
> > My Pig job always make one reduce job in version 0.12.0-h2, ... because
> >
> > InputSizeReducerEstimator class return input file size always -1.
> >
> > I'm not sure the reason, but actually, PlanHelper.getPhysicalOperators
> > method always return 0 size list.
> >
> >
> >   public int estimateNumberOfReducers(Job job, MapReduceOper
> >> mapReduceOper) throws IOException {
> >>         Configuration conf = job.getConfiguration();
> >>         long bytesPerReducer = conf.getLong(BYTES_PER_REDUCER_PARAM,
> >> DEFAULT_BYTES_PER_REDUCER);
> >>         int maxReducers = conf.getInt(MAX_REDUCER_COUNT_PARAM,
> >> DEFAULT_MAX_REDUCER_COUNT_PARAM);
> >>         List<POLoad> poLoads =
> >> PlanHelper.getPhysicalOperators(mapReduceOper.mapPlan, POLoad.class);
> >>         long totalInputFileSize = getTotalInputFileSize(conf, poLoads,
> >> job);
> >>         log.info("BytesPerReducer=" + bytesPerReducer + " maxReducers="
> >>             + maxReducers + " totalInputFileSize=" +
> totalInputFileSize);
> >>         // if totalInputFileSize == -1, we couldn't get the input size
> >> so we can't estimate.
> >>         if (totalInputFileSize == -1) { return -1; }
> >>         int reducers = (int)Math.ceil((double)totalInputFileSize /
> >> bytesPerReducer);
> >>         reducers = Math.max(1, reducers);
> >>         reducers = Math.min(maxReducers, reducers);
> >>         return reducers;
> >>     }
> >
> >
> >
> > and the pig job ends successful.
> >
> > But the reducer planed one one task, it takes very long time.
> >
> >
> > I tried it in apache hadoop 2.2.0 and pig 0.12.0 (h2) version.
> >
> > And also another version by installing ambari 1.4.3.
> >
> > The result always same.
> >
> >
> > What was wrong ???
> >
>

Reply via email to