Hi, Sounds like you're bitten by PIG-3512- https://issues.apache.org/jira/browse/PIG-3512
Can you try to apply the patch and rebuild the jar? Thanks, Cheolsoo On Thu, Feb 6, 2014 at 7:27 PM, 최종원 <jongwons.c...@gmail.com> wrote: > This is the log ... > > 2014-02-06 17:29:19,087 [Thread-42] INFO > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Reduce phase detected, estimating # of required reducers. > 2014-02-06 17:29:19,087 [Thread-42] INFO > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Using reducer estimator: > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator > 2014-02-06 17:29:19,087 [Thread-42] INFO > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator > - BytesPerReducer=100000000 maxReducers=999=-1 totalInputFileSize > 2014-02-06 17:29:19,087 [Thread-42] INFO > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Could not estimate number of reducers and no requested or default > parallelism set. Defaulting to 1 reducer. > 2014-02-06 17:29:19,087 [Thread-42] INFO > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Setting Parallelism to 1 > 2014-02-06 17:29:19,104 [Thread-42] INFO > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 1 map-reduce job(s) waiting for submission. > > InputSizeReducerEstimator cannot calculate map files size, so doesn't > estimate reducer size. > But I think, I gave the right hadoop file path. > I tried many possible pathes like... > > relative-path/to/file > /user/myuser/absolute-path/to/file > hdfs://host:8020/user/myuser/absolute-path/to/file > hdfs://host:9000/user/myuser/absolute-path/to/file/change-the-hdfs-port > > etc... > > but the pig failed to estimate reducer size. > > I am almost defeated... by this problem. > > > > 2014-02-06 21:31 GMT+09:00 최종원 <jongwons.c...@gmail.com>: > > > Hello. > > > > My Pig job always make one reduce job in version 0.12.0-h2, ... because > > > > InputSizeReducerEstimator class return input file size always -1. > > > > I'm not sure the reason, but actually, PlanHelper.getPhysicalOperators > > method always return 0 size list. > > > > > > public int estimateNumberOfReducers(Job job, MapReduceOper > >> mapReduceOper) throws IOException { > >> Configuration conf = job.getConfiguration(); > >> long bytesPerReducer = conf.getLong(BYTES_PER_REDUCER_PARAM, > >> DEFAULT_BYTES_PER_REDUCER); > >> int maxReducers = conf.getInt(MAX_REDUCER_COUNT_PARAM, > >> DEFAULT_MAX_REDUCER_COUNT_PARAM); > >> List<POLoad> poLoads = > >> PlanHelper.getPhysicalOperators(mapReduceOper.mapPlan, POLoad.class); > >> long totalInputFileSize = getTotalInputFileSize(conf, poLoads, > >> job); > >> log.info("BytesPerReducer=" + bytesPerReducer + " maxReducers=" > >> + maxReducers + " totalInputFileSize=" + > totalInputFileSize); > >> // if totalInputFileSize == -1, we couldn't get the input size > >> so we can't estimate. > >> if (totalInputFileSize == -1) { return -1; } > >> int reducers = (int)Math.ceil((double)totalInputFileSize / > >> bytesPerReducer); > >> reducers = Math.max(1, reducers); > >> reducers = Math.min(maxReducers, reducers); > >> return reducers; > >> } > > > > > > > > and the pig job ends successful. > > > > But the reducer planed one one task, it takes very long time. > > > > > > I tried it in apache hadoop 2.2.0 and pig 0.12.0 (h2) version. > > > > And also another version by installing ambari 1.4.3. > > > > The result always same. > > > > > > What was wrong ??? > > >