date:20140905

[jira] [Resolved] (MAPREDUCE-6067) native-task: fix some counter issues

2014-09-05 Thread Sean Zhong (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Zhong resolved MAPREDUCE-6067.
---
Resolution: Fixed

mark as resolved.

 native-task: fix some counter issues
 

 Key: MAPREDUCE-6067
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6067
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
  Components: task
Reporter: Todd Lipcon
Assignee: Binglin Chang
 Attachments: MAPREDUCE-6067.v1.patch, MAPREDUCE-6067.v2.patch, 
 MAPREDUCE-6067.v3.patch, MAPREDUCE-6067.v4.patch, MAPREDUCE-6067.v5.patch, 
 native-counters.html, trunk-counters.html


 After running a terasort, I see the spilled records counter at 5028651606, 
 which is about half what I expected to see. Using the non-native collector I 
 see the expected count of 100. It seems the correct number of records 
 were indeed spilled, because the job's output record count is correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Hadoop-Mapreduce-trunk - Build # 1887 - Still Failing

2014-09-05 Thread Apache Jenkins Server

See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1887/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 27690 lines...]
[INFO] hadoop-mapreduce-client-shuffle ... SKIPPED
[INFO] hadoop-mapreduce-client-app ... SKIPPED
[INFO] hadoop-mapreduce-client-hs  SKIPPED
[INFO] hadoop-mapreduce-client-jobclient . SKIPPED
[INFO] hadoop-mapreduce-client-hs-plugins  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] hadoop-mapreduce .. SKIPPED
[INFO] Apache Hadoop MapReduce Streaming . SKIPPED
[INFO] Apache Hadoop Distributed Copy  SKIPPED
[INFO] Apache Hadoop Archives  SKIPPED
[INFO] Apache Hadoop Rumen ... SKIPPED
[INFO] Apache Hadoop Gridmix . SKIPPED
[INFO] Apache Hadoop Data Join ... SKIPPED
[INFO] Apache Hadoop Extras .. SKIPPED
[INFO] Apache Hadoop Pipes ... SKIPPED
[INFO] Apache Hadoop OpenStack support ... SKIPPED
[INFO] Apache Hadoop Azure support ... SKIPPED
[INFO] Apache Hadoop Client .. SKIPPED
[INFO] Apache Hadoop Mini-Cluster  SKIPPED
[INFO] Apache Hadoop Scheduler Load Simulator  SKIPPED
[INFO] Apache Hadoop Tools Dist .. SKIPPED
[INFO] Apache Hadoop Amazon Web Services support . SKIPPED
[INFO] Apache Hadoop Tools ... SKIPPED
[INFO] Apache Hadoop Distribution  SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 22:44 min
[INFO] Finished at: 2014-09-05T13:47:50+00:00
[INFO] Final Memory: 65M/833M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-common: ExecutionException; nested exception is 
java.util.concurrent.ExecutionException: java.lang.RuntimeException: The forked 
VM terminated without saying properly goodbye. VM crash or System.exit called ?
[ERROR] Command was/bin/sh -c cd 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-common-project/hadoop-common
  /home/jenkins/tools/java/jdk1.6.0_45-64/jre/bin/java -Xmx1024m 
-XX:+HeapDumpOnOutOfMemoryError -jar 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-common-project/hadoop-common/target/surefire/surefirebooter6003405202689870343.jar
 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-common-project/hadoop-common/target/surefire/surefire857949756753728249tmp
 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-common-project/hadoop-common/target/surefire/surefire_158809730311942119770tmp
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hadoop-common
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Updating HDFS-6996
Updating HDFS-6714
Updating YARN-2431
Updating HADOOP-11063
Updating HDFS-6905
Updating HADOOP-11015
Updating HADOOP-11054
Updating YARN-2509
Updating YARN-2511
Updating HADOOP-11060
Updating HDFS-6886
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (MAPREDUCE-6073) Description of mapreduce.job.speculative.slowtaskthreshold in mapred-default should be moved into description tags

2014-09-05 Thread Tsuyoshi OZAWA (JIRA)

Tsuyoshi OZAWA created MAPREDUCE-6073:
-

 Summary: Description of 
mapreduce.job.speculative.slowtaskthreshold in mapred-default should be moved 
into description tags
 Key: MAPREDUCE-6073
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6073
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: documentation
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
Priority: Trivial


Currently, description of mapreduce.job.speculative.slowtaskthreshold in 
mapred-default is put outside of description tags. We should move it into 
description tags.

{code}
property
  namemapreduce.job.speculative.slowtaskthreshold/name
  value1.0/valueThe number of standard deviations by which a task's 
  ave progress-rates must be lower than the average of all running tasks'
  for the task to be considered too slow.
  description
  /description
/property
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6074) native-task: fix release audit warnings

2014-09-05 Thread Todd Lipcon (JIRA)

Todd Lipcon created MAPREDUCE-6074:
--

 Summary: native-task: fix release audit warnings
 Key: MAPREDUCE-6074
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6074
 Project: Hadoop Map/Reduce
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon


RAT is showing some release audit warnings. They all look spurious - just need 
to do a little cleanup and add excludes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (MAPREDUCE-6075) HistoryServerFileSystemStateStore can create zero-length files

2014-09-05 Thread Jason Lowe (JIRA)

Jason Lowe created MAPREDUCE-6075:
-

 Summary: HistoryServerFileSystemStateStore can create zero-length 
files
 Key: MAPREDUCE-6075
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6075
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe


When the history server state store writes a token file it uses 
IOUtils.cleanup() to close the file which will silently ignore errors.  This 
can lead to empty token files in the state store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Review request: YARN-1492 (i.e. the YARN shared cache)

2014-09-05 Thread Chris Trezzo

Hi All,

This email is to draw more attention to YARN-1492 and hopefully gain
traction on code reviews. At Twitter we have been running the shared cache
on our clusters and it has already served over 36 million requests.

The new shared cache feature is relatively isolated from existing code and
is completely enabled/disabled by configuration. When disabled there are no
behavioral changes compared to the existing code base. It would be great to
see this committed to trunk and even more awesome if it makes 2.6.

This is a larger patch, but I have broken it up into a number of sub-tasks
in an attempt to make it more digestible for the review process. A couple
things to note:

1. The two patches that interact with existing code in a substantial way
are:
YARN-2236 https://issues.apache.org/jira/browse/YARN-2236
https://issues.apache.org/jira/browse/MAPREDUCE-5951 - This patch adds
the cache uploader service to the node manager. MAPREDUCE-5951
https://issues.apache.org/jira/browse/MAPREDUCE-5951 - This patch adds
support for the shared cache at the MapReduce layer allowing jobs to cache
job jars, lib jars, files and archives.

2. If you would like to try out the entire feature there is a big bang
patch in YARN-1492 and instructions on how to set it up in a comment on the
issue here
https://issues.apache.org/jira/browse/YARN-1492?focusedCommentId=14123617page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14123617
.

Please ping me if you have any questions or think there is something else I
could do to make reviewing easier.

Thanks!
Chris Trezzo

[VOTE] Merge branch MAPREDUCE-2841 to trunk

2014-09-05 Thread Todd Lipcon

Hi all,

As I've reported recently [1], work on the MAPREDUCE-2841 branch has
progressed well and the development team working on it feels that it is
ready to be merged into trunk.

For those not familiar with the JIRA (it's a bit lengthy to read from start
to finish!) the goal of this work is to build a native implementation of
the map-side sort code. The native implementation's primary advantage is
its speed: for example, terasort is 30% faster on a wall-clock basis and
60% faster on a resource consumption basis. For clusters which make heavy
use of MapReduce, this is a substantial improvement to their efficiency.
Users may enable the feature by switching a single configuration flag, and
it will fall back to the original implementation in cases where the native
code doesn't support the configured features/types.

The new work is entirely pluggable and off-by-default to mitigate risk. The
merge patch itself does not modify even a single line of existing code: all
necessary plug-points have already been committed to trunk for some time.

Though we do not yet have a full +1 precommit Jenkins run on the JIRA,
there are only a few small nits to fix before merge, so I figured that we
could start the vote in parallel. Of course we will not merge until it has
a positive precommit run.

Though this branch is a new contribution to the Apache repository, it
represents work done over several years by a large community of developers
including the following:

Binglin Chang
Yang Dong
Sean Zhong
Manu Zhang
Zhongliang Zhu
Vincent Wang
Yan Dong
Cheng Lian
Xusen Yin
Fangqin Dai
Jiang Weihua
Gansha Wu
Avik Dey

The vote will run for 7 days, ending Friday 9/12 EOD PST.

I'll start the voting with my own +1.

-Todd

[1]
http://search-hadoop.com/m/09oay13EwlV/native+task+progresssubj=Native+task+branch+progress

[jira] [Created] (MAPREDUCE-6076) Zero map split input length combine with none zero map split input length will cause MR1 job hung.

2014-09-05 Thread zhihai xu (JIRA)

zhihai xu created MAPREDUCE-6076:


 Summary: Zero map split input length combine with none zero  map 
split input length will cause MR1 job hung. 
 Key: MAPREDUCE-6076
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6076
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv1
Reporter: zhihai xu
Assignee: zhihai xu


Zero map split input length combine with none zero map split input length will 
cause MR1 job hung. 
This problem may happen when use HBASE input split(TableSplit).
HBASE split input length can be zero for unknown regions or non-zero for known 
regions in the following code:
{code}
// TableSplit.java
public long getLength() {
return length;
  }

// RegionSizeCalculator.java
public long getRegionSize(byte[] regionId) {
Long size = sizeMap.get(regionId);
if (size == null) {
  LOG.debug(Unknown region: + Arrays.toString(regionId));
  return 0;
} else {
  return size;
}
  }
{code}
The TableSplit length come from RegionSizeCalculator.getRegionSize.

The job hung is because in MR1,
If these zero split input length map tasks are scheduled and completed before 
all none zero split input length map tasks are scheduled,
Scheduling new map task in JobProgress.java will be failed to pass the 
TaskTracker resources check at.
{code}
// findNewMapTask
// Check to ensure this TaskTracker has enough resources to 
// run tasks from this job
long outSize = resourceEstimator.getEstimatedMapOutputSize();
long availSpace = tts.getResourceStatus().getAvailableSpace();
if(availSpace  outSize) {
  LOG.warn(No room for map task. Node  + tts.getHost() + 
has  + availSpace + 
bytes free; but we expect map to take  + outSize);

  return -1; //see if a different TIP might work better. 
}
{code}
The resource calculation is at
{code}
// in ResourceEstimator.java
protected synchronized long getEstimatedTotalMapOutputSize()  {
if(completedMapsUpdates  threshholdToUse) {
  return 0;
} else {
  long inputSize = job.getInputLength() + job.desiredMaps(); 
  //add desiredMaps() so that randomwriter case doesn't blow up
  //the multiplication might lead to overflow, casting it with
  //double prevents it
  long estimate = Math.round(((double)inputSize * 
  completedMapsOutputSize * 2.0)/completedMapsInputSize);
  if (LOG.isDebugEnabled()) {
LOG.debug(estimate total map output will be  + estimate);
  }
  return estimate;
}
  }
protected synchronized void updateWithCompletedTask(TaskStatus ts, 
  TaskInProgress tip) {

//-1 indicates error, which we don't average in.
if(tip.isMapTask()   ts.getOutputSize() != -1)  {
  completedMapsUpdates++;

  completedMapsInputSize+=(tip.getMapInputSize()+1);
  completedMapsOutputSize+=ts.getOutputSize();

  if(LOG.isDebugEnabled()) {
LOG.debug(completedMapsUpdates:+completedMapsUpdates+  +
  completedMapsInputSize:+completedMapsInputSize+   +
  completedMapsOutputSize:+completedMapsOutputSize);
  }
}
  }
{code}
You can see in the calculation:
completedMapsInputSize will be a very small number and inputSize * 
  completedMapsOutputSize  will be a very big number
For example, completedMapsInputSize = 1; inputSize = 100MBytes and  
completedMapsOutputSize=100MBytes,
The estimate will be 5000TB which will be more than most task tracker disk 
space size.

So I think if the map split input length is 0, it means the split input length 
is unknown and it is reasonable to use map output size as input size for the 
calculation in ResourceEstimator. I will upload a fix based on this method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[VOTE] Release Apache Hadoop 2.5.1 RC0

2014-09-05 Thread Karthik Kambatla

Hi folks,

I have put together a release candidate (RC0) for Hadoop 2.5.1.

The RC is available at: http://people.apache.org/~kasha/hadoop-2.5.1-RC0/
The RC git tag is release-2.5.1-RC0
The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1010/

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the now usual 5
days.

Thanks
Karthik

Re: [VOTE] Merge branch MAPREDUCE-2841 to trunk

2014-09-05 Thread Chris Douglas

The change to the existing code is very limited and the perf is impressive. -C

On Fri, Sep 5, 2014 at 4:58 PM, Todd Lipcon t...@apache.org wrote:
Hi all,