Build failed in Jenkins: Hadoop-Common-trunk #945

2013-11-07 Thread Apache Jenkins Server
See 

Changes:

[cnauroth] HADOOP-9660. Update CHANGES.txt.

[cnauroth] MAPREDUCE-5451. MR uses LD_LIBRARY_PATH which doesn't mean anything 
in Windows. Contributed by Yingda Chen.

--
[...truncated 57370 lines...]
Adding reference: maven.local.repository
[DEBUG] Initialize Maven Ant Tasks
parsing buildfile 
jar:file:/home/jenkins/.m2/repository/org/apache/maven/plugins/maven-antrun-plugin/1.7/maven-antrun-plugin-1.7.jar!/org/apache/maven/ant/tasks/antlib.xml
 with URI = 
jar:file:/home/jenkins/.m2/repository/org/apache/maven/plugins/maven-antrun-plugin/1.7/maven-antrun-plugin-1.7.jar!/org/apache/maven/ant/tasks/antlib.xml
 from a zip file
parsing buildfile 
jar:file:/home/jenkins/.m2/repository/org/apache/ant/ant/1.8.2/ant-1.8.2.jar!/org/apache/tools/ant/antlib.xml
 with URI = 
jar:file:/home/jenkins/.m2/repository/org/apache/ant/ant/1.8.2/ant-1.8.2.jar!/org/apache/tools/ant/antlib.xml
 from a zip file
Class org.apache.maven.ant.tasks.AttachArtifactTask loaded from parent loader 
(parentFirst)
 +Datatype attachartifact org.apache.maven.ant.tasks.AttachArtifactTask
Class org.apache.maven.ant.tasks.DependencyFilesetsTask loaded from parent 
loader (parentFirst)
 +Datatype dependencyfilesets org.apache.maven.ant.tasks.DependencyFilesetsTask
Setting project property: test.build.dir -> 

Setting project property: test.exclude.pattern -> _
Setting project property: hadoop.assemblies.version -> 3.0.0-SNAPSHOT
Setting project property: test.exclude -> _
Setting project property: distMgmtSnapshotsId -> apache.snapshots.https
Setting project property: project.build.sourceEncoding -> UTF-8
Setting project property: java.security.egd -> file:///dev/urandom
Setting project property: distMgmtSnapshotsUrl -> 
https://repository.apache.org/content/repositories/snapshots
Setting project property: distMgmtStagingUrl -> 
https://repository.apache.org/service/local/staging/deploy/maven2
Setting project property: avro.version -> 1.7.4
Setting project property: test.build.data -> 

Setting project property: commons-daemon.version -> 1.0.13
Setting project property: hadoop.common.build.dir -> 

Setting project property: testsThreadCount -> 4
Setting project property: maven.test.redirectTestOutputToFile -> true
Setting project property: jdiff.version -> 1.0.9
Setting project property: distMgmtStagingName -> Apache Release Distribution 
Repository
Setting project property: project.reporting.outputEncoding -> UTF-8
Setting project property: build.platform -> Linux-i386-32
Setting project property: protobuf.version -> 2.5.0
Setting project property: failIfNoTests -> false
Setting project property: protoc.path -> ${env.HADOOP_PROTOC_PATH}
Setting project property: jersey.version -> 1.9
Setting project property: distMgmtStagingId -> apache.staging.https
Setting project property: distMgmtSnapshotsName -> Apache Development Snapshot 
Repository
Setting project property: ant.file -> 

[DEBUG] Setting properties with prefix: 
Setting project property: project.groupId -> org.apache.hadoop
Setting project property: project.artifactId -> hadoop-common-project
Setting project property: project.name -> Apache Hadoop Common Project
Setting project property: project.description -> Apache Hadoop Common Project
Setting project property: project.version -> 3.0.0-SNAPSHOT
Setting project property: project.packaging -> pom
Setting project property: project.build.directory -> 

Setting project property: project.build.outputDirectory -> 

Setting project property: project.build.testOutputDirectory -> 

Setting project property: project.build.sourceDirectory -> 

Setting project property: project.build.testSourceDirectory -> 

Setting project property: localRepository ->id: local
  url: file:///home/jenkins/.m2/repository/
   layout: none
Setting project property: settings.localRepository -> 
/home/jenkins/.m2/repository
Setting project property: maven.project.dependencies.versions -> 
[INFO] Executing tasks
Build sequence for target(s) `main' is [main]
Comple

Re: [DISCUSS] What is the purpose of merge vote threads?

2013-11-07 Thread Chris Nauroth
Thank you to everyone who replied.  Even though it sounds like there is not
complete consensus on some of the finer points, I think I have a clearer
understanding on how to participate now.

I do think posting all requirements in jira before calling the merge vote
makes the process more effective.  Contributors who haven't been following
the branch closely can get up to speed quickly by reading a refreshed
design doc and test plan.  Getting a +1 from Jenkins before the vote helps
reviewers focus on the logic instead of problems that could be caught by
static analysis.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Fri, Oct 25, 2013 at 12:04 PM, Doug Cutting  wrote:

> On Fri, Oct 25, 2013 at 10:56 AM, Vinod Kumar Vavilapalli
>  wrote:
> > Discussing on a voting thread is not productive.
>
> When all votes are +1 then no discussion is needed.  One shouldn't
> call a vote unless one expects all votes to be +1.  But, if
> unexpectedly they're not all +1, then a discussion must ensue, to
> substantiate the veto and to try to establish a remedy.
>
> It seems overly formal to immediately terminate all votes at the first
> expression of concern, restarting them later.  That puts process ahead
> of practicality and progress.  Rather, if an unforeseen yet easily
> addressed concern is raised during a vote then folks might reasonably
> agree to continue without restarting the vote.
>
> The purpose of the vote is to establish consensus.  If consensus is
> determined, then there's no need to delay.  So a vote can pass when
> the -1 voters change their vote to +1.  This might not hold if a
> remedy might be considered controversial, and its inclusion might
> reasonably invalidate prior +1 votes.  Then more time might be given
> for folks to consider the remedy.  But when the remedy is trivial it
> needn't be held to higher voting standard than a regular patch.
>
> Commits differ from releases since a release cannot be easily altered
> once published.  However a commit can be amended by subsequent
> commits.  We certainly want to minimize the need for subsequent
> commits, but don't need the same level of confidence.  With branch
> merge votes we should focus on the issue of whether the project is
> ready to assume the burden of maintaining the new functionality, since
> it's much harder to remove things than add them.  That's the reason
> for the one-week, 3 +1 rule.  For minor issues like compiler warnings,
> a fix to a merge patch should be held to the same standard as any
> other patch.
>
> Doug
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Next releases

2013-11-07 Thread Arun C Murthy
Gang,

 Thinking through the next couple of releases here, appreciate f/b.

 # hadoop-2.2.1

 I was looking through commit logs and there is a *lot* of content here (81 
commits as on 11/7). Some are features/improvements and some are fixes - it's 
really hard to distinguish what is important and what isn't.

 I propose we start with a blank slate (i.e. blow away branch-2.2 and start 
fresh from a copy of branch-2.2.0)  and then be very careful and meticulous 
about including only *blocker* fixes in branch-2.2. So, most of the content 
here comes via the next minor release (i.e. hadoop-2.3)

 In future, we continue to be *very* parsimonious about what gets into a patch 
release (major.minor.patch) - in general, these should be only *blocker* fixes 
or key operational issues.

 # hadoop-2.3
 
 I'd like to propose the following features for YARN/MR to make it into 
hadoop-2.3 and punt the rest to hadoop-2.4 and beyond:
 * Application History Server - This is happening in  a branch and is close; 
with it we can provide a reasonable experience for new frameworks being built 
on top of YARN.
 * Bug-fixes in RM Restart
 * Minimal support for long-running applications (e.g. security) via YARN-896
 * RM Fail-over via ZKFC
 * Anything else?

 HDFS???

 Overall, I feel like we have a decent chance of rolling hadoop-2.3 by the end 
of the year.

 Thoughts?

thanks,
Arun
 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


release-2.2.0 tag?

2013-11-07 Thread Mohammad Islam
Hi,
I could not find out any release tag for 2.2.0.
Which branch should I use instead for latest released version? branch-2.2.0 or 
branch-2.2?

Regards,
Mohammad

Unable to perform terasort for 50GB of data

2013-11-07 Thread Khai Cher LIM (NYP)
Dear all,

I have just started learning Hadoop setup and I am having problem with running 
terasort on my Hadoop cluster. My input folder contains 50 GB of data but when 
I run the terasort, the tasks failed and it gave me the error message as shown 
in the following screenshot.

[cid:image001.png@01CEDC83.D789EC90]

I've set my dfs block size to be 128 MB. Actually with the default 64 MB, the 
tasks failed also with the same reason.

Server information - HP ProLiant DL380p Gen8 (2U)
* two Intel Xeon E5-2640 processors with 15 MB cache, 2.5Ghz, 
7.2GT/s
* 48GB RAM
* 12 x 1TB (or a raw capacity of 12TB) 6G SAS 7.2K 3.5 HDD
* RAID controller that supports RAID 5 with at least 512MB 
Flash-Backed Write Cache (FBWC)
* on-board adapter of 4 x 1GbE Ethernet port
* 2 hot-pluggable power supply units

I've configured two servers with virtual machines as decribed below:
Server 1:
1 Name Node - 32 GB RAM, 300 GB HDD space
4 Data Nodes - 16 GB RAM, 300 GB HDD space

Server 2:
1 Secondary Name Node - 32 GB RAM, 300 GB HDD space
4 Data Nodes - 16 GB RAM, 300 GB HDD space

I've checked that the diskspace used per data node is about 20% on average. 
Thus I couldn't understand the error message complaining about "no space left 
on device".

Any help is much appreciated.

Thank you.

Regards,
Khai Cher



Re: release-2.2.0 tag?

2013-11-07 Thread Tsuyoshi OZAWA
Hi Mohammad,

IIUC, 2.2.0 is the latest released version. branch-2.2 is the latest
branch under development.

On Fri, Nov 8, 2013 at 11:52 AM, Mohammad Islam  wrote:
> Hi,
> I could not find out any release tag for 2.2.0.
> Which branch should I use instead for latest released version? branch-2.2.0 
> or branch-2.2?
>
> Regards,
> Mohammad



-- 
- Tsuyoshi


Re: Unable to perform terasort for 50GB of data

2013-11-07 Thread inelu nagamallikarjuna
Hai,

Check the individual data nodes usage:
Hadoop dfsadmin -report
And moreover override the config parameter mapred.local.dir to store
intermediate data in some path rather than /tmp directory and don't use
single reducer, increase no of reducers and use  totalorderpartitioner

Thanks
Nagamallikarjuna
On Nov 8, 2013 10:40 AM, "Khai Cher LIM (NYP)" 
wrote:

>  Dear all,
>
>
>
> I have just started learning Hadoop setup and I am having problem with
> running terasort on my Hadoop cluster. My input folder contains 50 GB of
> data but when I run the terasort, the tasks failed and it gave me the error
> message as shown in the following screenshot.
>
>
>
>
>
> I've set my dfs block size to be 128 MB. Actually with the default 64 MB,
> the tasks failed also with the same reason.
>
>
>
> Server information - HP ProLiant DL380p Gen8 (2U)
>
> • two Intel Xeon E5-2640 processors with 15 MB cache, 2.5Ghz,
> 7.2GT/s
>
> • 48GB RAM
>
> • 12 x 1TB (or a raw capacity of 12TB) 6G SAS 7.2K 3.5 HDD
>
> • RAID controller that supports RAID 5 with at least 512MB
> Flash-Backed Write Cache (FBWC)
>
> • on-board adapter of 4 x 1GbE Ethernet port
>
> • 2 hot-pluggable power supply units
>
>
>
> I've configured two servers with virtual machines as decribed below:
>
> Server 1:
>
> 1 Name Node - 32 GB RAM, 300 GB HDD space
>
> 4 Data Nodes - 16 GB RAM, 300 GB HDD space
>
>
>
> Server 2:
>
> 1 Secondary Name Node - 32 GB RAM, 300 GB HDD space
>
> 4 Data Nodes - 16 GB RAM, 300 GB HDD space
>
>
>
> I've checked that the diskspace used per data node is about 20% on
> average. Thus I couldn't understand the error message complaining about "no
> space left on device".
>
>
>
> Any help is much appreciated.
>
>
>
> Thank you.
>
>
>
> Regards,
>
> Khai Cher
>
>
>