[jira] [Resolved] (HDFS-6037) TestIncrementalBlockReports#testReplaceReceivedBlock fails occasionally in trunk

2015-03-07 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HDFS-6037.
--
Resolution: Cannot Reproduce

> TestIncrementalBlockReports#testReplaceReceivedBlock fails occasionally in 
> trunk
> 
>
> Key: HDFS-6037
> URL: https://issues.apache.org/jira/browse/HDFS-6037
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Ted Yu
>
> From 
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/1688/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestIncrementalBlockReports/testReplaceReceivedBlock/
>  :
> {code}
> datanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(
> ,
> ,
> 
> );
> Wanted 1 time:
> -> at 
> org.apache.hadoop.hdfs.server.datanode.TestIncrementalBlockReports.testReplaceReceivedBlock(TestIncrementalBlockReports.java:198)
> But was 2 times. Undesired invocation:
> -> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.reportReceivedDeletedBlocks(BPServiceActor.java:303)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Looking to a Hadoop 3 release

2015-03-07 Thread Eric Yang
Some lesson learn during 2.x.  WebHDFS, HDFS ACL, QJM HA, rolling upgrade
are great features.  Mapreduce 1.x uses resources more efficiently,
containers have rigid constraint, and applications get killed prematurely.
When a node has a lot of containers, YARN takes significant amount of
system resources.  Existing daemon based application to run on top of YARN
without code change is impossible.  It is difficult to pinpoint where
services will run.  Extra routing of client to server code needs to be
written for the application.  Hence, the existing map reduce approach to
spawn off parallelized work load and output result in durable file system
is better.  Client serving service doesn't need to track states but read
from hdfs.  Hence some level of HA for external serving service can achieve
without YARN.  Slider provides a better interface for exposing API to
deploy applications.

It would be nice to support the following in 3.x:

- JDK 8
- Upgrade to most recent version of Jetty, most hang problems or busy cpu
problems comes from Jetty 6.1.x being incompatible with JDK 7 in NIO design.
- Improve default security, there is a gap where Default Container Executor
vs Linux Container Executor.  It would be nicer if default security uses
Linux Container Executor to ensure developer remember to run with doAs when
designing services to run on top of Hadoop.
- Since 3.x is a major release number change.  There maybe backward
compatible API breakage initially in order to gain new functionality.  The
backward compatible patches can be added over time.
- Reduce YARN framework resource usage
- Improve usability of YARN UI.  Drill down from application to container
then back to application view is almost unusable.
- Smarter strategy for containers placement.  Some call this anti-affinity
support for YARN, but there is only a few types to support.  The identified
ones are: shared, silo, and dedicated.  In shared, containers can co-locate
on same node.  In silo, where same type of container can only spawn one per
node.  Dedicated will reserve the entire node for this workload.


regards,
Eric

On Fri, Mar 6, 2015 at 5:20 PM, Chris Douglas  wrote:

> On Fri, Mar 6, 2015 at 4:32 PM, Vinod Kumar Vavilapalli
>  wrote:
> > I'd encourage everyone to post their wish list on the Roadmap wiki that
> *warrants* making incompatible changes forcing us to go 3.x.
>
> This is a useful exercise, but not a prerequisite to releasing 3.0.0
> as an alpha off of trunk, right? Andrew summarized the operating
> assumptions for anyone working on it: rolling upgrades still work,
> wire compat is preserved, breaking changes may get rolled back when
> branch-3 is in beta (so be very conservative, notify others loudly).
> This applies to branches merged to trunk, also.
>
> > +1 to Jason's comments on general. We can keep rolling alphas that
> downstream can pick up, but I'd also like us to clarify the exit criterion
> for a GA release of 3.0 and its relation to the life of 2.x if we are going
> this route. This brings us back to the roadmap discussion, and a collective
> agreement about a logical step at a future point in time where we say we
> have enough incompatible features in 3.x that we can stop putting more of
> them and start stabilizing it.
>
> We'll have this discussion again. We don't need to reach consensus on
> the roadmap, just that each artifact reflects the output of the
> project.
>
> > Irrespective of that, here is my proposal in the interim:
> >  - Run JDK7 + JDK8 first in a compatible manner like I mentioned before
> for atleast two releases in branch-2: say 2.8 and 2.9 before we consider
> taking up the gauntlet on 3.0.
> >  - Continue working on the classpath isolation effort and try making it
> as compatible as is possible for users to opt in and migrate easily.
>
> +1 for 2.x, but again I don't understand the sequencing. -C
>
> > On Mar 5, 2015, at 1:44 PM, Jason Lowe 
> wrote:
> >
> >> I'm OK with a 3.0.0 release as long as we are minimizing the pain of
> maintaining yet another release line and conscious of the incompatibilities
> going into that release line.
> >> For the former, I would really rather not see a branch-3 cut so soon.
> It's yet another line onto which to cherry-pick, and I don't see why we
> need to add this overhead at such an early phase.  We should only create
> branch-3 when there's an incompatible change that the community wants and
> it should _not_ go into the next major release (i.e.: it's for Hadoop
> 4.0).  We can develop 3.0 alphas and betas on trunk and release from trunk
> in the interim.  IMHO we need to stop treating trunk as a place to exile
> patches.
> >>
> >> For the latter, I think as a community we need to evaluate the benefits
> of breaking compatibility against the costs of migrating.  Each time we
> break compatibility we create a hurdle for people to jump when they move to
> the new release, and we should make those hurdles worth their time.  For
> example, wire-compatibility has bee

Jenkins build is back to normal : Hadoop-Hdfs-trunk-Java8 #116

2015-03-07 Thread Apache Jenkins Server
See 



Hadoop-Hdfs-trunk - Build # 2057 - Still Failing

2015-03-07 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2057/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 7500 lines...]
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.3:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Skipping javadoc generation
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.12.1:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE [  02:58 h]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  2.194 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:58 h
[INFO] Finished at: 2015-03-07T14:33:01+00:00
[INFO] Final Memory: 67M/619M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-hdfs: There was a timeout or other error in the fork -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating HDFS-7885
Updating YARN-3227
Updating YARN-3275
Updating HADOOP-11653
Updating YARN-2190
Updating HADOOP-11642
Updating HDFS-6488
Updating HDFS-7818
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.testFailoverRightBeforeCommitSynchronization

Error Message:
test timed out after 3 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 3 milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236)
at 
org.apache.hadoop.test.GenericTestUtils$DelayAnswer.waitForCall(GenericTestUtils.java:226)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.testFailoverRightBeforeCommitSynchronization(TestPipelinesFailover.java:379)




Build failed in Jenkins: Hadoop-Hdfs-trunk #2057

2015-03-07 Thread Apache Jenkins Server
See 

Changes:

[jing9] HDFS-7885. Datanode should not trust the generation stamp provided by 
client. Contributed by Tsz Wo Nicholas Sze.

[xgong] YARN-3227. Timeline renew delegation token fails when RM user's TGT is

[aw] HADOOP-11653. shellprofiles should require .sh extension (Brahma Reddy 
Battula via aw)

[jianhe] YARN-2190. Added CPU and memory limit options to the default container 
executor for Windows containers. Contributed by Chuan Liu

[wheat9] HDFS-7818. OffsetParam should return the default value instead of 
throwing NPE when the value is unspecified. Contributed by Eric Payne.

[jlowe] YARN-3275. CapacityScheduler: Preemption happening on non-preemptable 
queues. Contributed by Eric Payne

[brandonli] HDFS-6488. Support HDFS superuser in NFSv3 gateway. Contributed by 
Brandon Li

[cnauroth] HADOOP-11642. Upgrade azure sdk version from 0.6.0 to 2.0.0. 
Contributed by Shashank Khandelwal and Ivan Mitic.

--
[...truncated 7307 lines...]
Running org.apache.hadoop.hdfs.qjournal.server.TestJournalNode
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.378 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournalNode
Running org.apache.hadoop.hdfs.qjournal.server.TestJournal
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.811 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournal
Running org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeMXBean
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.63 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeMXBean
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.46 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.974 sec - 
in org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
Running org.apache.hadoop.hdfs.qjournal.client.TestSegmentRecoveryComparator
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.3 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestSegmentRecoveryComparator
Running org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChannel
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.163 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChannel
Running org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.24 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Running org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 151.937 sec - 
in org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.262 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Running org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.814 sec - in 
org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Running org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.552 sec - in 
org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Running org.apache.hadoop.hdfs.TestConnCache
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.866 sec - in 
org.apache.hadoop.hdfs.TestConnCache
Running org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.882 sec - in 
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Running org.apache.hadoop.hdfs.TestFileAppend
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.62 sec - in 
org.apache.hadoop.hdfs.TestFileAppend
Running org.apache.hadoop.hdfs.TestFileAppend3
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 41.363 sec - 
in org.apache.hadoop.hdfs.TestFileAppend3
Running org.apache.hadoop.hdfs.TestClientReportBadBlock
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.115 sec - in 
org.apache.hadoop.hdfs.TestClientReportBadBlock
Running org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.068 sec - in 
org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Running org.apache.hadoop.hdfs.TestFileCreation
Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 381.732 sec - 
in org.apache.hadoop.hdfs.TestFileCreation
Running org.apache.hadoop.hdfs.TestDFSRemove
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.364 sec - in 
org.apache.hadoop.hdfs.TestDFSRemove
Running org.apache.hadoop.hdfs.TestHdfsAdmin
Tests run: 2, F