[jira] [Resolved] (HDFS-7925) truncate RPC should not be considered as idempotent

2015-03-12 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey resolved HDFS-7925.

Resolution: Duplicate

Duplicate of HDFS-7926

> truncate RPC should not be considered as idempotent
> ---
>
> Key: HDFS-7925
> URL: https://issues.apache.org/jira/browse/HDFS-7925
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Brandon Li
>
> Currently truncate is considered as an idempotent call in ClientProtocol. 
> However, the retried RPC request could get a lease error like following:
> 2015-03-12 11:45:47,320 INFO  ipc.Server (Server.java:run(2053)) - IPC Server 
> handler 6 on 8020, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.truncate from 
> 192.168.76.4:49763 Call#1 Retry#1: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to 
> TRUNCATE_FILE /user/testuser/testFileTr for 
> DFSClient_NONMAPREDUCE_171671673_1 on 192.168.76.4 because 
> DFSClient_NONMAPREDUCE_171671673_1 is already the current lease holder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7911) Buffer Overflow when running HBase on HDFS Encryption Zone

2015-03-12 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu resolved HDFS-7911.
--
Resolution: Duplicate

> Buffer Overflow when running HBase on HDFS Encryption Zone
> --
>
> Key: HDFS-7911
> URL: https://issues.apache.org/jira/browse/HDFS-7911
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.6.0
>Reporter: Xiaoyu Yao
>Assignee: Yi Liu
>Priority: Blocker
>
> Create an HDFS EZ for HBase under /apps/hbase with some basic testing passed, 
> including creating tables, listing, adding a few rows, scanning them, etc. 
> However, when doing bulk load 100's k rows. After 10 minutes or so, we get 
> the following error on the Region Server that owns the table.
> {code}
> 2015-03-02 10:25:47,784 FATAL [regionserver60020-WAL.AsyncSyncer0] 
> wal.FSHLog: Error while AsyncSyncer sync, request close of hlog 
> java.io.IOException: java.nio.BufferOverflowException 
> at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:156)
> at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.encrypt(JceAesCtrCryptoCodec.java:127)
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.encrypt(CryptoOutputStream.java:162)
>  
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:232)
>  
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.hflush(CryptoOutputStream.java:267)
>  
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.sync(CryptoOutputStream.java:262) 
> at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:123) 
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
>  
> at 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
>  
> at java.lang.Thread.run(Thread.java:744) 
> Caused by: java.nio.BufferOverflowException 
> at java.nio.DirectByteBuffer.put(DirectByteBuffer.java:357) 
> at javax.crypto.CipherSpi.bufferCrypt(CipherSpi.java:823) 
> at javax.crypto.CipherSpi.engineUpdate(CipherSpi.java:546) 
> at javax.crypto.Cipher.update(Cipher.java:1760) 
> at 
> org.apache.hadoop.crypto.JceAesCtrCryptoCodec$JceAesCtrCipher.process(JceAesCtrCryptoCodec.java:145)
> ... 9 more 
> {code}
> It looks like the HBase WAL  (Write Ahead Log) use case is broken on the 
> CryptoOutputStream(). The use case has one flusher thread that keeps calling 
> the hflush() on WAL file while other roller threads are trying to write 
> concurrently to that same file handle.
> As the class comments mentioned. *""CryptoOutputStream encrypts data. It is 
> not thread-safe."* I check the code and it seems the buffer overflow is 
> related to the race between the CryptoOutputStream#write() and 
> CryptoOutputStream#flush() as both can call CryptoOutputStream#encrypt(). The 
> inBuffer/outBuffer of the CryptoOutputStream is not thread safe. They can be 
> changed during encrypt for flush() when write() is coming from other threads. 
> I have validated this with multi-threaded unit tests that mimic the HBase WAL 
> use case. For file not under encryption zone (*DFSOutputStream*), 
> multi-threaded flusher/writer works fine. For file under encryption zone 
> (*CryptoOutputStream*), multi-threaded flusher/writer randomly fails with 
> Buffer Overflow/Underflow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-7267) TestBalancer#testUnknownDatanode occasionally fails in trunk

2015-03-12 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko reopened HDFS-7267:
---

Reopening. Just saw a failure on Jenkins.
There is bunch of NPEs. The assert is probably a consequence.
Attaching the logs from the failed run.

> TestBalancer#testUnknownDatanode occasionally fails in trunk
> 
>
> Key: HDFS-7267
> URL: https://issues.apache.org/jira/browse/HDFS-7267
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
> Attachments: testUnknownDatanode-failed-log.html
>
>
> In build #1907 (https://builds.apache.org/job/Hadoop-Hdfs-trunk/1907/):
> {code}
> REGRESSION:  
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode
> Error Message:
> expected:<0> but was:<-3>
> Stack Trace:
> java.lang.AssertionError: expected:<0> but was:<-3>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode(TestBalancer.java:737)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7926) NameNode implementation of ClientProtocol.truncate(..) is not idempotent

2015-03-12 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-7926:
-

 Summary: NameNode implementation of ClientProtocol.truncate(..) is 
not idempotent
 Key: HDFS-7926
 URL: https://issues.apache.org/jira/browse/HDFS-7926
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze


If dfsclient drops the first response of a truncate RPC call, the retry by 
retry cache will fail with "DFSClient ... is already the current lease holder". 
 The truncate RPC is annotated as @Idempotent in ClientProtocol but the 
NameNode implementation is not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7925) truncate RPC should not be considered idempotent

2015-03-12 Thread Brandon Li (JIRA)
Brandon Li created HDFS-7925:


 Summary: truncate RPC should not be considered idempotent
 Key: HDFS-7925
 URL: https://issues.apache.org/jira/browse/HDFS-7925
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Brandon Li


Currently truncate is considered as an idempotent call in ClientProtocol. 
However, the retried RPC request could get a lease error like following:

2015-03-12 11:45:47,320 INFO  ipc.Server (Server.java:run(2053)) - IPC Server 
handler 6 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.truncate 
from 192.168.76.4:49763 Call#1 Retry#1: 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to 
TRUNCATE_FILE /user/hrt_qa/testFileTr for DFSClient_NONMAPREDUCE_171671673_1 on 
192.168.76.4 because DFSClient_NONMAPREDUCE_171671673_1 is already the current 
lease holder.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7924) NameNode goes into infinite lease recovery

2015-03-12 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-7924:
---

 Summary: NameNode goes into infinite lease recovery
 Key: HDFS-7924
 URL: https://issues.apache.org/jira/browse/HDFS-7924
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: 2.6.0
Reporter: Arpit Agarwal


We encountered an HDFS lease recovery issue. All DataNodes+NameNodes were 
restarted while a client was running. A block was created on the NN but it had 
not yet been created on DNs. The NN tried to recover the lease for the block on 
restart but was unable to do so getting into an infinite loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages

2015-03-12 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-7923:
--

 Summary: The DataNodes should rate-limit their full block reports 
by asking the NN on heartbeat messages
 Key: HDFS-7923
 URL: https://issues.apache.org/jira/browse/HDFS-7923
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Colin Patrick McCabe
Assignee: Charles Lamb


The DataNodes should rate-limit their full block reports.  They can do this by 
first sending a heartbeat message to the NN with an optional boolean set which 
requests permission to send a full block report.  If the NN responds with 
another optional boolean set, the DN will send an FBR... if not, it will wait 
until later.  This can be done compatibly with optional fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7922) ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors

2015-03-12 Thread Rakesh R (JIRA)
Rakesh R created HDFS-7922:
--

 Summary: ShortCircuitCache#close is not releasing 
ScheduledThreadPoolExecutors
 Key: HDFS-7922
 URL: https://issues.apache.org/jira/browse/HDFS-7922
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Rakesh R
Assignee: Rakesh R


ShortCircuitCache has the following executors. It would be good to shutdown 
these pools during ShortCircuitCache#close to avoid leaks.

{code}
  /**
   * The executor service that runs the cacheCleaner.
   */
  private final ScheduledThreadPoolExecutor cleanerExecutor
  = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder().
  setDaemon(true).setNameFormat("ShortCircuitCache_Cleaner").
  build());

  /**
   * The executor service that runs the cacheCleaner.
   */
  private final ScheduledThreadPoolExecutor releaserExecutor
  = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder().
  setDaemon(true).setNameFormat("ShortCircuitCache_SlotReleaser").
  build());
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7921) FileSystem listFiles doesn't list the directories if recursive is false

2015-03-12 Thread Sowmya Ramesh (JIRA)
Sowmya Ramesh created HDFS-7921:
---

 Summary: FileSystem listFiles doesn't list the directories if 
recursive is false
 Key: HDFS-7921
 URL: https://issues.apache.org/jira/browse/HDFS-7921
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Sowmya Ramesh


Below code, lists only files and not dirs if recursive is set to false. If 
recursive is set to true list all dirs and files. If recursive is set to false 
it should behave similar to hadoop fs -ls  which is not the case.

{code}
FileSystem fs = FileSystem.get(uri, conf);
RemoteIterator fileStatusListIterator = 
fs.listFiles(new Path("/tmp"), false);

while(fileStatusListIterator.hasNext()) {
LocatedFileStatus fileStatus = fileStatusListIterator.next();
System.out.println("Path: " + fileStatus.getPath());
}

Test results :
Path: hdfs://240.0.0.10:8020/tmp/idtest.hadoopqe.580215.29151.in
Path: hdfs://240.0.0.10:8020/tmp/idtest.hadoopqe.580215.29151.pig

[root@node-1 hive-repl-recipe]# hadoop fs -ls /tmp
Found 4 items
drwx-wx-wx   - hadoopqe hdfs  0 2015-03-02 17:52 /tmp/hive
drwxr-xr-x   - hadoopqe hdfs  0 2015-03-02 17:51 /tmp/id.out
-rw-r--r--   3 hadoopqe hdfs   2605 2015-03-02 17:58 
/tmp/idtest.hadoopqe.580215.29151.in
-rw-r--r--   3 hadoopqe hdfs159 2015-03-02 17:58 
/tmp/idtest.hadoopqe.580215.29151.pig

{code}

Is this the intended behavior? It is weird just to list files and not the 
directories if recursive is set to false.

If listStatus should be used instead can we make listFiles API deprecated?

Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7920) FIx WebHDFS AuthFilter to use DelegationTokenAuthenticationFilter

2015-03-12 Thread Arun Suresh (JIRA)
Arun Suresh created HDFS-7920:
-

 Summary: FIx WebHDFS AuthFilter to use 
DelegationTokenAuthenticationFilter
 Key: HDFS-7920
 URL: https://issues.apache.org/jira/browse/HDFS-7920
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Arun Suresh
Assignee: Arun Suresh


The {{AuthFilter}} currently overrides the {{AuthenticationFilter}} to bypass 
kerberos authentication if it finds a DelegationToken param in the request. It 
doesnt verify/validate the token. This is handled properly in the 
{{DelegationTokenAuthenticationFilter}} / 
{{KerberosDelegationTokenAuthenticationHandler}}.

This will also work in an HA setup if the DelegationTokenHandler is configured 
to use a distributed DelegationTokenSecretManager like 
{{ZKDelegationTokenSecretManager}} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hadoop-Hdfs-trunk-Java8 - Build # 121 - Still Failing

2015-03-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/121/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 8924 lines...]
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.3:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.12.1:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE [  03:13 h]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  1.723 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 03:13 h
[INFO] Finished at: 2015-03-12T14:48:41+00:00
[INFO] Final Memory: 55M/244M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-hdfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating HADOOP-11703
Updating HADOOP-11693
Updating HADOOP-10027
Updating HDFS-7491
Updating YARN-1884
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
9 tests failed.
FAILED:  
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect

Error Message:
The map of version counts returned by DatanodeManager was not what it was 
expected to be on iteration 365 expected:<0> but was:<1>

Stack Trace:
java.lang.AssertionError: The map of version counts returned by DatanodeManager 
was not what it was expected to be on iteration 365 expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect(TestDatanodeManager.java:157)


REGRESSION:  
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline

Error Message:
After waiting the operation updatePipeline still has not taken effect on NN yet

Stack Trace:
java.lang.AssertionError: After waiting the operation updatePipeline still has 
not taken effect on NN yet
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testClientRetryWithFailover(TestRetryCacheWithHA.java:1280)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline(TestRetryCacheWithHA.java:1178)


FAILED:  
org.apache.hadoop.hdfs.server.namenode.Tes

Build failed in Jenkins: Hadoop-Hdfs-trunk-Java8 #121

2015-03-12 Thread Apache Jenkins Server
See 

Changes:

[aw] HADOOP-11703. git should ignore .DS_Store files on Mac OS X (Abin Shahab 
via aw)

[cnauroth] HDFS-7491. Add incremental blockreport latency to DN metrics. 
Contributed by Ming Ma.

[cnauroth] HADOOP-11693. Azure Storage FileSystem rename operations are 
throttled too aggressively to complete HBase WAL archiving. Contributed by Duo 
Xu.

[zjshen] YARN-1884. Added nodeHttpAddress into ContainerReport and fixed the 
link to NM web page. Contributed by Xuan Gong.

[cnauroth] HADOOP-10027. *Compressor_deflateBytesDirect passes instance instead 
of jclass to GetStaticObjectField. Contributed by Hui Zheng.

--
[...truncated 8731 lines...]
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.961 sec - in 
org.apache.hadoop.fs.TestSymlinkHdfsDisable
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestSymlinkHdfsFileSystem
Tests run: 72, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 7.682 sec - in 
org.apache.hadoop.fs.TestSymlinkHdfsFileSystem
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestUrlStreamHandler
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.445 sec - in 
org.apache.hadoop.fs.TestUrlStreamHandler
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestXAttr
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.088 sec - in 
org.apache.hadoop.fs.TestXAttr
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestUnbuffer
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.989 sec - in 
org.apache.hadoop.fs.TestUnbuffer
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestFcHdfsCreateMkdir
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.408 sec - in 
org.apache.hadoop.fs.TestFcHdfsCreateMkdir
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestEnhancedByteBufferAccess
Tests run: 10, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 17.118 sec - 
in org.apache.hadoop.fs.TestEnhancedByteBufferAccess
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestHdfsNativeCodeLoader
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.163 sec - in 
org.apache.hadoop.fs.TestHdfsNativeCodeLoader
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestUrlStreamHandlerFactory
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.213 sec - in 
org.apache.hadoop.fs.TestUrlStreamHandlerFactory
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestHDFSFileContextMainOperations
Tests run: 68, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.42 sec - in 
org.apache.hadoop.fs.TestHDFSFileContextMainOperations
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.shell.TestHdfsTextCommand
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.537 sec - in 
org.apache.hadoop.fs.shell.TestHdfsTextCommand
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestResolveHdfsSymlink
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.81 sec - in 
org.apache.hadoop.fs.TestResolveHdfsSymlink
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.TestVolumeId
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.178 sec - in 
org.apache.hadoop.fs.TestVolumeId
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.viewfs.TestViewFsWithAcls
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.672 sec - in 
org.apache.hadoop.fs.viewfs.TestViewFsWithAcls
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.viewfs.TestViewFsFileStatusHdfs
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.031 sec - in 
org.apache.hadoop.fs.viewfs.TestViewFsFileStatusHdfs
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.fs.viewfs.TestViewFileSyste

Re: subscribe the dev mailing list

2015-03-12 Thread 张铎
Yeah I noticed...
I used to wonder why no confirm message was sent back and finally I found
that I pasted the wrong address...
Sorry for the spam...

2015-03-12 22:28 GMT+08:00 Ted Yu :

> Please send email to hdfs-dev-subscr...@hadoop.apache.org
>
> On Thu, Mar 12, 2015 at 6:56 AM, 张铎  wrote:
>
> > Thanks.
> >
>


Hadoop-Hdfs-trunk - Build # 2062 - Still Failing

2015-03-12 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2062/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 6530 lines...]
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.3:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Skipping javadoc generation
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.12.1:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE [  03:02 h]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  2.208 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 03:02 h
[INFO] Finished at: 2015-03-12T14:36:19+00:00
[INFO] Final Memory: 57M/591M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-hdfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating HADOOP-11703
Updating HADOOP-11693
Updating HADOOP-10027
Updating HDFS-7491
Updating YARN-1884
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
2 tests failed.
FAILED:  
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect

Error Message:
The map of version counts returned by DatanodeManager was not what it was 
expected to be on iteration 276 expected:<0> but was:<1>

Stack Trace:
java.lang.AssertionError: The map of version counts returned by DatanodeManager 
was not what it was expected to be on iteration 276 expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect(TestDatanodeManager.java:157)


REGRESSION:  
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown

Error Message:
Bad safemode status: 'Safe mode is ON. The reported blocks 3 needs additional 3 
blocks to reach the threshold 0.9990 of total blocks 6.
The number of live datanodes 3 has reached the minimum number 0. Safe mode will 
be turned off automatically once the thresholds have been reached.'

Stack Trace:
java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
blocks 3 needs additional 3 blocks to reach the threshold 0.9990 of total 
blocks 6.
The number of live datanodes 3 has reached the minimum number 0. Safe mode will 
be turned off automatically once the thresholds have been reached.'
at org.junit.Assert.fail(Assert.java:88)
at org.j

Build failed in Jenkins: Hadoop-Hdfs-trunk #2062

2015-03-12 Thread Apache Jenkins Server
See 

Changes:

[aw] HADOOP-11703. git should ignore .DS_Store files on Mac OS X (Abin Shahab 
via aw)

[cnauroth] HDFS-7491. Add incremental blockreport latency to DN metrics. 
Contributed by Ming Ma.

[cnauroth] HADOOP-11693. Azure Storage FileSystem rename operations are 
throttled too aggressively to complete HBase WAL archiving. Contributed by Duo 
Xu.

[zjshen] YARN-1884. Added nodeHttpAddress into ContainerReport and fixed the 
link to NM web page. Contributed by Xuan Gong.

[cnauroth] HADOOP-10027. *Compressor_deflateBytesDirect passes instance instead 
of jclass to GetStaticObjectField. Contributed by Hui Zheng.

--
[...truncated 6337 lines...]
Running org.apache.hadoop.hdfs.qjournal.server.TestJournal
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.772 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournal
Running org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeMXBean
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.64 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeMXBean
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.457 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.969 sec - 
in org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
Running org.apache.hadoop.hdfs.qjournal.client.TestSegmentRecoveryComparator
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.299 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestSegmentRecoveryComparator
Running org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChannel
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.158 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChannel
Running org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.822 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Running org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 151.709 sec - 
in org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.268 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Running org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.823 sec - in 
org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Running org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.556 sec - in 
org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Running org.apache.hadoop.hdfs.TestConnCache
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.933 sec - in 
org.apache.hadoop.hdfs.TestConnCache
Running org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 62.063 sec - in 
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Running org.apache.hadoop.hdfs.TestFileAppend
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.509 sec - 
in org.apache.hadoop.hdfs.TestFileAppend
Running org.apache.hadoop.hdfs.TestFileAppend3
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 41.227 sec - 
in org.apache.hadoop.hdfs.TestFileAppend3
Running org.apache.hadoop.hdfs.TestClientReportBadBlock
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.079 sec - in 
org.apache.hadoop.hdfs.TestClientReportBadBlock
Running org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.086 sec - in 
org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Running org.apache.hadoop.hdfs.TestFileCreation
Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 381.311 sec - 
in org.apache.hadoop.hdfs.TestFileCreation
Running org.apache.hadoop.hdfs.TestDFSRemove
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.321 sec - in 
org.apache.hadoop.hdfs.TestDFSRemove
Running org.apache.hadoop.hdfs.TestHdfsAdmin
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.928 sec - in 
org.apache.hadoop.hdfs.TestHdfsAdmin
Running org.apache.hadoop.hdfs.TestDFSUtil
Tests run: 30, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 75.031 sec - 
in org.apache.hadoop.hdfs.TestDFSUtil
Running org.apache.hadoop.hdfs.TestWriteBlockGetsBlockLengthHint
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.103 sec - in 
org.apache.hadoop.hdfs.TestWriteBlockGetsBlockLengthHint
Running org.apache.hadoop.hdfs.Te

Re: subscribe the dev mailing list

2015-03-12 Thread Ted Yu
Please send email to hdfs-dev-subscr...@hadoop.apache.org

On Thu, Mar 12, 2015 at 6:56 AM, 张铎  wrote:

> Thanks.
>


subscribe the dev mailing list

2015-03-12 Thread 张铎
Thanks.


[jira] [Created] (HDFS-7919) Time.NANOSECONDS_PER_MILLISECOND - use class level final constant instead of method variable

2015-03-12 Thread Ajith S (JIRA)
Ajith S created HDFS-7919:
-

 Summary: Time.NANOSECONDS_PER_MILLISECOND - use class level final 
constant instead of method variable 
 Key: HDFS-7919
 URL: https://issues.apache.org/jira/browse/HDFS-7919
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ajith S
Assignee: Ajith S
Priority: Trivial


NANOSECONDS_PER_MILLISECOND constant can be moved to class level instead of 
creating it in each method call.
{code}
org.apache.hadoop.util.Time.java
 public static long monotonicNow() {
final long NANOSECONDS_PER_MILLISECOND = 100;

return System.nanoTime() / NANOSECONDS_PER_MILLISECOND;
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015?

2015-03-12 Thread Karthik Kambatla
On Mon, Mar 9, 2015 at 2:15 PM, Steve Loughran 
wrote:

>
> If 3.x is going to be Java 8 & not backwards compatible, I don't expect
> anyone wanting to use this in production until some time deep into 2016.
>
> Issue: JDK 8 vs 7
>
> It will require Hadoop clusters to move up to Java 8. While there's dev
> pull for this, there's ops pull against this: people are still in the
> moving-off Java 6 phase due to that "it's working, don't update it"
> philosophy. Java 8 is compelling to us coders, but that doesn't mean ops
> want it.
>
> You can run JDK-8 code in a YARN cluster running on Hadoop 2.7 *today*,
> the main thing is setting up JAVA_HOME. That's something we could make
> easier somehow (maybe some min Java version field in resource requests that
> will let apps say java 8, java 9, ...). YARN could not only set up JVM
> paths, it could fail-fast if a Java version wasn't available.
>
> What we can't do in hadoop coretoday  is set javac.version=1.8 & use java
> 8 code. Downstream code ca do that (Hive, etc); they just need to accept
> that they don't get to play on JDK7 clusters if they embrace l-expressions.
>
> So...we need to stay on java 7 for some time due to ops pull; downstream
> apps get to choose what they want. We can/could enhance YARN to make JVM
> choice more declarative.
>
> Issue: Incompatible changes
>
> Without knowing what is proposed for "an incompatible classpath change", I
> can't say whether this is something that could be made optional. If it
> isn't, then it is a python-3 class option, "rewrite your code" event, which
> is going to be particularly traumatic to things like Hive that already do
> complex CP games. I'm currently against any mandatory change here, though
> would love to see an optional one. And if optional, it ceases to become an
> incompatible change...
>

We should probably start qualifying the word incompatible more often.

Are we okay with an API incompatible Hadoop-3? No.

Are we okay with an wire-incompatible Hadoop-3? Likely not.

Are we okay with breaking other forms of compatibility for Hadoop-3, like
behavior, dependencies, JDK, classpath, environment? I think so. Are we
okay with breaking these forms of compatibility in future Hadoop-2.x?
Likely not. Does our compatibility policy allow these changes in 2.x?
Mostly yes, but that is because we don't have policies for a lot of these
things that affect end-users. The reason we don't have a policy, IMO, is a
combination of (1) we haven't spent enough time thinking about them, (2)
without things like classpath isolation, we end up tying developers' hands
if we don't let them change the dependencies. I propose we update our
compat guidelines to be stricter, and do whatever is required to get there.
Is it okay to change our compat guidelines incompatibly? May be, it
warrants a Hadoop-3? I don't know yet.

And, some other policies like bumping min JDK requirement are allowed in
minor releases. Users might be okay with certain JDK bumps (6 to 7, since
no one seems to be using 6 anymore), but users most definitely care about
some other bumps (7 - 8). If we want to remove this subjective evaluation,
I am open to requiring a major version for JDK upgrades (not support, but
language features) even if it meant we have to wait until 3.0 for JDK
upgrade.



>
> Issue: Getting trunk out the door
>
> The main diff from branch-2 and trunk is currently the bash script
> changes. These don't break client apps. May or may not break bigtop & other
> downstream hadoop stacks, but developers don't need to worry about this:
> no recompilation necessary
>
> Proposed: ship trunk as a 2.x release, compatible with JDK7 & Java code.
>
> It seems to me that I could go
>
> git checkout trunk
> mvn versions:set -DnewVersion=2.8.0-SNAPSHOT
>
> We'd then have a version of Hadoop-trunk we could ship later this year,
> compatible at the JDK and API level with the existing java code & JDK7+
> clusters.
>
> A classpath fix that is optional/compatible can then go out on the 2.x
> line, saving the 3.x tag for something that really breaks things, forces
> all downstream apps to set up new hadoop profiles, have separate modules &
> generally hate the hadoop dev team
>
> This lets us tick off the "recent trunk release" and "fixed shell scripts"
> items, pushing out those benefits to people sooner rather than later, and
> puts off the "Hello, we've just broken your code" event for another 12+
> months.
>
> Comments?
>
> -Steve
>
>
>
>


-- 
Karthik Kambatla
Software Engineer, Cloudera Inc.

http://five.sentenc.es


Re: upstream jenkins build broken?

2015-03-12 Thread Vinayakumar B
I think the problem started from here.

https://builds.apache.org/job/PreCommit-HDFS-Build/9828/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/

As Chris mentioned TestDataNodeVolumeFailure is changing the permission.
But in this patch, ReplicationMonitor got NPE and it got terminate signal,
due to which MiniDFSCluster.shutdown() throwing Exception.

But, TestDataNodeVolumeFailure#teardown() is restoring those permission
after shutting down cluster. So in this case IMO, permissions were never
restored.


  @After
  public void tearDown() throws Exception {
if(data_fail != null) {
  FileUtil.setWritable(data_fail, true);
}
if(failedDir != null) {
  FileUtil.setWritable(failedDir, true);
}
if(cluster != null) {
  cluster.shutdown();
}
for (int i = 0; i < 3; i++) {
  FileUtil.setExecutable(new File(dataDir, "data"+(2*i+1)), true);
  FileUtil.setExecutable(new File(dataDir, "data"+(2*i+2)), true);
}
  }


Regards,
Vinay

On Thu, Mar 12, 2015 at 12:35 PM, Vinayakumar B 
wrote:

> When I see the history of these kind of builds, All these are failed on
> node H9.
>
> I think some or the other uncommitted patch would have created the problem
> and left it there.
>
>
> Regards,
> Vinay
>
> On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey  wrote:
>
>> You could rely on a destructive git clean call instead of maven to do the
>> directory removal.
>>
>> --
>> Sean
>> On Mar 11, 2015 4:11 PM, "Colin McCabe"  wrote:
>>
>> > Is there a maven plugin or setting we can use to simply remove
>> > directories that have no executable permissions on them?  Clearly we
>> > have the permission to do this from a technical point of view (since
>> > we created the directories as the jenkins user), it's simply that the
>> > code refuses to do it.
>> >
>> > Otherwise I guess we can just fix those tests...
>> >
>> > Colin
>> >
>> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu  wrote:
>> > > Thanks a lot for looking into HDFS-7722, Chris.
>> > >
>> > > In HDFS-7722:
>> > > TestDataNodeVolumeFailureXXX tests reset data dir permissions in
>> > TearDown().
>> > > TestDataNodeHotSwapVolumes reset permissions in a finally clause.
>> > >
>> > > Also I ran mvn test several times on my machine and all tests passed.
>> > >
>> > > However, since in DiskChecker#checkDirAccess():
>> > >
>> > > private static void checkDirAccess(File dir) throws
>> DiskErrorException {
>> > >   if (!dir.isDirectory()) {
>> > > throw new DiskErrorException("Not a directory: "
>> > >  + dir.toString());
>> > >   }
>> > >
>> > >   checkAccessByFileMethods(dir);
>> > > }
>> > >
>> > > One potentially safer alternative is replacing data dir with a regular
>> > > file to stimulate disk failures.
>> > >
>> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth <
>> cnaur...@hortonworks.com>
>> > wrote:
>> > >> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
>> > >> TestDataNodeVolumeFailureReporting, and
>> > >> TestDataNodeVolumeFailureToleration all remove executable permissions
>> > from
>> > >> directories like the one Colin mentioned to simulate disk failures at
>> > data
>> > >> nodes.  I reviewed the code for all of those, and they all appear to
>> be
>> > >> doing the necessary work to restore executable permissions at the
>> end of
>> > >> the test.  The only recent uncommitted patch I¹ve seen that makes
>> > changes
>> > >> in these test suites is HDFS-7722.  That patch still looks fine
>> > though.  I
>> > >> don¹t know if there are other uncommitted patches that changed these
>> > test
>> > >> suites.
>> > >>
>> > >> I suppose it¹s also possible that the JUnit process unexpectedly died
>> > >> after removing executable permissions but before restoring them.
>> That
>> > >> always would have been a weakness of these test suites, regardless of
>> > any
>> > >> recent changes.
>> > >>
>> > >> Chris Nauroth
>> > >> Hortonworks
>> > >> http://hortonworks.com/
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On 3/10/15, 1:47 PM, "Aaron T. Myers"  wrote:
>> > >>
>> > >>>Hey Colin,
>> > >>>
>> > >>>I asked Andrew Bayer, who works with Apache Infra, what's going on
>> with
>> > >>>these boxes. He took a look and concluded that some perms are being
>> set
>> > in
>> > >>>those directories by our unit tests which are precluding those files
>> > from
>> > >>>getting deleted. He's going to clean up the boxes for us, but we
>> should
>> > >>>expect this to keep happening until we can fix the test in question
>> to
>> > >>>properly clean up after itself.
>> > >>>
>> > >>>To help narrow down which commit it was that started this, Andrew
>> sent
>> > me
>> > >>>this info:
>> > >>>
>> > >>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-
>> >
>> >>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3/
>> > has
>> > >>>500 perms, so I'm guessing that's the problem. Been that way since
>> 9:32
>> > >>>

Re: upstream jenkins build broken?

2015-03-12 Thread Vinayakumar B
When I see the history of these kind of builds, All these are failed on
node H9.

I think some or the other uncommitted patch would have created the problem
and left it there.


Regards,
Vinay

On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey  wrote:

> You could rely on a destructive git clean call instead of maven to do the
> directory removal.
>
> --
> Sean
> On Mar 11, 2015 4:11 PM, "Colin McCabe"  wrote:
>
> > Is there a maven plugin or setting we can use to simply remove
> > directories that have no executable permissions on them?  Clearly we
> > have the permission to do this from a technical point of view (since
> > we created the directories as the jenkins user), it's simply that the
> > code refuses to do it.
> >
> > Otherwise I guess we can just fix those tests...
> >
> > Colin
> >
> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu  wrote:
> > > Thanks a lot for looking into HDFS-7722, Chris.
> > >
> > > In HDFS-7722:
> > > TestDataNodeVolumeFailureXXX tests reset data dir permissions in
> > TearDown().
> > > TestDataNodeHotSwapVolumes reset permissions in a finally clause.
> > >
> > > Also I ran mvn test several times on my machine and all tests passed.
> > >
> > > However, since in DiskChecker#checkDirAccess():
> > >
> > > private static void checkDirAccess(File dir) throws DiskErrorException
> {
> > >   if (!dir.isDirectory()) {
> > > throw new DiskErrorException("Not a directory: "
> > >  + dir.toString());
> > >   }
> > >
> > >   checkAccessByFileMethods(dir);
> > > }
> > >
> > > One potentially safer alternative is replacing data dir with a regular
> > > file to stimulate disk failures.
> > >
> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth <
> cnaur...@hortonworks.com>
> > wrote:
> > >> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
> > >> TestDataNodeVolumeFailureReporting, and
> > >> TestDataNodeVolumeFailureToleration all remove executable permissions
> > from
> > >> directories like the one Colin mentioned to simulate disk failures at
> > data
> > >> nodes.  I reviewed the code for all of those, and they all appear to
> be
> > >> doing the necessary work to restore executable permissions at the end
> of
> > >> the test.  The only recent uncommitted patch I¹ve seen that makes
> > changes
> > >> in these test suites is HDFS-7722.  That patch still looks fine
> > though.  I
> > >> don¹t know if there are other uncommitted patches that changed these
> > test
> > >> suites.
> > >>
> > >> I suppose it¹s also possible that the JUnit process unexpectedly died
> > >> after removing executable permissions but before restoring them.  That
> > >> always would have been a weakness of these test suites, regardless of
> > any
> > >> recent changes.
> > >>
> > >> Chris Nauroth
> > >> Hortonworks
> > >> http://hortonworks.com/
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On 3/10/15, 1:47 PM, "Aaron T. Myers"  wrote:
> > >>
> > >>>Hey Colin,
> > >>>
> > >>>I asked Andrew Bayer, who works with Apache Infra, what's going on
> with
> > >>>these boxes. He took a look and concluded that some perms are being
> set
> > in
> > >>>those directories by our unit tests which are precluding those files
> > from
> > >>>getting deleted. He's going to clean up the boxes for us, but we
> should
> > >>>expect this to keep happening until we can fix the test in question to
> > >>>properly clean up after itself.
> > >>>
> > >>>To help narrow down which commit it was that started this, Andrew sent
> > me
> > >>>this info:
> > >>>
> > >>>"/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-
> > >>>Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3/
> > has
> > >>>500 perms, so I'm guessing that's the problem. Been that way since
> 9:32
> > >>>UTC
> > >>>on March 5th."
> > >>>
> > >>>--
> > >>>Aaron T. Myers
> > >>>Software Engineer, Cloudera
> > >>>
> > >>>On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe 
> > >>>wrote:
> > >>>
> >  Hi all,
> > 
> >  A very quick (and not thorough) survey shows that I can't find any
> >  jenkins jobs that succeeded from the last 24 hours.  Most of them
> seem
> >  to be failing with some variant of this message:
> > 
> >  [ERROR] Failed to execute goal
> >  org.apache.maven.plugins:maven-clean-plugin:2.5:clean
> (default-clean)
> >  on project hadoop-hdfs: Failed to clean project: Failed to delete
> > 
> > 
> >
> >
> /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hdfs-pr
> > oject/hadoop-hdfs/target/test/data/dfs/data/data3
> >  -> [Help 1]
> > 
> >  Any ideas how this happened?  Bad disk, unit test setting wrong
> >  permissions?
> > 
> >  Colin
> > 
> > >>
> > >
> > >
> > >
> > > --
> > > Lei (Eddy) Xu
> > > Software Engineer, Cloudera
> >
>