[jira] [Created] (HDFS-7242) Code improvement for FSN#checkUnreadableBySuperuser

2014-10-14 Thread Yi Liu (JIRA)
Yi Liu created HDFS-7242:


 Summary: Code improvement for FSN#checkUnreadableBySuperuser
 Key: HDFS-7242
 URL: https://issues.apache.org/jira/browse/HDFS-7242
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.6.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor


_checkUnreadableBySuperuser_ is to check whether user can access specific path. 
The code logic is not efficient. It does iteration check for all user, actually 
we just need to check _super user_ and can save few cpu cycle.
{code}
private void checkUnreadableBySuperuser(FSPermissionChecker pc,
  INode inode, int snapshotId)
  throws IOException {
for (XAttr xattr : dir.getXAttrs(inode, snapshotId)) {
  if (XAttrHelper.getPrefixName(xattr).
  equals(SECURITY_XATTR_UNREADABLE_BY_SUPERUSER)) {
if (pc.isSuperUser()) {
  throw new AccessControlException(Access is denied for  +
  pc.getUser() +  since the superuser is not allowed to  +
  perform this operation.);
}
  }
}
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7243) HDFS concat operation should not be allowed in Encryption Zone

2014-10-14 Thread Yi Liu (JIRA)
Yi Liu created HDFS-7243:


 Summary: HDFS concat operation should not be allowed in Encryption 
Zone
 Key: HDFS-7243
 URL: https://issues.apache.org/jira/browse/HDFS-7243
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption, namenode
Affects Versions: 2.6.0
Reporter: Yi Liu
Assignee: Yi Liu


For HDFS encryption at rest, files in an encryption zone are using different 
data encryption keys, so concat should be disallowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7244) Reduce Namenode memory using Flyweight pattern

2014-10-14 Thread Amir Langer (JIRA)
Amir Langer created HDFS-7244:
-

 Summary: Reduce Namenode memory using Flyweight pattern
 Key: HDFS-7244
 URL: https://issues.apache.org/jira/browse/HDFS-7244
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Amir Langer


Using the flyweight pattern can dramatically reduce memory usage in the 
Namenode. The pattern also abstracts the actual storage type and allows the 
decision of whether it is off-heap or not and what is the serialisation 
mechanism to be configured per deployment. 

The idea is to move all BlockInfo data (as a first step) to this storage using 
the Flyweight pattern. The cost to doing it will be in higher latency when 
accessing/modifying a block. The idea is that this will be offset with a 
reduction in memory and in the case of off-heap, a dramatic reduction in memory 
(effectively, memory used for BlockInfo would reduce to a very small constant 
value).
This reduction will also have an huge impact on the latency as GC pauses will 
be reduced considerably and may even end up with better latency results than 
the original code.

I wrote a stand-alone project as a proof of concept, to show the pattern, the 
data structure we can use and what will be the performance costs of this 
approach.

see [Slab|https://github.com/langera/slab]
and [Slab performance 
results|https://github.com/langera/slab/wiki/Performance-Results].

Slab abstracts the storage, gives several storage implementations and 
implements the flyweight pattern for the application (Namenode in our case).
The stages to incorporate Slab into the Namenode is outlined in the sub-tasks 
JIRAs.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7245) Introduce Slab code in HDFS

2014-10-14 Thread Amir Langer (JIRA)
Amir Langer created HDFS-7245:
-

 Summary: Introduce Slab code in HDFS
 Key: HDFS-7245
 URL: https://issues.apache.org/jira/browse/HDFS-7245
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: performance
Reporter: Amir Langer


see [Slab|https://github.com/langera/slab]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7246) Use ids for DatanodeStorageInfo in the BlockInfo triplets

2014-10-14 Thread Amir Langer (JIRA)
Amir Langer created HDFS-7246:
-

 Summary: Use ids for DatanodeStorageInfo in the BlockInfo triplets
 Key: HDFS-7246
 URL: https://issues.apache.org/jira/browse/HDFS-7246
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Identical to HDFS-6660




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7248) Use ids for blocks in InodeFile

2014-10-14 Thread Amir Langer (JIRA)
Amir Langer created HDFS-7248:
-

 Summary: Use ids for blocks in InodeFile
 Key: HDFS-7248
 URL: https://issues.apache.org/jira/browse/HDFS-7248
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Getting access to a block will be via lookup in the BlocksMap by id.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7247) Use ids for Block collection in Block

2014-10-14 Thread Amir Langer (JIRA)
Amir Langer created HDFS-7247:
-

 Summary: Use ids for Block collection in Block
 Key: HDFS-7247
 URL: https://issues.apache.org/jira/browse/HDFS-7247
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Getting the BlockCollection will be done via lookup by Id.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7249) Define block slabs per replication factor and initialise them in advance (inc. size config)

2014-10-14 Thread Amir Langer (JIRA)
Amir Langer created HDFS-7249:
-

 Summary: Define block slabs per replication factor and initialise 
them in advance (inc. size config)
 Key: HDFS-7249
 URL: https://issues.apache.org/jira/browse/HDFS-7249
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


The plan is to create a slab per replication factor inside the BlocksMap.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7250) Store blocks in slabs rather than a Map inside BlocksMap

2014-10-14 Thread Amir Langer (JIRA)
Amir Langer created HDFS-7250:
-

 Summary: Store blocks in slabs rather than a Map inside BlocksMap
 Key: HDFS-7250
 URL: https://issues.apache.org/jira/browse/HDFS-7250
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Amir Langer


Key to every block is the replication factor + slab key (address).
When setting a different replication factor for a block we need to move its 
data from one slab to another.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7251) Hadoop fs -put documentation issue

2014-10-14 Thread Sai Srikanth (JIRA)
Sai Srikanth created HDFS-7251:
--

 Summary: Hadoop fs -put documentation issue
 Key: HDFS-7251
 URL: https://issues.apache.org/jira/browse/HDFS-7251
 Project: Hadoop HDFS
  Issue Type: Task
  Components: nfs
Reporter: Sai Srikanth
Priority: Minor


cmd Hadoop fs -put documentation, in most of the version, it was given that 
source should be file. 

https://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#put
 

Usage: hdfs dfs -put localsrc ... dst

Copy single src, or multiple srcs from local file system to the destination 
file system. Also reads input from stdin and writes to destination file system.

hdfs dfs -put localfile /user/hadoop/hadoopfile
hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir
hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
hdfs dfs -put - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from 
stdin.


I have tested with the directory as a source and it worked fine. I think the 
documentation need to updated.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hadoop-Hdfs-trunk - Build # 1901 - Still Failing

2014-10-14 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1901/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 6198 lines...]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:2.3.2:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] ** FindBugsMojo execute ***
[INFO] canGenerate is false
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE [  02:21 h]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  2.266 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:21 h
[INFO] Finished at: 2014-10-14T13:56:47+00:00
[INFO] Final Memory: 66M/851M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-hdfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports
 for the individual test results.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating YARN-2667
Updating YARN-2641
Updating YARN-2651
Updating HDFS-7090
Updating MAPREDUCE-6115
Updating YARN-2308
Updating MAPREDUCE-6125
Updating HDFS-6544
Updating HDFS-7236
Updating YARN-2377
Updating HDFS-7237
Updating YARN-2566
Updating HADOOP-11198
Updating HADOOP-11176
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
2 tests failed.
FAILED:  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend

Error Message:
expected:18 but was:12

Stack Trace:
java.lang.AssertionError: expected:18 but was:12
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencing.testQueueingWithAppend(TestDNFencing.java:448)


FAILED:  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress

Error Message:
Deferred

Stack Trace:
java.lang.RuntimeException: Deferred
at 
org.apache.hadoop.test.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:130)
at 
org.apache.hadoop.test.MultithreadedTestUtil$TestContext.stop(MultithreadedTestUtil.java:166)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication.testFencingStress(TestDNFencingWithReplication.java:135)
Caused by: java.io.IOException: Timed out waiting for 2 replicas on path 
/test-15
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler.waitForReplicas(TestDNFencingWithReplication.java:96)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication$ReplicationToggler.doAnAction(TestDNFencingWithReplication.java:78)
at 
org.apache.hadoop.test.MultithreadedTestUtil$RepeatingTestThread.doWork(MultithreadedTestUtil.java:222)
at 
org.apache.hadoop.test.MultithreadedTestUtil$TestingThread.run(MultithreadedTestUtil.java:189)




Build failed in Jenkins: Hadoop-Hdfs-trunk #1901

2014-10-14 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1901/changes

Changes:

[jlowe] MAPREDUCE-6125. TestContainerLauncherImpl sometimes fails. Contributed 
by Mit Desai

[jlowe] YARN-2667. Fix the release audit warning caused by 
hadoop-yarn-registry. Contributed by Yi Liu

[jlowe] MAPREDUCE-6115. TestPipeApplication#testSubmitter fails in trunk. 
Contributed by Binglin Chang

[jing9] HDFS-7236. Fix 
TestOpenFilesWithSnapshot#testOpenFilesWithMultipleSnapshots. Contributed by 
Yongjun Zhang.

[wheat9] HDFS-6544. Broken Link for GFS in package.html. Contributed by Suraj 
Nayak M.

[zjshen] YARN-2651. Spun off LogRollingInterval from LogAggregationContext. 
Contributed by Xuan Gong.

[cnauroth] HDFS-7090. Use unbuffered writes when persisting in-memory replicas. 
Contributed by Xiaoyu Yao.

[jlowe] YARN-2377. Localization exception stack traces are not passed as 
diagnostic info. Contributed by Gera Shegalov

[jianhe] YARN-2308. Changed CapacityScheduler to explicitly throw exception if 
the queue

[jianhe] Missing Changes.txt for YARN-2308

[kasha] YARN-2641. Decommission nodes on -refreshNodes instead of next NM-RM 
heartbeat. (Zhihai Xu via kasha)

[kasha] YARN-2566. DefaultContainerExecutor should pick a working directory 
randomly. (Zhihai Xu via kasha)

[atm] HADOOP-11176. KMSClientProvider authentication fails when both currentUgi 
and loginUgi are a proxied user. Contributed by Arun Suresh.

[szetszwo] HDFS-7237. The command hdfs namenode -rollingUpgrade throws 
ArrayIndexOutOfBoundsException.

[wheat9] HADOOP-11198. Fix typo in javadoc for FileSystem#listStatus(). 
Contributed by Li Lu.

--
[...truncated 6005 lines...]
Running org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.856 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Running org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 151.103 sec - 
in org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.222 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Running org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.617 sec - in 
org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Running org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.521 sec - in 
org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Running org.apache.hadoop.hdfs.TestConnCache
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.59 sec - in 
org.apache.hadoop.hdfs.TestConnCache
Running org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.184 sec - in 
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Running org.apache.hadoop.hdfs.TestFileAppend
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.413 sec - in 
org.apache.hadoop.hdfs.TestFileAppend
Running org.apache.hadoop.hdfs.TestFileAppend3
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.195 sec - in 
org.apache.hadoop.hdfs.TestFileAppend3
Running org.apache.hadoop.hdfs.TestClientReportBadBlock
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.046 sec - in 
org.apache.hadoop.hdfs.TestClientReportBadBlock
Running org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.491 sec - in 
org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Running org.apache.hadoop.hdfs.TestFileCreation
Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 380.486 sec - 
in org.apache.hadoop.hdfs.TestFileCreation
Running org.apache.hadoop.hdfs.TestDFSRemove
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.872 sec - in 
org.apache.hadoop.hdfs.TestDFSRemove
Running org.apache.hadoop.hdfs.TestHdfsAdmin
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.529 sec - in 
org.apache.hadoop.hdfs.TestHdfsAdmin
Running org.apache.hadoop.hdfs.TestDFSUtil
Tests run: 30, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.545 sec - in 
org.apache.hadoop.hdfs.TestDFSUtil
Running org.apache.hadoop.hdfs.TestDatanodeBlockScanner
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 154.356 sec - 
in org.apache.hadoop.hdfs.TestDatanodeBlockScanner
Running org.apache.hadoop.hdfs.TestWriteBlockGetsBlockLengthHint
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.69 sec - in 
org.apache.hadoop.hdfs.TestWriteBlockGetsBlockLengthHint
Running org.apache.hadoop.hdfs.TestDataTransferKeepalive
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.092 sec - in 
org.apache.hadoop.hdfs.TestDataTransferKeepalive
Running 

Re: Thinking ahead to hadoop-2.6

2014-10-14 Thread Arun C Murthy
2.6.0 is close now.

Here are the remaining blockers, I'm hoping cut an RC in the next week or so:
http://s.apache.org/hadoop-2.6.0-blockers

thanks,
Arun

On Sep 30, 2014, at 10:42 AM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,
 
  I've created branch-2.6 to stabilize the release.
  
  Committers, please exercise caution henceforth on commits other than the 
 ones we've discussed on this thread already.
 
  By default new features should now be targeted to the version 2.7 
 henceforth - I've ensure all the projects have that version on jira.
 
 thanks,
 Arun
 
 On Sep 26, 2014, at 1:08 AM, Arun Murthy a...@hortonworks.com wrote:
 
 Sounds good. I'll branch this weekend and we can merge the jiras we
 discussed in this thread as they they get wrapped next week.
 
 Thanks everyone.
 
 Arun
 
 
 On Sep 24, 2014, at 7:39 PM, Vinod Kumar Vavilapalli vino...@apache.org 
 wrote:
 
 We can branch off in a week or two so that work on branch-2 itself can go
 ahead with other features that can't fit in 2.6. Independent of that, we
 can then decide on the timeline of the release candidates once branch-2.6
 is close to being done w.r.t the planned features.
 
 Branching it off can let us focus on specific features that we want in for
 2.6 and then eventually blockers for the release, nothing else. There is a
 trivial pain of committing to one more branch, but it's worth it in this
 case IMO.
 
 A lot of efforts are happening in parallel from the YARN side from where I
 see. 2.6 is a little bulky if only on the YARN side and I'm afraid if we
 don't branch off and selectively try to get stuff in, it is likely to be in
 a perpetual delay.
 
 My 2 cents.
 
 +Vinod
 
 On Wed, Sep 24, 2014 at 3:28 PM, Suresh Srinivas sur...@hortonworks.com
 wrote:
 
 Given some of the features are in final stages of stabilization,
 Arun, we should hold off creating 2.6 branch or building an RC by a week?
 All the features in flux are important ones and worth delaying the release
 by a week.
 
 On Wed, Sep 24, 2014 at 11:36 AM, Andrew Wang andrew.w...@cloudera.com
 wrote:
 
 Hey Nicholas,
 
 My concern about Archival Storage isn't related to the code quality or
 the
 size of the feature. I think that you and Jing did good work. My concern
 is
 that once we ship, we're locked into that set of archival storage APIs,
 and
 these APIs are not yet finalized. Simply being able to turn off the
 feature
 does not change the compatibility story.
 
 I'm willing to devote time to help review these JIRAs and kick the tires
 on
 the APIs, but my point above was that I'm not sure it'd all be done by
 the
 end of the week. Testing might also reveal additional changes that need
 to
 be made, which also might not happen by end-of-week.
 
 I guess the question before us is if we're comfortable putting something
 in
 branch-2.6 and then potentially adding API changes after. I'm okay with
 that as long as we're all aware that this might happen.
 
 Arun, as RM is this cool with you? Again, I like this feature and I'm
 fine
 with it's inclusion, just a heads up that we might need some extra time
 to
 finalize things before an RC can be cut.
 
 Thanks,
 Andrew
 
 On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze 
 s29752-hadoop...@yahoo.com.invalid wrote:
 
 Hi,
 
 I am worry about KMS and transparent encryption since there are quite
 many
 bugs discovered after it got merged to branch-2.  It gives us an
 impression
 that the feature is not yet well tested.  Indeed, transparent
 encryption
 is
 a complicated feature which changes the core part of HDFS.  It is not
 easy
 to get everything right.
 
 
 For HDFS-6584: Archival Storage, it is a relatively simple and low risk
 feature.  It introduces a new storage type ARCHIVE and the concept of
 block
 storage policy to HDFS.  When a cluster is configured with ARCHIVE
 storage,
 the blocks will be stored using the appropriate storage types specified
 by
 storage policies assigned to the files/directories.  Cluster admin
 could
 disable the feature by simply not configuring any storage type and not
 setting any storage policy as before.   As Suresh mentioned, HDFS-6584
 is
 in the final stages to be merged to branch-2.
 
 Regards,
 Tsz-Wo
 
 
 
 On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas 
 sur...@hortonworks.com wrote:
 
 
 
 
 I actually would like to see both archival storage and single replica
 memory writes to be in 2.6 release. Archival storage is in the final
 stages
 of getting ready for branch-2 merge as Nicholas has already indicated
 on
 the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both of
 these
 features are being in development for sometime.
 
 On Tue, Sep 23, 2014 at 3:27 PM, Andrew Wang 
 andrew.w...@cloudera.com
 wrote:
 
 Hey Arun,
 
 Maybe we could do a quick run through of the Roadmap wiki and
 add/retarget
 things accordingly?
 
 I think the KMS and transparent encryption are ready to go. We've
 got
 a
 very few further bug fixes pending, but that's it.
 
 

[jira] [Resolved] (HDFS-7241) Unable to create encryption zone for viewfs:// after namenode federation is enabled

2014-10-14 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb resolved HDFS-7241.

Resolution: Not a Problem
  Assignee: Charles Lamb

Since creating an encryption zone is an administrative function, it makes the 
most sense to just create it on the underlying HDFS namenode rather than 
through viewfs.


 Unable to create encryption zone for viewfs:// after namenode federation is 
 enabled
 ---

 Key: HDFS-7241
 URL: https://issues.apache.org/jira/browse/HDFS-7241
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption, federation
Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
Reporter: Xiaomin Zhang
Assignee: Charles Lamb

 After configuring namenode federation for the cluster, I also enabled client 
 mount table and viewfs as default URI. The hdfs crypto commands now failed 
 with below error:
 # hdfs crypto -createZone -keyName key1 -path /user/test
 IllegalArgumentException: FileSystem viewfs://cluster18/ is not an HDFS file 
 system
 This blocks the whole encryption at-rest feature as no zone could be defined



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Thinking ahead to hadoop-2.6

2014-10-14 Thread Neeraj Asrani
Hi Arun,

Can you assign one of the issues to me. Il be glad to help out.

Thanks,
Nick.


On Tue, Oct 14, 2014 at 11:25 AM, Arun C Murthy a...@hortonworks.com wrote:

 2.6.0 is close now.

 Here are the remaining blockers, I'm hoping cut an RC in the next week or
 so:
 http://s.apache.org/hadoop-2.6.0-blockers

 thanks,
 Arun

 On Sep 30, 2014, at 10:42 AM, Arun C Murthy a...@hortonworks.com wrote:

  Folks,
 
   I've created branch-2.6 to stabilize the release.
 
   Committers, please exercise caution henceforth on commits other than
 the ones we've discussed on this thread already.
 
   By default new features should now be targeted to the version 2.7
 henceforth - I've ensure all the projects have that version on jira.
 
  thanks,
  Arun
 
  On Sep 26, 2014, at 1:08 AM, Arun Murthy a...@hortonworks.com wrote:
 
  Sounds good. I'll branch this weekend and we can merge the jiras we
  discussed in this thread as they they get wrapped next week.
 
  Thanks everyone.
 
  Arun
 
 
  On Sep 24, 2014, at 7:39 PM, Vinod Kumar Vavilapalli 
 vino...@apache.org wrote:
 
  We can branch off in a week or two so that work on branch-2 itself can
 go
  ahead with other features that can't fit in 2.6. Independent of that,
 we
  can then decide on the timeline of the release candidates once
 branch-2.6
  is close to being done w.r.t the planned features.
 
  Branching it off can let us focus on specific features that we want in
 for
  2.6 and then eventually blockers for the release, nothing else. There
 is a
  trivial pain of committing to one more branch, but it's worth it in
 this
  case IMO.
 
  A lot of efforts are happening in parallel from the YARN side from
 where I
  see. 2.6 is a little bulky if only on the YARN side and I'm afraid if
 we
  don't branch off and selectively try to get stuff in, it is likely to
 be in
  a perpetual delay.
 
  My 2 cents.
 
  +Vinod
 
  On Wed, Sep 24, 2014 at 3:28 PM, Suresh Srinivas 
 sur...@hortonworks.com
  wrote:
 
  Given some of the features are in final stages of stabilization,
  Arun, we should hold off creating 2.6 branch or building an RC by a
 week?
  All the features in flux are important ones and worth delaying the
 release
  by a week.
 
  On Wed, Sep 24, 2014 at 11:36 AM, Andrew Wang 
 andrew.w...@cloudera.com
  wrote:
 
  Hey Nicholas,
 
  My concern about Archival Storage isn't related to the code quality
 or
  the
  size of the feature. I think that you and Jing did good work. My
 concern
  is
  that once we ship, we're locked into that set of archival storage
 APIs,
  and
  these APIs are not yet finalized. Simply being able to turn off the
  feature
  does not change the compatibility story.
 
  I'm willing to devote time to help review these JIRAs and kick the
 tires
  on
  the APIs, but my point above was that I'm not sure it'd all be done
 by
  the
  end of the week. Testing might also reveal additional changes that
 need
  to
  be made, which also might not happen by end-of-week.
 
  I guess the question before us is if we're comfortable putting
 something
  in
  branch-2.6 and then potentially adding API changes after. I'm okay
 with
  that as long as we're all aware that this might happen.
 
  Arun, as RM is this cool with you? Again, I like this feature and I'm
  fine
  with it's inclusion, just a heads up that we might need some extra
 time
  to
  finalize things before an RC can be cut.
 
  Thanks,
  Andrew
 
  On Tue, Sep 23, 2014 at 7:30 PM, Tsz Wo (Nicholas), Sze 
  s29752-hadoop...@yahoo.com.invalid wrote:
 
  Hi,
 
  I am worry about KMS and transparent encryption since there are
 quite
  many
  bugs discovered after it got merged to branch-2.  It gives us an
  impression
  that the feature is not yet well tested.  Indeed, transparent
  encryption
  is
  a complicated feature which changes the core part of HDFS.  It is
 not
  easy
  to get everything right.
 
 
  For HDFS-6584: Archival Storage, it is a relatively simple and low
 risk
  feature.  It introduces a new storage type ARCHIVE and the concept
 of
  block
  storage policy to HDFS.  When a cluster is configured with ARCHIVE
  storage,
  the blocks will be stored using the appropriate storage types
 specified
  by
  storage policies assigned to the files/directories.  Cluster admin
  could
  disable the feature by simply not configuring any storage type and
 not
  setting any storage policy as before.   As Suresh mentioned,
 HDFS-6584
  is
  in the final stages to be merged to branch-2.
 
  Regards,
  Tsz-Wo
 
 
 
  On Wednesday, September 24, 2014 7:00 AM, Suresh Srinivas 
  sur...@hortonworks.com wrote:
 
 
 
 
  I actually would like to see both archival storage and single
 replica
  memory writes to be in 2.6 release. Archival storage is in the
 final
  stages
  of getting ready for branch-2 merge as Nicholas has already
 indicated
  on
  the dev mailing list. Hopefully HDFS-6581 gets ready sooner. Both
 of
  these
  features are being in development for sometime.
 
  On Tue, Sep 23, 

[jira] [Resolved] (HDFS-7070) TestWebHdfsFileSystemContract fails occassionally

2014-10-14 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang resolved HDFS-7070.
-
Resolution: Cannot Reproduce

Haven't seen the reported tests to fail for 3 weeks. The issue might have been 
addressed by some fix. Closing it for now. Please feel free to reopen if it 
happens again.


 TestWebHdfsFileSystemContract fails occassionally
 -

 Key: HDFS-7070
 URL: https://issues.apache.org/jira/browse/HDFS-7070
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang

 org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract.testResponseCode
 and  
 org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract.testRenameDirToSelf 
 failed recently.
 Need to determine whether it's  introduced by some latest code change due to 
 file descriptor leak; or it's a similar issue as HDFS-6694 reported.
 E.g. 
 https://builds.apache.org/job/PreCommit-HDFS-Build/8026/testReport/org.apache.hadoop.hdfs.web/TestWebHdfsFileSystemContract/testResponseCode/.
 {code}
 2014-09-15 12:52:18,866 INFO  datanode.DataNode 
 (DataXceiver.java:writeBlock(749)) - opWriteBlock 
 BP-23833599-67.195.81.147-1410785517350:blk_1073741827_1461 received 
 exception java.io.IOException: Cannot run program stat: 
 java.io.IOException: error=24, Too many open files
 2014-09-15 12:52:18,867 ERROR datanode.DataNode (DataXceiver.java:run(243)) - 
 127.0.0.1:47221:DataXceiver error processing WRITE_BLOCK operation  src: 
 /127.0.0.1:38112 dst: /127.0.0.1:47221
 java.io.IOException: Cannot run program stat: java.io.IOException: 
 error=24, Too many open files
   at java.lang.ProcessBuilder.start(ProcessBuilder.java:470)
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:485)
   at org.apache.hadoop.util.Shell.run(Shell.java:455)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
   at org.apache.hadoop.fs.HardLink.getLinkCount(HardLink.java:495)
   at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInfo.unlinkBlock(ReplicaInfo.java:288)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:702)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:680)
   at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.append(FsDatasetImpl.java:101)
   at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:193)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:604)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:225)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: java.io.IOException: error=24, Too many open 
 files
   at java.lang.UNIXProcess.init(UNIXProcess.java:148)
   at java.lang.ProcessImpl.start(ProcessImpl.java:65)
   at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
   ... 14 more
 2014-09-15 12:52:18,867 INFO  hdfs.DFSClient 
 (DFSOutputStream.java:createBlockOutputStream(1400)) - Exception in 
 createBlockOutputStream
 java.io.EOFException: Premature EOF: no length prefix available
   at 
 org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2101)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1368)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1210)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:530)
 2014-09-15 12:52:18,870 WARN  hdfs.DFSClient (DFSOutputStream.java:run(883)) 
 - DFSOutputStream ResponseProcessor exception  for block 
 BP-23833599-67.195.81.147-1410785517350:blk_1073741827_1461
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2099)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:798)
 2014-09-15 12:52:18,870 WARN  hdfs.DFSClient (DFSOutputStream.java:run(627)) 
 - DataStreamer Exception
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:273)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:579)
 {code}



--
This message was sent by Atlassian JIRA

[jira] [Created] (HDFS-7252) small refine for use of isInAnEZ in FSNamesystem

2014-10-14 Thread Yi Liu (JIRA)
Yi Liu created HDFS-7252:


 Summary: small refine for use of isInAnEZ in FSNamesystem
 Key: HDFS-7252
 URL: https://issues.apache.org/jira/browse/HDFS-7252
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Trivial


In {{FSN#startFileInt}}, _EncryptionZoneManager#getEncryptionZoneForPath_ is 
invoked 3 times (_dir.isInAnEZ(iip)_, _dir.getEZForPath(iip)_, 
_dir.getKeyName(iip)_) in following code, actually we just need one.
{code}
if (dir.isInAnEZ(iip)) {
  EncryptionZone zone = dir.getEZForPath(iip);
  protocolVersion = chooseProtocolVersion(zone, supportedVersions);
  suite = zone.getSuite();
  ezKeyName = dir.getKeyName(iip);

  Preconditions.checkNotNull(protocolVersion);
  Preconditions.checkNotNull(suite);
  Preconditions.checkArgument(!suite.equals(CipherSuite.UNKNOWN),
  Chose an UNKNOWN CipherSuite!);
  Preconditions.checkNotNull(ezKeyName);
}
{code}
Also there are 2 times in following code, but just need one
{code}
if (dir.isInAnEZ(iip)) {
  // The path is now within an EZ, but we're missing encryption parameters
  if (suite == null || edek == null) {
throw new RetryStartFileException();
  }
  // Path is within an EZ and we have provided encryption parameters.
  // Make sure that the generated EDEK matches the settings of the EZ.
  String ezKeyName = dir.getKeyName(iip);
  if (!ezKeyName.equals(edek.getEncryptionKeyName())) {
throw new RetryStartFileException();
  }
  feInfo = new FileEncryptionInfo(suite, version,
  edek.getEncryptedKeyVersion().getMaterial(),
  edek.getEncryptedKeyIv(),
  ezKeyName, edek.getEncryptionKeyVersionName());
  Preconditions.checkNotNull(feInfo);
}
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)