[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Attachment: (was: HDFS-7743.000.patch)

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7732:
---
Attachment: HDFS-7732.patch

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7720) Quota by Storage Type API, tools and ClientNameNode Protocol changes

2015-02-06 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7720:
-
Attachment: HDFS-7720.4.patch

Update the patch that fixes TestHDFSCLI.

 Quota by Storage Type API, tools and ClientNameNode Protocol changes
 

 Key: HDFS-7720
 URL: https://issues.apache.org/jira/browse/HDFS-7720
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7720.0.patch, HDFS-7720.1.patch, HDFS-7720.2.patch, 
 HDFS-7720.3.patch, HDFS-7720.4.patch


 Split the patch into small ones based on the feedback. This one covers the 
 HDFS API changes, tool changes and ClientNameNode protocol changes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7744) Fix potential NPE in DFSInputStream after setDropBehind or setReadahead is called

2015-02-06 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-7744:
--

 Summary: Fix potential NPE in DFSInputStream after setDropBehind 
or setReadahead is called
 Key: HDFS-7744
 URL: https://issues.apache.org/jira/browse/HDFS-7744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: dfsclient
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


Fix a potential NPE in DFSInputStream after setDropBehind or setReadahead is 
called.  These functions clear the {{blockReader}}, but don't set {{blockEnd}} 
to -1, which could lead to {{DFSInputStream#seek}} attempting to derference 
{{blockReader}} even though it is {{null}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7738:
--
Attachment: h7738_20150206.patch

Thanks, Konstantin.
# Java enum is implicitly static, i.e. there is no non-static enum.
# done
# yes, it is better to have some random test.  Let me decrease the number of 
blocks from 1000 to 100 so that the overhead is much smaller.
# Sure, let's use my previous version of the test.  Will file a new JIRA for 
the new test.  BTW, how it fails? 

Here is a new patch: h7738_20150206.patch

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309700#comment-14309700
 ] 

Akira AJISAKA commented on HDFS-7732:
-

The fix is just to fix the order, so we don't need to wait for Jenkins. 
Checking this in.

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Trivial
 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7716) Erasure Coding: extend BlockInfo to handle EC info

2015-02-06 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309739#comment-14309739
 ] 

Zhe Zhang commented on HDFS-7716:
-

Thanks Jing for the great work! I haven't finished reading the patch but some 
high level comments so far:
# bq. To make the review easier, I plan to do the 
BlockInfo--BlockReplicationInfo rename in trunk so that the size of the patch 
can be decreased.
Great idea. I believe it will also make it easier to maintain the branch. Just 
to clarify, the plan is to commit the updated {{BlockInfo}} and 
{{BlockReplicationInfo}} classes to trunk right?
# Is the below because of {{BlockGroupInfo}} and {{BlockReplicationInfo}}'s 
names? If we do want this sanity check, how about {{BlockInfoStriped}}, 
{{BlockInfoStripedUnderConstruction}}, {{BlockInfoContiguous}} (or 
{{BlockInfoReplicated}})? I prefer _contiguous_ because later we can have 
contiguous + EC blocks, which will also use this {{BlockInfoContiguous}} / 
{{BlockInfoReplicated}} class. Of course we can also change the assert to check 
both Block and Info are in the name.
{code}
-assert info == null || 
-info.getClass().getName().startsWith(BlockInfo.class.getName()) : 
-  BlockInfo is expected at  + index*3;
{code}
# {{BlockGroupInfo#addStorage}} requires additional information (the block 
reported from DN) to determine the index inside of the block group.
I agree. I originally planed to add this logic under HDFS-7652 but then 
realized it's better to wait until this JIRA is finalized. The current 
HDFS-7652 patch has a smaller scope -- it just adds a method to calculate group 
ID from received block reports, and find the entry in {{blocksMap}} using the 
group ID instead of individual block IDs. Could you take a look at that patch? 
I guess this JIRA requires it to handled reports of striped blocks?
# I'm still digesting the details of {{BlockGroupInfo}} and will post another 
comment later today.

 Erasure Coding: extend BlockInfo to handle EC info
 --

 Key: HDFS-7716
 URL: https://issues.apache.org/jira/browse/HDFS-7716
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-7716.000.patch


 The current BlockInfo's implementation only supports the replication 
 mechanism. To use the same blocksMap handling block group and its data/parity 
 blocks, we need to define a new BlockGroupInfo class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309818#comment-14309818
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7738:
---

 In general I don't like random tests because they imply intermittent 
 failures, ...

If there are intermittent failures, it means that there are bugs either in the 
code or in the test.  I guess what you don't like is poorly written random 
tests which may experience intermittent failures.  For well written tests, it 
won't have intermittent failures.

Why we need random tests?  It is because the problem space is huge so that it 
is impossible to try all the cases.  We have to do random sampling.

testBasicTruncate, which is a well written test, does cover a lot of cases.  
However, it only tests a 12 bytes file with 3 blocks.  Also, toTruncate is 
consecutive.  For example, it does not test the case calling truncate to take 
out 10 blocks at once.

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-316) Balancer should run for a configurable # of iterations

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309897#comment-14309897
 ] 

Hadoop QA commented on HDFS-316:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697066/HDFS-316.3.patch
  against trunk revision 1425e3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9459//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9459//console

This message is automatically generated.

 Balancer should run for a configurable # of iterations
 --

 Key: HDFS-316
 URL: https://issues.apache.org/jira/browse/HDFS-316
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover
Affects Versions: 2.4.1
Reporter: Brian Bockelman
Assignee: Xiaoyu Yao
Priority: Minor
  Labels: newbie
 Attachments: HDFS-316.0.patch, HDFS-316.1.patch, HDFS-316.2.patch, 
 HDFS-316.3.patch


 The balancer currently exits if nothing has changed after 5 iterations.
 Our site would like to constantly balance a stream of incoming data; we would 
 like to be able to set the number of iterations it does nothing for before 
 exiting; even better would be if we set it to a negative number and could 
 continuously run this as a daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Status: Open  (was: Patch Available)

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7647) DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs

2015-02-06 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309643#comment-14309643
 ] 

Arpit Agarwal commented on HDFS-7647:
-

Thanks for your patience [~milandesai], the latest patch look good.

There are a few reproducible test failures flagged by Jenkins we'll need to 
address before committing.

 DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs
 --

 Key: HDFS-7647
 URL: https://issues.apache.org/jira/browse/HDFS-7647
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Milan Desai
Assignee: Milan Desai
 Attachments: HDFS-7647-2.patch, HDFS-7647-3.patch, HDFS-7647-4.patch, 
 HDFS-7647.patch


 DatanodeManager.sortLocatedBlocks() sorts the array of DatanodeInfos inside 
 each LocatedBlock, but does not touch the array of StorageIDs and 
 StorageTypes. As a result, the DatanodeInfos and StorageIDs/StorageTypes are 
 mismatched. The method is called by FSNamesystem.getBlockLocations(), so the 
 client will not know which StorageID/Type corresponds to which DatanodeInfo.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Attachment: HDFS-7743.000.patch

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.000.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer

2015-02-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309780#comment-14309780
 ] 

Colin Patrick McCabe commented on HDFS-7694:


bq. CanUnbuffer ain't too pretty. Unbufferable is about as ugly. Its fine I 
suppose as is.

It's consistent with our other input stream extension interfaces such as 
{{Syncable}}, {{CanSetReadahead}}, etc.  The problem is that we can't add the 
new APIs to {{FSInputStream}}, or else we'd break a bunch of non-HDFS streams 
(in and out of the tree) that don't implement the new API.  I guess Java is 
adding default implementations for interface functions in some future 
version... too bad we're not there yet.

bq.l In DFSIS#unbuffer, should we be resetting data members back to zero, etc?

I'm not sure what else we'd reset.  This isn't changing the {{closed}} state, 
it's not a seek so the {{pos}} is not affected, it's not changing the 
{{cachingStrategy}} or {{fileEncryptionInfo}}... we certainly don't want to 
clear the block location info because then we need to do an RPC to the NN to 
get it again...

Actually I do see one thing we should change.  We should set {{blockEnd}} to 
-1.  Otherwise, {{seek}} may attempt to use {{blockReader}} even though it's 
{{null}}.  It seems like this is also a problem in {{closeCurrentBlockReader}}. 
 And let me add a {{seek}} after the unbuffer in {{testUnbufferClosesSockets}} 
to make sure that this doesn't regress.

bq. In testOpenManyFilesViaTcp, we assert we can read but is there a reason why 
we would not be able to that unbuffer enables? (pardon if dumb question)

Not a dumb question at all.  What I was testing here was that opening a lot of 
files didn't consume too many resources.  In my local test environment, I 
increased {{NUM_OPENS}} to be a really big number... I didn't want to burden 
Jenkins too much, though.  {{testUnbufferClosesSockets}} is a more direct and 
straightforward test than {{testOpenManyFilesViaTcp}}... the latter is perhaps 
more of a stress test.

 FSDataInputStream should support unbuffer
 ---

 Key: HDFS-7694
 URL: https://issues.apache.org/jira/browse/HDFS-7694
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch


 For applications that have many open HDFS (or other Hadoop filesystem) files, 
 it would be useful to have an API to clear readahead buffers and sockets.  
 This could be added to the existing APIs as an optional interface, in much 
 the same way as we added setReadahead / setDropBehind / etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7694) FSDataInputStream should support unbuffer

2015-02-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309862#comment-14309862
 ] 

Colin Patrick McCabe commented on HDFS-7694:


I filed HDFS-7744 for the existing seek after setReadahead bug.  It's not 
good form to combine a new feature with a bugfix.

 FSDataInputStream should support unbuffer
 ---

 Key: HDFS-7694
 URL: https://issues.apache.org/jira/browse/HDFS-7694
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, 
 HDFS-7694.003.patch


 For applications that have many open HDFS (or other Hadoop filesystem) files, 
 it would be useful to have an API to clear readahead buffers and sockets.  
 This could be added to the existing APIs as an optional interface, in much 
 the same way as we added setReadahead / setDropBehind / etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6133) Make Balancer support exclude specified path

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309919#comment-14309919
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6133:
---

- PBHelper.convert(..) only adds one FALSE when targetPinnings == null.  Should 
we add n FALSEs, where n = targetPinnings.length?
- Remove use sticky bit from the javadoc below since an implementation may 
not use sticky bit.
{code}
//FsDatasetSpi
+  /**
+   * Use sticky bit to pin the block
+   */
+  public void setPinning(ExtendedBlock block) throws IOException;
{code}
- FileUtil currently does not support sticky bit so that using LocalFileSystem 
is correct.  Is there a way to add support to stick bit in FileUtil?
- There are some tab characters.  Please replace them by spaces.

Thanks for working hard on this!


 Make Balancer support exclude specified path
 

 Key: HDFS-6133
 URL: https://issues.apache.org/jira/browse/HDFS-6133
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover, namenode
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-6133-1.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, 
 HDFS-6133-4.patch, HDFS-6133-5.patch, HDFS-6133-6.patch, HDFS-6133-7.patch, 
 HDFS-6133-8.patch, HDFS-6133.patch


 Currently, run Balancer will destroying Regionserver's data locality.
 If getBlocks could exclude blocks belongs to files which have specific path 
 prefix, like /hbase, then we can run Balancer without destroying 
 Regionserver's data locality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7732:
---
Assignee: Brahma Reddy Battula
  Status: Patch Available  (was: Open)

Attached the  patch..

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Trivial
  Labels: newbie
 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309656#comment-14309656
 ] 

Brahma Reddy Battula commented on HDFS-7732:


Thanks [~ajisakaa] for filing this issue..

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Trivial
 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-02-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309746#comment-14309746
 ] 

Jing Zhao commented on HDFS-7285:
-

I tried to merge the trunk into the HDFS-EC branch and got a lot of conflicts. 
Looks like some of the trunk changes were applied manually to the EC branch? 
git rebase origin/trunk got several hundred commits diverge. Someone knows 
how to fix this? or we can create a different EC branch.

 Erasure Coding Support inside HDFS
 --

 Key: HDFS-7285
 URL: https://issues.apache.org/jira/browse/HDFS-7285
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
 Attachments: ECAnalyzer.py, ECParser.py, 
 HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, 
 HDFSErasureCodingDesign-20150204.pdf, fsimage-analysis-20150105.pdf


 Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
 of data reliability, comparing to the existing HDFS 3-replica approach. For 
 example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
 with storage overhead only being 40%. This makes EC a quite attractive 
 alternative for big data storage, particularly for cold data. 
 Facebook had a related open source project called HDFS-RAID. It used to be 
 one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
 on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
 cold files that are intended not to be appended anymore; 3) the pure Java EC 
 coding implementation is extremely slow in practical use. Due to these, it 
 might not be a good idea to just bring HDFS-RAID back.
 We (Intel and Cloudera) are working on a design to build EC into HDFS that 
 gets rid of any external dependencies, makes it self-contained and 
 independently maintained. This design lays the EC feature on the storage type 
 support and considers compatible with existing HDFS features like caching, 
 snapshot, encryption, high availability and etc. This design will also 
 support different EC coding schemes, implementations and policies for 
 different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
 ISA-L library), an implementation can greatly improve the performance of EC 
 encoding/decoding and makes the EC solution even more attractive. We will 
 post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309787#comment-14309787
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7285:
---

Since there are only a few patches committed, let's recreate the branch in 
order to fix the divergence.

 Erasure Coding Support inside HDFS
 --

 Key: HDFS-7285
 URL: https://issues.apache.org/jira/browse/HDFS-7285
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
 Attachments: ECAnalyzer.py, ECParser.py, 
 HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, 
 HDFSErasureCodingDesign-20150204.pdf, fsimage-analysis-20150105.pdf


 Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
 of data reliability, comparing to the existing HDFS 3-replica approach. For 
 example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
 with storage overhead only being 40%. This makes EC a quite attractive 
 alternative for big data storage, particularly for cold data. 
 Facebook had a related open source project called HDFS-RAID. It used to be 
 one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
 on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
 cold files that are intended not to be appended anymore; 3) the pure Java EC 
 coding implementation is extremely slow in practical use. Due to these, it 
 might not be a good idea to just bring HDFS-RAID back.
 We (Intel and Cloudera) are working on a design to build EC into HDFS that 
 gets rid of any external dependencies, makes it self-contained and 
 independently maintained. This design lays the EC feature on the storage type 
 support and considers compatible with existing HDFS features like caching, 
 snapshot, encryption, high availability and etc. This design will also 
 support different EC coding schemes, implementations and policies for 
 different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
 ISA-L library), an implementation can greatly improve the performance of EC 
 encoding/decoding and makes the EC solution even more attractive. We will 
 post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7694) FSDataInputStream should support unbuffer

2015-02-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7694:
---
Attachment: HDFS-7694.003.patch

 FSDataInputStream should support unbuffer
 ---

 Key: HDFS-7694
 URL: https://issues.apache.org/jira/browse/HDFS-7694
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, 
 HDFS-7694.003.patch


 For applications that have many open HDFS (or other Hadoop filesystem) files, 
 it would be useful to have an API to clear readahead buffers and sockets.  
 This could be added to the existing APIs as an optional interface, in much 
 the same way as we added setReadahead / setDropBehind / etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7736) Typos in dfsadmin/fsck/snapshotDiff Commands

2015-02-06 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7736:
---
Summary: Typos in dfsadmin/fsck/snapshotDiff Commands  (was: [HDFS]Few 
Command print incorrect command usage)

 Typos in dfsadmin/fsck/snapshotDiff Commands
 

 Key: HDFS-7736
 URL: https://issues.apache.org/jira/browse/HDFS-7736
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 2.6.0
Reporter: Archana T
Assignee: Brahma Reddy Battula
Priority: Minor
 Attachments: HDFS-7736-branch-2-001.patch, HDFS-7736.patch


 Scenario --
 Try the following hdfs commands --
 Scenario --
 Try the following hdfs commands --
 1. 
 # ./hdfs dfsadmin -getStoragePolicy
 Usage:*{color:red} java DFSAdmin {color}*[-getStoragePolicy path]
 Expected- 
 Usage:*{color:green} hdfs dfsadmin {color}*[-getStoragePolicy path]
 2.
 # ./hdfs dfsadmin -setStoragePolicy
 Usage:*{color:red} java DFSAdmin {color}*[-setStoragePolicy path policyName]
 Expected- 
 Usage:*{color:green} hdfs dfsadmin {color}*[-setStoragePolicy path policyName]
 3.
 # ./hdfs fsck
 Usage:*{color:red} DFSck path {color}*[-list-corruptfileblocks | [-move | 
 -delete | -openforwrite] [-files [-blocks [-locations | -racks
 Expected- 
 Usage:*{color:green} hdfs fsck path {color}*[-list-corruptfileblocks | 
 [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks
 4.
 # ./hdfs snapshotDiff
 Usage:
 *{color:red}SnapshotDiff{color}* snapshotDir from to:
 Expected- 
 Usage:
 *{color:green}snapshotDiff{color}* snapshotDir from to:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7652) Process block reports for erasure coded blocks

2015-02-06 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7652:

Attachment: HDFS-7652.001.patch

It's actually possible to test this feature by altering the block reports. 

 Process block reports for erasure coded blocks
 --

 Key: HDFS-7652
 URL: https://issues.apache.org/jira/browse/HDFS-7652
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-7652.001.patch


 HDFS-7339 adds support in NameNode for persisting block groups. For memory 
 efficiency, erasure coded blocks under the striping layout are not stored in 
 {{BlockManager#blocksMap}}. Instead, entire block groups are stored in 
 {{BlockGroupManager#blockGroups}}. When a block report arrives from the 
 DataNode, it should be processed under the block group that it belongs to. 
 The following naming protocol is used to calculate the group of a given block:
 {code}
  * HDFS-EC introduces a hierarchical protocol to name blocks and groups:
  * Contiguous: {reserved block IDs | flag | block ID}
  * Striped: {reserved block IDs | flag | block group ID | index in group}
  *
  * Following n bits of reserved block IDs, The (n+1)th bit in an ID
  * distinguishes contiguous (0) and striped (1) blocks. For a striped block,
  * bits (n+2) to (64-m) represent the ID of its block group, while the last m
  * bits represent its index of the group. The value m is determined by the
  * maximum number of blocks in a group (MAX_BLOCKS_IN_GROUP).
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309819#comment-14309819
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7738:
---

 TestHAAppend fails waiting on checkBlockRecovery() for me. Does it not fail 
 for you?

The machine you used is probably slow.  It also passed the previous Jenkins run.

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309829#comment-14309829
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7738:
---

 If the desire to have these is strong then they need sufficient logging 
 information, which would describe the full conditions under which a failure 
 occurs when it does.

We already have 
{code}
LOG.info(newLength= + newLength + , isReady= + isReady);
{code}
I think it is good enough.  Agree?

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7744) Fix potential NPE in DFSInputStream after setDropBehind or setReadahead is called

2015-02-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7744:
---
Status: Patch Available  (was: Open)

 Fix potential NPE in DFSInputStream after setDropBehind or setReadahead is 
 called
 -

 Key: HDFS-7744
 URL: https://issues.apache.org/jira/browse/HDFS-7744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: dfsclient
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7744.001.patch


 Fix a potential NPE in DFSInputStream after setDropBehind or setReadahead is 
 called.  These functions clear the {{blockReader}}, but don't set 
 {{blockEnd}} to -1, which could lead to {{DFSInputStream#seek}} attempting to 
 derference {{blockReader}} even though it is {{null}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7744) Fix potential NPE in DFSInputStream after setDropBehind or setReadahead is called

2015-02-06 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7744:
---
Attachment: HDFS-7744.001.patch

 Fix potential NPE in DFSInputStream after setDropBehind or setReadahead is 
 called
 -

 Key: HDFS-7744
 URL: https://issues.apache.org/jira/browse/HDFS-7744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: dfsclient
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7744.001.patch


 Fix a potential NPE in DFSInputStream after setDropBehind or setReadahead is 
 called.  These functions clear the {{blockReader}}, but don't set 
 {{blockEnd}} to -1, which could lead to {{DFSInputStream#seek}} attempting to 
 derference {{blockReader}} even though it is {{null}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Status: Patch Available  (was: Open)

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Attachment: HDFS-7743.000.patch

The first patch mainly does the rename and removes the unnecessary overrided 
methods from BlockInfo.

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-7732:
---
Labels:   (was: newbie)

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Trivial
 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7684) The host:port settings of dfs.namenode.secondary.http-address should be trimmed before use

2015-02-06 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-7684:
---
Status: Patch Available  (was: Open)

 The host:port settings of dfs.namenode.secondary.http-address should be 
 trimmed before use
 --

 Key: HDFS-7684
 URL: https://issues.apache.org/jira/browse/HDFS-7684
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.1, 2.4.1
Reporter: Tianyin Xu
Assignee: Anu Engineer
 Attachments: HDFS.7684.001.patch


 With the following setting,
 property
 namedfs.namenode.secondary.http-address/name
 valuemyhostname:50090 /value
 /property
 The secondary NameNode could not be started
 $ hadoop-daemon.sh start secondarynamenode
 starting secondarynamenode, logging to 
 /home/hadoop/hadoop-2.4.1/logs/hadoop-hadoop-secondarynamenode-xxx.out
 /home/hadoop/hadoop-2.4.1/bin/hdfs
 Exception in thread main java.lang.IllegalArgumentException: Does not 
 contain a valid host:port authority: myhostname:50090
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:196)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.getHttpAddress(SecondaryNameNode.java:203)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:214)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:192)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:651)
 We were really confused and misled by the log message: we thought about the 
 DNS problems (changed to IP address but no success) and the network problem 
 (tried to test the connections with no success...)
 It turned out to be that the setting is not trimmed and the additional space 
 character in the end of the setting caused the problem... OMG!!!...
 Searching on the Internet, we find we are really not alone.  So many users 
 encountered similar trim problems! The following lists a few:
 http://solaimurugan.blogspot.com/2013/10/hadoop-multi-node-cluster-configuration.html
 http://stackoverflow.com/questions/11263664/error-while-starting-the-hadoop-using-strat-all-sh
 https://issues.apache.org/jira/browse/HDFS-2799
 https://issues.apache.org/jira/browse/HBASE-6973



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7716) Erasure Coding: extend BlockInfo to handle EC info

2015-02-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309776#comment-14309776
 ] 

Jing Zhao commented on HDFS-7716:
-

Thanks for the review, Zhe!

bq. Just to clarify, the plan is to commit the updated BlockInfo and 
BlockReplicationInfo classes to trunk right?

Yes. A patch has already submitted to HDFS-7743.

bq.  If we do want this sanity check, how about BlockInfoStriped, 
BlockInfoStripedUnderConstruction, BlockInfoContiguous (or BlockInfoReplicated)?

Actually I do not think we need that sanity check since the current code 
already does the conversion before the check. But I think the class names you 
proposed look better than BlockReplicationInfo and BlockGroupInfo. I will make 
this change in HDFS-7743.
{code}
BlockInfo info = (BlockInfo)triplets[index*3+1];
assert info == null || 
info.getClass().getName().startsWith(BlockInfo.class.getName()) : 
  BlockInfo is expected at  + index*3;
{code}

bq. I guess this JIRA requires it to handled reports of striped blocks

Yes, that's why my current patch does not include this part of logic. We can 
commit HDFS-7652 first.

 Erasure Coding: extend BlockInfo to handle EC info
 --

 Key: HDFS-7716
 URL: https://issues.apache.org/jira/browse/HDFS-7716
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-7716.000.patch


 The current BlockInfo's implementation only supports the replication 
 mechanism. To use the same blocksMap handling block group and its data/parity 
 blocks, we need to define a new BlockGroupInfo class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-7743:
---

 Summary: Code cleanup of BlockInfo and rename BlockInfo to 
BlockReplicationInfo
 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor


In the work of erasure coding (HDFS-7285), we plan to extend the class 
BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
(HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans to 
rename the current BlockInfo to BlockReplicationInfo in trunk.

In the meanwhile, we can also use this chance to do some minor code cleanup. 
E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods since 
they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-7732:

  Resolution: Fixed
   Fix Version/s: 2.7.0
Target Version/s: 2.7.0
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed this to trunk and branch-2. Thanks [~brahmareddy] for the 
contribution, and thanks [~cmccabe] for the review.

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Trivial
 Fix For: 2.7.0

 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309727#comment-14309727
 ] 

Konstantin Shvachko commented on HDFS-7738:
---

  it is better to have some random test.

In general I don't like random tests because they imply intermittent failures, 
which are hard to reproduce and therefore fix.
If the desire to have these is strong then they need sufficient logging 
information, which would describe the full conditions under which a failure 
occurs when it does.
In this particular case your test is a subset of the existing test on its every 
run.
Will look at the patch in a bit.

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309734#comment-14309734
 ] 

Konstantin Shvachko commented on HDFS-7738:
---

TestHAAppend fails waiting on checkBlockRecovery() for me. Does it not fail for 
you?
{code}
java.lang.AssertionError: inode should complete in ~3 ms.
Expected: is true
 but: was false
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:865)
at 
org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.checkBlockRecovery(TestFileTruncate.java:944)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHAAppend.testMultipleAppendsDuringCatchupTailing(TestHAAppend.java:213)
{code}

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7720) Quota by Storage Type API, tools and ClientNameNode Protocol changes

2015-02-06 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309934#comment-14309934
 ] 

Arpit Agarwal commented on HDFS-7720:
-

+1 pending Jenkins.

 Quota by Storage Type API, tools and ClientNameNode Protocol changes
 

 Key: HDFS-7720
 URL: https://issues.apache.org/jira/browse/HDFS-7720
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7720.0.patch, HDFS-7720.1.patch, HDFS-7720.2.patch, 
 HDFS-7720.3.patch, HDFS-7720.4.patch


 Split the patch into small ones based on the feedback. This one covers the 
 HDFS API changes, tool changes and ClientNameNode protocol changes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309686#comment-14309686
 ] 

Colin Patrick McCabe commented on HDFS-7732:


+1

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Trivial
 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309730#comment-14309730
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7743:
---

There are 4 tabs in the assert statements in BlockReplicationInfo.  Could you 
as well fix them?

+1 patch looks good.

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7736) Typos in dfsadmin/fsck/snapshotDiff Commands

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309783#comment-14309783
 ] 

Hadoop QA commented on HDFS-7736:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697056/HDFS-7736.patch
  against trunk revision 1425e3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDFSClientRetries
  
org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9458//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9458//console

This message is automatically generated.

 Typos in dfsadmin/fsck/snapshotDiff Commands
 

 Key: HDFS-7736
 URL: https://issues.apache.org/jira/browse/HDFS-7736
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 2.6.0
Reporter: Archana T
Assignee: Brahma Reddy Battula
Priority: Minor
 Attachments: HDFS-7736-branch-2-001.patch, HDFS-7736.patch


 Scenario --
 Try the following hdfs commands --
 Scenario --
 Try the following hdfs commands --
 1. 
 # ./hdfs dfsadmin -getStoragePolicy
 Usage:*{color:red} java DFSAdmin {color}*[-getStoragePolicy path]
 Expected- 
 Usage:*{color:green} hdfs dfsadmin {color}*[-getStoragePolicy path]
 2.
 # ./hdfs dfsadmin -setStoragePolicy
 Usage:*{color:red} java DFSAdmin {color}*[-setStoragePolicy path policyName]
 Expected- 
 Usage:*{color:green} hdfs dfsadmin {color}*[-setStoragePolicy path policyName]
 3.
 # ./hdfs fsck
 Usage:*{color:red} DFSck path {color}*[-list-corruptfileblocks | [-move | 
 -delete | -openforwrite] [-files [-blocks [-locations | -racks
 Expected- 
 Usage:*{color:green} hdfs fsck path {color}*[-list-corruptfileblocks | 
 [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks
 4.
 # ./hdfs snapshotDiff
 Usage:
 *{color:red}SnapshotDiff{color}* snapshotDir from to:
 Expected- 
 Usage:
 *{color:green}snapshotDiff{color}* snapshotDir from to:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7736) Typos in dfsadmin/fsck/snapshotDiff Commands

2015-02-06 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309917#comment-14309917
 ] 

Akira AJISAKA commented on HDFS-7736:
-

Thanks [~brahmareddy] for creating the patch. Two comments:
{code}
-String description = SnapshotDiff snapshotDir from to:\n +
+String description = snapshotDiff snapshotDir from to:\n +
{code}
# Would hdfs snapshotDiff be more consistent to other command than 
snapshotDiff?
# There are typos in lsSnapshottableDir and mover command also. Would you fix 
them?

 Typos in dfsadmin/fsck/snapshotDiff Commands
 

 Key: HDFS-7736
 URL: https://issues.apache.org/jira/browse/HDFS-7736
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 2.6.0
Reporter: Archana T
Assignee: Brahma Reddy Battula
Priority: Minor
 Attachments: HDFS-7736-branch-2-001.patch, HDFS-7736.patch


 Scenario --
 Try the following hdfs commands --
 Scenario --
 Try the following hdfs commands --
 1. 
 # ./hdfs dfsadmin -getStoragePolicy
 Usage:*{color:red} java DFSAdmin {color}*[-getStoragePolicy path]
 Expected- 
 Usage:*{color:green} hdfs dfsadmin {color}*[-getStoragePolicy path]
 2.
 # ./hdfs dfsadmin -setStoragePolicy
 Usage:*{color:red} java DFSAdmin {color}*[-setStoragePolicy path policyName]
 Expected- 
 Usage:*{color:green} hdfs dfsadmin {color}*[-setStoragePolicy path policyName]
 3.
 # ./hdfs fsck
 Usage:*{color:red} DFSck path {color}*[-list-corruptfileblocks | [-move | 
 -delete | -openforwrite] [-files [-blocks [-locations | -racks
 Expected- 
 Usage:*{color:green} hdfs fsck path {color}*[-list-corruptfileblocks | 
 [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks
 4.
 # ./hdfs snapshotDiff
 Usage:
 *{color:red}SnapshotDiff{color}* snapshotDir from to:
 Expected- 
 Usage:
 *{color:green}snapshotDiff{color}* snapshotDir from to:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6133) Make Balancer support exclude specified path

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309942#comment-14309942
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6133:
---

 ... where n = targetPinnings.length ...

It should be n = nodes.length.

 Make Balancer support exclude specified path
 

 Key: HDFS-6133
 URL: https://issues.apache.org/jira/browse/HDFS-6133
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover, namenode
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-6133-1.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, 
 HDFS-6133-4.patch, HDFS-6133-5.patch, HDFS-6133-6.patch, HDFS-6133-7.patch, 
 HDFS-6133-8.patch, HDFS-6133.patch


 Currently, run Balancer will destroying Regionserver's data locality.
 If getBlocks could exclude blocks belongs to files which have specific path 
 prefix, like /hbase, then we can run Balancer without destroying 
 Regionserver's data locality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Attachment: HDFS-7743.001.patch

[~zhz] 
[suggested|https://issues.apache.org/jira/browse/HDFS-7716?focusedCommentId=14309739page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14309739]
 we name the new class as {{BlockInfoContiguous}}. I think this is a better 
name. Update the patch. The patch also removes the tabs to address Nicholas's 
comments.

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.001.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310051#comment-14310051
 ] 

Hadoop QA commented on HDFS-7743:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697085/HDFS-7743.000.patch
  against trunk revision 1425e3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 21 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9463//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9463//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9463//console

This message is automatically generated.

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.001.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7604) Track and display failed DataNode storage locations in NameNode.

2015-02-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310052#comment-14310052
 ] 

Chris Nauroth commented on HDFS-7604:
-

Thank you for taking a look, Xiaoyu.  I actually had the same thought as you a 
few days ago and updated my code to change the name of the page.  I think 
changing the name is preferable to displaying all volumes of all nodes.  The 
existing Datanode Information page already gives us a view of all nodes, and 
I'd like to keep the new page focused on just the nodes with problems, so that 
an admin can use it as a task list for routine cluster maintenance.

 Track and display failed DataNode storage locations in NameNode.
 

 Key: HDFS-7604
 URL: https://issues.apache.org/jira/browse/HDFS-7604
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-7604-screenshot-1.png, HDFS-7604-screenshot-2.png, 
 HDFS-7604-screenshot-3.png, HDFS-7604-screenshot-4.png, HDFS-7604.001.patch, 
 HDFS-7604.prototype.patch


 During heartbeats, the DataNode can report a list of its storage locations 
 that have been taken out of service due to failure (such as due to a bad disk 
 or a permissions problem).  The NameNode can track these failed storage 
 locations and then report them in JMX and the NameNode web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-7746:
-

 Summary: Add a test randomly mixing append, truncate and snapshot
 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor


TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
covering many test cases.  Let's add a random test for mixing many append, 
truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7647) DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs

2015-02-06 Thread Milan Desai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310123#comment-14310123
 ] 

Milan Desai commented on HDFS-7647:
---

Some of the tests were failing because the .equals() and .hashCode() method in 
DatanodeInfoWithStorage also checked for storage equality, and tests like 
TestShortCircuitCache relied on the ability to identify DatanodeInfos in a map 
using a DatanodeInfoWithStorage object. The fix was to remove the equals() and 
hashCode() overrides so that they rely on the super.

TestDecomission.testDecomissionWithOpenfile is failing because the test is 
waiting for the DatanodeInfoWithStorage object to become decommissioned, but it 
won't ever be decommissioned because it is not the same reference as the 
DatanodeInfo in the datanode map. This poses a larger problem in that the 
client may expect changes to the datanode state to be reflected in the output 
of LocatedBlock.getLocations(), but they won't be because we create a new 
DatanodeInfoWithStorage object during construction and return it in 
getLocations().

Instead of having DatanodeInfoWithStorage be a subclass of DatanodeInfo, we 
could make it be a wrapper for DatanodeInfo, StorageID, and StorageType. Then 
to make NetworkTopology.sortByDistance work, we would have to make it implement 
Node and/or DatanodeInfo.

 DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs
 --

 Key: HDFS-7647
 URL: https://issues.apache.org/jira/browse/HDFS-7647
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Milan Desai
Assignee: Milan Desai
 Attachments: HDFS-7647-2.patch, HDFS-7647-3.patch, HDFS-7647-4.patch, 
 HDFS-7647.patch


 DatanodeManager.sortLocatedBlocks() sorts the array of DatanodeInfos inside 
 each LocatedBlock, but does not touch the array of StorageIDs and 
 StorageTypes. As a result, the DatanodeInfos and StorageIDs/StorageTypes are 
 mismatched. The method is called by FSNamesystem.getBlockLocations(), so the 
 client will not know which StorageID/Type corresponds to which DatanodeInfo.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2084) Sometimes backup node/secondary name node stops with exception

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2084:
---
Status: Open  (was: Patch Available)

 Sometimes backup node/secondary name node stops with exception
 --

 Key: HDFS-2084
 URL: https://issues.apache.org/jira/browse/HDFS-2084
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: FreeBSD
Reporter: Vitalii Tymchyshyn
 Attachments: patch.diff


 2011-06-17 11:43:23,096 ERROR 
 org.apache.hadoop.hdfs.server.namenode.Checkpointer: Throwable Exception in 
 doCheckpoint: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1765)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1753)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:708)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:411)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:378)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1209)
 at 
 org.apache.hadoop.hdfs.server.namenode.BackupStorage.loadCheckpoint(BackupStorage.java:158)
 at 
 org.apache.hadoop.hdfs.server.namenode.Checkpointer.doCheckpoint(Checkpointer.java:243)
 at 
 org.apache.hadoop.hdfs.server.namenode.Checkpointer.run(Checkpointer.java:141)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2084) Sometimes backup node/secondary name node stops with exception

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2084:
---
Status: Patch Available  (was: Open)

 Sometimes backup node/secondary name node stops with exception
 --

 Key: HDFS-2084
 URL: https://issues.apache.org/jira/browse/HDFS-2084
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.21.0
 Environment: FreeBSD
Reporter: Vitalii Tymchyshyn
 Attachments: patch.diff


 2011-06-17 11:43:23,096 ERROR 
 org.apache.hadoop.hdfs.server.namenode.Checkpointer: Throwable Exception in 
 doCheckpoint: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1765)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedSetTimes(FSDirectory.java:1753)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:708)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:411)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:378)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1209)
 at 
 org.apache.hadoop.hdfs.server.namenode.BackupStorage.loadCheckpoint(BackupStorage.java:158)
 at 
 org.apache.hadoop.hdfs.server.namenode.Checkpointer.doCheckpoint(Checkpointer.java:243)
 at 
 org.apache.hadoop.hdfs.server.namenode.Checkpointer.run(Checkpointer.java:141)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-957) FSImage layout version should be only once file is complete

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-957:
--
Status: Open  (was: Patch Available)

Patch no longer applies.

 FSImage layout version should be only once file is complete
 ---

 Key: HDFS-957
 URL: https://issues.apache.org/jira/browse/HDFS-957
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-957.txt


 Right now, the FSImage save code writes the LAYOUT_VERSION at the head of the 
 file, along with some other headers, and then dumps the directory into the 
 file. Instead, it should write a special IMAGE_IN_PROGRESS entry for the 
 layout version, dump all of the data, then seek back to the head of the file 
 to write the proper LAYOUT_VERSION. This would make it very easy to detect 
 the case where the FSImage save got interrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7471) TestDatanodeManager#testNumVersionsReportedCorrect occasionally fails

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310169#comment-14310169
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7471:
---

It fails again.
https://builds.apache.org/job/PreCommit-HDFS-Build/9440//testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestDatanodeManager/testNumVersionsReportedCorrect/

 TestDatanodeManager#testNumVersionsReportedCorrect occasionally fails
 -

 Key: HDFS-7471
 URL: https://issues.apache.org/jira/browse/HDFS-7471
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Ted Yu
Assignee: Binglin Chang
 Attachments: HDFS-7471.001.patch


 From https://builds.apache.org/job/Hadoop-Hdfs-trunk/1957/ :
 {code}
 FAILED:  
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
 Error Message:
 The map of version counts returned by DatanodeManager was not what it was 
 expected to be on iteration 237 expected:0 but was:1
 Stack Trace:
 java.lang.AssertionError: The map of version counts returned by 
 DatanodeManager was not what it was expected to be on iteration 237 
 expected:0 but was:1
 at org.junit.Assert.fail(Assert.java:88)
 at org.junit.Assert.failNotEquals(Assert.java:743)
 at org.junit.Assert.assertEquals(Assert.java:118)
 at org.junit.Assert.assertEquals(Assert.java:555)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect(TestDatanodeManager.java:150)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310192#comment-14310192
 ] 

Jing Zhao commented on HDFS-7743:
-

bq. It might make sense if BlockInfo became a common super-class for various 
impls like contiguous, sparse, etc.

yes, this is the case for our current erasure coding implementation. I guess 
the motivation has been clearly stated in the description. Please also see 
HDFS-7285 and HDFS-7716 for design details of EC.

bq. If yes, I'd please request refraining since I'm extremely close to posting 
a reference-free block triplets replacement and would like to avoid unnecessary 
merge issues.

Cool. Then we're both working on some big features that requires later merge. 
But instead of just mentioning here, could you please open a jira and post your 
patch there? I would be very happy to help you do the merging. 


 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.001.patch, 
 HDFS-7743.002.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk. In the 
 meanwhile, we can also use this chance to do some minor code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-02-06 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309995#comment-14309995
 ] 

Zhe Zhang commented on HDFS-7285:
-

[~jingzhao], [~szetszwo] I do the following every week (Monday) to merge trunk 
into HDFS-EC:
{code}
 git rebase apache/trunk
 git rebase apache/HDFS-EC
 git push apache HDFS-EC:HDFS-EC
{code}
Does it look correct? I just rebased again, didn't see any conflict.

 Erasure Coding Support inside HDFS
 --

 Key: HDFS-7285
 URL: https://issues.apache.org/jira/browse/HDFS-7285
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
 Attachments: ECAnalyzer.py, ECParser.py, 
 HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, 
 HDFSErasureCodingDesign-20150204.pdf, fsimage-analysis-20150105.pdf


 Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
 of data reliability, comparing to the existing HDFS 3-replica approach. For 
 example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
 with storage overhead only being 40%. This makes EC a quite attractive 
 alternative for big data storage, particularly for cold data. 
 Facebook had a related open source project called HDFS-RAID. It used to be 
 one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
 on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
 cold files that are intended not to be appended anymore; 3) the pure Java EC 
 coding implementation is extremely slow in practical use. Due to these, it 
 might not be a good idea to just bring HDFS-RAID back.
 We (Intel and Cloudera) are working on a design to build EC into HDFS that 
 gets rid of any external dependencies, makes it self-contained and 
 independently maintained. This design lays the EC feature on the storage type 
 support and considers compatible with existing HDFS features like caching, 
 snapshot, encryption, high availability and etc. This design will also 
 support different EC coding schemes, implementations and policies for 
 different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
 ISA-L library), an implementation can greatly improve the performance of EC 
 encoding/decoding and makes the EC solution even more attractive. We will 
 post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-3107) HDFS truncate

2015-02-06 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309989#comment-14309989
 ] 

Konstantin Shvachko edited comment on HDFS-3107 at 2/6/15 9:50 PM:
---

Attaching new version of the design document.
You guys keep referring to the document attached to HDFS-7056, which I am 
really not in control of. It was uploaded by [~rguo], and it reflects according 
to the jira comments Pivotal's internal implementation of truncate. It is quite 
close to what we did, but not the exact match.
I think my design doc has all necessary details for the truncate-snapshot 
integration. So I just removed the reference to Guo's doc, because it keeps 
causing confusion. It is still a good source of introduction to snapshots impl. 
in general.


was (Author: shv):
Attaching new version of the design document.
You guys keep referring to the document attached to HDFS-7056, which I am 
really not in control of. It was uploaded by [~rguo], and it reflects according 
to the jira comments Pivotal's internal implementation of truncate. It is quite 
close to what we did, but the exact match.
I think my design doc has all necessary details for the truncate-snapshot 
integration. So I just removed the reference to Guo's doc, because it keeps 
causing confusion.

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309994#comment-14309994
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7692:
---

I think we don't need DFS_DATANODE_DIRECTORYSCAN_THREADS_KEY.  Since the number 
of dataDirs won't be large, why don't always use dataDirs.size()?

 DataStorage#addStorageLocations(...) should support MultiThread to speedup 
 the upgrade of block pool at multi storage directories.
 --

 Key: HDFS-7692
 URL: https://issues.apache.org/jira/browse/HDFS-7692
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.2
Reporter: Leitao Guo
 Attachments: HDFS-7692.01.patch


 {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
 for (StorageLocation dataDir : dataDirs) {
   File root = dataDir.getFile();
  ... ...
 bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
 startOpt);
 addBlockPoolStorage(bpid, bpStorage);
 ... ...
   successVolumes.add(dataDir);
 }
 {code}
 In the above code the storage directories will be analyzed one by one, which 
 is really time consuming when upgrading HDFS with datanodes have dozens of 
 large volumes.  MultiThread dataDirs analyzing should be supported here to 
 speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310024#comment-14310024
 ] 

Hadoop QA commented on HDFS-7738:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697079/h7738_20150206.patch
  against trunk revision 1425e3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestHFlush

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9460//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9460//console

This message is automatically generated.

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3107) HDFS truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-3107:
--
Attachment: HDFS_truncate_semantics_Mar21.pdf

HDFS_truncate_semantics_Mar21.pdf: obsoletes the previous doc.

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, HDFS_truncate_semantics_Mar21.pdf, 
 editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3107) HDFS truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-3107:
--
Attachment: HDFS_truncate_semantics_Mar15.pdf

HDFS_truncate_semantics_Mar15.pdf: obsoletes the previous doc.

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-6743:
---
Status: Open  (was: Patch Available)

 Put IP address into a new column on the new NN webUI
 

 Key: HDFS-6743
 URL: https://issues.apache.org/jira/browse/HDFS-6743
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma
Assignee: Siqi Li
 Attachments: HDFS-6743.v1.patch, HDFS-6743.v2.patch


 The new NN webUI combines hostname and IP into one column in datanode list. 
 It is more convenient for admins if the IP address can be put to a separate 
 column, as in the legacy NN webUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6133) Make Balancer support exclude specified path

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309965#comment-14309965
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6133:
---

We should also add a configuration to enable/disable this feature since block 
pinning may not be preferable for some deployment.

BTW, just heard from [~cnauroth] that Windows does not support sticky bit.  Is 
there some other ways to implement block pinning?  Use file name?

 Make Balancer support exclude specified path
 

 Key: HDFS-6133
 URL: https://issues.apache.org/jira/browse/HDFS-6133
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover, namenode
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-6133-1.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, 
 HDFS-6133-4.patch, HDFS-6133-5.patch, HDFS-6133-6.patch, HDFS-6133-7.patch, 
 HDFS-6133-8.patch, HDFS-6133.patch


 Currently, run Balancer will destroying Regionserver's data locality.
 If getBlocks could exclude blocks belongs to files which have specific path 
 prefix, like /hbase, then we can run Balancer without destroying 
 Regionserver's data locality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7745) HDFS should have its own daemon command and not rely on the one in common

2015-02-06 Thread Sanjay Radia (JIRA)
Sanjay Radia created HDFS-7745:
--

 Summary: HDFS should have its own daemon command  and not rely on 
the one in common
 Key: HDFS-7745
 URL: https://issues.apache.org/jira/browse/HDFS-7745
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sanjay Radia


HDFS should have its own daemon command and not rely on the one in common.  BTW 
Yarn split out its own daemon command during project split. Note the 
hdfs-command does have --daemon flag and hence the daemon script is merely a 
wrapper. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310019#comment-14310019
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7692:
---

Some more comments:
- InterruptedException should be re-thrown as InterruptedIOException but not 
ignoring it.
- Instead more wait forever as below, 
{code}
+  addStorageThreadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
{code}
how about wait a short time period, 1 minute, and print a info message?  It 
would be great if we can print progress for each dir.
{code}
  for(; !addStorageThreadPool.awaitTermination(1, TimeUnit.MINUTES); ) {
LOG.info(..);
  }
{code}

BTW, thanks a lot for working on this!

 DataStorage#addStorageLocations(...) should support MultiThread to speedup 
 the upgrade of block pool at multi storage directories.
 --

 Key: HDFS-7692
 URL: https://issues.apache.org/jira/browse/HDFS-7692
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.5.2
Reporter: Leitao Guo
 Attachments: HDFS-7692.01.patch


 {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
 for (StorageLocation dataDir : dataDirs) {
   File root = dataDir.getFile();
  ... ...
 bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
 startOpt);
 addBlockPoolStorage(bpid, bpStorage);
 ... ...
   successVolumes.add(dataDir);
 }
 {code}
 In the above code the storage directories will be analyzed one by one, which 
 is really time consuming when upgrading HDFS with datanodes have dozens of 
 large volumes.  MultiThread dataDirs analyzing should be supported here to 
 speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Attachment: HDFS-7743.002.patch

We still need the equals and hashCode methods to avoid findbug warning. Update 
the patch.

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.001.patch, 
 HDFS-7743.002.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk.
 In the meanwhile, we can also use this chance to do some minor code cleanup. 
 E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods 
 since they are just the same with the super class {{Block}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7684) The host:port settings of dfs.namenode.secondary.http-address should be trimmed before use

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310081#comment-14310081
 ] 

Hadoop QA commented on HDFS-7684:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697096/HDFS.7684.001.patch
  against trunk revision eaab959.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.balancer.TestBalancer
  org.apache.hadoop.hdfs.TestLeaseRecovery2

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9464//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9464//artifact/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9464//console

This message is automatically generated.

 The host:port settings of dfs.namenode.secondary.http-address should be 
 trimmed before use
 --

 Key: HDFS-7684
 URL: https://issues.apache.org/jira/browse/HDFS-7684
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1, 2.5.1
Reporter: Tianyin Xu
Assignee: Anu Engineer
 Attachments: HDFS.7684.001.patch


 With the following setting,
 property
 namedfs.namenode.secondary.http-address/name
 valuemyhostname:50090 /value
 /property
 The secondary NameNode could not be started
 $ hadoop-daemon.sh start secondarynamenode
 starting secondarynamenode, logging to 
 /home/hadoop/hadoop-2.4.1/logs/hadoop-hadoop-secondarynamenode-xxx.out
 /home/hadoop/hadoop-2.4.1/bin/hdfs
 Exception in thread main java.lang.IllegalArgumentException: Does not 
 contain a valid host:port authority: myhostname:50090
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:196)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.getHttpAddress(SecondaryNameNode.java:203)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:214)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:192)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:651)
 We were really confused and misled by the log message: we thought about the 
 DNS problems (changed to IP address but no success) and the network problem 
 (tried to test the connections with no success...)
 It turned out to be that the setting is not trimmed and the additional space 
 character in the end of the setting caused the problem... OMG!!!...
 Searching on the Internet, we find we are really not alone.  So many users 
 encountered similar trim problems! The following lists a few:
 http://solaimurugan.blogspot.com/2013/10/hadoop-multi-node-cluster-configuration.html
 http://stackoverflow.com/questions/11263664/error-while-starting-the-hadoop-using-strat-all-sh
 https://issues.apache.org/jira/browse/HDFS-2799
 https://issues.apache.org/jira/browse/HBASE-6973



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7322) deprecate sbin/*.sh

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-7322:
---
Status: Patch Available  (was: Open)

 deprecate sbin/*.sh
 ---

 Key: HDFS-7322
 URL: https://issues.apache.org/jira/browse/HDFS-7322
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Allen Wittenauer
 Fix For: 3.0.0

 Attachments: HDFS-7322-00.patch


 The HDFS-related sbin commands (except for \*-dfs.sh) should be marked as 
 deprecated in trunk so that they may be removed from a future release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7748) Separate ECN flags from the Status in the DataTransferPipelineAck

2015-02-06 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-7748:


 Summary: Separate ECN flags from the Status in the 
DataTransferPipelineAck
 Key: HDFS-7748
 URL: https://issues.apache.org/jira/browse/HDFS-7748
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai


Prior to the discussions on HDFS-7270, the old clients might fail to talk to 
the newer server when ECN is turned on. This jira proposes to separate the ECN 
flags in a separate protobuf field to make the ack compatible on both versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7270) Add congestion signaling capability to DataNode write protocol

2015-02-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310146#comment-14310146
 ] 

Haohui Mai commented on HDFS-7270:
--

Filed HDFS-7748 to track the follow up changes.

 Add congestion signaling capability to DataNode write protocol
 --

 Key: HDFS-7270
 URL: https://issues.apache.org/jira/browse/HDFS-7270
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7270.000.patch, HDFS-7270.001.patch, 
 HDFS-7270.002.patch, HDFS-7270.003.patch, HDFS-7270.004.patch


 When a client writes to HDFS faster than the disk bandwidth of the DNs, it  
 saturates the disk bandwidth and put the DNs unresponsive. The client only 
 backs off by aborting / recovering the pipeline, which leads to failed writes 
 and unnecessary pipeline recovery.
 This jira proposes to add explicit congestion control mechanisms in the 
 writing pipeline. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7747) Add a truncate test with cached data

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-7747:
-

 Summary: Add a truncate test with cached data 
 Key: HDFS-7747
 URL: https://issues.apache.org/jira/browse/HDFS-7747
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Priority: Minor


Let's add a truncate test with cached data to verify that a new client won't 
read beyond truncated length from the cached data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-5324) Make Namespace implementation pluggable in the namenode

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-5324:
---
Status: Open  (was: Patch Available)

 Make Namespace implementation pluggable in the namenode
 ---

 Key: HDFS-5324
 URL: https://issues.apache.org/jira/browse/HDFS-5324
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.1.1-beta
 Environment: All
Reporter: Milind Bhandarkar
Assignee: Milind Bhandarkar
 Attachments: AbstractNamesystem.java, Checklist Of Changes.docx, 
 trunk_1544305_12-12-13.patch


 For the last couple of months, we have been working on making Namespace
 implementation in the namenode pluggable. We have demonstrated that it can
 be done without major surgery on the namenode, and does not have noticeable
 performance impact. We would like to contribute it back to Apache HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7749) Add stripped block support in INodeFile

2015-02-06 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-7749:
---

 Summary: Add stripped block support in INodeFile
 Key: HDFS-7749
 URL: https://issues.apache.org/jira/browse/HDFS-7749
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao


This jira plan to add a new INodeFile feature to store the stripped blocks 
information in case that the INodeFile is erasure coded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HDFS-7749) Erasure Coding: Add striped block support in INodeFile

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-7749:
---
Comment: was deleted

(was: stripped or striped?)

 Erasure Coding: Add striped block support in INodeFile
 --

 Key: HDFS-7749
 URL: https://issues.apache.org/jira/browse/HDFS-7749
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao

 This jira plan to add a new INodeFile feature to store the stripped blocks 
 information in case that the INodeFile is erasure coded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7749) Erasure Coding: Add striped block support in INodeFile

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7749:

Summary: Erasure Coding: Add striped block support in INodeFile  (was: 
Erasure Coding: Add stripped block support in INodeFile)

 Erasure Coding: Add striped block support in INodeFile
 --

 Key: HDFS-7749
 URL: https://issues.apache.org/jira/browse/HDFS-7749
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao

 This jira plan to add a new INodeFile feature to store the stripped blocks 
 information in case that the INodeFile is erasure coded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310176#comment-14310176
 ] 

Daryn Sharp commented on HDFS-7743:
---

What is the rationale for renaming to {{BlockInfoContiguous}}?  It might make 
sense if {{BlockInfo}} became a common super-class for various impls like 
contiguous, sparse, etc.   Since the patch also renames all the users of the 
object, it seems like this was a rename for the sake of renaming?  If yes, I'd 
please request refraining since I'm extremely close to posting a reference-free 
block triplets replacement and would like to avoid unnecessary merge issues.

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.001.patch, 
 HDFS-7743.002.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk. In the 
 meanwhile, we can also use this chance to do some minor code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7647) DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs

2015-02-06 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310191#comment-14310191
 ] 

Arpit Agarwal commented on HDFS-7647:
-

bq. This poses a larger problem in that the client may expect changes to the 
datanode state to be reflected in the output of LocatedBlock.getLocations(), 
but they won't be because we create a new DatanodeInfoWithStorage object during 
construction and return it in getLocations().
I think this is a test specific problem. Clients will never get access to the 
NameNode object, instead they will have a replica constructed from the wire 
message. The test case directly calls the FSNameSystem.getBlockLocations. The 
test should be fixed to not poll the cached object but query a new one 
periodically.

 DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs
 --

 Key: HDFS-7647
 URL: https://issues.apache.org/jira/browse/HDFS-7647
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Milan Desai
Assignee: Milan Desai
 Attachments: HDFS-7647-2.patch, HDFS-7647-3.patch, HDFS-7647-4.patch, 
 HDFS-7647.patch


 DatanodeManager.sortLocatedBlocks() sorts the array of DatanodeInfos inside 
 each LocatedBlock, but does not touch the array of StorageIDs and 
 StorageTypes. As a result, the DatanodeInfos and StorageIDs/StorageTypes are 
 mismatched. The method is called by FSNamesystem.getBlockLocations(), so the 
 client will not know which StorageID/Type corresponds to which DatanodeInfo.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2145) Add tests for FsShell -copyFromLocal/put without dst path

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2145:
---
Status: Open  (was: Patch Available)

Cancelling, as patch no longer applies.

 Add tests for FsShell -copyFromLocal/put without dst path
 -

 Key: HDFS-2145
 URL: https://issues.apache.org/jira/browse/HDFS-2145
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: XieXianshan
 Attachments: HADOOP_2145.patch


 Add tests for HADOOP-7441.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7743:
--
Hadoop Flags: Reviewed

BlockInfoContiguous sounds good.

+1 on the new patch.

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.001.patch, 
 HDFS-7743.002.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk. In the 
 meanwhile, we can also use this chance to do some minor code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7714) Simultaneous restart of HA NameNodes and DataNode can cause DataNode to register successfully with only one NameNode.

2015-02-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310060#comment-14310060
 ] 

Chris Nauroth commented on HDFS-7714:
-

Thank you for taking this issue, Vinay.  I think catching {{EOFException}} is a 
fine approach.  Changing the {{isAlive}} logic has some other side effects, and 
I believe that's what caused the test failures in the last Jenkins run.

 Simultaneous restart of HA NameNodes and DataNode can cause DataNode to 
 register successfully with only one NameNode.
 -

 Key: HDFS-7714
 URL: https://issues.apache.org/jira/browse/HDFS-7714
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Vinayakumar B
 Attachments: HDFS-7714-001.patch


 In an HA deployment, DataNodes must register with both NameNodes and send 
 periodic heartbeats and block reports to both.  However, if NameNodes and 
 DataNodes are restarted simultaneously, then this can trigger a race 
 condition in registration.  The end result is that the {{BPServiceActor}} for 
 one NameNode terminates, but the {{BPServiceActor}} for the other NameNode 
 remains alive.  The DataNode process is then in a half-alive state where it 
 only heartbeats and sends block reports to one of the NameNodes.  This could 
 cause a loss of storage capacity after an HA failover.  The DataNode process 
 would have to be restarted to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7743) Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo

2015-02-06 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7743:

Description: In the work of erasure coding (HDFS-7285), we plan to extend 
the class BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
(HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans to 
rename the current BlockInfo to BlockReplicationInfo in trunk. In the 
meanwhile, we can also use this chance to do some minor code cleanup.  (was: In 
the work of erasure coding (HDFS-7285), we plan to extend the class BlockInfo 
to two subclasses: BlockReplicationInfo and BlockGroupInfo (HDFS-7716). To ease 
the HDFS-EC branch syncing with trunk, this jira plans to rename the current 
BlockInfo to BlockReplicationInfo in trunk.

In the meanwhile, we can also use this chance to do some minor code cleanup. 
E.g., removing unnecessary overrided {{hashCode}} and {{equals}} methods since 
they are just the same with the super class {{Block}}.)

 Code cleanup of BlockInfo and rename BlockInfo to BlockReplicationInfo
 --

 Key: HDFS-7743
 URL: https://issues.apache.org/jira/browse/HDFS-7743
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-7743.000.patch, HDFS-7743.001.patch, 
 HDFS-7743.002.patch


 In the work of erasure coding (HDFS-7285), we plan to extend the class 
 BlockInfo to two subclasses: BlockReplicationInfo and BlockGroupInfo 
 (HDFS-7716). To ease the HDFS-EC branch syncing with trunk, this jira plans 
 to rename the current BlockInfo to BlockReplicationInfo in trunk. In the 
 meanwhile, we can also use this chance to do some minor code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7341) Add initial snapshot support based on pipeline recovery

2015-02-06 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-7341:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-3107)

 Add initial snapshot support based on pipeline recovery
 ---

 Key: HDFS-7341
 URL: https://issues.apache.org/jira/browse/HDFS-7341
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode
Reporter: Colin Patrick McCabe
 Attachments: HDFS-3107_Nov3.patch, editsStored_Nov3, 
 editsStored_Nov3.xml


 Add initial snapshot support based on pipeline recovery.  This iteration does 
 not support snapshots or rollback.  This support will be added in the 
 HDFS-3107 branch by later subtasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2145) Add tests for FsShell -copyFromLocal/put without dst path

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2145:
---
Status: Open  (was: Patch Available)

 Add tests for FsShell -copyFromLocal/put without dst path
 -

 Key: HDFS-2145
 URL: https://issues.apache.org/jira/browse/HDFS-2145
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: XieXianshan
 Attachments: HADOOP_2145.patch


 Add tests for HADOOP-7441.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2138) fix aop.xml to refer to the right hadoop-common.version variable

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2138:
---
Status: Open  (was: Patch Available)

Patch no longer applies.

 fix aop.xml to refer to the right hadoop-common.version variable
 

 Key: HDFS-2138
 URL: https://issues.apache.org/jira/browse/HDFS-2138
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
Reporter: Giridharan Kesavan
Assignee: Giridharan Kesavan
 Attachments: HDFS-2138-trunk.patch, HDFS-2138.PATCH


 aop.xml refers to hadoop-common version through project.version variable; 
 Instead hadoop-common version should be referred through 
 hadoop-common.version set in ivy/libraries.properties file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2306) NameNode web UI should show information about recent checkpoints

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2306:
---
Status: Open  (was: Patch Available)

 NameNode web UI should show information about recent checkpoints
 

 Key: HDFS-2306
 URL: https://issues.apache.org/jira/browse/HDFS-2306
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.24.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.24.0

 Attachments: checkpoint-history.1.png, checkpoint-history.png, 
 hdfs-2306.0.patch, hdfs-2306.1.patch


 It would be nice if the NN web UI showed the 2NN address, timestamp, number 
 of edits, etc. of the last few checkpoints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-2306) NameNode web UI should show information about recent checkpoints

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2306:
---
Status: Patch Available  (was: Open)

 NameNode web UI should show information about recent checkpoints
 

 Key: HDFS-2306
 URL: https://issues.apache.org/jira/browse/HDFS-2306
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.24.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.24.0

 Attachments: checkpoint-history.1.png, checkpoint-history.png, 
 hdfs-2306.0.patch, hdfs-2306.1.patch


 It would be nice if the NN web UI showed the 2NN address, timestamp, number 
 of edits, etc. of the last few checkpoints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6743) Put IP address into a new column on the new NN webUI

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-6743:
---
Status: Patch Available  (was: Open)

 Put IP address into a new column on the new NN webUI
 

 Key: HDFS-6743
 URL: https://issues.apache.org/jira/browse/HDFS-6743
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ming Ma
Assignee: Siqi Li
 Attachments: HDFS-6743.v1.patch, HDFS-6743.v2.patch


 The new NN webUI combines hostname and IP into one column in datanode list. 
 It is more convenient for admins if the IP address can be put to a separate 
 column, as in the legacy NN webUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3107) HDFS truncate

2015-02-06 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-3107:
--
Attachment: HDFS_truncate.pdf

Attaching new version of the design document.
You guys keep referring to the document attached to HDFS-7056, which I am 
really not in control of. It was uploaded by [~rguo], and it reflects according 
to the jira comments Pivotal's internal implementation of truncate. It is quite 
close to what we did, but the exact match.
I think my design doc has all necessary details for the truncate-snapshot 
integration. So I just removed the reference to Guo's doc, because it keeps 
causing confusion.

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7745) HDFS should have its own daemon command and not rely on the one in common

2015-02-06 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310010#comment-14310010
 ] 

Allen Wittenauer edited comment on HDFS-7745 at 2/6/15 10:00 PM:
-

My plan is to mark all of the daemon shell scripts in sbin as deprecated: 
HDFS-7322, YARN-2796, ... and looks like I'm missing one for MR.

In any case, this would make the hdfs command itself be the official way to 
launch daemons.  The dependencies on common's shell infrastructure 
(hadoop-functions, etc) would remain.

With the commit of HADOOP-11485, there is very little HDFS-specific bits still 
remaining in common that are actually used.  (The big one being 
HADOOP_HDFS_HOME resolution, but that one pretty much has to stay there.)  

The only thing really able to be moved would be the HDFS definitions in 
hadoop-env.sh to hdfs-env.sh.  (which is already supported by the code)


was (Author: aw):
My plan is to mark all of the daemon shell scripts in sbin as deprecated: 
HDFS-7322, YARN-2796, ... and looks like I'm missing one for MR.

In any case, this would make the hdfs command itself be the official way to 
launch daemons.  The dependencies on common's shell infrastructure would remain.

 HDFS should have its own daemon command  and not rely on the one in common
 --

 Key: HDFS-7745
 URL: https://issues.apache.org/jira/browse/HDFS-7745
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sanjay Radia

 HDFS should have its own daemon command and not rely on the one in common.  
 BTW Yarn split out its own daemon command during project split. Note the 
 hdfs-command does have --daemon flag and hence the daemon script is merely a 
 wrapper. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7604) Track and display failed DataNode storage locations in NameNode.

2015-02-06 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310030#comment-14310030
 ] 

Xiaoyu Yao commented on HDFS-7604:
--

Thanks [~cnauroth] for working on this, which will be very useful for admin. 
The changes looks good to me with one comment about the UI:

The title of the new page Datanode Volumes is a little bit confusing as it 
only shows failed volumes. A more accurate title could be Failed Datanode 
Volumes.  An alternative is to keep the Datanode Volumes as title but 
display all the volumes with their status in a dedicate column (e.g., RED for 
failed ones and Green for healthy ones). This way, admin can have a complete 
view of all the volumes and their status of the cluster. However, display all 
the volumes in a large cluster with thousands of nodes/volumes could be slow. 
UI should give the option to filter what to display, e.g. only failed volumes 
or all volumes. 

 Track and display failed DataNode storage locations in NameNode.
 

 Key: HDFS-7604
 URL: https://issues.apache.org/jira/browse/HDFS-7604
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: HDFS-7604-screenshot-1.png, HDFS-7604-screenshot-2.png, 
 HDFS-7604-screenshot-3.png, HDFS-7604-screenshot-4.png, HDFS-7604.001.patch, 
 HDFS-7604.prototype.patch


 During heartbeats, the DataNode can report a list of its storage locations 
 that have been taken out of service due to failure (such as due to a bad disk 
 or a permissions problem).  The NameNode can track these failed storage 
 locations and then report them in JMX and the NameNode web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7738) Add more negative tests for truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7738:
--
Attachment: h7738_20150206b.patch

h7738_20150206b.patch: fixes TestHFlush.

 Add more negative tests for truncate
 

 Key: HDFS-7738
 URL: https://issues.apache.org/jira/browse/HDFS-7738
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7738_20150204.patch, h7738_20150205.patch, 
 h7738_20150205b.patch, h7738_20150206.patch, h7738_20150206b.patch


 The following are negative test cases for truncate.
 - new length  old length
 - truncating a directory
 - truncating a non-existing file
 - truncating a file without write permission
 - truncating a file opened for append
 - truncating a file in safemode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7732) Fix the order of the parameters in DFSConfigKeys

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310050#comment-14310050
 ] 

Hadoop QA commented on HDFS-7732:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697087/HDFS-7732.patch
  against trunk revision 1425e3d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

  {color:red}-1 javac{color}.  The applied patch generated 1156 javac 
compiler warnings (more than the trunk's current 154 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
48 warning messages.
See 
https://builds.apache.org/job/PreCommit-HDFS-Build/9462//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.crypto.TestHdfsCryptoStreams

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9462//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9462//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9462//console

This message is automatically generated.

 Fix the order of the parameters in DFSConfigKeys
 

 Key: HDFS-7732
 URL: https://issues.apache.org/jira/browse/HDFS-7732
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Trivial
 Fix For: 2.7.0

 Attachments: HDFS-7732.patch


 In DFSConfigKeys.java, there are some parameters between 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY}} and 
 {{DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT}}.
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
 {code}
 The order should be corrected as 
 {code}
   public static final String DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_KEY = 
 dfs.client.read.shortcircuit.buffer.size;
   public static final int DFS_CLIENT_READ_SHORTCIRCUIT_BUFFER_SIZE_DEFAULT = 
 1024 * 1024;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_KEY = 
 dfs.client.read.shortcircuit.streams.cache.size;
   public static final int 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_SIZE_DEFAULT = 256;
   public static final String 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_KEY = 
 dfs.client.read.shortcircuit.streams.cache.expiry.ms;
   public static final long 
 DFS_CLIENT_READ_SHORTCIRCUIT_STREAMS_CACHE_EXPIRY_MS_DEFAULT = 5 * 60 * 1000;
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1348) Improve NameNode reponsiveness while it is checking if datanode decommissions are complete

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1348:
---
Status: Patch Available  (was: Open)

 Improve NameNode reponsiveness while it is checking if datanode decommissions 
 are complete
 --

 Key: HDFS-1348
 URL: https://issues.apache.org/jira/browse/HDFS-1348
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Attachments: decomissionImp1.patch, decomissionImp2.patch, 
 decommission.patch, decommission1.patch


 NameNode normally is busy all the time. Its log is full of activities every 
 second. But once for a while, NameNode seems to pause for more than 10 
 seconds without doing anything, leaving a blank in its log even though no 
 garbage collection is happening.  All other requests to NameNode are blocked 
 when this is happening.
 One culprit is DecommionManager. Its monitor holds the fsynamesystem lock 
 during the whole process of checking if decomissioning DataNodes are finished 
 or not, during which it checks every block of up to a default of 5 datanodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1497) Write pipeline sequence numbers should be sequential with no skips or duplicates

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1497:
---
Status: Open  (was: Patch Available)

 Write pipeline sequence numbers should be sequential with no skips or 
 duplicates
 

 Key: HDFS-1497
 URL: https://issues.apache.org/jira/browse/HDFS-1497
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 0.22.0, 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-1497.txt, hdfs-1497.txt, hdfs-1497.txt, 
 hdfs-1497.txt, hdfs-1497.txt


 In HDFS-895 we discovered that multiple hflush() calls in a row without 
 intervening writes could cause a skip in sequence number. This doesn't seem 
 to have any direct consequences, but we should maintain and assert the 
 invariant that sequence numbers have no gaps or duplicates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1497) Write pipeline sequence numbers should be sequential with no skips or duplicates

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1497:
---
Status: Patch Available  (was: Open)

 Write pipeline sequence numbers should be sequential with no skips or 
 duplicates
 

 Key: HDFS-1497
 URL: https://issues.apache.org/jira/browse/HDFS-1497
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 0.22.0, 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-1497.txt, hdfs-1497.txt, hdfs-1497.txt, 
 hdfs-1497.txt, hdfs-1497.txt


 In HDFS-895 we discovered that multiple hflush() calls in a row without 
 intervening writes could cause a skip in sequence number. This doesn't seem 
 to have any direct consequences, but we should maintain and assert the 
 invariant that sequence numbers have no gaps or duplicates.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7684) The host:port settings of dfs.namenode.secondary.http-address should be trimmed before use

2015-02-06 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310181#comment-14310181
 ] 

Akira AJISAKA commented on HDFS-7684:
-

Hi [~anu], thank you for creating the patch. I'm +1 for trimming the values.
bq. -1 release audit. The applied patch generated 1 release audit warnings.
Would you add a license header to TestMalformedURLs.java?
In addition, would you please change the indent to 2 spaces instead of 4 in 
TestMalformedURLs.java?

 The host:port settings of dfs.namenode.secondary.http-address should be 
 trimmed before use
 --

 Key: HDFS-7684
 URL: https://issues.apache.org/jira/browse/HDFS-7684
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.1, 2.5.1
Reporter: Tianyin Xu
Assignee: Anu Engineer
 Attachments: HDFS.7684.001.patch


 With the following setting,
 property
 namedfs.namenode.secondary.http-address/name
 valuemyhostname:50090 /value
 /property
 The secondary NameNode could not be started
 $ hadoop-daemon.sh start secondarynamenode
 starting secondarynamenode, logging to 
 /home/hadoop/hadoop-2.4.1/logs/hadoop-hadoop-secondarynamenode-xxx.out
 /home/hadoop/hadoop-2.4.1/bin/hdfs
 Exception in thread main java.lang.IllegalArgumentException: Does not 
 contain a valid host:port authority: myhostname:50090
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:196)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163)
   at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.getHttpAddress(SecondaryNameNode.java:203)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:214)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:192)
   at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:651)
 We were really confused and misled by the log message: we thought about the 
 DNS problems (changed to IP address but no success) and the network problem 
 (tried to test the connections with no success...)
 It turned out to be that the setting is not trimmed and the additional space 
 character in the end of the setting caused the problem... OMG!!!...
 Searching on the Internet, we find we are really not alone.  So many users 
 encountered similar trim problems! The following lists a few:
 http://solaimurugan.blogspot.com/2013/10/hadoop-multi-node-cluster-configuration.html
 http://stackoverflow.com/questions/11263664/error-while-starting-the-hadoop-using-strat-all-sh
 https://issues.apache.org/jira/browse/HDFS-2799
 https://issues.apache.org/jira/browse/HBASE-6973



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1348) Improve NameNode reponsiveness while it is checking if datanode decommissions are complete

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310186#comment-14310186
 ] 

Hadoop QA commented on HDFS-1348:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12455885/decomissionImp2.patch
  against trunk revision da2fb2b.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9473//console

This message is automatically generated.

 Improve NameNode reponsiveness while it is checking if datanode decommissions 
 are complete
 --

 Key: HDFS-1348
 URL: https://issues.apache.org/jira/browse/HDFS-1348
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Attachments: decomissionImp1.patch, decomissionImp2.patch, 
 decommission.patch, decommission1.patch


 NameNode normally is busy all the time. Its log is full of activities every 
 second. But once for a while, NameNode seems to pause for more than 10 
 seconds without doing anything, leaving a blank in its log even though no 
 garbage collection is happening.  All other requests to NameNode are blocked 
 when this is happening.
 One culprit is DecommionManager. Its monitor holds the fsynamesystem lock 
 during the whole process of checking if decomissioning DataNodes are finished 
 or not, during which it checks every block of up to a default of 5 datanodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2145) Add tests for FsShell -copyFromLocal/put without dst path

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310185#comment-14310185
 ] 

Hadoop QA commented on HDFS-2145:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12486301/HADOOP_2145.patch
  against trunk revision da2fb2b.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9472//console

This message is automatically generated.

 Add tests for FsShell -copyFromLocal/put without dst path
 -

 Key: HDFS-2145
 URL: https://issues.apache.org/jira/browse/HDFS-2145
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: XieXianshan
 Attachments: HADOOP_2145.patch


 Add tests for HADOOP-7441.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7720) Quota by Storage Type API, tools and ClientNameNode Protocol changes

2015-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310208#comment-14310208
 ] 

Hadoop QA commented on HDFS-7720:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697118/HDFS-7720.4.patch
  against trunk revision c1957fe.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9465//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9465//console

This message is automatically generated.

 Quota by Storage Type API, tools and ClientNameNode Protocol changes
 

 Key: HDFS-7720
 URL: https://issues.apache.org/jira/browse/HDFS-7720
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7720.0.patch, HDFS-7720.1.patch, HDFS-7720.2.patch, 
 HDFS-7720.3.patch, HDFS-7720.4.patch


 Split the patch into small ones based on the feedback. This one covers the 
 HDFS API changes, tool changes and ClientNameNode protocol changes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7745) HDFS should have its own daemon command and not rely on the one in common

2015-02-06 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310010#comment-14310010
 ] 

Allen Wittenauer commented on HDFS-7745:


My plan is to mark all of the daemon shell scripts in sbin as deprecated: 
HDFS-7322, YARN-2796, ... and looks like I'm missing one for MR.

In any case, this would make the hdfs command itself be the official way to 
launch daemons.  The dependencies on common's shell infrastructure would remain.

 HDFS should have its own daemon command  and not rely on the one in common
 --

 Key: HDFS-7745
 URL: https://issues.apache.org/jira/browse/HDFS-7745
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sanjay Radia

 HDFS should have its own daemon command and not rely on the one in common.  
 BTW Yarn split out its own daemon command during project split. Note the 
 hdfs-command does have --daemon flag and hence the daemon script is merely a 
 wrapper. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3107) HDFS truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-3107:
--
Attachment: HDFS_truncate_semantics_Mar21.pdf

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, HDFS_truncate_semantics_Mar21.pdf, 
 editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3107) HDFS truncate

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-3107:
--
Attachment: (was: HDFS_truncate_semantics_Mar21.pdf)

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, HDFS_truncate_semantics_Mar21.pdf, 
 editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1300) Decommissioning nodes does not increase replication priority

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1300:
---
Status: Open  (was: Patch Available)

Patch no longer applies.

 Decommissioning nodes does not increase replication priority
 

 Key: HDFS-1300
 URL: https://issues.apache.org/jira/browse/HDFS-1300
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.21.0, 0.20.2, 0.20.1, 0.20-append, 0.20.3
Reporter: Dmytro Molkov
Assignee: Dmytro Molkov
 Attachments: HDFS-1300.2.patch, HDFS-1300.3.patch, HDFS-1300.patch


 Currently when you decommission a node each block is only inserted into 
 neededReplications if it is not there yet. This causes a problem of a block 
 sitting in a low priority queue when all replicas sit on the nodes being 
 decommissioned.
 The common usecase for decommissioning nodes for us is proactively exclude 
 them before they went bad, so it would be great to get the blocks at risk 
 onto the live datanodes as quickly as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1348) Improve NameNode reponsiveness while it is checking if datanode decommissions are complete

2015-02-06 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1348:
---
Status: Open  (was: Patch Available)

 Improve NameNode reponsiveness while it is checking if datanode decommissions 
 are complete
 --

 Key: HDFS-1348
 URL: https://issues.apache.org/jira/browse/HDFS-1348
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Attachments: decomissionImp1.patch, decomissionImp2.patch, 
 decommission.patch, decommission1.patch


 NameNode normally is busy all the time. Its log is full of activities every 
 second. But once for a while, NameNode seems to pause for more than 10 
 seconds without doing anything, leaving a blank in its log even though no 
 garbage collection is happening.  All other requests to NameNode are blocked 
 when this is happening.
 One culprit is DecommionManager. Its monitor holds the fsynamesystem lock 
 during the whole process of checking if decomissioning DataNodes are finished 
 or not, during which it checks every block of up to a default of 5 datanodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >