[jira] [Updated] (HDFS-7610) Fix removal of dynamically added DN volumes

2015-08-13 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated HDFS-7610:
--
Labels: 2.6.1-candidate  (was: )

> Fix removal of dynamically added DN volumes
> ---
>
> Key: HDFS-7610
> URL: https://issues.apache.org/jira/browse/HDFS-7610
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>  Labels: 2.6.1-candidate
> Fix For: 2.7.0
>
> Attachments: HDFS-7610.000.patch, HDFS-7610.001.patch
>
>
> In the hot swap feature, {{FsDatasetImpl#addVolume}} uses the base volume dir 
> (e.g. "{{/foo/data0}}", instead of volume's current dir 
> "{{/foo/data/current}}" to construct {{FsVolumeImpl}}. As a result, DataNode 
> can not remove this newly added volume, because its 
> {{FsVolumeImpl#getBasePath}} returns "{{/foo}}".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694855#comment-14694855
 ] 

Yongjun Zhang commented on HDFS-8828:
-

Hi [~yufeigu],

Thanks for the new rev 006  which tries to address the issue we discussed (to 
avoid re-copying an already copied dir/file which is moved to a newly created 
dir since last snapshot/distcp).

I have some more comments:

# Change {{Number of path in the copy list}} to {{Number of paths in the copy 
list}}
# change
{code}
   if (LOG.isDebugEnabled()) {
  LOG.debug("Path in the copy list: " + 
lastFileStatus.getPath().toUri().getPath());
}
{code}
To (add an idx and only print in usediff && debug mode):
if (options.shouldUseDiff() && LOG.isDebugEnabled()) {
  LOG.debug("Copy list entry " + idx + ": " + 
lastFileStatus.getPath().toUri().getPath());
}
++idx;
{code}
# Add some more explanation to the javadoc of {{static HashSet 
getExcludeList(Path dir,  DiffInfo[] renameDiffs, Path prefix}}}, such as:
{code}
Given a newly created directory newDir in the snapshot diff, if a previously 
copied file/dirctory itemX is moved (renamed) to below newDir, itemX should be 
excluded so it will not to be copied again. 
{code}
# the goal of this jira is to only copy modified/created files, all of which 
would have entries in the snapshot diff report, why we have to call 
{{traverseDirectory}} to recursively traverse everything in 
{{doBuildListingWithSnapshotDiff(..}}} in this mode? Sounds to me that we only 
need to look at each snapshot diff item, and its direct children. ("mv ./x/y 
./p/q" would make two entries in the snapshot diff: ./x and ./y, so do need to 
care about the first level children of snapshot diff entry). Right?
If so, to reuse the code in {{traverseDirectory}}, we can modify 
{{traverseDirectory}}  to support a mode that only cares about the current 
source and and it's first level children, but not recursively. 

Thanks.



> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%

2015-08-13 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694879#comment-14694879
 ] 

Yi Liu commented on HDFS-8859:
--

Thanks [~szetszwo] for the review! Update the patch to address your comments.
{quote}
How about calling it LightWeightResizableGSet?
{quote}
Agree, rename it in the new patch.

{quote}
>From your calculation, the patch improve each block replica object size about 
>45%. The JIRA summary is misleading. It seems claiming that it improves the 
>overall DataNode memory footprint by about 45%. For 10m replicas, the original 
>overall map entry object size is ~900 MB and the new size is ~500MB. Is it 
>correct?
{quote}
It's correct. Actually I added {{ReplicaMap}} in the JIRA summary, yes, I use 
{{()}}, :), considering the {{ReplicaMap}} is the major in memory long-lived 
object of Datanode, of course, there are other aspects (most are transient: 
data read/write buffer, rpc buffer, etc..), I just highlighted the improvement.

{quote}
 Subclass can call super.put(..)
{quote}
Update in the new patch. I just used to a new internal method . 

{quote}
There is a rewrite for LightWeightGSet.remove(..)
{quote}
I revert it in the new patch and keep original one. Original implement has 
duplicate logic, we can share same logic for all the {{if...else..}} branches.

{quote}
I think we need some long running tests to make sure the correctness. See 
TestGSet.runMultipleTestGSet()
{quote}
Agree, updated it in the new patch. 


For the test failures of {{003}}, it's because there is one place 
(BlockPoolSlice) add replicaInfo to replicaMap from a tmp replicapMap, but the 
replicaInfo is still in the tmp one, we can remove it from the tmp one before 
adding (for LightWeightGSet, an element is not allowed to exist in two gset).  
In {{002}} patch, the failure doesn't exist, we have a new implement of 
{{SetIterator}} which is very similar to the logic in java Hashmap, and a bit 
different with original one, but both are correct, the major difference is the 
time of finding next element. In the new patch, I keep the original one, and 
make few change in BlockPoolSlice.  All tests run successfully in my local for 
the new patch.

> Improve DataNode (ReplicaMap) memory footprint to save about 45%
> 
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%

2015-08-13 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8859:
-
Attachment: HDFS-8859.004.patch

> Improve DataNode (ReplicaMap) memory footprint to save about 45%
> 
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8891) HDFS concat should keep srcs order

2015-08-13 Thread Yong Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Zhang updated HDFS-8891:
-
Attachment: HDFS-8891.001.patch

First patch, please review

> HDFS concat should keep srcs order
> --
>
> Key: HDFS-8891
> URL: https://issues.apache.org/jira/browse/HDFS-8891
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yong Zhang
>Assignee: Yong Zhang
> Attachments: HDFS-8891.001.patch
>
>
> FSDirConcatOp.verifySrcFiles may change src files order, but it should their 
> order as input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8891) HDFS concat should keep srcs order

2015-08-13 Thread Yong Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Zhang updated HDFS-8891:
-
Status: Patch Available  (was: Open)

> HDFS concat should keep srcs order
> --
>
> Key: HDFS-8891
> URL: https://issues.apache.org/jira/browse/HDFS-8891
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yong Zhang
>Assignee: Yong Zhang
> Attachments: HDFS-8891.001.patch
>
>
> FSDirConcatOp.verifySrcFiles may change src files order, but it should their 
> order as input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695077#comment-14695077
 ] 

Hadoop QA commented on HDFS-8808:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 28s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 27s | The applied patch generated  1 
new checkstyle issues (total was 574, now 574). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 32s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 177m  4s | Tests failed in hadoop-hdfs. |
| | | 221m 22s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestDFSClientRetries |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750240/HDFS-8808-03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 40f8151 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11986/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11986/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11986/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11986/console |


This message was automatically generated.

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too

2015-08-13 Thread Ravikumar (JIRA)
Ravikumar created HDFS-8892:
---

 Summary: ShortCircuitCache.CacheCleaner can add Slot.isInvalid() 
check too
 Key: HDFS-8892
 URL: https://issues.apache.org/jira/browse/HDFS-8892
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.7.1
Reporter: Ravikumar
Priority: Minor


Currently CacheCleaner thread checks only for cache-expiry times. It would be 
nice if it handles an invalid-slot too in an extra-pass of evictable map…

for(ShortCircuitReplica replica:evictable.values()) {
 if(!scr.getSlot().isValid()) {
purge(replica);
 }
}
//Existing code...
int numDemoted = demoteOldEvictableMmaped(curMs);
int numPurged = 0;
Long evictionTimeNs = Long.valueOf(0);
….
…..

Apps like HBase can tweak the expiry/staleness/cache-size params in DFS-Client, 
so that ShortCircuitReplica will never be closed except when Slot is declared 
invalid. 

I assume slot-invalidation will happen during block-invalidation/deletes 
{Primarily triggered by compaction/shard-takeover etc..}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8892) ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too

2015-08-13 Thread kanaka kumar avvaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru reassigned HDFS-8892:
-

Assignee: kanaka kumar avvaru

> ShortCircuitCache.CacheCleaner can add Slot.isInvalid() check too
> -
>
> Key: HDFS-8892
> URL: https://issues.apache.org/jira/browse/HDFS-8892
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.7.1
>Reporter: Ravikumar
>Assignee: kanaka kumar avvaru
>Priority: Minor
>
> Currently CacheCleaner thread checks only for cache-expiry times. It would be 
> nice if it handles an invalid-slot too in an extra-pass of evictable map…
> for(ShortCircuitReplica replica:evictable.values()) {
>  if(!scr.getSlot().isValid()) {
> purge(replica);
>  }
> }
> //Existing code...
> int numDemoted = demoteOldEvictableMmaped(curMs);
> int numPurged = 0;
> Long evictionTimeNs = Long.valueOf(0);
> ….
> …..
> Apps like HBase can tweak the expiry/staleness/cache-size params in 
> DFS-Client, so that ShortCircuitReplica will never be closed except when Slot 
> is declared invalid. 
> I assume slot-invalidation will happen during block-invalidation/deletes 
> {Primarily triggered by compaction/shard-takeover etc..}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-08-13 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8859:
-
Summary: Improve DataNode ReplicaMap memory footprint to save about 45%  
(was: Improve DataNode (ReplicaMap) memory footprint to save about 45%)

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-8859) Improve DataNode (ReplicaMap) memory footprint to save about 45%

2015-08-13 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694879#comment-14694879
 ] 

Yi Liu edited comment on HDFS-8859 at 8/13/15 12:02 PM:


Thanks [~szetszwo] for the review! Update the patch to address your comments.
{quote}
How about calling it LightWeightResizableGSet?
{quote}
Agree, rename it in the new patch.

{quote}
>From your calculation, the patch improve each block replica object size about 
>45%. The JIRA summary is misleading. It seems claiming that it improves the 
>overall DataNode memory footprint by about 45%. For 10m replicas, the original 
>overall map entry object size is ~900 MB and the new size is ~500MB. Is it 
>correct?
{quote}
It's correct. I did add {{ReplicaMap}} in the JIRA summary, yes, I use {{()}}, 
:), considering the {{ReplicaMap}} is the major long-lived object in memory of 
Datanode which could be large, of course, there are other aspects (many are 
transient: data read/write buffer, rpc buffer, etc..), I just highlighted the 
improvement.  
Let me remove the {{()}}.

{quote}
 Subclass can call super.put(..)
{quote}
Update in the new patch. I just used to a new internal method . 

{quote}
There is a rewrite for LightWeightGSet.remove(..)
{quote}
I revert it in the new patch and keep original one. Original implement has 
duplicate logic, we can share same logic for all the {{if...else..}}.

{quote}
I think we need some long running tests to make sure the correctness. See 
TestGSet.runMultipleTestGSet()
{quote}
Agree, updated it in the new patch. 


For the test failures of {{003}}, it's because there is one place 
(BlockPoolSlice) add replicaInfo to replicaMap from a tmp replicapMap, but the 
replicaInfo is still in the tmp one, we can remove it from the tmp one before 
adding (for LightWeightGSet, an element is not allowed to exist in two gset).  
In {{002}} patch, the failure didn't exist, we had a new implement of 
{{SetIterator}} which was very similar to the logic in java Hashmap, and a bit 
different with original one. But both are correct, the major difference is the 
time of finding next element. In the new patch, I keep the original one, and 
make few change in BlockPoolSlice.  All tests run successfully in my local for 
the new patch.


was (Author: hitliuyi):
Thanks [~szetszwo] for the review! Update the patch to address your comments.
{quote}
How about calling it LightWeightResizableGSet?
{quote}
Agree, rename it in the new patch.

{quote}
>From your calculation, the patch improve each block replica object size about 
>45%. The JIRA summary is misleading. It seems claiming that it improves the 
>overall DataNode memory footprint by about 45%. For 10m replicas, the original 
>overall map entry object size is ~900 MB and the new size is ~500MB. Is it 
>correct?
{quote}
It's correct. Actually I added {{ReplicaMap}} in the JIRA summary, yes, I use 
{{()}}, :), considering the {{ReplicaMap}} is the major in memory long-lived 
object of Datanode, of course, there are other aspects (most are transient: 
data read/write buffer, rpc buffer, etc..), I just highlighted the improvement.

{quote}
 Subclass can call super.put(..)
{quote}
Update in the new patch. I just used to a new internal method . 

{quote}
There is a rewrite for LightWeightGSet.remove(..)
{quote}
I revert it in the new patch and keep original one. Original implement has 
duplicate logic, we can share same logic for all the {{if...else..}} branches.

{quote}
I think we need some long running tests to make sure the correctness. See 
TestGSet.runMultipleTestGSet()
{quote}
Agree, updated it in the new patch. 


For the test failures of {{003}}, it's because there is one place 
(BlockPoolSlice) add replicaInfo to replicaMap from a tmp replicapMap, but the 
replicaInfo is still in the tmp one, we can remove it from the tmp one before 
adding (for LightWeightGSet, an element is not allowed to exist in two gset).  
In {{002}} patch, the failure doesn't exist, we have a new implement of 
{{SetIterator}} which is very similar to the logic in java Hashmap, and a bit 
different with original one, but both are correct, the major difference is the 
time of finding next element. In the new patch, I keep the original one, and 
make few change in BlockPoolSlice.  All tests run successfully in my local for 
the new patch.

> Improve DataNode (ReplicaMap) memory footprint to save about 45%
> 
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch
>
>
> By using followi

[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695116#comment-14695116
 ] 

Hudson commented on HDFS-8879:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #286 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/286/])
HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode 
restart. Contributed by Xiaoyu Yao. (xyao: rev 
3e715a4f4c46bcd8b3054cb0566e526c46bd5d66)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java


> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695119#comment-14695119
 ] 

Hudson commented on HDFS-8622:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #286 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/286/])
HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. 
Contributed by Jagadesh Kiran N. (aajisaka: rev 
40f815131e822f5b7a8e6a6827f4b85b31220c43)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java


> Implement GETCONTENTSUMMARY operation for WebImageViewer
> 
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, 
> HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, 
> HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695123#comment-14695123
 ] 

Hudson commented on HDFS-8879:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1016 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1016/])
HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode 
restart. Contributed by Xiaoyu Yao. (xyao: rev 
3e715a4f4c46bcd8b3054cb0566e526c46bd5d66)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java


> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695126#comment-14695126
 ] 

Hudson commented on HDFS-8622:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1016 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1016/])
HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. 
Contributed by Jagadesh Kiran N. (aajisaka: rev 
40f815131e822f5b7a8e6a6827f4b85b31220c43)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java


> Implement GETCONTENTSUMMARY operation for WebImageViewer
> 
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, 
> HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, 
> HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695178#comment-14695178
 ] 

Hadoop QA commented on HDFS-8859:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 21s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 52s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 51s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 50s | The applied patch generated  6 
new checkstyle issues (total was 12, now 16). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  22m 33s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  76m 49s | Tests failed in hadoop-hdfs. |
| | | 145m 35s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.ha.TestZKFailoverController |
|   | hadoop.net.TestNetUtils |
|   | hadoop.hdfs.TestReplication |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.TestDatanodeRegistration |
|   | hadoop.hdfs.tools.TestDebugAdmin |
|   | hadoop.hdfs.TestSetrepIncreasing |
|   | hadoop.hdfs.TestDatanodeReport |
|   | hadoop.hdfs.TestDFSShellGenericOptions |
|   | hadoop.hdfs.TestParallelRead |
|   | hadoop.hdfs.tools.TestStoragePolicyCommands |
|   | hadoop.hdfs.TestDFSRemove |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.web.TestWebHdfsTokens |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.TestPersistBlocks |
|   | hadoop.hdfs.TestParallelShortCircuitReadNoChecksum |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.TestQuota |
|   | hadoop.hdfs.TestDFSClientFailover |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForAcl |
|   | hadoop.hdfs.tools.TestDFSAdmin |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
|   | hadoop.hdfs.web.TestWebHDFS |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestFileLengthOnClusterRestart |
|   | 
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForContentSummary |
|   | hadoop.hdfs.TestFSOutputSummer |
|   | hadoop.hdfs.TestEncryptionZonesWithHA |
|   | hadoop.hdfs.TestBlockReaderFactory |
|   | hadoop.hdfs.TestDFSFinalize |
|   | hadoop.hdfs.TestDisableConnCache |
|   | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerForXAttr |
|   | hadoop.hdfs.web.TestHttpsFileSystem |
|   | hadoop.hdfs.web.TestWebHdfsWithAuthenticationFilter |
|   | hadoop.hdfs.web.TestWebHDFSAcl |
|   | hadoop.hdfs.TestHDFSTrash |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.TestDataTransferKeepalive |
|   | hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer |
|   | hadoop.hdfs.web.TestWebHDFSForHA |
|   | hadoop.hdfs.TestBlockMissingException |
|   | hadoop.hdfs.TestPipelines |
|   | hadoop.hdfs.TestRenameWhileOpen |
|   | hadoop.hdfs.TestFileCreationClient |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.TestFileAppend3 |
|   | hadoop.hdfs.TestBalancerBandwidth |
|   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.TestSeekBug |
|   | hadoop.hdfs.TestParallelShortCircuitReadUnCached |
|   | hadoop.hdfs.TestBlockReaderLocal |
|   | hadoop.hdfs.TestListFilesInFileContext |
|   | hadoop.hdfs.web.TestWebHDFSXAttr |
|   | hadoop.hdfs.TestFileStatus |
|   | hadoop.hdfs.web.TestFSMainOperationsWebHdfs |
| Timed out tests | org.apache.hadoop.hdfs.TestFileCreation |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750254/HDFS-8859.004.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 53bef9c |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11987/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11987/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://build

[jira] [Created] (HDFS-8893) DNs with failed volumes stop serving during rolling upgrade

2015-08-13 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created HDFS-8893:


 Summary: DNs with failed volumes stop serving during rolling 
upgrade
 Key: HDFS-8893
 URL: https://issues.apache.org/jira/browse/HDFS-8893
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Rushabh S Shah
Priority: Critical


When a rolling upgrade starts, all DNs try to write a rolling_upgrade marker to 
each of their volumes. If one of the volumes is bad, this will fail. When this 
failure happens, the DN does not update the key it received from the NN.
Unfortunately we had one failed volume on all the 3 datanodes which were having 
replica.

Keys expire after 20 hours so at about 20 hours into the rolling upgrade, the 
DNs with failed volumes will stop serving clients.

Here is the stack trace on the datanode size:
{noformat}
2015-08-11 07:32:28,827 [DataNode: heartbeating to 8020] WARN 
datanode.DataNode: IOException in offerService
java.io.IOException: Read-only file system
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:947)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.setRollingUpgradeMarkers(BlockPoolSliceStorage.java:721)
at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.setRollingUpgradeMarker(DataStorage.java:173)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.setRollingUpgradeMarker(FsDatasetImpl.java:2357)
at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.signalRollingUpgrade(BPOfferService.java:480)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.handleRollingUpgradeStatus(BPServiceActor.java:626)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:677)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:833)
at java.lang.Thread.run(Thread.java:722)

{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8893) DNs with failed volumes stop serving during rolling upgrade

2015-08-13 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-8893:
-
Assignee: Daryn Sharp

> DNs with failed volumes stop serving during rolling upgrade
> ---
>
> Key: HDFS-8893
> URL: https://issues.apache.org/jira/browse/HDFS-8893
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>Priority: Critical
>
> When a rolling upgrade starts, all DNs try to write a rolling_upgrade marker 
> to each of their volumes. If one of the volumes is bad, this will fail. When 
> this failure happens, the DN does not update the key it received from the NN.
> Unfortunately we had one failed volume on all the 3 datanodes which were 
> having replica.
> Keys expire after 20 hours so at about 20 hours into the rolling upgrade, the 
> DNs with failed volumes will stop serving clients.
> Here is the stack trace on the datanode size:
> {noformat}
> 2015-08-11 07:32:28,827 [DataNode: heartbeating to 8020] WARN 
> datanode.DataNode: IOException in offerService
> java.io.IOException: Read-only file system
> at java.io.UnixFileSystem.createFileExclusively(Native Method)
> at java.io.File.createNewFile(File.java:947)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.setRollingUpgradeMarkers(BlockPoolSliceStorage.java:721)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.setRollingUpgradeMarker(DataStorage.java:173)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.setRollingUpgradeMarker(FsDatasetImpl.java:2357)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.signalRollingUpgrade(BPOfferService.java:480)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.handleRollingUpgradeStatus(BPServiceActor.java:626)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:677)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:833)
> at java.lang.Thread.run(Thread.java:722)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8891) HDFS concat should keep srcs order

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695320#comment-14695320
 ] 

Hadoop QA commented on HDFS-8891:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 16s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 39s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 21s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 175m 19s | Tests failed in hadoop-hdfs. |
| | | 219m 12s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750259/HDFS-8891.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 53bef9c |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11988/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11988/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11988/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11988/console |


This message was automatically generated.

> HDFS concat should keep srcs order
> --
>
> Key: HDFS-8891
> URL: https://issues.apache.org/jira/browse/HDFS-8891
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yong Zhang
>Assignee: Yong Zhang
> Attachments: HDFS-8891.001.patch
>
>
> FSDirConcatOp.verifySrcFiles may change src files order, but it should their 
> order as input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695387#comment-14695387
 ] 

Hudson commented on HDFS-8879:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2213 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2213/])
HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode 
restart. Contributed by Xiaoyu Yao. (xyao: rev 
3e715a4f4c46bcd8b3054cb0566e526c46bd5d66)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7926) NameNode implementation of ClientProtocol.truncate(..) is not idempotent

2015-08-13 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated HDFS-7926:
--
Labels:   (was: 2.6.1-candidate)

Removing the 2.6.1-candidate label as truncate is not a feature in 2.6.

> NameNode implementation of ClientProtocol.truncate(..) is not idempotent
> 
>
> Key: HDFS-7926
> URL: https://issues.apache.org/jira/browse/HDFS-7926
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.7.0
>
> Attachments: h7926_20150313.patch, h7926_20150313b.patch
>
>
> If dfsclient drops the first response of a truncate RPC call, the retry by 
> retry cache will fail with "DFSClient ... is already the current lease 
> holder".  The truncate RPC is annotated as @Idempotent in ClientProtocol but 
> the NameNode implementation is not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695390#comment-14695390
 ] 

Hudson commented on HDFS-8622:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2213 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2213/])
HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. 
Contributed by Jagadesh Kiran N. (aajisaka: rev 
40f815131e822f5b7a8e6a6827f4b85b31220c43)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java


> Implement GETCONTENTSUMMARY operation for WebImageViewer
> 
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, 
> HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, 
> HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695454#comment-14695454
 ] 

Hudson commented on HDFS-8879:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #275 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/275/])
HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode 
restart. Contributed by Xiaoyu Yao. (xyao: rev 
3e715a4f4c46bcd8b3054cb0566e526c46bd5d66)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java


> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695457#comment-14695457
 ] 

Hudson commented on HDFS-8622:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #275 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/275/])
HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. 
Contributed by Jagadesh Kiran N. (aajisaka: rev 
40f815131e822f5b7a8e6a6827f4b85b31220c43)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java


> Implement GETCONTENTSUMMARY operation for WebImageViewer
> 
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, 
> HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, 
> HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695467#comment-14695467
 ] 

Yufei Gu commented on HDFS-8828:


Hi [~yzhangal],

Thank you for detailed review.

For 3, we do need recursively traverse because a created directory item in a 
snapshot diff report could have multiple levels of subdirectories.

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695479#comment-14695479
 ] 

Hudson commented on HDFS-8879:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #283 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/283/])
HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode 
restart. Contributed by Xiaoyu Yao. (xyao: rev 
3e715a4f4c46bcd8b3054cb0566e526c46bd5d66)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695482#comment-14695482
 ] 

Hudson commented on HDFS-8622:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #283 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/283/])
HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. 
Contributed by Jagadesh Kiran N. (aajisaka: rev 
40f815131e822f5b7a8e6a6827f4b85b31220c43)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java


> Implement GETCONTENTSUMMARY operation for WebImageViewer
> 
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, 
> HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, 
> HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695521#comment-14695521
 ] 

Hudson commented on HDFS-8622:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2232 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2232/])
HDFS-8622. Implement GETCONTENTSUMMARY operation for WebImageViewer. 
Contributed by Jagadesh Kiran N. (aajisaka: rev 
40f815131e822f5b7a8e6a6827f4b85b31220c43)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewerForContentSummary.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageHandler.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsImageViewer.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/FSImageLoader.java


> Implement GETCONTENTSUMMARY operation for WebImageViewer
> 
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, 
> HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, 
> HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695518#comment-14695518
 ] 

Hudson commented on HDFS-8879:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2232 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2232/])
HDFS-8879. Quota by storage type usage incorrectly initialized upon namenode 
restart. Contributed by Xiaoyu Yao. (xyao: rev 
3e715a4f4c46bcd8b3054cb0566e526c46bd5d66)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestQuotaByStorageType.java


> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Fix For: 2.8.0
>
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8865) Improve quota initialization performance

2015-08-13 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695574#comment-14695574
 ] 

Xiaoyu Yao commented on HDFS-8865:
--

Thanks for the patch, [~kihwal]! It looks pretty good to me. 

Just a few comments:
1. The number for large namespace looks impressive. Do you have the number for 
small/medium namespace? 

2. Is it possible to add some profiling info between these logs below so that 
we can easily find how long it takes to finish quota initialization from the 
log?
{code}
LOG.info("Initializing quota with " + threads + " thread(s)");

...
LOG.info("Quota initialization complete.\n" + counts);
{code}

3. Can you change to parameterized logging to avoid parameter construction in 
case the log statement is disabled. For example, 
{code}
LOG.debug("Setting quota for {} +\n{}", dir,  myCounts);
{code}

4. NIT: typo chached -> cached?
{code}
// Directly access the name system to obtain the current chached usage.
{code}

5. Now that HDFS-8879 is in, can you rebase and update the patch? Thanks!

> Improve quota initialization performance
> 
>
> Key: HDFS-8865
> URL: https://issues.apache.org/jira/browse/HDFS-8865
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
> HDFS-8865.v2.patch
>
>
> After replaying edits, the whole file system tree is recursively scanned in 
> order to initialize the quota. For big name space, this can take a very long 
> time.  Since this is done during namenode failover, it also affects failover 
> latency.
> By using the Fork-Join framework, I was able to greatly reduce the 
> initialization time.  The following is the test result using the fsimage from 
> one of the big name nodes we have.
> || threads || seconds||
> | 1 (existing) | 55|
> | 1 (fork-join) | 68 |
> | 4 | 16 |
> | 8 | 8 |
> | 12 | 6 |
> | 16 | 5 |
> | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on

2015-08-13 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695580#comment-14695580
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8890:
---

We probably already have this feature since we can specify paths when running 
Balancer.

> Allow admin to specify which blockpools the balancer should run on
> --
>
> Key: HDFS-8890
> URL: https://issues.apache.org/jira/browse/HDFS-8890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>
> Currently the balancer runs on all blockpools. Allow an admin to run the 
> balancer on a set of blockpools. This will enable the balancer to skip 
> blockpools that should not be balanced. For example, a tmp blockpool that has 
> a large amount of churn.
> An example of the command line interface would be an additional flag that 
> specifies the blockpools by id:
> -blockpools 
> BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8854) Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs

2015-08-13 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8854:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7285
   Status: Resolved  (was: Patch Available)

Jenkins still generating unrelated failures sometimes, but we have 1 successful 
[run | https://builds.apache.org/job/Hadoop-HDFS-7285-Merge/84/]. 

Committed to both HDFS-7285-merge and HDFS-7285. Thanks Walter for the 
contribution, and Rakesh for reviewing!

> Erasure coding: add ECPolicy to replace schema+cellSize in hadoop-hdfs
> --
>
> Key: HDFS-8854
> URL: https://issues.apache.org/jira/browse/HDFS-8854
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Walter Su
> Fix For: HDFS-7285
>
> Attachments: HDFS-8854-Consolidated-20150806.02.txt, 
> HDFS-8854-HDFS-7285-merge.03.patch, HDFS-8854-HDFS-7285-merge.03.txt, 
> HDFS-8854-HDFS-7285.00.patch, HDFS-8854-HDFS-7285.01.patch, 
> HDFS-8854-HDFS-7285.02.patch, HDFS-8854-HDFS-7285.03.patch, HDFS-8854.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8890) Allow admin to specify which blockpools the balancer should run on

2015-08-13 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695585#comment-14695585
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8890:
---

Oops, my previous comment is incorrect.  Mixing something up.

> Allow admin to specify which blockpools the balancer should run on
> --
>
> Key: HDFS-8890
> URL: https://issues.apache.org/jira/browse/HDFS-8890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>
> Currently the balancer runs on all blockpools. Allow an admin to run the 
> balancer on a set of blockpools. This will enable the balancer to skip 
> blockpools that should not be balanced. For example, a tmp blockpool that has 
> a large amount of churn.
> An example of the command line interface would be an additional flag that 
> specifies the blockpools by id:
> -blockpools 
> BP-6299761-10.55.116.188-1415904647555,BP-47348528-10.51.120.139-1415904199257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8894) Set SO_KEEPALIVE on DN server sockets

2015-08-13 Thread Nathan Roberts (JIRA)
Nathan Roberts created HDFS-8894:


 Summary: Set SO_KEEPALIVE on DN server sockets
 Key: HDFS-8894
 URL: https://issues.apache.org/jira/browse/HDFS-8894
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.1
Reporter: Nathan Roberts


SO_KEEPALIVE is not set on things like datastreamer sockets which can cause 
lingering ESTABLISHED sockets when there is a network glitch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-13 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695599#comment-14695599
 ] 

Zhe Zhang commented on HDFS-8808:
-

Both reported test issues are unrelated and pass locally. The error message 
from Jenkins test result of {{testIdempotentAllocateBlockAndClose}} is 
interesting though. We should examine it in a separate JIRA.

The checkstyle issue was pre-existing. 

> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-13 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-8078:
---
Description: 
1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
at java.lang.Thread.run(Thread.java:745)

Which also comes as client error "-get: 2401 is not an IP string literal."

This one has existing parsing logic which needs to shift to the last colon 
rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
rather than split.  Could alternatively use the techniques above.

  was:
/patch1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at 
org.apache.hadoop.hdfs.server.datanode.DataX

[jira] [Updated] (HDFS-8078) HDFS client gets errors trying to to connect to IPv6 DataNode

2015-08-13 Thread Nemanja Matkovic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemanja Matkovic updated HDFS-8078:
---
Description: 
/patch1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
at java.lang.Thread.run(Thread.java:745)

Which also comes as client error "-get: 2401 is not an IP string literal."

This one has existing parsing logic which needs to shift to the last colon 
rather than the first.  Should also be a tiny bit faster by using lastIndexOf 
rather than split.  Could alternatively use the techniques above.

  was:
1st exception, on put:

15/03/23 18:43:18 WARN hdfs.DFSClient: DataStreamer Exception
java.lang.IllegalArgumentException: Does not contain a valid host:port 
authority: 2401:db00:1010:70ba:face:0:8:0:50010
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:212)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1607)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

Appears to actually stem from code in DataNodeID which assumes it's safe to 
append together (ipaddr + ":" + port) -- which is OK for IPv4 and not OK for 
IPv6.  NetUtils.createSocketAddr( ) assembles a Java URI object, which requires 
the format proto://[2401:db00:1010:70ba:face:0:8:0]:50010

Currently using InetAddress.getByName() to validate IPv6 (guava 
InetAddresses.forString has been flaky) but could also use our own parsing. 
(From logging this, it seems like a low-enough frequency call that the extra 
object creation shouldn't be problematic, and for me the slight risk of passing 
in bad input that is not actually an IPv4 or IPv6 address and thus calling an 
external DNS lookup is outweighed by getting the address normalized and 
avoiding rewriting parsing.)

Alternatively, sun.net.util.IPAddressUtil.isIPv6LiteralAddress()

---

2nd exception (on datanode)
15/04/13 13:18:07 ERROR datanode.DataNode: 
dev1903.prn1.facebook.com:50010:DataXceiver error processing unknown operation  
src: /2401:db00:20:7013:face:0:7:0:54152 dst: 
/2401:db00:11:d010:face:0:2f:0:50010
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:315)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
at 
org.apache.hadoop.hdfs.server.datanode.DataX

[jira] [Commented] (HDFS-8891) HDFS concat should keep srcs order

2015-08-13 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695634#comment-14695634
 ] 

Jing Zhao commented on HDFS-8891:
-

Thanks for working on this, Yong! Agree we should keep the srcs order. For the 
fix, maybe we only need to replace "HashSet" to "LinkedHashSet"?

> HDFS concat should keep srcs order
> --
>
> Key: HDFS-8891
> URL: https://issues.apache.org/jira/browse/HDFS-8891
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yong Zhang
>Assignee: Yong Zhang
> Attachments: HDFS-8891.001.patch
>
>
> FSDirConcatOp.verifySrcFiles may change src files order, but it should their 
> order as input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695685#comment-14695685
 ] 

Yongjun Zhang commented on HDFS-8828:
-

Hello [~yufeigu],

I expect every new CREATE/MODIFICATION below the newly created dir would also 
have an entry in the snapshot diff report (maybe except the first level 
children case described in my last comment), is this not the case?

Thanks.


> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-08-13 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8859:
--
Priority: Major  (was: Critical)

This is a good change although it does not reduce the overall datanode memory 
footprint much.  (For 10m blocks, it only reduces 400MB memory.  However, a 
datanode does not even have 1m blocks in practice.)

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8861) Remove unnecessary log from method FSNamesystem.getCorruptFiles

2015-08-13 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-8861:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Close it and leave getCorruptFiles unchanged, that warn log is fine. 
However HDFS-8522 patch is necessary.

> Remove unnecessary log from method FSNamesystem.getCorruptFiles
> ---
>
> Key: HDFS-8861
> URL: https://issues.apache.org/jira/browse/HDFS-8861
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>Priority: Minor
> Attachments: HDFS-8861.1.patch
>
>
> The log in FSNamesystem.getCorruptFiles will print out too many messages 
> mixed with other log entries, which makes whole log quite verbose, hard to 
> understood and analyzed, especially in those cases where SuperuserPrivilege 
> check and Operation check are not satisfied in frequent calls of 
> listCorruptFileBlocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2015-08-13 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8883:

Priority: Major  (was: Minor)

> NameNode Metrics : Add FSNameSystem lock Queue Length
> -
>
> Key: HDFS-8883
> URL: https://issues.apache.org/jira/browse/HDFS-8883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-8883.001.patch
>
>
> FSNameSystemLock can have contention when NameNode is under load. This patch 
> adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
> as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-08-13 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695800#comment-14695800
 ] 

Chang Li commented on HDFS-6407:


[~wheat9] how soon could you check in this code? Are you still waiting for some 
more reviews?

> new namenode UI, lost ability to sort columns in datanode tab
> -
>
> Key: HDFS-6407
> URL: https://issues.apache.org/jira/browse/HDFS-6407
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Nathan Roberts
>Assignee: Haohui Mai
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: 002-datanodes-sorted-capacityUsed.png, 
> 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
> HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
> HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
> HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
> HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
> 2.png, sorting table.png
>
>
> old ui supported clicking on column header to sort on that column. The new ui 
> seems to have dropped this very useful feature.
> There are a few tables in the Namenode UI to display  datanodes information, 
> directory listings and snapshots.
> When there are many items in the tables, it is useful to have ability to sort 
> on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8895) Remove deprecated BlockStorageLocation APIs

2015-08-13 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-8895:
-

 Summary: Remove deprecated BlockStorageLocation APIs
 Key: HDFS-8895
 URL: https://issues.apache.org/jira/browse/HDFS-8895
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang


HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so it 
can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7649) Multihoming docs should emphasize using hostnames in configurations

2015-08-13 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-7649:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

+1 Committed for 2.8.0. Thanks [~brahmareddy].

> Multihoming docs should emphasize using hostnames in configurations
> ---
>
> Key: HDFS-7649
> URL: https://issues.apache.org/jira/browse/HDFS-7649
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
> Attachments: HDFS-7649.patch
>
>
> The docs should emphasize that master and slave configurations should 
> hostnames wherever possible.
> Link to current docs: 
> https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696000#comment-14696000
 ] 

Yongjun Zhang commented on HDFS-8828:
-

Hi [~yufeigu],

Thanks for answering my question in person. So for newly created dir, there is 
indeed one entry "CREATE" in the snapshot diff report, and no entries for new 
elements created below this dir.

So please take care of my comment 1, 2 in my previous review, plus: 

3.  Suggest to change the {{getExcludeList}} method to 
{{getTraverseExcludeList}}  (hopefully a better name) and with the following 
javadoc as we agreed.
{code}
This method returns a list of items to be excluded when recursively traversing 
newDir to build the copy list.

Specifically, given a newly created directory newDir (a CREATE entry in the 
snapshot diff), if a previously copied file/directory itemX is moved (a RENAME 
entry in the snapshot diff) into newDir, itemX should be excluded when 
recursively traversing newDir in #traverseDirectory,  so that it will not to be 
copied again.

If the same itemX also has a MODIFY entry in the snapshot diff report, meaning 
it was modified after it was previously copied, it will still be added to the 
copy list (handled in the main loop of doBuildListingWithSnapshotDiff).
{code}

4. Do refactoring to consolidate duplicated code in test code that we discussed.
 
Hi [~jingzhao], I had quite some side discussion with Yufei, I am +1 on the 
change after the above comments are addressed. Would you please take a look at 
it if you wish? I'm targeting at committing it next Monday.

Thanks much.



> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696023#comment-14696023
 ] 

Jing Zhao commented on HDFS-8828:
-

Sure. I will review the patch. Thanks for the work, Yufei and Yongjun!

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase

2015-08-13 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8435:
--
Status: Open  (was: Patch Available)

> createNonRecursive support needed in WebHdfsFileSystem to support HBase
> ---
>
> Key: HDFS-8435
> URL: https://issues.apache.org/jira/browse/HDFS-8435
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.6.0
>Reporter: Vinoth Sathappan
>Assignee: Jakob Homan
> Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
> HDFS-8435.002.patch
>
>
> The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
> HBase extensively depends on that for proper functioning. Currently, when the 
> region servers are started over web hdfs, they crash due with -
> createNonRecursive unsupported for this filesystem class 
> org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8888) Support volumes in HDFS

2015-08-13 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696039#comment-14696039
 ] 

Konstantin Shvachko commented on HDFS-:
---

Could you please explain your concept of volumes. HDFS already has one from 
federation. I guess you are thinking of something different?

> Support volumes in HDFS
> ---
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>
> There are multiple types of zones (e.g., snapshottable directories, 
> encryption zones, directories with quotas) which are conceptually close to 
> namespace volumes in traditional file systems.
> This jira proposes to introduce the concept of volume to simplify the 
> implementation of snapshots and encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase

2015-08-13 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8435:
--
Status: Patch Available  (was: Open)

> createNonRecursive support needed in WebHdfsFileSystem to support HBase
> ---
>
> Key: HDFS-8435
> URL: https://issues.apache.org/jira/browse/HDFS-8435
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.6.0
>Reporter: Vinoth Sathappan
>Assignee: Jakob Homan
> Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
> HDFS-8435.002.patch, HDFS-8435.003.patch
>
>
> The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
> HBase extensively depends on that for proper functioning. Currently, when the 
> region servers are started over web hdfs, they crash due with -
> createNonRecursive unsupported for this filesystem class 
> org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase

2015-08-13 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-8435:
--
Attachment: HDFS-8435.003.patch

New patch that applies to both trunk and branch 2.  

The failed tests were because the default of createParent param in WebHDFS was 
being set to false, but then not being used by the actual call and overridden 
to true in the create call on the dfsclient.  I've fixed this to pay attention 
to the parameter and updated the spec to be correct.

Good catch on the throw.  Removed.

I had played around with that uber test a bit.  Using the annotation loses the 
explicit method about what went wrong on each test.  I put as much into the 
helper method as looked reasonable (judgment call here); when I put more of the 
per-test logic into the helper (expected exception, subsequent message), it got 
really crowded and ugly.  

> createNonRecursive support needed in WebHdfsFileSystem to support HBase
> ---
>
> Key: HDFS-8435
> URL: https://issues.apache.org/jira/browse/HDFS-8435
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.6.0
>Reporter: Vinoth Sathappan
>Assignee: Jakob Homan
> Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
> HDFS-8435.002.patch, HDFS-8435.003.patch
>
>
> The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
> HBase extensively depends on that for proper functioning. Currently, when the 
> region servers are started over web hdfs, they crash due with -
> createNonRecursive unsupported for this filesystem class 
> org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
> at 
> org.apache.hadoop.hbase.regionserver.wal.HLogFactory.createWriter(HLogFactory.java:198)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7649) Multihoming docs should emphasize using hostnames in configurations

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696079#comment-14696079
 ] 

Hudson commented on HDFS-7649:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8295 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8295/])
HDFS-7649. Multihoming docs should emphasize using hostnames in configurations. 
(Contributed by Brahma Reddy Battula) (arp: rev 
ae57d60d8239916312bca7149e2285b2ed3b123a)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsMultihoming.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Multihoming docs should emphasize using hostnames in configurations
> ---
>
> Key: HDFS-7649
> URL: https://issues.apache.org/jira/browse/HDFS-7649
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
> Attachments: HDFS-7649.patch
>
>
> The docs should emphasize that master and slave configurations should 
> hostnames wherever possible.
> Link to current docs: 
> https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-08-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696103#comment-14696103
 ] 

Ravi Prakash commented on HDFS-6407:


The patch looks good to me. +1.

> new namenode UI, lost ability to sort columns in datanode tab
> -
>
> Key: HDFS-6407
> URL: https://issues.apache.org/jira/browse/HDFS-6407
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Nathan Roberts
>Assignee: Haohui Mai
>Priority: Critical
>  Labels: BB2015-05-TBR
> Attachments: 002-datanodes-sorted-capacityUsed.png, 
> 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
> HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.008.patch, 
> HDFS-6407.009.patch, HDFS-6407.010.patch, HDFS-6407.011.patch, 
> HDFS-6407.4.patch, HDFS-6407.5.patch, HDFS-6407.6.patch, HDFS-6407.7.patch, 
> HDFS-6407.patch, browse_directory.png, datanodes.png, snapshots.png, sorting 
> 2.png, sorting table.png
>
>
> old ui supported clicking on column header to sort on that column. The new ui 
> seems to have dropped this very useful feature.
> There are a few tables in the Namenode UI to display  datanodes information, 
> directory listings and snapshots.
> When there are many items in the tables, it is useful to have ability to sort 
> on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8895) Remove deprecated BlockStorageLocation APIs

2015-08-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8895:
--
Status: Patch Available  (was: Open)

> Remove deprecated BlockStorageLocation APIs
> ---
>
> Key: HDFS-8895
> URL: https://issues.apache.org/jira/browse/HDFS-8895
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8895.001.patch
>
>
> HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
> it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8895) Remove deprecated BlockStorageLocation APIs

2015-08-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8895:
--
Attachment: HDFS-8895.001.patch

Patch attached, deleting lots of the code. I looked at the original patch at 
HDFS-3672 for guidance as to what to delete, would appreciate a second look 
that I didn't miss anything.

> Remove deprecated BlockStorageLocation APIs
> ---
>
> Key: HDFS-8895
> URL: https://issues.apache.org/jira/browse/HDFS-8895
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8895.001.patch
>
>
> HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
> it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8895) Remove deprecated BlockStorageLocation APIs

2015-08-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8895:
--
Release Note: This removes the deprecated 
DistributedFileSystem#getFileBlockStorageLocations API used for getting 
VolumeIds of block replicas. Instead, use BlockLocation#getStorageIds to get 
very similar information.

> Remove deprecated BlockStorageLocation APIs
> ---
>
> Key: HDFS-8895
> URL: https://issues.apache.org/jira/browse/HDFS-8895
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8895.001.patch
>
>
> HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
> it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-13 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-6244:
--
Status: Patch Available  (was: Open)

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled everywhere and we'd be done. This seems simple enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-13 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-6244:
--
Status: Open  (was: Patch Available)

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled everywhere and we'd be done. This seems simple enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated HDFS-8828:
---
Attachment: HDFS-8828.007.patch

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696162#comment-14696162
 ] 

Yufei Gu commented on HDFS-8828:


No. Just one CREATE item in snapshot diff report in this case.

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696164#comment-14696164
 ] 

Yufei Gu commented on HDFS-8828:


Hi [~yzhangal],

Thanks very much for code review. I've done the modification and uploaded the 
new patch. 

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696167#comment-14696167
 ] 

Yufei Gu commented on HDFS-8828:


Thank you, [~jingzhao]. Glad to have you review the code. 

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696224#comment-14696224
 ] 

Yongjun Zhang commented on HDFS-8828:
-

Thank you [~yufeigu] and [~jingzhao]!


> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-08-13 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696262#comment-14696262
 ] 

Yi Liu commented on HDFS-8859:
--

Seems Jenkins has some problem and all are timeout, I randomly select 10 of 
them, they run successfully quickly, let me re-trigger the Jenkins.

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7649) Multihoming docs should emphasize using hostnames in configurations

2015-08-13 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696288#comment-14696288
 ] 

Brahma Reddy Battula commented on HDFS-7649:


[~arpitagarwal] thanks a lot for your review and commit!!

> Multihoming docs should emphasize using hostnames in configurations
> ---
>
> Key: HDFS-7649
> URL: https://issues.apache.org/jira/browse/HDFS-7649
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Brahma Reddy Battula
> Fix For: 2.8.0
>
> Attachments: HDFS-7649.patch
>
>
> The docs should emphasize that master and slave configurations should 
> hostnames wherever possible.
> Link to current docs: 
> https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7278) Add a command that allows sysadmins to manually trigger full block reports from a DN

2015-08-13 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated HDFS-7278:
--
Labels: 2.6.1-candidate  (was: )

> Add a command that allows sysadmins to manually trigger full block reports 
> from a DN
> 
>
> Key: HDFS-7278
> URL: https://issues.apache.org/jira/browse/HDFS-7278
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>  Labels: 2.6.1-candidate
> Fix For: 2.7.0
>
> Attachments: HDFS-7278.002.patch, HDFS-7278.003.patch, 
> HDFS-7278.004.patch, HDFS-7278.005.patch
>
>
> We should add a command that allows sysadmins to manually trigger full block 
> reports from a DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7915) The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell the DFSClient about it because of a network error

2015-08-13 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-7915:

Attachment: HDFS-7915.branch-2.6.patch

HADOOP-11802 depends on this issue. If we are going to cherry-pick 
HADOOP-11802, we need to cherry-pick this issue first. Attaching a patch for 
branch-2.6.

> The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell 
> the DFSClient about it because of a network error
> -
>
> Key: HDFS-7915
> URL: https://issues.apache.org/jira/browse/HDFS-7915
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.7.0
>
> Attachments: HDFS-7915.001.patch, HDFS-7915.002.patch, 
> HDFS-7915.004.patch, HDFS-7915.005.patch, HDFS-7915.006.patch, 
> HDFS-7915.branch-2.6.patch
>
>
> The DataNode can sometimes allocate a ShortCircuitShm slot and fail to tell 
> the DFSClient about it because of a network error.  In 
> {{DataXceiver#requestShortCircuitFds}}, the DataNode can succeed at the first 
> part (mark the slot as used) and fail at the second part (tell the DFSClient 
> what it did). The "try" block for unregistering the slot only covers a 
> failure in the first part, not the second part. In this way, a divergence can 
> form between the views of which slots are allocated on DFSClient and on 
> server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8070) Pre-HDFS-7915 DFSClient cannot use short circuit on post-HDFS-7915 DataNode

2015-08-13 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-8070:

Attachment: HDFS-8070.branch-2.6.patch

Attaching a patch for branch-2.6. If we are going to include HADOOP-11802, we 
need to include HDFS-7915 and this issue as well.

> Pre-HDFS-7915 DFSClient cannot use short circuit on post-HDFS-7915 DataNode
> ---
>
> Key: HDFS-8070
> URL: https://issues.apache.org/jira/browse/HDFS-8070
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching
>Affects Versions: 2.7.0
>Reporter: Gopal V
>Assignee: Colin Patrick McCabe
>Priority: Blocker
> Fix For: 2.7.1
>
> Attachments: HDFS-8070.001.patch, HDFS-8070.branch-2.6.patch
>
>
> HDFS ShortCircuitShm layer keeps the task locked up during multi-threaded 
> split-generation.
> I hit this immediately after I upgraded the data, so I wonder if the 
> ShortCircuitShim wire protocol has trouble when 2.8.0 DN talks to a 2.7.0 
> Client?
> {code}
> 2015-04-06 00:04:30,780 INFO [ORC_GET_SPLITS #3] orc.OrcInputFormat: ORC 
> pushdown predicate: leaf-0 = (IS_NULL ss_sold_date_sk)
> expr = (not leaf-0)
> 2015-04-06 00:04:30,781 ERROR [ShortCircuitCache_SlotReleaser] 
> shortcircuit.ShortCircuitCache: ShortCircuitCache(0x29e82045): failed to 
> release short-circuit shared memory slot Slot(slotIdx=2, 
> shm=DfsClientShm(a86ee34576d93c4964005d90b0d97c38)) by sending 
> ReleaseShortCircuitAccessRequestProto to /grid/0/cluster/hdfs/dn_socket.  
> Closing shared memory segment.
> java.io.IOException: ERROR_INVALID: there is no shared memory segment 
> registered with shmId a86ee34576d93c4964005d90b0d97c38
>   at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:208)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-04-06 00:04:30,781 INFO [ORC_GET_SPLITS #5] orc.OrcInputFormat: ORC 
> pushdown predicate: leaf-0 = (IS_NULL ss_sold_date_sk)
> expr = (not leaf-0)
> 2015-04-06 00:04:30,781 WARN [ShortCircuitCache_SlotReleaser] 
> shortcircuit.DfsClientShmManager: EndpointShmManager(172.19.128.60:50010, 
> parent=ShortCircuitShmManager(5e763476)): error shutting down shm: got 
> IOException calling shutdown(SHUT_RDWR)
> java.nio.channels.ClosedChannelException
>   at 
> org.apache.hadoop.util.CloseableReferenceCount.reference(CloseableReferenceCount.java:57)
>   at 
> org.apache.hadoop.net.unix.DomainSocket.shutdown(DomainSocket.java:387)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.DfsClientShmManager$EndpointShmManager.shutdown(DfsClientShmManager.java:378)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:223)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-04-06 00:04:30,783 INFO [ORC_GET_SPLITS #7] orc.OrcInputFormat: ORC 
> pushdown predicate: leaf-0 = (IS_NULL cs_sold_date_sk)
> expr = (not leaf-0)
> 2015-04-06 00:04:30,785 ERROR [ShortCircuitCache_SlotReleaser] 
> shortcircuit.ShortCircuitCache: ShortCircuitCache(0x29e82045): failed to 
> release short-circuit shared memory slot Slot(slotIdx=4, 
> shm=DfsClientShm(a86ee34576d93c4964005d90b0d97c38)) by sending 
> ReleaseShortCircuitAccessRequestProto to /grid/0/cluster/hdfs/dn_socket.  
> Closing shared memory segment.
> java.io.IOException: ERROR_INVALID: there is no shared memory segment 
> registered with shmId a86ee34576d93c4964005d90b0d97c38
>   at 
> org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache$SlotReleaser.run(ShortCircuitCache.java:208)
>   at 
> java.util.concurrent.E

[jira] [Updated] (HDFS-8891) HDFS concat should keep srcs order

2015-08-13 Thread Yong Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Zhang updated HDFS-8891:
-
Attachment: HDFS-8891.002.patch

Thanks [~jingzhao] for review.
Upload 2th path base on [~jingzhao]'s comment. But also need to change UT code.

> HDFS concat should keep srcs order
> --
>
> Key: HDFS-8891
> URL: https://issues.apache.org/jira/browse/HDFS-8891
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yong Zhang
>Assignee: Yong Zhang
> Attachments: HDFS-8891.001.patch, HDFS-8891.002.patch
>
>
> FSDirConcatOp.verifySrcFiles may change src files order, but it should their 
> order as input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode

2015-08-13 Thread Zhihua Deng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HDFS-7980:
--
Attachment: hadoop-241.patch

> Incremental BlockReport will dramatically slow down the startup of  a namenode
> --
>
> Key: HDFS-7980
> URL: https://issues.apache.org/jira/browse/HDFS-7980
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hui Zheng
>Assignee: Walter Su
>  Labels: 2.6.1-candidate
> Fix For: 2.7.1
>
> Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, 
> HDFS-7980.003.patch, HDFS-7980.004.patch, HDFS-7980.004.repost.patch, 
> hadoop-241.patch
>
>
> In the current implementation the datanode will call the 
> reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before 
> calling the bpNamenode.blockReport() method. So in a large(several thousands 
> of datanodes) and busy cluster it will slow down(more than one hour) the 
> startup of namenode. 
> {code}
> List blockReport() throws IOException {
> // send block report if timer has expired.
> final long startTime = now();
> if (startTime - lastBlockReport <= dnConf.blockReportInterval) {
>   return null;
> }
> final ArrayList cmds = new ArrayList();
> // Flush any block information that precedes the block report. Otherwise
> // we have a chance that we will miss the delHint information
> // or we will report an RBW replica after the BlockReport already reports
> // a FINALIZED one.
> reportReceivedDeletedBlocks();
> lastDeletedReport = startTime;
> .
> // Send the reports to the NN.
> int numReportsSent = 0;
> int numRPCs = 0;
> boolean success = false;
> long brSendStartTime = now();
> try {
>   if (totalBlockCount < dnConf.blockReportSplitThreshold) {
> // Below split threshold, send all reports in a single message.
> DatanodeCommand cmd = bpNamenode.blockReport(
> bpRegistration, bpos.getBlockPoolId(), reports);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6244) Make Trash Interval configurable for each of the namespaces

2015-08-13 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696329#comment-14696329
 ] 

Ming Ma commented on HDFS-6244:
---

Thanks [~l201514]!

The patch adds key with prefix of "dfs.federation" to 
CommonConfigurationKeysPublic. Not sure if that it is a good place to put it 
given federation is specific to HDFS and CommonConfigurationKeysPublic and 
Trash are under hadoop-common-project and might be designed to be used by any 
FileSystem.

Your early patch had NameNode read the new property defined in hdfs-site.xml 
and set the value for {{fs.trash.interval}} before creating {{Trash}}. Any 
reason not to go with that?

{{dfs.federation.trash.interval.ns.}} might be misleading as ns might mean 
nanosecond. "minutes" might be better. Another thing, maybe we can drop 
federation from the name; {{dfs.trash.interval.minutes}} is good enough; just 
like how {{dfs.namenode.rpc-address}} is used as prefix for different 
namespaces.

It might be useful to add some description for the new property and how it 
overrides the {{fs.trash.interval}}.

The patch includes unrelated FairSchedulerPage.

> Make Trash Interval configurable for each of the namespaces
> ---
>
> Key: HDFS-6244
> URL: https://issues.apache.org/jira/browse/HDFS-6244
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Assignee: Siqi Li
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6244.v1.patch, HDFS-6244.v2.patch, 
> HDFS-6244.v3.patch, HDFS-6244.v4.patch
>
>
> Somehow we need to avoid the cluster filling up.
> One solution is to have a different trash policy per namespace. However, if 
> we can simply make the property configurable per namespace, then the same 
> config can be rolled everywhere and we'd be done. This seems simple enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode

2015-08-13 Thread Zhihua Deng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HDFS-7980:
--
Attachment: (was: hadoop-241.patch)

> Incremental BlockReport will dramatically slow down the startup of  a namenode
> --
>
> Key: HDFS-7980
> URL: https://issues.apache.org/jira/browse/HDFS-7980
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hui Zheng
>Assignee: Walter Su
>  Labels: 2.6.1-candidate
> Fix For: 2.7.1
>
> Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, 
> HDFS-7980.003.patch, HDFS-7980.004.patch, HDFS-7980.004.repost.patch
>
>
> In the current implementation the datanode will call the 
> reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before 
> calling the bpNamenode.blockReport() method. So in a large(several thousands 
> of datanodes) and busy cluster it will slow down(more than one hour) the 
> startup of namenode. 
> {code}
> List blockReport() throws IOException {
> // send block report if timer has expired.
> final long startTime = now();
> if (startTime - lastBlockReport <= dnConf.blockReportInterval) {
>   return null;
> }
> final ArrayList cmds = new ArrayList();
> // Flush any block information that precedes the block report. Otherwise
> // we have a chance that we will miss the delHint information
> // or we will report an RBW replica after the BlockReport already reports
> // a FINALIZED one.
> reportReceivedDeletedBlocks();
> lastDeletedReport = startTime;
> .
> // Send the reports to the NN.
> int numReportsSent = 0;
> int numRPCs = 0;
> boolean success = false;
> long brSendStartTime = now();
> try {
>   if (totalBlockCount < dnConf.blockReportSplitThreshold) {
> // Below split threshold, send all reports in a single message.
> DatanodeCommand cmd = bpNamenode.blockReport(
> bpRegistration, bpos.getBlockPoolId(), reports);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-08-13 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696349#comment-14696349
 ] 

Kai Sasaki commented on HDFS-8287:
--

I rebased HDFS-7285. Could you please check it? Thank you!

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-08-13 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated HDFS-8287:
-
Attachment: HDFS-8287-HDFS-7285.03.patch

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode

2015-08-13 Thread Zhihua Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696376#comment-14696376
 ] 

Zhihua Deng commented on HDFS-7980:
---

   
   Recently, we encountered the same problem in our cluster of version 2.4.1 
and created a 
patch(https://github.com/dengzhhu653/hdfs-2.4.1/blob/master/hadoop-241.patch) 
according to the patch attached. let the restarted NN process the first full 
report by the faster processFirstBlockReport method, and add an condition 
AddBlockResult.ADDED==result in addStoredBlockImmediate method when 
FSNameSystem tries to invoke incrementSafeBlockCount method.
   
   The problem is I am not so sure if there exists any potential issues of the 
patch when I apply it to our cluster , any advises and opinions will be greatly 
appreciated and taken seriously, thanks!

> Incremental BlockReport will dramatically slow down the startup of  a namenode
> --
>
> Key: HDFS-7980
> URL: https://issues.apache.org/jira/browse/HDFS-7980
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Hui Zheng
>Assignee: Walter Su
>  Labels: 2.6.1-candidate
> Fix For: 2.7.1
>
> Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, 
> HDFS-7980.003.patch, HDFS-7980.004.patch, HDFS-7980.004.repost.patch
>
>
> In the current implementation the datanode will call the 
> reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before 
> calling the bpNamenode.blockReport() method. So in a large(several thousands 
> of datanodes) and busy cluster it will slow down(more than one hour) the 
> startup of namenode. 
> {code}
> List blockReport() throws IOException {
> // send block report if timer has expired.
> final long startTime = now();
> if (startTime - lastBlockReport <= dnConf.blockReportInterval) {
>   return null;
> }
> final ArrayList cmds = new ArrayList();
> // Flush any block information that precedes the block report. Otherwise
> // we have a chance that we will miss the delHint information
> // or we will report an RBW replica after the BlockReport already reports
> // a FINALIZED one.
> reportReceivedDeletedBlocks();
> lastDeletedReport = startTime;
> .
> // Send the reports to the NN.
> int numReportsSent = 0;
> int numRPCs = 0;
> boolean success = false;
> long brSendStartTime = now();
> try {
>   if (totalBlockCount < dnConf.blockReportSplitThreshold) {
> // Below split threshold, send all reports in a single message.
> DatanodeCommand cmd = bpNamenode.blockReport(
> bpRegistration, bpos.getBlockPoolId(), reports);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696383#comment-14696383
 ] 

Hadoop QA commented on HDFS-8828:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 26s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 24s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 49s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   6m 26s | Tests passed in 
hadoop-distcp. |
| | |  43m 13s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750404/HDFS-8828.007.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0a03054 |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11990/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11990/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11990/console |


This message was automatically generated.

> Utilize Snapshot diff report to build copy list in distcp
> -
>
> Key: HDFS-8828
> URL: https://issues.apache.org/jira/browse/HDFS-8828
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp, snapshots
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
> HDFS-8828.003.patch, HDFS-8828.004.patch, HDFS-8828.005.patch, 
> HDFS-8828.006.patch, HDFS-8828.007.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 
> hours for 1.6M files). We can leverage snapshot diff report to build file 
> copy list including files/dirs which are changes only between two snapshots 
> (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
> less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, 
> deletion, rename and modification between two snapshots or a snapshot and a 
> normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
> the default distcp. So it still relies on default distcp to building complete 
> list of files under the source dir. This patch only puts creation and 
> modification files into the copy list based on snapshot diff report. We can 
> minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8435) createNonRecursive support needed in WebHdfsFileSystem to support HBase

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696394#comment-14696394
 ] 

Hadoop QA commented on HDFS-8435:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  24m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:red}-1{color} | javac |   7m 46s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | javadoc |  10m 10s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 58s | Site still builds. |
| {color:red}-1{color} | checkstyle |   3m 38s | The applied patch generated  1 
new checkstyle issues (total was 104, now 105). |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 30s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   6m 34s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |  23m 20s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 118m 24s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 28s | Tests passed in 
hadoop-hdfs-client. |
| | | 200m 44s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.ha.TestZKFailoverController |
|   | hadoop.net.TestNetUtils |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockReportRateLimiting 
|
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750390/HDFS-8435.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / 0a03054 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/diffJavacWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11989/console |


This message was automatically generated.

> createNonRecursive support needed in WebHdfsFileSystem to support HBase
> ---
>
> Key: HDFS-8435
> URL: https://issues.apache.org/jira/browse/HDFS-8435
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.6.0
>Reporter: Vinoth Sathappan
>Assignee: Jakob Homan
> Attachments: HDFS-8435-branch-2.7.001.patch, HDFS-8435.001.patch, 
> HDFS-8435.002.patch, HDFS-8435.003.patch
>
>
> The WebHdfsFileSystem implementation doesn't support createNonRecursive. 
> HBase extensively depends on that for proper functioning. Currently, when the 
> region servers are started over web hdfs, they crash due with -
> createNonRecursive unsupported for this filesystem class 
> org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1137)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1112)
> at 
> org.apache.hadoop.fs.FileSystem.createNonRecursive(FileSystem.java:1088)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.init(ProtobufLogWriter.java:85)
> at 
> org.apac

[jira] [Commented] (HDFS-8895) Remove deprecated BlockStorageLocation APIs

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696399#comment-14696399
 ] 

Hadoop QA commented on HDFS-8895:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  19m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 45s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |   0m 26s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 28s | Tests passed in 
hadoop-hdfs-client. |
| | |  50m 47s | |
\\
\\
|| Reason || Tests ||
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750398/HDFS-8895.001.patch |
| Optional Tests | javac unit javadoc findbugs checkstyle |
| git revision | trunk / 0a03054 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11991/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11991/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11991/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11991/console |


This message was automatically generated.

> Remove deprecated BlockStorageLocation APIs
> ---
>
> Key: HDFS-8895
> URL: https://issues.apache.org/jira/browse/HDFS-8895
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-8895.001.patch
>
>
> HDFS-8887 supercedes DistributedFileSystem#getFileBlockStorageLocations, so 
> it can be removed from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8622) Implement GETCONTENTSUMMARY operation for WebImageViewer

2015-08-13 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-8622:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk and branch-2. Thanks [~jagadesh.kiran] for the 
continuous work!

> Implement GETCONTENTSUMMARY operation for WebImageViewer
> 
>
> Key: HDFS-8622
> URL: https://issues.apache.org/jira/browse/HDFS-8622
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jagadesh Kiran N
>Assignee: Jagadesh Kiran N
> Fix For: 2.8.0
>
> Attachments: HDFS-8622-00.patch, HDFS-8622-01.patch, 
> HDFS-8622-02.patch, HDFS-8622-03.patch, HDFS-8622-04.patch, 
> HDFS-8622-05.patch, HDFS-8622-06.patch, HDFS-8622-07.patch, 
> HDFS-8622-08.patch, HDFS-8622-09.patch, HDFS-8622-10.patch
>
>
>  it would be better for administrators if {code} GETCONTENTSUMMARY {code} are 
> supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity

2015-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696458#comment-14696458
 ] 

Hadoop QA commented on HDFS-8287:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 43s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 57s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 39s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:red}-1{color} | eclipse:eclipse |   0m 15s | The patch failed to build 
with eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   0m 25s | Post-patch findbugs 
hadoop-hdfs-project/hadoop-hdfs compilation is broken. |
| {color:green}+1{color} | findbugs |   0m 25s | The patch does not introduce 
any new Findbugs (version ) warnings. |
| {color:red}-1{color} | native |   0m 23s | Failed to build the native portion 
 of hadoop-common prior to running the unit tests in   
hadoop-hdfs-project/hadoop-hdfs |
| | |  37m  9s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750432/HDFS-8287-HDFS-7285.03.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / 1d37a88 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11993/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11993/console |


This message was automatically generated.

> DFSStripedOutputStream.writeChunk should not wait for writing parity 
> -
>
> Key: HDFS-8287
> URL: https://issues.apache.org/jira/browse/HDFS-8287
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Kai Sasaki
> Attachments: HDFS-8287-HDFS-7285.00.patch, 
> HDFS-8287-HDFS-7285.01.patch, HDFS-8287-HDFS-7285.02.patch, 
> HDFS-8287-HDFS-7285.03.patch
>
>
> When a stripping cell is full, writeChunk computes and generates parity 
> packets.  It sequentially calls waitAndQueuePacket so that user client cannot 
> continue to write data until it finishes.
> We should allow user client to continue writing instead but not blocking it 
> when writing parity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-6939) Support path-based filtering of inotify events

2015-08-13 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore reassigned HDFS-6939:


Assignee: Surendra Singh Lilhore

> Support path-based filtering of inotify events
> --
>
> Key: HDFS-6939
> URL: https://issues.apache.org/jira/browse/HDFS-6939
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, namenode, qjm
>Reporter: James Thomas
>Assignee: Surendra Singh Lilhore
>
> Users should be able to specify that they only want events involving 
> particular paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8894) Set SO_KEEPALIVE on DN server sockets

2015-08-13 Thread kanaka kumar avvaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru reassigned HDFS-8894:
-

Assignee: kanaka kumar avvaru

> Set SO_KEEPALIVE on DN server sockets
> -
>
> Key: HDFS-8894
> URL: https://issues.apache.org/jira/browse/HDFS-8894
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: kanaka kumar avvaru
>
> SO_KEEPALIVE is not set on things like datastreamer sockets which can cause 
> lingering ESTABLISHED sockets when there is a network glitch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8824) Do not use small blocks for balancing the cluster

2015-08-13 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696520#comment-14696520
 ] 

Jitendra Nath Pandey commented on HDFS-8824:


+1 for the latest patch.

> Do not use small blocks for balancing the cluster
> -
>
> Key: HDFS-8824
> URL: https://issues.apache.org/jira/browse/HDFS-8824
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8824_20150727b.patch, h8824_20150811b.patch
>
>
> Balancer gets datanode block lists from NN and then move the blocks in order 
> to balance the cluster.  It should not use the blocks with small size since 
> moving the small blocks generates a lot of overhead and the small blocks do 
> not help balancing the cluster much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7213) processIncrementalBlockReport performance degradation

2015-08-13 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696524#comment-14696524
 ] 

Vinayakumar B commented on HDFS-7213:
-

Cherry-picked to 2.6.1.

> processIncrementalBlockReport performance degradation
> -
>
> Key: HDFS-7213
> URL: https://issues.apache.org/jira/browse/HDFS-7213
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Daryn Sharp
>Assignee: Eric Payne
>Priority: Critical
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1
>
> Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt
>
>
> {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
> missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
> with the increase in incremental block reports from receiving blocks, under 
> heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7213) processIncrementalBlockReport performance degradation

2015-08-13 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7213:

Fix Version/s: (was: 2.7.0)
   2.6.1

> processIncrementalBlockReport performance degradation
> -
>
> Key: HDFS-7213
> URL: https://issues.apache.org/jira/browse/HDFS-7213
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Daryn Sharp
>Assignee: Eric Payne
>Priority: Critical
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1
>
> Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt
>
>
> {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
> missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
> with the increase in incremental block reports from receiving blocks, under 
> heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7213) processIncrementalBlockReport performance degradation

2015-08-13 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7213:

Fix Version/s: 2.7.0

> processIncrementalBlockReport performance degradation
> -
>
> Key: HDFS-7213
> URL: https://issues.apache.org/jira/browse/HDFS-7213
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Daryn Sharp
>Assignee: Eric Payne
>Priority: Critical
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt
>
>
> {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
> missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
> with the increase in incremental block reports from receiving blocks, under 
> heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7213) processIncrementalBlockReport performance degradation

2015-08-13 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7213:

Labels:   (was: 2.6.1-candidate)

> processIncrementalBlockReport performance degradation
> -
>
> Key: HDFS-7213
> URL: https://issues.apache.org/jira/browse/HDFS-7213
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Daryn Sharp
>Assignee: Eric Payne
>Priority: Critical
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt
>
>
> {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
> missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
> with the increase in incremental block reports from receiving blocks, under 
> heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock

2015-08-13 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7235:

Labels:   (was: 2.6.1-candidate)

> DataNode#transferBlock should report blocks that don't exist using 
> reportBadBlock
> -
>
> Key: HDFS-7235
> URL: https://issues.apache.org/jira/browse/HDFS-7235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
> HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
> HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
> final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
> b.getLocalBlock());
> return replicaInfo != null
> && replicaInfo.getState() == state
> && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock

2015-08-13 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7235:

Fix Version/s: 2.6.1

> DataNode#transferBlock should report blocks that don't exist using 
> reportBadBlock
> -
>
> Key: HDFS-7235
> URL: https://issues.apache.org/jira/browse/HDFS-7235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
> HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
> HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
> final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
> b.getLocalBlock());
> return replicaInfo != null
> && replicaInfo.getState() == state
> && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock

2015-08-13 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696535#comment-14696535
 ] 

Vinayakumar B commented on HDFS-7235:
-

Cherry-picked to 2.6.1

> DataNode#transferBlock should report blocks that don't exist using 
> reportBadBlock
> -
>
> Key: HDFS-7235
> URL: https://issues.apache.org/jira/browse/HDFS-7235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
> HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
> HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
> final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
> b.getLocalBlock());
> return replicaInfo != null
> && replicaInfo.getState() == state
> && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7213) processIncrementalBlockReport performance degradation

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696547#comment-14696547
 ] 

Hudson commented on HDFS-7213:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8298 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8298/])
HDFS-7213. processIncrementalBlockReport performance degradation. Contributed 
by Eric Payne. (vinayakumarb: rev d25cb8fe12d00faf3e8f3bfd23fd1b01981a340f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> processIncrementalBlockReport performance degradation
> -
>
> Key: HDFS-7213
> URL: https://issues.apache.org/jira/browse/HDFS-7213
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.4.0
>Reporter: Daryn Sharp
>Assignee: Eric Payne
>Priority: Critical
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7213.1412804753, HDFS-7213.1412806496.txt
>
>
> {{BlockManager#processIncrementalBlockReport}} has a debug line that is 
> missing a {{isDebugEnabled}} check.  The write lock is being held.  Coupled 
> with the increase in incremental block reports from receiving blocks, under 
> heavy load this log line noticeably degrades performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7235) DataNode#transferBlock should report blocks that don't exist using reportBadBlock

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696548#comment-14696548
 ] 

Hudson commented on HDFS-7235:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8298 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8298/])
HDFS-7235. DataNode#transferBlock should report blocks that don't exist using 
reportBadBlock (yzhang via cmccabe) (vinayakumarb: rev 
f2b4bc9b6a1bd3f9dbfc4e85c1b9bde238da3627)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> DataNode#transferBlock should report blocks that don't exist using 
> reportBadBlock
> -
>
> Key: HDFS-7235
> URL: https://issues.apache.org/jira/browse/HDFS-7235
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
> HDFS-7235.003.patch, HDFS-7235.004.patch, HDFS-7235.005.patch, 
> HDFS-7235.006.patch, HDFS-7235.007.patch, HDFS-7235.007.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
> final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
> b.getLocalBlock());
> return replicaInfo != null
> && replicaInfo.getState() == state
> && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7263) Snapshot read can reveal future bytes for appended files.

2015-08-13 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696550#comment-14696550
 ] 

Vinayakumar B commented on HDFS-7263:
-

Cherry-picked to 2.6.1

> Snapshot read can reveal future bytes for appended files.
> -
>
> Key: HDFS-7263
> URL: https://issues.apache.org/jira/browse/HDFS-7263
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.5.0
>Reporter: Konstantin Shvachko
>Assignee: Tao Luo
>  Labels: 2.6.1-candidate
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, 
> TestSnapshotRead.java
>
>
> The following sequence of steps will produce extra bytes, that should not be 
> visible, because they are not in the snapshot.
> * Create a file of size L, where {{L % blockSize != 0}}.
> * Create a snapshot
> * Append bytes to the file
> * Read file in the snapshot (not the current file)
> * You will see the bytes are read beoynd the original file size L



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7263) Snapshot read can reveal future bytes for appended files.

2015-08-13 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7263:

Fix Version/s: 2.6.1

> Snapshot read can reveal future bytes for appended files.
> -
>
> Key: HDFS-7263
> URL: https://issues.apache.org/jira/browse/HDFS-7263
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.5.0
>Reporter: Konstantin Shvachko
>Assignee: Tao Luo
>  Labels: 2.6.1-candidate
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, 
> TestSnapshotRead.java
>
>
> The following sequence of steps will produce extra bytes, that should not be 
> visible, because they are not in the snapshot.
> * Create a file of size L, where {{L % blockSize != 0}}.
> * Create a snapshot
> * Append bytes to the file
> * Read file in the snapshot (not the current file)
> * You will see the bytes are read beoynd the original file size L



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7263) Snapshot read can reveal future bytes for appended files.

2015-08-13 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7263:

Labels:   (was: 2.6.1-candidate)

> Snapshot read can reveal future bytes for appended files.
> -
>
> Key: HDFS-7263
> URL: https://issues.apache.org/jira/browse/HDFS-7263
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.5.0
>Reporter: Konstantin Shvachko
>Assignee: Tao Luo
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, 
> TestSnapshotRead.java
>
>
> The following sequence of steps will produce extra bytes, that should not be 
> visible, because they are not in the snapshot.
> * Create a file of size L, where {{L % blockSize != 0}}.
> * Create a snapshot
> * Append bytes to the file
> * Read file in the snapshot (not the current file)
> * You will see the bytes are read beoynd the original file size L



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7263) Snapshot read can reveal future bytes for appended files.

2015-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696558#comment-14696558
 ] 

Hudson commented on HDFS-7263:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8299 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8299/])
HDFS-7263. Snapshot read can reveal future bytes for appended files. 
Contributed by Tao Luo. (vinayakumarb: rev 
fa2641143c0d74c4fef122d79f27791e15d3b43f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Snapshot read can reveal future bytes for appended files.
> -
>
> Key: HDFS-7263
> URL: https://issues.apache.org/jira/browse/HDFS-7263
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.5.0
>Reporter: Konstantin Shvachko
>Assignee: Tao Luo
> Fix For: 2.7.0, 2.6.1
>
> Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, 
> TestSnapshotRead.java
>
>
> The following sequence of steps will produce extra bytes, that should not be 
> visible, because they are not in the snapshot.
> * Create a file of size L, where {{L % blockSize != 0}}.
> * Create a snapshot
> * Append bytes to the file
> * Read file in the snapshot (not the current file)
> * You will see the bytes are read beoynd the original file size L



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)