[jira] [Commented] (HDFS-8913) Documentation clarity regarding Secondary node, Checkpoint node & Backup node

2016-01-03 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080782#comment-15080782
 ] 

Jeff Zhang commented on HDFS-8913:
--

+1, I think we do need to highlight the differences between these roles. 

> Documentation clarity regarding Secondary node, Checkpoint node & Backup node
> -
>
> Key: HDFS-8913
> URL: https://issues.apache.org/jira/browse/HDFS-8913
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7.1
> Environment: Content in documentation
>Reporter: Ravindra Babu
>Priority: Trivial
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I checked with many people and almost all of them are confused on 
> responsibilities of Secondary Node, Checkpoint Node and Backup node.
> Link:
> http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
> Confusion:
> Secondary NameNode
> The NameNode stores modifications to the file system as a log appended to a 
> native file system file, edits. When a NameNode starts up, it reads HDFS 
> state from an image file, fsimage, and then applies edits from the edits log 
> file. It then writes new HDFS state to the fsimage and starts normal 
> operation with an empty edits file. Since NameNode merges fsimage and edits 
> files only during start up, the edits log file could get very large over time 
> on a busy cluster. Another side effect of a larger edits file is that next 
> restart of NameNode takes longer.
> The secondary NameNode merges the fsimage and the edits log files 
> periodically and keeps edits log size within a limit. It is usually run on a 
> different machine than the primary NameNode since its memory requirements are 
> on the same order as the primary NameNode.
> Checkpoint Node
> NameNode persists its namespace using two files: fsimage, which is the latest 
> checkpoint of the namespace and edits, a journal (log) of changes to the 
> namespace since the checkpoint. When a NameNode starts up, it merges the 
> fsimage and edits journal to provide an up-to-date view of the file system 
> metadata. The NameNode then overwrites fsimage with the new HDFS state and 
> begins a new edits journal.
> Backup Node
> The Backup node provides the same checkpointing functionality as the 
> Checkpoint node, as well as maintaining an in-memory, up-to-date copy of the 
> file system namespace that is always synchronized with the active NameNode 
> state. Along with accepting a journal stream of file system edits from the 
> NameNode and persisting this to disk, the Backup node also applies those 
> edits into its own copy of the namespace in memory, thus creating a backup of 
> the namespace.
> Now all three nodes have overlapping functionalities. To add confusion to 
> this point, 
> http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html
> quotes that NameNode will never make RPC call to other nodes.
> The Communication Protocols
> All HDFS communication protocols are layered on top of the TCP/IP protocol. A 
> client establishes a connection to a configurable TCP port on the NameNode 
> machine. It talks the ClientProtocol with the NameNode. The DataNodes talk to 
> the NameNode using the DataNode Protocol. A Remote Procedure Call (RPC) 
> abstraction wraps both the Client Protocol and the DataNode Protocol. By 
> design, the NameNode never initiates any RPCs. Instead, it only responds to 
> RPC requests issued by DataNodes or clients.
> We need clarification regarding these points. Please enhance your 
> documentation to avoid confusion among readers.
> 1) Secondary Node, Check point Node & Backup node - Clear separation of roles
> 2) For High Availability, do we require  only One of them Or Two of them or 
> All of them? If it's not all of them, what combination is allowed?
> 3) Without RPC by Name node to data nodes, how writes and read are happening?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9376) TestSeveralNameNodes fails occasionally

2016-01-03 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080780#comment-15080780
 ] 

Masatake Iwasaki commented on HDFS-9376:


Thanks, [~cnauroth].

> TestSeveralNameNodes fails occasionally
> ---
>
> Key: HDFS-9376
> URL: https://issues.apache.org/jira/browse/HDFS-9376
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Kihwal Lee
>Assignee: Masatake Iwasaki
> Fix For: 3.0.0
>
> Attachments: HDFS-9376.001.patch, HDFS-9376.002.patch
>
>
> TestSeveralNameNodes has been failing in precommit builds.  It usually times 
> out on waiting for the last thread to finish writing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2016-01-03 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9601:
---
Attachment: HDFS-9601.002.patch

I attached 002.
* simplified retrying
* added comment
* got rid of logging on every retries because NNThroughputBenchmark could be 
used from command line

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch, HDFS-9601.002.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9601) NNThroughputBenchmark.BlockReportStats should handle NotReplicatedYetException on adding block

2016-01-03 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080778#comment-15080778
 ] 

Masatake Iwasaki commented on HDFS-9601:


Thanks for the comment, [~liuml07].

As you say, we can not reuse {{DFSOutputStream#addBlock}} depending on real dfs 
client as is. I think it is not problem to retry {{addBlock}} by itself because 
{{generateInputs}} is called in preparetion phase of the benchmark. It does not 
need to have same behavior with {{DFSOutputStream}}. It should not need even 
exponential backoff.

> NNThroughputBenchmark.BlockReportStats should handle 
> NotReplicatedYetException on adding block
> --
>
> Key: HDFS-9601
> URL: https://issues.apache.org/jira/browse/HDFS-9601
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
> Attachments: HDFS-9601.001.patch
>
>
> TestNNThroughputBenchmark intermittently fails due to 
> NotReplicatedYetException. Because 
> {{NNThroughputBenchmark.BlockReportStats#generateInputs}} directly uses 
> {{ClientProtocol#addBlock}}, it must handles {{NotReplicatedYetException}} by 
> itself as {{DFSOutputStream#addBlock} do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080683#comment-15080683
 ] 

Dinesh S. Atreya commented on HDFS-9607:


Personal preference regarding API naming.

* We can prefer to use "writeInPlace" for the method name (as it is 
descriptive). Welcome other suggestions.
* We can have auxiliary methods "updtInPlace" and "deleteInPlace" which will be 
no-ops performing the check "checkWriteInPlace" or calling "writeInPlace". 

The Hadoop sub-systems such as ORC, HIVE, etc. may then override "updtInPlace" 
and "deleteInPlace" methods to implement functionality relevant to each. 

> Advance Hadoop Architecture (AHA) - HDFS
> 
>
> Key: HDFS-9607
> URL: https://issues.apache.org/jira/browse/HDFS-9607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080680#comment-15080680
 ] 

Hadoop QA commented on HDFS-9445:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HDFS-9445 does not apply to branch-2. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12780259/HDFS-9445-branch-2.6_02.patch
 |
| JIRA Issue | HDFS-9445 |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14016/console |


This message was automatically generated.



> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9445-branch-2.6_02.patch, HDFS-9445.00.patch, 
> HDFS-9445.01.patch, HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080679#comment-15080679
 ] 

Hadoop QA commented on HDFS-9445:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s {color} 
| {color:red} HDFS-9445 does not apply to branch-2. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12780259/HDFS-9445-branch-2.6_02.patch
 |
| JIRA Issue | HDFS-9445 |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14015/console |


This message was automatically generated.



> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9445-branch-2.6_02.patch, HDFS-9445.00.patch, 
> HDFS-9445.01.patch, HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-03 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su reopened HDFS-9445:
-

> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9445-branch-2.6_02.patch, HDFS-9445.00.patch, 
> HDFS-9445.01.patch, HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-03 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-9445:

Status: Patch Available  (was: Reopened)

> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9445-branch-2.6_02.patch, HDFS-9445.00.patch, 
> HDFS-9445.01.patch, HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080678#comment-15080678
 ] 

Dinesh S. Atreya commented on HDFS-9607:


A few checks that should be implemented by the method checkWriteInPlace (or 
alternatively checkUpdtInPlace) are as follows:

* The maximum length of either readLength or writeBuffer should be less than 
the blocksize
** Multiple updates can be chained with separate method call with each update 
less than equal to the blocksize.
* readLength should be equal to writeBuffer.length
** if it is not then the check fails.
* desiredWritePos + readLength (and desiredWritePos + writeBuffer.length) 
cannot extend beyond the block boundary.
* It is presumed that any updates should not extend beyond the end of file. 
Hence desiredWritePos + readLength (and desiredWritePos + writeBuffer.length) 
should be less than equal to the length of the file. This will ensure that EOF 
is not compromised.

Please indicate other checks should be implemented.

*Since we are talking about HDFS, data integrity is very _critical_. The checks 
should ensure that data does not get corrupted.*


> Advance Hadoop Architecture (AHA) - HDFS
> 
>
> Key: HDFS-9607
> URL: https://issues.apache.org/jira/browse/HDFS-9607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

2016-01-03 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080672#comment-15080672
 ] 

Walter Su commented on HDFS-8430:
-

bq. 1. Use CRC64 (or some other linear code) for block checksum instead of MD5.
Agreed. CRC works fine as hash function. Our purpose is file comparison. MD5 is 
overkill.
MD5 is 128bits, I think you mean CRC128?

bq. The datanode may compute cell CRC64s...
We may have many policies, and many cellSize. Let's say minimal cell size is 
64k. You mean calculate a CRC per 64k (instead of default value of 
_dfs.bytes-per-checksum_) ? It does reduce network traffic. But I thought we 
could use the block metadata which already has the CRCs and avoid 
re-calculation.

bq. Instead of sending all CRCs to the client, send all CRCs to one of the 
datanode in a block group. 
Either way, we still need to fetch all CRCs from 6(or 9) DNs, and change the 
ordering. So the hash value can be the same as replicated block.

bq. The hard part would be to consider the block missing, decoding and checksum 
computing case.
Agreed.

> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

2016-01-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080667#comment-15080667
 ] 

Kai Zheng commented on HDFS-8430:
-

Thanks Nicholas for the correction. Yeah I misunderstood. It's smart to adjust 
the algorithm in the replicated files side to conform with striped files. The 
impact might be big for existing clusters because they will find their 
identical replicated files are not equal now. To avoid the impact, how about 
adding a new API for the new behaviour? In the new approach, we would need to 
introduce {{cell}} similar to striped files for replicated files when computing 
the checksum? If so, how to determine it? When a replicated file is compared to 
a striped file, I guess we can use the cell value used by the striped file for 
the replicated file. But then the cell value needs to be passed into when 
calling {{getFileChecksum}}, which should be fine if we introduce a new API.

I guess you want to use CRC64 to be collision-safer against CRC32 and make 
network traffic smaller against MD5, {{64-bits x numCellsInOneBlock}} instead 
of {{16-bytes x numCellsInOneBlock}}. Please help correct  if I don't get your 
point. Thanks.

> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-03 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-9445:

Attachment: HDFS-9445-branch-2.6_02.patch

yes. Uploaded HDFS-9445-branch-2.6_02.patch

> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9445-branch-2.6_02.patch, HDFS-9445.00.patch, 
> HDFS-9445.01.patch, HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

2016-01-03 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080639#comment-15080639
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8430:
---

> ... It looks like it's acceptable the file checksum for striped files are not 
> compatible or comparable with replicated files. ...

No, I mean changing the checksum algorithm for replicated files so that the new 
algorithm can also handle striped files.  That's why I said that the new 
algorithm was incompatible with the existing algorithm.

> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9608) Disk IO imbalance in HDFS with heterogeneous storages

2016-01-03 Thread Wei Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zhou updated HDFS-9608:
---
Attachment: HDFS-9608.01.patch

This patch insure choosing volumes in RR mode for each storage type. Besides, 
it uses storage-type-level lock to allow concurrent choosing of different 
storage types. Thanks!

> Disk IO imbalance in HDFS with heterogeneous storages
> -
>
> Key: HDFS-9608
> URL: https://issues.apache.org/jira/browse/HDFS-9608
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: HDFS-9608.01.patch
>
>
> Currently RoundRobinVolumeChoosingPolicy use a shared index to choose volumes 
> in HDFS with heterogeneous storages, this leads to non-RR choosing mode for 
> certain type of storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9608) Disk IO imbalance in HDFS with heterogeneous storages

2016-01-03 Thread Wei Zhou (JIRA)
Wei Zhou created HDFS-9608:
--

 Summary: Disk IO imbalance in HDFS with heterogeneous storages
 Key: HDFS-9608
 URL: https://issues.apache.org/jira/browse/HDFS-9608
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Wei Zhou
Assignee: Wei Zhou


Currently RoundRobinVolumeChoosingPolicy use a shared index to choose volumes 
in HDFS with heterogeneous storages, this leads to non-RR choosing mode for 
certain type of storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

2016-01-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080608#comment-15080608
 ] 

Kai Zheng commented on HDFS-8430:
-

So given above approach is used, the involved change would be much smaller, as 
we may reuse and don't change {{DataXceiver#blockChecksum}} at all, small 
change for {{DFSClient#getFileChecksum}} considering block groups, and small 
change in {{distcp}} to aware block layout difference (if different, no file 
checksum comparing will be involved since surely different). No extra RPC is 
involved. The hard part would be to consider the block missing, decoding and 
checksum computing case.

> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

2016-01-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080604#comment-15080604
 ] 

Kai Zheng commented on HDFS-8430:
-

Thanks [~szetszwo] for the ideas! It looks like it's acceptable the file 
checksum for striped files are not compatible or comparable with replicated 
files. It sounds not bad since we may seldom compare striped files with 
replicated files, or if we do, they will surely differ since their block 
layouts are different at all. So in this direction, I guess things could be 
simpler since we can consider different algorithms for striped files as you 
said and we could avoid the increased network traffic. 
bq. Or simply compute cell checksums for replicated files instead of block 
checksums.
I guess you mean *stripped files*? In this thinking for the {{simple}}, Would 
you think it works or not if we do as illustrated in details as follows?

Assumes a block group of blocks from {{b0}} to {{b5}}, and of {{n+1}} strips or 
rows. The 1st strip is of cells from {{c00}} to {{c05}} and so on. For the 1st 
column, the cells from {{c00}}, {{c10}} to {{cn0}} reside on block {{b0}}, and 
so on for other columns.
{noformat}
b0b1b2   b3   b4   b5
c00  c01  c02  c03  c04  c05
c10  c11  c12  c13  c14  c15
...
cn0  cn1  cn2  cn3  cn4  cn5
{noformat}
Similar to the block MD5 algorithm for replicated files in 
{{DataXceiver#blockChecksum}}, we could compute the checksum result for block 
{{b0}} by aggregating the MD5 (or other algorithm) hash results for the located 
cells (c00, c10, ..., cn0).

> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9294) DFSClient deadlock when close file and failed to renew lease

2016-01-03 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080594#comment-15080594
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9294:
---

+1 we should as well cherry-pick this to branch-2.6.

> DFSClient  deadlock when close file and failed to renew lease
> -
>
> Key: HDFS-9294
> URL: https://issues.apache.org/jira/browse/HDFS-9294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.2.0, 2.7.1
> Environment: Hadoop 2.2.0
>Reporter: 邓飞
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: HDFS-9294-002.patch, HDFS-9294-002.patch, 
> HDFS-9294-branch-2.7.patch, HDFS-9294-branch-2.patch, HDFS-9294.patch
>
>
> We found a deadlock at our HBase(0.98) cluster(and the Hadoop Version is 
> 2.2.0),and it should be HDFS BUG,at the time our network is not stable.
>  below is the stack:
> *
> Found one Java-level deadlock:
> =
> "MemStoreFlusher.1":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   waiting to lock monitor 0x7ff2e67e16a8 (object 0x000486ce6620, a 
> org.apache.hadoop.hdfs.DFSOutputStream),
>   which is held by "MemStoreFlusher.0"
> "MemStoreFlusher.0":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> Java stack information for the threads listed above:
> ===
> "MemStoreFlusher.1":
>   at org.apache.hadoop.hdfs.LeaseRenewer.addClient(LeaseRenewer.java:216)
>   - waiting to lock <0x0002fae5ebe0> (a 
> org.apache.hadoop.hdfs.LeaseRenewer)
>   at org.apache.hadoop.hdfs.LeaseRenewer.getInstance(LeaseRenewer.java:81)
>   at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:648)
>   at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:659)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1882)
>   - locked <0x00055b606cb0> (a org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:104)
>   at 
> org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:250)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
>   - locked <0x00059869eed8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:812)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1974)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1795)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1678)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1591)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:472)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:211)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:66)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:238)
>   at java.lang.Thread.run(Thread.java:744)
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:1822)
>   - waiting to lock <0x000486ce6620> (a 
> org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:780)
>   at org.apache.hadoop.hdfs.DFSClient.abort(DFSClient.java:753)
>   at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:453)
>   - locked <0x0002fae5ebe0> (a org.apache.hadoop.hdfs.LeaseRenewer)
>   at org.apache.ha

[jira] [Updated] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9445:
-
Target Version/s: 2.7.2, 2.6.4  (was: 2.7.2)

> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9445.00.patch, HDFS-9445.01.patch, 
> HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080586#comment-15080586
 ] 

Junping Du commented on HDFS-9445:
--

Hi [~walter.k.su], [~vinayrpet] and [~kihwal], does the same issue exist in 
branch-2.6 also? If so, we may consider to cherry-pick the fix to branch-2.6?

> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9445.00.patch, HDFS-9445.01.patch, 
> HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9294) DFSClient deadlock when close file and failed to renew lease

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9294:
-
Target Version/s: 2.7.2, 2.6.4  (was: 2.7.2)

> DFSClient  deadlock when close file and failed to renew lease
> -
>
> Key: HDFS-9294
> URL: https://issues.apache.org/jira/browse/HDFS-9294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.2.0, 2.7.1
> Environment: Hadoop 2.2.0
>Reporter: 邓飞
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: HDFS-9294-002.patch, HDFS-9294-002.patch, 
> HDFS-9294-branch-2.7.patch, HDFS-9294-branch-2.patch, HDFS-9294.patch
>
>
> We found a deadlock at our HBase(0.98) cluster(and the Hadoop Version is 
> 2.2.0),and it should be HDFS BUG,at the time our network is not stable.
>  below is the stack:
> *
> Found one Java-level deadlock:
> =
> "MemStoreFlusher.1":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   waiting to lock monitor 0x7ff2e67e16a8 (object 0x000486ce6620, a 
> org.apache.hadoop.hdfs.DFSOutputStream),
>   which is held by "MemStoreFlusher.0"
> "MemStoreFlusher.0":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> Java stack information for the threads listed above:
> ===
> "MemStoreFlusher.1":
>   at org.apache.hadoop.hdfs.LeaseRenewer.addClient(LeaseRenewer.java:216)
>   - waiting to lock <0x0002fae5ebe0> (a 
> org.apache.hadoop.hdfs.LeaseRenewer)
>   at org.apache.hadoop.hdfs.LeaseRenewer.getInstance(LeaseRenewer.java:81)
>   at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:648)
>   at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:659)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1882)
>   - locked <0x00055b606cb0> (a org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:104)
>   at 
> org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:250)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
>   - locked <0x00059869eed8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:812)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1974)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1795)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1678)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1591)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:472)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:211)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:66)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:238)
>   at java.lang.Thread.run(Thread.java:744)
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:1822)
>   - waiting to lock <0x000486ce6620> (a 
> org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:780)
>   at org.apache.hadoop.hdfs.DFSClient.abort(DFSClient.java:753)
>   at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:453)
>   - locked <0x0002fae5ebe0> (a org.apache.hadoop.hdfs.LeaseRenewer)
>   at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
>   at org.apache.ha

[jira] [Commented] (HDFS-9294) DFSClient deadlock when close file and failed to renew lease

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080582#comment-15080582
 ] 

Junping Du commented on HDFS-9294:
--

Hi [~szetszwo] and [~brahmareddy], shall we cherry-pick this fix to branch-2.6 
as well?

> DFSClient  deadlock when close file and failed to renew lease
> -
>
> Key: HDFS-9294
> URL: https://issues.apache.org/jira/browse/HDFS-9294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.2.0, 2.7.1
> Environment: Hadoop 2.2.0
>Reporter: 邓飞
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: HDFS-9294-002.patch, HDFS-9294-002.patch, 
> HDFS-9294-branch-2.7.patch, HDFS-9294-branch-2.patch, HDFS-9294.patch
>
>
> We found a deadlock at our HBase(0.98) cluster(and the Hadoop Version is 
> 2.2.0),and it should be HDFS BUG,at the time our network is not stable.
>  below is the stack:
> *
> Found one Java-level deadlock:
> =
> "MemStoreFlusher.1":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   waiting to lock monitor 0x7ff2e67e16a8 (object 0x000486ce6620, a 
> org.apache.hadoop.hdfs.DFSOutputStream),
>   which is held by "MemStoreFlusher.0"
> "MemStoreFlusher.0":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> Java stack information for the threads listed above:
> ===
> "MemStoreFlusher.1":
>   at org.apache.hadoop.hdfs.LeaseRenewer.addClient(LeaseRenewer.java:216)
>   - waiting to lock <0x0002fae5ebe0> (a 
> org.apache.hadoop.hdfs.LeaseRenewer)
>   at org.apache.hadoop.hdfs.LeaseRenewer.getInstance(LeaseRenewer.java:81)
>   at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:648)
>   at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:659)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1882)
>   - locked <0x00055b606cb0> (a org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:104)
>   at 
> org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:250)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
>   - locked <0x00059869eed8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:812)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1974)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1795)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1678)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1591)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:472)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:211)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:66)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:238)
>   at java.lang.Thread.run(Thread.java:744)
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:1822)
>   - waiting to lock <0x000486ce6620> (a 
> org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:780)
>   at org.apache.hadoop.hdfs.DFSClient.abort(DFSClient.java:753)
>   at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:453)
>   - locked <0x0002fae5ebe0> (a org.apache.hadoop.hdfs.LeaseRenewer)
> 

[jira] [Updated] (HDFS-8891) HDFS concat should keep srcs order

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8891:
-
Target Version/s: 2.6.4

> HDFS concat should keep srcs order
> --
>
> Key: HDFS-8891
> URL: https://issues.apache.org/jira/browse/HDFS-8891
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yong Zhang
>Assignee: Yong Zhang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: HDFS-8891.001.patch, HDFS-8891.002.patch
>
>
> FSDirConcatOp.verifySrcFiles may change src files order, but it should their 
> order as input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8891) HDFS concat should keep srcs order

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080579#comment-15080579
 ] 

Junping Du commented on HDFS-8891:
--

Hi [~chris.douglas] and [~jingzhao], shall we cherry-pick this fix to 2.6.4 as 
well? Thanks!

> HDFS concat should keep srcs order
> --
>
> Key: HDFS-8891
> URL: https://issues.apache.org/jira/browse/HDFS-8891
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yong Zhang
>Assignee: Yong Zhang
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: HDFS-8891.001.patch, HDFS-8891.002.patch
>
>
> FSDirConcatOp.verifySrcFiles may change src files order, but it should their 
> order as input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080575#comment-15080575
 ] 

Dinesh S. Atreya commented on HDFS-9607:


Linking to the parent/umbrella JIRA.

> Advance Hadoop Architecture (AHA) - HDFS
> 
>
> Key: HDFS-9607
> URL: https://issues.apache.org/jira/browse/HDFS-9607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080571#comment-15080571
 ] 

Dinesh S. Atreya commented on HDFS-9607:


Please choose which of the following API signatures are preferable by the HDFS 
team. Note that “write” similar to POSIX is not supported. Only limited 
write-in-place is supported. A detailed design document can be prepared after 
the APIs are agreed upon,

First alternative is given hereunder: 

{code:title= FSWriteInPlaceStream.java|borderStyle=solid}

longgetPos()
// Get the current position in the input stream.

voidseek(long desiredWritePos)
// Seek to the given offset.

int writeInPlace(long position, int readLength, byte[] writeBuffer)
// Write/Update bytes from writeBuffer up to previously read length 
// at given position in file

int writeInPlace(int readLength, byte[] writeBuffer)
// Write/Update bytes from writeBuffer up to previously read length 
// after seek in file

boolean  checkWriteInPlace(long position, int readLength, byte[] writeBuffer)
// Check whether Write/Update of bytes from writeBuffer up to 
// previously read length at given position is possible inside file

boolean  checkWriteInPlace(int readLength, byte[] writeBuffer)
// Check whether Write/Update of bytes from writeBuffer up to 
// previously read length after seek is possible inside file

{code}
 
Second alternative is given below:

{code:title= FSWriteInPlaceStream.java|borderStyle=solid}


longgetPos()
// Get the current position in the input stream.

voidseek(long desiredWritePos)
// Seek to the given offset.

int updtInPlace(long position, int readLength, byte[] writeBuffer)
// Write/Update bytes from writeBuffer up to previously read length 
// at given position in file

int updtInPlace(int readLength, byte[] writeBuffer)
// Write/Update bytes from writeBuffer up to previously read length 
// after seek in file

boolean  checkUpdtInPlace(long position, int readLength, byte[] writeBuffer)
// Check whether Write/Update of bytes from writeBuffer up to 
// previously read length at given position is possible inside file

boolean  checkUpdtInPlace(int readLength, byte[] writeBuffer)
// Check whether Write/Update of bytes from writeBuffer up to 
// previously read length after seek is possible inside file

{code}

Please indicate the preferred class name as well. One alternative is 
FSWriteInPlaceStream (which extends FSDataOutputStream) that explicitly 
specifies that only writes/updates in-place are supported.


> Advance Hadoop Architecture (AHA) - HDFS
> 
>
> Key: HDFS-9607
> URL: https://issues.apache.org/jira/browse/HDFS-9607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9220) Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9220:
-
Target Version/s: 2.7.2, 2.6.4  (was: 2.7.2)

> Reading small file (< 512 bytes) that is open for append fails due to 
> incorrect checksum
> 
>
> Key: HDFS-9220
> URL: https://issues.apache.org/jira/browse/HDFS-9220
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Jing Zhao
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9220.000.patch, HDFS-9220.001.patch, 
> HDFS-9220.002.patch, test2.java
>
>
> Exception:
> 2015-10-09 14:59:40 WARN  DFSClient:1150 - fetchBlockByteRange(). Got a 
> checksum exception for /tmp/file0.05355529331575182 at 
> BP-353681639-10.10.10.10-1437493596883:blk_1075692769_9244882:0 from 
> DatanodeInfoWithStorage[10.10.10.10]:5001
> All 3 replicas cause this exception and the read fails entirely with:
> BlockMissingException: Could not obtain block: 
> BP-353681639-10.10.10.10-1437493596883:blk_1075692769_9244882 
> file=/tmp/file0.05355529331575182
> Code to reproduce is attached.
> Does not happen in 2.7.0.
> Data is read correctly if checksum verification is disabled.
> More generally, the failure happens when reading from the last block of a 
> file and the last block has <= 512 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9220) Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080570#comment-15080570
 ] 

Junping Du commented on HDFS-9220:
--

Hi [~jingzhao] and [~kihwal], shall we cherry-pick the fix to branch-2.6 as 
well?

> Reading small file (< 512 bytes) that is open for append fails due to 
> incorrect checksum
> 
>
> Key: HDFS-9220
> URL: https://issues.apache.org/jira/browse/HDFS-9220
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Jing Zhao
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9220.000.patch, HDFS-9220.001.patch, 
> HDFS-9220.002.patch, test2.java
>
>
> Exception:
> 2015-10-09 14:59:40 WARN  DFSClient:1150 - fetchBlockByteRange(). Got a 
> checksum exception for /tmp/file0.05355529331575182 at 
> BP-353681639-10.10.10.10-1437493596883:blk_1075692769_9244882:0 from 
> DatanodeInfoWithStorage[10.10.10.10]:5001
> All 3 replicas cause this exception and the read fails entirely with:
> BlockMissingException: Could not obtain block: 
> BP-353681639-10.10.10.10-1437493596883:blk_1075692769_9244882 
> file=/tmp/file0.05355529331575182
> Code to reproduce is attached.
> Does not happen in 2.7.0.
> Data is read correctly if checksum verification is disabled.
> More generally, the failure happens when reading from the last block of a 
> file and the last block has <= 512 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080567#comment-15080567
 ] 

Junping Du commented on HDFS-6945:
--

I have cherry-pick the patch to branch-2.6.

> BlockManager should remove a block from excessReplicateMap and decrement 
> ExcessBlocks metric when the block is removed
> --
>
> Key: HDFS-6945
> URL: https://issues.apache.org/jira/browse/HDFS-6945
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Critical
>  Labels: metrics
> Fix For: 2.8.0, 2.7.2, 2.6.4
>
> Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
> HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch
>
>
> I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
> however, there are no over-replicated blocks (confirmed by fsck).
> After a further research, I noticed when deleting a block, BlockManager does 
> not remove the block from excessReplicateMap or decrement excessBlocksCount.
> Usually the metric is decremented when processing block report, however, if 
> the block has been deleted, BlockManager does not remove the block from 
> excessReplicateMap or decrement the metric.
> That way the metric and excessReplicateMap can increase infinitely (i.e. 
> memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-6945:
-
Target Version/s: 2.8.0, 2.6.4  (was: 2.8.0)

> BlockManager should remove a block from excessReplicateMap and decrement 
> ExcessBlocks metric when the block is removed
> --
>
> Key: HDFS-6945
> URL: https://issues.apache.org/jira/browse/HDFS-6945
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Critical
>  Labels: metrics
> Fix For: 2.8.0, 2.7.2, 2.6.4
>
> Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
> HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch
>
>
> I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
> however, there are no over-replicated blocks (confirmed by fsck).
> After a further research, I noticed when deleting a block, BlockManager does 
> not remove the block from excessReplicateMap or decrement excessBlocksCount.
> Usually the metric is decremented when processing block report, however, if 
> the block has been deleted, BlockManager does not remove the block from 
> excessReplicateMap or decrement the metric.
> That way the metric and excessReplicateMap can increase infinitely (i.e. 
> memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-6945:
-
Fix Version/s: 2.6.4

> BlockManager should remove a block from excessReplicateMap and decrement 
> ExcessBlocks metric when the block is removed
> --
>
> Key: HDFS-6945
> URL: https://issues.apache.org/jira/browse/HDFS-6945
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Critical
>  Labels: metrics
> Fix For: 2.8.0, 2.7.2, 2.6.4
>
> Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
> HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch
>
>
> I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
> however, there are no over-replicated blocks (confirmed by fsck).
> After a further research, I noticed when deleting a block, BlockManager does 
> not remove the block from excessReplicateMap or decrement excessBlocksCount.
> Usually the metric is decremented when processing block report, however, if 
> the block has been deleted, BlockManager does not remove the block from 
> excessReplicateMap or decrement the metric.
> That way the metric and excessReplicateMap can increase infinitely (i.e. 
> memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080559#comment-15080559
 ] 

Junping Du commented on HDFS-7645:
--

Hi [~kihwal], do you suggest we should backport these three fixes (this JIRA, 
HDFS-8656 and HDFS-9426) to branch-2.6?

> Rolling upgrade is restoring blocks from trash multiple times
> -
>
> Key: HDFS-7645
> URL: https://issues.apache.org/jira/browse/HDFS-7645
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Nathan Roberts
>Assignee: Keisuke Ogiwara
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, 
> HDFS-7645.03.patch, HDFS-7645.04.patch, HDFS-7645.05.patch, 
> HDFS-7645.06.patch, HDFS-7645.07.patch
>
>
> When performing an HDFS rolling upgrade, the trash directory is getting 
> restored twice when under normal circumstances it shouldn't need to be 
> restored at all. iiuc, the only time these blocks should be restored is if we 
> need to rollback a rolling upgrade. 
> On a busy cluster, this can cause significant and unnecessary block churn 
> both on the datanodes, and more importantly in the namenode.
> The two times this happens are:
> 1) restart of DN onto new software
> {code}
>   private void doTransition(DataNode datanode, StorageDirectory sd,
>   NamespaceInfo nsInfo, StartupOption startOpt) throws IOException {
> if (startOpt == StartupOption.ROLLBACK && sd.getPreviousDir().exists()) {
>   Preconditions.checkState(!getTrashRootDir(sd).exists(),
>   sd.getPreviousDir() + " and " + getTrashRootDir(sd) + " should not 
> " +
>   " both be present.");
>   doRollback(sd, nsInfo); // rollback if applicable
> } else {
>   // Restore all the files in the trash. The restored files are retained
>   // during rolling upgrade rollback. They are deleted during rolling
>   // upgrade downgrade.
>   int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd));
>   LOG.info("Restored " + restored + " block files from trash.");
> }
> {code}
> 2) When heartbeat response no longer indicates a rollingupgrade is in progress
> {code}
>   /**
>* Signal the current rolling upgrade status as indicated by the NN.
>* @param inProgress true if a rolling upgrade is in progress
>*/
>   void signalRollingUpgrade(boolean inProgress) throws IOException {
> String bpid = getBlockPoolId();
> if (inProgress) {
>   dn.getFSDataset().enableTrash(bpid);
>   dn.getFSDataset().setRollingUpgradeMarker(bpid);
> } else {
>   dn.getFSDataset().restoreTrash(bpid);
>   dn.getFSDataset().clearRollingUpgradeMarker(bpid);
> }
>   }
> {code}
> HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely 
> clear whether this is somehow intentional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration and write failures

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080557#comment-15080557
 ] 

Junping Du commented on HDFS-8676:
--

Hi [~walter.k.su] and [~kihwal], I tried to cherry-pick this patch to 
branch-2.6 but it seems to have conflicts. Would you help to commit the patch 
to branch-2.6 or provide a patch against branch-2.6? Thanks!

> Delayed rolling upgrade finalization can cause heartbeat expiration and write 
> failures
> --
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Critical
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-8676.01.patch, HDFS-8676.02.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration and write failures

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8676:
-
Target Version/s: 2.7.2, 2.6.4  (was: 2.7.2)

> Delayed rolling upgrade finalization can cause heartbeat expiration and write 
> failures
> --
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Critical
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-8676.01.patch, HDFS-8676.02.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8767:
-
Target Version/s: 2.7.2, 2.6.4

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Kanaka Kumar Avvaru
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch, 
> HDFS-8767-02.patch, HDFS-8767.003.patch, HDFS-8767.004.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8767) RawLocalFileSystem.listStatus() returns null for UNIX pipefile

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080554#comment-15080554
 ] 

Junping Du commented on HDFS-8767:
--

Hi [~wheat9] and [~cnauroth], does this fix should land on branch-2.6 also? 
Thanks!

> RawLocalFileSystem.listStatus() returns null for UNIX pipefile
> --
>
> Key: HDFS-8767
> URL: https://issues.apache.org/jira/browse/HDFS-8767
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Kanaka Kumar Avvaru
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8767-00.patch, HDFS-8767-01.patch, 
> HDFS-8767-02.patch, HDFS-8767.003.patch, HDFS-8767.004.patch
>
>
> Calling FileSystem.listStatus() on a UNIX pipe file returns null instead of 
> the file. The bug breaks Hive when Hive loads data from UNIX pipe file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8995) Flaw in registration bookeeping can make DN die on reconnect

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8995:
-
Target Version/s: 2.7.2, 2.6.4  (was: 2.7.2)

> Flaw in registration bookeeping can make DN die on reconnect
> 
>
> Key: HDFS-8995
> URL: https://issues.apache.org/jira/browse/HDFS-8995
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8995.patch
>
>
> Normally data nodes re-register with the namenode when it was unreachable for 
> more than the heartbeat expiration and becomes reachable again. Datanodes 
> keep retrying the last rpc call such as incremental block report and 
> heartbeat and when it finally gets through the namenode tells it to 
> re-register.
> We have observed that some of datanodes stay dead in such scenarios. Further 
> investigation has revealed that those were told to shutdown by the namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8995) Flaw in registration bookeeping can make DN die on reconnect

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080553#comment-15080553
 ] 

Junping Du commented on HDFS-8995:
--

Hi [~hitliuyi] and [~kihwal], as Sangjin's comments earlier, shall we backport 
the fix to branch-2.6?

> Flaw in registration bookeeping can make DN die on reconnect
> 
>
> Key: HDFS-8995
> URL: https://issues.apache.org/jira/browse/HDFS-8995
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8995.patch
>
>
> Normally data nodes re-register with the namenode when it was unreachable for 
> more than the heartbeat expiration and becomes reachable again. Datanodes 
> keep retrying the last rpc call such as incremental block report and 
> heartbeat and when it finally gets through the namenode tells it to 
> re-register.
> We have observed that some of datanodes stay dead in such scenarios. Further 
> investigation has revealed that those were told to shutdown by the namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9106) Transfer failure during pipeline recovery causes permanent write failures

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9106:
-
Target Version/s: 2.8.0, 2.6.4  (was: 2.8.0)

> Transfer failure during pipeline recovery causes permanent write failures
> -
>
> Key: HDFS-9106
> URL: https://issues.apache.org/jira/browse/HDFS-9106
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-9106-poc.patch, HDFS-9106.branch-2.7.patch, 
> HDFS-9106.patch
>
>
> When a new node is added to a write pipeline during flush/sync, if the 
> partial block transfer fails, the write will fail permanently without 
> retrying or continuing with whatever is in the pipeline. 
> The transfer often fails in busy clusters due to timeout. There is no 
> per-packet ACK between client and datanode or between source and target 
> datanodes. If the total transfer time exceeds the configured timeout + 10 
> seconds (2 * 5 seconds slack), it is considered failed.  Naturally, the 
> failure rate is higher with bigger block sizes.
> I propose following changes:
> - Transfer timeout needs to be different from per-packet timeout.
> - transfer should be retried if fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9106) Transfer failure during pipeline recovery causes permanent write failures

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080549#comment-15080549
 ] 

Junping Du commented on HDFS-9106:
--

Hi [~hitliuyi], [~jingzhao] and [~kihwal], do we think this bug should be fixed 
in branch-2.6 also?

> Transfer failure during pipeline recovery causes permanent write failures
> -
>
> Key: HDFS-9106
> URL: https://issues.apache.org/jira/browse/HDFS-9106
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-9106-poc.patch, HDFS-9106.branch-2.7.patch, 
> HDFS-9106.patch
>
>
> When a new node is added to a write pipeline during flush/sync, if the 
> partial block transfer fails, the write will fail permanently without 
> retrying or continuing with whatever is in the pipeline. 
> The transfer often fails in busy clusters due to timeout. There is no 
> per-packet ACK between client and datanode or between source and target 
> datanodes. If the total transfer time exceeds the configured timeout + 10 
> seconds (2 * 5 seconds slack), it is considered failed.  Naturally, the 
> failure rate is higher with bigger block sizes.
> I propose following changes:
> - Transfer timeout needs to be different from per-packet timeout.
> - transfer should be retried if fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-9178:
-
Target Version/s: 2.7.2, 2.6.4  (was: 2.7.2)

> Slow datanode I/O can cause a wrong node to be marked bad
> -
>
> Key: HDFS-9178
> URL: https://issues.apache.org/jira/browse/HDFS-9178
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch
>
>
> When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the 
> downstream node can timeout on reading packet since even the heartbeat 
> packets will not be relayed down.  
> The packet read timeout is set in {{DataXceiver#run()}}:
> {code}
>   peer.setReadTimeout(dnConf.socketTimeout);
> {code}
> When the downstream node times out and closes the connection to the upstream, 
> the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an 
> ack upstream with the downstream node status set to {{ERROR}}.  This caused 
> the client to exclude the downstream node, even thought the upstream node was 
> the one got stuck.
> The connection to downstream has longer timeout, so the downstream will 
> always timeout  first. The downstream timeout is set in {{writeBlock()}}
> {code}
>   int timeoutValue = dnConf.socketTimeout +
>   (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length);
>   NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue);
>   OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock,
>   writeTimeout);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080547#comment-15080547
 ] 

Junping Du commented on HDFS-9178:
--

Hi [~kihwal], I saw you already attached the patch for 2.6 branch. Shall we 
commit this patch in branch-2.6? Thanks!

> Slow datanode I/O can cause a wrong node to be marked bad
> -
>
> Key: HDFS-9178
> URL: https://issues.apache.org/jira/browse/HDFS-9178
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch
>
>
> When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the 
> downstream node can timeout on reading packet since even the heartbeat 
> packets will not be relayed down.  
> The packet read timeout is set in {{DataXceiver#run()}}:
> {code}
>   peer.setReadTimeout(dnConf.socketTimeout);
> {code}
> When the downstream node times out and closes the connection to the upstream, 
> the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an 
> ack upstream with the downstream node status set to {{ERROR}}.  This caused 
> the client to exclude the downstream node, even thought the upstream node was 
> the one got stuck.
> The connection to downstream has longer timeout, so the downstream will 
> always timeout  first. The downstream timeout is set in {{writeBlock()}}
> {code}
>   int timeoutValue = dnConf.socketTimeout +
>   (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length);
>   NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue);
>   OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock,
>   writeTimeout);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8722) Optimize datanode writes for small writes and flushes

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080543#comment-15080543
 ] 

Junping Du commented on HDFS-8722:
--

Hi [~kihwal], shall we backport the fix to 2.6 branch also?

> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8722) Optimize datanode writes for small writes and flushes

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-8722:
-
Target Version/s: 2.7.2, 2.6.4  (was: 2.7.2)

> Optimize datanode writes for small writes and flushes
> -
>
> Key: HDFS-8722
> URL: https://issues.apache.org/jira/browse/HDFS-8722
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8722.patch, HDFS-8722.v1.patch
>
>
> After the data corruption fix by HDFS-4660, the CRC recalculation for partial 
> chunk is executed more frequently, if the client repeats writing few bytes 
> and calling hflush/hsync.  This is because the generic logic forces CRC 
> recalculation if on-disk data is not CRC chunk aligned. Prior to HDFS-4660, 
> datanode blindly accepted whatever CRC client provided, if the incoming data 
> is chunk-aligned. This was the source of the corruption.
> We can still optimize for the most common case where a client is repeatedly 
> writing small number of bytes followed by hflush/hsync with no pipeline 
> recovery or append, by allowing the previous behavior for this specific case. 
>  If the incoming data has a duplicate portion and that is at the last 
> chunk-boundary before the partial chunk on disk, datanode can use the 
> checksum supplied by the client without redoing the checksum on its own.  
> This reduces disk reads as well as CPU load for the checksum calculation.
> If the incoming packet data goes back further than the last on-disk chunk 
> boundary, datanode will still do a recalculation, but this occurs rarely 
> during pipeline recoveries. Thus the optimization for this specific case 
> should be sufficient to speed up the vast majority of cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4660) Block corruption can happen during pipeline recovery

2016-01-03 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-4660:
-
Target Version/s: 2.7.1, 2.6.4  (was: 2.7.1)

> Block corruption can happen during pipeline recovery
> 
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Peng Zhang
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 2.7.1
>
> Attachments: HDFS-4660.patch, HDFS-4660.patch, HDFS-4660.v2.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4660) Block corruption can happen during pipeline recovery

2016-01-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080542#comment-15080542
 ] 

Junping Du commented on HDFS-4660:
--

Hi [~kihwal], shall we backport this patch to 2.6.x branch?

> Block corruption can happen during pipeline recovery
> 
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Peng Zhang
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 2.7.1
>
> Attachments: HDFS-4660.patch, HDFS-4660.patch, HDFS-4660.v2.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080500#comment-15080500
 ] 

Dinesh S. Atreya commented on HDFS-9607:


This write feature proposal is a very restricted version of POSIX write 
capability to minimize impacts to HDFS and yet provide comprehensive 
functionalities.
For example, following can be allowed: Replacing
{code}
"Hello World" 
{code}
by
{code}
 "Hello HDFS!" 
{code}, or {noformat}
 "Hello World" 
{noformat}
by
{noformat}
 "Hello: HDFS" 
{noformat}, or
{noformat} "Hello World" {noformat} by {noformat} "Hello  HDFS" {noformat} 
(with extra space between Hello and HDFS), or
{noformat} "Hello World" {noformat} by {noformat}  "   " {noformat} 
(blank spaces or any other *_padding_* instead of letters while maintaining 
length
Following are not allowed:
{noformat} "Hello World" {noformat} by {noformat} "Hello Hadoop" {noformat} 
(greater length), and
{noformat} "Hello World"  {noformat} by {noformat} "Hello HDFS" {noformat} 
(shorter length)

Note: The term *_padding_* is used to account for cases in which *_encryption_* 
with associated *_padding_* is used.

> Advance Hadoop Architecture (AHA) - HDFS
> 
>
> Key: HDFS-9607
> URL: https://issues.apache.org/jira/browse/HDFS-9607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080498#comment-15080498
 ] 

Dinesh S. Atreya commented on HDFS-9607:


Can somebody delete the above comment and this one, since I do not seem to have 
permission to edit own comment or delete own comment.

Alternatively, please grant me permission to edit/delete my own comments.

> Advance Hadoop Architecture (AHA) - HDFS
> 
>
> Key: HDFS-9607
> URL: https://issues.apache.org/jira/browse/HDFS-9607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080493#comment-15080493
 ] 

Dinesh S. Atreya commented on HDFS-9607:


This write feature proposal is a very restricted version of POSIX write 
capability to minimize impacts to HDFS and yet provide comprehensive 
functionalities.
For example, following can be allowed: Replacing 
{code}"Hello World" {code} by {code} "Hello HDFS!" {code}
{noformat} "Hello World" {noformat} by {noformat} "Hello: HDFS" {noformat}
"Hello World" by "Hello  HDFS" (with extra space between Hello and HDFS).
"Hello World" by "   " (blank spaces instead of letters while 
maintaining length
Following are not allowed:
"Hello World" by "Hello Hadoop" (greater length).
"Hello World" by "Hello HDFS" (shorter length)



> Advance Hadoop Architecture (AHA) - HDFS
> 
>
> Key: HDFS-9607
> URL: https://issues.apache.org/jira/browse/HDFS-9607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dinesh S. Atreya
>
> Link to Umbrella JIRA
> https://issues.apache.org/jira/browse/HADOOP-12620 
> Provide capability to carry out in-place writes/updates. Only writes in-place 
> are supported where the existing length does not change.
> For example, "Hello World" can be replaced by "Hello HDFS!"
> See 
> https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
>  for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9607) Advance Hadoop Architecture (AHA) - HDFS

2016-01-03 Thread Dinesh S. Atreya (JIRA)
Dinesh S. Atreya created HDFS-9607:
--

 Summary: Advance Hadoop Architecture (AHA) - HDFS
 Key: HDFS-9607
 URL: https://issues.apache.org/jira/browse/HDFS-9607
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Dinesh S. Atreya


Link to Umbrella JIRA
https://issues.apache.org/jira/browse/HADOOP-12620 
Provide capability to carry out in-place writes/updates. Only writes in-place 
are supported where the existing length does not change.
For example, "Hello World" can be replaced by "Hello HDFS!"
See 
https://issues.apache.org/jira/browse/HADOOP-12620?focusedCommentId=15046300&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15046300
 for more details.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

2016-01-03 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080487#comment-15080487
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8430:
---

[~drankye], I think it is a good start for the first implementation.  We may 
improve it later on.  Some ideas:
# Instead of sending all CRCs to the client, send all CRCs to one of the 
datanode in a block group.  The datanode computes the block MD5s and returns 
them to the client.  Then, the computation becomes distributed.
# We may consider changing the checksum algorithm for replicated files 
(although it is incompatible with the old clusters)
## Use CRC64 (or some other linear code) for block checksum instead of MD5.  
The datanode may compute cell CRC64s and then send them to a client (or a 
datanode). We may combine the cell CRC64s to obtain the block CRC64 since the 
code is linear.  Since datanodes send cell checksums instead of data checksums, 
the network overhead becomes ignorable.
## Or simply compute cell checksums for replicated files instead of block 
checksums.

> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9606) Unknown error encountered while tailing edits. Shutting down standby NN.

2016-01-03 Thread tangshangwen (JIRA)
tangshangwen created HDFS-9606:
--

 Summary: Unknown error encountered while tailing edits. Shutting 
down standby NN.
 Key: HDFS-9606
 URL: https://issues.apache.org/jira/browse/HDFS-9606
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: tangshangwen
Assignee: tangshangwen


standby NN Shutdown when NN apply edit log, 
{noformat}
2016-01-03 14:04:19,293 FATAL 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unknown error 
encountered while tailing edits. Shutting down standby NN.
java.io.IOException: Failed to apply edit log operation ReassignLeaseOp 
[leaseHolder=DFSClient_NONMAPREDUCE_854707399_1, 
path=/tmp/jrdw/kafka2hdfs/log_mobile_gateway-21-1443245603647--6536501137915724876,
 newHolde
r=HDFS_NameNode, opCode=OP_REASSIGN_LEASE, txid=20790808505]: error File is not 
under construction: 
/tmp/jrdw/kafka2hdfs/log_mobile_gateway-21-1443245603647--6536501137915724876
   at 
org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
   at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:205)
   at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:112)
   at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:771)
   at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
   at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
   at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
   at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
   at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
   at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)