subject:"\"\\\[jira\\\] \\\[Commented\\\] \\\(HDFS\\\-9289\\\) check genStamp when complete file\""

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981798#comment-14981798
 ] 

Hadoop QA commented on HDFS-9289:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
8s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run 
convertXmlToText from findbugs {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s 
{color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s 
{color} | {color:red} Patch generated 1 new checkstyle issues in 
hadoop-hdfs-project (total was 247, now 247). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 1s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 54s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s 
{color} | {color:red} Patch generated 56 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 168m 34s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.server.namenode.TestFSImage |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.0 Server=1.7.0 
Image:test-patch-base-hadoop-date2015-10-30 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12769654/HDFS-9289.6.patch |
| JIRA Issue | HDFS-9289 |
| Optional T

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-29 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981321#comment-14981321
 ] 

Zhe Zhang commented on HDFS-9289:
-

A small ask for the next rev:
{code}
// BlockInfo#commitBlock
- this.set(getBlockId(), block.getNumBytes(), block.getGenerationStamp());
+ this.setNumBytes(block.getNumBytes());
{code}

We also need to add a test in the 04 patch. Otherwise LGTM.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-29 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981313#comment-14981313
 ] 

Zhe Zhang commented on HDFS-9289:
-

Thanks Jing for the explanation. I agree it's reasonable to throw an exception 
in {{commitBlock}} and rely on lease recovery to bring the block back to full 
strength in this case. 

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-29 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981291#comment-14981291
 ] 

Jing Zhao commented on HDFS-9289:
-

bq. In general, if a client misreports GS, does it indicate a likelihood of 
misreported numBytes – and therefore we should deny the commitBlock?

Currently NN only depends on the reported length from the client to determine 
the block length (not considering lease recovery scenario). So the only check 
we can do about the length is the existing one: {{assert block.getNumBytes() <= 
commitBlock.getNumBytes()}}.

bq. But it's still a data loss because the data written by the client after 
updatePipeline becomes invisible.

Throwing an exception here may not necessarily mean that the data written after 
updatePipeline will get lost. In most cases the data can still get recovered 
during the lease recovery, considering the replicas have already get persisted 
in DataNodes before client sends out the commit/complete request to NN (since 
the client has received the last response from the pipeline at that time).

So throwing exception here should be the correct behavior and may not be that 
risky.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-29 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981262#comment-14981262
 ] 

Zhe Zhang commented on HDFS-9289:
-

bq. That's silent data corruption!
[~daryn] I agree it's a silent data corruption in the current logic because we 
update the NN's copy of the GS with the reported GS from the client:
{code}
// BlockInfo#commitBlock
this.set(getBlockId(), block.getNumBytes(), block.getGenerationStamp());
{code}

Throwing an exception (and therefore denying the commitBlock) turns this into 
an explicit failure, which is better. But it's still a data loss because the 
data written by the client after {{updatePipeline}} becomes invisible. 

So I think at least for this particular bug (lacking {{volatile}}), the right 
thing to do is to avoid changing NN's copy of GS when committing block (so we 
should avoid changing blockID as well). The only thing we should commit is 
{{numBytes}}. Of course we should still print a {{WARN}} or {{ERROR}} when GSes 
mismatch. As a safer first step we should at least avoid decrementing NN's copy 
of block GS.

In general, if a client misreports GS, does it indicate a likelihood of 
misreported {{numBytes}} -- and therefore we should deny the {{commitBlock}}? 
It's hard to say; the {{volatile}} bug here is only for GS. But since we have 
already ensured the NN's copy of block {{numBytes}} never decrements, the harm 
of a misreported {{numBytes}} is not severe.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-29 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14981222#comment-14981222
 ] 

Chang Li commented on HDFS-9289:


Thanks [~jingzhao], [~zhz] and [~daryn] for reivew and valuable discussions! 
Some additional info about several cases of mismatched GS we encountered is 
that they all happened after pipelineupdate for datanode close recovery, so no 
mismatched size of commit happen only mismatched GS. 
Could we reach a consensus of whether we should log warn of mismatched GS block 
info or throw exception?


> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-29 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980535#comment-14980535
 ] 

Daryn Sharp commented on HDFS-9289:
---

I worked with Chang on this issue and can't think of a scenario in which it's 
legitimate for the client to misreport the genstamp - whether the pipeline was 
updated or not.

Consider a more extreme case: The client wrote more data after the pipeline 
recovered and misreports the older genstamp.  That's silent data corruption!

I'd like to see an exception here rather than later.


> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-28 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979400#comment-14979400
 ] 

Jing Zhao commented on HDFS-9289:
-

bq. What if the updatePipeline RPC call has successfully finished NN side 
changes but failed in sending response to client? Should we allow the client to 
commit the block?

The RPC call will fail on the client side and the client will not use the old 
GS to commit the block.

bq. How about we use this JIRA to commit the volatile change (which should fix 
the reported issue) and dedicate a follow-on JIRA to the commitBlock GS 
validation change?

I agree with this proposal. Let's only log a warning msg on the NN side but not 
throw exception in this jira.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-28 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979367#comment-14979367
 ] 

Zhe Zhang commented on HDFS-9289:
-

Thanks Jing for sharing the thoughts.

I think the GS validation in {{BlockManager#commitBlock}} is a little tricky. 
What if the {{updatePipeline}} RPC call has successfully finished NN side 
changes but failed in sending response to client? Should we allow the client to 
commit the block?

GS is used to determine whether a replica is stale, but the client doesn't have 
a replica. Among the 3 attributes of a block (ID, size, GS), client should 
always have the same ID as NN, and should always have fresher size than NN. So 
maybe the right thing to do is to discard the client reported GS in 
{{commitBlock}}, but I'm not so sure.

How about we use this JIRA to commit the {{volatile}} change (which should fix 
the reported issue) and dedicate a follow-on JIRA to the {{commitBlock}} GS 
validation change?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch, 
> HDFS-9289.4.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-28 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978896#comment-14978896
 ] 

Jing Zhao commented on HDFS-9289:
-

Making DataStreamer#block volatile is a good change, the GS validation on the 
NN side also looks good to me. Maybe we do not need a new type of 
{{InvalidGenStampException}} though. And to log the detailed information of the 
block with mismatching GS on the NN side will also be useful.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-28 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978617#comment-14978617
 ] 

Chang Li commented on HDFS-9289:


[~zhz], no we don't have this log because we didn't enable the 
blockStateChangeLog. How do you propose we should proceed with this jira?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977679#comment-14977679
 ] 

Hadoop QA commented on HDFS-9289:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  30m 31s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |  11m  6s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  16m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 39s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   3m 30s | The applied patch generated  1 
new checkstyle issues (total was 161, now 161). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   1m  5s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   8m  5s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   5m 22s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  78m  6s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 59s | Tests passed in 
hadoop-hdfs-client. |
| | | 159m 31s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestRecoverStripedFile |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.TestEncryptionZones |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12769100/HDFS-9289.3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 68ce93c |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13238/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13238/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13238/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13238/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13238/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13238/console |


This message was automatically generated.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977260#comment-14977260
 ] 

Zhe Zhang commented on HDFS-9289:
-

bq. I think there probabaly exist some cache coherence issue
This sounds possible. Maybe the {{DFSOutputStream}} thread uses a stale copy of 
{{block}} in {{completeFile}}, after {{block}} is updated by the 
{{DataStreamer}} thread.

bq. Then pipelineupdate happen with only d2 and d3 with new GS. Then file 
complete with old GS and d2 and d3 were marked corrupt.
Do you have any log showing that "replica marked as corrupt because its GS is 
newer than the block GS on NN"?

Regardless, making {{DataStreamer#block}} volatile is a good change. Ideally we 
should add a test to emulate the cache coherency problem but it doesn't look 
easy.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977156#comment-14977156
 ] 

Chang Li commented on HDFS-9289:


[~zhz], yes, the above log is from the same cluster as the first log I post.

The two replicas in two datanodes from updated pipeline had new GS but they 
were marked as corrupt because the block commit with old genstamp. 
The complete story happened in that cluster is:  there were initially 3 
datanodes in pipeline d1, d2, d3. Then pipelineupdate happen with only d2 and 
d3 with new GS. Then file complete with old GS and d2 and d3 were marked 
corrupt. Then after 1 day, full block report from d1 came in, and NN found out 
d1 has the the right block with "correct" old GS but d1 is under replicated, so 
NN told d1 to replicate its replica with old GS to the other two nodes, d4, d5. 
So the all 3DNs I showed above were d1, d4, and d5 having old GS.
I think there probabaly exist some cache coherence issue since 
{code}protected ExtendedBlock block;{code}
lack volatile. That could also explain why this issue didn't happen frequently.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976925#comment-14976925
 ] 

Zhe Zhang commented on HDFS-9289:
-

The fact that all 3 DNs have old GS doesn't mean the client also has an old GS. 
Is the above log from the same cluster as previous [logs | 
https://issues.apache.org/jira/browse/HDFS-9289?focusedCommentId=14972655&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14972655]?

In these cases, is there any replica with the correct (new) GS? If so it 
doesn't look a bug. If all replicas of a block have old GS, then it's more 
suspicious.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976815#comment-14976815
 ] 

Chang Li commented on HDFS-9289:


[~zhz], I don't have the log show the file was completed with an old GS. But by 
look up the block from jsp page right now, I can see that the block 
blk_3773617405 currently has replica on host ***657n26.***.com, 
***656n04.***.com, and ***656n38.***.com,
by going to those datanode, I see the replica on those datanodes have replica 
with old genstamp.
{code}
bash-4.1$ hostname
***657n26.***.com
bash-4.1$ ls -l 
/grid/2/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405*
-rw-r--r-- 1 hdfs users 107761275 Oct 23 18:00 
/grid/2/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405
-rw-r--r-- 1 hdfs users841895 Oct 23 18:00 
/grid/2/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405_1106111498065.meta

bash-4.1$ hostname
***656n04.***.com
bash-4.1$ ls -l 
/grid/1/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405*
-rw-r--r-- 1 hdfs users 107761275 Oct 21 19:14 
/grid/1/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405
-rw-r--r-- 1 hdfs users841895 Oct 21 19:14 
/grid/1/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405_1106111498065.meta

bash-4.1$ hostname
***656n38.***.com
bash-4.1$ ls -l 
/grid/3/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405*
-rw-r--r-- 1 hdfs users 107761275 Oct 23 09:14 
/grid/3/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405
-rw-r--r-- 1 hdfs users841895 Oct 23 09:14 
/grid/3/hadoop/var/hdfs/data/current/BP-1052427332-98.138.108.146-1350583571998/current/finalized/subdir236/subdir212/blk_3773617405_1106111498065.meta
{code}

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976735#comment-14976735
 ] 

Zhe Zhang commented on HDFS-9289:
-

bq. the client after updatepipeline with the new gen stamp it later completed 
file with the old gen stamp
This looks very strange. But why do you think this happened? Did you see logs 
that the file was completed with an old GS?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-27 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976471#comment-14976471
 ] 

Chang Li commented on HDFS-9289:


Hi, [~walter.k.su], I don't know in which cluster this strange case will happen 
again and I can't enable the debug message of NameNode.blockStateChangeLog 
across all clusters. I will look into the root cause of how this strange 
problem happen.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975837#comment-14975837
 ] 

Walter Su commented on HDFS-9289:
-

The patch hides a potential bigger bug. We should find it out and address it.
Hi, [~lichangleo]. I'll very appreciate if you could enable debug level of 
{{NameNode.blockStateChangeLog}} and attach more logs? Or instructions about 
how to reproduce it.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975381#comment-14975381
 ] 

Chang Li commented on HDFS-9289:


[~zhz], you are the right, the client had the new genstamp, but the problem I 
am trying to point out is that the client after updatepipeline with the new gen 
stamp it later completed file with the old gen stamp. So my patch is trying to 
prevent client from complete file with old genstamp after it updated pipeline 
with new genstamp

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975348#comment-14975348
 ] 

Zhe Zhang commented on HDFS-9289:
-

[~lichangleo] I think the below log shows that the client does have new GS 
{{1106111511603}} because the parameter {{newBlock}} is passed in from the 
client. So IIUC even if we check GS when completing file, as the patch does, it 
won't stop the client from completing / closing the file. Or could you describe 
how you think the patch can avoid this error? Thanks..

{code}
2015-10-20 19:49:20,392 [IPC Server handler 63 on 8020] INFO 
namenode.FSNamesystem: 
updatePipeline(BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065)
 successfully to 
BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111511603
{code}

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974704#comment-14974704
 ] 

Chang Li commented on HDFS-9289:


Hi [~jingzhao], we are currently taking the default one, and the default is 
true.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974679#comment-14974679
 ] 

Jing Zhao commented on HDFS-9289:
-

Hi [~lichangleo], what is the current conf setting of the 
replace-datanode-on-failure policy in your cluster? From the log looks like you 
disabled it?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974525#comment-14974525
 ] 

Zhe Zhang commented on HDFS-9289:
-

[~lichangleo] Thanks for sharing the logs! I'll look at the patch and logs and 
post a review today.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974497#comment-14974497
 ] 

Chang Li commented on HDFS-9289:


[~zhz], before we figure out the root cause of this strange case, should let 
this jira to be a temporary fix?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-26 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974490#comment-14974490
 ] 

Chang Li commented on HDFS-9289:


have met another case in our cluster
{code}
2015-10-23 04:38:08,544 [IPC Server handler 11 on 8020] INFO hdfs.StateChange: 
BLOCK* allocateBlock: /projects/wcc/wcc1/data/2015/10/22/05/Content-9892.temp.
gz.temp._COPYING_. BP-1161836467-98.137.240.59-1438814573258 
blk_1427767166_354062734{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[Replica
UnderConstruction[[DISK]DS-a04f60ed-6700-4e93-8a52-555301e07d3b:NORMAL:10.213.43.41:1004|RBW],
 ReplicaUnderConstruction[[DISK]DS-7e0de56b-17ba-4164-8b19-67a9
f9f84c2c:NORMAL:10.213.46.123:1004|RBW], 
ReplicaUnderConstruction[[DISK]DS-14a850d1-deb9-496b-b5ed-bb57010a8b56:NORMAL:10.213.46.96:1004|RBW]]}
2015-10-23 04:39:35,588 [IPC Server handler 5 on 8020] INFO 
namenode.FSNamesystem: 
updatePipeline(block=BP-1161836467-98.137.240.59-1438814573258:blk_1427767
166_354062734, newGenerationStamp=354080525, newLength=24505255, 
newNodes=[10.213.46.123:1004, 10.213.46.96:1004], 
clientName=DFSClient_NONMAPREDUCE_12621589
81_1)
2015-10-23 04:39:35,588 [IPC Server handler 5 on 8020] INFO 
namenode.FSNamesystem: 
updatePipeline(BP-1161836467-98.137.240.59-1438814573258:blk_1427767166_35
4062734) successfully to 
BP-1161836467-98.137.240.59-1438814573258:blk_1427767166_354080525
2015-10-23 04:39:35,595 [IPC Server handler 50 on 8020] INFO hdfs.StateChange: 
DIR* completeFile: 
/projects/wcc/wcc1/data/2015/10/22/05/Content-9892.temp.gz.temp._COPYING_ is 
closed by DFSClient_NONMAPREDUCE_1262158981_1
{code}
This is also a complete file before a pipelineupdate.
jsp page shows three nodes which currently holds the replica of blk_1427767166. 
One of the node is 10.213.43.41, which is the first node in old pipeline and 
dropped out in the updated pipeline and the replica currently in that node has 
old gen stamp. The other two nodes are later replicated after the first node in 
old pipeline sent in its block report. The two nodes in the updated pipeline 
were marked as corrupted until the node 10.213.43.41 sent in its block report.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-24 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972655#comment-14972655
 ] 

Chang Li commented on HDFS-9289:


Hi [~zhz], here is the log,
{code}
INFO hdfs.StateChange: BLOCK* allocateBlock: 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/_temporary/1/_temporary/attempt_1444859775697_31140_m_001028_0/part-m-01028.
 BP-1052427332-98.138.108.146-1350583571998 
blk_3773617405_1106111498065{blockUCState=UNDER_CONSTRUCTION, 
primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-0a28b82a-e3fb-4e42-b925-e76ebd98afb4:NORMAL:10.216.32.61:1004|RBW],
 
ReplicaUnderConstruction[[DISK]DS-236c19ee-0a39-4e53-9520-c32941ca1828:NORMAL:10.216.70.49:1004|RBW],
 
ReplicaUnderConstruction[[DISK]DS-fc7c2dab-9309-46be-b5c0-52be8e698591:NORMAL:10.216.70.43:1004|RBW]]}
2015-10-20 19:49:20,392 [IPC Server handler 63 on 8020] INFO 
namenode.FSNamesystem: 
updatePipeline(block=BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065,
 newGenerationStamp=1106111511603, newLength=107761275, 
newNodes=[10.216.70.49:1004, 10.216.70.43:1004], 
clientName=DFSClient_attempt_1444859775697_31140_m_001028_0_1424303982_1)
2015-10-20 19:49:20,392 [IPC Server handler 63 on 8020] INFO 
namenode.FSNamesystem: 
updatePipeline(BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065)
 successfully to 
BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111511603
2015-10-20 19:49:20,400 [IPC Server handler 96 on 8020] INFO hdfs.StateChange: 
DIR* completeFile: 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/_temporary/1/_temporary/attempt_1444859775697_31140_m_001028_0/part-m-01028
 is closed by DFSClient_attempt_1444859775697_31140_m_001028_0_1424303982_1
{code}
You can see the file complete after a pipeline update. The block changed its 
genStamp from blk_3773617405_1106111498065 to blk_3773617405_1106111511603. But 
then the two nodes in the updated pipeline are marked as corrupted. When I do 
fsck, it shows 
{code} 
hdfs fsck 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028
Connecting to namenode via http://uraniumtan-nn1.tan.ygrid.yahoo.com:50070
FSCK started by hdfs (auth:KERBEROS_SSL) from /98.138.131.190 for path 
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028
 at Wed Oct 21 15:04:56 UTC 2015
.
/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028:
 CORRUPT blockpool BP-1052427332-98.138.108.146-1350583571998 block 
blk_3773617405

/projects/FETLDEV/Benzene/benzene_stg_transient/primer/201510201900/part-m-01028:
  Replica placement policy is violated for 
BP-1052427332-98.138.108.146-1350583571998:blk_3773617405_1106111498065. Block 
should be additionally replicated on 1 more rack(s).
{code}
it shows the blk with old gen stamp blk_3773617405_1106111498065.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-23 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972209#comment-14972209
 ] 

Zhe Zhang commented on HDFS-9289:
-

[~lichangleo] Thanks for reporting the issue. 

bq. but the file complete with the old block genStamp.
How did that happen? So the client somehow had an old GS? IIUC the 
{{updatePipeline}} protocol is below (using {{client_GS}}, {{DN_GS}}, and 
{{NN_GS}} to denote the 3 copies of GS):
# Client asks for new GS from NN through {{updateBlockForPipeline}}. After 
this, {{client_GS}} is new, both {{DN_GS}} and {{NN_GS}} are old
# Client calls {{createBlockOutputStream}} to update DN's GS. After this, both 
{{client_GS}} and {{DN_GS}} are new, {{NN_GS}} is old
# Client calls {{updatePipeline}}. After this, all 3 GSes should be new

Maybe step 3 failed, and then client tried to complete the file? It'd be ideal 
if you could extend the unit test to reproduce the error without the fix (or 
paste the error log). Thanks!

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-23 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971590#comment-14971590
 ] 

Elliott Clark commented on HDFS-9289:
-

It had all of the data and the same md5sums when I checked. So the only thing 
different was genstamps. Not really sure about why that happened. But I didn't 
mean to side track this jira.

Test looks nice.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-23 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971302#comment-14971302
 ] 

Chang Li commented on HDFS-9289:


[~eclark], block on 10.210.31.38 should be marked as corrupt because it's from 
old pipeline right?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-22 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970213#comment-14970213
 ] 

Elliott Clark commented on HDFS-9289:
-

{code}
15/10/22 09:37:36 INFO BlockStateChange: BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1190230043 added as corrupt on 
10.210.31.38:50010 by hbase4678.test.com/10.210.31.38 because reported RBW 
replica with genstamp 116735085 does not match COMPLETE block's genstamp in 
block map 116737586
{code}

Block lengths on "corrupt" replicas is the same as on the non-corrupt. The only 
difference is the genstamp.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-22 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969984#comment-14969984
 ] 

Chang Li commented on HDFS-9289:


will update patch with info of expected and encountered gen stamp and unit test 
soon.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-22 Thread Chang Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969979#comment-14969979
 ] 

Chang Li commented on HDFS-9289:


Hi [~eclark], I think the case you gave is not the same and the corrupt block 
doesn't seem to be caused by gen stamp reverse mismatch. So your initial 
pipeline has node 33, 48, 38. Then after pipeline update it has node 33, 45, 
29. Then node 38 is marked corrupt due to gen stamp mismatch, which is what it 
should be. Then node 29(with correct gen stamp) were told to replicate to some 
other node, and then client report block node 29 as corrupt. This case of 
corruption doesn't seem like to be caused by gen stamp mismatch on namenode 
side but a report from client ("because client machine reported it")

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969946#comment-14969946
 ] 

Hadoop QA commented on HDFS-9289:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 50s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m 59s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 27s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 32s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m  9s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   0m 30s | Post-patch findbugs 
hadoop-hdfs-project/hadoop-hdfs compilation is broken. |
| {color:green}+1{color} | findbugs |   0m 30s | The patch does not introduce 
any new Findbugs (version ) warnings. |
| {color:green}+1{color} | native |   0m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |   0m 25s | Tests failed in hadoop-hdfs. |
| | |  44m 31s | |
\\
\\
|| Reason || Tests ||
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768113/HDFS-9289.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0fce5f9 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13136/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13136/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13136/console |


This message was automatically generated.

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-22 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969933#comment-14969933
 ] 

Elliott Clark commented on HDFS-9289:
-

Also can we add the expected and encountered genstamps to the exception message

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-22 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969890#comment-14969890
 ] 

Elliott Clark commented on HDFS-9289:
-

We just had this something very similar happen on a prod cluster. Then the 
datanode holding the only complete block was shut off for repair.

{code}
15/10/22 06:29:32 INFO hdfs.StateChange: BLOCK* allocateBlock: 
/TESTCLUSTER-HBASE/WALs/hbase4544.test.com,16020,1444266312515/hbase4544.test.com%2C16020%2C1444266312515.default.1445520572440.
 BP-1735829752-10.210.49.21-1437433901380 
blk_1190230043_116735085{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-8d0a91de-8a69-4f39-816e-de3a0fa8a3aa:NORMAL:10.210.81.33:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-52d9a122-a46a-4129-ab3d-d9041de109f8:NORMAL:10.210.31.48:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-c734b72e-27de-4dd4-a46c-7ae59f6ef792:NORMAL:10.210.31.38:50010|RBW]]}
15/10/22 06:32:48 INFO namenode.FSNamesystem: 
updatePipeline(block=BP-1735829752-10.210.49.21-1437433901380:blk_1190230043_116735085,
 newGenerationStamp=116737586, newLength=201675125, 
newNodes=[10.210.81.33:50010, 10.210.81.45:50010, 10.210.64.29:50010], 
clientName=DFSClient_NONMAPREDUCE_1976436475_1)
15/10/22 06:32:48 INFO namenode.FSNamesystem: 
updatePipeline(BP-1735829752-10.210.49.21-1437433901380:blk_1190230043_116735085)
 successfully to 
BP-1735829752-10.210.49.21-1437433901380:blk_1190230043_116737586
15/10/22 06:32:50 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.210.64.29:50010 is added to 
blk_1190230043_116737586{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-8d0a91de-8a69-4f39-816e-de3a0fa8a3aa:NORMAL:10.210.81.33:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-d5f7fff9-005d-4804-a223-b6e6624d3af2:NORMAL:10.210.81.45:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-0620aef7-b6b2-4a23-950c-09373f68a815:NORMAL:10.210.64.29:50010|FINALIZED]]}
 size 201681322
15/10/22 06:32:50 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.210.81.45:50010 is added to 
blk_1190230043_116737586{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-8d0a91de-8a69-4f39-816e-de3a0fa8a3aa:NORMAL:10.210.81.33:50010|RBW],
 
ReplicaUnderConstruction[[DISK]DS-0620aef7-b6b2-4a23-950c-09373f68a815:NORMAL:10.210.64.29:50010|FINALIZED],
 
ReplicaUnderConstruction[[DISK]DS-52a0a4ba-cf64-4763-99a8-6c9bb5946879:NORMAL:10.210.81.45:50010|FINALIZED]]}
 size 201681322
15/10/22 06:32:50 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.210.81.33:50010 is added to 
blk_1190230043_116737586{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-0620aef7-b6b2-4a23-950c-09373f68a815:NORMAL:10.210.64.29:50010|FINALIZED],
 
ReplicaUnderConstruction[[DISK]DS-52a0a4ba-cf64-4763-99a8-6c9bb5946879:NORMAL:10.210.81.45:50010|FINALIZED],
 
ReplicaUnderConstruction[[DISK]DS-4d937567-7184-40b7-a822-c7e3b5d588d4:NORMAL:10.210.81.33:50010|FINALIZED]]}
 size 201681322
15/10/22 09:37:36 INFO BlockStateChange: BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1190230043 added as corrupt on 
10.210.31.38:50010 by hbase4678.test.com/10.210.31.38 because reported RBW 
replica with genstamp 116735085 does not match COMPLETE block's genstamp in 
block map 116737586
15/10/22 09:37:36 INFO BlockStateChange: BLOCK* invalidateBlock: 
blk_1190230043_116735085(stored=blk_1190230043_116737586) on 10.210.31.38:50010
15/10/22 09:37:36 INFO BlockStateChange: BLOCK* InvalidateBlocks: add 
blk_1190230043_116735085 to 10.210.31.38:50010
15/10/22 09:37:39 INFO BlockStateChange: BLOCK* BlockManager: ask 
10.210.31.38:50010 to delete [blk_1190230043_116735085]
15/10/22 12:45:03 INFO BlockStateChange: BLOCK* ask 10.210.64.29:50010 to 
replicate blk_1190230043_116737586 to datanode(s) 10.210.64.56:50010
15/10/22 12:45:07 INFO BlockStateChange: BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1190230043 added as corrupt on 
10.210.64.29:50010 by hbase4496.test.com/10.210.64.56 because client machine 
reported it
15/10/22 12:50:49 INFO BlockStateChange: BLOCK* ask 10.210.81.45:50010 to 
replicate blk_1190230043_116737586 to datanode(s) 10.210.49.49:50010
15/10/22 12:50:55 INFO BlockStateChange: BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1190230043 added as corrupt on 
10.210.81.45:50010 by hbase4478.test.com/10.210.49.49 because client machine 
reported it
15/10/22 12:56:01 WARN blockmanagement.BlockManager: PendingReplicationMonitor 
timed out blk_1190230043_116737586
{code}

The patch will help but the issue will still be there. Is there some way to 
keep the genstamps from getting out of sync?

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/b

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

[jira] [Commented] (HDFS-9289) check genStamp when complete file

36 matches

Site Navigation

Mail list logo

Footer information