from:"sam rash \(JIRA\)"

[jira] [Commented] (HDFS-1401) TestFileConcurrentReader test case is still timing out / failing

2011-05-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038119#comment-13038119
 ] 

sam rash commented on HDFS-1401:


see todd's find in:

https://issues.apache.org/jira/browse/HDFS-1057


 TestFileConcurrentReader test case is still timing out / failing
 

 Key: HDFS-1401
 URL: https://issues.apache.org/jira/browse/HDFS-1401
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Tanping Wang
Priority: Critical
 Attachments: HDFS-1401.patch


 The unit test case, TestFileConcurrentReader after its most recent fix in 
 HDFS-1310 still times out when using java 1.6.0_07.  When using java 
 1.6.0_07, the test case simply hangs.  On apache Hudson build ( which 
 possibly is using a higher sub-version of java) this test case has presented 
 an inconsistent test result that it sometimes passes, some times fails. For 
 example, between the most recent build 423, 424 and build 425, there is no 
 effective change, however, the test case failed on build 424 and passed on 
 build 425
 build 424 test failed
 https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk/424/testReport/org.apache.hadoop.hdfs/TestFileConcurrentReader/
 build 425 test passed
 https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk/425/testReport/org.apache.hadoop.hdfs/TestFileConcurrentReader/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-19 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036343#comment-13036343
 ] 

sam rash commented on HDFS-1057:


if it helps, there is only ever 1 writer + 1 reader in the test.  1 reader 
'tails' by opening and closing the file repeatedly, up to 1000 times (hence 
exposing socket leaks in the past)


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-19 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036461#comment-13036461
 ] 

sam rash commented on HDFS-1057:


todd: thanks for digging into this

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-18 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035851#comment-13035851
 ] 

sam rash commented on HDFS-1057:


the last time I debugged the test failure, it exposed a socket/fd leak in a 
completely unrelated part of the code.  The test failing here also has 0 to do 
with the added feature--because it closes/opens files in rapid succession, it 
is prone expose resource leaks.

Removing this test (or feature) won't take away the underlying problem that 
should be looked at.

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-18 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035868#comment-13035868
]

sam rash commented on HDFS-1057:

the test opens/closes files for read/write. that exposed a slow leak last time.

I suggest anyone concerned with resources leaks in hadoop investigate. we
don't see the test failure in our open-sourced 0.20 fork

removing the test is an option; or coming up with a better one (this was my
first hdfs feature + test).

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

Attachments: HDFS-1057-0.20-append.patch,
conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt,
conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt,
hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt,
hdfs-1057-trunk-6.txt

In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before
calling flush(). Therefore, if there is a concurrent reader, it's possible to
race here - the reader will see the new length while those bytes are still in
the buffers of BlockReceiver. Thus the client will potentially see checksum
errors or EOFs. Additionally, the last checksum chunk of the file is made
accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-18 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035884#comment-13035884
]

sam rash commented on HDFS-1057:

i assume a similar problem as before. The problem was that code that opened
RPC proxies to DNs did not get closed in a finally block. The test failure
output indicates a socket/fd leak (Too many open files).

https://issues.apache.org/jira/browse/HDFS-1310

the test was succeeding 8 months ago, 2010-09-10, so I'd look at commits that
came after that.

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-18 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13035924#comment-13035924
 ] 

sam rash commented on HDFS-1057:


Todd: on a different issue, one test in here that looks suspicious is the 
testImmediateReadOfNewFile.

It repeatedly opens and closes a file right away, requiring 1k successful opens 
(at least in our copy)


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-941) Datanode xceiver protocol should allow reuse of a connection

2011-04-19 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13021862#comment-13021862
 ] 

sam rash commented on HDFS-941:
---

The last failure I saw with this test was basically unrelated to the test 
itself--it was a socket leak in the datanode, i think with RPCs. 

I glanced at the first test failure output and found a similar error:


2011-04-11 21:29:36,962 INFO  datanode.DataNode 
(DataXceiver.java:opWriteBlock(458)) - writeBlock blk_-6878114854540472276_1001 
received exception java.io.FileNotFoundException: 
/grid/0/hudson/hudson-slave/workspace/PreCommit-HDFS-Build/trunk/build/test/data/dfs/data/data1/current/rbw/blk_-6878114854540472276_1001.meta
 (Too many open files)


Note that this test implicitly finds any socket/fd leaks because it 
opens/closes files repeatedly.

If you can check into this, that'd be great.  I'll have some more time later 
this week to help more.

 Datanode xceiver protocol should allow reuse of a connection
 

 Key: HDFS-941
 URL: https://issues.apache.org/jira/browse/HDFS-941
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: bc Wong
 Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, 
 HDFS-941-3.patch, HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.patch, 
 HDFS-941-6.patch, HDFS-941-6.patch, hdfs941-1.png


 Right now each connection into the datanode xceiver only processes one 
 operation.
 In the case that an operation leaves the stream in a well-defined state (eg a 
 client reads to the end of a block successfully) the same connection could be 
 reused for a second operation. This should improve random read performance 
 significantly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-03-01 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13001134#comment-13001134
 ] 

sam rash commented on HDFS-1057:


does this test include the patch:

https://issues.apache.org/jira/browse/HADOOP-6907

?

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HDFS-1403) add -truncate option to fsck

2010-09-22 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12913908#action_12913908
 ] 

sam rash commented on HDFS-1403:


can you elaborate?

also, this truncate option will have to work on open files.  I think 
-list-corruptfiles only works on closed ones.  we have to handle the missing 
last block problem (the main reason I filed this)


 add -truncate option to fsck
 

 Key: HDFS-1403
 URL: https://issues.apache.org/jira/browse/HDFS-1403
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs client, name-node
Reporter: sam rash

 When running fsck, it would be useful to be able to tell hdfs to truncate any 
 corrupt file to the last valid position in the latest block.  Then, when 
 running hadoop fsck, an admin can cleanup the filesystem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-1403) add -truncate option to fsck

2010-09-15 Thread sam rash (JIRA)

add -truncate option to fsck


 Key: HDFS-1403
 URL: https://issues.apache.org/jira/browse/HDFS-1403
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs client, name-node
Reporter: sam rash


When running fsck, it would be useful to be able to tell hdfs to truncate any 
corrupt file to the last valid position in the latest block.  Then, when 
running hadoop fsck, an admin can cleanup the filesystem.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1310) TestFileConcurrentReader fails

2010-09-07 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906912#action_12906912
 ] 

sam rash commented on HDFS-1310:


my apologies for the delay--i came down with a cold right before the long 
weekend

results of test-patch:

 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.


 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash
 Attachments: hdfs-1310-1.txt, hdfs-1310-2.txt


 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1310) TestFileConcurrentReader fails

2010-09-07 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906964#action_12906964
 ] 

sam rash commented on HDFS-1310:


I have not run it all the way through yet.  Is it 'test' or 'test-core' these 
days?

 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash
 Attachments: hdfs-1310-1.txt, hdfs-1310-2.txt


 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1310) TestFileConcurrentReader fails

2010-09-03 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906105#action_12906105
 ] 

sam rash commented on HDFS-1310:


is that just 

ant test

?

not familiar with test-patch ?


 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash
 Attachments: hdfs-1310-1.txt, hdfs-1310-2.txt


 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1310) TestFileConcurrentReader fails

2010-09-02 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1310:
---

Attachment: hdfs-1310-2.txt

create ClientDatanodeProtocol in try{} block so that we don't skip checking 
additional DNs on an exception

 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash
 Attachments: hdfs-1310-1.txt, hdfs-1310-2.txt


 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1310) TestFileConcurrentReader fails

2010-09-01 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905380#action_12905380
 ] 

sam rash commented on HDFS-1310:


good point, i'll move the init into the try block



 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash
 Attachments: hdfs-1310-1.txt


 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1310) TestFileConcurrentReader fails

2010-08-31 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1310:
---

Attachment: hdfs-1310-1.txt

Datanode RPC proxy created stopped properly

 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash
 Attachments: hdfs-1310-1.txt


 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-08-30 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904455#action_12904455
 ] 

sam rash commented on HDFS-1057:


I'm confused--the jira for the test result indicated you had solved the 
problem.  Can you let me know what you need me to do?


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HDFS-1310) TestFileConcurrentReader fails

2010-08-30 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash reassigned HDFS-1310:
--

Assignee: sam rash

 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash

 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1310) TestFileConcurrentReader fails

2010-08-30 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904472#action_12904472
 ] 

sam rash commented on HDFS-1310:


sorry for the delay, I skimmed this jira and the last comment contradicted the 
title, so I assumed it was in ok shape.
I will have a minute to look at this in more detail tomorrow night


 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash

 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1310) TestFileConcurrentReader fails

2010-08-30 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904506#action_12904506
 ] 

sam rash commented on HDFS-1310:


actually the 2nd two are likely fallout from the first--if it died and didn't 
cleanup the locks, this could happen.

as i noted, i'm a bit short on time tonight, so I'll get to this tomorrow 
evening.  fwiw, this looks familiar--the 'too many open files' with this unit 
test.  i thought i already saw this and fixed it, though, where I simply didn't 
close a file in a thread...maybe only patched it in our local branch.

thanks for the directly links to the results

 TestFileConcurrentReader fails
 --

 Key: HDFS-1310
 URL: https://issues.apache.org/jira/browse/HDFS-1310
 Project: Hadoop HDFS
  Issue Type: Test
Affects Versions: 0.22.0
Reporter: Suresh Srinivas
Assignee: sam rash

 For details of test failure see 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/218/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1350) make datanodes do graceful shutdown

2010-08-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901483#action_12901483
 ] 

sam rash commented on HDFS-1350:


actually sorry, i remember your patch to do this.  I think the rev I'm using 
internally is older--i will check 20-append as well. This problem may not 
appear in 20-append with all the latest patches.

the question still remains if there is benefit in making the datanode do a 
clean shutdown on the DataXceiver threads.  

 make datanodes do graceful shutdown
 ---

 Key: HDFS-1350
 URL: https://issues.apache.org/jira/browse/HDFS-1350
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: sam rash
Assignee: sam rash

 we found that the Datanode doesn't do a graceful shutdown and a block can be 
 corrupted (data + checksum amounts off)
 we can make the DN do a graceful shutdown in case there are open files. if 
 this presents a problem to a timely shutdown, we can make a it a parameter of 
 how long to wait for the full graceful shutdown before just exiting

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1350) make datanodes do graceful shutdown

2010-08-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901527#action_12901527
 ] 

sam rash commented on HDFS-1350:


actually, it's in our latest branch here which is = 20-append and includes 
your patch.  The problem is that getBlockMetaDataInfo() has this at the end:

{code}

// paranoia! verify that the contents of the stored block
// matches the block file on disk.
data.validateBlockMetadata(stored);
{code}

which includes this check:

{code}

if (f.length()  maxDataSize || f.length() = minDataSize) {
  throw new IOException(Block  + b +
 is of size  + f.length() +
 but has  + (numChunksInMeta + 1) +
 checksums and each checksum size is  +
checksumsize +  bytes.);
}
{code}

a block is not allowed to participate in lease recovery if this fails.  

 make datanodes do graceful shutdown
 ---

 Key: HDFS-1350
 URL: https://issues.apache.org/jira/browse/HDFS-1350
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: sam rash
Assignee: sam rash

 we found that the Datanode doesn't do a graceful shutdown and a block can be 
 corrupted (data + checksum amounts off)
 we can make the DN do a graceful shutdown in case there are open files. if 
 this presents a problem to a timely shutdown, we can make a it a parameter of 
 how long to wait for the full graceful shutdown before just exiting

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-08-22 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-5.txt

address todd's comments (except for RPC compatibility--pending discussion)

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt, 
 hdfs-1262-4.txt, hdfs-1262-5.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1350) make datanodes do graceful shutdown

2010-08-22 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901269#action_12901269
 ] 

sam rash commented on HDFS-1350:


I saw the case of a single replica existing that did not have a matching data + 
checksum length.  it was not used and we lost the block.  i need to 
double-check the code to see, but the DN exception was that the block was not 
valid and couldn't be used

it seems to me the logic is simple:  take the longest length you can get.  It 
doesn't matter if data and checksum match as far as I can tell (though I think 
typically matching = longer than unmatching).

truncation only happens after the NN picks the length of the blocks.  as I 
said, I think the bug, at least in our patched rev (need to look at stock 
20-append), is that mismatching lengths can't participate at all in lease 
recovery which seems broken


 make datanodes do graceful shutdown
 ---

 Key: HDFS-1350
 URL: https://issues.apache.org/jira/browse/HDFS-1350
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: sam rash
Assignee: sam rash

 we found that the Datanode doesn't do a graceful shutdown and a block can be 
 corrupted (data + checksum amounts off)
 we can make the DN do a graceful shutdown in case there are open files. if 
 this presents a problem to a timely shutdown, we can make a it a parameter of 
 how long to wait for the full graceful shutdown before just exiting

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-1350) make datanodes do graceful shutdown

2010-08-21 Thread sam rash (JIRA)

make datanodes do graceful shutdown
---

 Key: HDFS-1350
 URL: https://issues.apache.org/jira/browse/HDFS-1350
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: sam rash
Assignee: sam rash


we found that the Datanode doesn't do a graceful shutdown and a block can be 
corrupted (data + checksum amounts off)

we can make the DN do a graceful shutdown in case there are open files. if this 
presents a problem to a timely shutdown, we can make a it a parameter of how 
long to wait for the full graceful shutdown before just exiting


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1350) make datanodes do graceful shutdown

2010-08-21 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901129#action_12901129
 ] 

sam rash commented on HDFS-1350:


My understanding of how lease recovery works in 20-append is that on cluster 
restart, an open file will be recovered by the Namenode.  Datanodes will send 
the longest valid length of the block (ie, if there are 8 bytes of checksum and 
1500 data, the valid length is 1024 assuming 512 byte chunk size).   The block 
is then truncated to a valid length.

20-append seems to have a bug that any block where data + checksum length don't 
match, the block isn't use in lease recovery.  the work here might be to fix 
that?



 make datanodes do graceful shutdown
 ---

 Key: HDFS-1350
 URL: https://issues.apache.org/jira/browse/HDFS-1350
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: sam rash
Assignee: sam rash

 we found that the Datanode doesn't do a graceful shutdown and a block can be 
 corrupted (data + checksum amounts off)
 we can make the DN do a graceful shutdown in case there are open files. if 
 this presents a problem to a timely shutdown, we can make a it a parameter of 
 how long to wait for the full graceful shutdown before just exiting

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-08-21 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901130#action_12901130
]

sam rash commented on HDFS-1262:

my apologies for the delay. I've been caught up in some hi-pri bits at work.

thanks for the comments. inlined responses

#why does abandonFile return boolean? looks like right now it can only return
true or throw, may as well make it void, no?
good question: I stole abandonBlock() which has the same behavior. It returns
true or throws an exception. I was trying to keep it consistent (rather than
logical per se).
I do prefer the void option as it makes the method more clear.

#in the log message in FSN.abandonFile it looks like there's a missing '+ src
+' in the second log message
#in the log messages, also log the holder argument perhaps
will fix

#in previous append-branch patches we've been trying to keep RPC compatibility
with unpatched 0.20 - ie you can run an updated client against an old NN, with
the provision #that it might not fix all the bugs. Given that, maybe we should
catch the exception we get if we call abandonFile() and get back an exception
indicating the method doesn't #exist? Check out what we did for HDFS-630
backport for example.
nice idea, I will check this out

#looks like there are some other patches that got conflated into this one - eg
testSimultaneousRecoveries is part of another patch on the append branch.
hmm, yea, not sure what happened here...weird, I think I applied one of your
patches. Which patch is that test from?

#missing Apache license on new test file
will fix
#typo: Excection instead of Exception
will fix
#(PermissionStatus) anyObject(), might generated an unchecked cast warning -
I think you can do Matchers.PermissionStatusanyObject() or some such to avoid
the unchecked #cast
ah, nice catch, will fix also

#given the complexity of the unit test, would be good to add some comments for
the general flow of what all the mocks/spys are achieving. I found myself a bit
lost in the #abstractions

yea, sry, was in a rush b4 vacation to get some test + patch up. It was a bit
tricky to get this case going for both create + append; I'll document the case
better (at all)

Failed pipeline creation during append leaves lease hanging on NN
-

Key: HDFS-1262
URL: https://issues.apache.org/jira/browse/HDFS-1262
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
Fix For: 0.20-append

Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt,
hdfs-1262-4.txt

Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened
was the following:
1) File's original writer died
2) Recovery client tried to open file for append - looped for a minute or so
until soft lease expired, then append call initiated recovery
3) Recovery completed successfully
4) Recovery client calls append again, which succeeds on the NN
5) For some reason, the block recovery that happens at the start of append
pipeline creation failed on all datanodes 6 times, causing the append() call
to throw an exception back to HBase master. HBase assumed the file wasn't
open and put it back on a queue to try later
6) Some time later, it tried append again, but the lease was still assigned
to the same DFS client, so it wasn't able to recover.
The recovery failure in step 5 is a separate issue, but the problem for this
JIRA is that the NN can think it failed to open a file for append when the NN
thinks the writer holds a lease. Since the writer keeps renewing its lease,
recovery never happens, and no one can open or recover the file until the DFS
client shuts down.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-08-21 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12901134#action_12901134
]

sam rash commented on HDFS-1262:

re: RPC compatibility. I'm not 100% sure this is a good idea. If we start to
enumerate the cases of how a client can interact with the server, bugs seem
more likely. It makes sense with a single method, but if RPC changes become
interdependent...

what's the case that mandates using a new client against an old namenode? is
it not possible to use the appropriately versioned client? or is it the case
of heterogeneous sets of clusters and simplicity of management with a single
client code base?

any other thoughts on this?

Failed pipeline creation during append leaves lease hanging on NN
-

Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt,
hdfs-1262-4.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1346) DFSClient receives out of order packet ack

2010-08-18 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900105#action_12900105
 ] 

sam rash commented on HDFS-1346:


not easily--PacketResponder is a non-static inner class.  Constructing a 
BlockReceiver requires a Datanode instance.  
If you can harness a Datanode, then you need to stub out the DataInputStream 
and figure out when to fire a callback (somehow when ack.readFields() reads 
from the DatainputStream, but not before).

I think it's possible, but we haven't had time yet


 DFSClient receives out of order packet ack
 --

 Key: HDFS-1346
 URL: https://issues.apache.org/jira/browse/HDFS-1346
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client
Affects Versions: 0.20-append
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.20-append

 Attachments: outOfOrder.patch


 When running 0.20 patched with HDFS-101, we sometimes see an error as follow:
 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block 
 blk_-2871223654872350746_21421120java.io.IOException: Responseprocessor: 
 Expecting seq
 no for block blk_-2871223654872350746_21421120 10280 but received 10281
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2570)
 This indicates that DFS client expects an ack for packet N, but receives an 
 ack for packet N+1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-1342) expose DFSOutputStream.getNumCurrentReplicas() in libhdfs

2010-08-13 Thread sam rash (JIRA)

expose DFSOutputStream.getNumCurrentReplicas() in libhdfs
-

 Key: HDFS-1342
 URL: https://issues.apache.org/jira/browse/HDFS-1342
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: contrib/libhdfs
Reporter: sam rash
Assignee: sam rash
Priority: Minor


DFSOutputStream exposes the number of writers in a pipeline.  We should make 
this callable from libhdfs



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1330) Make RPCs to DataNodes timeout

2010-08-09 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896684#action_12896684
 ] 

sam rash commented on HDFS-1330:


+1 lgtm


 Make RPCs to DataNodes timeout
 --

 Key: HDFS-1330
 URL: https://issues.apache.org/jira/browse/HDFS-1330
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node
Affects Versions: 0.22.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.22.0

 Attachments: hdfsRpcTimeout.patch


 This jira aims to make client/datanode or datanode/datanode RPC to have a 
 timeout of DataNode#socketTimeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1252) TestDFSConcurrentFileOperations broken in 0.20-appendj

2010-07-26 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12892444#action_12892444
 ] 

sam rash commented on HDFS-1252:


does the patch preserver the essence of the test: a file that is about to be 
closed is moved and lease recovery should still work (ie, recover blocks that 
are already finalized on DNs)


 TestDFSConcurrentFileOperations broken in 0.20-appendj
 --

 Key: HDFS-1252
 URL: https://issues.apache.org/jira/browse/HDFS-1252
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.20-append

 Attachments: hdfs-1252.txt


 This test currently has several flaws:
  - It calls DN.updateBlock with a BlockInfo instance, which then causes 
 java.lang.RuntimeException: java.lang.NoSuchMethodException: 
 org.apache.hadoop.hdfs.server.namenode.BlocksMap$BlockInfo.init() in the 
 logs when the DN tries to send blockReceived for the block
  - It assumes that getBlockLocations returns an up-to-date length block after 
 a sync, which is false. It happens to work because it calls getBlockLocations 
 directly on the NN, and thus gets a direct reference to the block in the 
 blockmap, which later gets updated
 This patch fixes this test to use the AppendTestUtil functions to intiiate 
 recovery, and generally pass more reliably.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-07-22 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-3.txt

removed empty file MockitoUtil

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-07-22 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-4.txt

fixed bug where calling append() to trigger lease recovery resulted in a 
client-side exception (trying to abandon a file that you don't own lease on).

DFSClient now catches this exception and logs it

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt, 
 hdfs-1262-4.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-30 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883828#action_12883828
 ] 

sam rash commented on HDFS-1262:


one note:

{code}
 public void updateRegInfo(DatanodeID nodeReg) {
   name = nodeReg.getName();
   infoPort = nodeReg.getInfoPort();
   // update any more fields added in future.
 }
{code}

should be:
{code}
public void updateRegInfo(DatanodeID nodeReg) {
   name = nodeReg.getName();
   infoPort = nodeReg.getInfoPort();
   ipcPort = nodeReg.getIpcPort();
   // update any more fields added in future.
 }
{code}

it wasn't copying the ipcPort for some reason.

My patch includes this fix

trunk doesn't have this bug

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-30 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883830#action_12883830
 ] 

sam rash commented on HDFS-1262:


above is from DatanodeId.java

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-30 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883857#action_12883857
 ] 

sam rash commented on HDFS-1262:


verified test case passes w/o that patch.  we should commit hdfs-894 to 
20-append for sure, though.  that seems like a potentially gnarly bug in tests 
to track down (took me a short spell)

i can upload the patch w/o the DatanodeID

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-30 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883855#action_12883855
 ] 

sam rash commented on HDFS-1262:


that's probably better.  this was dependent on it as i was killing the 
datanodes to simulate the pipeline failure.  i ended up tuning the test case to 
use mockito to throw exceptions at the end of a NN rpc call for both append() 
and create(), so I think that dependency is gone. 

can we mark this as dependent on that if it turns out to be needed?


 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-30 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-2.txt

removed hdfs-894 change from patch (commit this to 0.20-append separately)

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-29 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883785#action_12883785
 ] 

sam rash commented on HDFS-1057:


from the raw console output of hudson:

 [exec] [junit] Tests run: 3, Failures: 0, Errors: 1, Time elapsed: 
0.624 sec
 [exec] [junit] Test 
org.apache.hadoop.hdfs.security.token.block.TestBlockToken FAILED
--
 [exec] [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 
0.706 sec
 [exec] [junit] Test org.apache.hadoop.hdfs.server.common.TestJspHelper 
FAILED
--
 [exec] [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 
28.477 sec
 [exec] [junit] Test org.apache.hadoop.hdfsproxy.TestHdfsProxy FAILED

I ran the tests locally and the first 2 succeed.  The third fails on the latest 
trunk without hdfs-1057.  I think from the test perspective, this is safe to 
commit.

1. TestBlockToken

run-test-hdfs:
   [delete] Deleting directory 
/data/users/srash/apache/hadoop-hdfs/build/test/data
[mkdir] Created dir: /data/users/srash/apache/hadoop-hdfs/build/test/data
   [delete] Deleting directory 
/data/users/srash/apache/hadoop-hdfs/build/test/logs
[mkdir] Created dir: /data/users/srash/apache/hadoop-hdfs/build/test/logs
[junit] WARNING: multiple versions of ant detected in path for junit 
[junit]  
jar:file:/usr/local/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit]  and 
jar:file:/home/srash/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] Running org.apache.hadoop.hdfs.security.token.block.TestBlockToken
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.248 sec


2. TestJspHelper
run-test-hdfs:
   [delete] Deleting directory 
/data/users/srash/apache/hadoop-hdfs/build/test/data
[mkdir] Created dir: /data/users/srash/apache/hadoop-hdfs/build/test/data
   [delete] Deleting directory 
/data/users/srash/apache/hadoop-hdfs/build/test/logs
[mkdir] Created dir: /data/users/srash/apache/hadoop-hdfs/build/test/logs
[junit] WARNING: multiple versions of ant detected in path for junit 
[junit]  
jar:file:/usr/local/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
[junit]  and 
jar:file:/home/srash/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
[junit] Running org.apache.hadoop.hdfs.server.common.TestJspHelper
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.275 sec


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-28 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883319#action_12883319
 ] 

sam rash commented on HDFS-1262:


in the 2nd case, can't the client still call close?  or will it hang forever 
waiting for blocks?

either way, i've got test cases for create() + append() and the fix.  took a 
little longer to clean up today, but will post the patch by end of day



 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-28 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-1.txt

-test case for append and create failures.
-tried to get it so both cases fail fast, but create will hit the test timeout 
(default for create that gets AlreadyBeingCreatedException is 5 retries with 
60s sleep)
-append case fails in 30s w/o the fix worst case


 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1262-1.txt


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-28 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883413#action_12883413
 ] 

sam rash commented on HDFS-1057:


the one test that failed from my new tests had an fd leak.  i've corrected 
that.  the other failed tests I cannot reproduce:

1. 
org.apache.hadoop.hdfs.TestFileConcurrentReader.testUnfinishedBlockCRCErrorNormalTransferVerySmallWrite
 
-had fd leak, fixed

2. org.apache.hadoop.hdfs.security.token.block.TestBlockToken.testBlockTokenRpc

[junit] Running org.apache.hadoop.hdfs.security.token.block.TestBlockToken
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.305 sec

3. org.apache.hadoop.hdfs.server.common.TestJspHelper.testGetUgi 

[junit] Running org.apache.hadoop.hdfs.server.common.TestJspHelper
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.309 sec


I can submit the patch with the fix for #1 plus warning fixes


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-27 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882979#action_12882979
 ] 

sam rash commented on HDFS-1262:


also, in writing up the test case, i realized DFSClient.create() is not 
susceptible to the same scenario.  While in theory it could happen on the NN 
side, right now, the namenode RPC for create happens and then all we do is 
start the streamer (hence i don't have a test case for it yet).

I still think having a finally block that calls abandonFile() for create is 
prudent--if we get any exception in the process client side, abandon the file 
to be safe


 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-25 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882609#action_12882609
]

sam rash commented on HDFS-1262:

hey, so what should the precise semantics of abandonFile(String src, String
holder) be? I have a quick impl now (+ test case) that does this:

1. check that holder owns the lease for src
2. call internalReleaseLeaseOne

so it really is a glorified 'cleanup and close' which has the same behavior as
if the lease expired--nice and tidy imo. It does have the slight delay of
lease recovery, though.

an alternative option:

for the specific case we are fixing here, we could do something simpler such as
just putting the targets in the blockMap and call completeFile (basically what
commitBlockSynchronization would do). However, this doesn't handle the general
case if we expose abandonFile at any other time and a client has actually
written data to last block.

I think the first option is safer, but maybe I'm too cautious

if the way I've implemented it seems ok, I can post he patch for review asap

Failed pipeline creation during append leaves lease hanging on NN
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-25 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882644#action_12882644
 ] 

sam rash commented on HDFS-1057:


sorry, i don't understand.  this is a race condition where the namenode has 
assigned locations to the block, but the client hasn't sent data yet.  the NN 
cannot know that the DNs don't have data on disk yet unless we add additional 
NN coordination?  our choice in this condition is to return 0 or let the 
exception be. I had done the latter, but you asked for the former unless I 
misunderstood.

can you clarify what you want?


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-25 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882664#action_12882664
 ] 

sam rash commented on HDFS-1057:


per out offline discussion, it seems the NN doesn't know when the pipeline is 
created, but the writer does, so the NN has to return the replicas for the 
current block in this case.

I will change it so we check all DNs for a replica before using the default of 
0.  I need to think about if we require all DNs to have ReplicaNotFound or all 
have that (versus some other exception). 

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-25 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1057:
---

Attachment: hdfs-1057-trunk-5.txt

-returns 0 length only if all DNs are missing the replica (any other io 
exception will cause client to get exception and it can retry)
-my diff viewer does not show any whitespace or indentation changes, but please 
advise if you see any 

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882175#action_12882175
 ] 

sam rash commented on HDFS-1262:


we actually use a new FileSystem instance per file in scribe.  see
http://hadoopblog.blogspot.com/2009/06/hdfs-scribe-integration.html

there are some downsides to this (creating a new FileSystem instance can be 
expensive, issuing fork  exec calls for 'whoami' and 'groups', but we have 
patches to minimize this)



 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882176#action_12882176
 ] 

sam rash commented on HDFS-1262:


i am also wondering why this hasn't shown up in regular create calls sometime.  
both DFSClient.append() and DFSClient.create() are susceptible to the same 
problem (client has lease, then throws exception setting up pipeline)

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882263#action_12882263
 ] 

sam rash commented on HDFS-1262:


todd: can you confirm if the exception was from the namenode.append() call or 
creating the output stream?  (sounds like the latter, in the lease recovery it 
initiates)

{code}
  OutputStream append(String src, int buffersize, Progressable progress
  ) throws IOException {
checkOpen();
FileStatus stat = null;
LocatedBlock lastBlock = null;
try {
  stat = getFileInfo(src);
  lastBlock = namenode.append(src, clientName);
} catch(RemoteException re) {
  throw re.unwrapRemoteException(FileNotFoundException.class,
 AccessControlException.class,
 NSQuotaExceededException.class,
 DSQuotaExceededException.class);
}
OutputStream result = new DFSOutputStream(src, buffersize, progress,
lastBlock, stat, conf.getInt(io.bytes.per.checksum, 512));
leasechecker.put(src, result);
return result;
  }
{code}


either way, i think the right way to do this is add back an abandonFile RPC 
call in the NN.   Even if we don't change function call signatures for 
abandonBlock, we will break client/server compatibility.

thoughts?

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882283#action_12882283
 ] 

sam rash commented on HDFS-1262:


i'd appreciate the chance to implement it, actually.  Thanks

re: the name, according to Dhruba, there used to be one called abandonFile 
which had the semantics we need.  Also, a similar error can occur on non-append 
creates, so probably having append in the name doesn't make sense.  abandonFile 
or another idea?

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-24 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash reassigned HDFS-1262:
--

Assignee: sam rash

 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1186) 0.20: DNs should interrupt writers at start of recovery

2010-06-24 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882350#action_12882350
]

sam rash commented on HDFS-1186:

hey todd,

i was looking at this patch, and while it has certainly reduced the chance of
problems, isn't it still possible a new writer thread could be created

1. between the kill loop in startBlockRecovery() and the synchronized block
2. between the startBlockRecovery() call and updateBlock() call

I seem to recall reasoning with dhruba that while in theory these could occur
from the DN perspective, the circumstances that would have to occur outside
were not (once you fixed hdfs-1260 anyway, where genstamp checks work right in
concurrent lease recovery).

what's your take on this? is it full-proof now? (1 2 can't happen) or what
about introducing a state like RUR here? (at least disabling writes to a block
while under recovery, maybe timing out in case the lease recovery owner dies)

0.20: DNs should interrupt writers at start of recovery
---

Key: HDFS-1186
URL: https://issues.apache.org/jira/browse/HDFS-1186
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
Attachments: hdfs-1186.txt

When block recovery starts (eg due to NN recovering lease) it needs to
interrupt any writers currently writing to those blocks. Otherwise, an old
writer (who hasn't realized he lost his lease) can continue to write+sync to
the blocks, and thus recovery ends up truncating data that has been sync()ed.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882359#action_12882359
 ] 

sam rash commented on HDFS-1057:


patch should use --no-prefix to get rid of 'a' and 'b' in paths, fyi


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 HDFS-1057-0.20-append.patch, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1186) 0.20: DNs should interrupt writers at start of recovery

2010-06-24 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882367#action_12882367
]

sam rash commented on HDFS-1186:

yea, i think so. let me repeat slightly different to make sure I get this at a
higher level:

1. we make sure that a lease recovery that starts with a old gs at one stage
(that's synchronized) actually mutates the block data of only the same gs
2. new writer that come in between start of recovery and actual stamping must
have a new gs since they can only come into being via lease recovery

this is effectively saying that if concurrent lease recoveries get started, the
first to complete wins (as it should), and later completions just fail.

sounds like optimistic locking/versioned puts in the cache world actually:
updateBlock requires the source to match an expected source.

nice idea

0.20: DNs should interrupt writers at start of recovery
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1186) 0.20: DNs should interrupt writers at start of recovery

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882386#action_12882386
 ] 

sam rash commented on HDFS-1186:


how could this happen?  the GS=2 stamp succeeds on A and B.  for GS=3 to win on 
C, GS=2 had to fail which means it went 2nd.  The primary for GS=2 would get a 
failure doing the stamping of DN C and would fail the lease recovery, right?

 0.20: DNs should interrupt writers at start of recovery
 ---

 Key: HDFS-1186
 URL: https://issues.apache.org/jira/browse/HDFS-1186
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Attachments: hdfs-1186.txt


 When block recovery starts (eg due to NN recovering lease) it needs to 
 interrupt any writers currently writing to those blocks. Otherwise, an old 
 writer (who hasn't realized he lost his lease) can continue to write+sync to 
 the blocks, and thus recovery ends up truncating data that has been sync()ed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1186) 0.20: DNs should interrupt writers at start of recovery

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882389#action_12882389
 ] 

sam rash commented on HDFS-1186:


I think you can make this argument:

1. each node has to make a transition from x - x+k
2. at most one node owns any x - x+k transition as the primary of a recovery
3. success requires all DNs to complete x - x+k
4. primary then commits x - x+k

and until commitBlockSync completes, no transition y - y+j with y  x can come 
in

right?


 0.20: DNs should interrupt writers at start of recovery
 ---

 Key: HDFS-1186
 URL: https://issues.apache.org/jira/browse/HDFS-1186
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Attachments: hdfs-1186.txt


 When block recovery starts (eg due to NN recovering lease) it needs to 
 interrupt any writers currently writing to those blocks. Otherwise, an old 
 writer (who hasn't realized he lost his lease) can continue to write+sync to 
 the blocks, and thus recovery ends up truncating data that has been sync()ed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1186) 0.20: DNs should interrupt writers at start of recovery

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882401#action_12882401
 ] 

sam rash commented on HDFS-1186:


hmm i wonder why only 1?  if the client thinks there are 3 DNs in the pipeline 
and asks to recovery 3, i think it should fail with less than 3.  a client can 
request fewer if that works (in which case we do have to handle the problem you 
lay out)

so in your sol'n, you are saying that the lease holder, the client, needs to be 
contacted to verify the primary is the only one doing lease recovery? (or at 
least the latest)


 0.20: DNs should interrupt writers at start of recovery
 ---

 Key: HDFS-1186
 URL: https://issues.apache.org/jira/browse/HDFS-1186
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Attachments: hdfs-1186.txt


 When block recovery starts (eg due to NN recovering lease) it needs to 
 interrupt any writers currently writing to those blocks. Otherwise, an old 
 writer (who hasn't realized he lost his lease) can continue to write+sync to 
 the blocks, and thus recovery ends up truncating data that has been sync()ed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1186) 0.20: DNs should interrupt writers at start of recovery

2010-06-24 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12882404#action_12882404
 ] 

sam rash commented on HDFS-1186:


wait, why can't commitBlockSync on the NN just do the same check on genstamps?  
if two primaries start concurrent lease recoveries and split the remaining 
nodes as far as who wins in stamping, and the NN can resolve the issue of who 
wins in the end?   then the loser will be marked as an invalid and replication 
takes over to fix it

or i have this sinking feeling i am still missing something?

 0.20: DNs should interrupt writers at start of recovery
 ---

 Key: HDFS-1186
 URL: https://issues.apache.org/jira/browse/HDFS-1186
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Attachments: hdfs-1186.txt


 When block recovery starts (eg due to NN recovering lease) it needs to 
 interrupt any writers currently writing to those blocks. Otherwise, an old 
 writer (who hasn't realized he lost his lease) can continue to write+sync to 
 the blocks, and thus recovery ends up truncating data that has been sync()ed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1263) 0.20: in tryUpdateBlock, the meta file is renamed away before genstamp validation is done

2010-06-23 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881739#action_12881739
]

sam rash commented on HDFS-1263:

several thoughts/comments:

my reading of the code is that the temp file was to make the creating of a meta
file that is both truncated and has the new genstamp and atomic operation on
the filesystem. If we rename first and crash and recover, then how do we know
that the truncation didn't finish? (without information from the NN or other
node giving us a new length). If we truncate first, then we have effectively
corrupted a block.

can you also explain the error state that results? (truncated blocks, infinite
loops, bad meta-data, etc)

and do i follow that a client started 2 lease recoveries? or was this a client
and a NN somehow? (ie, how were there concurrent recoveries of the same block)

seems like extra synchronization in parts of updateBlock might help as well.

also, we check the genstamp is moving upwards both at the start of updateBlock
and at the end of tryUpdateBlock. do you know why?

0.20: in tryUpdateBlock, the meta file is renamed away before genstamp
validation is done
-

Key: HDFS-1263
URL: https://issues.apache.org/jira/browse/HDFS-1263
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Fix For: 0.20-append

Saw an issue where multiple datanodes are trying to recover at the same time,
and all of them failed. I think the issue is in FSDataset.tryUpdateBlock, we
do the rename of blk_B_OldGS to blk_B_OldGS_tmpNewGS and *then* check that
the generation stamp is moving upwards. Because of this, invalid update block
calls are blocked, but they then cause future updateBlock calls to fail with
Meta file not found errors.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-23 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881748#action_12881748
]

sam rash commented on HDFS-1262:

i think something along the lines of option 2 sounds cleaner imo.

but i have another question, does the error you see have
because current leaseholder is trying to recreate file

it sounds like this code is executing:

{code}
//
// We found the lease for this file. And surprisingly the original
// holder is trying to recreate this file. This should never occur.
//
if (lease != null) {
Lease leaseFile = leaseManager.getLeaseByPath(src);
if (leaseFile != null leaseFile.equals(lease)) {
throw new AlreadyBeingCreatedException(
failed to create file + src
+ for + holder +
on client + clientMachine
+
because current leaseholder
is trying to recreate file.);
}
}
{code}

and anytime I see a comment this should never happen it sounds to me like the
handling of that might be suboptimal. is there any reason that a client
shouldn't be able to open a file in the same mode it already has it open?
NN-side, it's a basically a no-op, or a explicit lease renewal.

any reason we can't make the above code do that? (log something and return)

Failed pipeline creation during append leaves lease hanging on NN
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-23 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881755#action_12881755
]

sam rash commented on HDFS-1262:

in looking at client code, my suggestion above probably isn't a good idea. it
would allow concurrent writes.

i think the simplest solution is this:

add a finally block that removes the path from the LeaseChecker in DFSClient.
Then the lease will expire in 60s.

Failed pipeline creation during append leaves lease hanging on NN
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881757#action_12881757
 ] 

sam rash commented on HDFS-1262:


provided there is agreement on the last suggestion, i'm happy to take care of 
it btw


 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881759#action_12881759
 ] 

sam rash commented on HDFS-1262:


oop, nevermind--forgot that lease renewal is by client name *only* (hence your 
option 1)

still pondering this a bit, but ~option 2 sounds most appealing.  a client 
should have a way to release the lease it has on a file without necessarily 
doing a normal close (and hence completeFile)


 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881783#action_12881783
 ] 

sam rash commented on HDFS-1262:


actually, here's another idea:

3) the NN thinks the client has a lease.  it's right.  the client just didn't 
save enough information to handle the failure.
namenode.append() just returns the last block.  The code in DFSClient:

{code}
OutputStream result = new DFSOutputStream(src, buffersize, progress,
lastBlock, stat, conf.getInt(io.bytes.per.checksum, 512));
leasechecker.put(src, result);
return result;
{code}

if in leasechecker we stored a pair, lastBlock and result (and did so in a 
finally block):

{code}
OutputStream result = null;
try {
  result = new DFSOutputStream(src, buffersize, progress,
lastBlock, stat, conf.getInt(io.bytes.per.checksum, 512));
} finally {
  PairLocatedBlock, OutputStream pair = new Pair(lastBlock, result);
  leasechecker.put(src, pair);
  return result;
}
{code}

and above, we only call namenode.append() if we don't have a lease already.

again, if we do find a solution, i'm happy to help out on this one


 Failed pipeline creation during append leaves lease hanging on NN
 -

 Key: HDFS-1262
 URL: https://issues.apache.org/jira/browse/HDFS-1262
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append


 Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
 was the following:
 1) File's original writer died
 2) Recovery client tried to open file for append - looped for a minute or so 
 until soft lease expired, then append call initiated recovery
 3) Recovery completed successfully
 4) Recovery client calls append again, which succeeds on the NN
 5) For some reason, the block recovery that happens at the start of append 
 pipeline creation failed on all datanodes 6 times, causing the append() call 
 to throw an exception back to HBase master. HBase assumed the file wasn't 
 open and put it back on a queue to try later
 6) Some time later, it tried append again, but the lease was still assigned 
 to the same DFS client, so it wasn't able to recover.
 The recovery failure in step 5 is a separate issue, but the problem for this 
 JIRA is that the NN can think it failed to open a file for append when the NN 
 thinks the writer holds a lease. Since the writer keeps renewing its lease, 
 recovery never happens, and no one can open or recover the file until the DFS 
 client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-23 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881795#action_12881795
]

sam rash commented on HDFS-1262:

i see, it only solves the re-open by the same client problem, but not the
blocking of other clients.

the fact is the client does have the lease and currently the only way to
release it is via close.

in looking at DFSClient.create(), the same problem can occur there. we make a
NN rpc call to get a block and acquire a lease. we then create the
DFSOutputStream (which could fail)

i think that comes back to the need to be able to release a lease without
calling namenode.completeFile().

i guess there's not a clever way to do this with existing namenode RPC and/or
client initiated lease recovery?

Failed pipeline creation during append leaves lease hanging on NN
-

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1263) 0.20: in tryUpdateBlock, the meta file is renamed away before genstamp validation is done

2010-06-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881801#action_12881801
 ] 

sam rash commented on HDFS-1263:


so if i follow:

checking that the genstamp of the file is  the one we are trying to update 
before doing any mutation of blocks or metadata (ie renaming) should fix this 
issue

regarding throwing ioe on concurrent recovery in the same node, that might be 
problematic if:

DN A can talk to DN B, not DN C
DN B can talk to DN A and DN C

DN A starts recovery first
DN B starts after
if DN B talks to DN A before DN A times out talking to C, we'll fail a recovery 
that could succeed, no?

i like the idea of failing on these as early in the pipe, but i lean towards 
fixing the genstamp detection.  seems like the whole genstamp process is 
designed for this--there's just a tiny bug with the rename


 0.20: in tryUpdateBlock, the meta file is renamed away before genstamp 
 validation is done
 -

 Key: HDFS-1263
 URL: https://issues.apache.org/jira/browse/HDFS-1263
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.20-append


 Saw an issue where multiple datanodes are trying to recover at the same time, 
 and all of them failed. I think the issue is in FSDataset.tryUpdateBlock, we 
 do the rename of blk_B_OldGS to blk_B_OldGS_tmpNewGS and *then* check that 
 the generation stamp is moving upwards. Because of this, invalid update block 
 calls are blocked, but they then cause future updateBlock calls to fail with 
 Meta file not found errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1260) 0.20: Block lost when multiple DNs trying to recover it to different genstamps

2010-06-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881985#action_12881985
 ] 

sam rash commented on HDFS-1260:


about the testing, any reason not to use one of the adapters instead of making 
this public?

{code}
public long nextGenerationStampForBlock(Block block) throws IOException  {
{code}

sorry, i'm a stickler for visibility/encapsulation bits when i can be

 0.20: Block lost when multiple DNs trying to recover it to different genstamps
 --

 Key: HDFS-1260
 URL: https://issues.apache.org/jira/browse/HDFS-1260
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1260.txt


 Saw this issue on a cluster where some ops people were doing network changes 
 without shutting down DNs first. So, recovery ended up getting started at 
 multiple different DNs at the same time, and some race condition occurred 
 that caused a block to get permanently stuck in recovery mode. What seems to 
 have happened is the following:
 - FSDataset.tryUpdateBlock called with old genstamp 7091, new genstamp 7094, 
 while the block in the volumeMap (and on filesystem) was genstamp 7093
 - we find the block file and meta file based on block ID only, without 
 comparing gen stamp
 - we rename the meta file to the new genstamp _7094
 - in updateBlockMap, we do comparison in the volumeMap by oldblock *without* 
 wildcard GS, so it does *not* update volumeMap
 - validateBlockMetaData now fails with blk_7739687463244048122_7094 does not 
 exist in blocks map
 After this point, all future recovery attempts to that node fail in 
 getBlockMetaDataInfo, since it finds the _7094 gen stamp in getStoredBlock 
 (since the meta file got renamed above) and then fails since _7094 isn't in 
 volumeMap in validateBlockMetadata
 Making a unit test for this is probably going to be difficult, but doable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1260) 0.20: Block lost when multiple DNs trying to recover it to different genstamps

2010-06-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881988#action_12881988
 ] 

sam rash commented on HDFS-1260:


oh, other than that, lgtm


 0.20: Block lost when multiple DNs trying to recover it to different genstamps
 --

 Key: HDFS-1260
 URL: https://issues.apache.org/jira/browse/HDFS-1260
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1260.txt


 Saw this issue on a cluster where some ops people were doing network changes 
 without shutting down DNs first. So, recovery ended up getting started at 
 multiple different DNs at the same time, and some race condition occurred 
 that caused a block to get permanently stuck in recovery mode. What seems to 
 have happened is the following:
 - FSDataset.tryUpdateBlock called with old genstamp 7091, new genstamp 7094, 
 while the block in the volumeMap (and on filesystem) was genstamp 7093
 - we find the block file and meta file based on block ID only, without 
 comparing gen stamp
 - we rename the meta file to the new genstamp _7094
 - in updateBlockMap, we do comparison in the volumeMap by oldblock *without* 
 wildcard GS, so it does *not* update volumeMap
 - validateBlockMetaData now fails with blk_7739687463244048122_7094 does not 
 exist in blocks map
 After this point, all future recovery attempts to that node fail in 
 getBlockMetaDataInfo, since it finds the _7094 gen stamp in getStoredBlock 
 (since the meta file got renamed above) and then fails since _7094 isn't in 
 volumeMap in validateBlockMetadata
 Making a unit test for this is probably going to be difficult, but doable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1260) 0.20: Block lost when multiple DNs trying to recover it to different genstamps

2010-06-23 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881991#action_12881991
 ] 

sam rash commented on HDFS-1260:


yea, looks good.  at some point, does it make sense to move the DelayAnswer 
class out?  it seems generally useful (not this patch, but just thinking)


 0.20: Block lost when multiple DNs trying to recover it to different genstamps
 --

 Key: HDFS-1260
 URL: https://issues.apache.org/jira/browse/HDFS-1260
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1260.txt, hdfs-1260.txt


 Saw this issue on a cluster where some ops people were doing network changes 
 without shutting down DNs first. So, recovery ended up getting started at 
 multiple different DNs at the same time, and some race condition occurred 
 that caused a block to get permanently stuck in recovery mode. What seems to 
 have happened is the following:
 - FSDataset.tryUpdateBlock called with old genstamp 7091, new genstamp 7094, 
 while the block in the volumeMap (and on filesystem) was genstamp 7093
 - we find the block file and meta file based on block ID only, without 
 comparing gen stamp
 - we rename the meta file to the new genstamp _7094
 - in updateBlockMap, we do comparison in the volumeMap by oldblock *without* 
 wildcard GS, so it does *not* update volumeMap
 - validateBlockMetaData now fails with blk_7739687463244048122_7094 does not 
 exist in blocks map
 After this point, all future recovery attempts to that node fail in 
 getBlockMetaDataInfo, since it finds the _7094 gen stamp in getStoredBlock 
 (since the meta file got renamed above) and then fails since _7094 isn't in 
 volumeMap in validateBlockMetadata
 Making a unit test for this is probably going to be difficult, but doable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-22 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881249#action_12881249
 ] 

sam rash commented on HDFS-1218:


I racked my brain and can't come up with a case that this could actually 
occur--keepLength is only set true when doing an append.  If any nodes had gone 
down and come back up (RWR), they either have an old genstamp and will be 
ignored, or soft lease expiry recovery is initiated by the NN with keepLength = 
false first.

i think the idea + patch look good to me
(and thanks for taking the time to explain it)


 20 append: Blocks recovered on startup should be treated with lower priority 
 during block synchronization
 -

 Key: HDFS-1218
 URL: https://issues.apache.org/jira/browse/HDFS-1218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1281.txt


 When a datanode experiences power loss, it can come back up with truncated 
 replicas (due to local FS journal replay). Those replicas should not be 
 allowed to truncate the block during block synchronization if there are other 
 replicas from DNs that have _not_ restarted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1202) DataBlockScanner throws NPE when updated before initialized

2010-06-22 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881370#action_12881370
 ] 

sam rash commented on HDFS-1202:


this looks good.

I checked trunk and I think it is needed there also

 DataBlockScanner throws NPE when updated before initialized
 ---

 Key: HDFS-1202
 URL: https://issues.apache.org/jira/browse/HDFS-1202
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.20-append

 Attachments: hdfs-1202-0.20-append.txt


 Missing an isInitialized() check in updateScanStatusInternal

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HDFS-1214) hdfs client metadata cache

2010-06-22 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash reassigned HDFS-1214:
--

Assignee: sam rash

 hdfs client metadata cache
 --

 Key: HDFS-1214
 URL: https://issues.apache.org/jira/browse/HDFS-1214
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs client
Reporter: Joydeep Sen Sarma
Assignee: sam rash

 In some applications, latency is affected by the cost of making rpc calls to 
 namenode to fetch metadata. the most obvious case are calls to fetch 
 file/directory status. applications like hive like to make optimizations 
 based on file size/number etc. - and for such optimizations - 'recent' status 
 data (as opposed to most up-to-date) is acceptable. in such cases, a cache on 
 the DFS client that transparently caches metadata would be greatly benefit 
 applications.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-22 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881421#action_12881421
 ] 

sam rash commented on HDFS-1057:


i have an updated patch, but it does not yet handle the missing replicas as 0 
sized for under construction.  there may be other 20 patches to port to make 
this happen.  

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-22 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1057:
---

Attachment: hdfs-1057-trunk-4.txt

includes requested changes by hairong.  also handles immediate reading of new 
files by translating a ReplicaNotFoundException into a 0-length block within 
DFSInputStream for under construction files

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append

 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, 
 hdfs-1057-trunk-4.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-21 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880987#action_12880987
]

sam rash commented on HDFS-1218:

a few questions

1. this assumes a DN goes down with the client (either in tandem, or on the
same box) and that the NN initiates lease recovery later correct?

2. the idea here is that RBW should have lengths longer than RWR, but both will
have the same genstamp?

If so, why aren't we just taking the replica with the longest length? Is there
a reason to

3. if sync() did not complete, there is no violation. do I follow? i agree
we can try to recover more data if it's there, but i just want to make sure i'm
on the same page

20 append: Blocks recovered on startup should be treated with lower priority
during block synchronization
-

Key: HDFS-1218
URL: https://issues.apache.org/jira/browse/HDFS-1218
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
Fix For: 0.20-append

Attachments: hdfs-1281.txt

When a datanode experiences power loss, it can come back up with truncated
replicas (due to local FS journal replay). Those replicas should not be
allowed to truncate the block during block synchronization if there are other
replicas from DNs that have _not_ restarted.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-21 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880989#action_12880989
 ] 

sam rash commented on HDFS-1218:


I realize in the hadoop code we already swallow InterruptedException 
frequently, but I think you can change the trend here:

{code}
// wait for all acks to be received back from datanodes
synchronized (ackQueue) {
  if (!closed  ackQueue.size() != 0) {
try {
  ackQueue.wait();
} catch (InterruptedException e) {
  Thread.currentThread.interrupt();  //add this 
}
continue;
  }
}
{code}

otherwise, it's very easy to have a thread that I own and manage that has a 
DFSOutputStream in it that swallows an interrupt.  when i check 
Thread.currentThread.isInterrupted() to see if one of my other threads has 
interrupted me, i will not see it

(the crux here is that swallowing interrupts in threads that hadoop controls 
are less harmful--this is directly in client code when you call sync()/close())


 20 append: Blocks recovered on startup should be treated with lower priority 
 during block synchronization
 -

 Key: HDFS-1218
 URL: https://issues.apache.org/jira/browse/HDFS-1218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1281.txt


 When a datanode experiences power loss, it can come back up with truncated 
 replicas (due to local FS journal replay). Those replicas should not be 
 allowed to truncate the block during block synchronization if there are other 
 replicas from DNs that have _not_ restarted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-21 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880990#action_12880990
 ] 

sam rash commented on HDFS-1218:


disregard comment above, was meant for hdfs-895


 20 append: Blocks recovered on startup should be treated with lower priority 
 during block synchronization
 -

 Key: HDFS-1218
 URL: https://issues.apache.org/jira/browse/HDFS-1218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1281.txt


 When a datanode experiences power loss, it can come back up with truncated 
 replicas (due to local FS journal replay). Those replicas should not be 
 allowed to truncate the block during block synchronization if there are other 
 replicas from DNs that have _not_ restarted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-895) Allow hflush/sync to occur in parallel with new writes to the file

2010-06-21 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880991#action_12880991
 ] 

sam rash commented on HDFS-895:
---

re: the patch

I realize in the hadoop code we already swallow InterruptedException 
frequently, but I think you can change the trend here:

{code}
// wait for all acks to be received back from datanodes
synchronized (ackQueue) {
  if (!closed  ackQueue.size() != 0) {
try {
  ackQueue.wait();
} catch (InterruptedException e) {
  Thread.currentThread.interrupt();  //add this 
}
continue;
  }
}
{code}

otherwise, it's very easy to have a thread that I own and manage that has a 
DFSOutputStream in it that swallows an interrupt. when i check 
Thread.currentThread.isInterrupted() to see if one of my other threads has 
interrupted me, i will not see it

(the crux here is that swallowing interrupts in threads that hadoop controls 
are less harmful--this is directly in client code when you call sync()/close())


 Allow hflush/sync to occur in parallel with new writes to the file
 --

 Key: HDFS-895
 URL: https://issues.apache.org/jira/browse/HDFS-895
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: dhruba borthakur
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: hdfs-895-0.20-append.txt, hdfs-895-20.txt, 
 hdfs-895-trunk.txt, hdfs-895.txt


 In the current trunk, the HDFS client methods writeChunk() and hflush./sync 
 are syncronized. This means that if a hflush/sync is in progress, an 
 applicationn cannot write data to the HDFS client buffer. This reduces the 
 write throughput of the transaction log in HBase. 
 The hflush/sync should allow new writes to happen to the HDFS client even 
 when a hflush/sync is in progress. It can record the seqno of the message for 
 which it should receice the ack, indicate to the DataStream thread to star 
 flushing those messages, exit the synchronized section  and just wai for that 
 ack to arrive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-21 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881005#action_12881005
]

sam rash commented on HDFS-1218:

1. how can there be a pipeline recovery by a client when the client goes down?
in client-initiated recovery, it sends in the list of nodes which excludes the
node that went down. Even if a node goes down and comes back up, it won't
participate in recovery.

The only case I can see that this can occur is if the client is not the one to
initiate lease recovery--ie hard or soft limits in the NN.

I only point this out because I wonder if this recovery code can be simplified.
We already pass in a flag that is a surrogate for indicating NN initiated
lease recovery (closeFile == true = NN).

maybe not, but I wanted to throw it out there.

2. hmm, i think i see, it's sort of like using RBW and RWR as 1 and 0, and
tacking to the genstamp so that you take the highest appended genstamp and take
the shortest length of those as the length of the block. in this way, you are
auto-incrementing the genstamp in a way...

but I think there's still an edge case:

i. client node has network trouble (slow, falls off) and transfer to next DN
in pipeline from primary slowed/stops (going to timeout)
ii. DN-1 writes after putting bytes into network buffer
iii. bytes make it to first DN disk, but do not leave OS network stack
iv. DN-1 comes up before NN starts hard expiry lease recovery
v. we use the other DNs length which is shorter

or do I misunderstand?

20 append: Blocks recovered on startup should be treated with lower priority
during block synchronization
-

Attachments: hdfs-1281.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-21 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881011#action_12881011
 ] 

sam rash commented on HDFS-1218:


ah, interesting.  so the point of this fix isn't to get the best block, but to 
maintain sync semantics?

 20 append: Blocks recovered on startup should be treated with lower priority 
 during block synchronization
 -

 Key: HDFS-1218
 URL: https://issues.apache.org/jira/browse/HDFS-1218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1281.txt


 When a datanode experiences power loss, it can come back up with truncated 
 replicas (due to local FS journal replay). Those replicas should not be 
 allowed to truncate the block during block synchronization if there are other 
 replicas from DNs that have _not_ restarted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

2010-06-21 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881025#action_12881025
 ] 

sam rash commented on HDFS-1218:


re: the patch

shouldn't the skipping of RWRs be inside the else block?  if keepLength is 
passed by a client, the fact the block length matches should be the sole 
criteria for accepting it right?  there is not a notion of better.

(tho, I don't think it will ever be the case that we have RWRs participating in 
a client-initiated recovery.  soft expiry even comes from the NN where 
keepLength=false)

{code}
if (!shouldRecoverRwrs  info.wasRecoveredOnStartup()) {
  LOG.info(Not recovering replica  + record +  since it was 
recovered on 
  + startup and we have better replicas);
  continue;
}
if (keepLength) {
  if (info.getBlock().getNumBytes() == block.getNumBytes()) {
syncList.add(record);
  }
} else {  
  syncList.add(record);
  if (info.getBlock().getNumBytes()  minlength) {
minlength = info.getBlock().getNumBytes();
  }
}
{code}

 20 append: Blocks recovered on startup should be treated with lower priority 
 during block synchronization
 -

 Key: HDFS-1218
 URL: https://issues.apache.org/jira/browse/HDFS-1218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20-append
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.20-append

 Attachments: hdfs-1281.txt


 When a datanode experiences power loss, it can come back up with truncated 
 replicas (due to local FS journal replay). Those replicas should not be 
 allowed to truncate the block during block synchronization if there are other 
 replicas from DNs that have _not_ restarted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-18 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880278#action_12880278
 ] 

sam rash commented on HDFS-1057:


So removing setBytesOnDisk() means:


{code}
  if (replicaInfo instanceof ReplicaBeingWritten) {
((ReplicaBeingWritten) replicaInfo)
  .setLastChecksumAndDataLen(offsetInBlock, lastChunkChecksum);
  }
  
  replicaInfo.setBytesOnDisk(offsetInBlock);
{code}

will not have the latter.  So all other implementations of Replica will have a 
valid value for getByteOnDIsk()?
Does this also mean that the impl of getBytesOnDisk for ReplicaInPipeline will 
move to ReplicaBeingWritten ?

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-18 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880291#action_12880291
 ] 

sam rash commented on HDFS-1057:


another way to ask this:  only ReplicaBeingWritten needs to have the bytes on 
disk set in BlockRecevier?

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-18 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880346#action_12880346
 ] 

sam rash commented on HDFS-1057:


got it.  i will make the changes and get a patch soon

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-16 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12879460#action_12879460
]

sam rash commented on HDFS-1057:

1. they aren't guaranteed to be since there are methods to change the
bytesOnDisk separate from the lastCheckSum bytes. It's entirely conceivable
that something could update the bytes on disk w/o updating the lastChecksum
with the current set of methods

If we are ok with a loosely coupled guarantee, then we can use bytesOnDisk and
be careful never to call setBytesOnDisk() for any RBW

2. oh--your previous comments indicated we shouldn't change either
ReplicaInPipelineInterface or ReplicaInPipeline. If that's not the case and we
can do this, then my comment above doesn't hold. we use bytesOnDisk and
guarantee it's in sync with the checksum in a single synchronized method (I
like this)

3. will make the update to treat missing last blocks as 0-length and re-instate
the unit test.

thanks for all the help on this

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

Key: HDFS-1057
URL: https://issues.apache.org/jira/browse/HDFS-1057
Project: Hadoop HDFS
Issue Type: Sub-task
Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
Attachments: conurrent-reader-patch-1.txt,
conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt,
hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-07 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12876380#action_12876380
]

sam rash commented on HDFS-1057:

I am addressing the last comments. I have one more question, though, as I have
one test that it still fails and I want to see what you think the expected
behavior should be:

immediate read of a new file:
1. writer creates a file and starts to write and hence blocks are assigned in
the NN
2. a reader gets these locations and contacts DN
3. DN has not yet put the replica in the volumeMap and
FSDataset.getVisibleLength() throws a MissingReplicaException

In 0.20, I made it so that client just treats this as a 0-length file. What
should the behavior in trunk be?

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

Key: HDFS-1057
URL: https://issues.apache.org/jira/browse/HDFS-1057
Project: Hadoop HDFS
Issue Type: Sub-task
Components: data-node
Affects Versions: 0.21.0, 0.22.0, 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
Attachments: conurrent-reader-patch-1.txt,
conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt,
hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-07 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12876410#action_12876410
 ] 

sam rash commented on HDFS-1057:


hmm, i can remove the test case.  one of our internal tools saw this rather 
frequently in 0.20--maybe in trunk it's far less likely?


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.21.0, 0.22.0, 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-07 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

sam rash updated HDFS-1057:
---

Attachment: hdfs-1057-trunk-3.txt

1. endOffset is either bytesOnDisk or the chunkChecksum.getDataLength()
2. if tmpLen = endOffset this is a write in progress, use the in--memory
checksum (else this is a finalized block not ending on a chunk bondary)
3. fixed up whitespace

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-07 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12876424#action_12876424
 ] 

sam rash commented on HDFS-1057:


also disabled out the test on read of immediate file for now.  if we want to 
change how this is handled, I can enable it

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.21.0, 0.22.0, 0.20-append
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-06 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

sam rash updated HDFS-1057:
---

Attachment: hdfs-1057-trunk-2.txt

1. new endOffset calc includes determining if in-memory checksum is needed
2. added methods to RBW only to set/get last checksum and data length
-track this dataLength separate as setBytesOnDisk may be called independently
and make the length/byte[] not match (in theory bytes on disk *could* be set to
more and we still want a checksum + the corresponding length kept)
3. appropriate changes around waiting for start + length

did not remove all replicaVisibleLength uses yet--want to clarify what to
replace them with in pre-existing code.

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-06-04 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875801#action_12875801
]

sam rash commented on HDFS-1057:

Thanks for the quick review.
I understand most of the comments, but have a couple of questions:

1. replicaVisibleLength was here before I made any changes. Why is it not
valid? I understood it to be an upper bound on the bytes that could be read
from this block. Is it the case that start + length = replicaVisibleLength
and we want to optimize?

(the for loop to wait for bytes on disk = visible length was here before, I
just moved it earlier in the constructor)

2. not sure I understand endOffset. This was again a variable that already
existed. What I thought you were getting at was the condition to decide if we
should use the in-memory checksum or not (which is what you describe).

3. If we don't put the sync set/get method in ReplicaInPipelineInterface, we
will have to use an if/else construct on instanceof in BlockReceiver and call
one or the other. I can see the argument for keeping the method out of the
interface since it is RBW-specific, but on the other hand, it's effectively a
no-op for other implementers of the interface and leads to cleaner code (better
natural polymorphism then if-else constructs to force it).

either way, just wanted to throw that out there as a question of style

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1142) Lease recovery doesn't reassign lease when triggered by append()

2010-06-03 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875303#action_12875303
 ] 

sam rash commented on HDFS-1142:


a small sidenote:

re: killing writers, it does so *after* getting metadata, so there is still a 
window under which the client could start another lease recovery, it would 
complete, and it could start writing and call sync.  the 1st lease recovery 
kills threads, then truncate the block (based on the first set of lengths).  
This violates sync/hflush semantics.  I don't know if there's a jira for this, 
but I had planned to make the change so the writers *are* killed first thing 
before getting meta-data.



 Lease recovery doesn't reassign lease when triggered by append()
 

 Key: HDFS-1142
 URL: https://issues.apache.org/jira/browse/HDFS-1142
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-1142.txt, hdfs-1142.txt


 If a soft lease has expired and another writer calls append(), it triggers 
 lease recovery but doesn't reassign the lease to a new owner. Therefore, the 
 old writer can continue to allocate new blocks, try to steal back the lease, 
 etc. This is for the testRecoveryOnBlockBoundary case of HDFS-1139

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1142) Lease recovery doesn't reassign lease when triggered by append()

2010-06-03 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12875313#action_12875313
 ] 

sam rash commented on HDFS-1142:


todd: ah yea, i had trunk open and just checked--it's exactly that way.  nice.  
we do need 1186, though

 Lease recovery doesn't reassign lease when triggered by append()
 

 Key: HDFS-1142
 URL: https://issues.apache.org/jira/browse/HDFS-1142
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-1142.txt, hdfs-1142.txt


 If a soft lease has expired and another writer calls append(), it triggers 
 lease recovery but doesn't reassign the lease to a new owner. Therefore, the 
 old writer can continue to allocate new blocks, try to steal back the lease, 
 etc. This is for the testRecoveryOnBlockBoundary case of HDFS-1139

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1142) Lease recovery doesn't reassign lease when triggered by append()

2010-05-28 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12873212#action_12873212
 ] 

sam rash commented on HDFS-1142:


konstantin : nvm last comment, i misread your comment (and forgot 22 == trunk)


 Lease recovery doesn't reassign lease when triggered by append()
 

 Key: HDFS-1142
 URL: https://issues.apache.org/jira/browse/HDFS-1142
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-1142.txt, hdfs-1142.txt


 If a soft lease has expired and another writer calls append(), it triggers 
 lease recovery but doesn't reassign the lease to a new owner. Therefore, the 
 old writer can continue to allocate new blocks, try to steal back the lease, 
 etc. This is for the testRecoveryOnBlockBoundary case of HDFS-1139

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-05-28 Thread sam rash (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1057:
---

Attachment: hdfs-1057-trunk-1.txt

ported patch to trunk (hairong's idea of storing last checksum)

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Attachments: conurrent-reader-patch-1.txt, 
 conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, 
 hdfs-1057-trunk-1.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-05-25 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12871285#action_12871285
]

sam rash commented on HDFS-1057:

@hairong:

I'm looking a little at implementing this in trunk (reading your append/hflush
doc from hdfs-265), and I have a question. From above:

In each ReplcaBeingWritten, we could have two more fields to keep track of
the last consistent state: replica length and the last chunk's crc

why does there need to be another length field? the getVisibleLenght() ==
acked bytes isn't sufficient? if the crc stored in the RBW is for that length,
you only need the additional byte[] field which is the last chunk's crc I think.

ReplicaBeingWritten.setBytesAcked() could take the crc and atomically set the
len + bytes

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

Key: HDFS-1057
URL: https://issues.apache.org/jira/browse/HDFS-1057
Project: Hadoop HDFS
Issue Type: Sub-task
Components: data-node
Affects Versions: 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
Attachments: conurrent-reader-patch-1.txt,
conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2010-05-25 Thread sam rash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12871288#action_12871288
]

sam rash commented on HDFS-1057:

hmm, in looking at the code more, I see that this depends on what # of bytes we
want to make available to readers.

-visible length (bytes acked) : needed for consistent view of data I think
-bytes on disk : this seems like we'll get inconsistent reads; and in theory,
acked data may be *more* than on disk for a given node (if I read the doc +
code right). However, how can a DN send data that's not on disk unless it's
made available via memory? (very complex)

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1 2 >

1 - 100 of 133 matches

Mail list logo