[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-21 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286408#comment-14286408
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7430:
---

It seems that the entire BlockScanner is rewritten.  This is not a code 
refactoring.  Is it correct?

If yes, how about making it configurable so that it is possible to use the old 
scanner?  The new scanner needs to take some time to be stabilized.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, 
 HDFS-7430.010.patch, HDFS-7430.011.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-21 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286495#comment-14286495
 ] 

Colin Patrick McCabe commented on HDFS-7430:


It is fair to call this a rewrite of major parts of the block scanner.

I don't think it makes sense to maintain two block scanners in parallel.  There 
would have to be a lot of glue code and extra interfaces to get both working.  
Let's let this soak in trunk for a while and then merge to branch-2 when it is 
stabilized, the same as we did with other things such as truncate.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, 
 HDFS-7430.010.patch, HDFS-7430.011.patch, HDFS-7430.012.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279969#comment-14279969
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692717/HDFS-7430.011.patch
  against trunk revision 5d1cca3.

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9245//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, 
 HDFS-7430.010.patch, HDFS-7430.011.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280744#comment-14280744
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692717/HDFS-7430.011.patch
  against trunk revision 000ca83.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager
  org.apache.hadoop.hdfs.TestSafeMode

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9249//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9249//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, 
 HDFS-7430.010.patch, HDFS-7430.011.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-15 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279501#comment-14279501
 ] 

Andrew Wang commented on HDFS-7430:
---

Hey Colin, thanks for revving, I took another look. Just nits, I'm +1 pending 
these:

The DFSConfigKeys spacing, we want the variable names to line up. Maybe we're 
talking past each other here...this is the current:

{code}
  public static final int DFS_DATANODE_SCAN_PERIOD_HOURS_DEFAULT = 0;
  public static final String DFS_BLOCK_SCANNER_VOLUME_BYTES_PER_SECOND = 
dfs.block.scanner.volume.bytes.per.second;
  public static final long DFS_BLOCK_SCANNER_VOLUME_BYTES_PER_SECOND_DEFAULT = 
1048576L;
  public static final String  DFS_DATANODE_TRANSFERTO_ALLOWED_KEY = 
dfs.datanode.transferTo.allowed;
{code}

Desired:

{code}
  public static final int DFS_DATANODE_SCAN_PERIOD_HOURS_DEFAULT = 0;
  public static final String  DFS_BLOCK_SCANNER_VOLUME_BYTES_PER_SECOND = 
dfs.block.scanner.volume.bytes.per.second;
  public static final longDFS_BLOCK_SCANNER_VOLUME_BYTES_PER_SECOND_DEFAULT 
= 1048576L;
  public static final String  DFS_DATANODE_TRANSFERTO_ALLOWED_KEY = 
dfs.datanode.transferTo.allowed;
{code}

FSVolumeImpl
* Unused imports
* Would be good to wrap the LOG.trace in save and load with isTraceEnabled, 
which will let us skip doing the writeValueAsString. slf4j just lets us skip 
string construction costs, the parameters are still resolved.
* I guess we can not do the iterator-of-iterator, but could we still rename 
{{eof}} to something more descriptive? Don't need to stick to error code names 
here.
* Regarding the iterator extending Closeable, can't we just add Closeable when 
there's a need for it? This is a private interface after all.

TestOverReplicatedBlocks, opportunity to fix the spacing in this line:
{code}
  for(int i=0; !scanCursor.delete(); i++) {
{code}

BlockScanner:
* INTERNAL_DFS_BLOCK_SCANNER_THRESHOLD_BYTES_PER_SECOND_DEFAULT is never used? 
IIRC this was a min to avoid dribbly little scans, but I'm ok with not having 
that.

TestBlockScanner:
* testVolumeIteratorImpl is long and the flow control is quite complicated, is 
it possible to simplify / break this up? Seems worth it even at the cost of 
some duplicated code.
* A bunch of the log statements prepend TestBlockScanner: but isn't this 
unnecessary? Our log4j format includes the log's name, which is 
TestBlockScanner.


 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279794#comment-14279794
 ] 

Colin Patrick McCabe commented on HDFS-7430:


bq. ... DFSConfigKeys spacing...

Ah, that's what you meant.  Fixed.

* Unused imports \[in FsVolumeImpl\]

Fixed

bq. Would be good to wrap the LOG.trace in save and load with isTraceEnabled, 
which will let us skip doing the writeValueAsString. slf4j just lets us skip 
string construction costs, the parameters are still resolved.

ok

bq. could we still rename eof to something more descriptive? Don't need to 
stick to error code names here.

renamed to 'atEnd'

bq. Regarding the iterator extending Closeable, can't we just add Closeable 
when there's a need for it? This is a private interface after all.

This is going to be needed once we start reference-counting FsVolume objects.  
The close method will decrement a reference count on the volume.  Lei Xu is 
working on that change now.

bq. TestOverReplicatedBlocks, opportunity to fix the spacing in this line:

ok

bq. INTERNAL_DFS_BLOCK_SCANNER_THRESHOLD_BYTES_PER_SECOND_DEFAULT is never 
used? IIRC this was a min to avoid dribbly little scans, but I'm ok with not 
having that.

yeah, it's unused. removed.

bq. testVolumeIteratorImpl is long and the flow control is quite complicated, 
is it possible to simplify / break this up? Seems worth it even at the cost of 
some duplicated code.

So, the issue with this test is that it's testing a bunch of stuff, including 
rewind, load, and save.  I think there is some value in testing the combination 
of all these things, really stressing it.  It might look a little cleaner if 
the if blocksprocessed... statements were factored out into some kind of 
observer objects, but on the other hand, it would probably be even harder to 
follow the control flow.  I'd say let's refactor this test in a follow-on if 
needed.

bq. A bunch of the log statements prepend TestBlockScanner: but isn't this 
unnecessary? Our log4j format includes the log's name, which is 
TestBlockScanner.

yeah, I'll skip logging that.

thanks for the +1, will commit when jenkins returns.  Can file any more work as 
follow ons

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279932#comment-14279932
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12692693/HDFS-7430.010.patch
  against trunk revision c4ccbe6.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9242//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9242//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, 
 HDFS-7430.010.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-12 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274402#comment-14274402
 ] 

Colin Patrick McCabe commented on HDFS-7430:


rebasing patch

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274431#comment-14274431
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12691782/HDFS-7430.008.patch
  against trunk revision f3507fa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9193//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-12 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274543#comment-14274543
 ] 

Colin Patrick McCabe commented on HDFS-7430:


The {{getStoredBlock}} interface changed.  Let me update the patch to use the 
new API.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2015-01-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274718#comment-14274718
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12691817/HDFS-7430.009.patch
  against trunk revision 6f3a63a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.web.TestWebHDFSAcl

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9194//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9194//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254373#comment-14254373
 ] 

Colin Patrick McCabe commented on HDFS-7430:


I rebased on trunk.

I moved the call to {{BlockScanner#addVolumeScanner}} into {{FsVolumeList}}.  
Since the call to {{BlockScanner#removeVolumeScanner}} was there, it seems to 
make more sense this way.

bq. DFSConfigKeys, I agree the spacing is erratic in this files, but adding 
some spaces to line up the variable names would agree with the variables 
immediately around the new keys

If the spacing is erratic, let's fix the spacing.  I filed HDFS-7557 to do 
this.  It will be nicer to do it in a separate jira to avoid muddying the 
change history.

bq. Still need javadoc p/ tags in a lot of places. It's not a big deal, so if 
you do another pass and think it looks fine we can leave it.

I did another pass on this and found a few spots I missed the first time.  I 
agree that we can do a follow-on if there are any more... it's not critical, 
just kind of nice.

bq. TestFsDatasetImpl, FsVolumeImpl, FsDatasetSpi, FsDatasetImpl unused imports

OK, I just removed a few and all of them show up as black in Intellij.  
Hopefully I got them all this time

bq. \@VisibleForTesting could be added to 
BlockScanner#Conf#INTERNAL_DFS_BLOCK_SCANNER_THRESHOLD...

added

bq. Still some lines longer than 80 chars

- look at 

bq. VolumeScanner#positiveMsToHours, the else case \[could be done with 
TimeUnit\]

Changed.  Also fixed some cases where {{TimeUnit#convert}} was being used 
improperly by me (src and dst units were flipped)

bq. testScanRateImpl \[could be done with TimeUnit\]

This is probably a dumb question, but I'm not sure what you are suggesting I 
should use {{TimeUnit}} for here?

bq. I'd still like to use JSON to save the iterator  Pretty sure Jackson can 
pretty print it for you.

OK.  I'll use JSON here.

bq. I also still like the iterator-of-iterators idea a lot, since we could 
probably use the same iterator implementation at each level. Iterating would be 
simpler, the serde would be harder, but overall I think simpler code and more 
friendly for Java programmers.

We'll have time to look into this later.  Rebasing a big patch like this is 
annoying, and it's debatable whether iterator-of-iterators is even better (it's 
certainly less efficient)

bq. BlockIterator still implements Closeable, unnecessary?

It might be necessary for other implementations of BlockIterator, besides the 
one in {{FsDatasetImpl}}.  In general, you might need to create some 
storage-system-level iterator.

bq. \[neededBytesPerSec\] Still mismatched?

Very good catch.  Yeah, it needs to convert between bytes/hour and bytes/sec

bq. Guessing the JDK7 file listing goodness is coming in the next patch, since 
it's still using File#list

Now that I rebased on trunk, I can add this.  Added.

bq. Did you look into the failed test I posted earlier? Any RCA?

I think that this probably resulted from the incorrect rate computation.

bq. The bugs found in my previous review seem worth unit testing, e.g the OBO 
with the binarySearch index and the neededBytesPerSec that still looks off, the 
= in place of  that affected continuous scans. Might be fun trying to write 
some actual stripped down unit tests, rather than poking with a full 
minicluster.

OK that's fair.  The binarySearch index code path didn't get much exercise, 
because the off-by-one appeared only when the directory listing had changed 
between our last listDir and the current one.  I wrote a unit test just for 
{{FsVolumeImpl#nextSorted}} that should verify that this is working as expected.

I will also break {{calculateNeededBytesPerSec}} out into a function and write 
a unit test for that

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-19 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254474#comment-14254474
 ] 

Colin Patrick McCabe commented on HDFS-7430:


So, I changed the rate calculation a bit.  Now it simply scans at the 
configured bytes per second the average bytes per second in the last hour is 
lower than the target, and scans nothing if not.  I was having some trouble 
getting the rate calculation to be exactly what I wanted (It's kind of a PID 
control problem), so I think a fixed rate is easier to understand.  Added a 
unit test for this as well.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254535#comment-14254535
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12688487/HDFS-7430.007.patch
  against trunk revision 808cba3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 12 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 13 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.fs.TestSymlinkHdfsFileContext

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9100//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9100//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9100//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, 
 HDFS-7430.007.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250452#comment-14250452
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12687802/HDFS-7430.006.patch
  against trunk revision 9b4ba40.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 13 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9060//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9060//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-17 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250832#comment-14250832
 ] 

Andrew Wang commented on HDFS-7430:
---

Cool, looks like you hit a lot of these. I did another review pass:

Nits: 
* DFSConfigKeys, I agree the spacing is erratic in this files, but adding some 
spaces to line up the variable names would agree with the variables immediately 
around the new keys
* Still need javadoc {{p/}} tags in a lot of places. It's not a big deal, so 
if you do another pass and think it looks fine we can leave it.
* TestFsDatasetImpl, FsVolumeImpl, FsDatasetSpi, FsDatasetImpl unused imports
* @VisibleForTesting could be added to 
BlockScanner#Conf#INTERNAL_DFS_BLOCK_SCANNER_THRESHOLD...
* Still some lines longer than 80 chars

Some more time conversions that could be done with TimeUnit:
* VolumeScanner#positiveMsToHours, the else case
* testScanRateImpl

FsDatasetImpl
* I'd still like to use JSON to save the iterator :) Pretty sure Jackson can 
pretty print it for you.
* I also still like the iterator-of-iterators idea a lot, since we could 
probably use the same iterator implementation at each level. Iterating would be 
simpler, the serde would be harder, but overall I think simpler code and more 
friendly for Java programmers.
* BlockIterator still implements Closeable, unnecessary?

VolumeScanner

{code}
  // Find out how many bytes per second we should scan.
  long neededBytesPerSec =
conf.targetBytesPerSec - (scannedBytesSum / MINUTES_PER_HOUR);
{code}

Still mismatched?

* Guessing the JDK7 file listing goodness is coming in the next patch, since 
it's still using File#list

Tests:
* Did you look into the failed test I posted earlier? Any RCA?
* The bugs found in my previous review seem worth unit testing, e.g the OBO 
with the binarySearch index and the neededBytesPerSec that still looks off, the 
{{=}} in place of {{}} that affected continuous scans. Might be fun trying to 
write some actual stripped down unit tests, rather than poking with a full 
minicluster.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-16 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249132#comment-14249132
 ] 

Colin Patrick McCabe commented on HDFS-7430:


bq. getVolumeStats and VolumeScanner#getStatistics can be @VisibleForTesting

ok

bq. IIUC the getUnitTestLong is for the INTERNAL_ config keys, but it's not 
being used for MAX_STALENESS_MS. This one is also seemingly not set by any unit 
tests, so maybe can be removed.

Yeah, we should use {{getUnitTestLong}} here.  Added.  I would prefer to keep 
this here in case we ever wanted to expose it, or if we added a test for it 
later.  The unit tests does test different staleness settings, but using the 
setter directly at the moment.

bq. Would prefer Precondition checks that abort with a message rather than 
Math.max'ing with 0. Users should know when the have an invalid config, maybe 
they typoed a minus sign rather than wanting a value of zero. This isn't 
necessary for the INTERNAL_ ones though, since they're only set by tests.

So, we already have a convention going on that setting 
{{dfs.datanode.scan.period.hours}} to -1 indicates that the block scanner 
should be disabled.  It's not my favorite convention, but I don't want to break 
user configs over a bikeshed issue.  I would also prefer to handle all config 
keys uniformly, rather than throwing an exception if some were negative, but 
letting others go negative.  I think the likelihood of accidentally setting a 
negative value is pretty small... I would expect most mistakes to be leaving 
out a zero, or just not setting the key at all, etc.

bq. Again a comment about locking would be nice. I see direct usage of 
FSVolumes which I think is safe, but dunno really. 

[~eddyxu] and I have been discussing this in HDFS-7496.  Once that JIRA is 
complete, FsVolume instances will be reference-counted, ensuring that we never 
perform an operation on a volume that has already been removed.  That fits in 
nicely with the current code, because the volume removal code instructs the 
volume scanner to shut down.

bq. Various races that lead to FNF too, which are already described in 
ScanResultHandler.

Yeah, these already exist in the current block scanner code.  Eventually maybe 
we can do a spin-wait loop to make sure that we can distinguish real block FNF 
from race-condition-induced FNF.  But I think that's something we should do in 
a follow-up... this patch is big enough :)

bq. ScanResultHandler can be package-private

ok

bq. getStatistics and getVolume can be @VisibleForTesting

{{getStatistics}}, sure.  {{getVolume}} is called by {{BlockScanner}}

bq. I notice some Time.now() being used to calculate an interval in 
findNextUsableBlockIter, can this be monotonicNow instead?

The iterator start time is saved in the cursor file, and it is a wall-clock 
time, not a monotonic time.  This is necessary because if we shut down the 
datanode and reboot, we want to be able to pick up in our scan where we left 
off.  But monotonic time resets every time we reboot.  I added a comment in 
{{findNextUsableBlockIter}} to clarify this.

bq. scanBlock, the first series of try/catch, is this going to lead to lots of 
log prints? I believe passing in the exception will print a stack trace. I 
noticed ScanResultHandler quashes a lot of these.

Yeah, let's not print the stack trace... it's not interesting here.  Also, I 
will log a message when {{FsVolumeSpi#getStoredBlock}} returns null instead of 
throwing {{FileNotFoundException}} (apparently it's allowed to do that too :P )

bq. Seems like a mismatch here, since this is subtracting the average num bytes 
scanned per minute, not per second

Yikes.  Yeah, we need the average bytes per *second* here, not the average 
bytes per *minute*, which is what we'd get from just dividing the bytes in the 
last hour by 60.  Fixed.

bq. if (neededBytesPerSec = conf.thresholdBytesPerSec) { ... should be just  

Yeah, let's use  instead.  This will also allow setting the threshold to 0 to 
get continuous scan.

bq. Comment seems unrelated.

So, actually the comment is related... we're taking one {{ExtendedBlock}} 
structure, which always has 0 for its genstamp, and looking up another 
{{ExtendedBlock}} structure, which will have a valid genstamp.  It's 
frustrating that we don't have a more expressive type system that could express 
something like this struct has poolId and blockId, but not genstamp.  Instead 
we've got Block.java which has id / genstamp, and ExtendedBlock.java, which has 
all that plus block pool id.

I will see if I can come up with a better comment in the code.

bq. Doesn't BlockIterator fit better in BlockPoolSlice, since it's tied to a 
block pool?

{{BlockPoolSlice}} is internal to {{FsDatasetImpl}}.  It's not a generic class, 
whereas the volume scanner is intended to be able to work on any volume, not 
just ones from {{FsVolumeImpl}}.

bq. 

[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-16 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249317#comment-14249317
 ] 

Colin Patrick McCabe commented on HDFS-7430:


bq. getSubdirEntries, we can directly set long now = Time.monotonicNow and save 
the else case.

ok

bq. It'd be nicer to use Paths.get rather than stringing together all these 
File objects. Yay JDK7!

good idea

bq. getNextSubDir, getNextFinalizedDir, getNextFinalizedSubDir, 
getSubdirEntries don't need to throw IOException, neither does nextBlock. I'm a 
bit surprised, but it turns out listing dirs only throws a RuntimeException on 
permission error (which we shouldn't have to worry about). Can clean the 
resulting try/catch in VolumeScanner too.

Let me fix this so that we return {{IOException}} from these functions.  We can 
use the JDK7 APIs to do this.

bq. Is it worth doing rename tricks to atomically save the cursor file? I 
remember reading an LWN article about this being surprisingly hard to do 
correctly, so maybe worth looking that up too. Could also just double buffer 
and look at ctimes too.

Sure.  Let's write to a temporary file and then rename once we're done.  That's 
easy to do.

rename is atomic on almost all production systems.  Anyway, even if we end up 
with an empty or missing cursor file once in a blue moon, it just means we 
restart the scan.

bq. The cache timeout seems rather low. With the default params of 1MB/s and 
30s timeout, we'll timeout each time if blocks are bigger than 30MB.

Good point.  I will up it to 15 minutes, that seems reasonable for this and 
should let us scan a few blocks in between.

bq. Is this OBO? If missing, the return value is -index - 1. With the code as 
it is, we'll never return 0, which is what we want if the first item in the 
list was deleted.

Yeah, I think this is wrong.  Will change.

bq. Spacing is a little off with rest of file

This is actually a problem with the other keys at the end of the file, not with 
the new keys I added.  If you look carefully, you can see they're spaced with 3 
spaces rather than 2 :P

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-12 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245106#comment-14245106
 ] 

Colin Patrick McCabe commented on HDFS-7430:


bq. Any reason BlockScanner is contained in DataNode rather than FsDatasetImpl? 
It seems like their lifecycles are the same. Might also move to the fsdataset 
package if you agree.

I think the logic is sufficiently general that this could be used in other 
{{FsDataset}} implementations than {{FsDatasetImpl}}.  So I'd prefer to keep it 
in {{DataNode}}.  The block iterator abstracts away the details of reading the 
blocks from a volume, and could be implemented by other {{Volume}} 
implementations.  Actually I think the abstraction is better now because we cut 
the link to reading paths.

bq. There's a bunch of time conversion scattered about, it'd be better to use 
TimeUnit.MILLISECONDS.toHours(millis) and similar where we can. I like this 
form better than TimeUnit#convert since it's very obvious.

Good idea.

bq. Javadoc, need to put a p/ tag to actually get line-breaks.

added... hopefully I got all the spots

bq. Can add a log when removeVolumeScanner is called and not enabled

added

I have to close this window now, will address the other comments in a bit

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243146#comment-14243146
 ] 

Andrew Wang commented on HDFS-7430:
---

This is a heroic and much needed patch. Thanks for taking this on Colin, mondo 
diff here:

{noformat}
 32 files changed, 2268 insertions(+), 2414 deletions(-)
{noformat}

I appreciate that a lot of the new lines are dedicated to code comments, 
configuration, and tests, and it's still a net reduction in the number of lines 
of code. Well done just based on that.

Anyway, as with any patch this large, I do have a few review comments :)

Misc:
* Any reason BlockScanner is contained in DataNode rather than FsDatasetImpl? 
It seems like their lifecycles are the same. Might also move to the fsdataset 
package if you agree.
* There's a bunch of time conversion scattered about, it'd be better to use 
{{TimeUnit.MILLISECONDS.toHours(millis)}} and similar where we can. I like this 
form better than {{TimeUnit#convert}} since it's very obvious.
* Javadoc, need to put a {{p/}} tag to actually get line-breaks.
* A few lines longer than 80 chars

BlockScanner:
* Can add a log when removeVolumeScanner is called and not enabled
* Could we have a comment about the lock hierarchy? It seems like it's 
FSDatasetImpl - BlockScanner - VolumeScanner, and some things go direct to 
BlockScanner, like initBlockPool and the servlet.
* getVolumeStats and VolumeScanner#getStatistics can be @VisibleForTesting

BlockScanner#Conf:
* IIUC the getUnitTestLong is for the INTERNAL_ config keys, but it's not being 
used for MAX_STALENESS_MS. This one is also seemingly not set by any unit 
tests, so maybe can be removed.
* Would prefer Precondition checks that abort with a message rather than 
Math.max'ing with 0. Users should know when the have an invalid config, maybe 
they typoed a minus sign rather than wanting a value of zero. This isn't 
necessary for the INTERNAL_ ones though, since they're only set by tests.

VolumeScanner:
* Again a comment about locking would be nice. I see direct usage of FSVolumes 
which I think is safe, but dunno really. Various races that lead to FNF too, 
which are already described in ScanResultHandler.
* ScanResultHandler can be package-private
* getStatistics and getVolume can be @VisibleForTesting
* I notice some Time.now() being used to calculate an interval in 
findNextUsableBlockIter, can this be monotonicNow instead?
* scanBlock, the first series of try/catch, is this going to lead to lots of 
log prints? I believe passing in the exception will print a stack trace. I 
noticed ScanResultHandler quashes a lot of these.

{code}
long neededBytesPerSec =
  conf.targetBytesPerSec - (scannedBytesSum / MINUTES_PER_HOUR);
{code}

Seems like a mismatch here, since this is subtracting the average num bytes 
scanned per minute, not per second

{code}
if (neededBytesPerSec = conf.thresholdBytesPerSec) {
{code}

This should be just {{}} based on the following log message.

{code}
// Get the genstamp for the block.
{code}

Comment seems unrelated.

Higher-level FsVolumeSpi/FsVolumeImpl BlockIterator stuff:
* Doesn't BlockIterator fit better in BlockPoolSlice, since it's tied to a 
block pool?
* Implementing {{Iterator}} and {{Iterable}} might make a lot of sense here. 
For instance, {{hasNext}} is more canonical than {{eof}}.
* Also not sure why we bother implementing Closeable, when there's just an 
empty {{close()}} implementation.
* Rather than hand-rolled serde, could we use Jackson to do JSON? I think it'll 
save some lines of code too.
* Finally, have you considered implementing this as an iterator-of-iterators? 
The complexity in this code comes from advancing curDir and curSubDir as 
appropriate, but we could encapsulate the logic for each in its own iterator, 
e.g. BlockIterator - SubdirIterator - SubdirIterator - SubdirBlockIterator. 
Serializing would be saving the current iter position of each. Right now this 
is coded in a very C-style with checking nulls.

Detail/nitty BlockIterator stuff:
* Class javadoc for BlockIterator is a bit incorrect, it returns entries from a 
block pool, not the entire volume.
* Unused imports
* getSubdirEntries, we can directly set {{long now = Time.monotonicNow}} and 
save the else case.
* It'd be nicer to use {{Paths.get}} rather than stringing together all these 
File objects. Yay JDK7!
* getNextSubDir, getNextFinalizedDir, getNextFinalizedSubDir, getSubdirEntries 
don't need to throw IOException, neither does nextBlock. I'm a bit surprised, 
but it turns out listing dirs only throws a RuntimeException on permission 
error (which we shouldn't have to worry about). Can clean the resulting 
try/catch in VolumeScanner too.
* Is it worth doing rename tricks to atomically save the cursor file? I 
remember reading an LWN article about this being surprisingly hard to do 
correctly, so maybe worth looking that up too. Could 

[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-10 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241924#comment-14241924
 ] 

Colin Patrick McCabe commented on HDFS-7430:


bq. how about blocks which goes to already scanned subdirs ?  These subdirs 
have to wait for their next turn for scanning?

Yes.  They will have to wait until the next scan period (normally a week or 
two.)

In general, block scanning is always going to be a probabilistic thing.  We 
never know when blocks are going to go bad.  Perhaps a block goes bad right 
after we scan it, and then we don't find out until two weeks later when we 
rescan.  This could have happened under the old block scanner.  The function of 
the block scanner isn't to immediately detect all blocks going bad (this is 
simply not possible), but to ensure that we look at old data every once in a 
while.

bq. If there are very few blocks ( just for example) and completed scaning 
fast, then whether the scanning will continue again?

We will restart scanning after {{dfs.datanode.scan.period.hours}} have gone by.

If we can't complete a full scan within {{dfs.datanode.scan.period.hours}}, 
then we will scan continuously without a break.

bq. How new blocks are identified among already scanned blocks in the same 
subdir as there is no track of previous scan information maintained?

There is no distinction between old and new blocks.  There is just an 
iterator and your position.  Your position is saved to the cursor file 
periodically, so that if the datanode is restarted, we pick up the scan more or 
less where we left off.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240698#comment-14240698
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12686175/HDFS-7430.005.patch
  against trunk revision 437322a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 13 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 163 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
  org.apache.hadoop.hdfs.qjournal.TestSecureNNWithQJM

  The following test timeouts occurred in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8989//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8989//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8989//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8989//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-12-09 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240729#comment-14240729
 ] 

Vinayakumar B commented on HDFS-7430:
-

Haven't gone through full. But checked the iterator.
Got the idea of getting the next block to scan by scanning through the 
subdirectories and choose the next block to scan.

Currently subdir path is based on blockId directly, its not confirmed that 
sorting orders of the subdirs is not same as creation of blocks.
I have few Qs here. 

* how about blocks which goes to already scanned subdirs ?
   These subdirs have to wait for their next turn for scanning?

* If there are very few blocks ( just for example) and completed scaning fast, 
then whether the scanning will continue again?

* How new blocks are identified among already scanned blocks in the same subdir 
as there is no track of previous scan information maintained?

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, HDFS-7430.005.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223654#comment-14223654
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683394/HDFS-7430.003.patch
  against trunk revision 8e253cb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1219 javac 
compiler warnings (more than the trunk's current 1218 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
  
org.apache.hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport
  org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot
  
org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage
  org.apache.hadoop.hdfs.server.balancer.TestBalancer
  org.apache.hadoop.hdfs.server.namenode.TestFSImageWithSnapshot
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestINodeFileUnderConstructionWithSnapshot
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots
  org.apache.hadoop.hdfs.server.datanode.TestBlockScanner
  
org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8823//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8823//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8823//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8823//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-24 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223709#comment-14223709
 ] 

Colin Patrick McCabe commented on HDFS-7430:


* Fix findbugs warning

* Fix javac warning

Other stuff looks like an slf4j problem: getting a bunch of stuff like this:
{code}
testRenameDirAndDeleteSnapshot_2(org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots)
  Time elapsed: 0 sec   ERROR!
java.lang.ClassCastException: org.slf4j.impl.Log4jLoggerAdapter cannot be cast 
to org.apache.commons.logging.impl.Log4JLogger
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotTestHelper.setLevel2OFF(SnapshotTestHelper.java:105)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotTestHelper.disableLogs(SnapshotTestHelper.java:92)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots.init(TestRenameWithSnapshots.java:80)
{code}

Since I changed the scanner from using log4j to using slf4j

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223768#comment-14223768
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683394/HDFS-7430.003.patch
  against trunk revision 2967c17.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1219 javac 
compiler warnings (more than the trunk's current 1218 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks
  org.apache.hadoop.hdfs.server.datanode.TestBlockScanner
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots
  
org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
  
org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage
  org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestINodeFileUnderConstructionWithSnapshot
  org.apache.hadoop.hdfs.server.namenode.TestFSImageWithSnapshot
  
org.apache.hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8824//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8824//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8824//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8824//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-24 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223998#comment-14223998
 ] 

Colin Patrick McCabe commented on HDFS-7430:


* Fix issue with {{stats.nextBlockPoolScanStartMs}}

* {{VolumeScanner#Statistics}} should have a {{toString}} method.

* {{TestOverReplicatedBlocks}}: test needs a slightly different way of manually 
triggering the block scanner

* {{BlockReportTestBase}}: 
{{DFSConfigKeys.DFS_DATANODE_DIRECTORYSCAN_INTERVAL_KEY}} is in seconds, but 
the test was treating it as milliseconds.

* {{SnapshotTestHelper}}: the old {{setLevel2OFF}} function for 
programmatically changing unit test logging no longer works (and throws an 
exception), because some of the loggers are actually slf4j loggers, not log4j.  
Got rid of {{setLevel2OFF}} and added a more sophisticated set of methods to 
{{GenericTestUtils}} that can programmatically disable or change logging for 
either commons-logging, slf4j, or log4j loggers, without making the caller care 
about which is which.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224131#comment-14224131
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683470/HDFS-7430.004.patch
  against trunk revision 61a2510.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 13 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.security.ssl.TestReloadingX509TrustManager

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8830//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8830//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, 
 HDFS-7430.004.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use O(1) memory and use multiple threads

2014-11-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222659#comment-14222659
 ] 

Hadoop QA commented on HDFS-7430:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683267/HDFS-7430.002.patch
  against trunk revision a4df9ee.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8816//console

This message is automatically generated.

 Refactor the BlockScanner to use O(1) memory and use multiple threads
 -

 Key: HDFS-7430
 URL: https://issues.apache.org/jira/browse/HDFS-7430
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7430.002.patch, memory.png


 We should update the BlockScanner to use a constant amount of memory by 
 keeping track of what block was scanned last, rather than by tracking the 
 scan status of all blocks in memory.  Also, instead of having just one 
 thread, we should have a verification thread per hard disk (or other volume), 
 scanning at a configurable rate of bytes per second.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)