[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2015-05-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14524739#comment-14524739
 ] 

Hadoop QA commented on HDFS-4960:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12591131/4960-trunk.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / f1a152c |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10583/console |


This message was automatically generated.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843883#comment-13843883
 ] 

Hadoop QA commented on HDFS-4960:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12591131/4960-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
-14 warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
  org.apache.hadoop.hdfs.web.TestWebHDFS

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5683//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5683//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5683//console

This message is automatically generated.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13704848#comment-13704848
 ] 

stack commented on HDFS-4960:
-

bq. You could try increasing the size of the FileInputStreamCache in HDFS by 
setting dfs.client.read.shortcircuit.streams.cache.size to something bigger 
than 100.

Would that cache the meta header Colin?

Varun, Colin gave me crash-course offline on his option #1 above caching the 
meta data header for files in FileInputStreamCache; I can hack up patch when 
you want something to try...

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-10 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705000#comment-13705000
 ] 

Colin Patrick McCabe commented on HDFS-4960:


bq. Would that cache the meta header Colin?

No.  We would still read it each time.

bq. Varun, Colin gave me crash-course offline on his option #1 above caching 
the meta data header for files in FileInputStreamCache; I can hack up patch 
when you want something to try...

This seems like the way to go for addressing the concerns in this JIRA.  Not a 
huge optimization but it's easy to do.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703745#comment-13703745
 ] 

Colin Patrick McCabe commented on HDFS-4960:


Two proposals:
* check the version on the DataNode side rather than the client side.  Then, we 
don't have to re-check the header of checksum files that are already in 
FileInputStreamCache.
* Add an optional boolean to OpRequestShortCircuitAccessProto that can be used 
to request *only* the block file, not the checksum file.  This will avoid the 
overhead of duplicating the other file descriptor.  In our testing, this 
overhead was much higher than just doing a read.

Also some reminders:
* if you're optimizing for SSDs (as it says in the description), seeks don't 
matter!
* the .meta file will likely be in the cache after a few reads, so seeks won't 
happen anyway
* let's not commit anything without at least one test that shows it's better.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-09 Thread Varun Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703810#comment-13703810
 ] 

Varun Sharma commented on HDFS-4960:


Actually, I am currently failing to max out the SSD(s) and I am essentially 
looking out for things that hog CPU resources or could be time sinks. Even with 
this patch, i fail to max out the hardware. 

I will try to run an isolated test perhaps with just HDFS random seeks on the 
drives (no HBase). I agree that we need to have a test showing that this is 
really better. Currently, I doubt that it will be easy because it seems that a 
bunch of other paths which are hogging cpu resources.

As for SSDs, seeks do matter if you are looking max them out or utilize them 
well enough. They obviously dont matter if you are looking at doing  10K iops.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703828#comment-13703828
 ] 

Colin Patrick McCabe commented on HDFS-4960:


You could try increasing the size of the FileInputStreamCache in HDFS by 
setting {{dfs.client.read.shortcircuit.streams.cache.size}} to something bigger 
than 100.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-09 Thread Varun Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703840#comment-13703840
 ] 

Varun Sharma commented on HDFS-4960:


IIRC, I ran with 10K or something...

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-08 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702787#comment-13702787
 ] 

Colin Patrick McCabe commented on HDFS-4960:


This change seems problematic.  The .meta header exists so that we can check 
the version of the block file.  If we ignore it, we've effectively locked 
ourselves into a single version forever.

I don't see how we can accept this change unless an alternate way of versioning 
the block files is also proposed.  One possible way would be through the naming 
of the block files.  But this is something we should discuss before throwing 
out the existing mechanism.  Also, if we're going to ignore the header for 
no-checksum, we should ignore it in the checksum case as well.

Also, did you measure the performance impact?

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-08 Thread Varun Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702832#comment-13702832
 ] 

Varun Sharma commented on HDFS-4960:


I don't know about the future plans of .meta - however, I think currently its 
only being used for storing checksums and if checksums are not being asked for, 
there is no need for seeking into the .meta file. I think the FB branch already 
has this change. I personally think that inlining metadata in blocks is the way 
to go, for the future, instead of storing separately in a .meta file - it 
cripples hdfs for real time work loads.

I did not have a large enough data set so probably all the .meta files were in 
fs cache.

lseek in .meta ~50 microseconds
lseek in block ~50 microseconds
read .meta  ~50 microseconds
read block  ~ 300 microseconds

So, just looking from system calls perspective, it is ~ 20 % however, it does 
not show up in the end to end test, because in general, there are a bunch of 
other contention issues inside HDFS + HBase which hog CPU resources.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702877#comment-13702877
 ] 

stack commented on HDFS-4960:
-

bq. The .meta header exists so that we can check the version of the block file.

Thanks for taking a looksee mighty [~cmccabe].

What if we made a patch with a configuration to skip the reading of metadata?  
Or, added a setSkipMetadata API as we have a setVerifyChecksum API?

Agree, it is handy having blocks versioned so can be evolved at later date.  No 
harm having version in side file either since we are probably going to do an 
extra seek to get metadata anyways (would be coolio if DN cached block metadata 
so could save a seek but maybe this would be more trouble than it is worth -- 
we'd need to prove it useful in say a heavy random read scenario).

But the metadata doesn't look to have ever changed.  It is version 1 (It looks 
like the version used to be a define out of FSDataset before it was defined 
inside this file but even then the version was 1).  Meantime the lads here are 
paying a seek to learn something that is unlikely ever to change.  

(Filename would be good place for version but would be a massive change).

Thanks Colin.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701623#comment-13701623
 ] 

Hadoop QA commented on HDFS-4960:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12591131/4960-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4600//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4600//console

This message is automatically generated.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701670#comment-13701670
 ] 

stack commented on HDFS-4960:
-

+1 on patch.  When you run w/ it [~varunsharma], do you no longer see the 
7-byte extra seek?  All unit tests passing would seem to say you haven't broken 
anything.  Missing I suppose is a test that shows we do not touch the .meta 
file at all when skip checksum is enabled (else it'd have turned up the issue 
you uncovered here).

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4960) Unnecessary .meta seeks even when skip checksum is true

2013-07-07 Thread Varun Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701677#comment-13701677
 ] 

Varun Sharma commented on HDFS-4960:


Yep, that is how I confirmed it - strace now only shows lseek() into the real 
blocks - no more lseeks into the meta file.

 Unnecessary .meta seeks even when skip checksum is true
 ---

 Key: HDFS-4960
 URL: https://issues.apache.org/jira/browse/HDFS-4960
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.1.0-beta
Reporter: Varun Sharma
Assignee: Varun Sharma
 Attachments: 4960-branch2.patch, 4960-trunk.patch


 While attempting to benchmark an HBase + Hadoop 2.0 setup on SSDs, we found 
 unnecessary seeks into .meta files, each seek was a 7 byte read at the head 
 of the file - this attempts to validate the version #. Since the client is 
 requesting no-checksum, we should not be needing to touch the .meta file at 
 all.
 Since the purpose of skip checksum is to also avoid the performance penalty 
 of the extra seek, we should not be seeking into .meta if skip checksum is 
 true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira