[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-10-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13469913#comment-13469913
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-0.94-security-on-Hadoop-23 #8 (See 
[https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/8/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390012)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463346#comment-13463346
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390013)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463105#comment-13463105
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-TRUNK #3377 (See 
[https://builds.apache.org/job/HBase-TRUNK/3377/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390013)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463090#comment-13463090
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-0.94 #488 (See 
[https://builds.apache.org/job/HBase-0.94/488/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390012)

 Result = FAILURE
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463079#comment-13463079
 ] 

Hudson commented on HBASE-6868:
---

Integrated in HBase-0.94-security #57 (See 
[https://builds.apache.org/job/HBase-0.94-security/57/])
HBASE-6868 Skip checksum is broke; are we double-checksumming by default? 
(Revision 1390012)

 Result = SUCCESS
larsh : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463017#comment-13463017
 ] 

Lars Hofhansl commented on HBASE-6868:
--

I manually did these tests (0.94 patch):
* started HBase with HBase checksums off, inserted some data, flushed, 
compacted, scanned
* restarted HBase with HBase checksums on, inserted some more data, 
flush/compacted, scanned
* restarted HBase again with HBase checksums off, inserted some more data, 
flush/compacted, scanned

Checked the logs for anything weird. Looks good. Going to commit to 0.94 and 
0.96.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.2, 0.96.0
>
> Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 
> 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462925#comment-13462925
 ] 

Lars Hofhansl commented on HBASE-6868:
--

I looked through the run, nothing stuck out... All the tests passed.

I'll do some manual testing today and then commit.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462923#comment-13462923
 ] 

Hadoop QA commented on HBASE-6868:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546437/6868-0.96-v3.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 140 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2931//console

This message is automatically generated.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462457#comment-13462457
 ] 

Lars Hofhansl commented on HBASE-6868:
--

Cool... The only strange spot left is here: 
http://hbase.apache.org/xref/org/apache/hadoop/hbase/io/hfile/HFileBlock.html#1473

We're setting useHBaseChecksum to true in the absence of any information from 
HFileSystem.
As far as I can tell this only triggered by tests, so I think we're good.

Will commit tomorrow unless there are any objections. Agree on the testing, I 
will add this to the testing spreadsheet. I think I will sink the current 
0.94.2 RC for this.


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462440#comment-13462440
 ] 

stack commented on HBASE-6868:
--

+1 on patch.  Default is let hdfs worry about checksums.  No chance of 
dbl-checksumming.  If you did enable HBASE-5074 hbase checksums and also 
enabled shortcircuit, you should be good (you'll be doing an unwanted extra 
seek if you have to do a non-local until hdfs-3429 goes in but thats another 
issue).  Your setting of dfs.client.read.shortcircuit.skip.checksum on the 
hfile fs, will make it so we avoid a dbl-checksum.  The new Configuration is 
important before you set the boolean as you have it.  I think this is good for 
trunk and 0.94.

We should test starting a 0.94.2 on top of data written when the flag was true 
to see if we skip over the hbase inserted checksums.

I can write a little note for the refguide on this after goes in.  Will also 
look at verifying it.



> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462410#comment-13462410
 ] 

Lars Hofhansl commented on HBASE-6868:
--

Oh... I see what you meant before. What we could do is:
# disable HBase checksum by default.
# folks can then enable HBase checksums and dfs.client.read.shortcircuit 
together in the respective config files.
# when HBase checksums we enabled dfs.client.read.shortcircuit.skip.checksum 
for the noChecksymFs in HFileSystem.

That way we'd have no double checksumming and HLog will be checksummed (because 
HLog used the backingFs - not the noChecksumFs).

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462397#comment-13462397
 ] 

stack commented on HBASE-6868:
--

Only downside is no checksumming when reading (local) WAL blocks.  You ok w/ 
that?  It could be rare enough but when it'd be kinda ugly when we get bitten 
by rotten bits; will we even fall back to non-local block if read fails because 
unparesable section?

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462368#comment-13462368
 ] 

Lars Hofhansl commented on HBASE-6868:
--

bq. The patch permanently ties the checksumming feature to local short circuit

Yep, this is the reality currently. When that changes, we should change the 
patch, methinks.

On setting both config options... This would lead to issue reported here that 
HLogs are not checksummed.

I think with the proposed change we get best we can get right now. If HDFS is 
setup such that disabling checksumming would have a benefit, it is switched on 
and enabled correctly in HBase.

Once HDFS-3429 is in (and in a mainstream HDFS release, which might be a while) 
we should rethink this.
(Just doing this for 0.94 and leave 0.96 the way it is seems OK too.)


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462319#comment-13462319
 ] 

stack commented on HBASE-6868:
--

Sorry.  Distracted today.  On DFS_CLIENT_READ_SHORTCIRCUIT_KEY, I'd checked and 
yeah, hadoop 1.0.x has it so patch would be fine for 0.96.

The patch permanently ties the checksumming feature to local short circuit.  Is 
that what we want to do?  Might be ok for 0.94 but we might not want it for 
trunk/0.96 which we want working w/ h2 and hopefully it'll get hdfs-3429 soon.

Can we do this Lars:

{quote}
dfs.client.read.shortcircuit = true
dfs.client.read.shortcircuit.skip.checksum=true
{quote}

It means we read that when we read a WAL with blocks that are local, we'll not 
be checking their checksum.

Should we just turn off the HBASE-5074?  Set 
http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/HRegionServer.html#468
 to false?  Will that get us the old behavior?

[~liulei.cn] Regards "I think there is another problem in local readI think 
that may lead to RegionServer read wrong data.", I thought it a known hdfs 
issue w/ local read but could not find an issue describing the problem.  I'd 
say file an hdfs issue for it.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462194#comment-13462194
 ] 

Lars Hofhansl commented on HBASE-6868:
--

Hmm... Just checked Hadoop 0.20.x, 0.21.x, 0.22.x, and 0.23.x do not have 
DFSConfigKeys.DFS_CLIENT_READ_SHORTCIRCUIT_KEY, so this would need to be 
reflected.
Anyway, until somebody else comments here, I won't spend more time on this.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462135#comment-13462135
 ] 

Lars Hofhansl commented on HBASE-6868:
--

I would like to commit this (and then respin a 0.94.2RC). Does anyway have an 
issue with patch I posted?

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461941#comment-13461941
 ] 

Lars Hofhansl commented on HBASE-6868:
--

Comments? Concerns?
This should fix the first two issues I listed above.


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
> Attachments: 6868-0.96-idea.txt
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461494#comment-13461494
 ] 

Lars Hofhansl commented on HBASE-6868:
--

We keep going back and forth between different issues. There at least three 
issues now discussed here:
# HLogs are not checksummed when dfs.client.read.shortcircuit.skip.checksum and 
dfs.client.read.shortcircuit are both true
# double checksumming when dfs.client.read.shortcircuit = true and 
dfs.client.read.shortcircuit.skip.checksum=false
# local files lingering with DN is down and file was deleted via DFS.

The first two are related.


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461485#comment-13461485
 ] 

Lars Hofhansl commented on HBASE-6868:
--

OK. So can we just document then that 
dfs.client.read.shortcircuit.skip.checksum should not be enabled globally, and 
then we only enable it for the nonCheckSumFs?

If that does not work I let's revert HBASE-5074 (at least from 0.94). It's not 
worth it.
(Although that will be a big task now, since so much has changed since it got 
committed.)

The 2nd part you mention sounds like another bug in HDFS with local reads.


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread LiuLei (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461321#comment-13461321
 ] 

LiuLei commented on HBASE-6868:
---

Hi all,

1. Local Read
When we set dfs.client.read.shortcircuit=true and 
dfs.client.read.shortcircuit.skip.checksum=false, and verifyChecksum parameter 
is ture in BlockReaderLocal constructor,   the DFSClient read meta file and 
verify checksum.

When we set dfs.client.read.shortcircuit=true and 
dfs.client.read.shortcircuit.skip.checksum=true, the DFSClient don't read meta 
file and don't verify checksum.

2. Remote Read
When we call DistributedFileSystem.setVerifyChecksum(false),  the DFSClient 
don't  verify checksum.
When we call DistributedFileSystem.setVerifyChecksum(true),  the DFSClient  
verify checksum.

the verifyChecksum property default value is true.



I think  there is another problem in local read,   BlockReaderLocal class use  
"static Map localDatanodeInfoMap" property to store 
local block file path and local meta file path. When I stop HDFS cluster or I 
kill the local DataNode and delete file use "./hadoop dfs -rm path" command ,  
the RegionServer still can read the data form local file. I think that may lead 
to RegionServer read wrong data.


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461319#comment-13461319
 ] 

Lars Hofhansl commented on HBASE-6868:
--

OK. If that is all true, then we only benefit if:
# dfs.client.read.shortcircuit = true
# dfs.client.read.shortcircuit.skip.checksum=true
# the block is local

In all cases we do not improve the number of IOs.
Can we just recommend then to enable the two setting above?
Problem is that then we'll lose checksumming for all locally read blocks...?


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461312#comment-13461312
 ] 

binlijin commented on HBASE-6868:
-

@Lars Hofhansl  
Sorry about that, 
{code}
 case (2) dfs.client.read.shortcircuit = true, 
dfs.client.read.shortcircuit.skip.checksum=false, short circuit read turned on.
 If the block is local, DFSClient will read file data direct (HRegionServer is 
a DFSClient).
 HFile : DFSClient will read block file and meta file. DFSClient will checksum 
the data, HRegionServer(HFile) will checksum the HFile data.  This is the 
double-checksumming. 
 HLog : DFSClient will read block file and meta file. DFSClient will checksum 
the data, HRegionServer will not checksum HLog data.

(2a) the block is not local.
HFile : DataNode will read block file and meta file. DFSClient will not 
checksum the data, HRegionServer(HFile) will checksum the HFile data.
HLog : DataNode will read block file and meta file. DFSClient will checksum the 
data, HRegionServer will not checksum HLog data.

(3a) the block is not local.
HFile : DataNode will read block file and meta file. DFSClient will not 
checksum the data, HRegionServer(HFile) will checksum the HFile data.
HLog : DataNode will read block file and meta file. DFSClient will checksum the 
data, HRegionServer will not checksum HLog data.

(4) dfs.client.read.shortcircuit = false, 
dfs.client.read.shortcircuit.skip.checksum=true
 The same as case(1)

{code}

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461299#comment-13461299
 ] 

Lars Hofhansl commented on HBASE-6868:
--

@Jerry: That seems like a better approach. It always bothers me when we're 
trying to solve HDFS problems in HBase. On the other hand that change will 
never (at or at least much later) be in an official HDFS.

@binlijin:
Are you sure about case (2)? You're saying that even if 
dfs.client.read.shortcircuit.skip.checksum=false the DFSClient will still skip 
the checksumming?

Are there more cases:
(3a) the block is not local.
Both DFSClient and HRegionserver will calculate the checksum
(4) dfs.client.read.shortcircuit = false, 
dfs.client.read.shortcircuit.skip.checksum=true
Both DFSClient and HRegionserver will calculate the checksum

?

I it seems we're mostly good here. The double checksumming for non local blocks 
is not ideal of course (but that's HDFS-3429)

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread Jerry Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461293#comment-13461293
 ] 

Jerry Chen commented on HBASE-6868:
---

On 89-fb, we are depending on inline HDFS checksum to solve the checksum iop 
overhead. See https://issues.apache.org/jira/browse/HDFS-2699. Our HDFS 
progress can be seen here: https://github.com/facebook/hadoop-20/tree/develop. 
It is code complete (not committed to github yet) and is under production 
testing. 

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461290#comment-13461290
 ] 

binlijin commented on HBASE-6868:
-

[~lhofhansl]
I check the current implementations, hbase.regionserver.checksum.verify is 
enabled by default, so when reading HFile, it uses the noChecksumFs in 
HFileSystem, when reading HLog , it uses the fs in HFileSystem, they use 
different FS.
fs in HFileSystem  // filesystem object that has checksum verification turned 
on.
noChecksumFs in HFileSystem // filesystem object that has checksum verification 
turned off.

(1) dfs.client.read.shortcircuit = falseļ¼Œ short circuit read turned off. 
DataNode read file data and send it to DFSClient(HRegionServer is a DFSClient)
HFile : DataNode will read block file and meta file. DFSClient will not 
checksum the data, HRegionServer(HFile) will checksum the HFile data.
HLog : DataNode will read block file and meta file. DFSClient will checksum the 
data, HRegionServer will not checksum   HLog data.

(2)dfs.client.read.shortcircuit = true, 
dfs.client.read.shortcircuit.skip.checksum=false, short circuit read turned on. 
If the block is local, DFSClient will read file data direct (HRegionServer is a 
DFSClient).
HFile : DFSClient will read block file and meta file. DFSClient will not 
checksum the data, HRegionServer(HFile) will checksum the HFile data.
HLog : DFSClient will read block file and meta file. DFSClient will checksum 
the data, HRegionServer will not checksum   HLog data.

(3)dfs.client.read.shortcircuit = true, 
dfs.client.read.shortcircuit.skip.checksum=true, short circuit read turned on.
If the block is local, DFSClient will read file data direct (HRegionServer is a 
DFSClient).
HFile : DFSClient will read block file only. DFSClient will not checksum the 
data, HRegionServer(HFile) will checksum the HFile data.
HLog : DFSClient will read block file and meta file. DFSClient will checksum 
the data, HRegionServer will not checksum   HLog data.

If i am wrong, please corrent me.


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461286#comment-13461286
 ] 

Lars Hofhansl commented on HBASE-6868:
--

So I am looking through the patch in HBASE-5074.

@LiuLei: Did you positively verify that we're not reading the checksums for the 
HLogs?

HBASE-5074 introduces HFileSystem, which has a getNoChecksumFs() method, as 
well as a getBackingFs() method. The backingFs is not checksummed. Looks like 
for the HLog the backingFs is used, which does not have checksums disabled.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461213#comment-13461213
 ] 

Lars Hofhansl commented on HBASE-6868:
--

Taking a quick glance at the code, this is not actually just a simple switch. 
There's code at various places where we unconditionally create unchecksummed 
filesystems.

I wonder whether it is possible to revert the entire change and start again.

> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?

2012-09-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461209#comment-13461209
 ] 

Lars Hofhansl commented on HBASE-6868:
--

So the first task should be checking that turning it off actually undoes all of 
its effects.
Seems like I think should 0.94.2 for this.


> Skip checksum is broke; are we double-checksumming by default?
> --
>
> Key: HBASE-6868
> URL: https://issues.apache.org/jira/browse/HBASE-6868
> Project: HBase
>  Issue Type: Bug
>  Components: HFile, wal
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Fix For: 0.94.3, 0.96.0
>
>
> The HFile contains checksums for decrease the iops, so when Hbase read HFile 
> , that dont't need to read the checksum from meta file of HDFS.  But HLog 
> file of Hbase don't contain the checksum, so when HBase read the HLog, that 
> must read checksum from meta file of HDFS.  We could  add setSkipChecksum per 
> file to hdfs or we could write checksums into WAL if this skip checksum 
> facility is enabled 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira