[jira] [Commented] (HBASE-16292) Can't start a local instance

2016-07-27 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15396074#comment-15396074
 ] 

Lars Hofhansl commented on HBASE-16292:
---

Is this Mac OS only?

> Can't start a local instance
> 
>
> Key: HBASE-16292
> URL: https://issues.apache.org/jira/browse/HBASE-16292
> Project: HBase
>  Issue Type: Bug
>Reporter: Elliott Clark
>Priority: Blocker
>
> Currently master is completely broken.
> {code}
> ./bin/start-hbase.sh
> starting master, logging to 
> /Users/elliott/Code/Public/hbase/bin/../logs/hbase-elliott-master-elliott-mbp.out
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; 
> support was removed in 8.0
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; 
> support was removed in 8.0
> {code}
> {code}
> 2016-07-27 10:36:20,992 ERROR [main] regionserver.SecureBulkLoadManager: 
> Failed to create or set permission on staging directory 
> /user/elliott/hbase-staging
> ExitCodeException exitCode=1: chmod: /user/elliott/hbase-staging: No such 
> file or directory
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
>   at org.apache.hadoop.util.Shell.run(Shell.java:456)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
>   at org.apache.hadoop.util.Shell.execCommand(Shell.java:815)
>   at org.apache.hadoop.util.Shell.execCommand(Shell.java:798)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:728)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
>   at 
> org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager.start(SecureBulkLoadManager.java:124)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:624)
>   at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:403)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.(HMasterCommandLine.java:307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:221)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:156)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:226)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)
>   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2417)
> 2016-07-27 10:36:20,994 ERROR [main] master.HMasterCommandLine: Master exiting
> java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMasterchmod: 
> /user/elliott/hbase-staging: No such file or directory
>   at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:145)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:221)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:156)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:226)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)
>   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2417)
> Caused by: java.lang.IllegalStateException: Failed to create or set 
> permission on staging directory /user/elliott/hbase-staging
>   at 
> org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager.start(SecureBulkLoadManager.java:141)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:624)
>   at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:403)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.(HMasterCommandLine.java:307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.refle

[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398048#comment-15398048
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Which version of HBase is this against? I tried the two snippets and the 
repro-code (unchanged, which both caching and page filter set to 5) finishes 
immediately. (this is HBase 0.98.21-SNAPSHOT)


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398057#comment-15398057
 ] 

Lars Hofhansl commented on HBASE-16296:
---

I see no difference in speed whether I set caching to 4, 5, or 7.


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398086#comment-15398086
 ] 

Lars Hofhansl commented on HBASE-16296:
---

OK. I find it about 10ms, unless I set the caching to 5, in which case it takes 
about 150ms.
(I did run the data loader script and flushed the table)


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-16296:
--
Comment: was deleted

(was: OK. I find it about 10ms, unless I set the caching to 5, in which case it 
takes about 150ms.
(I did run the data loader script and flushed the table)
)

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398097#comment-15398097
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Just deleted that comment. :)
I added a loop around the scanning to measure this better. Now I find doing 
this all 1000 times setting the caching to page filter size is the *fastest*.

OK... So what I'm looking for is the extra 140ms? I though I'm looking for 50 
seconds or something.



> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398133#comment-15398133
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Ahh... So this really just happens with a PageFilter wrapped in a FilterList.

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398138#comment-15398138
 ] 

Lars Hofhansl commented on HBASE-16296:
---

This is terrible. Happens only with reverse scans?

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398242#comment-15398242
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Yeah, can easily repro this now. Will be traveling the next two days, though.

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399246#comment-15399246
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Thanks [~ram_krish], please feel free to work on a patch. I'll be in a plane 
for the next 13 hours :(


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399258#comment-15399258
 ] 

Lars Hofhansl commented on HBASE-16296:
---

While I'm still at the airport waiting for my flight... So is the FilterList 
logic incorrect - or at least unnecessary? Why would we filterRowKey only 
because a filter in the list has filterAllRemaining? If the filter wished to 
exhibit this behavior it could (and should) do so itself.


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399320#comment-15399320
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Even TestFilterList is confusing:
{code}
/* reach MAX_PAGES already, should filter any rows */
rowkey = Bytes.toBytes("yyy");
assertTrue(filterMPONE.filterRowKey(rowkey, 0, rowkey.length));
kv = new KeyValue(rowkey, rowkey, Bytes.toBytes(0),
Bytes.toBytes(0));
assertFalse(Filter.ReturnCode.INCLUDE == filterMPONE.filterKeyValue(kv));
assertFalse(filterMPONE.filterRow());
{code}

Note how we apparently expect filterRowKey (a PageFilter in a FilterList) to 
return true, but FilterKeyValue is still supposed to return INCLUDE and 
filterRow false?! That seems wrong.


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-29 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-16296:
--
Attachment: 16296-MAYBE.txt

Here's something to discuss. IMHO this is just as correct as the previous 
behavior, and it is more consistent.

I'd assume we'd need to have a lot of discussion around this.

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE.txt, generatedata-snippet.java, 
> repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401158#comment-15401158
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Yeah... That's what I wanted to start the discussion on. :)

I think FilterList shouldn't add any logic that should either be in the wrapped 
filter(s) itself or in HRegion. But it *is* a change. Not for any of the 
Filters we have (I think) but folks could conceivably have written their own 
filters and relied on this.

[~ram_krish] is the scenario you outline in TestFilterList? (sorry can't check 
right now)

[~apurtell] any opinions on 0.98.
What about the other releases [~mantonov], [~busbey], [~enis], [~ndimiduk].
{{FilterList}} is {{public}} and {{stable}}.

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE.txt, generatedata-snippet.java, 
> repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401160#comment-15401160
 ] 

Lars Hofhansl commented on HBASE-16296:
---

{{SkipFilter}} does not implement {{filterAllRemaining}}, but 
{{WhileMatchFilter}} does. And {{WhileMatchFilter}} does not add this logic to 
{{filterRowKey(...)}}


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE.txt, generatedata-snippet.java, 
> repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401158#comment-15401158
 ] 

Lars Hofhansl edited comment on HBASE-16296 at 7/31/16 2:17 PM:


Yeah... That's what I wanted to start the discussion on. :)

I think FilterList shouldn't add any logic that should either be in the wrapped 
filter(s) itself or in HRegion. But it *is* a change. Not for any of the 
Filters we have (I think) but folks could conceivably have written their own 
filters and relied on this.

[~ram_krish] is the scenario you outline in TestFilterList? (sorry can't check 
right now)

[~apurtell] any opinions on 0.98?
What about the other releases [~mantonov], [~busbey], [~enis], [~ndimiduk]?
{{FilterList}} is {{public}} and {{stable}}.


was (Author: lhofhansl):
Yeah... That's what I wanted to start the discussion on. :)

I think FilterList shouldn't add any logic that should either be in the wrapped 
filter(s) itself or in HRegion. But it *is* a change. Not for any of the 
Filters we have (I think) but folks could conceivably have written their own 
filters and relied on this.

[~ram_krish] is the scenario you outline in TestFilterList? (sorry can't check 
right now)

[~apurtell] any opinions on 0.98.
What about the other releases [~mantonov], [~busbey], [~enis], [~ndimiduk].
{{FilterList}} is {{public}} and {{stable}}.

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE.txt, generatedata-snippet.java, 
> repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401162#comment-15401162
 ] 

Lars Hofhansl commented on HBASE-16296:
---

Actually {{WhileMatchFilter}} is used in {{TestFilterList}} but {{SkipFilter}} 
is not. But since {{SkipFilter}} does not implement {{filterAllRemaining}} (in 
0.98) we should be good.

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE.txt, generatedata-snippet.java, 
> repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401420#comment-15401420
 ] 

Lars Hofhansl commented on HBASE-16296:
---

One would have to analyze the code to see if there'd be a end-to-end different 
behavior possible.

In these scenarios the Cells would be filtered later in SQM anyway, so while 
FilterList might return different values, the actual scanning behavior might 
not change actually change.


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE.txt, generatedata-snippet.java, 
> repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-31 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-16296:
--
Attachment: 16296-MAYBE-v2.txt

What about -v2?

Simply breaks early in RegionScannerImpl if the Filter would filterAllRemaining 
anyway.

Is that correct in all cases? Any scenarios where this would not work?
I confirmed that that fixes the Phoenix performance issue we've seen there.


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE-v2.txt, 16296-MAYBE.txt, 
> generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-07-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401427#comment-15401427
 ] 

Lars Hofhansl commented on HBASE-16296:
---

I will still maintain that the FilterList behavior is inconsistent and should 
be fixed as well. Maybe for 1.3++

> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE-v2.txt, 16296-MAYBE.txt, 
> generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16296) Reverse scan performance degrades when scanner cache size matches page filter size

2016-08-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402205#comment-15402205
 ] 

Lars Hofhansl commented on HBASE-16296:
---

The key point of my message is that {{new FilterList(new XYZFilter())}} should 
behave identical to just {{new XYZFilter()}}. Right now it's not, because 
FilterList has this weird logic in FilterRowKey - and perhaps other stuff. In 
my opinion that is a bug.

I do not really care that much how we fix it, as long as it gets fixed. I made 
two proposals now and verified that either of them fix the issue. Ted's 
suggestion seems fine as long as it fixes the perf problem.


> Reverse scan performance degrades when scanner cache size matches page filter 
> size
> --
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>Reporter: James Taylor
> Attachments: 16296-MAYBE-v2.txt, 16296-MAYBE.txt, PAgefilter.patch, 
> generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16296) Reverse scan performance degrades when using filter lists

2016-08-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-16296:
--
Attachment: 16296-ted-0.98.txt

Here's a 0.98 version of [~yuzhih...@gmail.com]'s fix.

> Reverse scan performance degrades when using filter lists
> -
>
> Key: HBASE-16296
> URL: https://issues.apache.org/jira/browse/HBASE-16296
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Reporter: James Taylor
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: 16296-MAYBE-v2.txt, 16296-MAYBE.txt, 16296-ted-0.98.txt, 
> HBASE-16296-0.98.patch, HBASE-16296.patch, PAgefilter.patch, 
> generatedata-snippet.java, repro-snippet.java
>
>
> When a reverse scan is done, the server seems to not know it's done when the 
> scanner cache size matches the number of rows in a PageFilter. See 
> PHOENIX-3121 for how this manifests itself. We have a standalone, pure HBase 
> API reproducer too that I'll attach (courtesy of [~churromorales] and 
> [~mujtabachohan]).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16321) Ensure findbugs jsr305 jar isn't present

2016-08-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-16321:
--
Attachment: 16321-amend.txt

Any objections to this amendment?

Otherwise this break 0.98 build against hadoop 2.7.2
{code}
> mvn -DskipTests -Dhadoop-two.version=2.7.2 install package assembly:single
...
[WARNING] Rule 0: org.apache.maven.plugins.enforcer.BannedDependencies failed 
with message:
We don't allow the JSR305 jar from the Findbugs project, see HBASE-16321.
Found Banned Dependency: com.google.code.findbugs:jsr305:jar:3.0.0
Use 'mvn dependency:tree' to locate the source of the banned dependencies.
{code}


> Ensure findbugs jsr305 jar isn't present
> 
>
> Key: HBASE-16321
> URL: https://issues.apache.org/jira/browse/HBASE-16321
> Project: HBase
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: 16321-amend.txt, HBASE-16321.1.patch, HBASE-16321.2.patch
>
>
> we should be using
> {code}
> 
> 
>   com.github.stephenc.findbugs
>   findbugs-annotations
>   ${findbugs-annotations}
>   compile
> 
> {code}
>  to ensure we don't have a prohibited dependency, but it looks like we're 
> still bringing in
> {code}
> 
>  com.google.code.findbugs
>  jsr305
>  ${jsr305.version}
>   
> {code}
> remove the findbugs version (even though the maven central pom claims the 
> license is ALv2, that doesn't line up with the referenced project sites).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16321) Ensure findbugs jsr305 jar isn't present

2016-08-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410772#comment-15410772
 ] 

Lars Hofhansl commented on HBASE-16321:
---

Cool. Committed to 0.98.

> Ensure findbugs jsr305 jar isn't present
> 
>
> Key: HBASE-16321
> URL: https://issues.apache.org/jira/browse/HBASE-16321
> Project: HBase
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: 16321-amend.txt, HBASE-16321.1.patch, HBASE-16321.2.patch
>
>
> we should be using
> {code}
> 
> 
>   com.github.stephenc.findbugs
>   findbugs-annotations
>   ${findbugs-annotations}
>   compile
> 
> {code}
>  to ensure we don't have a prohibited dependency, but it looks like we're 
> still bringing in
> {code}
> 
>  com.google.code.findbugs
>  jsr305
>  ${jsr305.version}
>   
> {code}
> remove the findbugs version (even though the maven central pom claims the 
> license is ALv2, that doesn't line up with the referenced project sites).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13094) Consider Filters that are evaluated before deletes and see delete markers

2016-08-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423087#comment-15423087
 ] 

Lars Hofhansl commented on HBASE-13094:
---

I'd like to warm this up again.
There are several ways to do this:
# Add a special Filter that can wrap any other filter. This filter would be 
evaluated before the delete markers are checked. This needs to be the only 
Filter used.
# Variation of #1, which would allow to add this wrapper to a FilterList, then 
some logic to evaluate the right parts before or after delete markers are 
handled.
# Add a new API to filter, in order to indicate whether this filter is to be 
evaluated before or after delete markers.
# Add a new API to Scan and Get to allow adding a 2nd filter, that is evaluated 
before deletes.

#1 is simplest and can be done 100% backwards compatible. #4 seems cleanest. 
Comments?

[~stack], [~apurtell]

> Consider Filters that are evaluated before deletes and see delete markers
> -
>
> Key: HBASE-13094
> URL: https://issues.apache.org/jira/browse/HBASE-13094
> Project: HBase
>  Issue Type: Brainstorming
>  Components: regionserver, Scanners
>Reporter: Lars Hofhansl
>
> That would be good for full control filtering of all cells, such as needed 
> for some transaction implementations.
> [~ghelmling]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13094) Consider Filters that are evaluated before deletes and see delete markers

2016-08-16 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-13094:
--
Attachment: 13094-0.98.txt

Here's a pretty trivial patch doing #1.

Defines a PreDeleteFilter class that can wrap any other filter. When SQM 
detects this class, it is will evaluate the wrapped filter's filterKeyValue 
method before delete markers and columns are filtered.

Also creates a convenience BaseDelegateFilter class that can be used to 
implement Filter delegates.

Warning: COMPLETELY UNTESTED. Just want to park it here. Probably a it hacky.


> Consider Filters that are evaluated before deletes and see delete markers
> -
>
> Key: HBASE-13094
> URL: https://issues.apache.org/jira/browse/HBASE-13094
> Project: HBase
>  Issue Type: Brainstorming
>  Components: regionserver, Scanners
>Reporter: Lars Hofhansl
> Attachments: 13094-0.98.txt
>
>
> That would be good for full control filtering of all cells, such as needed 
> for some transaction implementations.
> [~ghelmling]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14921) Memory optimizations

2016-08-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424881#comment-15424881
 ] 

Lars Hofhansl commented on HBASE-14921:
---

MSLAB forces another copy of the data backing each Cell coming in, consuming 
memory bandwidth. With G1GC that does not appear to be necessary, since G1 
manages small'ish memory regions anyway.

I will repeat what I always say: Let's not try to be smarter than the garbage 
collector. Chances are we're not :)


> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, 
> HBASE-14921-V06-CAO.patch, HBASE-14921-V08-CAO.patch, 
> HBASE-14921-V09-CAO.patch, HBASE-14921-V10-CAO.patch, 
> HBASE-14921-V11-CAO.patch, HBASE-14921-V11-CAO.patch, 
> InitialCellArrayMapEvaluation.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf, MemStoreSizes.pdf, 
> MemstoreItrCountissue.patch, NewCompactingMemStoreFlow.pptx
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13094) Consider Filters that are evaluated before deletes and see delete markers

2016-08-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424894#comment-15424894
 ] 

Lars Hofhansl commented on HBASE-13094:
---

PreDeleteFilter can wrap a FilterList too. But I agree, too much magic.
A wrinkle with #4 that occurred to me that than it would be possible to add two 
filters on a scan/get, and the logic to combine their outcomes would be ... 
interesting (one filter, which may be a FilterList, would run before deletes, 
and another, also perhaps a FilterList, runs after. When the 1st passes and the 
second doesn't, combining these outcomes would be tricky and hard to understand 
and reason about.)

> Consider Filters that are evaluated before deletes and see delete markers
> -
>
> Key: HBASE-13094
> URL: https://issues.apache.org/jira/browse/HBASE-13094
> Project: HBase
>  Issue Type: Brainstorming
>  Components: regionserver, Scanners
>Reporter: Lars Hofhansl
> Attachments: 13094-0.98.txt
>
>
> That would be good for full control filtering of all cells, such as needed 
> for some transaction implementations.
> [~ghelmling]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-13094) Consider Filters that are evaluated before deletes and see delete markers

2016-08-17 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reassigned HBASE-13094:
-

Assignee: Lars Hofhansl

> Consider Filters that are evaluated before deletes and see delete markers
> -
>
> Key: HBASE-13094
> URL: https://issues.apache.org/jira/browse/HBASE-13094
> Project: HBase
>  Issue Type: Brainstorming
>  Components: regionserver, Scanners
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 13094-0.98.txt
>
>
> That would be good for full control filtering of all cells, such as needed 
> for some transaction implementations.
> [~ghelmling]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13094) Consider Filters that are evaluated before deletes and see delete markers

2016-08-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425313#comment-15425313
 ] 

Lars Hofhansl commented on HBASE-13094:
---

Yep. Or we come up with an API that enforces this structurally.

For example we could adapt #4 to add an API to Get/Scan that instead only 
controls whether the Filter.filterKeyValue (set as before with setFilter) is 
evaluated either pre or post deletes. The effect would be the same as the patch 
I have, but with a clean no-magic API. (Although that almost leads me back to 
the approach of indicating the "pre-postness" in the Filter itself). Or one 
could think about option #3 above.


> Consider Filters that are evaluated before deletes and see delete markers
> -
>
> Key: HBASE-13094
> URL: https://issues.apache.org/jira/browse/HBASE-13094
> Project: HBase
>  Issue Type: Brainstorming
>  Components: regionserver, Scanners
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 13094-0.98.txt
>
>
> That would be good for full control filtering of all cells, such as needed 
> for some transaction implementations.
> [~ghelmling]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13094) Consider Filters that are evaluated before deletes and see delete markers

2016-08-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431374#comment-15431374
 ] 

Lars Hofhansl commented on HBASE-13094:
---

Another complication with option #4 and two filters: All hooks other than 
filterKeyValue would be ignored (or we'd need complicated logic to combine the 
two filter everywhere for all hooks - with no added value over just a single 
filter). Could document that only filterKeyValue is used, of course. The more I 
think about this, the less I like it.

I think we simply need a way to decide whether the _one_ filter passed is 
execute before or after the delete/column-selection logic in SQM. It will be 
too confusing otherwise. That can be done with a new flag on Scan/Get or with 
the hack I did in the patch here.

[~giacomotaylor], would the 2nd theme I describe (only allowing a single 
Filter, which could be a FilterList) suffice for the Phoenix transaction use 
case?

> Consider Filters that are evaluated before deletes and see delete markers
> -
>
> Key: HBASE-13094
> URL: https://issues.apache.org/jira/browse/HBASE-13094
> Project: HBase
>  Issue Type: Brainstorming
>  Components: regionserver, Scanners
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 13094-0.98.txt
>
>
> That would be good for full control filtering of all cells, such as needed 
> for some transaction implementations.
> [~ghelmling]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16499) slow replication for small HBase clusters

2016-08-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439368#comment-15439368
 ] 

Lars Hofhansl commented on HBASE-16499:
---

When I added the parallelization logic I envisioned that we do not want more 
than one batch per remote region server, hence the limit in the logic.

Maybe that assumption wasn't right either. More and smaller batches will 
increase usage of the available bandwidth (since HBase RPC is not streaming, 
it's hard to make use of the full b/w * latency product).

Since there are other check in there, for example not to create batches of less 
then 100 items, or not more than we have threads available I'm in favor 
increasing the default to something like 25%. So we'll have one peer picked up 
to a size of 7 nodes, when we'll pick 2.


> slow replication for small HBase clusters
> -
>
> Key: HBASE-16499
> URL: https://issues.apache.org/jira/browse/HBASE-16499
> Project: HBase
>  Issue Type: Bug
>Reporter: Vikas Vishwakarma
>Assignee: Vikas Vishwakarma
> Fix For: 0.98.20
>
>
> For small clusters 10-20 nodes we recently observed that replication is 
> progressing very slowly when we do bulk writes and there is lot of lag 
> accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed 
> that the number of threads used for shipping wal edits in parallel comes from 
> the following equation in HBaseInterClusterReplicationEndpoint
> int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
>   replicationSinkMgr.getSinks().size());
> ... 
>   for (int i=0; i entryLists.add(new ArrayList(entries.size()/n+1)); <-- 
> batch size
>   }
> ...
> for (int i=0; i  .
> // RuntimeExceptions encountered here bubble up and are handled 
> in ReplicationSource
> pool.submit(createReplicator(entryLists.get(i), i));  <-- 
> concurrency 
> futures++;
>   }
> }
> maxThreads is fixed & configurable and since we are taking min of the three 
> values n gets decided based replicationSinkMgr.getSinks().size() when we have 
> enough edits to replicate
> replicationSinkMgr.getSinks().size() is decided based on 
> int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio);
> where ratio is this.ratio = conf.getFloat("replication.source.ratio", 
> DEFAULT_REPLICATION_SOURCE_RATIO);
> Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small 
> clusters of size 10-20 RegionServers  the value we get for numSinks and hence 
> n is very small like 1 or 2. This substantially reduces the pool concurrency 
> used for shipping wal edits in parallel effectively slowing down replication 
> for small clusters and causing lot of lag accumulation in AgeOfLastShipped. 
> Sometimes it takes tens of hours to clear off the entire replication queue 
> even after the client has finished writing on the source side. 
> We are running tests by varying replication.source.ratio and have seen 
> multi-fold improvement in total replication time (will update the results 
> here). I wanted to propose here that we should increase the default value for 
> replication.source.ratio also so that we have sufficient concurrency even for 
> small clusters. We figured it out after lot of iterations and debugging so 
> probably slightly higher default will save the trouble. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14921) Memory optimizations

2016-08-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-14921:
--
Fix Version/s: 2.0.0

> Memory optimizations
> 
>
> Key: HBASE-14921
> URL: https://issues.apache.org/jira/browse/HBASE-14921
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Anastasia Braginsky
> Fix For: 2.0.0
>
> Attachments: CellBlocksSegmentInMemStore.pdf, 
> CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, 
> HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, 
> HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, 
> HBASE-14921-V06-CAO.patch, HBASE-14921-V08-CAO.patch, 
> HBASE-14921-V09-CAO.patch, HBASE-14921-V10-CAO.patch, 
> HBASE-14921-V11-CAO.patch, HBASE-14921-V11-CAO.patch, 
> HBASE-14921-V12-CAO.patch, InitialCellArrayMapEvaluation.pdf, 
> IntroductiontoNewFlatandCompactMemStore.pdf, MemStoreSizes.pdf, 
> MemstoreItrCountissue.patch, NewCompactingMemStoreFlow.pptx
>
>
> Memory optimizations including compressed format representation and offheap 
> allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16503) Delay in preScannerNext > rpcTimeout causes scan to fail

2016-08-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440606#comment-15440606
 ] 

Lars Hofhansl commented on HBASE-16503:
---

This is client side of the scanning timing out, right?

> Delay in preScannerNext > rpcTimeout causes scan to fail
> 
>
> Key: HBASE-16503
> URL: https://issues.apache.org/jira/browse/HBASE-16503
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.21
>Reporter: James Taylor
>  Labels: SFDC, phoenix
>
> In HBase 0.98, a delay in the preScannerNext coprocessor hook of greater than 
> the RPC timeout causes the scan to fail. This can happen in Phoenix when a 
> large aggregate is performed without statistics enabled, in which case the 
> work is done during preScannerNext. The same test works fine in HBase 1.1.
> {code}
> package org.apache.hadoop.hbase.client.coprocessor;
> import static org.junit.Assert.assertNull;
> import java.io.IOException;
> import java.util.List;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.HBaseConfiguration;
> import org.apache.hadoop.hbase.HBaseTestingUtility;
> import org.apache.hadoop.hbase.HTableDescriptor;
> import org.apache.hadoop.hbase.client.HConnection;
> import org.apache.hadoop.hbase.client.HConnectionManager;
> import org.apache.hadoop.hbase.client.HTableInterface;
> import org.apache.hadoop.hbase.client.Result;
> import org.apache.hadoop.hbase.client.ResultScanner;
> import org.apache.hadoop.hbase.client.Scan;
> import org.apache.hadoop.hbase.coprocessor.ObserverContext;
> import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;
> import org.apache.hadoop.hbase.coprocessor.SimpleRegionObserver;
> import org.apache.hadoop.hbase.regionserver.InternalScanner;
> import org.junit.Test;
> public class TestDelayPreScannerNext {
> private static final long RPC_TIMEOUT = 2000;
> private static final String TABLE_NAME = "foo";
> 
> @Test
> public void testPreScannerNextDelayCausesScanToFail() throws Exception {
> Configuration conf = HBaseConfiguration.create();
> conf.setLong("hbase.rpc.timeout", RPC_TIMEOUT);
> HBaseTestingUtility utility = new HBaseTestingUtility(conf);
> utility.startMiniCluster();
> HTableDescriptor desc = utility.createTableDescriptor(TABLE_NAME);
> 
> desc.addCoprocessor(DelayPreScannerNextRegionObserver.class.getName());
> utility.createTable(desc, new byte[][]{});
> HConnection connection = HConnectionManager.createConnection(conf);
> HTableInterface table = connection.getTable(TABLE_NAME);
> ResultScanner scanner = table.getScanner(new Scan());
> assertNull(scanner.next());
> }
> 
> public static class DelayPreScannerNextRegionObserver extends 
> SimpleRegionObserver {
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> Thread.sleep(RPC_TIMEOUT * 2);
> } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16503) Delay in preScannerNext > rpcTimeout causes scan to fail

2016-08-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440633#comment-15440633
 ] 

Lars Hofhansl commented on HBASE-16503:
---

1.1 and later have HBASE-13090. In 0.98 there's no way other than increasing 
the rpc timeout as far as I know.

> Delay in preScannerNext > rpcTimeout causes scan to fail
> 
>
> Key: HBASE-16503
> URL: https://issues.apache.org/jira/browse/HBASE-16503
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.21
>Reporter: James Taylor
>  Labels: SFDC, phoenix
>
> In HBase 0.98, a delay in the preScannerNext coprocessor hook of greater than 
> the RPC timeout causes the scan to fail. This can happen in Phoenix when a 
> large aggregate is performed without statistics enabled, in which case the 
> work is done during preScannerNext. The same test works fine in HBase 1.1.
> {code}
> package org.apache.hadoop.hbase.client.coprocessor;
> import static org.junit.Assert.assertNull;
> import java.io.IOException;
> import java.util.List;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.HBaseConfiguration;
> import org.apache.hadoop.hbase.HBaseTestingUtility;
> import org.apache.hadoop.hbase.HTableDescriptor;
> import org.apache.hadoop.hbase.client.HConnection;
> import org.apache.hadoop.hbase.client.HConnectionManager;
> import org.apache.hadoop.hbase.client.HTableInterface;
> import org.apache.hadoop.hbase.client.Result;
> import org.apache.hadoop.hbase.client.ResultScanner;
> import org.apache.hadoop.hbase.client.Scan;
> import org.apache.hadoop.hbase.coprocessor.ObserverContext;
> import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;
> import org.apache.hadoop.hbase.coprocessor.SimpleRegionObserver;
> import org.apache.hadoop.hbase.regionserver.InternalScanner;
> import org.junit.Test;
> public class TestDelayPreScannerNext {
> private static final long RPC_TIMEOUT = 2000;
> private static final String TABLE_NAME = "foo";
> 
> @Test
> public void testPreScannerNextDelayCausesScanToFail() throws Exception {
> Configuration conf = HBaseConfiguration.create();
> conf.setLong("hbase.rpc.timeout", RPC_TIMEOUT);
> HBaseTestingUtility utility = new HBaseTestingUtility(conf);
> utility.startMiniCluster();
> HTableDescriptor desc = utility.createTableDescriptor(TABLE_NAME);
> 
> desc.addCoprocessor(DelayPreScannerNextRegionObserver.class.getName());
> utility.createTable(desc, new byte[][]{});
> HConnection connection = HConnectionManager.createConnection(conf);
> HTableInterface table = connection.getTable(TABLE_NAME);
> ResultScanner scanner = table.getScanner(new Scan());
> assertNull(scanner.next());
> }
> 
> public static class DelayPreScannerNextRegionObserver extends 
> SimpleRegionObserver {
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> Thread.sleep(RPC_TIMEOUT * 2);
> } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2016-03-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181406#comment-15181406
 ] 

Lars Hofhansl commented on HBASE-13082:
---

So I found a deceptively simple way of (1) avoiding the locks in every method, 
(2) avoiding resetting the scanner stack, (3) avoiding an extra cleanup chore:
# At StoreScanner creation time create a CountDownLatch initialized with 1.
# When a StoreScanner is closed, call countDown on the latch.
# When the compaction calls updateReaders in the StoreScanner it will block 
until the scanner is done.

So we have an easy way to ensure that no compactions can finish while a scanner 
using referring to the store files is running. That also means that we no 
longer need to reset the scanner stack (a compaction only happens after the 
scanner is closed), and lastly the compactions will naturally stack up behind 
the scanners and eventually run.

The risk is that scanner is not properly closed and we'll never count down that 
latch, and hence block compactions forever. I've not seen this happening in my 
tests, also every scanner is guarded by a lease, so eventually we will close 
it. The other risk is that we simply might not be able to get any compactions 
through in a busy system.

Comments? Happy to attach a patch (or open another jira for this)

[~stack], [~ram_krish], [~anoop.hbase], [~apurtell]


> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, Scanners
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, 
> HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, 
> HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, 
> HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, 
> HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, 
> HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, 
> LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2016-03-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181406#comment-15181406
 ] 

Lars Hofhansl edited comment on HBASE-13082 at 3/5/16 1:35 AM:
---

So I found a deceptively simple way of (1) avoiding the locks in every method, 
(2) avoiding resetting the scanner stack, (3) avoiding an extra cleanup chore:
# At StoreScanner creation time create a CountDownLatch initialized with 1.
# When a StoreScanner is closed, call countDown() on the latch.
# When the compaction calls updateReaders on the StoreScanner it calls await() 
on the latch and will block until the scanner is done.

So we have an easy way to ensure that no compactions can finish while a scanner 
using referring to the store files is running. That also means that we no 
longer need to reset the scanner stack (a compaction only happens after the 
scanner is closed), and lastly the compactions will naturally stack up behind 
the scanners and eventually run.

The risk is that scanner is not properly closed and we'll never count down that 
latch, and hence block compactions forever. I've not seen this happening in my 
tests, also every scanner is guarded by a lease, so eventually we will close 
it. The other risk is that we simply might not be able to get any compactions 
through in a busy system.

Comments? Happy to attach a patch (or open another jira for this)

[~stack], [~ram_krish], [~anoop.hbase], [~apurtell]



was (Author: lhofhansl):
So I found a deceptively simple way of (1) avoiding the locks in every method, 
(2) avoiding resetting the scanner stack, (3) avoiding an extra cleanup chore:
# At StoreScanner creation time create a CountDownLatch initialized with 1.
# When a StoreScanner is closed, call countDown on the latch.
# When the compaction calls updateReaders in the StoreScanner it will block 
until the scanner is done.

So we have an easy way to ensure that no compactions can finish while a scanner 
using referring to the store files is running. That also means that we no 
longer need to reset the scanner stack (a compaction only happens after the 
scanner is closed), and lastly the compactions will naturally stack up behind 
the scanners and eventually run.

The risk is that scanner is not properly closed and we'll never count down that 
latch, and hence block compactions forever. I've not seen this happening in my 
tests, also every scanner is guarded by a lease, so eventually we will close 
it. The other risk is that we simply might not be able to get any compactions 
through in a busy system.

Comments? Happy to attach a patch (or open another jira for this)

[~stack], [~ram_krish], [~anoop.hbase], [~apurtell]


> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, Scanners
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, 
> HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, 
> HBASE-13082_15.patch, HBASE-13082_16.patch, HBASE-13082_17.patch, 
> HBASE-13082_18.patch, HBASE-13082_19.patch, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, 
> HBASE-13082_4.patch, HBASE-13082_9.patch, HBASE-13082_9.patch, 
> HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, 
> LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2016-03-04 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-13082:
--
Attachment: CountDownLatch-0.98.txt

Lemme attach the patch anyway - just so it's parked somewhere. Does what's 
described above.
Note that updateReaders is now only used to block compactions from finishing, 
not to do any work, so we could rename it.

This should actually be better that this patch even for performance since we're 
avoiding a volatile reference and hence a memory fence.

But as stated, it is riskier if Storescanner are not closed correctly.

> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, Scanners
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, 
> HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, 
> HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, 
> HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, 
> HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, 
> HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, 
> HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, 
> LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2016-03-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181607#comment-15181607
 ] 

Lars Hofhansl commented on HBASE-13082:
---

[~ramkrishna.s.vasude...@gmail.com] [~anoop.hbase], Yep, that's what I suggest 
as downside ("risks") above.
Everything that is calling updateReaders will have to wait behind active 
StoreScanners. Instead of a lot of huh hah, compactions are delayed until they 
can safely finish.
But those would only be StoreScanner open at the time the compaction/flush 
tries to finish. So actually my point about delaying compactions potentially 
forever is not true. Instead as as soon as the old set or StoreScanners is done 
the compaction/flush can proceed and the new Scanner would only see the new 
files.

The flush part is concerning, though. Thanks for pointing that out.

That was already the cases before in other situations (like a Scanner that only 
encounters deleted Cells with has a Filter that filters most Cells, etc). But 
that would not be the case in trunk anymore - just that there we'd disallow the 
file to be removed, which is better. There is currently still the checkFlushed 
call, which needs a volatile and hence causes a memory barrier.

I'll do some tests. This wasn't meant as an actual patch to be committed, just 
wanted to get some feedback. I'm always a fan of simplest code possible.


> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, Scanners
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, CountDownLatch-0.98.txt, HBASE-13082.pdf, 
> HBASE-13082_1.pdf, HBASE-13082_12.patch, HBASE-13082_13.patch, 
> HBASE-13082_14.patch, HBASE-13082_15.patch, HBASE-13082_16.patch, 
> HBASE-13082_17.patch, HBASE-13082_18.patch, HBASE-13082_19.patch, 
> HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, HBASE-13082_2_WIP.patch, 
> HBASE-13082_3.patch, HBASE-13082_4.patch, HBASE-13082_9.patch, 
> HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, HBASE-13082_withpatch.jpg, 
> LockVsSynchronized.java, gc.png, gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182252#comment-15182252
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Lemme have a look. Optimize might aggravate the problem, but it should not be 
the cause. If there's a seek to next row and optimize decides to skip instead 
of seek, that should be the better decision (less overall work to do). Does Get 
have any special logic around this?
In any case I'll take a look today.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch, 15392v3.wip.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.reg

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182324#comment-15182324
 ] 

Lars Hofhansl commented on HBASE-15392:
---

bq. It optimizes for scans but F's random reads (smile). Discrimination!

I still do not follow. Optimize relies on the ColumnTrackers doing the right 
thing. I.e. when we're past the column we're looking for they will issue a SEEK 
(and no longer an INCLUDE_AND_SEEK).

With that I do not see how optimize can make things worse in terms of number of 
blocks read. If the SEEK would get us to another block, optimize allows the 
seek, otherwise it converts it to INCLUDE or SKIP.

In either case we're moving to the next ROW. Either we SEEK there or we SKIP 
multiple times, but to the _very same_ row. How would optimize cause us reading 
further than we would have without? I am missing something.

[~stack], is this only happening when a row is the last in a block? How can 
reproduce this easily?


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch, 15392v3.wip.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182325#comment-15182325
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Is this part that's no longer executed:
{code}
  if (qcode == ScanQueryMatcher.MatchCode.INCLUDE_AND_SEEK_NEXT_ROW) {
if (!matcher.moreRowsMayExistAfter(kv)) {
  return false;
}
seekToNextRow(kv);
  } 
{code}
?

If so we can even be smarter. We know that a Get will always exactly touch one 
row. So if we do a XXX_SEEK_NEXT_ROW, we know we're done. We do not even have 
to check.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch, 15392v3.wip.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
>   

[jira] [Updated] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15392:
--
Attachment: 15392-0.98-looksee.txt

Hey [~stack], does -looksee fix the problem for you (sorry against 0.98)

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, no_optimize.patch, no_optimize.patch, 
> two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:624)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(Ke

[jira] [Issue Comment Deleted] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15392:
--
Comment: was deleted

(was: | (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s 
{color} | {color:blue} The patch file was not named according to hbase's naming 
conventions. Please see 
https://yetus.apache.org/documentation/latest/precommit-patchnames for 
instructions. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} 
| {color:red} HBASE-15392 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/latest/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791683/15392-0.98-looksee.txt
 |
| JIRA Issue | HBASE-15392 |
| Powered by | Apache Yetus 0.1.0   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/861/console |


This message was automatically generated.

)

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, no_optimize.patch, no_optimize.patch, 
> two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> 

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182342#comment-15182342
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Lastly (maybe) how big are the column? I'm testing the -looksee patch and it 
actually _slightly slower_ (because we're doing more work in SEEK'ing vs 
SKIP'ing), my Cells are small. Maybe this is more pronounced for big Cells (as 
the likelihood of overshooting by just one column leading to a new block is 
higher).

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, no_optimize.patch, no_optimize.patch, 
> two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
>

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182343#comment-15182343
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Strike that... It's not slower. In any case. Please let me know if the -looksee 
patch fixes the problem.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, no_optimize.patch, no_optimize.patch, 
> two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:624)
> at 
> org.a

[jira] [Commented] (HBASE-15243) Utilize the lowest seek value when all Filters in MUST_PASS_ONE FilterList return SEEK_NEXT_USING_HINT

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182376#comment-15182376
 ] 

Lars Hofhansl commented on HBASE-15243:
---

Looks good to me.

As for the observation above:
bq. This might be tangential but I think seeking could be a bit more aggressive 
on the MUST_PASS_ALL case as well.

I'd agree. We only have to reverse the compare the getHint method and we'll 
find the optimize (i.e. max) hint value for MUST_PASS_ALL. Different jira maybe.

I will say the FilterList is more tricky than it should and we've seen many 
"funny" things with in the past. The more tests we have the better.


> Utilize the lowest seek value when all Filters in MUST_PASS_ONE FilterList 
> return SEEK_NEXT_USING_HINT
> --
>
> Key: HBASE-15243
> URL: https://issues.apache.org/jira/browse/HBASE-15243
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-15243-v1.txt, HBASE-15243-v2.txt, 
> HBASE-15243-v3.txt, HBASE-15243-v4.txt, HBASE-15243-v5.txt, 
> HBASE-15243-v6.txt, HBASE-15243-v7.txt
>
>
> As Preston Koprivica pointed out at the tail of HBASE-4394, when all filters 
> in a MUST_PASS_ONE FilterList return a SEEK_USING_NEXT_HINT code, we should 
> return SEEK_NEXT_USING_HINT from the FilterList to utilize the lowest seek 
> value.
> This would save unnecessary accesses to blocks which are not in scan result.
> A unit test has been added which shows the reduction in the number of blocks 
> accessed when this optimization takes effect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182407#comment-15182407
 ] 

Lars Hofhansl commented on HBASE-15392:
---

I also like your suggestion [~anoop.hbase]. Although, I'm not sure what 
optimize would do in this case.

In the SEEK_NEXT_ROW we could be DONE in that case. For 
INCLUDE_AND_SEEK_NEXT_ROW we'd have to recheck the condition in the 
StoreScanner again unless we add an INCLUDE_AND_DONE state.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, no_optimize.patch, no_optimize.patch, 
> two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182623#comment-15182623
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Thanks for looking...

Generally for Scans we should reduce the per Cell/Row cost. IMHO, we cannot add 
a check for moreRowsMayExistAfter(...) for every Cell or, as in this case, for 
every Row. That's an extra compare, possibly two, we've been trying so hard to 
remove them. This should be done for Gets only.

-1 on doing that always.

In all but exceptional cases a Scan will stay within the block most of the time 
(if not, folks should increase their block size for Scans, anyway). For Gets 
it's a different story, since we know ahead of time that we want to look at the 
single row only.
(That makes me think, we probably want a Cell-per-block metric, but that's a 
different story)

In the end, I'm still a fan of my -looksee change. It (1) restores the old 
behavior for Gets, (2) keeps the Scan optimization (that I have verified many 
times over - we had some of our M/R jobs speed up by 3x end-to-end!), (3) and 
does not add any compares back into the mix.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, HBASE-15392_suggest.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.r

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182684#comment-15182684
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Thanks [~anoop.hbase]... But you left the important part from my quote: "or, as 
in this case, for every Row" :)

So we're agreeing my patches fixes the issue and is general?
[~stack], can you easily confirm that it fixes the problem you've seeing?


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, HBASE-15392_suggest.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserv

[jira] [Comment Edited] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182684#comment-15182684
 ] 

Lars Hofhansl edited comment on HBASE-15392 at 3/7/16 7:20 AM:
---

Thanks [~anoop.hbase]... But you left the important part from my quote: "or, as 
in this case, for every Row" :)

So we're agreeing my patche fixes the issue and is general?
[~stack], can you easily confirm that it fixes the problem you've seeing?



was (Author: lhofhansl):
Thanks [~anoop.hbase]... But you left the important part from my quote: "or, as 
in this case, for every Row" :)

So we're agreeing my patches fixes the issue and is general?
[~stack], can you easily confirm that it fixes the problem you've seeing?


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, HBASE-15392_suggest.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.ha

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183924#comment-15183924
 ] 

Lars Hofhansl commented on HBASE-15392:
---

[~stack], you cool with this man? If so I'll commit to all branches.
(and sorry I broke this)

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, HBASE-15392_suggest.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:624)
>  

[jira] [Commented] (HBASE-15389) Write out multiple files when compaction

2016-03-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184006#comment-15184006
 ] 

Lars Hofhansl commented on HBASE-15389:
---

[~Apache9] thanks for looking into this. This is awesome.

Looks good, generally - still have to look through the details.

Questions:
# where does that leave DateTieredCompactionPolicy? Would we remove that?
# I notice the Windowing logic in DateTieredMultiFileWriter is different from 
the one in DateTieredCompactionPolicy. Can you explain the difference?


> Write out multiple files when compaction
> 
>
> Key: HBASE-15389
> URL: https://issues.apache.org/jira/browse/HBASE-15389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction
>Affects Versions: 2.0.0, 1.3.0, 0.98.19
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 0.98.19
>
> Attachments: HBASE-15389-uc.patch, HBASE-15389-v1.patch, 
> HBASE-15389-v2.patch, HBASE-15389.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15389) Write out multiple files when compaction

2016-03-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184018#comment-15184018
 ] 

Lars Hofhansl commented on HBASE-15389:
---

3. Is it possible to refactor compactions to make this kind of compaction 
possible with just a CompactionPolicy?

> Write out multiple files when compaction
> 
>
> Key: HBASE-15389
> URL: https://issues.apache.org/jira/browse/HBASE-15389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction
>Affects Versions: 2.0.0, 1.3.0, 0.98.19
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 0.98.19
>
> Attachments: HBASE-15389-uc.patch, HBASE-15389-v1.patch, 
> HBASE-15389-v2.patch, HBASE-15389.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184448#comment-15184448
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Actually here's another thought. What optimize is trying to figure out is 
whether the SEEK would get us onto a new HFileBlock, if so, it'll do the SEEK. 
So the description is _exactly_ what we want: If we'd SEEK onto the next block 
it would leave the SEEK in place.

So why isn't it working? In SQM.compareKeyForNextRow I compare the next indexed 
key with HConstants.OLDEST_TIMESTAMP, Type.Minimum.getCode(). Seems for the 
purpose of this optimization we should instead use HConstants.LATEST_TIMESTAMP, 
Type.Maximum.getCode(), that way the indexedKey of the next block would _after_ 
this key, not before.

I still cannot reproduce this easily. So I'll produce a new patch in minute. 
Let me know what you think.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, HBASE-15392_suggest.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
>

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184524#comment-15184524
 ] 

Lars Hofhansl commented on HBASE-15392:
---

OK I get it now... It's doing the right thing. HConstants.OLDEST_TIMESTAMP, 
Type.Minimum.getCode() produces the _last_ KV for the current row.

But note that that would still be the Cell _right before_ the nextIndexedKey 
(since that is the first key of the _next_ block), and hence we'll need to 
check the next one and overshoop by one Cell (and hence load the next block).

Instead of the first key of the next block, what we'd really want is the last 
key of the current block to compare against - but we do not have that readily 
available.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, HBASE-15392_suggest.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.K

[jira] [Comment Edited] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184524#comment-15184524
 ] 

Lars Hofhansl edited comment on HBASE-15392 at 3/8/16 6:40 AM:
---

OK I get it now... It's doing the right thing. HConstants.OLDEST_TIMESTAMP, 
Type.Minimum.getCode() produces the _last_ KV for the current row.

But note that that would still be the Cell _right before_ the nextIndexedKey 
(since that is the first key of the _next_ block), and hence we'll need to 
check the next one and overshoop by one Cell (and hence load the next block).

Instead of the first key of the next block, what we'd really want is the last 
key of the current block to compare against - but we do not have that readily 
available, since we'd have to scan forward in the block to the last Cell in 
order to find its key.

So, it seems like -looksee is still the best option.



was (Author: lhofhansl):
OK I get it now... It's doing the right thing. HConstants.OLDEST_TIMESTAMP, 
Type.Minimum.getCode() produces the _last_ KV for the current row.

But note that that would still be the Cell _right before_ the nextIndexedKey 
(since that is the first key of the _next_ block), and hence we'll need to 
check the next one and overshoop by one Cell (and hence load the next block).

Instead of the first key of the next block, what we'd really want is the last 
key of the current block to compare against - but we do not have that readily 
available.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, HBASE-15392_suggest.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
>  

[jira] [Created] (HBASE-15431) A bunch of methods are hot and too big to be inline

2016-03-08 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15431:
-

 Summary: A bunch of methods are hot and too big to be inline
 Key: HBASE-15431
 URL: https://issues.apache.org/jira/browse/HBASE-15431
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
-XX:+PrintInlining" and then looked for "hot method too big" log lines.

I'll attach a log of those messages.
I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods (as 
long as they're hot, but actually didn't see any improvement).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15431) A bunch of methods are hot and too big to be inline

2016-03-08 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15431:
--
Attachment: hotMethods.txt

List of hot and big methods.

> A bunch of methods are hot and too big to be inline
> ---
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-03-08 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15431:
--
Summary: A bunch of methods are hot and too big to be inlined  (was: A 
bunch of methods are hot and too big to be inline)

> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15431) A bunch of methods are hot and too big to be inline

2016-03-08 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186342#comment-15186342
 ] 

Lars Hofhansl commented on HBASE-15431:
---

The load I ran was a Phoenix count\(*) query. Note that almost all interesting 
methods using during scanning are _not_ inlined.
* SQM.match
* StoreScanner.next
* HRegion$RegionScannerImpl::nextInternal
* HFileReaderV2$ScannerV2::blockSeek
* FastDiffDeltaEncoder$1::decode
* etc

As said, increasing -XX:FreqInlineSize did not improve things. Presumably 
because now too much stuff is being inlined.

> A bunch of methods are hot and too big to be inline
> ---
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-03-08 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186345#comment-15186345
 ] 

Lars Hofhansl commented on HBASE-15431:
---

* KeyValue::createByteArray ... scary

> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-03-08 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15431:
--
Description: 
I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
-XX:+PrintInlining" and then looked for "hot method too big" log lines.

I'll attach a log of those messages.
I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods (as 
long as they're hot, but actually didn't see any improvement).

In all cases I primed the JVM to make sure the JVM gets a chance to profile the 
methods and decide whether they're hot or not.

  was:
I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
-XX:+PrintInlining" and then looked for "hot method too big" log lines.

I'll attach a log of those messages.
I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods (as 
long as they're hot, but actually didn't see any improvement).



> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).
> In all cases I primed the JVM to make sure the JVM gets a chance to profile 
> the methods and decide whether they're hot or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-03-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187468#comment-15187468
 ] 

Lars Hofhansl commented on HBASE-15431:
---

[~ram_krish] Yeah... 0.98, sorry. Good catch :)
Most should translate to trunk, but some things will be different.


> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).
> In all cases I primed the JVM to make sure the JVM gets a chance to profile 
> the methods and decide whether they're hot or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188405#comment-15188405
 ] 

Lars Hofhansl commented on HBASE-15392:
---

bq. Optimize seems broke to me. We keep SKIPping even though we've found all 
our rows.

No, no... That's the _whole_ point optimize :)
SEEKing is so expensive that we translate it to series of SKIPing instead. We 
can do 100's SKIPs in the time of single SEEK - unless we expect the SEEK to 
get us onto the next block with high likelihood, in which case we should SEEK, 
which can definitely happen with many versions (so we want to mostly SKIP, but 
not have terrible performance when we store 1000 versions).

A bunch of SKIP, SKIP, SKIP, SKIP, ..., SKIP, us much cheaper than a single 
SEEK_TO_NEXT_ROW. In fact it can be 3x faster.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValu

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188427#comment-15188427
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Lemme try to answer. There seems to be a bunch of confusion about the optimize 
method.

bq. Lars Hofhansl Is this true? If a row of 1M columns and we are looking for 
the second column only, SKIP'ing through the rest is what we want?

Nope. That's the point. If the SEEK gets us to another block it does the SEEK. 
Otherwise it saves time by doing SKIP, SKIP, ... That _is_ cheaper.
If it does that, even though the remaining columns are on different block, 
that's a bug. It should only SKIP as long as the point we SEEK is to is on the 
same block.

SEEK is very expensive. Phoenix has an "optimization" where they always load 
all columns, so that it won't use the ExplicitColumnTracker, because seeking 
between columns or rows was too slow. With wide rows they are f'ed. With 
optimize they can remove that, small rows are performant, and for large rows we 
still SEEK.

The Key is: Don't count the number of Cells the ColumnTracker is seeing. Count 
the work that HBase doing when we're not SKIP'ing, but doing a SEEK instead.

bq. Optimize is squashing the SEEK making it an INCLUDE or a SKIP instead when 
the index key is >= current cell.
I do not think so. It's only because of the bug that moreRowsMayExistAfter is 
not called unless we SEEK. We're fixing this for Gets.

bq. Because it suppresses the SEEK, turning it into a SKIP or INCLUDE, then we 
are skirting the check of there being any more rows in this scan so we keep 
going... till we hit the next row... (loading blocks if we have to) and only 
then, do we realize we are actually done.
That is only true for the last row in a block.

bq. -looksee is good. Doesn't do anything for the case when stopRows are 
sprecified (here again we can over read) and it could make broader use of the 
fact that we know there are no more rows if the Scan is a Get Scan.

A scan presumably is big. We're doing an extra check for every single row to 
avoid loading a single block in the end. Sure, for small scans the block load 
might be worse, for big scans it is not. I'll point out the 3x improvement 
again in our M/R jobs that do a lot of scanning.

bq. Scenario #6 and #7 above are interesting. For a Get, we should not be doing 
#7.
I do not think they are. In #6, how do you now you're done with the row? You 
need to read the next Cell of the next row. If that's on another block that 
block needs to be loaded first.
#7 similarly, you need to check the next Cell to see whether you are done. If 
that is on the next block it needs to be loaded.

All of this should only happen when we talk about the last row in block, no? If 
the next row is the same block we _want_ to SKIP (not SEEK).


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUn

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188810#comment-15188810
 ] 

Lars Hofhansl commented on HBASE-15392:
---

We're optimizing the wrong thing here. Let's fix it for Gets.
[~anoopsamjohn], I was surprised by the cost of SEEKs as well. I've spent some 
time to make that better, but at the we have reset the pointers for every HFile 
we're touching. That means SEEKing in the right HFile block of every HFile. 
Then there are a bunch of compares in KeyValueHeap and the HFileReader classes. 
Note that is true whether we are reading a new block from disk or from the 
block cache, we're doing a bunch of work for every SEEK.

The case I missed is that moreRowsMayExistAfter is not called unless we're 
getting a XXX_SEEK code. My bad. Let's fix that part. For Gets for sure, we 
should fix it.

For Scans, unless we have a degenerate case (a very small scan that fits into 
one HFile block), we might load an extra block if the StopRow falls towards the 
end of a block, which is far outweighed by the time we've saved during the 
scan. That block will a read of the same file and it will be the _next_ block 
to read (so not a random read, but a sequential read). That is a fair 
trade-off. IMHO.
(Maybe we can disable the next row optimization for small scans as well.)

Anyway... If you want to remove optimize() or change it significantly, show me 
end-to-end perf numbers (like I have done in HBASE-13109). :)

Areas where it might be degenerate are the following:
* Gets.. I missed that.
* Large Cells, such that only a few (maybe just one) fit into into an 
HFileBlock.
* Very small scans (that just read a single block, or very few block)


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384

[jira] [Comment Edited] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188810#comment-15188810
 ] 

Lars Hofhansl edited comment on HBASE-15392 at 3/10/16 7:03 AM:


We're optimizing the wrong thing here. Let's fix it for Gets.
[~anoopsamjohn], I was surprised by the cost of SEEKs as well. I've spent some 
time to make that better, but at the end we have to reset the pointers for 
every HFile we're touching. That means SEEKing in the right HFile block of 
every HFile. Then there are a bunch of compares in KeyValueHeap and the 
HFileReader classes. Note that is true whether we are reading a new block from 
disk or from the block cache, we're doing a bunch of work for every SEEK.

The case I missed is that moreRowsMayExistAfter is not called unless we're 
getting a XXX_SEEK code. My bad. Let's fix that part. For Gets for sure, we 
should fix it.

For Scans, unless we have a degenerate case (a very small scan that fits into 
one HFile block), we might load an extra block if the StopRow falls towards the 
end of a block, which is far outweighed by the time we've saved during the 
scan. That block will a read of the same file and it will be the _next_ block 
to read (so not a random read, but a sequential read). That is a fair 
trade-off. IMHO. HBase is usually CPU bound (which is in part what I am trying 
to fix)

Maybe we can disable the next row optimization for small scans as well.

Anyway... If you want to remove optimize() or change it significantly, show me 
end-to-end perf numbers (like I have done in HBASE-13109). :)

Areas where it might be degenerate are the following:
* Gets.. I missed that.
* Large Cells, such that only a few (maybe just one) fit into into an 
HFileBlock.
* Very small scans (that just read a single block, or very few block)



was (Author: lhofhansl):
We're optimizing the wrong thing here. Let's fix it for Gets.
[~anoopsamjohn], I was surprised by the cost of SEEKs as well. I've spent some 
time to make that better, but at the we have reset the pointers for every HFile 
we're touching. That means SEEKing in the right HFile block of every HFile. 
Then there are a bunch of compares in KeyValueHeap and the HFileReader classes. 
Note that is true whether we are reading a new block from disk or from the 
block cache, we're doing a bunch of work for every SEEK.

The case I missed is that moreRowsMayExistAfter is not called unless we're 
getting a XXX_SEEK code. My bad. Let's fix that part. For Gets for sure, we 
should fix it.

For Scans, unless we have a degenerate case (a very small scan that fits into 
one HFile block), we might load an extra block if the StopRow falls towards the 
end of a block, which is far outweighed by the time we've saved during the 
scan. That block will a read of the same file and it will be the _next_ block 
to read (so not a random read, but a sequential read). That is a fair 
trade-off. IMHO.
(Maybe we can disable the next row optimization for small scans as well.)

Anyway... If you want to remove optimize() or change it significantly, show me 
end-to-end perf numbers (like I have done in HBASE-13109). :)

Areas where it might be degenerate are the following:
* Gets.. I missed that.
* Large Cells, such that only a few (maybe just one) fit into into an 
HFileBlock.
* Very small scans (that just read a single block, or very few block)


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketC

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188816#comment-15188816
 ] 

Lars Hofhansl commented on HBASE-15392:
---

(Maybe it's time to look at HBASE-11811. I started that, but never finished. 
Had a pretty big impact for Gets, since we can find the offset of any Cell in 
an HFileBlock in O(log\(n)) time rather then O\(n) (n being the number of Cells 
in a block), and we also have a notion of the nth Cell in a block, which will 
enable other optimization like sampling.)


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.request

[jira] [Commented] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-03-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188823#comment-15188823
 ] 

Lars Hofhansl commented on HBASE-15431:
---

bq. Was 1010 enough to get the list of methods inlined?

Yep. The largest is 1006 bytes. After that I no longer see any "hot method too 
big" messages. But performance actually seemed to be slower. So maybe this is a 
dud...? I looked into breaking up StoreScanner.next(...), it's pretty messy 
now, with many exits (continue with the loop, break out of loop, return from 
the method with result true, return with false, etc). What we want to do is 
breaking up the larger method so that hot code is separate from the not so hot 
code. So in StoreScanner.next we'd try to more DONE, DONE_SCAN, maybe XXX_SEEK 
code to separate methods. That would allow the more frequent and very hot SKIP 
and INCLUDE code paths to be inlined... But as I said, it's not a trivial quick 
fix, it's need to rethinking of the structure of this method. Others are 
similar.


> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).
> In all cases I primed the JVM to make sure the JVM gets a chance to profile 
> the methods and decide whether they're hot or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189542#comment-15189542
 ] 

Lars Hofhansl commented on HBASE-15392:
---

# rows can straddle block boundaries
# versions of column can straddle block boundaries (i.e. column C1 at T1 might 
in different block than column C1 at T2)
# we want to SKIP and INCLUDE if the chance is high that we'll find the desired 
Cell after a few of those
# we want INCLUDE_AND_SEEK and SEEK when the chance is high that we'll be able 
to seek past many Cells this way
# a good proxy (best effort) to determine whether INCLUDE/SKIP is better than 
SEEK is whether we'll be likely seeking to the next block (or past the next 
block).

Example:
{noformat}
|block 1  | block 2   |
|  r1/c1, r1/c2, r1/c3|r1/c4, r1/c5, r2/c1|
 ^ ^
 | |
Next Index Key   SEEK_NEXT_ROW (before r2/c2)

|block 1   | block 2  |
|  r1/c1/T5, r1/c1/T4, r1/c1/t3|r1/c1/t2, r1/c1/T1, r1/c2/T3  |
   ^  ^
   |  |
   Next Index KeySEEK_NEXT_COL
{noformat}

No imagine in case one we want columns c1 and c3, then we seek to the next, 
which will land us on the next block, hence we'll seek.
In case two, say we only want one version of c1, after we have it, a SEEK_COL 
will be issued, it would land us on the next block, so we'll SEEK.

In other scenarios where the SEEK will not land us in the next block, it is 
very likely better to issues a series of SKIPs.

Is it clearer now as to what optimize is doing?


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user9

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189685#comment-15189685
 ] 

Lars Hofhansl commented on HBASE-15392:
---

bq. We check rows in both and they are same. So the return value of 
compareKeyForNextRow will be 0.

No we don't! 

The seek keys will never be equal to anything, because they are not real keys. 
We are not comparing rows but keys.
You can change it to > if you want. It's won't make a difference.

Check the compareKeyForNextRow method. The next indexed key is r1/c4, 
compareKeyForNextRow will check against the last key in row, which is right 
_between_ r1/c5 and r2/c1 (it is equal to neither). Look at the comparison when 
the type is minimum and the column is empty (for the compare key).


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazy

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189687#comment-15189687
 ] 

Lars Hofhansl commented on HBASE-15392:
---

bq. Yes. All cells corresponding to a row are together in one HFile.

That, actually is not strictly true. Versions of the same Cell or columns of 
the same row could be in different HFiles. This is just a heuristic 
optimization it is not perfect.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScan

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189755#comment-15189755
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Sorry... I'm known for reading the whole comment before responding. Sorry about 
that.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
> at 
> org.apache.hadoop.hbase.regi

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189752#comment-15189752
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Yep. Sorry, I've in this code so much and that this is obvious. I should have 
put a nice comment on the method.
Want me to make a new "looksee" patch with improved comments?

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreSca

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189775#comment-15189775
 ] 

Lars Hofhansl commented on HBASE-15392:
---

bq. Coming back to reading the whole row. Could we look first at the index for 
the next block and decide if we need to read the next block ?

[~danielpol], thanks for keeping us all honest (and thanks to [~stack], 
[~anoop.hbase], and [~ram_krish] too).
Yes, we have a copy of first key of the next block, in theory we could use that 
to check without loading the actual block. I'd have to think about the 
mechanics, but we could potentially put this into the optimize method as well.
Like so: If this is a Get and we're seeking to the next row, we now compare 
rows (not keys this time!) of the current Cell with that of the next indexed 
key (this would be a different compator), if different, we know we're done.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.re

[jira] [Comment Edited] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15189752#comment-15189752
 ] 

Lars Hofhansl edited comment on HBASE-15392 at 3/10/16 7:16 PM:


Yep. Sorry, I've been in this code so much and that this seems obvious to me. I 
should have put a nice comment on the method.
Want me to make a new "looksee" patch with improved comments?



was (Author: lhofhansl):
Yep. Sorry, I've in this code so much and that this is obvious. I should have 
put a nice comment on the method.
Want me to make a new "looksee" patch with improved comments?

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.ja

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190505#comment-15190505
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Right, it's a heuristic. I'm open for better ideas as to how we can avoid the 
expensive SEEKs. For sure, if we end up on the next block we need to SEEK on 
that block anyway, so avoiding SEEKs before has no benefit. But if stay in the 
same block SKIP is likely better.

We're still debating the rare case that the row is the last one on the block, 
we we overshot by one (which then, rarely I think, loads the next block).


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
>  

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190509#comment-15190509
 ] 

Lars Hofhansl commented on HBASE-15392:
---

So here's another idea:
# we leave the SEEK_TO_COL stuff the way it is (compare SEEK key against next 
indexed key)
# when in optimize we get a SEEK_ TO_ROW, we instead not compare the _row_ part 
only of the next indexed key and the current key. If the rows are the same, for 
sure we want to SEEK (the row must end outside of the current block, since the 
next block starts with the same row), if rows are not the same, we might to 
SKIP, SKIP,... since the row _must_ end on the current block. That needs a new 
compare method :(  at least in 0.98.

On the other hand I do not want to over-complicate this more than it already is.

I also want to emphasize... Like stats in any traditional relational database, 
this is meant to get it right as much as possible without complete information. 
It's not meant to get every single SKIP vs. SEEK decision right - that is not 
possible without complete information about exactly how many Cells the SEEK 
would be able to SKIP. Whether we'll stay within the current block or not is 
simply a good proxy metric. If there is a better one, I'd love to hear that.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190537#comment-15190537
 ] 

Lars Hofhansl commented on HBASE-15392:
---

bq. compare the row part only

Alas won't work either, since the next indexed key is not actually a real key 
either, just one that sorts into the right place.

What I need to know to investigate this further: Do we overshoot by a block 
every time? For every Get, for every Scan, we overshoot into the next block? Or 
is it only when we happen look at the last row, or the last cell in the block?


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190540#comment-15190540
 ] 

Lars Hofhansl commented on HBASE-15392:
---

It's because we're comparing against what would be the key we use for seeking. 
Those are not real, but constructed such that they sort at the right place. 
Checkout KeyValueUtil.createLastOnRow for an example. There we create a KV that 
is guaranteed to sort behind all possible KVs of the passed row and before the 
next row. But it is not equal to a KV that can actually exist in an HFile 
becuase it was specially constructed (with type = minimum)


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(Key

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191791#comment-15191791
 ] 

Lars Hofhansl commented on HBASE-15392:
---

Here's the comment that we can add to optimize:
{code}
   * This works together with ColumnTrackers and Filters. ColumnTrackers may 
issue seek hints,
   * such as seek to next column, next row, or an arbitrary seek key.
   * Optimize intercepts these and decides whether a seek is the most efficient 
_actual_ way
   * to get us to the requested cell.
   * It does this looking at the next indexed key of the current HFile as a 
heuristic, this key
   * is then compared with the seek key (this is optimized here to avoid 
actually creating the
   * seek key, it just does the compare as if we had a seek key). If the seek 
gets us onto the
   * next block we seek, otherwise we just INCLUDE or SKIP, and let the 
ColumnTrackers or Filters
   * go through the next Cell, and so on)
   *
   * The ColumnTrackers and Filters must behave correctly in all cases, i.e. if 
past the Cells
   * they care about they must issues a SKIP or SEEK. That's the right behavior 
anyway.
{code}

In the end it's very simple, really: When ColumnTracker or Filter suggest that 
we _could_ do a seek, we compare the next indexed key (which may or may not be 
real, it comes from an HFile) with the seek key (which is generated from the 
current Cell, and not real), if the seek key is larger, we seek, since we'll 
end on another block for sure, if the seek key is smaller we assume that 
skipping on the current block is cheaper.

In my previous email meant that we create the _seek key_ this way. Not next 
indexed key. The seek key isn't real.
Checkout StoreScanner.seekToNextRow.
{code}
  protected boolean seekToNextRow(KeyValue kv) throws IOException {
return reseek(matcher.getKeyForNextRow(kv));
  }
{code}
Look how {{matcher.getKeyForNextRow(kv)}} creates a fake KV out of the current 
KV, such that it sorts after all Cells of the current row (empty cell and 
family and type=minimum), look at how Cells are compared when the cell is empty 
and type=minimum.
That fake key is the key that is used for seeking. When we do the seek, that's 
where we land: Right after the last Cell of the current row and before (but not 
on) the first Cell of the next row.
The compares in optimize() do the _exact same_ compare (just that for 
optimization the seek key is not actually created, that would be too expensive, 
it's _identical_ to compare matcher.getKeyForNextRow(kv) with next indexed key).


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 15392v5.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191834#comment-15191834
 ] 

Lars Hofhansl commented on HBASE-15392:
---

-v5 looks good. The comment is too long. :) After a while it'll go stale.

I do not think we want another compare for the stopRow. All the compares are 
_killing_ us in terms of performance. For a large scan I much rather read 
another block and wasting CPU cycles and RAM lookups on the compares.


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 15392v5.patch, 
> HBASE-15392_suggest.patch, gc.png, gc.png, io.png, no_optimize.patch, 
> no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> or

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191993#comment-15191993
 ] 

Lars Hofhansl commented on HBASE-15392:
---

bq. You are not referring to anything in the patch, right[...]?
Right :)

As to more data in the index. Yeah, we must ensure that we can know we're the 
last cell without doing a compare for each cell. Would have to think about it 
some more.

Lemme look at -v6.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 15392v5.patch, 
> 15392v6.patch, HBASE-15392_suggest.patch, gc.png, gc.png, io.png, 
> no_optimize.patch, no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-12 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192000#comment-15192000
 ] 

Lars Hofhansl commented on HBASE-15392:
---

+1 on -v6.

Minor spelling nit in comment: {{It does this looking at the next}}, missed a 
"by".


> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392-0.98-looksee.txt, 15392.wip.patch, 
> 15392v2.wip.patch, 15392v3.wip.patch, 15392v4.patch, 15392v5.patch, 
> 15392v6.patch, HBASE-15392_suggest.patch, gc.png, gc.png, io.png, 
> no_optimize.patch, no_optimize.patch, reads.png, reads.png, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
>  

[jira] [Created] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-13 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15452:
-

 Summary: Consider removing checkScanOrder from StoreScanner.next
 Key: HBASE-15452
 URL: https://issues.apache.org/jira/browse/HBASE-15452
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


In looking why we spent so much time in StoreScanner.next when doing a simple 
Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
function dispatch (that the JIT would eventually inline), it also requires 
setting the prevKV member for every Cell encountered.

Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
0.98).
I will repeat this test on my work machine tomorrow.

I think we're stable enough to remove that check anyway.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15452:
--
Attachment: 15452-0.98.txt

Patch... Just to park it. Simply removes the cruft.
Note that if we come in with identical KVs kvscanned will be off by those.


> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15452-0.98.txt
>
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15452:
--
Attachment: (was: 15452-0.98.txt)

> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15453) Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-13 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15453:
-

 Summary: Considering reverting HBASE-10015 - reinstance 
synchronized in StoreScanner
 Key: HBASE-15453
 URL: https://issues.apache.org/jira/browse/HBASE-15453
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
StoreScanner are slower that explicit locks.

I was surprised by this. To make sure I added a simple perf test and many folks 
ran it on their machines. All found that explicit locks were faster.

Now... I just ran that test again. On the latest JDK8 I find that now the 
intrinsic locks are significantly faster:

Explicit locks:
10 runs  mean:2223.6 sigma:72.29412147609237

Intrinsic locks:
10 runs  mean:1865.3 sigma:32.63755505548784

I confirmed the same with timing some Phoenix scans. We can save a bunch of 
time by changing this back 

Arrghhh... So maybe it's time to revert this now...?

(Note that in trunk due to [~ram_krish]'s work, we do not lock in StoreScanner 
anymore)

I'll attach the perf test and a patch that changes lock to synchronized, if 
some folks could run this on 0.98, that'd be great.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15453) Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15453:
--
Attachment: 15453-0.98.txt

Here's the patch with test.

[~apurtell], [~stack], and anybody else, if you guys have some time, could you 
apply this patch and run TestScanFilterPerformance?
Then undo the changes to StoreScanner and ReversedStoreScanner and run the test 
again.

The test does some warm up, then runs the perf test 10 times and calculates 
mean and standard deviation.


> Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner
> ---
>
> Key: HBASE-15453
> URL: https://issues.apache.org/jira/browse/HBASE-15453
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15453-0.98.txt
>
>
> In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
> StoreScanner are slower that explicit locks.
> I was surprised by this. To make sure I added a simple perf test and many 
> folks ran it on their machines. All found that explicit locks were faster.
> Now... I just ran that test again. On the latest JDK8 I find that now the 
> intrinsic locks are significantly faster:
> Explicit locks:
> 10 runs  mean:2223.6 sigma:72.29412147609237
> Intrinsic locks:
> 10 runs  mean:1865.3 sigma:32.63755505548784
> I confirmed the same with timing some Phoenix scans. We can save a bunch of 
> time by changing this back 
> Arrghhh... So maybe it's time to revert this now...?
> (Note that in trunk due to [~ram_krish]'s work, we do not lock in 
> StoreScanner anymore)
> I'll attach the perf test and a patch that changes lock to synchronized, if 
> some folks could run this on 0.98, that'd be great.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15453) Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192562#comment-15192562
 ] 

Lars Hofhansl commented on HBASE-15453:
---

Another 4 runs:
Intrinsic:
10 runs  mean:1918.1 sigma:24.122396232547054
10 runs  mean:1898.5 sigma:22.531089631884207

Explicit:
10 runs  mean:2164.2 sigma:21.48394749574668
10 runs  mean:2145.9 sigma:22.322410264126948


> Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner
> ---
>
> Key: HBASE-15453
> URL: https://issues.apache.org/jira/browse/HBASE-15453
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15453-0.98.txt
>
>
> In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
> StoreScanner are slower that explicit locks.
> I was surprised by this. To make sure I added a simple perf test and many 
> folks ran it on their machines. All found that explicit locks were faster.
> Now... I just ran that test again. On the latest JDK8 I find that now the 
> intrinsic locks are significantly faster:
> Explicit locks:
> 10 runs  mean:2223.6 sigma:72.29412147609237
> Intrinsic locks:
> 10 runs  mean:1865.3 sigma:32.63755505548784
> I confirmed the same with timing some Phoenix scans. We can save a bunch of 
> time by changing this back 
> Arrghhh... So maybe it's time to revert this now...?
> (Note that in trunk due to [~ram_krish]'s work, we do not lock in 
> StoreScanner anymore)
> I'll attach the perf test and a patch that changes lock to synchronized, if 
> some folks could run this on 0.98, that'd be great.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15453) Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15453:
--
Description: 
In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
StoreScanner are slower that explicit locks.

I was surprised by this. To make sure I added a simple perf test and many folks 
ran it on their machines. All found that explicit locks were faster.

Now... I just ran that test again. On the latest JDK8 I find that now the 
intrinsic locks are significantly faster:
(OpenJDK Runtime Environment (build 1.8.0_72-b15))

Explicit locks:
10 runs  mean:2223.6 sigma:72.29412147609237

Intrinsic locks:
10 runs  mean:1865.3 sigma:32.63755505548784

I confirmed the same with timing some Phoenix scans. We can save a bunch of 
time by changing this back 

Arrghhh... So maybe it's time to revert this now...?

(Note that in trunk due to [~ram_krish]'s work, we do not lock in StoreScanner 
anymore)

I'll attach the perf test and a patch that changes lock to synchronized, if 
some folks could run this on 0.98, that'd be great.


  was:
In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
StoreScanner are slower that explicit locks.

I was surprised by this. To make sure I added a simple perf test and many folks 
ran it on their machines. All found that explicit locks were faster.

Now... I just ran that test again. On the latest JDK8 I find that now the 
intrinsic locks are significantly faster:

Explicit locks:
10 runs  mean:2223.6 sigma:72.29412147609237

Intrinsic locks:
10 runs  mean:1865.3 sigma:32.63755505548784

I confirmed the same with timing some Phoenix scans. We can save a bunch of 
time by changing this back 

Arrghhh... So maybe it's time to revert this now...?

(Note that in trunk due to [~ram_krish]'s work, we do not lock in StoreScanner 
anymore)

I'll attach the perf test and a patch that changes lock to synchronized, if 
some folks could run this on 0.98, that'd be great.



> Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner
> ---
>
> Key: HBASE-15453
> URL: https://issues.apache.org/jira/browse/HBASE-15453
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15453-0.98.txt
>
>
> In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
> StoreScanner are slower that explicit locks.
> I was surprised by this. To make sure I added a simple perf test and many 
> folks ran it on their machines. All found that explicit locks were faster.
> Now... I just ran that test again. On the latest JDK8 I find that now the 
> intrinsic locks are significantly faster:
> (OpenJDK Runtime Environment (build 1.8.0_72-b15))
> Explicit locks:
> 10 runs  mean:2223.6 sigma:72.29412147609237
> Intrinsic locks:
> 10 runs  mean:1865.3 sigma:32.63755505548784
> I confirmed the same with timing some Phoenix scans. We can save a bunch of 
> time by changing this back 
> Arrghhh... So maybe it's time to revert this now...?
> (Note that in trunk due to [~ram_krish]'s work, we do not lock in 
> StoreScanner anymore)
> I'll attach the perf test and a patch that changes lock to synchronized, if 
> some folks could run this on 0.98, that'd be great.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15453) Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15192726#comment-15192726
 ] 

Lars Hofhansl commented on HBASE-15453:
---

Work machine (OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4))
Intrinsic:
10 runs  mean:1749.9 sigma:25.76606295109907
10 runs  mean:1782.8 sigma:54.50100916496868

Explicit:
10 runs  mean:1487.6 sigma:117.99169462296913
10 runs  mean:1479.6 sigma:41.01511916354749

So on JDK7 the explicit locks are (still inexplicably) faster.


> Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner
> ---
>
> Key: HBASE-15453
> URL: https://issues.apache.org/jira/browse/HBASE-15453
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15453-0.98.txt
>
>
> In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
> StoreScanner are slower that explicit locks.
> I was surprised by this. To make sure I added a simple perf test and many 
> folks ran it on their machines. All found that explicit locks were faster.
> Now... I just ran that test again. On the latest JDK8 I find that now the 
> intrinsic locks are significantly faster:
> (OpenJDK Runtime Environment (build 1.8.0_72-b15))
> Explicit locks:
> 10 runs  mean:2223.6 sigma:72.29412147609237
> Intrinsic locks:
> 10 runs  mean:1865.3 sigma:32.63755505548784
> I confirmed the same with timing some Phoenix scans. We can save a bunch of 
> time by changing this back 
> Arrghhh... So maybe it's time to revert this now...?
> (Note that in trunk due to [~ram_krish]'s work, we do not lock in 
> StoreScanner anymore)
> I'll attach the perf test and a patch that changes lock to synchronized, if 
> some folks could run this on 0.98, that'd be great.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-14 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-15452:
--
Attachment: 15452-0.98.txt

Oops. Here it is. Trivial.

> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15452-0.98.txt
>
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-14 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193690#comment-15193690
 ] 

Lars Hofhansl commented on HBASE-15452:
---

Two runs with patch applied (test from HBASE-15453)
10 runs  mean:1727.2 sigma:45.20353968440967
10 runs  mean:1769.9 sigma:16.53148511174964

Two runs without patch:
10 runs  mean:2194.8 sigma:56.31838065853811
10 runs  mean:2183.0 sigma:31.135189095298585

So this saved about 20% of runtime! Shows again that these code paths are 
extremely hot and every instruction we can save in these will be noticed!

> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15452-0.98.txt
>
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15453) Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-14 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193701#comment-15193701
 ] 

Lars Hofhansl commented on HBASE-15453:
---

What's the best way to just commit the perf test? I want to easily be able to 
run it without compiling and installing HBase somewhere.

Right now it's a test that in its failure message displays the runtime. Any 
better way to do this? I suppose I could check in but comment the \@test marker 
by default...?

> Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner
> ---
>
> Key: HBASE-15453
> URL: https://issues.apache.org/jira/browse/HBASE-15453
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15453-0.98.txt
>
>
> In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
> StoreScanner are slower that explicit locks.
> I was surprised by this. To make sure I added a simple perf test and many 
> folks ran it on their machines. All found that explicit locks were faster.
> Now... I just ran that test again. On the latest JDK8 I find that now the 
> intrinsic locks are significantly faster:
> (OpenJDK Runtime Environment (build 1.8.0_72-b15))
> Explicit locks:
> 10 runs  mean:2223.6 sigma:72.29412147609237
> Intrinsic locks:
> 10 runs  mean:1865.3 sigma:32.63755505548784
> I confirmed the same with timing some Phoenix scans. We can save a bunch of 
> time by changing this back 
> Arrghhh... So maybe it's time to revert this now...?
> (Note that in trunk due to [~ram_krish]'s work, we do not lock in 
> StoreScanner anymore)
> I'll attach the perf test and a patch that changes lock to synchronized, if 
> some folks could run this on 0.98, that'd be great.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-14 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193712#comment-15193712
 ] 

Lars Hofhansl commented on HBASE-15452:
---

This patch together with HBASE-15453 (JDK8)
10 runs  mean:1430.6 sigma:11.612062693595828

That saves over 1/3 of the runtime!!


> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15452-0.98.txt
>
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-14 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193773#comment-15193773
 ] 

Lars Hofhansl commented on HBASE-15452:
---

Wait... We're running the tests with assertions enabled, right? So this would 
not be like the production thing.
In the description I measure this end-to-end in "production" with Phoenix.


> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15452-0.98.txt
>
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15453) [Performance] Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-14 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194697#comment-15194697
 ] 

Lars Hofhansl commented on HBASE-15453:
---

Note that HBASE-10015 in had many folks confirm the numbers. Explicit locks 
were better back then.
Also note that with JDK7 tested at work, Explicit locks were still better.
Only with JDK8 at home did I find that intrinsic locks (synchronized) was 
better.

It's not slam dunk, it depends on the JDK.


> [Performance] Considering reverting HBASE-10015 - reinstance synchronized in 
> StoreScanner
> -
>
> Key: HBASE-15453
> URL: https://issues.apache.org/jira/browse/HBASE-15453
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Critical
> Attachments: 15453-0.98.txt
>
>
> In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
> StoreScanner are slower that explicit locks.
> I was surprised by this. To make sure I added a simple perf test and many 
> folks ran it on their machines. All found that explicit locks were faster.
> Now... I just ran that test again. On the latest JDK8 I find that now the 
> intrinsic locks are significantly faster:
> (OpenJDK Runtime Environment (build 1.8.0_72-b15))
> Explicit locks:
> 10 runs  mean:2223.6 sigma:72.29412147609237
> Intrinsic locks:
> 10 runs  mean:1865.3 sigma:32.63755505548784
> I confirmed the same with timing some Phoenix scans. We can save a bunch of 
> time by changing this back 
> Arrghhh... So maybe it's time to revert this now...?
> (Note that in trunk due to [~ram_krish]'s work, we do not lock in 
> StoreScanner anymore)
> I'll attach the perf test and a patch that changes lock to synchronized, if 
> some folks could run this on 0.98, that'd be great.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-14 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15452.
---
Resolution: Invalid

NM... I'm full of @#%^

> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 15452-0.98.txt
>
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >