[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831719#comment-13831719
 ] 

stack commented on HBASE-10015:
---

On mac I see this w/ jvm1.8:

Failed tests:   
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance):
 10 runs  mean:3032.9 sigma:60.501983438561744
Failed tests:   
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance):
 10 runs  mean:2322.4 sigma:119.92430946226041

Let me try another box.  How can I be sure there is no regression for wide 
tables?

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831721#comment-13831721
 ] 

Andrew Purtell commented on HBASE-10015:


JDK 7, 16 vcores, SSD, xfs

Without: 5 runs, Mean: 14.4136 Sigma: 0.17013712116995514

With the 0.94 -lock patch: 5 runs, Mean: 10.614 Sigma: 0.18904179432072687

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831740#comment-13831740
 ] 

Andrew Purtell commented on HBASE-10015:


This is the same system with TestScanFilterPerformance:

Without: 10 runs  mean:2248.7 sigma:36.59795076230362

With the 0.94 -lock patch: 10 runs  mean:1860.3 sigma:29.397448868906977

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831742#comment-13831742
 ] 

Ted Yu commented on HBASE-10015:


On Linux with jdk 1.7 :

with patch:
Failed tests:   
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance):
 10 runs  mean:2297.5 sigma:23.29484921608208

without:
Failed tests:   
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance):
 10 runs  mean:3242.4 sigma:732.0478399667605

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831814#comment-13831814
 ] 

Lars Hofhansl commented on HBASE-10015:
---

So it seems this is universal.
Our performance dude has actually confirmed that this is an expected outcome. 
I'll gather some more CPU metrics as I find time today.

[~stack], these locks are taken in StoreScanner, which is row based (unlike 
StoreFileScanner and MemstoreScanner). So the frequent locks/unlocks there we'd 
mostly see per row. Wider rows won't be slower, but the speedup effect would be 
proportionally less. To be sure I'll validate with wider tables (maybe 20 CQs). 
I will also double check that biased locking is in effect.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832029#comment-13832029
 ] 

Lars Hofhansl commented on HBASE-10015:
---

10 columns, 100 byte values, 1st and 4th selected.
5 runs, Mean: 13.756 Sigma: 0.04210938137755054, with lock patch
5 runs, Mean: 14.2028 Sigma: 0.08452313292821084, without

10 columns, all selected
5 runs, Mean: 8.577 Sigma: 0.09060463564299566
5 runs, Mean: 9.9846 Sigma: 0.1023749969474969, without

Per KV cost now is dominant. In no scenario have I observed this to be slower.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832032#comment-13832032
 ] 

Ted Yu commented on HBASE-10015:


+1

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832107#comment-13832107
 ] 

Lars Hofhansl commented on HBASE-10015:
---

I force enabled biased locking. No improvement with intrinsic locking; 
according to the docs it is enabled in JDK6+ anyway, was just making sure.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832147#comment-13832147
 ] 

Enis Soztutar commented on HBASE-10015:
---

If we do ref counting, can't we still finish the compaction, but not archive 
the files as long as there are scanners against them. We can do it by 
decoupling the store file archiving from compaction. At the worst case on RS 
crash, we might end up already compacted files which won't affect semantics. 
Regardless, +1 on -lock patch just for the gains. 

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832162#comment-13832162
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Yeah, that's the idea. We'd delay archiving any HFile until no scanners are 
referring to it any more.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831001#comment-13831001
 ] 

stack commented on HBASE-10015:
---

So you are thinking this cannot be completed until after we add delayed clean 
up of no-longer-used files?  Only then can we safely remove synchronizations?

How many threads we talking anyways?  It should be uncontended.  The only 
thread is the current handler asking to return scan results -- this changes as 
different handlers come in on each bulk next invocation -- and then an 
incidental update readers request..and that is it?

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1383#comment-1383
 ] 

Lars Hofhansl commented on HBASE-10015:
---

I think so (to your first point).

This is (almost) never contented, but StoreScanner.peek() is called *very* 
frequently (including the compares in KeyValueHeap) and the memory fences 
enforced by synchronized cause a slowdown.

I did notice that during flushed and compactions we do *not* register any 
listeners for changed readers, so my earlier idea of just synchronizing on the 
RegionScannerImpl should work after all.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831165#comment-13831165
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Was just looking at making a trunk patch.
In trunk StoreScanners for compaction *do* register a change observer? But not 
in 0.94?!

So - sigh - the patch I just made for 0.94 will not work in trunk. And maybe 
more importantly: Is there a lingering bug in 0.94?

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-new-sample.txt, 10015-0.94-v2.txt, 
 10015-0.94-v3.txt, 10015-0.94-v4.txt, 10015-0.94-withtest.txt, 
 10015-0.94.txt, 10015-trunk-v2.txt, 10015-trunk-v3.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831191#comment-13831191
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Continuing my monologue here... :)

To my surprise I get a 25-30% perf improvement when I just replace intrinsic 
locking (synchronized) with ReentrantLock - again for very tall tables only.
synchronized with biased locking should outperform the ReentrantLock in 
uncontended cases, but it does not (all my tests are against a real 
RegionServer, so the initial delay for BiasedLocking has not effect here). 

Something is going on.

This is on JDK7 on old'ish 2 core machine. Will try on some other machines as 
well. If this bears out on other machines as well it would be safe and quick 
win.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831202#comment-13831202
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Verified on different machine architecture with JDK6 (using the attached 
unittest).
4400ms vs 3200ms (JDK7, 2 core machine)
2600mx vs 2000ms (JDK6, 12 core machine)

Again, if somebody is still reading and could verify the latest patch with the 
earlier unittest on their hardware that would be greatly appreciated.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-lock.txt, 10015-0.94-new-sample.txt, 
 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830719#comment-13830719
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Something like this. 

On order for the compaction to make progress we could assign scanners to 
epochs, where everytime we change the readers for a store we go to a new epoch. 
If all scanners for an epoch are either done or have switched to the new epoch, 
we can retire the readers of that epoch.

In any case, just commit this change and keep working on it? As Vladimir points 
out there are other issues with more serious performance implications.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-23 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830735#comment-13830735
 ] 

Andrew Purtell commented on HBASE-10015:


bq. On order for the compaction to make progress we could assign scanners to 
epochs, where everytime we change the readers for a store we go to a new epoch. 
If all scanners for an epoch are either done or have switched to the new epoch, 
we can retire the readers of that epoch.

I haven't thought about this nearly as much as you have recently but that 
sounds promising.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830739#comment-13830739
 ] 

stack commented on HBASE-10015:
---

+1 on committing what is done already.

epoch sounds like refcounting?  Yeah, no hurry deleting the old stuff as long 
as it is done eventually.  Accounting would be easier if we could move/rename 
files under the scanner as long is it does not disrupt (maybe I can try this).

I like your idea of lockless scanning.  Would be good to put it up as a goal 
even if hard to attain, if only to orientate which way progress lies.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830786#comment-13830786
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Epoch would be like reference counting per distinct sweet of readers. With just 
a reference count of scanners I'd be worried that compaction would never make 
any progress.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830849#comment-13830849
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Actually this is not quite right in 0.94, because it does not have HBASE-6499. 
(seek does not call checkReseek). I'll make that change as well, also 
checkUpdatedReaders can be folded into checkReseek for better readability.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830857#comment-13830857
 ] 

Lars Hofhansl commented on HBASE-10015:
---

No longer sure that patch is good. The problem was that StoreScanner.peek() is 
synchronized.

With the patch peek() will use the old scanner stack, looking at 
MemstoreScanner and StoreFileScanner it *is* correct (both scanners just return 
the current KV and StoreFileScanner does not touch its reader during peek), but 
it is fragile, it also requires StoreScanner to hang on to all 
StoreFileScanners and the MemstoreScanner, until checkReseek is called or the 
scanner is closed. This in turn means that we keep references to the readers 
open, etc.

I keep doing that... Filing jiras with patches and then finding that the fix is 
a bad idea.
Well, at least I learned about modern CPU and that synchronized in tight loops 
is very expensive.

I will keep thinking about this. Unscheduling for now.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829824#comment-13829824
 ] 

Hadoop QA commented on HBASE-10015:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615283/10015-trunk-v4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7968//console

This message is automatically generated.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830174#comment-13830174
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Thanks Stack.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 10015-trunk.txt, 
 TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830243#comment-13830243
 ] 

Hadoop QA commented on HBASE-10015:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615356/10015-trunk-v4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7970//console

This message is automatically generated.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 10015-trunk.txt, 
 TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830296#comment-13830296
 ] 

Hadoop QA commented on HBASE-10015:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615371/10015-trunk-v4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7972//console

This message is automatically generated.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830309#comment-13830309
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Need to look at the find bugs thing

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830333#comment-13830333
 ] 

stack commented on HBASE-10015:
---

+1 on commit.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830464#comment-13830464
 ] 

Lars Hofhansl commented on HBASE-10015:
---

There's more stuff to do. I got a crash course on CPU design from Hussam Mousa, 
one of our performance experts.
We saw a lot of CPU frontend and backend stalls during scanning, so there is 
potential for a lot more improvements.

I'm still looking for ways to remove all synchronization from StoreScanner and 
RegionScannerImpl.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830467#comment-13830467
 ] 

Lars Hofhansl commented on HBASE-10015:
---

I'll look at the findbugs issue, do some more tests, and then commit.

An interesting metric we have to start to pay more attention is the scan cost 
per KV.
For example scanning through 1m rows with one CQ is *much* slower than scanning 
through 100k rows with 10 CQs, even though it touches the same number of KVs. 
This patch helps a bit to even that out.

As [~stack] and I said in the comments here, it should be possible to remove 
all synchronization form StoreScanner and RegionScannerImpl. It would require 
some refactoring.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830481#comment-13830481
 ] 

stack commented on HBASE-10015:
---

bq. It would require some refactoring.

Lets make a plan.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830495#comment-13830495
 ] 

Lars Hofhansl commented on HBASE-10015:
---

The findbugs warning is a dud, it complains about StoreScanner.heap being 
locked 79% of the cases.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830496#comment-13830496
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

{quote}
 so there is potential for a lot more improvements
{quote}

KV creation (new) is one of the serious bottlenecks, but I have no idea how to 
not create new instances on *next*. I have done some HBase internal hacks to 
get Maximum possible performance from scan. It is the multi-threaded  
application (in my case - 8HT threads) and scans on StoreFileScanner directly 
(data is cached 100%).  The table was tall and narrow.

Stock HBase was able to reach 50M KV per sec
Stock with KV reuse (hack) - 90M KV per sec.




 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830498#comment-13830498
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Here's another idea:
* we already lock the RegionScannerImpl.next/seek/reseek, etc.
* what if we pass the RegionScannerImpl instance as a lock object to 
StoreScanner
* in updateReaders() we'd then lock on that RegionScannerImpl instance, or we'd 
call a special synchronized method in RegionScannerImpl and have it update the 
readers.

The lock scope would be broader, but it should be correct, and it would allow 
us to remove all locking from StoreScanner.
I'll experiment with that.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830499#comment-13830499
 ] 

Lars Hofhansl commented on HBASE-10015:
---

[~vrodionov], yeah, need to work on that too. Is that with block encoding? When 
using block encoding we need to copy the underlying byte[]. When no block 
encoding is used making a new KV is just a few dozen bytes.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830578#comment-13830578
 ] 

Lars Hofhansl commented on HBASE-10015:
---

bq. pass the RegionScannerImpl instance as a lock object to StoreScanner

Alas, that works fine for scanning, but not for compactions/flushes (which also 
use StoreScanner, and can actually override the StoreScanner via a coprocessor 
hook).


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830580#comment-13830580
 ] 

stack commented on HBASE-10015:
---

bq. but not for compactions/flushes 

Tell us more?  Am interested.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830588#comment-13830588
 ] 

Lars Hofhansl commented on HBASE-10015:
---

I thought, since we lock all operations at the RegioScannerImpl level anyway, 
we could exclude a Store's updateReaders at that level too. That works fine if 
we have a RegionScanner, but in the case of compactions and flushes we don't 
one, so I have no way to protect the StoreScanner reading on behalf of a flush 
or a compaction.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830597#comment-13830597
 ] 

stack commented on HBASE-10015:
---

And it would be silly creating a regionscannerimpl to do a storescan.  We need 
to do update readers differently.  Compactions and flushes are rare and 
background tasks.  They should defer to the ongoing scans.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830600#comment-13830600
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Yeah, I'm coming around to that. :)
We have to figure out how to do that without excessive synchronization. At the 
minimum we have to guarantee that nobody is currently using the readers in 
question before we retire them.
Alternatively we wait for all running scanners to finish. How do we do that?


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830605#comment-13830605
 ] 

Lars Hofhansl commented on HBASE-10015:
---

I can come up with various schemes for that, but nothing that is cheaper than 
just synchronizing all the methods.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830612#comment-13830612
 ] 

stack commented on HBASE-10015:
---

+ Scans run free for N seconds or nanoseconds since checking this should be 
cheap(?) and then they go to a checkpoint where they look to see if they should 
reset.  ChangedReaders blocks until checkpoint has been cleared.
+ Scans check for closing being set on each op (I suppose this check of a 
volatile would be just as bad as a synchronization).
+ We refcount outstanding scanners and only delete compacted files when 
refcount goes to zero.  Not sure how we'd switch in flushes unless we prevent 
the flush happening while ongoing scan (eek).



 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk-v4.txt, 
 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829403#comment-13829403
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Thanks [~vrodionov], we'll look at RegionScanner next. :)  My evil plans is to 
eventually get rid of all synchronization during scanning and provide exclusion 
by containment instead.

The overall improvement is real. Validated on various different machines. The 
effect of memory stalls is probably more pronounced in the running server.

In any case, I think we agree that this patch can't make things worse (provide 
it is correct, of course).
I also tried with count(*) queries in Phoenix on tall tables (to rule out some 
anomalies with Filters). I see a 90% improvement there - again only on tall 
tables.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, 
 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829302#comment-13829302
 ] 

Lars Hofhansl commented on HBASE-10015:
---

I'll make a trunk patch later today.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829205#comment-13829205
 ] 

Ted Yu commented on HBASE-10015:


I tried to run the unit test but got:
{code}
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance)
  Time elapsed: 0.007 sec   ERROR!
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/tyu/hbase/hbase.version could only be replicated to 0 nodes, instead of 1
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
{code}

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829416#comment-13829416
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Some more results:
10 runs  mean:1851.0 sigma:127.71374240855992
vs (without StoreScanner patch)
10 runs  mean:2795.8 sigma:57.52182194611015


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, 
 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829423#comment-13829423
 ] 

Lars Hofhansl commented on HBASE-10015:
---

{code}
0: jdbc:phoenix:localhost select count(*) from tableTall;
+--+
| COUNT(1) |
+--+
| 2000 |
+--+
1 row selected (3.946 seconds)
{code}
With patch:
{code}
0: jdbc:phoenix:localhost select count(*) from tableTall;
+--+
| COUNT(1) |
+--+
| 2000 |
+--+
1 row selected (2.915 seconds)
{code}

Phoenix is using FAST_DIFF encoding and the ExplicitColumnTracker, so the 
relative gain is only 25% there (one CQ per row)

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, 
 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829485#comment-13829485
 ] 

Hadoop QA commented on HBASE-10015:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615205/10015-trunk.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestHRegion
  org.apache.hadoop.hbase.util.TestMergeTable
  org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7965//console

This message is automatically generated.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, 
 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by 

[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829212#comment-13829212
 ] 

Lars Hofhansl commented on HBASE-10015:
---

That would be related to your setup I think. Seems like it can't start the mini 
cluster.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829225#comment-13829225
 ] 

Andrew Purtell commented on HBASE-10015:


[~te...@apache.org] that's probably HBASE-5711, do 'umask 022' first.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829365#comment-13829365
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

I ran RegionScanner and see  ~10% improvement on on a very narrow rows only (8 
store files in a region). This is 0.94.6. By the way the performance difference 
between StoreScanner and RegionScanner is huge (almost 3x times).

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829268#comment-13829268
 ] 

Ted Yu commented on HBASE-10015:


200 iterations of TestStoreScanner, TestAtomicOperation and TestAcidGuarantees 
passed based on 0.94 patch.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829118#comment-13829118
 ] 

Lars Hofhansl commented on HBASE-10015:
---

If somebody could run the attached unit test that would be greatly appreciated.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829223#comment-13829223
 ] 

Andrew Purtell commented on HBASE-10015:


Without: 
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance):
 10 runs  mean:2224.3 sigma:24.178709642989634

With: 
testScanFilterPerformance(org.apache.hadoop.hbase.regionserver.TestScanFilterPerformance):
 10 runs  mean:1373.3 sigma:14.920120642943875

This is on an EC2 c3.2xlarge, which uses 16 of 20 available hardware threads of 
a two socket board with Xeon E5-2680s, backed by SSDs. 

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829547#comment-13829547
 ] 

Lars Hofhansl commented on HBASE-10015:
---

The failures are all NPEs. Two of them could be related, the last is during DFS 
cluster shutdown.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, 
 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829552#comment-13829552
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Good together with some of our hardware exports. We measured some of a perf 
counters and found that with the patch we see 50% less branch-misses, which is 
significant.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, 
 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829576#comment-13829576
 ] 

Lars Hofhansl commented on HBASE-10015:
---

10 runs  mean:2770.5 sigma:85.37827592543668, without
10 runs  mean:1791.2 sigma:50.81495842761264, with

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-withtest.txt, 
 10015-0.94.txt, 10015-trunk-v2.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829603#comment-13829603
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Still getting the NPE in trunk in TestHRegion with v2, though (0.94 is fine 
with v2)... Looking.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-withtest.txt, 
 10015-0.94.txt, 10015-trunk-v2.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829230#comment-13829230
 ] 

Ted Yu commented on HBASE-10015:


@Andy:
Thanks for the hint.
I ran 'umask 022' first but got the same result on Mac.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829624#comment-13829624
 ] 

Hadoop QA commented on HBASE-10015:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615249/10015-trunk-v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7966//console

This message is automatically generated.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-withtest.txt, 
 10015-0.94.txt, 10015-trunk-v2.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829645#comment-13829645
 ] 

stack commented on HBASE-10015:
---

The numbers are great.  Patch looks good given the premise.  Was going to 
suggest volatile instead of AtomicBoolean but looks like the getAndSet is 
fundamental so withdraw this nit.  Is test failure because of this patch?



 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-withtest.txt, 
 10015-0.94.txt, 10015-trunk-v2.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829649#comment-13829649
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Unfortunately I am no longer sure it actually works. I identified the problem:
In HStore.completeCompaction we call notifyChangedReadersObservers, which calls 
updateReaders on all StoreScanners. Before this patch, this method would block 
if there was a StoreScanner with a next/seek/reseek method. So before that 
patch one would guarantee that after notifyChangedReadersObservers returns it 
is safe to remove the compated files. That is no longer true.
I'll see if I can come up with something. It is a shame that we have to 
synchronize and get 10's of millions of branch mispredictions per second, just 
so we can compact a few times a day.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-withtest.txt, 
 10015-0.94.txt, 10015-trunk-v2.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829670#comment-13829670
 ] 

Lars Hofhansl commented on HBASE-10015:
---

A possible solution is keeping updateReaders() all methods that actually call 
checkReseek synchronized.
So peek() could still go without synchronization. Will test the performance 
with that.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-withtest.txt, 
 10015-0.94.txt, 10015-trunk-v2.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829673#comment-13829673
 ] 

Lars Hofhansl commented on HBASE-10015:
---

That seems to be correct. updateReaders now waits until all operations that are 
effected by the reader changes to finish. peek() does not need to be 
synchronized as long as we do not change scanner stack from under its feet, 
which we avoid by deferring to the scanner thread to do that.

The unittest now yields this:
10 runs  mean:4449.2 sigma:24.395901295094635, without patch
10 runs  mean:2609.2 sigma:67.94085663280968, with patch

So, still a significant win, albeit to quite what it was before.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829679#comment-13829679
 ] 

stack commented on HBASE-10015:
---

Can we have it so no changing of readers during a scan?

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829686#comment-13829686
 ] 

Lars Hofhansl commented on HBASE-10015:
---

I think it would be hard to
# push reader switching to the scanning thread
# guarantee that the readers will eventually be switched to the compaction can 
proceed and finished

What if a scanner isn't closed? Or nobody ever calls next() on it? Then we have 
to wait for the lease to expire before the compaction can finish.

next()/seek()/reseek() are not as critical as they are not called remotely as 
often as peek. peek() is also a very small method and (according to our perf 
expert here) small synchronized methods have the greatest potential to throw 
the branch prediction off.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829695#comment-13829695
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Phoenix with v3 (different machines than above, so do not compare absolute 
numbers).
Without patch:
{code}
0: jdbc:phoenix:localhost select count(*) from myNew1;
+--+
| COUNT(1) |
+--+
| 3000 |
+--+
1 row selected (18.417 seconds)
{code}

With patch:
{code}
0: jdbc:phoenix:localhost select count(*) from myNew1;
+--+
| COUNT(1) |
+--+
| 3000 |
+--+
1 row selected (13.319 seconds)
{code}


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829698#comment-13829698
 ] 

stack commented on HBASE-10015:
---

bq. Then we have to wait for the lease to expire before the compaction can 
finish.

I was thinking this might not be the end of the world.

What if we closed and reopened the current scanner when readers are changed. It 
is a rare event.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829700#comment-13829700
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Actually now it does not even need the AtomicBoolean anymore.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829701#comment-13829701
 ] 

Lars Hofhansl commented on HBASE-10015:
---

The closing and reopening is what needs the synchronization :)
A thread might still be using the scanner. If we can make it so that we do not 
have to synchronize all methods in StoreScanner, and still do it safely I'd be 
happy... But frankly ATM I do not see how without other expensive 
synchronization.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829732#comment-13829732
 ] 

stack commented on HBASE-10015:
---

bq. The closing and reopening is what needs the synchronization 

I'm not helping. 

This looks like an issue:

+if (shouldUpdateReaders) {
+  shouldUpdateReaders = false;

It is happening outside a sync block and shouldUpdateReaders is not volatile.



 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829736#comment-13829736
 ] 

Lars Hofhansl commented on HBASE-10015:
---

checkUpdateReaders is called from checkReseek, which is only called from 
synchronized methods. :)

You are helping. Looking at this from different angles is good.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829751#comment-13829751
 ] 

stack commented on HBASE-10015:
---

OK.  Would suggest running hadoopqa  a few times.  Tests around changing 
readers are usually pretty good at finding issues if any.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829754#comment-13829754
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Yep. Found issues with the first two versions of this.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829767#comment-13829767
 ] 

Hadoop QA commented on HBASE-10015:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12615278/10015-trunk-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestLogRolling

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7967//console

This message is automatically generated.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.98.0, 0.96.1, 0.94.15

 Attachments: 10015-0.94-v2.txt, 10015-0.94-v3.txt, 10015-0.94-v4.txt, 
 10015-0.94-withtest.txt, 10015-0.94.txt, 10015-trunk-v2.txt, 
 10015-trunk-v3.txt, 10015-trunk-v4.txt, 10015-trunk.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828074#comment-13828074
 ] 

Lars Hofhansl commented on HBASE-10015:
---

The 3x improvement I see for tall tables. For wider tables the improvement is 
less pronounced (for 5 columns I see a 20% or so improvement).

Also a note as to why I think this should be correct: Even with the current 
synchronized, a next/peek/reseek that has already started, will finish with the 
old reader. That has not changed.

Need to look closer at races between close() and next/peek/reseek, as close 
could be called from another thread (I think), when the lease expired.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828082#comment-13828082
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Actually the main difference is between ScanWildcardColumnTracker and 
ExplicitColumnTracker. With ExplicitColumnTracker there are still many reseeks 
that outweigh the performance improvement seen here.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828088#comment-13828088
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

It is interesting. Java synchronization w/o thread contention cost is close to 
zero. You would see the difference only when you run multiple threads accessing 
the same StoreScanner.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828103#comment-13828103
 ] 

Lars Hofhansl commented on HBASE-10015:
---

That is true as far as actual thread synchronization goes.

Every synchronize still places a read and write memory fence, though, and I 
think that is the effect we're seeing. I was a bit surprised about the 
magnitude of this myself.

I can try switching one of my cores off and measure this again, if the issues 
is memory fencing we should not see any improvement with one core only.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828131#comment-13828131
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Nope. I see the same effect with just one core.
Will on some other machines and different versions of the JDK.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828130#comment-13828130
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

I just ran mys tests and found no difference at all, but they were 
single-thread StoreScanner. 

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828278#comment-13828278
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Close is fine, since it is only triggered via RegionScannerImpl, which is 
synchronized.
The overhead I measured *could* explain the numbers I've seen here:
https://issues.apache.org/jira/browse/HBASE-9440?focusedCommentId=13767047page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13767047

(Compared to the raw scan speed in an HFile I found medium overhead per column 
and a lot of overhead per rows, and StoreScanner.peek()/next(), etc, are mostly 
called per row).

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828289#comment-13828289
 ] 

Andrew Purtell commented on HBASE-10015:


The patch doesn't look wrong, but I'm surprised at the improvement. If not too 
onerous, perhaps you could attach a test case that reproduces it?

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828288#comment-13828288
 ] 

Lars Hofhansl commented on HBASE-10015:
---

bq. I just ran mys tests and found no difference at all, but they were 
single-thread StoreScanner. 

Hmm... All my tests are single threaded. Different JVM versions?
Are you testing with tall tables?

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828297#comment-13828297
 ] 

Lars Hofhansl commented on HBASE-10015:
---

It'll be hard to disentangle the test. Basically I am testing with a table with 
a single column family, and filtering all data at the server with a 
ValueFilter. Lemme have an extra look at the code.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828307#comment-13828307
 ] 

Nick Dimiduk commented on HBASE-10015:
--

Could the test be isolated in a variant of HFilePerfEval? I've had luck 
isolating the BlockCache in a similar variant (HBASE-9806).

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828332#comment-13828332
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

{quote}
Different JVM versions?
{quote}
1.6_56 Mac OSX 10.7.5.  I concur with myself one more time: the cost of 
synchronized is very low when there is no thread contention.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828346#comment-13828346
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Strange... On my desktop machine (12 cores, JDK6) I also measure no difference.
My Laptop has 2 cores and OpenJDK7. I wonder whether I am seeing an OpenJDK bug.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828361#comment-13828361
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Confirmed on my laptop with OpenJDK, 2.5x improvement over 10 runs very low 
standard deviation.
Going to do some microbenchmarks.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828365#comment-13828365
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Wait... No, I do see the same improvement in the JDK6, 12 core box (I forgot to 
switch the server over).
I'll attach a quick and dirty benchmark.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828410#comment-13828410
 ] 

Lars Hofhansl commented on HBASE-10015:
---

bq.  I concur with myself one more time: the cost of synchronized is very low 
when there is no thread contention.

Well, you're wrong twice then :)

Just try it... Call a synchronized method in a loop a few 100 million times. 
Then remove the synchronized. Make sure the method returns something, such as a 
reference to a member, so it is not optimized immediately.

On my test machines (JDK6 and JDK7) the latter is at least 40x faster on some 
machines it's 63x slower. All just a single thread.

As I said before, synchronized does more than exclusion.
# it barres JVM from reordering instructions
# it places memory fences (both read and write), which -depending on exact hw- 
disallows instruction reordering of the CPU
# it may flush cache lines


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828418#comment-13828418
 ] 

Lars Hofhansl commented on HBASE-10015:
---

If somebody else could run the test I attached that'd be great. I ran against a 
local single node HBase on top of a single node HDFS cluster, with all data in 
the blockcache. Maybe a unittest against a mini cluster would be better.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828439#comment-13828439
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

May be I am wrong (empty synchronized method call cost on my laptop is 25 ns) 
but my own tests on StoreScanner show 0 improvement. 

Code is simple:

create region, populate with data (make sure data is in a cache) , then

{code}
 LOG.info(Test store scanner);
  Scan scan = new Scan();
  scan.setStartRow(region.getStartKey());
  scan.setStopRow(region.getEndKey());
  Store store = region.getStore(CF);
  StoreScanner scanner = new StoreScanner(store,  store.getScanInfo(), 
scan,  null);
  long start = System.currentTimeMillis();
  int total = 0;
  ListKeyValue result = new ArrayListKeyValue();
  while(scanner.next(result)){
total++; result.clear();
  }
  
  LOG.info(Test store scanner finished. Found +total + in 
+(System.currentTimeMillis() - start)+ms);
{code}

This test shows exact the same time for both: default StoreScanner and 
*unsynchronized* StoreScanner. The scan is not very fast: 1-1.5M rows per sec 
(rows are relatively small: 1 CF + 5 CQ,  ~ 120 bytes )

 

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828466#comment-13828466
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

For all microbenchmarks, please add the following command -line args:

{code}
-XX:+UseBiasedLocking -XX:BiasedLockingStartupDelay=0
{code}

Oracle JVM does not enable biased locking (single thread lock optimization) 
first 4s of program execution.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828472#comment-13828472
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Can you try at the RegionScanner level, which maintains a heap of StoreScanners 
and calls peek() quite frequently. In my sampling sampling profiler I saw 
StoreScanner.peek() come us as some of the top method where the time is spent 
(and does nothing but doing a compare on a local member then calls peek on its 
heap).

Also 25ns are more than 100 cycles on modern HW, in line what I would expect 
from memory stalls caused by the fences.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828473#comment-13828473
 ] 

Lars Hofhansl commented on HBASE-10015:
---

Will add those options. All the tests run at least 15s, though.

Oh, also rereading your comment above... I saw the most significant improvement 
per row (i.e. with 5 CQ you'd see less of an improvement). This very much looks 
like it is an interaction between KeyValueHeap and StoreScanner, where the cost 
seems to be mostly per row, rather than KeyValue.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828477#comment-13828477
 ] 

Vladimir Rodionov commented on HBASE-10015:
---

{quote}
Also 25ns are more than 100 cycles on modern HW, in line what I would expect 
from memory stalls caused by the fences.
{quote}
With  biased locking on, its ~ 2ns. I will try RegionScanner.

 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10015) Major performance improvement: Avoid synchronization in StoreScanner

2013-11-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828547#comment-13828547
 ] 

Lars Hofhansl commented on HBASE-10015:
---

One last note before I call it quits today.
The win appears to reduce the per row overhead by 50-60%. With 5 CQs this 
shrinks proportionally to 10-15% or so.


 Major performance improvement: Avoid synchronization in StoreScanner
 

 Key: HBASE-10015
 URL: https://issues.apache.org/jira/browse/HBASE-10015
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Attachments: 10015-0.94-withtest.txt, 10015-0.94.txt, TestLoad.java


 Did some more profiling (this time with a sampling profiler) and 
 StoreScanner.peek() showed up a lot in the samples. At first that was 
 surprising, but peek is synchronized, so it seems a lot of the sync'ing cost 
 is eaten there.
 It seems the only reason we have to synchronize all these methods is because 
 a concurrent flush or compaction can change the scanner stack, other than 
 that only a single thread should access a StoreScanner at any given time.
 So replaced updateReaders() with some code that just indicates to the scanner 
 that the readers should be updated and then make it the using thread's 
 responsibility to do the work.
 The perf improvement from this is staggering. I am seeing somewhere around 3x 
 scan performance improvement across all scenarios.
 Now, the hard part is to reason about whether this is 100% correct. I ran 
 TestAtomicOperation and TestAcidGuarantees a few times in a loop, all still 
 pass.
 Will attach a sample patch.



--
This message was sent by Atlassian JIRA
(v6.1#6144)