[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2012-01-25 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-3855:
--

Fix Version/s: (was: 0.90.6)
   0.90.7

Not fixed in this 0.90.6.  Hence moving it to 0.90.7.

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Priority: Blocker
> Fix For: 0.90.7
>
> Attachments: memstoreReseek.txt, memstoreReseek2.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2011-09-17 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3855:
-

Fix Version/s: (was: 0.92.0)
   0.90.5

OK. Moving to 0.90.5.  I did not apply 4195 to the branch BECAUSE it does not 
apply over on the branch (which means I must have been dreaming yesterday when 
I thought I was testing 4195 on 0.90 -- I must have been running it on TRUNK).  
Leaving this as open against 0.90.5 rather than against 0.92 since we don't 
seem to have the issue that caused the reopen in TRUNK (and 4195 improves on 
the original patch here anyways).

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: memstoreReseek.txt, memstoreReseek2.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2011-05-05 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-3855:
--

 Priority: Blocker  (was: Major)
Fix Version/s: 0.94.0

Marking as blocker for next major release. 

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>Priority: Blocker
> Fix For: 0.94.0
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2011-05-05 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-3855:


Status: Patch Available  (was: Open)

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: memstoreReseek.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2011-05-05 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-3855:


Attachment: memstoreReseek.txt

This patch changes a reseek to a seek if the number of kvs that it has already 
skipped over is larger than a configured number (default 32).

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: memstoreReseek.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2011-05-06 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HBASE-3855:


Attachment: memstoreReseek2.txt

A minor improvement from the previous patch
1.  invoke seek only if the getNext()call returned null and numIterReseek is 
zero.

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: memstoreReseek.txt, memstoreReseek2.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2011-05-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3855:
-

   Resolution: Fixed
Fix Version/s: (was: 0.94.0)
   0.92.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thanks for the patch Dhruba.

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: memstoreReseek.txt, memstoreReseek2.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3855) Performance degradation of memstore because reseek is linear

2012-06-30 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3855:
--

Priority: Critical  (was: Blocker)

> Performance degradation of memstore because reseek is linear
> 
>
> Key: HBASE-3855
> URL: https://issues.apache.org/jira/browse/HBASE-3855
> Project: HBase
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Priority: Critical
> Fix For: 0.90.7
>
> Attachments: memstoreReseek.txt, memstoreReseek2.txt
>
>
> The scanner use reseek to find the next row (or next column) as part of a 
> scan. The reseek code iterates over a Set to position itself at the right 
> place. If there are many thousands of kvs that need to be skipped over, then 
> the time-cost is very high. In this case, a seek would be far lesser in cost 
> than a reseek.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira