[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2018-08-01 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566424#comment-16566424
 ] 

Anoop Sam John commented on HBASE-13389:


I think so..  So looks like now we tend to keep this mvcc for really longer.  
One or the other reason.
Can we make it such that we keep the mvcc in long format rather than vlong?   
As it is vlong, every cell read need to read the vlong byte by byte and causing 
perf.   For random read, the seek need to skip many cells.  

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Major
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2018-07-31 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564192#comment-16564192
 ] 

Lars Hofhansl commented on HBASE-13389:
---

I was just pointed to this again... It's just not just DRL, it's also 
replication, right [~anoop.hbase]?

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>Priority: Major
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2016-01-25 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116696#comment-15116696
 ] 

Anoop Sam John commented on HBASE-13389:


It is HBASE-15020.  We can remove whole code parts added for DLR. Any way that 
is disabled and found to have bugs still.  Ya may be we can undo HBASE-12600 
now itself I feel (before whole code parts removal)

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2016-01-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116678#comment-15116678
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Thanks [~anoop.hbase]. Which jira are you referring to? Can we undo HBASE-12600 
right now, or does it depend on this other jira being implemented first?


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2016-01-25 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116641#comment-15116641
 ] 

Anoop Sam John commented on HBASE-13389:


There is already a jira to remove the DLR and cleanup code. So then we can undo 
HBASE-12600 and will get back the old mvcc parse optimization

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-07-18 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632376#comment-14632376
 ] 

Lars Hofhansl commented on HBASE-13389:
---

So where are we with this?

To answer your question above [~stack], in the subtask I just put the part of 
the optimization back, namely if all involved HFiles have a max timestamp of 0, 
then there is no need to write the timestamp into the new HFile (as all would 
be 0 anyway).
(previously it did that if all timestamp are older than the oldest running 
scanner, but as discussed here, we can't do that any long)

So how do we proceed with this one?

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-23 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510470#comment-14510470
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

{quote}
I don't see the WALEdit sequenceid being used when we replicate. Is this 
something to implement? (Sounds like a good idea... )
{quote}
[~saint@gmail.com]I thought we already had used it because 
intra-replication did otherwise I can give a first try on this.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509175#comment-14509175
 ] 

stack commented on HBASE-13389:
---

bq. After 6 days (currently) we are zero'ing mvcc during compactions.

[~lhofhansl] Pardon me, where is this enforced (all hfiles participating in a 
compaction must be 6-days old?)  6-days is arbitrary, right? It means that 
there cannot be a WAL outstanding that has an edit that has not yet been 
flushed, right?  It also means, that there cannot be an edit in a WAL that has 
not been replicated and flushed on the remote side? Is there a trigger we could 
use instead rather than an arbitrary timing?

[~jeffreyz] Thanks for chiming in. I don't see the WALEdit sequenceid being 
used when we replicate. Is this something to implement? (Sounds like a good 
idea... )

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508508#comment-14508508
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Feel free to +1 the patch on HBASE-13497 :)

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508507#comment-14508507
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Thanks for the explanation. After 6 days (currently) we _are_ zero'ing mvcc 
during compactions. In HBASE-13497 I allow a compaction to not store mvcc 
stuff, when it's all 0 anyway (not looking at the current scanner, but only 
going my the HFile's data). So that's safe at least. I agree we cannot put the 
original optimization back.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-22 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508495#comment-14508495
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

[~saint@gmail.com] Well said and good examples! As of today. there are two 
cases that we could have out of order puts: DLR or replication, where the order 
of wal files to be replayed isn't guaranteed.  

For non-adjacent hfile compactions, it seems that we have to keep mvcc in KVs 
level, For example, hfile1(max mvcc=1) hfile2(max mvcc=2) and hfile3(max 
mvcc=3). If we just compact hfile1 and hfile3, we can't set the newly compacted 
hfile's max mvcc=3 because hfile2 may have same rows in either hfile1 or hfile2.

Keeping mvcc will make the "haunting" out-of-order issue go away and one less 
concern. Let me know which option we should go and I can also help on the fix.

 

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504361#comment-14504361
 ] 

stack commented on HBASE-13389:
---

Thinking on it, during out-of-order DLR, there are a few ways in which we could 
lose data if we bring back the optimization that zeros all mvccs promoting the 
highest mvcc seen to be the hfiles mvcc kept in the hfile metadata.

During recovery of a region during DLR, we may flush hfiles in a manner such 
that the older edits are in the most recently flushed file or hfiles are made 
of edits that do not have a linearly increasing mvcc. This is a violation of 
tenets that  hold when flushes always drop files that have mvcc/sequenceid in 
excess of files currently present in the filesystem (and whose edits have 
increasing mvccs)

We have to be careful compacting these files dropped during recovery. We need 
to compact them all up together first -- after the region comes on line -- 
before we can mix them in with zero'd mvcc files (it has to be after region 
comes online and not before because region may crash during recovery having 
dropped one or more out-of-order hfiles)

Here is an illustration.

A region is recovering. It comes under memory pressure so flushes the edits it 
received so far. It so happens that it mostly received older edits but a few 
new ones came in too. It dumps out (Let the letters be keys and the numbers 
mvcc):

A 2
B 4
C 10

Recovery completes and it drops another hfile:

A 1
B 5
C 11

Now, if we compact the first file with a zero'd mvcc file with a sequenceid of 
8, the product will be a zero'd mvcc hfile whose seqid is 10.

If we then compact this '10' file with the second file flushed, we lose the 'B 
5' edit because it is < '10'. Even if we compacted all three files together -- 
the zero'd mvcc hfile and the two files dropped during recovery -- we could 
lose 'B 5' and 'A 2' since both have mvccs < '10'.



> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503943#comment-14503943
 ] 

stack commented on HBASE-13389:
---

bq. This may be hard to achieve because out of order puts can be flushed at 
different time.

Do 'out of order' puts happen at DLR time only [~jeffreyz]? i.e. WALs can be 
replayed in any order since they are farmed out over the cluster. We also 
cannot guarantee when a region that is receiving DLR edits will flush hfiles; 
e.g. we could get row1/logSeqId=2 during DLR and flush because we had memory 
pressure, but then later row1/logSeqId=1 might arrive and be flushed into a 
newer hfile. The fix for this is to not let compactions happen when region is 
in recovery -- this is probably the case already (or let compactions go on but 
preserve mvcc while in recovery)?

So, the Lars fix would be to drop mvcc if no scanner outstanding with a span 
that includes mvcc in current hfile AND we are not in DLR recovery mode? 

Are there other places where we might have out-of-order puts? (Flushes are 
single threaded and edits go into FSHLog and MemStore in order caveat Elliott 
and Nate's recent find: 
https://issues.apache.org/jira/browse/HBASE-12751?focusedCommentId=14377157&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14377157).

bq. ...and only keep mvcc around during region recovery time so that we can 
still keep HBASE-12600 goal

Yes.

On keeping seqid in the KV in hfiles so we can do "...out of order in minor 
compactions. "

...don't we mean compacting non-adjacent files rather than out-of-order here?

So, yeah, if we preserved mvcc always, we could do any order and non-adjacent. 
Would be nice.

Otherwise, as I see it, if we want to do non-adjacent compactions (which as 
[~lhofhansl] says above, we do not currently have), then we could do it if all 
files under a Store have zero for mvcc and we just order the edits by the hfile 
meta data mvcc number. When there are files with an mvcc per KV, then we should 
probably merge those first...  Would have to think it through more.

It gets a little complicated though if the Store has some files with a hfile 
meta data mvcc number but other files have an mvcc per KV. We could not do a 
file that has an mvcc per KV with a non-adjacent 

But we could do it also if files with zero if we have the Lars optimization, we 
could do non-adjacent if we respected the hfile seqid order.  It gets tricky if 
a file has mvcc in the KV and all the rest do not.  Files with KVs in the mvcc 
need to be compacted together ahead of 

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503806#comment-14503806
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Thanks [~jeffreyz], just discussed a bit with [~stack]... If we kept the 
in-order compactions, we won't need MVCC stamps in the HFile beyond the oldest 
scanner, right?

I feel like I am missing something. Could you show an example of when we need 
MVCC stamps in the HFile beyond the oldest scanner when you have some time?
The issue has to do with Puts/Deletes happening in the same millisecond, right?

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502373#comment-14502373
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

That sounds good. We can shorter the time period to 2 or 3 days. In one case 
that keeping mvcc longer can gain some performance because it makes possible 
that we can compact HFiles out of order in minor compactions.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-19 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502201#comment-14502201
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Correctness > Performance :)  We need to have this correct in all cases.

OK. So lemme file a sub issue to apply the patch I attache here. Not as good as 
the original patch, but better.

Here we need to have the discussion about how long to keep the Cells, it seems 
we want this less than the minimum time between major compactions (which is 3.5 
days currently - 1 week +- 1/2 week) for performance (but again correctness is 
more important).
Might also want to change the detection code for whether a major compaction is 
needed, if we can rid of MVCC stamp, we should major compact.


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502173#comment-14502173
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

{quote}
All other cases we should be covering with metadata in the HFiles trailer, not 
on individual Cells.
{quote}
This may be hard to achieve because out of order puts can be flushed at 
different time. Let's say row1/logSeqId=2 is flushed earlier than 
row1/logSeqId=1. HFile trailer meta data's mvcc range will be overlapped among 
multiple HFiles.  

One option is that we can reinstate your original code by checking against the 
oldest running scanner and only keep mvcc around during region recovery time so 
that we can still keep HBASE-12600 goal. 

If not much overall read performance degrade(because this part may not be the 
bottleneck in the read path), I think it's better to keep current way so all 
cases can work correctly for out of order puts. How do you guys think? Thanks.
 

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500883#comment-14500883
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Sure. After discussions we had ([~stack], [~apurtell], and I) I think we can 
reinstate my original code, where we check against the oldest running scanner 
and if all Cell in an HFile are older than that scaner (in terms of MVCC 
timestamp) we can set them to 0 and not store them upon compaction.

The observation being that we only need MVCC stamps in the HFile to cover 
flushes/compactions that happen while a current scanner is running.
All other cases we should be covering with metadata in the HFiles trailer, not 
on individual Cells.

[~jeffreyz], do you agree? HBASE-12600 changed that, and I bet you had a very 
good reason.


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492743#comment-14492743
 ] 

stack commented on HBASE-13389:
---

[~lhofhansl] Yes, in a subtask.  Lets figure this 6 days vs 3 days vs a couple 
of hours and other items raised here as other subtasks or issues.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492722#comment-14492722
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Should we apply my patch.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-08 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486820#comment-14486820
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Misunderstood the comparison order. All good :)


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482558#comment-14482558
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

Changing comparing order in row > column -> ts -> seqId -> type order can make 
things more consistently and doesn't change HBase current idempotence. For 
example, for puts with the same timestamp, the last put wins while if we do 
put, delete, put or delete, put , put and the delete always win.

I think it's better that a delete should be treated as a put so users can have 
same exceptions as puts. Otherwise, for low time resolution OS or when a put is 
missing, we often want to check if there is a delete overshadowing newer puts.

Yeah, keeping mvcc 3 days is good enough. 


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482550#comment-14482550
 ] 

stack commented on HBASE-13389:
---

[~lhofhansl] 3 days is arbitrary. If it so happens that someone has failed to 
observe a backed up replication and don't notice it till a week has gone by, 
will they lose data? Can we not have the optimization instead cut in when it is 
guaranteed the mvcc is no longer needed?

In-cluster, it seems a matter of hours will do it as long as we check that 
locally flushing is working. Below are a few 'statements' of why I think it 
should be fine.  For replication, as I read it, edits get a new seqid when 
applied to the sink cluster so source cluster seqid doesn't factor in at all. 
Maybe I misread.

As per Enis, I don't get how idempotency is effected.

Notes on in-cluster:

 * For log replay in-cluster, when Distributed Log Replay, if a Region gets an 
edit with a seqid that fits inside a range covered by an existing hfile, then 
we can just drop it because it already persisted. This would be for case were 
an old WAL is being replayed though the edits have been flushed out to hfiles 
already (and the optimization dropping mvcc has been run).
 * If a region can't flush, then we should not run the optimization (This is 
probably ok... compaction will likely just not succeed if we can't flush but 
optimization should check last flush time).
 * If no replication, optimization can run on any file as long as no 
outstanding scanners and read point is beyond the oldest edit in an hfile 
(optimization does this now I believe).

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482334#comment-14482334
 ] 

Enis Soztutar commented on HBASE-13389:
---

bq. I would actually be against it, since it breaks the fact that all mutations 
in HBase are idempotent - when the client encounters any problem with a batch 
of updates, it can just do those again, and the outcome would be identical 
I don't understand how this is related to idempotent updates. The sort order 
proposed will still keep ts before the type/seqId.  
3 days should be good enough for replication I say. 

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482322#comment-14482322
 ] 

Lars Hofhansl commented on HBASE-13389:
---

I think we had comment overlap. :)

bq. ...you are not against changing sort order so that seqid prevails over type 
are you...?

I would actually be against it, since it breaks the fact that all mutations in 
HBase are idempotent - when the client encounters any problem with a batch of 
updates, it can just do those again, and the outcome would be identical - 
within the limits of what HBase defines, i.e. with ms resolution, now we would 
complicate that, and need explaining to do.

So with the discussion above in place, can be lower the default time to 3 days? 
So that we can be reasonably sure that major compactions would purge the mvcc 
cruft?

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482283#comment-14482283
 ] 

stack commented on HBASE-13389:
---

bq. So with all this I do see any reason to keep these for more than a few 
hours.

Its not log rolling as per Enis. It is when memstore is flushed.  Default is 
memstores are flushed at least once an hour:

 public static final int DEFAULT_CACHE_FLUSH_INTERVAL = 360;

So if an old edit comes in during distributed log replay, an edit that has 
already been flushed to an hfile, we need to be able to put it in the 
appropriate slot (as you say). This can happen if we are overplaying edits in 
case where Master does not have last flush sequenceid on a region. If HFiles 
have all their seqids, it is easy.  But if mvcc has been purged from hfiles 
(optimization) and we get an edit that falls into the hfile time range, we are 
going to be confused.  Somehow the optimization purging mvcc should not run 
until we are sure old WALs with seqids older than those in hfiles for all 
regions have been let go.

For replication, yeah, needs a few days.  The root of the lag may take a few 
days to fix.

On the put -> delete -> put, you are not against changing sort order so that 
seqid prevails over type are you [~lhofhansl]? Would be good change for 2.0.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482262#comment-14482262
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Yeah, not related to log roll, sorry. I meant the max time before we force a 
memstore flush (1 hour by default)... HBASE-5930.

I still have not heard a convincing reason why the time to keep the mvcc stuff 
around needs to be greater than an hour or two :)


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482163#comment-14482163
 ] 

Enis Soztutar commented on HBASE-13389:
---

bq. when we replay data due to recovery we want it to fall into the right place 
w.r.t to existing data. Why do we need more than the maximum time to roll a log 
(1h)?
I think the min time to keep is max time an edit can live in the memstore 
without being flushed. This is not related to log roll (since we still replay 
edits from a previous log roll) but how much further an edit can be replayed 
through dist log replay I think. 
Case 3 as Jeff puts it is an issue with the comparison order. We compare 
entries with {{row > column -> ts -> type -> seqId}} order, however, we should 
compare entries in {{row > column -> ts -> seqId -> type}} order so that Put, 
Delete, Put with the same TS works. If we do better resolution for ts's, this 
is not needed though. 

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482106#comment-14482106
 ] 

stack commented on HBASE-13389:
---

I have been trying to write up life of a sequenceid: 
https://docs.google.com/document/d/16beczDie-KU1uSpJvd0GoUlQbPtQBL93rOOPqnE5Ma0/edit#
  Let me pick it up again. Will add in above notes. Would be sweet if could 
backfill tests that verify our expectations align with the story we are telling.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481916#comment-14481916
 ] 

Lars Hofhansl commented on HBASE-13389:
---

We do need to revisit the 6 days, right? Would 3 days be enough? 

Lemme try to understand the cases:
# when we replay data due to recovery we want it to fall into the right place 
w.r.t to existing data. Why do we need more then the maximum time to roll a log 
(1h)?
# replication... Yeah, that's important. I'd say if you have a replication lag 
of more than a few hours you have a larger problem anyway.
# This too... Although I do not actually agree that this is an advantage. 
Mutations (including deletes) being idempotent in HBase is a feature and not a 
problem.

So with all this I do need any reason to keep these for more than a few hours. 
It's very possible that I am missing something.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481348#comment-14481348
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

{quote}
To what does the above statement apply? To all three of your 'cases' or just to 
the last case, case #3?
{quote}
Just for case#3. The other two cases need mvcc around for a little bit time.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396059#comment-14396059
 ] 

stack commented on HBASE-13389:
---

bq. Seems to me not needed(before I thought we need to keep mvcc around till a 
major compaction)

[~jeffreyz] Please say more. To what does the above statement apply? To all 
three of your 'cases' or just to the last case, case #3?


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396012#comment-14396012
 ] 

stack commented on HBASE-13389:
---

@jeffrey zhong kwhwn you say AR end of first comment above "seems to me not 
needed...". I do not follow can u say more?  Thanks

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395995#comment-14395995
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

There is another thought. If we can keep mvcc being part of key byte 
array(logically it is but not in key serialization & deserialization) then we 
could use lazy read approach because mvcc is hardly used during key comparison.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395994#comment-14395994
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

Thanks [~lhofhansl] for looking this. I think your patch can help a bit.

{quote}
Do we need valid (non 0) mvcc readpoints for committed data (i.e. data that was 
flushed to an HFile and hence we'll never need to replay any HLogs for those)? 
Do we need these anywhere but in the memstore?
{quote}
There are three cases(I could think of and maybe more) that we need the 
logSeqId(mvcc) around to help us keep the put order.

Assuming all put/deletes are of same row & timestamp(version) 
case 1) region server recovery case
We need mvcc(logSeqId) only when region is in recovery mode but not after 
recovery.

case 2) replication receiving side, we need logSeqId to maintain the order 
because region move or recovery in replication playing side cause puts out of 
order
We need mvcc for couple of days(to be safe) so that at least the data 
eventually in receiving side are correct.

case 3) put , delete, put. Currently delete overshadows the later put but with 
logSeqId we can easily solve the issue because logSeqId is the real version of 
a put.
Seems to me not needed(before I thought we need to keep mvcc around till a 
major compaction)



 

 


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
> Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395875#comment-14395875
 ] 

Lars Hofhansl commented on HBASE-13389:
---

Lastly... 6 days from HBASE-11315 is not right. We have major compaction set to 
be done every 7 days, with a jitter of 1/2 week.
I.e. data might be major compacted as early as _3.5_ days. The retention of 
mvcc data should be less than that. Maybe 3 days or so.


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395851#comment-14395851
 ] 

Lars Hofhansl commented on HBASE-13389:
---

[~jeffreyz], [~stack], I have to admit I do not quite follow the reasoning 
behind HBASE-12600.
The main question I have:
Do we need valid (non 0) mvcc readpoints for committed data (i.e. data that was 
flushed to an HFile and hence we'll never need to replay any HLogs for those)? 
Do we need these anywhere but in the memstore?


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395844#comment-14395844
 ] 

Lars Hofhansl commented on HBASE-13389:
---

It turns out that the optimization in HBASE-8151 and HBASE-9751 still works, 
but only after 6 days, when compactions allow setting mvcc readpoints to 0.

I think we can get the optimization for HBASE-8166 and back and still have 
HBASE-12600 correctly, if we replace this:
{code}
- final boolean needMvcc = fd.maxMVCCReadpoint >= smallestReadPoint;
+
final Compression.Algorithm compression = 
store.getFamily().getCompactionCompression();
StripeMultiFileWriter.WriterFactory factory = new 
StripeMultiFileWriter.WriterFactory() {
@Override
public Writer createWriter() throws IOException {
return store.createWriterInTmp(
- fd.maxKeyCount, compression, true, needMvcc, fd.maxTagsLength > 0);
+ fd.maxKeyCount, compression, true, true, fd.maxTagsLength > 0);
}
{code}

With this:

{code}
- final boolean needMvcc = fd.maxMVCCReadpoint >= smallestReadPoint;
+ final boolean needMvcc = fd.maxMVCCReadpoint >= 0;
final Compression.Algorithm compression = 
store.getFamily().getCompactionCompression();
StripeMultiFileWriter.WriterFactory factory = new 
StripeMultiFileWriter.WriterFactory() {
@Override
public Writer createWriter() throws IOException {
return store.createWriterInTmp(
fd.maxKeyCount, compression, true, needMvcc, fd.maxTagsLength > 0);
}
{code}

So when all mvccr readpoint are 0, the next compaction can then still do the 
optimization for HBASE-8166 and not write the mvcc information at all. It just 
will be later... Before we already do that when we do not have any scanner open 
with a readpoint older than any of the readpoints in the HFile, now we have to 
wait until comactions set them all to 0.

It's not all that bad. [~stack], if the data is older than 6 days I'd expect 
this to no longer show in the profiler.

Maybe we need to write some unittests for this, although I assume that won't be 
easy.


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394693#comment-14394693
 ] 

stack commented on HBASE-13389:
---

bq. I'm surprised the extra mvcc value caused so much perf regression. 

Yeah, weirdly I see it costing us a bunch. Will report better over in 
HBASE-13291.

bq. Should we keep the time period configuration shorter or revert all related 
changes? 

We'll figure it [~jeffreyz] Thanks for the input.


> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-02 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394114#comment-14394114
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

Should we keep the time period configuration shorter or revert all related 
changes? Thanks.

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-02 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394111#comment-14394111
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

[~stack] The performance regression is due to we keep mvcc values 
longer(HBASE-11315) so comes the later change 
https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96.
 

I'm surprised the extra mvcc value caused so much perf regression. Here is the 
code which calculates minSeqId to keep in file Compactor.java during 
compaction. 

{code}
// when isAllFiles is true, all files are compacted so we can calculate 
the smallest 
// MVCC value to keep
if(fd.minSeqIdToKeep < file.getMaxMemstoreTS()) {
  fd.minSeqIdToKeep = file.getMaxMemstoreTS();
}

// output to writer:
for (Cell c : cells) {
  if (cleanSeqId && c.getSequenceId() <= smallestReadPoint) {
CellUtil.setSequenceId(c, 0);
  }
{code}

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -
>
> Key: HBASE-13389
> URL: https://issues.apache.org/jira/browse/HBASE-13389
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: stack
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
> mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
> mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
> file level. This is sort of how it should be many of us would argue but as a 
> side-effect of this change, read-time optimizations that helped speed scans 
> were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just 
> remove the optimizations altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, 
> Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against 
> the actual smallestReadpoint, and hence we're always performing all the 
> checks, tests, and comparisons that these jiras removed in addition to 
> actually storing the data - which with up to 8 bytes per Cell is not trivial.
> {quote}
> This is the 'breaking' change: 
> https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)