[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-14 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208077#comment-13208077
 ] 

Phabricator commented on HBASE-5241:


stack has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

  Not sure I grok completely whats going on.  Where is the extra cost we pay in 
seeking?

  Good stuff Amit.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:74 
This is ugly!
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:189 
This used to be compare to zero.  Was it wrong?
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:195 
Or, I suppose -1L now means what 0L used to?
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:277 
Why this change?
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java:845 We 
remove this assertions because delete behavior has changed?

REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-14 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208117#comment-13208117
 ] 

Lars Hofhansl commented on HBASE-5241:
--

This entire approach seems wrong to me. Things is HBase have a timestamp and 
one of the nicest parts about HBase is that the actual order in which 
operations are applied does not matter.

This will break replication where operations can arrive out of order and other 
code in HBase.

Unless somebody provides a very compelling use case I'm -1 on this general 
direction.


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Amitanand Aiyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208757#comment-13208757
 ] 

Amitanand Aiyer commented on HBASE-5241:


@Lars. Sure. I see there can be issues with replication.

Is that something that cannot be fixed?  

I am not really familiar with the replication code path. But, say, if we ship 
the memstoreTS along with the KV's during replication; would that not take care 
of the out of order issue?

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208765#comment-13208765
 ] 

Lars Hofhansl commented on HBASE-5241:
--

I fear we'd have two chronological dimensions now. One as indicated by the 
timestamps and another indicated by the order in which the changes are 
physically applied (memstoreTS).

This "problem" is really only a problem when Deletes are dated into the future 
or Puts are dated in the past. Any app doing this must be aware of the 
implications.
It just seems like a non-issue to me :)

Replication is just happens to be a place where I can see problems. I'm sure 
there're more (multi actions, etc).
Is the memstoreTS written to the WAL? (Replication uses WAL shipping).


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208768#comment-13208768
 ] 

Phabricator commented on HBASE-5241:


aaiyer has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:74 
fixing this in the next version.

  will update, once I fix the tests as well.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:189 
I guess -1 or 0, both would work.

  It seems to be initialized to -1L. But used to get reset to 0 on reset. That 
didn't make sense.

  Chose one at random.
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java:202 
This is one perf penalty we pay for the better consistency semantics.

  We can only zero-out memstoreTS upon major compaction. Not when all readers 
get past the read point.
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java:845 that 
was the initial direction.

  I'm working on keeping things backward compatible. so this will get reverted.

REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Amitanand Aiyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208769#comment-13208769
 ] 

Amitanand Aiyer commented on HBASE-5241:


@stack. The potential performance slow down on seek is due to this:

In ScanQueryMatcher, we used to return getNextRowOrNextColumn(bytes, offset, 
qualLength) for FAMILY_DELETED and COLUMN_DELETED; because once we see a KV 
that is deleted due to a family or a column delete, all the remaining KV's 
(with a lower timestamp) are guaranteed to be deleted.

Now, we return SKIP instead. This change is required, because there might be a 
KV, later in the file -- that has a lower timestamp, but a higher memstoreTS 
(so that deleteFamily does not apply). In this case, we end up moving 1 KV at a 
time; instead of potentially skipping the entire column or row.





> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Amitanand Aiyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208773#comment-13208773
 ] 

Amitanand Aiyer commented on HBASE-5241:


After some discussion, we figured that skipping to the next column, vs skipping 
to next KV shouldn't make so much of a difference. So, headed in this direction 
(as opposed to trying to create indices or change the sort order).


Will need to see once the patch is complete, if the read performance gets 
affected significantly.

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Amitanand Aiyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208772#comment-13208772
 ] 

Amitanand Aiyer commented on HBASE-5241:


After some discussion, we figured that skipping to the next column, vs skipping 
to next KV shouldn't make so much of a difference. So, headed in this direction 
(as opposed to trying to create indices or change the sort order).


Will need to see once the patch is complete, if the read performance gets 
affected significantly.

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Amitanand Aiyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208777#comment-13208777
 ] 

Amitanand Aiyer commented on HBASE-5241:


@Lars. I see your point. Its definitely debatable, weather timestamp is 
something that should be exposed to the client (to control) or something that 
should be considered an internal detail (so mess up at your risk).

We do have applications that control timestamp; so we might need this 
(internally at least). Not sure if that is the only one in use, or there are 
more.

Wrt the WALs. In the current codebase, I believe that we do not write 
memstoreTS. But, that can be fixed if needed.

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209068#comment-13209068
 ] 

Lars Hofhansl commented on HBASE-5241:
--

Skipping to next column can be much more efficient then skipping to next KV if 
there are many versions. In fact stack had filed a bug about it (can't find) 
and you FB folks put in the fix to skip to the next column.
Sorry to be the party killer here, but let's please not do this. :(


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Amitanand Aiyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209079#comment-13209079
 ] 

Amitanand Aiyer commented on HBASE-5241:


@Lars: Was discussing the replication issue a little more with Kannan. It does 
seem like there may be corner cases in which the exact order does matter. Even 
if we leave things the way they are; and clients do not take control of the 
timestamp.

Say for example we have two Puts from the client side. -- If both hit the 
server in quick succession, they could both be issued the same milli-second 
timestamp. Which one of them wins, will then be entirely determined by the 
order in which they are applied. If we apply them in different order, we could 
end up with different values.

It seems to me that to ensure determinism in terms of what the clients see; it 
would be crutial to have an internal timestamp that orders every operation 
(using something like Lamport's logical clock, instead real clock).

I do agree with you that it would be nice if timestamp were considered an 
internal detail that clients don't take control of. But, we would still have to 
include memstoreTS or Log Seq Id to ensure deterministic replay.

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Amitanand Aiyer (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209083#comment-13209083
 ] 

Amitanand Aiyer commented on HBASE-5241:


Yeah, I'm worried about the performance slow down as well. 

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209085#comment-13209085
 ] 

Lars Hofhansl commented on HBASE-5241:
--

If two things happen at the same time there *is* no right order. The fact that 
we have limited timer resolution is not that relevant here.

We just discussed another scenario internally here, were we have application 
level replicas of a table. One way to do this is have the client write to two 
HBase clusters and also have a catchup background task which copies older 
(before we started the dual-writing) cells to the replica. We will use this a 
lot to catch up standby clusters, etc. This would also not work any longer.

Anyway, if there are other committers that feel that we need this I won't veto 
it (but I am -0.5 on it). And it must configurable without any performance 
detriment when disabled (i.e. delete still seeks to the next column). I'd also 
vote to default off.

Maybe some of the other committers would like to comment? @Stack, @Ted, @Todd?


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209092#comment-13209092
 ] 

Zhihong Yu commented on HBASE-5241:
---

I am in favor of internal timestamp so that we don't rely so much on real clock.
The initiative for this feature is to remove indeterminate behavior w.r.t. the 
timing of major compaction.

I am in support of this feature. We can turn it off by default.

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209114#comment-13209114
 ] 

Phabricator commented on HBASE-5241:


aaiyer has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

INLINE COMMENTS
  src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:277 
Technically, this should have never passed with the old settings 
expectedResults = 2.

  It was passing due to a bug; which deleted version 0, whenever there was a 
delete for version 1. changing familyStamp to -1 should fix this.

  But, I'm having trouble convincing myself expectedResults = 3 is undoubtedly 
correct. It seems debatable. Any thoughts?

REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-15 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209142#comment-13209142
 ] 

Phabricator commented on HBASE-5241:


tedyu has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/HConstants.java:402 Put HBASE-5241 here.
  src/main/java/org/apache/hadoop/hbase/HConstants.java:408 Should this be 
turned off by default ?
  src/main/java/org/apache/hadoop/hbase/regionserver/DeleteTracker.java:45 Add 
@param for memstoreTS in these two methods.
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:1750 If there 
is no better way of handling, remove this line.
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:1763 This 
special constant should be defined and documented.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:49 
This member doesn't seem to be used anywhere.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:50 
This neither.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:66 
Would memstoreTSForDelete be a better name ?
  Add comment please.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:65 
Would timestampForDelete be better name ?
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:89 
Add @param for memstoreTS.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:93 
Indentation for these two lines.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:60 
Add comments for these fields please
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:62 
How about naming this field memstoreTSForDeleteCol ?
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:118 
Should we consider checking these two timestamps separately ?

REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-16 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209533#comment-13209533
 ] 

Lars Hofhansl commented on HBASE-5241:
--

@Ted: Fair enough.
Here's another thing that will break: Master-Master replication. The 
memstoreTSs generated by the regionserver have no meaning w.r.t. to each other.
Also, since the replication sink accesses the replicated cluster through the 
normal API we need to add (public?) APIs to pass the memstoreTS through.

And I can already see folks who want to manipulate the memstoreTS from the 
outside, bringing us back to where we are.

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-16 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210076#comment-13210076
 ] 

Lars Hofhansl commented on HBASE-5241:
--

Need to think about HBASE-4536 as well. The main problem there was to work out 
when it is safe to delete the (family) delete markers - the solution was to 
store the smallest TS of any Put KV in the store file's metadata. I think that 
method should still work with your change.


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-16 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210097#comment-13210097
 ] 

Phabricator commented on HBASE-5241:


lhofhansl has commented on the revision "HBASE-5241 [jira] Deletes should not 
mask Puts that come after it.".

  I have voiced my concern amply in the jira :)

  Implementation-wise this is looks reasonable enough. See a few questions and 
comments inline.


INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:1748 Is this 
right? Now we're always dating column or family way into the future.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java:223 
See HBASE-4926 on why this might be a performance problem.
  The seeking was just recently put in to address issues like this.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:155 
Ugh... Although there shouldn't be too many family delete markers.
  Are you going to do some performance tests (or is this in production at FB 
already?)
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:155 
This needs to conditioned on ENFORCE_STRICTER_SEMANTICS, right?
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:166 
Same here
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:173 
and here

REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-17 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210444#comment-13210444
 ] 

Phabricator commented on HBASE-5241:


aaiyer has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:1748 This 
will only happen for Deletes (Column and Family). The idea is that the Delete 
shall apply to all the puts, with a lower memstoreTS, regardless of their 
timestamp -- even if it is in "future".

  Subsequent Puts etc. will not get masked by the Delete, because they should 
have a memstoreTS that is larger.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:155 
This is not yet in production. But, if we decide to go down this route, we will 
definitely test it out for performance.

  Haven't optimised much here. Since, I don't expect there to be too many 
delete Family.

  Will revisit if the assumption turns out to be false.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:155 
I'm not sure if we want to put this under ENFORCE_STRICTER_SEMANTICS 

  my understanding was that it would be better to have Puts not be masked by 
previous Deletes, regardless 

  weather we are willing to pay the extra performance cost for it, was the 
trade-off enforced using ENFORCE_STRICTER_SEMANTICS.

  If there is a good reason for clients to expect that the Put will be masked 
by previous Deletes, we can definitely guard this with the flag.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:173 
Perhaps, I might rename this class to something different, and we can add a 
flag in ScanQueryMatcher to instantiate the appropriate DeleteTracker.
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java:223 
Agree that this is going to be a performance issue here.

  But, this is just a V-1 to get the general idea out. I'm hopeful, we can 
optimise the codepath so that we incur the performance penalty only when there 
is really a later KV with a higher memstoreTS.

  We currently, do not have a way to tell that. But, it can be done, say dump a 
flag while writing the HFile, if there is a memstoreTS inversion. Or something 
along that lines 

  Will try to optimise this, if needed, along those lines.

REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-18 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211190#comment-13211190
 ] 

Lars Hofhansl commented on HBASE-5241:
--

What about the fact the currently Deletes and Puts are idempotent? With this 
change a failed Put or Delete cannot just be redone, because the effect might 
be different.

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-02-20 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212214#comment-13212214
 ] 

Todd Lipcon commented on HBASE-5241:


{quote}
Now that we are flushing the memstoreTS to disk, along with the KVs, we should 
be able
to differentiate whether or not the Put happened after the Delete and offer 
better 
delete semantics.
{quote}

What are the "better semantics" that we would offer? ie, if I do:
- put value "a" at ts=1
- delete at ts=3
- put value "b" at ts=2

and I do a read with "current time" semantics, do you expect to see "b" or 
nothing? I'm not convinced that "b" is a "better semantic" here, except for the 
point that it makes major compaction more transparent. The transparency of 
compaction is sort of nice, but compaction is already not transparent because 
of time travel reads (except for the "always keep versions" stuff that we did 
recently)

> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-03-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221653#comment-13221653
 ] 

Phabricator commented on HBASE-5241:


aaiyer has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:609 It seems 
like going for stronger read semantics would disable the space savings we get 
by storing the memstoreTS as a (1 byte) Zero value, instead of the actual value.

  Perhaps, one way to get similar savings would be to store the actual 
memstoreTS in the variable length encoding, after differential encoding.

   Here is what I propose.
  Keep track of a per-StoreFile startMemstoreTS value, that (approximately) 
keeps track of the smallest memstoreTS in the file.

 KV's will store the deltas such that KV's-memstoreTS = 
StoreFile's-startMemstoreTS + KV-delta.

 If the delta is small enough, we will only use 1 or 2 bytes for storing 
it. Since we use Bytes.writeVLong: From the java docs:

  if n in [-32, 127): encode in one byte with the actual value. Otherwise,
  if n in [-20*2^8, 20*2^8): encode in two bytes: byte[0] = n/256 - 52; 
byte[1]=n&0xff. Otherwise,


REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-03-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221710#comment-13221710
 ] 

Phabricator commented on HBASE-5241:


tedyu has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:609 I think 
this suggestion is a good idea.

REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5241) Deletes should not mask Puts that come after it.

2012-03-03 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221817#comment-13221817
 ] 

Phabricator commented on HBASE-5241:


aaiyer has commented on the revision "HBASE-5241 [jira] Deletes should not mask 
Puts that come after it.".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java:223 
mbautin, kannan: Just thinking aloud ...

If we are able to keep track of the Rows/Rows+Col, at flush time, where we 
see that a DeleteColumn/DeleteFamily is followed by a Put/KV with a higher 
memstoreTS; we might be able to skip ahead to getNextRowOrNextColumn as 
earlier, for almost all cases except ones where there actually was a back-fill.

   Would it be possible, given the hfilev2 structure, to be able to add more 
kinds of bloom blocks to keep track of this information?


REVISION DETAIL
  https://reviews.facebook.net/D1731


> Deletes should not mask Puts that come after it.
> 
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
>  Issue Type: Improvement
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch, 
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete 
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we 
> should be able
> to differentiate whether or not the Put happened after the Delete and offer 
> better 
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating 
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not 
> quite the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira