[jira] [Comment Edited] (HBASE-10885) Support visibility expressions on Deletes
[ https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052112#comment-14052112 ] ramkrishna.s.vasudevan edited comment on HBASE-10885 at 7/4/14 3:02 AM: [~enis] Ping !! Can i commit this to branch-1? was (Author: ram_krish): [~enis] Ping !! Can i commit this to 0.98? > Support visibility expressions on Deletes > - > > Key: HBASE-10885 > URL: https://issues.apache.org/jira/browse/HBASE-10885 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.1 >Reporter: Andrew Purtell >Assignee: ramkrishna.s.vasudevan >Priority: Blocker > Fix For: 0.99.0, 1.0.0, 0.98.4 > > Attachments: > 10885-org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes-output.txt, > HBASE-10885_0.98_1.patch, HBASE-10885_1.patch, HBASE-10885_2.patch, > HBASE-10885_branch_1.patch, HBASE-10885_new_tag_type_1.patch, > HBASE-10885_new_tag_type_2.patch, HBASE-10885_v1.patch, > HBASE-10885_v12.patch, HBASE-10885_v12.patch, HBASE-10885_v13.patch, > HBASE-10885_v15.patch, HBASE-10885_v17.patch, HBASE-10885_v2.patch, > HBASE-10885_v2.patch, HBASE-10885_v2.patch, HBASE-10885_v3.patch, > HBASE-10885_v4.patch, HBASE-10885_v5.patch, HBASE-10885_v7.patch, > HBASE-10885_v8.patch, HBASE-10885_v9.patch > > > Accumulo can specify visibility expressions for delete markers. During > compaction the cells covered by the tombstone are determined in part by > matching the visibility expression. This is useful for the use case of data > set coalescing, where entries from multiple data sets carrying different > labels are combined into one common large table. Later, a subset of entries > can be conveniently removed using visibility expressions. > Currently doing the same in HBase would only be possible with a custom > coprocessor. Otherwise, a Delete will affect all cells covered by the > tombstone regardless of any visibility expression scoping. This is correct > behavior in that no data spill is possible, but certainly could be > surprising, and is only meant to be transitional. We decided not to support > visibility expressions on Deletes to control the complexity of the initial > implementation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-10885) Support visibility expressions on Deletes
[ https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980520#comment-13980520 ] Andrew Purtell edited comment on HBASE-10885 at 4/25/14 12:01 AM: -- On sorting of terminals or not, a discussion that Ram, Anoop, and I had included this topic and it seems reasonable to change the serialization. I think we should start by splitting out the ad hoc visibility tag serialization in VisibilityController to a separate file. We could put magic bytes in front and test for those, falling back to an expensive comparison of we don't find the magic, otherwise use one optimized for sorted representation. While we are at it we could use protobuf for the new serialization and so the magic preamble would be 'PBUF' I suppose. was (Author: apurtell): On sorting of terminals or not, a discussion that Ram, Anoop, and I had included this topic and it seems reasonable to change the serialization. I think we should start by splitting out the custom visibility tag serialization in VisibilityController to a separate file. We could put magic bytes in front and test for those, falling back to an expensive comparison of we don't find the magic, otherwise use one optimized for sorted representation. While we are at it we could use protobuf for the new serialization and so the magic preamble would be 'PBUF' I suppose. > Support visibility expressions on Deletes > - > > Key: HBASE-10885 > URL: https://issues.apache.org/jira/browse/HBASE-10885 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.1 >Reporter: Andrew Purtell >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0, 0.98.2 > > > Accumulo can specify visibility expressions for delete markers. During > compaction the cells covered by the tombstone are determined in part by > matching the visibility expression. This is useful for the use case of data > set coalescing, where entries from multiple data sets carrying different > labels are combined into one common large table. Later, a subset of entries > can be conveniently removed using visibility expressions. > Currently doing the same in HBase would only be possible with a custom > coprocessor. Otherwise, a Delete will affect all cells covered by the > tombstone regardless of any visibility expression scoping. This is correct > behavior in that no data spill is possible, but certainly could be > surprising, and is only meant to be transitional. We decided not to support > visibility expressions on Deletes to control the complexity of the initial > implementation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-10885) Support visibility expressions on Deletes
[ https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13980520#comment-13980520 ] Andrew Purtell edited comment on HBASE-10885 at 4/25/14 12:00 AM: -- On sorting of terminals or not, a discussion that Ram, Anoop, and I had included this topic and it seems reasonable to change the serialization. I think we should start by splitting out the custom visibility tag serialization in VisibilityController to a separate file. We could put magic bytes in front and test for those, falling back to an expensive comparison of we don't find the magic, otherwise use one optimized for sorted representation. While we are at it we could use protobuf for the new serialization and so the magic preamble would be 'PBUF' I suppose. was (Author: apurtell): On sorting of terminals or not, a discussion that Ram, Anoop, and I included this topic and it seems reasonable to change the serialization. I think we should start by splitting out the custom visibility tag serialization in VisibilityController to a separate file. We could put magic bytes in front and test for those, falling back to an expensive comparison of we don't find the magic, otherwise use one optimized for sorted representation. While we are at it we could use protobuf for the new serialization and so the magic preamble would be 'PBUF' I suppose. > Support visibility expressions on Deletes > - > > Key: HBASE-10885 > URL: https://issues.apache.org/jira/browse/HBASE-10885 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.1 >Reporter: Andrew Purtell >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 0.99.0, 0.98.2 > > > Accumulo can specify visibility expressions for delete markers. During > compaction the cells covered by the tombstone are determined in part by > matching the visibility expression. This is useful for the use case of data > set coalescing, where entries from multiple data sets carrying different > labels are combined into one common large table. Later, a subset of entries > can be conveniently removed using visibility expressions. > Currently doing the same in HBase would only be possible with a custom > coprocessor. Otherwise, a Delete will affect all cells covered by the > tombstone regardless of any visibility expression scoping. This is correct > behavior in that no data spill is possible, but certainly could be > surprising, and is only meant to be transitional. We decided not to support > visibility expressions on Deletes to control the complexity of the initial > implementation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-10885) Support visibility expressions on Deletes
[ https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958648#comment-13958648 ] Andrew Purtell edited comment on HBASE-10885 at 4/3/14 9:00 AM: bq. Doing like what ACL does may be easier because we could see which subject issues the delete. If a super user/admin that makes the put does the delete then we can just allow the delete to happen. Above I suggest splitting the authorization check and the actual delete handling. Do the authorization check in the preDelete hook because we have the user's effective label set in the RPC context. Do the delete handling in compaction because for the deleteColumn or deleteFamily cases if we convert that delete request to a set of per-cell deletes, this could produce an explosion of tombstones. bq. Apart from this with the ACL delete handling case, some doubts regarding the handling of the deleteColumn() - which deletes only the latest version. But with the current implementation even though the current version allows the delete with valid permissions for the user, because there is an older version with lesser permission we deny the delete. Is that valid? same applies with deleteFamily() also. Yes, the rule is all visible cells with an ACL must allow the delete, or the delete will be denied. However, we should respect the MAX_VERSIONS of column families as defined in the schema when determining the scope of visibility and so changes are needed for that (HBASE-10899). was (Author: apurtell): bq. Doing like what ACL does may be easier because we could see which subject issues the delete. If a super user/admin that makes the put does the delete then we can just allow the delete to happen. Above I suggest splitting the authorization check and the actual delete handling. Do the authorization check in the preDelete hook because we have the user's effective label set in the RPC context. Do the delete handling in compaction because for the deleteColumn or deleteFamily cases if we convert that delete request to a set of per-cell deletes, this could produce an explosion of tombstones. bq. Apart from this with the ACL delete handling case, some doubts regarding the handling of the deleteColumn() - which deletes only the latest version. But with the current implementation even though the current version allows the delete with valid permissions for the user, because there is an older version with lesser permission we deny the delete. Is that valid? same applies with deleteFamily() also. Yes, the rule is all visible cells with an ACL must allow the delete, or the delete will be denied. However, we should respect the MAX_VERSION of the schema when determining the scope of visibility and so changes are needed for that (HBASE-10899). > Support visibility expressions on Deletes > - > > Key: HBASE-10885 > URL: https://issues.apache.org/jira/browse/HBASE-10885 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.1 >Reporter: Andrew Purtell >Assignee: ramkrishna.s.vasudevan > Fix For: 0.99.0, 0.98.2 > > > Accumulo can specify visibility expressions for delete markers. During > compaction the cells covered by the tombstone are determined in part by > matching the visibility expression. This is useful for the use case of data > set coalescing, where entries from multiple data sets carrying different > labels are combined into one common large table. Later, a subset of entries > can be conveniently removed using visibility expressions. > Currently doing the same in HBase would only be possible with a custom > coprocessor. Otherwise, a Delete will affect all cells covered by the > tombstone regardless of any visibility expression scoping. This is correct > behavior in that no data spill is possible, but certainly could be > surprising, and is only meant to be transitional. We decided not to support > visibility expressions on Deletes to control the complexity of the initial > implementation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-10885) Support visibility expressions on Deletes
[ https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956348#comment-13956348 ] Andrew Purtell edited comment on HBASE-10885 at 4/1/14 10:42 AM: - bq. Delete.setCellVisibility() should be supported now. Yes. bq. And these labels passed here will be only a list of labels and not visibility expressions like A|B!C? No. Deletes should support visibility expressions just like Put, etc. The supplied visibility expression is then associated with the delete marker(s). Actually, scratch what I said above in the first comment. We can store the delete marker as a cell with a visibility expression tag and do the work later, by hooking the compaction scanner. We would check for visibility expressions in tags on delete markers at compaction time. If we find one, then we have to filter only the cells covered by the tombstone that have a matching expression. If we are not storing visibility expression terminals (LeafExpressionNodes) in sorted order by ordinal we probably should consider it. (I don't think we are.) Because e.g. A|B == B|A. It would be most efficient if we can simply do byte comparison of serialized visibility expressions on the delete marker and any found while enumerating cells covered by it. If a delete marker has a visibility expression, then we only apply it to cells with matching visibility expressions. If a cell has no visibility tag then it does not match. (A|B != nil) Should we check that the supplied expression does not exceed the maximal authorization set for the user submitting the Delete in the preDelete hook? In other words, should we we allow a user only granted authorization A to submit a delete with visibility expression A|B? We should not, in my opinion. Recommend we answer this question for other op types on another JIRA, should there be any. was (Author: apurtell): bq. Delete.setCellVisibility() should be supported now. Yes. bq. And these labels passed here will be only a list of labels and not visibility expressions like A|B!C? No. Deletes should support visibility expressions just like Put, etc. The supplied visibility expression is then associated with the delete marker(s). Actually, scratch what I said above in the first comment. We can store the delete marker as a cell with a visibility expression tag and do the work later, by hooking the compaction scanner. We would check for visibility expressions in tags on delete markers at compaction time. If we find one, then we have to filter only the cells covered by the tombstone that have a matching expression. If we are not storing visibility expression terminals (LeafExpressionNodes) in sorted order by ordinal we probably should consider it. (I don't think we are.) Because e.g. A|B == B|A. It would be most efficient if we can simply do byte comparison of serialized visibility expressions on the delete marker and any found while enumerating cells covered by it. If a delete marker has a visibility expression, then we only apply it to cells with matching visibility expressions. If a cell has no visibility tag then it does not match. (A|B != nil) Should we check that the supplied expression does not exceed the maximal authorization set for the user submitting the Delete in the preDelete hook? In other words, should we we allow a user only granted authorization A to submit a delete with visibility expression A|B? We should not, in my opinion. It is different for the delete case than others because delete is a destructive operation. Recommend we answer this question for other op types on another JIRA, should there be any. > Support visibility expressions on Deletes > - > > Key: HBASE-10885 > URL: https://issues.apache.org/jira/browse/HBASE-10885 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.1 >Reporter: Andrew Purtell > Fix For: 0.99.0, 0.98.2 > > > Accumulo can specify visibility expressions for delete markers. During > compaction the cells covered by the tombstone are determined in part by > matching the visibility expression. This is useful for the use case of data > set coalescing, where entries from multiple data sets carrying different > labels are combined into one common large table. Later, a subset of entries > can be conveniently removed using visibility expressions. > Currently doing the same in HBase would only be possible with a custom > coprocessor. Otherwise, a Delete will affect all cells covered by the > tombstone regardless of any visibility expression scoping. This is correct > behavior in that no data spill is possible, but certainly could be > surprising, and is only meant to be transitional. We decided not to support > visibility expressions on Deletes to contr
[jira] [Comment Edited] (HBASE-10885) Support visibility expressions on Deletes
[ https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956348#comment-13956348 ] Andrew Purtell edited comment on HBASE-10885 at 4/1/14 10:40 AM: - bq. Delete.setCellVisibility() should be supported now. Yes. bq. And these labels passed here will be only a list of labels and not visibility expressions like A|B!C? No. Deletes should support visibility expressions just like Put, etc. The supplied visibility expression is then associated with the delete marker(s). Actually, scratch what I said above in the first comment. We can store the delete marker as a cell with a visibility expression tag and do the work later, by hooking the compaction scanner. We would check for visibility expressions in tags on delete markers at compaction time. If we find one, then we have to filter only the cells covered by the tombstone that have a matching expression. If we are not storing visibility expression terminals (LeafExpressionNodes) in sorted order by ordinal we probably should consider it. (I don't think we are.) Because e.g. A|B == B|A. It would be most efficient if we can simply do byte comparison of serialized visibility expressions on the delete marker and any found while enumerating cells covered by it. If a delete marker has a visibility expression, then we only apply it to cells with matching visibility expressions. If a cell has no visibility tag then it does not match. (A|B != nil) Should we check that the supplied expression does not exceed the maximal authorization set for the user submitting the Delete in the preDelete hook? In other words, should we we allow a user only granted authorization A to submit a delete with visibility expression A|B? We should not, in my opinion. It is different for the delete case than others because delete is a destructive operation. Recommend we answer this question for other op types on another JIRA, should there be any. was (Author: apurtell): bq. Delete.setCellVisibility() should be supported now. Yes. bq. And these labels passed here will be only a list of labels and not visibility expressions like A|B!C? No. Deletes should support visibility expressions just like Put, etc. The supplied visibility expression is then associated with the delete marker(s). Actually, scratch what I said above in the first comment. I think we can check that the supplied expression does not exceed the maximal authorization set for the user submitting the Delete in the preDelete hook and then store the delete marker as a cell with a visibility expression tag and do the rest of the work later, by hooking the compaction scanner. The big change then would be checking for visibility expressions in tags on delete markers at compaction time. If we find one, then we have to filter only the cells covered by the tombstone that have a matching expression. If we are not storing visibility expression terminals (LeafExpressionNodes) in sorted order by ordinal we probably should consider it. (I don't think we are.) Because e.g. A|B == B|A. It would be most efficient if we can simply do byte comparison of serialized visibility expressions on the delete marker and any found while enumerating cells covered by it. If a delete marker has a visibility expression, then we only apply it to cells with matching visibility expressions. If a cell has no visibility tag then it does not match. (A|B != nil) > Support visibility expressions on Deletes > - > > Key: HBASE-10885 > URL: https://issues.apache.org/jira/browse/HBASE-10885 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.1 >Reporter: Andrew Purtell > Fix For: 0.99.0, 0.98.2 > > > Accumulo can specify visibility expressions for delete markers. During > compaction the cells covered by the tombstone are determined in part by > matching the visibility expression. This is useful for the use case of data > set coalescing, where entries from multiple data sets carrying different > labels are combined into one common large table. Later, a subset of entries > can be conveniently removed using visibility expressions. > Currently doing the same in HBase would only be possible with a custom > coprocessor. Otherwise, a Delete will affect all cells covered by the > tombstone regardless of any visibility expression scoping. This is correct > behavior in that no data spill is possible, but certainly could be > surprising, and is only meant to be transitional. We decided not to support > visibility expressions on Deletes to control the complexity of the initial > implementation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-10885) Support visibility expressions on Deletes
[ https://issues.apache.org/jira/browse/HBASE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956348#comment-13956348 ] Andrew Purtell edited comment on HBASE-10885 at 4/1/14 10:25 AM: - bq. Delete.setCellVisibility() should be supported now. Yes. bq. And these labels passed here will be only a list of labels and not visibility expressions like A|B!C? No. Deletes should support visibility expressions just like Put, etc. The supplied visibility expression is then associated with the delete marker(s). Actually, scratch what I said above in the first comment. I think we can check that the supplied expression does not exceed the maximal authorization set for the user submitting the Delete in the preDelete hook and then store the delete marker as a cell with a visibility expression tag and do the rest of the work later, by hooking the compaction scanner. The big change then would be checking for visibility expressions in tags on delete markers at compaction time. If we find one, then we have to filter only the cells covered by the tombstone that have a matching expression. If we are not storing visibility expression terminals (LeafExpressionNodes) in sorted order by ordinal we probably should consider it. (I don't think we are.) Because e.g. A|B == B|A. It would be most efficient if we can simply do byte comparison of serialized visibility expressions on the delete marker and any found while enumerating cells covered by it. If a delete marker has a visibility expression, then we only apply it to cells with matching visibility expressions. If a cell has no visibility tag then it does not match. (A|B != nil) was (Author: apurtell): bq. Delete.setCellVisibility() should be supported now. Yes. bq. And these labels passed here will be only a list of labels and not visibility expressions like A|B!C? No. Deletes should support visibility expressions just like Put, etc. The supplied visibility expression is then associated with the delete marker(s). Actually, scratch what I said above in the first comment. I think we can check that the supplied expression does not exceed the maximal authorization set for the user submitting the Delete in the preDelete hook and then store the delete marker as a cell with a visibility expression tag and do the rest of the work later, by hooking the compaction scanner. The big change then would be checking for visibility expressions in tags on delete markers at compaction time. If we find one, then we have to filter only the cells covered by the tombstone that have a matching expression. If we are not storing visibility expression terminals (LeafExpressionNodes) in sorted order by ordinal we probably should consider it. (I don't think we are.) Because e.g. A|B == B|A. It would be most efficient if we can simply do byte comparison of serialized visibility expressions. > Support visibility expressions on Deletes > - > > Key: HBASE-10885 > URL: https://issues.apache.org/jira/browse/HBASE-10885 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.98.1 >Reporter: Andrew Purtell > Fix For: 0.99.0, 0.98.2 > > > Accumulo can specify visibility expressions for delete markers. During > compaction the cells covered by the tombstone are determined in part by > matching the visibility expression. This is useful for the use case of data > set coalescing, where entries from multiple data sets carrying different > labels are combined into one common large table. Later, a subset of entries > can be conveniently removed using visibility expressions. > Currently doing the same in HBase would only be possible with a custom > coprocessor. Otherwise, a Delete will affect all cells covered by the > tombstone regardless if they are visible to the user issuing the delete or > not. This is correct behavior in that no data spill is possible, but > certainly could be surprising, and is only meant to be transitional. We > decided not to support visibility expressions on Deletes to control the > complexity of the initial implementation. -- This message was sent by Atlassian JIRA (v6.2#6252)