[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649533#comment-13649533
 ] 

ramkrishna.s.vasudevan commented on HBASE-7897:
---

I raised https://issues.apache.org/jira/browse/HBASE-8496 for other details.
I feel getNumTags() should be needed here.  
Removing hasTags() is fine with me.  
If we don't have getNumTags(), then everytime CellUtil.getNumTags() need to be 
used i feel mainly in cases when i have more than one tag.  (may be in future 
we may end up in this).

> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649527#comment-13649527
 ] 

stack commented on HBASE-7897:
--

Ok.  Tags are opaque byte arrays at this level.  Whether there is one tag or 
100 in the array of bytes and how to parse the bytes is for figuring elsewhere. 
 Below the Cell level, the tag representation can be compressed or encoded but 
by inspection of the raw bytes themselves, not by exploiting some Tag aspect.

Sounds like the additional methods would be the below only?

+  /**
+   * @return the tag byte array
+   */
+  byte[] getTagArray();
+
+   /**
+   * @return the first offset where the tags start in the Cell
+   */
+  int getTagsOffset();
+  
+  /**
+   * @return the total length of the tags in the Cell.
+   */
+  int getTagsLength();
+  


> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8496) Implement tags and the internals of how a tag should look like

2013-05-05 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-8496:
-

 Summary: Implement tags and the internals of how a tag should look 
like
 Key: HBASE-8496
 URL: https://issues.apache.org/jira/browse/HBASE-8496
 Project: HBase
  Issue Type: New Feature
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.98.0


The intent of this JIRA comes from HBASE-7897.
This would help us to decide on the structure and format of how the tags should 
look like. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649507#comment-13649507
 ] 

ramkrishna.s.vasudevan commented on HBASE-7897:
---

HBASE-6222 deals with Security at cell level
HBASE-7448 - is already closed as dup and HBASE-7662 - is applying ACLs as tags.

So from the work that i have been currently doing there are lot of workarounds 
and different ideas to actually  get these tags fit into the internals of the 
KeyValues and the related code that works on these.  So better start a new JIRA 
and discuss over there.
The Cell methods i can add over here and we can close this JIRA once it gets 
committed.

> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649504#comment-13649504
 ] 

Matt Corgan commented on HBASE-7897:


{quote}BTW what is the other JIRA where the internals are going to be 
discussed?{quote}I can't think of a specific jira - maybe one should be 
created.  Related issues include HBASE-6222, HBASE-7448, and HBASE-7662.

{quote}So Matt you suggest that i give an uploaded patch with the methods added 
to Cell alone and remove the CellUtil.iterators for now?{quote}I've not been 
working on this lately and don't mean to have too strong of a recommendation, 
but if you agree the Cell interface should only focus on allocating a range of 
bytes for the tags, then yes, I'd say we could have a separate issue/discussion 
for the format of the tags bytes.  I'm concerned that the logic behind the tags 
is too complex to be handled at the Cell level.  HBase already needs 
performance improvements with regards to iterating cells, and parsing and 
interpreting tags contents at the cell iteration level would be too expensive.  
At the lower levels they should be as opaque as the contents of the value 
byte[].  Others may disagree for sure - it's an important conversation.

> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649493#comment-13649493
 ] 

ramkrishna.s.vasudevan commented on HBASE-7897:
---

BTW what is the other JIRA where the internals are going to be discussed? If it 
is not found i can raise one.
I can provide more details on how the tag byte[] array can be represented.. We 
can discuss on the options out there.

> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649492#comment-13649492
 ] 

ramkrishna.s.vasudevan commented on HBASE-7897:
---

bq.It's an important discussion no doubt, but I wonder if it matters here?
Agree.  We can focus on the methods needed for the Cell interface.
So Matt you suggest that i give an uploaded patch with the methods added to 
Cell alone and remove the CellUtil.iterators for now? 

> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8375) Streamline Table durability settings

2013-05-05 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-8375:
-

Assignee: Enis Soztutar

> Streamline Table durability settings
> 
>
> Key: HBASE-8375
> URL: https://issues.apache.org/jira/browse/HBASE-8375
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Enis Soztutar
> Fix For: 0.95.2
>
>
> HBASE-7801 introduces the notion of per mutation fine grained durability 
> settings.
> This issue is to consider and the discuss the same for the per table settings 
> (i.e. what would be used if the mutation indicates USE_DEFAULT). I propose 
> the following setting per table:
> * SKIP_WAL (i.e. an unlogged table)
> * ASYNC_WAL (the current deferred log flush)
> * SYNC_WAL (the current default)
> * FSYNC_WAL (for future uses of HDFS' hsync())

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8375) Streamline Table durability settings

2013-05-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649484#comment-13649484
 ] 

Lars Hofhansl commented on HBASE-8375:
--

I do not mind at all. :)

> Streamline Table durability settings
> 
>
> Key: HBASE-8375
> URL: https://issues.apache.org/jira/browse/HBASE-8375
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
> Fix For: 0.95.2
>
>
> HBASE-7801 introduces the notion of per mutation fine grained durability 
> settings.
> This issue is to consider and the discuss the same for the per table settings 
> (i.e. what would be used if the mutation indicates USE_DEFAULT). I propose 
> the following setting per table:
> * SKIP_WAL (i.e. an unlogged table)
> * ASYNC_WAL (the current deferred log flush)
> * SYNC_WAL (the current default)
> * FSYNC_WAL (for future uses of HDFS' hsync())

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8375) Streamline Table durability settings

2013-05-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649475#comment-13649475
 ] 

Enis Soztutar commented on HBASE-8375:
--

It seems that we should get this in by 0.96, marking it for 0.95.2. Lars, I can 
work on it if you would not mind. 

> Streamline Table durability settings
> 
>
> Key: HBASE-8375
> URL: https://issues.apache.org/jira/browse/HBASE-8375
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
> Fix For: 0.95.2
>
>
> HBASE-7801 introduces the notion of per mutation fine grained durability 
> settings.
> This issue is to consider and the discuss the same for the per table settings 
> (i.e. what would be used if the mutation indicates USE_DEFAULT). I propose 
> the following setting per table:
> * SKIP_WAL (i.e. an unlogged table)
> * ASYNC_WAL (the current deferred log flush)
> * SYNC_WAL (the current default)
> * FSYNC_WAL (for future uses of HDFS' hsync())

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-8375) Streamline Table durability settings

2013-05-05 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-8375:
-

Fix Version/s: 0.95.2

> Streamline Table durability settings
> 
>
> Key: HBASE-8375
> URL: https://issues.apache.org/jira/browse/HBASE-8375
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
> Fix For: 0.95.2
>
>
> HBASE-7801 introduces the notion of per mutation fine grained durability 
> settings.
> This issue is to consider and the discuss the same for the per table settings 
> (i.e. what would be used if the mutation indicates USE_DEFAULT). I propose 
> the following setting per table:
> * SKIP_WAL (i.e. an unlogged table)
> * ASYNC_WAL (the current deferred log flush)
> * SYNC_WAL (the current default)
> * FSYNC_WAL (for future uses of HDFS' hsync())

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649455#comment-13649455
 ] 

Matt Corgan commented on HBASE-7897:


{quote}I'd say no. What you thinking?{quote}I'd say no too.  I shouldn't have 
talked about it so much =).  Perhaps the content of the tags can be inferred 
from a column family setting rather than a per-cell attribute.

Thinking on it a little more, I'm trying to convey that the Cell is a very 
low-level interface whose purpose is only to delineate a group of byte[]'s at a 
moment in time.  It is up to the levels above Cell to interpret what is in the 
byte[]'s.  A comparison would be a multi-part row key with int-String-int.  The 
Cell only knows that we have a row key byte[], and the levels on top are free 
to use those bytes any way they like.  We don't want to expose an Iterator on 
the rowKey section of the Cell...

There is another discussion about the use cases for tags and how those tags 
should be written into the byte[], but this particular jira is only focused on 
what to add to the Cell interface.  It's an important discussion no doubt, but 
I wonder if it matters here?



> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649454#comment-13649454
 ] 

stack commented on HBASE-7897:
--

bq. My first question would be, are we going with different Encoders for the 
Tags? Which means the normal keyvalue will be encoded differently and the tags 
will be encoded differently?

I'd say no.  What you thinking?

Maybe I am being extra thick and there would be encoding of tag bytes and that 
when you ask for tags, you'd get decoded byte array... (but then you have to 
parse it for case that tag byte array is carrying multiple tags).

On Iterator, sure, but to [~mcorgan]'s point, maybe talk of Iterators and Maps 
is way overengineering for something that we are supposed to be able to spin 
through quickly and cheaply.



> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8420) Port HBASE-6874 Implement prefetching for scanners from 0.89-fb

2013-05-05 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649422#comment-13649422
 ] 

Jimmy Xiang commented on HBASE-8420:


I see.  Sounds like we should not touch 0.94 unless the behavior is the same as 
before if the feature is disabled. We can do that.

> Port  HBASE-6874  Implement prefetching for scanners from 0.89-fb
> -
>
> Key: HBASE-8420
> URL: https://issues.apache.org/jira/browse/HBASE-8420
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: 0.94-8420_v1.patch, trunk-8420_v1.patch
>
>
> This should help scanner performance.  We should have it in trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8420) Port HBASE-6874 Implement prefetching for scanners from 0.89-fb

2013-05-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649406#comment-13649406
 ] 

Lars Hofhansl commented on HBASE-8420:
--

Thanks Jimmy.

As for changing the caching default in 0.94, see discussion here: HBASE-7008.


> Port  HBASE-6874  Implement prefetching for scanners from 0.89-fb
> -
>
> Key: HBASE-8420
> URL: https://issues.apache.org/jira/browse/HBASE-8420
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: 0.94-8420_v1.patch, trunk-8420_v1.patch
>
>
> This should help scanner performance.  We should have it in trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7820) Authenticating users from different realm without a trust relationship

2013-05-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649403#comment-13649403
 ] 

Ted Yu commented on HBASE-7820:
---

@Benoy:
Can you address Andy's comments ?

I can help produce patch for trunk if you're busy.

> Authenticating users from different realm without a trust relationship
> --
>
> Key: HBASE-7820
> URL: https://issues.apache.org/jira/browse/HBASE-7820
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Benoy Antony
> Attachments: HBASE-7820-0.94.patch
>
>
> HBase servers are part of the Hadoop domain, controlled by Hadoop Active 
> Directory. 
> The users belong to the CORP domain, controlled by the CORP Active Directory. 
> In the absence of a one way trust from HADOOP DOMAIN to CORP DOMAIN, how will 
> HBase servers authenticate CORP users ?
> This is the HBase equivalent of HADOOP-9296

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6192) Document ACL matrix in the book

2013-05-05 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649400#comment-13649400
 ] 

Doug Meil commented on HBASE-6192:
--

I'd be happy too.

> Document ACL matrix in the book
> ---
>
> Key: HBASE-6192
> URL: https://issues.apache.org/jira/browse/HBASE-6192
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation, security
>Affects Versions: 0.94.1, 0.95.2
>Reporter: Enis Soztutar
>Assignee: Laxman
>  Labels: documentaion, security
> Attachments: HBase Security-ACL Matrix.pdf, HBase Security-ACL 
> Matrix.pdf, HBase Security-ACL Matrix.pdf, HBase Security-ACL Matrix.xls, 
> HBase Security-ACL Matrix.xls, HBase Security-ACL Matrix.xls
>
>
> We have an excellent matrix at 
> https://issues.apache.org/jira/secure/attachment/12531252/Security-ACL%20Matrix.pdf
>  for ACL. Once the changes are done, we can adapt that and put it in the 
> book, also add some more documentation about the new authorization features. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8483) HConnectionManager can leak ZooKeeper connections when using deleteStaleConnection

2013-05-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649383#comment-13649383
 ] 

Ted Yu commented on HBASE-8483:
---

@Eric:
Can you run 0.94 patch through 0.94 test suite ?

Good job.

> HConnectionManager can leak ZooKeeper connections when using 
> deleteStaleConnection
> --
>
> Key: HBASE-8483
> URL: https://issues.apache.org/jira/browse/HBASE-8483
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.4
>Reporter: Eric Yu
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-8483-0.94.patch, HBASE-8483.patch, 
> LeakZKConnections.java
>
>
> If one thread calls deleteStaleConnection while other threads are using 
> connection, can leak ZK connections.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8483) HConnectionManager can leak ZooKeeper connections when using deleteStaleConnection

2013-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649365#comment-13649365
 ] 

Hadoop QA commented on HBASE-8483:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12581811/HBASE-8483.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5559//console

This message is automatically generated.

> HConnectionManager can leak ZooKeeper connections when using 
> deleteStaleConnection
> --
>
> Key: HBASE-8483
> URL: https://issues.apache.org/jira/browse/HBASE-8483
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.4
>Reporter: Eric Yu
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-8483-0.94.patch, HBASE-8483.patch, 
> LeakZKConnections.java
>
>
> If one thread calls deleteStaleConnection while other threads are using 
> connection, can leak ZK connections.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7897) Add support for tags to Cell Interface

2013-05-05 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649361#comment-13649361
 ] 

ramkrishna.s.vasudevan commented on HBASE-7897:
---

I have some doubts on the above discussions.
My first question would be, are we going with different Encoders for the Tags?  
Which means the normal keyvalue will be encoded differently and the tags will 
be encoded differently?
I can provide more details on this once am done with some more internal work.
bq. If we added hasTags, Map getTags, and byte [] getTag(byte [] tagName), 
would this allow tags to be impemented diffrently and up to the Cell 
implementation how tags are parsed?
First i think we can have mutliple tags per KV.  Something like Visibility tag 
and the other one could be an ACL tag.
So we may have to provide an iterator which would help us to get the different 
tags.  So the iterator should have an idea as how the tags would look like. 
This is what i thought :).  Open for discussion on any of the above points.
bq. the API doesn't seem to lend itself to being able to ask for a particular 
tag
I had an api for this but did not attach it in the patch.  The idea was the 
KeyValue will have something called Tag, and it will have the Type and the 
Actual tag byte array.  
So if i say get me the Tag Type Visibility this iterator will iterate 
internally and give us that particular tag byte array that corresponds to 
Visibility.
Again the tags cannot be overwritten just that every tag goes with that 
individual KV/Cell.
I can provide some more details on what is planned and what is done after some 
internal discussions.







> Add support for tags to Cell Interface
> --
>
> Key: HBASE-7897
> URL: https://issues.apache.org/jira/browse/HBASE-7897
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: HBASE-7897.patch
>
>
> Cell Interface has suppport for mvcc.   The only thing we'd add to Cell in 
> the near future is support for tags it would seem.  Should be easy to add.  
> Should add it now.  See backing discussion here: 
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13573784&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13573784
> Matt outlines what the additions to Cell might look like here:
> https://issues.apache.org/jira/browse/HBASE-7233?focusedCommentId=13531619&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13531619
> Would be good to get these in now.
> Marking as 0.96.  Can more later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8420) Port HBASE-6874 Implement prefetching for scanners from 0.89-fb

2013-05-05 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649353#comment-13649353
 ] 

Jimmy Xiang commented on HBASE-8420:


You are right. It is not exactly the same as before. The difference is mostly 
in the ClientScanner and related, as I mentioned in the RB. The main reason for 
that is because how the ClientScanner moves to the next scanner: asking for a 
number of rows, moving to the next scanner/region if getting less. The existing 
logic is to cache exactly the "caching" number of rows.  This patch could 
caching 1-less or many more rows, instead. Not sure if it is ok. But I think it 
makes the logic more efficient.

bq. Lastly in 0.94, we could use scan attributes to indicate this per scanner 
in a backward compatible way.
Cool, we can use scan attributes to indicate prefetching instead of a global 
flag. I will update the 0.94 patch.

As to the caching size, for 0.94, should we leave it as before (1), to make it 
the same as the trunk (100)?

> Port  HBASE-6874  Implement prefetching for scanners from 0.89-fb
> -
>
> Key: HBASE-8420
> URL: https://issues.apache.org/jira/browse/HBASE-8420
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: 0.94-8420_v1.patch, trunk-8420_v1.patch
>
>
> This should help scanner performance.  We should have it in trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8482) TestHBaseFsck#testCheckTableLocks broke; java.lang.AssertionError: expected:<[]> but was:<[EXPIRED_TABLE_LOCK]>

2013-05-05 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649341#comment-13649341
 ] 

Jonathan Hsieh commented on HBASE-8482:
---

bq. I fear that some time later we will just see that we are using 
system.currentTimeMilis() there and change it again to use EEM. Unfortunately I 
do not have a clever solution to change the chore + sleeper thing.

Just add a comment justifying it in the code.  Clever enough?

> TestHBaseFsck#testCheckTableLocks broke; java.lang.AssertionError: 
> expected:<[]> but was:<[EXPIRED_TABLE_LOCK]>
> ---
>
> Key: HBASE-8482
> URL: https://issues.apache.org/jira/browse/HBASE-8482
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 0.95.1
>
> Attachments: 8482.txt
>
>
> I've been looking into this test failure because I thought it particular to 
> my rpc hackery.
> What I see is like the subject:
> {code}
> java.lang.AssertionError: expected:<[]> but was:<[EXPIRED_TABLE_LOCK]>
> {code}
> and later in same unit test:
> {code}
> java.lang.AssertionError: expected:<[EXPIRED_TABLE_LOCK]> but 
> was:<[EXPIRED_TABLE_LOCK, EXPIRED_TABLE_LOCK]>
> {code}
> The test creates a write lock and then expires it.  In subject failure, we 
> are expiring the lock ahead of the time it should be.  Easier for me to 
> reproduce is that the second write lock we put in place is not allowed to 
> happen because of the presence of the first lock EVEN THOUGH IT HAS BEEN 
> JUDGED EXPIRED:
> {code}
> ERROR: Table lock acquire attempt found:[tableName=foo, 
> lockOwner=localhost,6,1, threadId=387, purpose=testCheckTableLocks, 
> isShared=false, createTime=129898749]
> 2013-05-02 00:34:42,715 INFO  [Thread-183] lock.ZKInterProcessLockBase(431): 
> Lock is held by: write-testing utility00
> ERROR: Table lock acquire attempt found:[tableName=foo, 
> lockOwner=localhost,6,1, threadId=349, purpose=testCheckTableLocks, 
> isShared=false, createTime=28506852]
> {code}
> Above, you see the expired lock and then our hbck lock visitor has it that 
> the second lock is expired because it is held by the first lock.
> I can keep looking at this but input would be appreciated.
> It failed in recent trunk build 
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/4090/testReport/junit/org.apache.hadoop.hbase.util/TestHBaseFsck/testCheckTableLocks/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8478) HBASE-2231 breaks TestHRegion#testRecoveredEditsReplayCompaction under hadoop2 profile

2013-05-05 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649339#comment-13649339
 ] 

Jonathan Hsieh commented on HBASE-8478:
---

+1. lovely.

> HBASE-2231 breaks TestHRegion#testRecoveredEditsReplayCompaction under 
> hadoop2 profile
> --
>
> Key: HBASE-8478
> URL: https://issues.apache.org/jira/browse/HBASE-8478
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction, hadoop2, Protobufs
>Affects Versions: 0.98.0, 0.95.1
>Reporter: Jonathan Hsieh
>Assignee: Enis Soztutar
> Fix For: 0.98.0
>
> Attachments: hbase-8478_v1.patch, hbase-8478_v2.patch
>
>
> TestHRegion#testRecoveredEditsReplyCompaction and 
> TestHRegionBusyWait#testRecoveredEditsReplyCompaction fail against the 
> hadoop2 profile due to HBASE-2231.  
> If you checkout at the patch on trunk, the error trace looks like this:
> {code}
> type="java.io.FileNotFoundException">java.io.FileNotFoundException: File 
> /home/jon/proj/hbase/hbase-server/target/test-data/05b5be10-bc88-40b0-a274-d1ebffe24e85/TestHRegiontestRecoveredEditsReplayCompaction/testRecoveredEditsReplayCompaction/f1de1d311572557ca13c4cb810ebfc0b/.tmp
>  does not exist
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:340)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1418)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1458)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:569)
> at 
> org.apache.hadoop.hbase.regionserver.TestHRegion.testRecoveredEditsReplayCompaction(TestHRegion.java:462)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:176)
> at junit.framework.TestCase.runBare(TestCase.java:141)
> at junit.framework.TestResult$1.protect(TestResult.java:122)
> at junit.framework.TestResult.runProtected(TestResult.java:142)
> at junit.framework.TestResult.run(TestResult.java:125)
> at junit.framework.TestCase.run(TestCase.java:129)
> at junit.framework.TestSuite.runTest(TestSuite.java:255)
> at junit.framework.TestSuite.run(TestSuite.java:250)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:226)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188)
> at 
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166)
> at 
> org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101)
> {code}
> This was found via git bisect.
> {{git bisect run mvn clean install test -DskipITs -Dhadoop.profile=2.0 
> -Dtest=TestHRegion#testRecoveredEditsReplayCompaction}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8405) Add more custom options to how ClusterManager runs commands

2013-05-05 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649317#comment-13649317
 ] 

Jean-Marc Spaggiari commented on HBASE-8405:


2 comments.

Some system might not be using bash but a different command line interpreter. 
Should we make that configurable?  Or since all the hbase scripts are for bash 
too, it's ok to hard code it? Same question for the SSH path.

> Add more custom options to how ClusterManager runs commands
> ---
>
> Key: HBASE-8405
> URL: https://issues.apache.org/jira/browse/HBASE-8405
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.94.8, 0.95.1
>
> Attachments: HBASE-8405-take2-v0.patch, HBASE-8405-v0.patch, 
> HBASE-8405-v1.patch
>
>
> You may want to run yet more custom commands (such as su as some local user) 
> depending on test setup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8405) Add more custom options to how ClusterManager runs commands

2013-05-05 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649316#comment-13649316
 ] 

Jean-Marc Spaggiari commented on HBASE-8405:


2 comments.

Some system might not be using bash but a different command line interpreter. 
Should we make that configurable?  Or since all the hbase scripts are for bash 
too, it's ok to hard code it? Same question for the SSH path.

> Add more custom options to how ClusterManager runs commands
> ---
>
> Key: HBASE-8405
> URL: https://issues.apache.org/jira/browse/HBASE-8405
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.94.8, 0.95.1
>
> Attachments: HBASE-8405-take2-v0.patch, HBASE-8405-v0.patch, 
> HBASE-8405-v1.patch
>
>
> You may want to run yet more custom commands (such as su as some local user) 
> depending on test setup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7667) Support stripe compaction

2013-05-05 Thread Raymond (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649312#comment-13649312
 ] 

Raymond commented on HBASE-7667:


great, more region lead to better load balance and good compaction effect, less 
region lead to easy management and fast failover, I think stripe (or 
sub-region) is a good trade-off. 
And I think stripe compaction is similar with Level compaction with L0+L1 only. 
Another difficult is about configuration, in big hbase cluster, there are so 
many applications, how to build suitable configuation for each one will be a 
huge challenge.

> Support stripe compaction
> -
>
> Key: HBASE-7667
> URL: https://issues.apache.org/jira/browse/HBASE-7667
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: Stripe compaction perf evaluation.pdf, Stripe compaction 
> perf evaluation.pdf, Stripe compaction perf evaluation.pdf, Stripe 
> compactions.pdf, Stripe compactions.pdf, Stripe compactions.pdf, Stripe 
> compactions.pdf, Using stripe compactions.pdf, Using stripe compactions.pdf, 
> Using stripe compactions.pdf
>
>
> So I was thinking about having many regions as the way to make compactions 
> more manageable, and writing the level db doc about how level db range 
> overlap and data mixing breaks seqNum sorting, and discussing it with Jimmy, 
> Matteo and Ted, and thinking about how to avoid Level DB I/O multiplication 
> factor.
> And I suggest the following idea, let's call it stripe compactions. It's a 
> mix between level db ideas and having many small regions.
> It allows us to have a subset of benefits of many regions (wrt reads and 
> compactions) without many of the drawbacks (managing and current 
> memstore/etc. limitation).
> It also doesn't break seqNum-based file sorting for any one key.
> It works like this.
> The region key space is separated into configurable number of fixed-boundary 
> stripes (determined the first time we stripe the data, see below).
> All the data from memstores is written to normal files with all keys present 
> (not striped), similar to L0 in LevelDb, or current files.
> Compaction policy does 3 types of compactions.
> First is L0 compaction, which takes all L0 files and breaks them down by 
> stripe. It may be optimized by adding more small files from different 
> stripes, but the main logical outcome is that there are no more L0 files and 
> all data is striped.
> Second is exactly similar to current compaction, but compacting one single 
> stripe. In future, nothing prevents us from applying compaction rules and 
> compacting part of the stripe (e.g. similar to current policy with rations 
> and stuff, tiers, whatever), but for the first cut I'd argue let it "major 
> compact" the entire stripe. Or just have the ratio and no more complexity.
> Finally, the third addresses the concern of the fixed boundaries causing 
> stripes to be very unbalanced.
> It's exactly like the 2nd, except it takes 2+ adjacent stripes and writes the 
> results out with different boundaries.
> There's a tradeoff here - if we always take 2 adjacent stripes, compactions 
> will be smaller but rebalancing will take ridiculous amount of I/O.
> If we take many stripes we are essentially getting into the 
> epic-major-compaction problem again. Some heuristics will have to be in place.
> In general, if, before stripes are determined, we initially let L0 grow 
> before determining the stripes, we will get better boundaries.
> Also, unless unbalancing is really large we don't need to rebalance really.
> Obviously this scheme (as well as level) is not applicable for all scenarios, 
> e.g. if timestamp is your key it completely falls apart.
> The end result:
> - many small compactions that can be spread out in time.
> - reads still read from a small number of files (one stripe + L0).
> - region splits become marvelously simple (if we could move files between 
> regions, no references would be needed).
> Main advantage over Level (for HBase) is that default store can still open 
> the files and get correct results - there are no range overlap shenanigans.
> It also needs no metadata, although we may record some for convenience.
> It also would appear to not cause as much I/O.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-8495) Change ownership of the directory to bulk load

2013-05-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-8495:
--

 Summary: Change ownership of the directory to bulk load
 Key: HBASE-8495
 URL: https://issues.apache.org/jira/browse/HBASE-8495
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 0.95.0, 0.94.7
Reporter: Matteo Bertozzi
Priority: Trivial
 Fix For: 0.95.2


To bulk load something you need to change the ownership of the data directory 
to allow the hbase user to read and move the files, also in the split case you 
must use the hbase user to run the LoadIncrementalHFiles tool, since internally 
some directories "_tmp" are created to add the split reference files.

In a secure cluster, the SecureBulkLoadEndPoint will take care of this problem 
by doing a chmod 777 on the directory to bulk load.

NOTE that a chown is not possible since you must be a super user to change the 
ownership, a change group may be possible but the user must be in the hbase 
group... and anyway it will require a chmod to allow the group to perform the 
move.

{code}
Caused by: org.apache.hadoop.security.AccessControlException: Permission 
denied: user=hbase, access=WRITE, inode="/test/cf":th30z:supergroup:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
java.io.IOException: Exception in rename
at 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem.rename(HRegionFileSystem.java:928)
at 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:340)
at 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:414)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8015) Support for Namespaces

2013-05-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649295#comment-13649295
 ] 

Ted Yu commented on HBASE-8015:
---

I volunteer to be the sponsor. 

> Support for Namespaces
> --
>
> Key: HBASE-8015
> URL: https://issues.apache.org/jira/browse/HBASE-8015
> Project: HBase
>  Issue Type: New Feature
>Reporter: Francis Liu
>Assignee: Francis Liu
> Attachments: HBASE-8015_draft_94.patch, Namespace Design.pdf
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira