[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875919#comment-13875919
 ] 

Anoop Sam John commented on HBASE-10322:


That changes can be in another Jira [~stack]?
We need any way an RpcClient to talk with peer right?
So we have 2 options. 
1.Go with current way without any code changes.  The RpcClient used by 
ReplicationSource looks at the config hbase.client.rpc.codec  to know the 
codec name and uses that. This defaults to KVCodec.  As long as user don't deal 
with tags directly or indirectly (via usage of cell level ACL/ visibility 
labels) the current way works good. If tag case comes, user must
   a. Change this config value at HRS side to any of the codec with tags class. 
(We plan to give a KVCodecWithTag)
   b. Make sure upgrade the RSs in peer clusters also so that the new class 
added in 98 is available there also.
2. Introduce a new config name as in latest patch and do change is 
ReplicationSource to decorate the conf. In the attached patch a new Codec ie. 
CellCodecV2 is used as default. But I think there should not be any default 
value for this codec because of the below reason. (Default value should be 
value of the old config with thats default as KVCodec)
Suppose the src cluster user is upgrading to 98 (or later versions in future) 
But the peer is still in 96 .
When the replication src write using new Codec class, the destination will need 
the codec class to be present in it also. So this make it necessary for the 
peer also should be upgraded. What abt rolling upgrade then?
So even if the new config is there or not, the def codec should not change.  

Out of these 2 options which one you guys prefer?   Can give a patch 
accordingly.


 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed

2014-01-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10294:
---

Fix Version/s: (was: 0.99.0)

 Some synchronization on ServerManager#onlineServers can be removed
 --

 Key: HBASE-10294
 URL: https://issues.apache.org/jira/browse/HBASE-10294
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10294-v1.txt


 ServerManager#onlineServers is a ConcurrentHashMap
 Yet I found that some accesses to it are synchronized and unnecessary.
 Here is one example:
 {code}
   public MapServerName, ServerLoad getOnlineServers() {
 // Presumption is that iterating the returned Map is OK.
 synchronized (this.onlineServers) {
   return Collections.unmodifiableMap(this.onlineServers);
 {code}
 Note: not all accesses to ServerManager#onlineServers are synchronized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code

2014-01-19 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10366:
--

  Resolution: Fixed
Release Note: Thanks for the reviews! I've integrated the v1 patch into 
trunk, 0.98  0.96 branches.
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 0.94 filterRow() may be skipped in 0.96(or onwards) code
 

 Key: HBASE-10366
 URL: https://issues.apache.org/jira/browse/HBASE-10366
 Project: HBase
  Issue Type: Bug
  Components: Filters
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.98.0, 0.96.2

 Attachments: hbase-10366-v1.patch, hbase-10366.patch


 HBASE-6429 combines both filterRow  filterRow(ListKeyValue kvs) functions 
 in Filter. 
 While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 
 expected because 0.94(old) hasFilterRow only returns true when 
 filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, 
 the filterRow() will be skipped.
 Since we don't ask 0.94 users to update their existing filter code, the issue 
 will cause scan returns unexpected keyvalues and break the backward 
 compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875990#comment-13875990
 ] 

Lars Hofhansl commented on HBASE-10322:
---

I am a bit late to this party. The visibility tags control what a *client* can 
see, right?

Then what's a client? A client is outside of the HBase cluster outside of 
HBase's control. So HFile, HLog are not a client. Replication is also not a 
client. Export is a client, just like any other Java/Thrift/MR/etc client.
As Andy points out the interesting part here are these real clients.

Are the tags themselves (i.e. who sees what) more sensitive than the data that 
can be accessed.
I.e. if I can see a certain KV, should I be able to see its visibility tags?
* If the answer is yes, this is an easy problem in principle and squarely in 
the hands of an HBase admin to setup access correctly. You just run Export/etc 
as a user with sufficient access and all problems just go away.
* If the answer is no it gets murky quickly. Now all tools and access paths 
need to be considered individually.

Maybe we can even have a tag that controls the visibility of the tags? 
Generally anything that we hardware assumes something about desired behavior 
that might not be the same at every institution.


 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code

2014-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876012#comment-13876012
 ] 

Hudson commented on HBASE-10366:


SUCCESS: Integrated in HBase-TRUNK #4836 (See 
[https://builds.apache.org/job/HBase-TRUNK/4836/])
HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code 
(jeffreyz: rev 1559547)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java


 0.94 filterRow() may be skipped in 0.96(or onwards) code
 

 Key: HBASE-10366
 URL: https://issues.apache.org/jira/browse/HBASE-10366
 Project: HBase
  Issue Type: Bug
  Components: Filters
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.98.0, 0.96.2

 Attachments: hbase-10366-v1.patch, hbase-10366.patch


 HBASE-6429 combines both filterRow  filterRow(ListKeyValue kvs) functions 
 in Filter. 
 While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 
 expected because 0.94(old) hasFilterRow only returns true when 
 filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, 
 the filterRow() will be skipped.
 Since we don't ask 0.94 users to update their existing filter code, the issue 
 will cause scan returns unexpected keyvalues and break the backward 
 compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code

2014-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876020#comment-13876020
 ] 

Hudson commented on HBASE-10366:


SUCCESS: Integrated in HBase-0.98 #95 (See 
[https://builds.apache.org/job/HBase-0.98/95/])
HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code 
(jeffreyz: rev 1559548)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java


 0.94 filterRow() may be skipped in 0.96(or onwards) code
 

 Key: HBASE-10366
 URL: https://issues.apache.org/jira/browse/HBASE-10366
 Project: HBase
  Issue Type: Bug
  Components: Filters
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.98.0, 0.96.2

 Attachments: hbase-10366-v1.patch, hbase-10366.patch


 HBASE-6429 combines both filterRow  filterRow(ListKeyValue kvs) functions 
 in Filter. 
 While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 
 expected because 0.94(old) hasFilterRow only returns true when 
 filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, 
 the filterRow() will be skipped.
 Since we don't ask 0.94 users to update their existing filter code, the issue 
 will cause scan returns unexpected keyvalues and break the backward 
 compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code

2014-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876024#comment-13876024
 ] 

Hudson commented on HBASE-10366:


FAILURE: Integrated in hbase-0.96 #263 (See 
[https://builds.apache.org/job/hbase-0.96/263/])
HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code 
(jeffreyz: rev 1559551)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java


 0.94 filterRow() may be skipped in 0.96(or onwards) code
 

 Key: HBASE-10366
 URL: https://issues.apache.org/jira/browse/HBASE-10366
 Project: HBase
  Issue Type: Bug
  Components: Filters
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.98.0, 0.96.2

 Attachments: hbase-10366-v1.patch, hbase-10366.patch


 HBASE-6429 combines both filterRow  filterRow(ListKeyValue kvs) functions 
 in Filter. 
 While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 
 expected because 0.94(old) hasFilterRow only returns true when 
 filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, 
 the filterRow() will be skipped.
 Since we don't ask 0.94 users to update their existing filter code, the issue 
 will cause scan returns unexpected keyvalues and break the backward 
 compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code

2014-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876025#comment-13876025
 ] 

Hudson commented on HBASE-10366:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #87 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/87/])
HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code 
(jeffreyz: rev 1559548)
* 
/hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java


 0.94 filterRow() may be skipped in 0.96(or onwards) code
 

 Key: HBASE-10366
 URL: https://issues.apache.org/jira/browse/HBASE-10366
 Project: HBase
  Issue Type: Bug
  Components: Filters
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.98.0, 0.96.2

 Attachments: hbase-10366-v1.patch, hbase-10366.patch


 HBASE-6429 combines both filterRow  filterRow(ListKeyValue kvs) functions 
 in Filter. 
 While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 
 expected because 0.94(old) hasFilterRow only returns true when 
 filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, 
 the filterRow() will be skipped.
 Since we don't ask 0.94 users to update their existing filter code, the issue 
 will cause scan returns unexpected keyvalues and break the backward 
 compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-19 Thread Ishan Chhabra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chhabra updated HBASE-10323:
--

Attachment: HBASE_10323-trunk-v4.patch
HBASE_10323-0.94.15-v5.patch

 Auto detect data block encoding in HFileOutputFormat
 

 Key: HBASE-10323
 URL: https://issues.apache.org/jira/browse/HBASE-10323
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Ishan Chhabra
Assignee: Ishan Chhabra
 Fix For: 0.99.0

 Attachments: HBASE_10323-0.94.15-v1.patch, 
 HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
 HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
 HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
 HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch


 Currently, one has to specify the data block encoding of the table explicitly 
 using the config parameter 
 hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload 
 load. This option is easily missed, not documented and also works differently 
 than compression, block size and bloom filter type, which are auto detected. 
 The solution would be to add support to auto detect datablock encoding 
 similar to other parameters. 
 The current patch does the following:
 1. Automatically detects datablock encoding in HFileOutputFormat.
 2. Keeps the legacy option of manually specifying the datablock encoding
 around as a method to override auto detections.
 3. Moves string conf parsing to the start of the program so that it fails
 fast during starting up instead of failing during record writes. It also
 makes the internals of the program type safe.
 4. Adds missing doc strings and unit tests for code serializing and
 deserializing config paramerters for bloom filer type, block size and
 datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-19 Thread Ishan Chhabra (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876054#comment-13876054
 ] 

Ishan Chhabra commented on HBASE-10323:
---

Added the @VisibleForTesting annotations where needed and fixed the '{' in 
newline. I didn't make the constants package-private since no other class needs 
them at the moment. When some other class in the package or a test needs it, 
they could be made package private then. 

 Auto detect data block encoding in HFileOutputFormat
 

 Key: HBASE-10323
 URL: https://issues.apache.org/jira/browse/HBASE-10323
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Ishan Chhabra
Assignee: Ishan Chhabra
 Fix For: 0.99.0

 Attachments: HBASE_10323-0.94.15-v1.patch, 
 HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
 HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
 HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
 HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch


 Currently, one has to specify the data block encoding of the table explicitly 
 using the config parameter 
 hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload 
 load. This option is easily missed, not documented and also works differently 
 than compression, block size and bloom filter type, which are auto detected. 
 The solution would be to add support to auto detect datablock encoding 
 similar to other parameters. 
 The current patch does the following:
 1. Automatically detects datablock encoding in HFileOutputFormat.
 2. Keeps the legacy option of manually specifying the datablock encoding
 around as a method to override auto detections.
 3. Moves string conf parsing to the start of the program so that it fails
 fast during starting up instead of failing during record writes. It also
 makes the internals of the program type safe.
 4. Adds missing doc strings and unit tests for code serializing and
 deserializing config paramerters for bloom filer type, block size and
 datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code

2014-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876062#comment-13876062
 ] 

Hudson commented on HBASE-10366:


FAILURE: Integrated in hbase-0.96-hadoop2 #181 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/181/])
HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code 
(jeffreyz: rev 1559551)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java


 0.94 filterRow() may be skipped in 0.96(or onwards) code
 

 Key: HBASE-10366
 URL: https://issues.apache.org/jira/browse/HBASE-10366
 Project: HBase
  Issue Type: Bug
  Components: Filters
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.98.0, 0.96.2

 Attachments: hbase-10366-v1.patch, hbase-10366.patch


 HBASE-6429 combines both filterRow  filterRow(ListKeyValue kvs) functions 
 in Filter. 
 While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 
 expected because 0.94(old) hasFilterRow only returns true when 
 filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, 
 the filterRow() will be skipped.
 Since we don't ask 0.94 users to update their existing filter code, the issue 
 will cause scan returns unexpected keyvalues and break the backward 
 compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas

2014-01-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876066#comment-13876066
 ] 

Devaraj Das commented on HBASE-10070:
-

[~lhofhansl] The user specifies he can tolerate the stale reads via flags in 
the read api. The Results are also tagged as such and so he can inspect whether 
the result is stale or not. In other words, the user has full control still.

 HBase read high-availability using eventually consistent region replicas
 

 Key: HBASE-10070
 URL: https://issues.apache.org/jira/browse/HBASE-10070
 Project: HBase
  Issue Type: New Feature
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HighAvailabilityDesignforreadsApachedoc.pdf


 In the present HBase architecture, it is hard, probably impossible, to 
 satisfy constraints like 99th percentile of the reads will be served under 10 
 ms. One of the major factors that affects this is the MTTR for regions. There 
 are three phases in the MTTR process - detection, assignment, and recovery. 
 Of these, the detection is usually the longest and is presently in the order 
 of 20-30 seconds. During this time, the clients would not be able to read the 
 region data.
 However, some clients will be better served if regions will be available for 
 reads during recovery for doing eventually consistent reads. This will help 
 with satisfying low latency guarantees for some class of applications which 
 can work with stale reads.
 For improving read availability, we propose a replicated read-only region 
 serving design, also referred as secondary regions, or region shadows. 
 Extending current model of a region being opened for reads and writes in a 
 single region server, the region will be also opened for reading in region 
 servers. The region server which hosts the region for reads and writes (as in 
 current case) will be declared as PRIMARY, while 0 or more region servers 
 might be hosting the region as SECONDARY. There may be more than one 
 secondary (replica count  2).
 Will attach a design doc shortly which contains most of the details and some 
 thoughts about development approaches. Reviews are more than welcome. 
 We also have a proof of concept patch, which includes the master and regions 
 server side of changes. Client side changes will be coming soon as well. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code

2014-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876067#comment-13876067
 ] 

Hudson commented on HBASE-10366:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #59 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/59/])
HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code 
(jeffreyz: rev 1559547)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java


 0.94 filterRow() may be skipped in 0.96(or onwards) code
 

 Key: HBASE-10366
 URL: https://issues.apache.org/jira/browse/HBASE-10366
 Project: HBase
  Issue Type: Bug
  Components: Filters
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.98.0, 0.96.2

 Attachments: hbase-10366-v1.patch, hbase-10366.patch


 HBASE-6429 combines both filterRow  filterRow(ListKeyValue kvs) functions 
 in Filter. 
 While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 
 expected because 0.94(old) hasFilterRow only returns true when 
 filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, 
 the filterRow() will be skipped.
 Since we don't ask 0.94 users to update their existing filter code, the issue 
 will cause scan returns unexpected keyvalues and break the backward 
 compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876079#comment-13876079
 ] 

Hadoop QA commented on HBASE-10323:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12623875/HBASE_10323-trunk-v4.patch
  against trunk revision .
  ATTACHMENT ID: 12623875

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8470//console

This message is automatically generated.

 Auto detect data block encoding in HFileOutputFormat
 

 Key: HBASE-10323
 URL: https://issues.apache.org/jira/browse/HBASE-10323
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Ishan Chhabra
Assignee: Ishan Chhabra
 Fix For: 0.99.0

 Attachments: HBASE_10323-0.94.15-v1.patch, 
 HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
 HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
 HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
 HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch


 Currently, one has to specify the data block encoding of the table explicitly 
 using the config parameter 
 hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload 
 load. This option is easily missed, not documented and also works differently 
 than compression, block size and bloom filter type, which are auto detected. 
 The solution would be to add support to auto detect datablock encoding 
 similar to other parameters. 
 The current patch does the following:
 1. Automatically detects datablock encoding in HFileOutputFormat.
 2. Keeps the legacy option of manually specifying the datablock encoding
 around as a method to override auto detections.
 3. Moves string conf parsing to the start of the program so that it fails
 fast during starting up instead of failing during record writes. It also
 makes the internals of the program type safe.
 4. Adds missing doc strings and unit tests for code serializing and
 deserializing config paramerters for bloom filer type, block size and
 datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas

2014-01-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876087#comment-13876087
 ] 

Devaraj Das commented on HBASE-10070:
-

[~stack], I agree with you that the notion of replicaID == 0 being a primary 
replica, etc. should be maintained in a layer outside HRegionInfo. HRegionInfo 
could come with an 'index' and the 'index' should be an inherent part of the 
HRI's identification (in the name, etc.). The layer outside could associate 
index == 0 with primary replica, etc.. Will submit a patch along these 
lines.

 HBase read high-availability using eventually consistent region replicas
 

 Key: HBASE-10070
 URL: https://issues.apache.org/jira/browse/HBASE-10070
 Project: HBase
  Issue Type: New Feature
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HighAvailabilityDesignforreadsApachedoc.pdf


 In the present HBase architecture, it is hard, probably impossible, to 
 satisfy constraints like 99th percentile of the reads will be served under 10 
 ms. One of the major factors that affects this is the MTTR for regions. There 
 are three phases in the MTTR process - detection, assignment, and recovery. 
 Of these, the detection is usually the longest and is presently in the order 
 of 20-30 seconds. During this time, the clients would not be able to read the 
 region data.
 However, some clients will be better served if regions will be available for 
 reads during recovery for doing eventually consistent reads. This will help 
 with satisfying low latency guarantees for some class of applications which 
 can work with stale reads.
 For improving read availability, we propose a replicated read-only region 
 serving design, also referred as secondary regions, or region shadows. 
 Extending current model of a region being opened for reads and writes in a 
 single region server, the region will be also opened for reading in region 
 servers. The region server which hosts the region for reads and writes (as in 
 current case) will be declared as PRIMARY, while 0 or more region servers 
 might be hosting the region as SECONDARY. There may be more than one 
 secondary (replica count  2).
 Will attach a design doc shortly which contains most of the details and some 
 thoughts about development approaches. Reviews are more than welcome. 
 We also have a proof of concept patch, which includes the master and regions 
 server side of changes. Client side changes will be coming soon as well. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10371) Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again

2014-01-19 Thread binlijin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

binlijin updated HBASE-10371:
-

Attachment: HBASE-10371-96.patch

 Compaction creates empty hfile, then selects this file for compaction and 
 creates empty hfile and over again
 

 Key: HBASE-10371
 URL: https://issues.apache.org/jira/browse/HBASE-10371
 Project: HBase
  Issue Type: Bug
Reporter: binlijin
Assignee: binlijin
 Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17

 Attachments: 10371-trunk-3.patch, HBASE-10371-94.patch, 
 HBASE-10371-96.patch, HBASE-10371-trunk-2.patch, HBASE-10371-trunk.patch


 (1) Select HFile for compaction
 {code}
 2014-01-16 01:01:25,111 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting 
 the expired store file by compaction: 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b
  whose maxTimeStamp is -1 while the max expired timestamp is 1389632485111
 {code}
 (2) Compact
 {code}
 2014-01-16 01:01:26,042 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: 
 Compacting 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b,
  keycount=0, bloomtype=NONE, size=534, encoding=NONE
 2014-01-16 01:01:26,045 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
 file=hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8
  with permission=rwxrwxrwx
 2014-01-16 01:01:26,076 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Renaming compacted file at 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8
  to 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8
 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Completed compaction of 1 file(s) in a of 
 storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767. into 
 40de5d79f80e4fb197e409fb99ab0fd8, size=534; total size for store is 399.0 M
 2014-01-16 01:01:26,142 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed 
 compaction: 
 regionName=storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767., 
 storeName=a, fileCount=1, fileSize=534, priority=16, time=18280340606333745; 
 duration=0sec
 {code}
 (3) Select HFile for compaction
 {code}
 2014-01-16 03:48:05,120 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting 
 the expired store file by compaction: 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8
  whose maxTimeStamp is -1 while the max expired timestamp is 1389642485120
 {code}
 (4) Compact
 {code}
 2014-01-16 03:50:17,731 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: 
 Compacting 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8,
  keycount=0, bloomtype=NONE, size=534, encoding=NONE
 2014-01-16 03:50:17,732 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
 file=hdfs://dump002002.cm6:9000/hbase-0.90
 {code}
 ... 
 this loop for ever.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10371) Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again

2014-01-19 Thread binlijin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

binlijin updated HBASE-10371:
-

Attachment: HBASE-10371-94-2.patch

 Compaction creates empty hfile, then selects this file for compaction and 
 creates empty hfile and over again
 

 Key: HBASE-10371
 URL: https://issues.apache.org/jira/browse/HBASE-10371
 Project: HBase
  Issue Type: Bug
Reporter: binlijin
Assignee: binlijin
 Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17

 Attachments: 10371-trunk-3.patch, HBASE-10371-94-2.patch, 
 HBASE-10371-94.patch, HBASE-10371-96.patch, HBASE-10371-trunk-2.patch, 
 HBASE-10371-trunk.patch


 (1) Select HFile for compaction
 {code}
 2014-01-16 01:01:25,111 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting 
 the expired store file by compaction: 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b
  whose maxTimeStamp is -1 while the max expired timestamp is 1389632485111
 {code}
 (2) Compact
 {code}
 2014-01-16 01:01:26,042 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: 
 Compacting 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b,
  keycount=0, bloomtype=NONE, size=534, encoding=NONE
 2014-01-16 01:01:26,045 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
 file=hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8
  with permission=rwxrwxrwx
 2014-01-16 01:01:26,076 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Renaming compacted file at 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8
  to 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8
 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Completed compaction of 1 file(s) in a of 
 storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767. into 
 40de5d79f80e4fb197e409fb99ab0fd8, size=534; total size for store is 399.0 M
 2014-01-16 01:01:26,142 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed 
 compaction: 
 regionName=storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767., 
 storeName=a, fileCount=1, fileSize=534, priority=16, time=18280340606333745; 
 duration=0sec
 {code}
 (3) Select HFile for compaction
 {code}
 2014-01-16 03:48:05,120 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting 
 the expired store file by compaction: 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8
  whose maxTimeStamp is -1 while the max expired timestamp is 1389642485120
 {code}
 (4) Compact
 {code}
 2014-01-16 03:50:17,731 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: 
 Compacting 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8,
  keycount=0, bloomtype=NONE, size=534, encoding=NONE
 2014-01-16 03:50:17,732 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
 file=hdfs://dump002002.cm6:9000/hbase-0.90
 {code}
 ... 
 this loop for ever.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10371) Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again

2014-01-19 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876101#comment-13876101
 ] 

binlijin commented on HBASE-10371:
--

Add patch for 96 and 94 branch.

 Compaction creates empty hfile, then selects this file for compaction and 
 creates empty hfile and over again
 

 Key: HBASE-10371
 URL: https://issues.apache.org/jira/browse/HBASE-10371
 Project: HBase
  Issue Type: Bug
Reporter: binlijin
Assignee: binlijin
 Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17

 Attachments: 10371-trunk-3.patch, HBASE-10371-94-2.patch, 
 HBASE-10371-94.patch, HBASE-10371-96.patch, HBASE-10371-trunk-2.patch, 
 HBASE-10371-trunk.patch


 (1) Select HFile for compaction
 {code}
 2014-01-16 01:01:25,111 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting 
 the expired store file by compaction: 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b
  whose maxTimeStamp is -1 while the max expired timestamp is 1389632485111
 {code}
 (2) Compact
 {code}
 2014-01-16 01:01:26,042 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: 
 Compacting 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b,
  keycount=0, bloomtype=NONE, size=534, encoding=NONE
 2014-01-16 01:01:26,045 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
 file=hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8
  with permission=rwxrwxrwx
 2014-01-16 01:01:26,076 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Renaming compacted file at 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8
  to 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8
 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.Store: 
 Completed compaction of 1 file(s) in a of 
 storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767. into 
 40de5d79f80e4fb197e409fb99ab0fd8, size=534; total size for store is 399.0 M
 2014-01-16 01:01:26,142 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed 
 compaction: 
 regionName=storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767., 
 storeName=a, fileCount=1, fileSize=534, priority=16, time=18280340606333745; 
 duration=0sec
 {code}
 (3) Select HFile for compaction
 {code}
 2014-01-16 03:48:05,120 INFO 
 org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting 
 the expired store file by compaction: 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8
  whose maxTimeStamp is -1 while the max expired timestamp is 1389642485120
 {code}
 (4) Compact
 {code}
 2014-01-16 03:50:17,731 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: 
 Compacting 
 hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8,
  keycount=0, bloomtype=NONE, size=534, encoding=NONE
 2014-01-16 03:50:17,732 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
 file=hdfs://dump002002.cm6:9000/hbase-0.90
 {code}
 ... 
 this loop for ever.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10379) Turn the msg Request is a replay (34) - PROCESS_TGS from logging level ERROR to WARN

2014-01-19 Thread takeshi.miao (JIRA)
takeshi.miao created HBASE-10379:


 Summary: Turn the msg Request is a replay (34) - PROCESS_TGS 
from logging level ERROR to WARN
 Key: HBASE-10379
 URL: https://issues.apache.org/jira/browse/HBASE-10379
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.16
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Minor


Hi All,

Recently we got the error msg Request is a replay (34) - PROCESS_TGS while we 
are using the HBase client API to put data into HBase-0.94.16 with krb5-1.6.1 
enabled. The related msg as follows...
{code}
[2014-01-15 
09:40:38,452][hbase-tablepool-1-thread-3][ERROR][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1124)):
 PriviledgedActionException as:takeshi_miao@LAB 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Request is a 
replay (34) - PROCESS_TGS)]
[2014-01-15 
09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.logPriviledgedAction(UserGroupInformation.java:1143)):
 PriviledgedAction as:takeshi_miao@LAB 
from:sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  
[2014-01-15 
09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.ipc.SecureClient](org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:213)):
 Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Request is a replay (34) - 
PROCESS_TGS)]
[2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO 
][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:657)):
 Initiating logout for takeshi_miao@LAB
[2014-01-15 
09:40:38,454][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.logout(UserGroupInformation.java:154)):
 hadoop logout
[2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO 
][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:667)):
 Initiating re-login for takeshi_miao@LAB
[2014-01-15 
09:40:38,455][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.login(UserGroupInformation.java:146)):
 hadoop login
[2014-01-15 
09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:95)):
 hadoop login commit
[2014-01-15 
09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:100)):
 using existing subject:[takeshi_miao@LAB, UnixPrincipal: takeshi_miao, 
UnixNumericUserPrincipal: 501, UnixNumericGroupPrincipal [Primary Group]: 501, 
UnixNumericGroupPrincipal [Supplementary Group]: 502, takeshi_miao@LAB]
{code}

At the beginning, we were worry about the data loss occurring while we found 
the Request is a replay (34) - PROCESS_TGS (especially it is the ERROR level) 
in log, but after code study, this is basically *NOT* a data loss issue due to 
HBase client API would try 5 times internally (o.a.h.hbase.ipc.SecureClient, 
L#296, a one thread) and also 10 times of retry externally 
(o.a.h.hbase.client.HConnectionManager, L#1661, for all failed thread), The 
HTable API would also throw IOEcxeption to client code if any thread still fail 
after these retries.

Based on HBase users' viewpoint as us, we think this is better to change the 
logging level from 'ERROR' to 'WARN', due to the 'ERROR' level had been 
confused us for a while...But this code change may need to change boht HBase 
code and Hadoop code as well; so I am wondering how community think about this 
small thing but may be important to the pure HBase users.

mailing list
http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCADcMMgGiEyho0HGwgbfOUS78ymDpCo5Q0PStWAPUk40W%3DPfcFQ%40mail.gmail.com%3E




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10379) Turn the msg Request is a replay (34) - PROCESS_TGS from logging level ERROR to WARN

2014-01-19 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876105#comment-13876105
 ] 

takeshi.miao commented on HBASE-10379:
--

Will apply a patch later

 Turn the msg Request is a replay (34) - PROCESS_TGS from logging level 
 ERROR to WARN
 --

 Key: HBASE-10379
 URL: https://issues.apache.org/jira/browse/HBASE-10379
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.16
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Minor

 Hi All,
 Recently we got the error msg Request is a replay (34) - PROCESS_TGS while 
 we are using the HBase client API to put data into HBase-0.94.16 with 
 krb5-1.6.1 enabled. The related msg as follows...
 {code}
 [2014-01-15 
 09:40:38,452][hbase-tablepool-1-thread-3][ERROR][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1124)):
  PriviledgedActionException as:takeshi_miao@LAB 
 cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Request is a 
 replay (34) - PROCESS_TGS)]
 [2014-01-15 
 09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.logPriviledgedAction(UserGroupInformation.java:1143)):
  PriviledgedAction as:takeshi_miao@LAB 
 from:sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  
 
 [2014-01-15 
 09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.ipc.SecureClient](org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:213)):
  Exception encountered while connecting to the server : 
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Request is a 
 replay (34) - PROCESS_TGS)]
 [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO 
 ][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:657)):
  Initiating logout for takeshi_miao@LAB
 [2014-01-15 
 09:40:38,454][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.logout(UserGroupInformation.java:154)):
  hadoop logout
 [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO 
 ][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:667)):
  Initiating re-login for takeshi_miao@LAB
 [2014-01-15 
 09:40:38,455][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.login(UserGroupInformation.java:146)):
  hadoop login
 [2014-01-15 
 09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:95)):
  hadoop login commit
 [2014-01-15 
 09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:100)):
  using existing subject:[takeshi_miao@LAB, UnixPrincipal: takeshi_miao, 
 UnixNumericUserPrincipal: 501, UnixNumericGroupPrincipal [Primary Group]: 
 501, UnixNumericGroupPrincipal [Supplementary Group]: 502, takeshi_miao@LAB]
 {code}
 At the beginning, we were worry about the data loss occurring while we 
 found the Request is a replay (34) - PROCESS_TGS (especially it is the 
 ERROR level) in log, but after code study, this is basically *NOT* a data 
 loss issue due to HBase client API would try 5 times internally 
 (o.a.h.hbase.ipc.SecureClient, L#296, a one thread) and also 10 times of 
 retry externally (o.a.h.hbase.client.HConnectionManager, L#1661, for all 
 failed thread), The HTable API would also throw IOEcxeption to client code if 
 any thread still fail after these retries.
 Based on HBase users' viewpoint as us, we think this is better to change the 
 logging level from 'ERROR' to 'WARN', due to the 'ERROR' level had been 
 confused us for a while...But this code change may need to change boht HBase 
 code and Hadoop code as well; so I am wondering how community think about 
 this small thing but may be important to the pure HBase users.
 mailing list
 

[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat

2014-01-19 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876114#comment-13876114
 ] 

Ted Yu commented on HBASE-10323:


[~apurtell]:
Do you want this in 0.98 ?

 Auto detect data block encoding in HFileOutputFormat
 

 Key: HBASE-10323
 URL: https://issues.apache.org/jira/browse/HBASE-10323
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Ishan Chhabra
Assignee: Ishan Chhabra
 Fix For: 0.99.0

 Attachments: HBASE_10323-0.94.15-v1.patch, 
 HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, 
 HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, 
 HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, 
 HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch


 Currently, one has to specify the data block encoding of the table explicitly 
 using the config parameter 
 hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload 
 load. This option is easily missed, not documented and also works differently 
 than compression, block size and bloom filter type, which are auto detected. 
 The solution would be to add support to auto detect datablock encoding 
 similar to other parameters. 
 The current patch does the following:
 1. Automatically detects datablock encoding in HFileOutputFormat.
 2. Keeps the legacy option of manually specifying the datablock encoding
 around as a method to override auto detections.
 3. Moves string conf parsing to the start of the program so that it fails
 fast during starting up instead of failing during record writes. It also
 makes the internals of the program type safe.
 4. Adds missing doc strings and unit tests for code serializing and
 deserializing config paramerters for bloom filer type, block size and
 datablock encoding.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876141#comment-13876141
 ] 

ramkrishna.s.vasudevan commented on HBASE-10322:


bq.CellCodecV2 is used as default. But I think there should not be any 
default value for this codec because of the below reason. (Default value 
should be value of the old config with thats default as KVCodec)
Suppose the src cluster user is upgrading to 98 (or later versions in future) 
But the peer is still in 96 .
I agree. the patch's intention was to show how we could do that config 
settings. +1
[~lhofhansl]
bq.if I can see a certain KV, should I be able to see its visibility tags?
What you say is right in the sense admin sets up proper access control say for 
User A and User A would be seeing only those KVs which has visibility labels 
that A is associated with.  But sometimes the labels can be combination of 
visibility labels seperted by ,|,!.  In that case the User A on reading the 
visibility labels would come to know about the existence of other labels.  And 
added to that, the whole idea of associating the labels and users are done by 
admin with super user privileges.  So allowing all users to view the labels in 
the KV would break it because reading the kv the User A would come to know what 
combination of labels he can pass to access the kvs to which he would not be 
authorised to.

 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876146#comment-13876146
 ] 

stack commented on HBASE-10322:
---

bq. We need any way an RpcClient to talk with peer right?

Yes, but I thought that if the Service is the explicit Replication Service, 
then you could identify the context as replication and slot in replication 
suited codecs that preserve tags on setup of the replication connection -- if 
asked for (A codec for replication that is other than what we use for 'normal' 
client/server seems like something we'd want to have anyways).

If we break out a replication Service, it will break being able to replicate 
from a 0.98 to a 0.96 whether or not you are forwarding tags or not.  That 
ain't good.  If we leave the service as is, It sounds like we can have a 0.98 
replicate to a 0.96 when no tags in the mix.  It is only when you enable tags 
will you have to update the  sink cluster so it recognizes the tag-bearing 
codec.

Of your 1., and 2., 1. is preferable.  Pity it has to be a config. in the 
hbase-*xml.  Can it be a replication config (I suppose this is what your 2. 
does in part)?  Can ship 0.98.0RC as soon as we dump in a codec that can do 
tags (what happens when you pass a KV with tags to default KVCodec?  It just 
dumps them?)

I like how [~lhofhansl] is telling it.  Does that help lads?





 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876147#comment-13876147
 ] 

stack commented on HBASE-10322:
---

Just to say that the 'codec ' problem would be true of any sink cluster no 
matter what the version; you couldn't do some fancy compression codec unless 
you first updated the sink cluster so it recognized it when the source cluster 
set up the connection.

 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10373) Add more details info for ACL group in HBase book

2014-01-19 Thread takeshi.miao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

takeshi.miao updated HBASE-10373:
-

Fix Version/s: 0.99.0
Affects Version/s: 0.99.0
   Status: Patch Available  (was: Open)

 Add more details info for ACL group in HBase book
 -

 Key: HBASE-10373
 URL: https://issues.apache.org/jira/browse/HBASE-10373
 Project: HBase
  Issue Type: Improvement
  Components: documentation, security
Affects Versions: 0.99.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBASE-10373-trunk-v01.patch


 Current ACL section '8.3. Access Control' in HBase book, not instructs user 
 to grant ACL for group. I think this is good to make it clear for users due 
 to I think that this is great and important feature for users to manage their 
 ACL more easily.
 mailing list
 http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCA+RK=_b+umfzwiaeud9fsqjk8rs8l-vuo6arvos8k5sutog...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10373) Add more details info for ACL group in HBase book

2014-01-19 Thread takeshi.miao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

takeshi.miao updated HBASE-10373:
-

Attachment: HBASE-10373-trunk-v01.patch

patch-v01 submitted
added some descriptions in section _'8.4.5. Shell Enhancements for Access 
Control'_ for demonstrating the ACL group as well

 Add more details info for ACL group in HBase book
 -

 Key: HBASE-10373
 URL: https://issues.apache.org/jira/browse/HBASE-10373
 Project: HBase
  Issue Type: Improvement
  Components: documentation, security
Affects Versions: 0.99.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBASE-10373-trunk-v01.patch


 Current ACL section '8.3. Access Control' in HBase book, not instructs user 
 to grant ACL for group. I think this is good to make it clear for users due 
 to I think that this is great and important feature for users to manage their 
 ACL more easily.
 mailing list
 http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCA+RK=_b+umfzwiaeud9fsqjk8rs8l-vuo6arvos8k5sutog...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876159#comment-13876159
 ] 

ramkrishna.s.vasudevan commented on HBASE-10322:


bq.what happens when you pass a KV with tags to default KVCodec? It just dumps 
them
No KVCodec by default will not dump tags but when it works with the 
WALCellCodec it would dump.  so we  would control it with a flag.
The reason being kvcodec writes the entire length of the byte array.

 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9004) Fix Documentation around Minor compaction and ttl

2014-01-19 Thread Dan Feng (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876165#comment-13876165
 ] 

Dan Feng commented on HBASE-9004:
-

I'm a little bit confused. Can somebody clarify whether minor compaction will 
delete expired cell or not?

 Fix Documentation around Minor compaction and ttl
 -

 Key: HBASE-9004
 URL: https://issues.apache.org/jira/browse/HBASE-9004
 Project: HBase
  Issue Type: Task
Reporter: Elliott Clark

 Minor compactions should be able to delete KeyValues outside of ttl.  The 
 docs currently suggest otherwise.  We should bring them in line.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9004) Fix Documentation around Minor compaction and ttl

2014-01-19 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876169#comment-13876169
 ] 

Feng Honghua commented on HBASE-9004:
-

minor compaction *will* delete expired cells, but only those within the input 
hfiles selected by the minor compact. the expired cells in other hfiles not 
selected by compact still exist in hfiles, but can't be read out by read/scan 
since ScanQueryMatcher guarantees to filter them out by TTL rule when 
processing read/scan, and these expired cells will eventually be deleted in 
subsequent compact if their hosting hfiles are selected by that compact.

 Fix Documentation around Minor compaction and ttl
 -

 Key: HBASE-9004
 URL: https://issues.apache.org/jira/browse/HBASE-9004
 Project: HBase
  Issue Type: Task
Reporter: Elliott Clark

 Minor compactions should be able to delete KeyValues outside of ttl.  The 
 docs currently suggest otherwise.  We should bring them in line.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10378) Divide HLog interface into User and Implementor specific interfaces

2014-01-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876174#comment-13876174
 ] 

ramkrishna.s.vasudevan commented on HBASE-10378:


[~himan...@cloudera.com]
Had a glance at the patch.  The WAL and WALService looks good.  
I have some basic questions, incase of rollWriter and replications (including 
starting the syncer and writer threads), should we have an implemenation for 
the WALservice where based on the number of hLogs those many syncer and writer 
threads need to be started along with the replicaiton services for them?  
Currently the HRS just instantiates one HLog and starts them.  What do you say?
I was having an idea of defining an interface called WALGrouper and every 
region server would use an type of this WALGrouper and this grouper would know 
how many hLog instance he is creating and for every instance those syncer and 
writer threads would be started. 
And the api 
{code}
public WALService getWAL() {
{code} 
How will this make sense when there are more than one hlog for that RS?  I know 
that in ur current implementation there are only 2 HLog and one among them is 
going to be active? But what if my multiwal impl is not that way and i may have 
more than one active HLog?  
I can discuss with you offline too on some my concerns and questions that I had 
while doing HBASE-8610.

 Divide HLog interface into User and Implementor specific interfaces
 ---

 Key: HBASE-10378
 URL: https://issues.apache.org/jira/browse/HBASE-10378
 Project: HBase
  Issue Type: Sub-task
  Components: wal
Reporter: Himanshu Vashishtha
 Attachments: 10378-1.patch


 HBASE-5937 introduces the HLog interface as a first step to support multiple 
 WAL implementations. This interface is a good start, but has some 
 limitations/drawbacks in its current state, such as:
 1) There is no clear distinction b/w User and Implementor APIs, and it 
 provides APIs both for WAL users (append, sync, etc) and also WAL 
 implementors (Reader/Writer interfaces, etc). There are APIs which are very 
 much implementation specific (getFileNum, etc) and a user such as a 
 RegionServer shouldn't know about it.
 2) There are about 14 methods in FSHLog which are not present in HLog 
 interface but are used at several places in the unit test code. These tests 
 typecast HLog to FSHLog, which makes it very difficult to test multiple WAL 
 implementations without doing some ugly checks.
 I'd like to propose some changes in HLog interface that would ease the multi 
 WAL story:
 1) Have two interfaces WAL and WALService. WAL provides APIs for 
 implementors. WALService provides APIs for users (such as RegionServer).
 2) A skeleton implementation of the above two interface as the base class for 
 other WAL implementations (AbstractWAL). It provides required fields for all 
 subclasses (fs, conf, log dir, etc). Make a minimal set of test only methods 
 and add this set in AbstractWAL.
 3) HLogFactory returns a WALService reference when creating a WAL instance; 
 if a user need to access impl specific APIs (there are unit tests which get 
 WAL from a HRegionServer and then call impl specific APIs), use AbstractWAL 
 type casting,
 4) Make TestHLog abstract and let all implementors provide their respective 
 test class which extends TestHLog (TestFSHLog, for example).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876186#comment-13876186
 ] 

Anoop Sam John commented on HBASE-10322:


bq.I.e. if I can see a certain KV, should I be able to see its visibility tags?
The answer to this is No. Ram explain the reason. 

bq.Maybe we can even have a tag that controls the visibility of the tags?
Who can see the tags also along with KVs, we thought let it be decided by the 
user. Only HBase super user will be able to get tags also along with KVs.

So this is the overall idea. Yes tags control what a client can see.. But we 
would like to control normal clients from seeing the tags.

The impl of this becomes very tricky as we use same Codec to write from client 
to server and back. We were giving options for user to add tags in Mutation 
KVs. As of now we are thinking of removing this APIs.  Over RPC tags will not 
go (ie. client - server or reverse)
To write to WAL we use WALCellCodec and that will be able to write and read 
tags.
Then the last problem came out was replication for which we propose 2 possible 
solutions.  [~stack] with out big changes like ReplicationServer can we do?  I 
think those we can try addressing in another issue..  
bq.Of your 1., and 2., 1. is preferable. Pity it has to be a config. in the 
hbase-*xml. Can it be a replication config (I suppose this is what your 2. does 
in part)?
This config (hbase.client.rpc.codec) is used by the RpcClient. RpcClient used 
by the ReplicationSource. Yes already it refers to the config. As long as user 
dont deal with tags (existing migrated to 98) no changes required at all.. When 
they have tags usage have to change this config at HRSs side to some codec with 
tags..  We will ship some new codecs which can handle tags also along with this 
issue.   Yes option 2 adds a new config and as shown in the patch it just write 
the value of this new config as old config's value. (Decorating the conf object 
used by ReplicationSource)
If you feel option 1 is fine, no extra code will be needed in Replication side.


 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10373) Add more details info for ACL group in HBase book

2014-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876191#comment-13876191
 ] 

Hadoop QA commented on HBASE-10373:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12623897/HBASE-10373-trunk-v01.patch
  against trunk revision .
  ATTACHMENT ID: 12623897

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+grant lt;user|@groupgt; lt;permissionsgt; [ lt;tablegt; [ 
lt;column familygt; [ lt;column qualifiergt; ] ] ]
+code class=codelt;user|@groupgt;/code is user or group  (start 
with character '@'), Groups are created and manipulated via the Hadoop group 
mapping service.
+revoke lt;user|@groupgt; [ lt;tablegt; [ lt;column familygt; [ 
lt;column qualifiergt; ] ] ]

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8471//console

This message is automatically generated.

 Add more details info for ACL group in HBase book
 -

 Key: HBASE-10373
 URL: https://issues.apache.org/jira/browse/HBASE-10373
 Project: HBase
  Issue Type: Improvement
  Components: documentation, security
Affects Versions: 0.99.0
Reporter: takeshi.miao
Assignee: takeshi.miao
Priority: Minor
 Fix For: 0.99.0

 Attachments: HBASE-10373-trunk-v01.patch


 Current ACL section '8.3. Access Control' in HBase book, not instructs user 
 to grant ACL for group. I think this is good to make it clear for users due 
 to I think that this is great and important feature for users to manage their 
 ACL more easily.
 mailing list
 http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCA+RK=_b+umfzwiaeud9fsqjk8rs8l-vuo6arvos8k5sutog...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-19 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876212#comment-13876212
 ] 

Lars Hofhansl commented on HBASE-10322:
---

Thanks for explaining, Ram and Anoop.

bq. So tags are becoming a server only thing as of now
Agree. Can tackle Export/Copytable/etc later, although I figure eventually 
these would have to be addressed if folks use them for backup.

bq. Only HBase super user will be able to get tags also along with KVs.
This seems to contradict the earlier point.


 Strip tags from KV while sending back to client on reads
 

 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, 
 HBASE-10322_codec.patch


 Right now we have some inconsistency wrt sending back tags on read. We do 
 this in scan when using Java client(Codec based cell block encoding). But 
 during a Get operation or when a pure PB based Scan comes we are not sending 
 back the tags.  So any of the below fix we have to do
 1. Send back tags in missing cases also. But sending back visibility 
 expression/ cell ACL is not correct.
 2. Don't send back tags in any case. This will a problem when a tool like 
 ExportTool use the scan to export the table data. We will miss exporting the 
 cell visibility/ACL.
 3. Send back tags based on some condition. It has to be per scan basis. 
 Simplest way is pass some kind of attribute in Scan which says whether to 
 send back tags or not. But believing some thing what scan specifies might not 
 be correct IMO. Then comes the way of checking the user who is doing the 
 scan. When a HBase super user doing the scan then only send back tags. So 
 when a case comes like Export Tool's the execution should happen from a super 
 user.
 So IMO we should go with #3.
 Patch coming soon.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)