[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875919#comment-13875919 ] Anoop Sam John commented on HBASE-10322: That changes can be in another Jira [~stack]? We need any way an RpcClient to talk with peer right? So we have 2 options. 1.Go with current way without any code changes. The RpcClient used by ReplicationSource looks at the config hbase.client.rpc.codec to know the codec name and uses that. This defaults to KVCodec. As long as user don't deal with tags directly or indirectly (via usage of cell level ACL/ visibility labels) the current way works good. If tag case comes, user must a. Change this config value at HRS side to any of the codec with tags class. (We plan to give a KVCodecWithTag) b. Make sure upgrade the RSs in peer clusters also so that the new class added in 98 is available there also. 2. Introduce a new config name as in latest patch and do change is ReplicationSource to decorate the conf. In the attached patch a new Codec ie. CellCodecV2 is used as default. But I think there should not be any default value for this codec because of the below reason. (Default value should be value of the old config with thats default as KVCodec) Suppose the src cluster user is upgrading to 98 (or later versions in future) But the peer is still in 96 . When the replication src write using new Codec class, the destination will need the codec class to be present in it also. So this make it necessary for the peer also should be upgraded. What abt rolling upgrade then? So even if the new config is there or not, the def codec should not change. Out of these 2 options which one you guys prefer? Can give a patch accordingly. Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed
[ https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10294: --- Fix Version/s: (was: 0.99.0) Some synchronization on ServerManager#onlineServers can be removed -- Key: HBASE-10294 URL: https://issues.apache.org/jira/browse/HBASE-10294 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 10294-v1.txt ServerManager#onlineServers is a ConcurrentHashMap Yet I found that some accesses to it are synchronized and unnecessary. Here is one example: {code} public MapServerName, ServerLoad getOnlineServers() { // Presumption is that iterating the returned Map is OK. synchronized (this.onlineServers) { return Collections.unmodifiableMap(this.onlineServers); {code} Note: not all accesses to ServerManager#onlineServers are synchronized. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code
[ https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-10366: -- Resolution: Fixed Release Note: Thanks for the reviews! I've integrated the v1 patch into trunk, 0.98 0.96 branches. Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) 0.94 filterRow() may be skipped in 0.96(or onwards) code Key: HBASE-10366 URL: https://issues.apache.org/jira/browse/HBASE-10366 Project: HBase Issue Type: Bug Components: Filters Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Priority: Critical Fix For: 0.98.0, 0.96.2 Attachments: hbase-10366-v1.patch, hbase-10366.patch HBASE-6429 combines both filterRow filterRow(ListKeyValue kvs) functions in Filter. While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 expected because 0.94(old) hasFilterRow only returns true when filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, the filterRow() will be skipped. Since we don't ask 0.94 users to update their existing filter code, the issue will cause scan returns unexpected keyvalues and break the backward compatibility. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13875990#comment-13875990 ] Lars Hofhansl commented on HBASE-10322: --- I am a bit late to this party. The visibility tags control what a *client* can see, right? Then what's a client? A client is outside of the HBase cluster outside of HBase's control. So HFile, HLog are not a client. Replication is also not a client. Export is a client, just like any other Java/Thrift/MR/etc client. As Andy points out the interesting part here are these real clients. Are the tags themselves (i.e. who sees what) more sensitive than the data that can be accessed. I.e. if I can see a certain KV, should I be able to see its visibility tags? * If the answer is yes, this is an easy problem in principle and squarely in the hands of an HBase admin to setup access correctly. You just run Export/etc as a user with sufficient access and all problems just go away. * If the answer is no it gets murky quickly. Now all tools and access paths need to be considered individually. Maybe we can even have a tag that controls the visibility of the tags? Generally anything that we hardware assumes something about desired behavior that might not be the same at every institution. Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code
[ https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876012#comment-13876012 ] Hudson commented on HBASE-10366: SUCCESS: Integrated in HBase-TRUNK #4836 (See [https://builds.apache.org/job/HBase-TRUNK/4836/]) HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code (jeffreyz: rev 1559547) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 0.94 filterRow() may be skipped in 0.96(or onwards) code Key: HBASE-10366 URL: https://issues.apache.org/jira/browse/HBASE-10366 Project: HBase Issue Type: Bug Components: Filters Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Priority: Critical Fix For: 0.98.0, 0.96.2 Attachments: hbase-10366-v1.patch, hbase-10366.patch HBASE-6429 combines both filterRow filterRow(ListKeyValue kvs) functions in Filter. While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 expected because 0.94(old) hasFilterRow only returns true when filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, the filterRow() will be skipped. Since we don't ask 0.94 users to update their existing filter code, the issue will cause scan returns unexpected keyvalues and break the backward compatibility. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code
[ https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876020#comment-13876020 ] Hudson commented on HBASE-10366: SUCCESS: Integrated in HBase-0.98 #95 (See [https://builds.apache.org/job/HBase-0.98/95/]) HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code (jeffreyz: rev 1559548) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 0.94 filterRow() may be skipped in 0.96(or onwards) code Key: HBASE-10366 URL: https://issues.apache.org/jira/browse/HBASE-10366 Project: HBase Issue Type: Bug Components: Filters Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Priority: Critical Fix For: 0.98.0, 0.96.2 Attachments: hbase-10366-v1.patch, hbase-10366.patch HBASE-6429 combines both filterRow filterRow(ListKeyValue kvs) functions in Filter. While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 expected because 0.94(old) hasFilterRow only returns true when filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, the filterRow() will be skipped. Since we don't ask 0.94 users to update their existing filter code, the issue will cause scan returns unexpected keyvalues and break the backward compatibility. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code
[ https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876024#comment-13876024 ] Hudson commented on HBASE-10366: FAILURE: Integrated in hbase-0.96 #263 (See [https://builds.apache.org/job/hbase-0.96/263/]) HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code (jeffreyz: rev 1559551) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 0.94 filterRow() may be skipped in 0.96(or onwards) code Key: HBASE-10366 URL: https://issues.apache.org/jira/browse/HBASE-10366 Project: HBase Issue Type: Bug Components: Filters Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Priority: Critical Fix For: 0.98.0, 0.96.2 Attachments: hbase-10366-v1.patch, hbase-10366.patch HBASE-6429 combines both filterRow filterRow(ListKeyValue kvs) functions in Filter. While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 expected because 0.94(old) hasFilterRow only returns true when filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, the filterRow() will be skipped. Since we don't ask 0.94 users to update their existing filter code, the issue will cause scan returns unexpected keyvalues and break the backward compatibility. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code
[ https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876025#comment-13876025 ] Hudson commented on HBASE-10366: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #87 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/87/]) HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code (jeffreyz: rev 1559548) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 0.94 filterRow() may be skipped in 0.96(or onwards) code Key: HBASE-10366 URL: https://issues.apache.org/jira/browse/HBASE-10366 Project: HBase Issue Type: Bug Components: Filters Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Priority: Critical Fix For: 0.98.0, 0.96.2 Attachments: hbase-10366-v1.patch, hbase-10366.patch HBASE-6429 combines both filterRow filterRow(ListKeyValue kvs) functions in Filter. While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 expected because 0.94(old) hasFilterRow only returns true when filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, the filterRow() will be skipped. Since we don't ask 0.94 users to update their existing filter code, the issue will cause scan returns unexpected keyvalues and break the backward compatibility. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: HBASE_10323-trunk-v4.patch HBASE_10323-0.94.15-v5.patch Auto detect data block encoding in HFileOutputFormat Key: HBASE-10323 URL: https://issues.apache.org/jira/browse/HBASE-10323 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0 Attachments: HBASE_10323-0.94.15-v1.patch, HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch Currently, one has to specify the data block encoding of the table explicitly using the config parameter hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload load. This option is easily missed, not documented and also works differently than compression, block size and bloom filter type, which are auto detected. The solution would be to add support to auto detect datablock encoding similar to other parameters. The current patch does the following: 1. Automatically detects datablock encoding in HFileOutputFormat. 2. Keeps the legacy option of manually specifying the datablock encoding around as a method to override auto detections. 3. Moves string conf parsing to the start of the program so that it fails fast during starting up instead of failing during record writes. It also makes the internals of the program type safe. 4. Adds missing doc strings and unit tests for code serializing and deserializing config paramerters for bloom filer type, block size and datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876054#comment-13876054 ] Ishan Chhabra commented on HBASE-10323: --- Added the @VisibleForTesting annotations where needed and fixed the '{' in newline. I didn't make the constants package-private since no other class needs them at the moment. When some other class in the package or a test needs it, they could be made package private then. Auto detect data block encoding in HFileOutputFormat Key: HBASE-10323 URL: https://issues.apache.org/jira/browse/HBASE-10323 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0 Attachments: HBASE_10323-0.94.15-v1.patch, HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch Currently, one has to specify the data block encoding of the table explicitly using the config parameter hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload load. This option is easily missed, not documented and also works differently than compression, block size and bloom filter type, which are auto detected. The solution would be to add support to auto detect datablock encoding similar to other parameters. The current patch does the following: 1. Automatically detects datablock encoding in HFileOutputFormat. 2. Keeps the legacy option of manually specifying the datablock encoding around as a method to override auto detections. 3. Moves string conf parsing to the start of the program so that it fails fast during starting up instead of failing during record writes. It also makes the internals of the program type safe. 4. Adds missing doc strings and unit tests for code serializing and deserializing config paramerters for bloom filer type, block size and datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code
[ https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876062#comment-13876062 ] Hudson commented on HBASE-10366: FAILURE: Integrated in hbase-0.96-hadoop2 #181 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/181/]) HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code (jeffreyz: rev 1559551) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 0.94 filterRow() may be skipped in 0.96(or onwards) code Key: HBASE-10366 URL: https://issues.apache.org/jira/browse/HBASE-10366 Project: HBase Issue Type: Bug Components: Filters Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Priority: Critical Fix For: 0.98.0, 0.96.2 Attachments: hbase-10366-v1.patch, hbase-10366.patch HBASE-6429 combines both filterRow filterRow(ListKeyValue kvs) functions in Filter. While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 expected because 0.94(old) hasFilterRow only returns true when filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, the filterRow() will be skipped. Since we don't ask 0.94 users to update their existing filter code, the issue will cause scan returns unexpected keyvalues and break the backward compatibility. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas
[ https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876066#comment-13876066 ] Devaraj Das commented on HBASE-10070: - [~lhofhansl] The user specifies he can tolerate the stale reads via flags in the read api. The Results are also tagged as such and so he can inspect whether the result is stale or not. In other words, the user has full control still. HBase read high-availability using eventually consistent region replicas Key: HBASE-10070 URL: https://issues.apache.org/jira/browse/HBASE-10070 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HighAvailabilityDesignforreadsApachedoc.pdf In the present HBase architecture, it is hard, probably impossible, to satisfy constraints like 99th percentile of the reads will be served under 10 ms. One of the major factors that affects this is the MTTR for regions. There are three phases in the MTTR process - detection, assignment, and recovery. Of these, the detection is usually the longest and is presently in the order of 20-30 seconds. During this time, the clients would not be able to read the region data. However, some clients will be better served if regions will be available for reads during recovery for doing eventually consistent reads. This will help with satisfying low latency guarantees for some class of applications which can work with stale reads. For improving read availability, we propose a replicated read-only region serving design, also referred as secondary regions, or region shadows. Extending current model of a region being opened for reads and writes in a single region server, the region will be also opened for reading in region servers. The region server which hosts the region for reads and writes (as in current case) will be declared as PRIMARY, while 0 or more region servers might be hosting the region as SECONDARY. There may be more than one secondary (replica count 2). Will attach a design doc shortly which contains most of the details and some thoughts about development approaches. Reviews are more than welcome. We also have a proof of concept patch, which includes the master and regions server side of changes. Client side changes will be coming soon as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10366) 0.94 filterRow() may be skipped in 0.96(or onwards) code
[ https://issues.apache.org/jira/browse/HBASE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876067#comment-13876067 ] Hudson commented on HBASE-10366: SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #59 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/59/]) HBASE-10366: 0.94 filterRow() may be skipped in 0.96(or onwards) code (jeffreyz: rev 1559547) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java 0.94 filterRow() may be skipped in 0.96(or onwards) code Key: HBASE-10366 URL: https://issues.apache.org/jira/browse/HBASE-10366 Project: HBase Issue Type: Bug Components: Filters Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Priority: Critical Fix For: 0.98.0, 0.96.2 Attachments: hbase-10366-v1.patch, hbase-10366.patch HBASE-6429 combines both filterRow filterRow(ListKeyValue kvs) functions in Filter. While 0.94 code or older, it may not implement hasFilterRow as HBase-6429 expected because 0.94(old) hasFilterRow only returns true when filterRow(ListKeyValue kvs) is overridden not the filterRow(). Therefore, the filterRow() will be skipped. Since we don't ask 0.94 users to update their existing filter code, the issue will cause scan returns unexpected keyvalues and break the backward compatibility. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876079#comment-13876079 ] Hadoop QA commented on HBASE-10323: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623875/HBASE_10323-trunk-v4.patch against trunk revision . ATTACHMENT ID: 12623875 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8470//console This message is automatically generated. Auto detect data block encoding in HFileOutputFormat Key: HBASE-10323 URL: https://issues.apache.org/jira/browse/HBASE-10323 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0 Attachments: HBASE_10323-0.94.15-v1.patch, HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch Currently, one has to specify the data block encoding of the table explicitly using the config parameter hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload load. This option is easily missed, not documented and also works differently than compression, block size and bloom filter type, which are auto detected. The solution would be to add support to auto detect datablock encoding similar to other parameters. The current patch does the following: 1. Automatically detects datablock encoding in HFileOutputFormat. 2. Keeps the legacy option of manually specifying the datablock encoding around as a method to override auto detections. 3. Moves string conf parsing to the start of the program so that it fails fast during starting up instead of failing during record writes. It also makes the internals of the program type safe. 4. Adds missing doc strings and unit tests for code serializing and deserializing config paramerters for bloom filer type, block size and datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10070) HBase read high-availability using eventually consistent region replicas
[ https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876087#comment-13876087 ] Devaraj Das commented on HBASE-10070: - [~stack], I agree with you that the notion of replicaID == 0 being a primary replica, etc. should be maintained in a layer outside HRegionInfo. HRegionInfo could come with an 'index' and the 'index' should be an inherent part of the HRI's identification (in the name, etc.). The layer outside could associate index == 0 with primary replica, etc.. Will submit a patch along these lines. HBase read high-availability using eventually consistent region replicas Key: HBASE-10070 URL: https://issues.apache.org/jira/browse/HBASE-10070 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HighAvailabilityDesignforreadsApachedoc.pdf In the present HBase architecture, it is hard, probably impossible, to satisfy constraints like 99th percentile of the reads will be served under 10 ms. One of the major factors that affects this is the MTTR for regions. There are three phases in the MTTR process - detection, assignment, and recovery. Of these, the detection is usually the longest and is presently in the order of 20-30 seconds. During this time, the clients would not be able to read the region data. However, some clients will be better served if regions will be available for reads during recovery for doing eventually consistent reads. This will help with satisfying low latency guarantees for some class of applications which can work with stale reads. For improving read availability, we propose a replicated read-only region serving design, also referred as secondary regions, or region shadows. Extending current model of a region being opened for reads and writes in a single region server, the region will be also opened for reading in region servers. The region server which hosts the region for reads and writes (as in current case) will be declared as PRIMARY, while 0 or more region servers might be hosting the region as SECONDARY. There may be more than one secondary (replica count 2). Will attach a design doc shortly which contains most of the details and some thoughts about development approaches. Reviews are more than welcome. We also have a proof of concept patch, which includes the master and regions server side of changes. Client side changes will be coming soon as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10371) Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again
[ https://issues.apache.org/jira/browse/HBASE-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-10371: - Attachment: HBASE-10371-96.patch Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again Key: HBASE-10371 URL: https://issues.apache.org/jira/browse/HBASE-10371 Project: HBase Issue Type: Bug Reporter: binlijin Assignee: binlijin Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17 Attachments: 10371-trunk-3.patch, HBASE-10371-94.patch, HBASE-10371-96.patch, HBASE-10371-trunk-2.patch, HBASE-10371-trunk.patch (1) Select HFile for compaction {code} 2014-01-16 01:01:25,111 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the expired store file by compaction: hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b whose maxTimeStamp is -1 while the max expired timestamp is 1389632485111 {code} (2) Compact {code} 2014-01-16 01:01:26,042 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: Compacting hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b, keycount=0, bloomtype=NONE, size=534, encoding=NONE 2014-01-16 01:01:26,045 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file=hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8 with permission=rwxrwxrwx 2014-01-16 01:01:26,076 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8 to hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s) in a of storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767. into 40de5d79f80e4fb197e409fb99ab0fd8, size=534; total size for store is 399.0 M 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767., storeName=a, fileCount=1, fileSize=534, priority=16, time=18280340606333745; duration=0sec {code} (3) Select HFile for compaction {code} 2014-01-16 03:48:05,120 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the expired store file by compaction: hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8 whose maxTimeStamp is -1 while the max expired timestamp is 1389642485120 {code} (4) Compact {code} 2014-01-16 03:50:17,731 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: Compacting hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8, keycount=0, bloomtype=NONE, size=534, encoding=NONE 2014-01-16 03:50:17,732 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file=hdfs://dump002002.cm6:9000/hbase-0.90 {code} ... this loop for ever. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10371) Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again
[ https://issues.apache.org/jira/browse/HBASE-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-10371: - Attachment: HBASE-10371-94-2.patch Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again Key: HBASE-10371 URL: https://issues.apache.org/jira/browse/HBASE-10371 Project: HBase Issue Type: Bug Reporter: binlijin Assignee: binlijin Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17 Attachments: 10371-trunk-3.patch, HBASE-10371-94-2.patch, HBASE-10371-94.patch, HBASE-10371-96.patch, HBASE-10371-trunk-2.patch, HBASE-10371-trunk.patch (1) Select HFile for compaction {code} 2014-01-16 01:01:25,111 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the expired store file by compaction: hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b whose maxTimeStamp is -1 while the max expired timestamp is 1389632485111 {code} (2) Compact {code} 2014-01-16 01:01:26,042 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: Compacting hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b, keycount=0, bloomtype=NONE, size=534, encoding=NONE 2014-01-16 01:01:26,045 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file=hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8 with permission=rwxrwxrwx 2014-01-16 01:01:26,076 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8 to hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s) in a of storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767. into 40de5d79f80e4fb197e409fb99ab0fd8, size=534; total size for store is 399.0 M 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767., storeName=a, fileCount=1, fileSize=534, priority=16, time=18280340606333745; duration=0sec {code} (3) Select HFile for compaction {code} 2014-01-16 03:48:05,120 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the expired store file by compaction: hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8 whose maxTimeStamp is -1 while the max expired timestamp is 1389642485120 {code} (4) Compact {code} 2014-01-16 03:50:17,731 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: Compacting hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8, keycount=0, bloomtype=NONE, size=534, encoding=NONE 2014-01-16 03:50:17,732 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file=hdfs://dump002002.cm6:9000/hbase-0.90 {code} ... this loop for ever. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10371) Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again
[ https://issues.apache.org/jira/browse/HBASE-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876101#comment-13876101 ] binlijin commented on HBASE-10371: -- Add patch for 96 and 94 branch. Compaction creates empty hfile, then selects this file for compaction and creates empty hfile and over again Key: HBASE-10371 URL: https://issues.apache.org/jira/browse/HBASE-10371 Project: HBase Issue Type: Bug Reporter: binlijin Assignee: binlijin Fix For: 0.98.0, 0.96.2, 0.99.0, 0.94.17 Attachments: 10371-trunk-3.patch, HBASE-10371-94-2.patch, HBASE-10371-94.patch, HBASE-10371-96.patch, HBASE-10371-trunk-2.patch, HBASE-10371-trunk.patch (1) Select HFile for compaction {code} 2014-01-16 01:01:25,111 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the expired store file by compaction: hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b whose maxTimeStamp is -1 while the max expired timestamp is 1389632485111 {code} (2) Compact {code} 2014-01-16 01:01:26,042 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: Compacting hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/f3e38d10d579420494079e17a2557f0b, keycount=0, bloomtype=NONE, size=534, encoding=NONE 2014-01-16 01:01:26,045 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file=hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8 with permission=rwxrwxrwx 2014-01-16 01:01:26,076 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming compacted file at hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/.tmp/40de5d79f80e4fb197e409fb99ab0fd8 to hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 1 file(s) in a of storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767. into 40de5d79f80e4fb197e409fb99ab0fd8, size=534; total size for store is 399.0 M 2014-01-16 01:01:26,142 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=storagetable,01:,1369377609136.7d8941661904fb99a41f79a1fce47767., storeName=a, fileCount=1, fileSize=534, priority=16, time=18280340606333745; duration=0sec {code} (3) Select HFile for compaction {code} 2014-01-16 03:48:05,120 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the expired store file by compaction: hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8 whose maxTimeStamp is -1 while the max expired timestamp is 1389642485120 {code} (4) Compact {code} 2014-01-16 03:50:17,731 DEBUG org.apache.hadoop.hbase.regionserver.Compactor: Compacting hdfs://dump002002.cm6:9000/hbase-0.90/storagetable/7d8941661904fb99a41f79a1fce47767/a/40de5d79f80e4fb197e409fb99ab0fd8, keycount=0, bloomtype=NONE, size=534, encoding=NONE 2014-01-16 03:50:17,732 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating file=hdfs://dump002002.cm6:9000/hbase-0.90 {code} ... this loop for ever. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10379) Turn the msg Request is a replay (34) - PROCESS_TGS from logging level ERROR to WARN
takeshi.miao created HBASE-10379: Summary: Turn the msg Request is a replay (34) - PROCESS_TGS from logging level ERROR to WARN Key: HBASE-10379 URL: https://issues.apache.org/jira/browse/HBASE-10379 Project: HBase Issue Type: Improvement Affects Versions: 0.94.16 Reporter: takeshi.miao Assignee: takeshi.miao Priority: Minor Hi All, Recently we got the error msg Request is a replay (34) - PROCESS_TGS while we are using the HBase client API to put data into HBase-0.94.16 with krb5-1.6.1 enabled. The related msg as follows... {code} [2014-01-15 09:40:38,452][hbase-tablepool-1-thread-3][ERROR][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1124)): PriviledgedActionException as:takeshi_miao@LAB cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Request is a replay (34) - PROCESS_TGS)] [2014-01-15 09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.logPriviledgedAction(UserGroupInformation.java:1143)): PriviledgedAction as:takeshi_miao@LAB from:sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [2014-01-15 09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.ipc.SecureClient](org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:213)): Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Request is a replay (34) - PROCESS_TGS)] [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO ][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:657)): Initiating logout for takeshi_miao@LAB [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.logout(UserGroupInformation.java:154)): hadoop logout [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO ][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:667)): Initiating re-login for takeshi_miao@LAB [2014-01-15 09:40:38,455][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.login(UserGroupInformation.java:146)): hadoop login [2014-01-15 09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:95)): hadoop login commit [2014-01-15 09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:100)): using existing subject:[takeshi_miao@LAB, UnixPrincipal: takeshi_miao, UnixNumericUserPrincipal: 501, UnixNumericGroupPrincipal [Primary Group]: 501, UnixNumericGroupPrincipal [Supplementary Group]: 502, takeshi_miao@LAB] {code} At the beginning, we were worry about the data loss occurring while we found the Request is a replay (34) - PROCESS_TGS (especially it is the ERROR level) in log, but after code study, this is basically *NOT* a data loss issue due to HBase client API would try 5 times internally (o.a.h.hbase.ipc.SecureClient, L#296, a one thread) and also 10 times of retry externally (o.a.h.hbase.client.HConnectionManager, L#1661, for all failed thread), The HTable API would also throw IOEcxeption to client code if any thread still fail after these retries. Based on HBase users' viewpoint as us, we think this is better to change the logging level from 'ERROR' to 'WARN', due to the 'ERROR' level had been confused us for a while...But this code change may need to change boht HBase code and Hadoop code as well; so I am wondering how community think about this small thing but may be important to the pure HBase users. mailing list http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCADcMMgGiEyho0HGwgbfOUS78ymDpCo5Q0PStWAPUk40W%3DPfcFQ%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10379) Turn the msg Request is a replay (34) - PROCESS_TGS from logging level ERROR to WARN
[ https://issues.apache.org/jira/browse/HBASE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876105#comment-13876105 ] takeshi.miao commented on HBASE-10379: -- Will apply a patch later Turn the msg Request is a replay (34) - PROCESS_TGS from logging level ERROR to WARN -- Key: HBASE-10379 URL: https://issues.apache.org/jira/browse/HBASE-10379 Project: HBase Issue Type: Improvement Affects Versions: 0.94.16 Reporter: takeshi.miao Assignee: takeshi.miao Priority: Minor Hi All, Recently we got the error msg Request is a replay (34) - PROCESS_TGS while we are using the HBase client API to put data into HBase-0.94.16 with krb5-1.6.1 enabled. The related msg as follows... {code} [2014-01-15 09:40:38,452][hbase-tablepool-1-thread-3][ERROR][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1124)): PriviledgedActionException as:takeshi_miao@LAB cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Request is a replay (34) - PROCESS_TGS)] [2014-01-15 09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.logPriviledgedAction(UserGroupInformation.java:1143)): PriviledgedAction as:takeshi_miao@LAB from:sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [2014-01-15 09:40:38,453][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.ipc.SecureClient](org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:213)): Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Request is a replay (34) - PROCESS_TGS)] [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO ][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:657)): Initiating logout for takeshi_miao@LAB [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.logout(UserGroupInformation.java:154)): hadoop logout [2014-01-15 09:40:38,454][hbase-tablepool-1-thread-3][INFO ][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation.reloginFromTicketCache(UserGroupInformation.java:667)): Initiating re-login for takeshi_miao@LAB [2014-01-15 09:40:38,455][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.login(UserGroupInformation.java:146)): hadoop login [2014-01-15 09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:95)): hadoop login commit [2014-01-15 09:40:38,456][hbase-tablepool-1-thread-3][DEBUG][org.apache.hadoop.security.UserGroupInformation](org.apache.hadoop.security.UserGroupInformation$HadoopLoginModule.commit(UserGroupInformation.java:100)): using existing subject:[takeshi_miao@LAB, UnixPrincipal: takeshi_miao, UnixNumericUserPrincipal: 501, UnixNumericGroupPrincipal [Primary Group]: 501, UnixNumericGroupPrincipal [Supplementary Group]: 502, takeshi_miao@LAB] {code} At the beginning, we were worry about the data loss occurring while we found the Request is a replay (34) - PROCESS_TGS (especially it is the ERROR level) in log, but after code study, this is basically *NOT* a data loss issue due to HBase client API would try 5 times internally (o.a.h.hbase.ipc.SecureClient, L#296, a one thread) and also 10 times of retry externally (o.a.h.hbase.client.HConnectionManager, L#1661, for all failed thread), The HTable API would also throw IOEcxeption to client code if any thread still fail after these retries. Based on HBase users' viewpoint as us, we think this is better to change the logging level from 'ERROR' to 'WARN', due to the 'ERROR' level had been confused us for a while...But this code change may need to change boht HBase code and Hadoop code as well; so I am wondering how community think about this small thing but may be important to the pure HBase users. mailing list
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876114#comment-13876114 ] Ted Yu commented on HBASE-10323: [~apurtell]: Do you want this in 0.98 ? Auto detect data block encoding in HFileOutputFormat Key: HBASE-10323 URL: https://issues.apache.org/jira/browse/HBASE-10323 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0 Attachments: HBASE_10323-0.94.15-v1.patch, HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch Currently, one has to specify the data block encoding of the table explicitly using the config parameter hbase.mapreduce.hfileoutputformat.datablock.encoding when doing a bulkload load. This option is easily missed, not documented and also works differently than compression, block size and bloom filter type, which are auto detected. The solution would be to add support to auto detect datablock encoding similar to other parameters. The current patch does the following: 1. Automatically detects datablock encoding in HFileOutputFormat. 2. Keeps the legacy option of manually specifying the datablock encoding around as a method to override auto detections. 3. Moves string conf parsing to the start of the program so that it fails fast during starting up instead of failing during record writes. It also makes the internals of the program type safe. 4. Adds missing doc strings and unit tests for code serializing and deserializing config paramerters for bloom filer type, block size and datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876141#comment-13876141 ] ramkrishna.s.vasudevan commented on HBASE-10322: bq.CellCodecV2 is used as default. But I think there should not be any default value for this codec because of the below reason. (Default value should be value of the old config with thats default as KVCodec) Suppose the src cluster user is upgrading to 98 (or later versions in future) But the peer is still in 96 . I agree. the patch's intention was to show how we could do that config settings. +1 [~lhofhansl] bq.if I can see a certain KV, should I be able to see its visibility tags? What you say is right in the sense admin sets up proper access control say for User A and User A would be seeing only those KVs which has visibility labels that A is associated with. But sometimes the labels can be combination of visibility labels seperted by ,|,!. In that case the User A on reading the visibility labels would come to know about the existence of other labels. And added to that, the whole idea of associating the labels and users are done by admin with super user privileges. So allowing all users to view the labels in the KV would break it because reading the kv the User A would come to know what combination of labels he can pass to access the kvs to which he would not be authorised to. Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876146#comment-13876146 ] stack commented on HBASE-10322: --- bq. We need any way an RpcClient to talk with peer right? Yes, but I thought that if the Service is the explicit Replication Service, then you could identify the context as replication and slot in replication suited codecs that preserve tags on setup of the replication connection -- if asked for (A codec for replication that is other than what we use for 'normal' client/server seems like something we'd want to have anyways). If we break out a replication Service, it will break being able to replicate from a 0.98 to a 0.96 whether or not you are forwarding tags or not. That ain't good. If we leave the service as is, It sounds like we can have a 0.98 replicate to a 0.96 when no tags in the mix. It is only when you enable tags will you have to update the sink cluster so it recognizes the tag-bearing codec. Of your 1., and 2., 1. is preferable. Pity it has to be a config. in the hbase-*xml. Can it be a replication config (I suppose this is what your 2. does in part)? Can ship 0.98.0RC as soon as we dump in a codec that can do tags (what happens when you pass a KV with tags to default KVCodec? It just dumps them?) I like how [~lhofhansl] is telling it. Does that help lads? Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876147#comment-13876147 ] stack commented on HBASE-10322: --- Just to say that the 'codec ' problem would be true of any sink cluster no matter what the version; you couldn't do some fancy compression codec unless you first updated the sink cluster so it recognized it when the source cluster set up the connection. Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10373) Add more details info for ACL group in HBase book
[ https://issues.apache.org/jira/browse/HBASE-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] takeshi.miao updated HBASE-10373: - Fix Version/s: 0.99.0 Affects Version/s: 0.99.0 Status: Patch Available (was: Open) Add more details info for ACL group in HBase book - Key: HBASE-10373 URL: https://issues.apache.org/jira/browse/HBASE-10373 Project: HBase Issue Type: Improvement Components: documentation, security Affects Versions: 0.99.0 Reporter: takeshi.miao Assignee: takeshi.miao Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10373-trunk-v01.patch Current ACL section '8.3. Access Control' in HBase book, not instructs user to grant ACL for group. I think this is good to make it clear for users due to I think that this is great and important feature for users to manage their ACL more easily. mailing list http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCA+RK=_b+umfzwiaeud9fsqjk8rs8l-vuo6arvos8k5sutog...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10373) Add more details info for ACL group in HBase book
[ https://issues.apache.org/jira/browse/HBASE-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] takeshi.miao updated HBASE-10373: - Attachment: HBASE-10373-trunk-v01.patch patch-v01 submitted added some descriptions in section _'8.4.5. Shell Enhancements for Access Control'_ for demonstrating the ACL group as well Add more details info for ACL group in HBase book - Key: HBASE-10373 URL: https://issues.apache.org/jira/browse/HBASE-10373 Project: HBase Issue Type: Improvement Components: documentation, security Affects Versions: 0.99.0 Reporter: takeshi.miao Assignee: takeshi.miao Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10373-trunk-v01.patch Current ACL section '8.3. Access Control' in HBase book, not instructs user to grant ACL for group. I think this is good to make it clear for users due to I think that this is great and important feature for users to manage their ACL more easily. mailing list http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCA+RK=_b+umfzwiaeud9fsqjk8rs8l-vuo6arvos8k5sutog...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876159#comment-13876159 ] ramkrishna.s.vasudevan commented on HBASE-10322: bq.what happens when you pass a KV with tags to default KVCodec? It just dumps them No KVCodec by default will not dump tags but when it works with the WALCellCodec it would dump. so we would control it with a flag. The reason being kvcodec writes the entire length of the byte array. Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9004) Fix Documentation around Minor compaction and ttl
[ https://issues.apache.org/jira/browse/HBASE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876165#comment-13876165 ] Dan Feng commented on HBASE-9004: - I'm a little bit confused. Can somebody clarify whether minor compaction will delete expired cell or not? Fix Documentation around Minor compaction and ttl - Key: HBASE-9004 URL: https://issues.apache.org/jira/browse/HBASE-9004 Project: HBase Issue Type: Task Reporter: Elliott Clark Minor compactions should be able to delete KeyValues outside of ttl. The docs currently suggest otherwise. We should bring them in line. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9004) Fix Documentation around Minor compaction and ttl
[ https://issues.apache.org/jira/browse/HBASE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876169#comment-13876169 ] Feng Honghua commented on HBASE-9004: - minor compaction *will* delete expired cells, but only those within the input hfiles selected by the minor compact. the expired cells in other hfiles not selected by compact still exist in hfiles, but can't be read out by read/scan since ScanQueryMatcher guarantees to filter them out by TTL rule when processing read/scan, and these expired cells will eventually be deleted in subsequent compact if their hosting hfiles are selected by that compact. Fix Documentation around Minor compaction and ttl - Key: HBASE-9004 URL: https://issues.apache.org/jira/browse/HBASE-9004 Project: HBase Issue Type: Task Reporter: Elliott Clark Minor compactions should be able to delete KeyValues outside of ttl. The docs currently suggest otherwise. We should bring them in line. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10378) Divide HLog interface into User and Implementor specific interfaces
[ https://issues.apache.org/jira/browse/HBASE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876174#comment-13876174 ] ramkrishna.s.vasudevan commented on HBASE-10378: [~himan...@cloudera.com] Had a glance at the patch. The WAL and WALService looks good. I have some basic questions, incase of rollWriter and replications (including starting the syncer and writer threads), should we have an implemenation for the WALservice where based on the number of hLogs those many syncer and writer threads need to be started along with the replicaiton services for them? Currently the HRS just instantiates one HLog and starts them. What do you say? I was having an idea of defining an interface called WALGrouper and every region server would use an type of this WALGrouper and this grouper would know how many hLog instance he is creating and for every instance those syncer and writer threads would be started. And the api {code} public WALService getWAL() { {code} How will this make sense when there are more than one hlog for that RS? I know that in ur current implementation there are only 2 HLog and one among them is going to be active? But what if my multiwal impl is not that way and i may have more than one active HLog? I can discuss with you offline too on some my concerns and questions that I had while doing HBASE-8610. Divide HLog interface into User and Implementor specific interfaces --- Key: HBASE-10378 URL: https://issues.apache.org/jira/browse/HBASE-10378 Project: HBase Issue Type: Sub-task Components: wal Reporter: Himanshu Vashishtha Attachments: 10378-1.patch HBASE-5937 introduces the HLog interface as a first step to support multiple WAL implementations. This interface is a good start, but has some limitations/drawbacks in its current state, such as: 1) There is no clear distinction b/w User and Implementor APIs, and it provides APIs both for WAL users (append, sync, etc) and also WAL implementors (Reader/Writer interfaces, etc). There are APIs which are very much implementation specific (getFileNum, etc) and a user such as a RegionServer shouldn't know about it. 2) There are about 14 methods in FSHLog which are not present in HLog interface but are used at several places in the unit test code. These tests typecast HLog to FSHLog, which makes it very difficult to test multiple WAL implementations without doing some ugly checks. I'd like to propose some changes in HLog interface that would ease the multi WAL story: 1) Have two interfaces WAL and WALService. WAL provides APIs for implementors. WALService provides APIs for users (such as RegionServer). 2) A skeleton implementation of the above two interface as the base class for other WAL implementations (AbstractWAL). It provides required fields for all subclasses (fs, conf, log dir, etc). Make a minimal set of test only methods and add this set in AbstractWAL. 3) HLogFactory returns a WALService reference when creating a WAL instance; if a user need to access impl specific APIs (there are unit tests which get WAL from a HRegionServer and then call impl specific APIs), use AbstractWAL type casting, 4) Make TestHLog abstract and let all implementors provide their respective test class which extends TestHLog (TestFSHLog, for example). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876186#comment-13876186 ] Anoop Sam John commented on HBASE-10322: bq.I.e. if I can see a certain KV, should I be able to see its visibility tags? The answer to this is No. Ram explain the reason. bq.Maybe we can even have a tag that controls the visibility of the tags? Who can see the tags also along with KVs, we thought let it be decided by the user. Only HBase super user will be able to get tags also along with KVs. So this is the overall idea. Yes tags control what a client can see.. But we would like to control normal clients from seeing the tags. The impl of this becomes very tricky as we use same Codec to write from client to server and back. We were giving options for user to add tags in Mutation KVs. As of now we are thinking of removing this APIs. Over RPC tags will not go (ie. client - server or reverse) To write to WAL we use WALCellCodec and that will be able to write and read tags. Then the last problem came out was replication for which we propose 2 possible solutions. [~stack] with out big changes like ReplicationServer can we do? I think those we can try addressing in another issue.. bq.Of your 1., and 2., 1. is preferable. Pity it has to be a config. in the hbase-*xml. Can it be a replication config (I suppose this is what your 2. does in part)? This config (hbase.client.rpc.codec) is used by the RpcClient. RpcClient used by the ReplicationSource. Yes already it refers to the config. As long as user dont deal with tags (existing migrated to 98) no changes required at all.. When they have tags usage have to change this config at HRSs side to some codec with tags.. We will ship some new codecs which can handle tags also along with this issue. Yes option 2 adds a new config and as shown in the patch it just write the value of this new config as old config's value. (Decorating the conf object used by ReplicationSource) If you feel option 1 is fine, no extra code will be needed in Replication side. Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10373) Add more details info for ACL group in HBase book
[ https://issues.apache.org/jira/browse/HBASE-10373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876191#comment-13876191 ] Hadoop QA commented on HBASE-10373: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623897/HBASE-10373-trunk-v01.patch against trunk revision . ATTACHMENT ID: 12623897 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +grant lt;user|@groupgt; lt;permissionsgt; [ lt;tablegt; [ lt;column familygt; [ lt;column qualifiergt; ] ] ] +code class=codelt;user|@groupgt;/code is user or group (start with character '@'), Groups are created and manipulated via the Hadoop group mapping service. +revoke lt;user|@groupgt; [ lt;tablegt; [ lt;column familygt; [ lt;column qualifiergt; ] ] ] {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8471//console This message is automatically generated. Add more details info for ACL group in HBase book - Key: HBASE-10373 URL: https://issues.apache.org/jira/browse/HBASE-10373 Project: HBase Issue Type: Improvement Components: documentation, security Affects Versions: 0.99.0 Reporter: takeshi.miao Assignee: takeshi.miao Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10373-trunk-v01.patch Current ACL section '8.3. Access Control' in HBase book, not instructs user to grant ACL for group. I think this is good to make it clear for users due to I think that this is great and important feature for users to manage their ACL more easily. mailing list http://mail-archives.apache.org/mod_mbox/hbase-user/201401.mbox/%3CCA+RK=_b+umfzwiaeud9fsqjk8rs8l-vuo6arvos8k5sutog...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10322) Strip tags from KV while sending back to client on reads
[ https://issues.apache.org/jira/browse/HBASE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876212#comment-13876212 ] Lars Hofhansl commented on HBASE-10322: --- Thanks for explaining, Ram and Anoop. bq. So tags are becoming a server only thing as of now Agree. Can tackle Export/Copytable/etc later, although I figure eventually these would have to be addressed if folks use them for backup. bq. Only HBase super user will be able to get tags also along with KVs. This seems to contradict the earlier point. Strip tags from KV while sending back to client on reads Key: HBASE-10322 URL: https://issues.apache.org/jira/browse/HBASE-10322 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Blocker Fix For: 0.98.0, 0.99.0 Attachments: HBASE-10322.patch, HBASE-10322_V2.patch, HBASE-10322_codec.patch Right now we have some inconsistency wrt sending back tags on read. We do this in scan when using Java client(Codec based cell block encoding). But during a Get operation or when a pure PB based Scan comes we are not sending back the tags. So any of the below fix we have to do 1. Send back tags in missing cases also. But sending back visibility expression/ cell ACL is not correct. 2. Don't send back tags in any case. This will a problem when a tool like ExportTool use the scan to export the table data. We will miss exporting the cell visibility/ACL. 3. Send back tags based on some condition. It has to be per scan basis. Simplest way is pass some kind of attribute in Scan which says whether to send back tags or not. But believing some thing what scan specifies might not be correct IMO. Then comes the way of checking the user who is doing the scan. When a HBase super user doing the scan then only send back tags. So when a case comes like Export Tool's the execution should happen from a super user. So IMO we should go with #3. Patch coming soon. -- This message was sent by Atlassian JIRA (v6.1.5#6160)