[jira] [Created] (HBASE-17241) Avoid compacting already compacted mob files with _del files
huaxiang sun created HBASE-17241: Summary: Avoid compacting already compacted mob files with _del files Key: HBASE-17241 URL: https://issues.apache.org/jira/browse/HBASE-17241 Project: HBase Issue Type: Improvement Components: mob Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Today if there is only one file in the partition, and there is no _del files, the file is skipped. With del file, the current logic is to compact the already-compacted file with _del file. Let's say there is one mob file regionA20161101***, which was compacted. On 12/1/2016, there is _del file regionB20161201**_del, mob compaction kicks in, regionA20161101*** is less than the threshold, and it is picked for compaction. Since there is a _del file, regionA20161101 and regionB20161201***_del are compacted into regionA20161101**_1 . After that, regionB20161201**_del cannot be deleted since it is not a allFile compaction. The next mob compaction, regionA20161101**_1 and regionB20161201**_del will be picked up again and be compacted into regionA20161101***_2. So in this case, it will cause more unnecessary IOs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-17196) deleted mob cell can come back after major compaction and minor mob compaction
[ https://issues.apache.org/jira/browse/HBASE-17196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun resolved HBASE-17196. -- Resolution: Invalid As explained in the comments. > deleted mob cell can come back after major compaction and minor mob compaction > -- > > Key: HBASE-17196 > URL: https://issues.apache.org/jira/browse/HBASE-17196 > Project: HBase > Issue Type: Bug > Components: mob >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > > In the following case, the deleted mob cell can come back. > {code} > 1) hbase(main):001:0> create 't1', {NAME => 'f1', IS_MOB => true, > MOB_THRESHOLD => 10} > 2) hbase(main):002:0> put 't1', 'r1', 'f1:q1', '' > 3) hbase(main):003:0> flush 't1' > 4) hbase(main):004:0> deleteall 't1', 'r1' > 5) hbase(main):005:0> scan 't1' > ROW COLUMN+CELL > > > 0 row(s) > 6) hbase(main):006:0> flush 't1' > 7) hbase(main):007:0> major_compact 't1' > After that, go to mobdir, remove the _del file, this is to simulate the case > that mob minor compaction does not the _del file. Right now, the cell in > normal region is gone after the major compaction. > 8) hbase(main):008:0> put 't1', 'r2', 'f1:q1', '' > > > 9) hbase(main):009:0> flush 't1' > 10) hbase(main):010:0> scan 't1' > ROW COLUMN+CELL > > > r2 column=f1:q1, > timestamp=1480451201393, value= > > 1 row(s) > 11) hbase(main):011:0> compact 't1', 'f1', 'MOB' > 12) hbase(main):012:0> scan 't1' > ROW COLUMN+CELL > > > r1 column=f1:q1, > timestamp=1480450987725, value= > > r2 column=f1:q1, > timestamp=1480451201393, value= > > 2 row(s) > The deleted "r1" comes back. The reason is that mob minor compaction does not > include _del files so it generates references for the deleted cell. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-17197) hfile does not work in 2.0
[ https://issues.apache.org/jira/browse/HBASE-17197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun resolved HBASE-17197. -- Resolution: Invalid Missing -f for the file, do not know why I forgot that. > hfile does not work in 2.0 > -- > > Key: HBASE-17197 > URL: https://issues.apache.org/jira/browse/HBASE-17197 > Project: HBase > Issue Type: Bug > Components: HFile >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > > I tried to use hfile in master branch, it does not print out kv pairs or meta > as it is supposed to be. > {code} > hsun-MBP:hbase-2.0.0-SNAPSHOT hsun$ hbase hfile > file:///Users/hsun/work/local-hbase-cluster/data/data/default/t1/755b5d7a44148492b7138c79c5e4f39f/f1/ > 53e9f9bc328f468b87831221de3a0c74 bdc6e1c4eea246a99e989e02d554cb03 > bf9275ac418d4d458904d81137e82683 > hsun-MBP:hbase-2.0.0-SNAPSHOT hsun$ hbase hfile > file:///Users/hsun/work/local-hbase-cluster/data/data/default/t1/755b5d7a44148492b7138c79c5e4f39f/f1/bf9275ac418d4d458904d81137e82683 > -m > 2016-11-29 12:25:22,019 WARN [main] util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > hsun-MBP:hbase-2.0.0-SNAPSHOT hsun$ hbase hfile > file:///Users/hsun/work/local-hbase-cluster/data/data/default/t1/755b5d7a44148492b7138c79c5e4f39f/f1/bf9275ac418d4d458904d81137e82683 > -p > 2016-11-29 12:25:27,333 WARN [main] util.NativeCodeLoader: Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > Scanned kv count -> 0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17197) hfile does not work in 2.0
huaxiang sun created HBASE-17197: Summary: hfile does not work in 2.0 Key: HBASE-17197 URL: https://issues.apache.org/jira/browse/HBASE-17197 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun I tried to use hfile in master branch, it does not print out kv pairs or meta as it is supposed to be. {code} hsun-MBP:hbase-2.0.0-SNAPSHOT hsun$ hbase hfile file:///Users/hsun/work/local-hbase-cluster/data/data/default/t1/755b5d7a44148492b7138c79c5e4f39f/f1/ 53e9f9bc328f468b87831221de3a0c74 bdc6e1c4eea246a99e989e02d554cb03 bf9275ac418d4d458904d81137e82683 hsun-MBP:hbase-2.0.0-SNAPSHOT hsun$ hbase hfile file:///Users/hsun/work/local-hbase-cluster/data/data/default/t1/755b5d7a44148492b7138c79c5e4f39f/f1/bf9275ac418d4d458904d81137e82683 -m 2016-11-29 12:25:22,019 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable hsun-MBP:hbase-2.0.0-SNAPSHOT hsun$ hbase hfile file:///Users/hsun/work/local-hbase-cluster/data/data/default/t1/755b5d7a44148492b7138c79c5e4f39f/f1/bf9275ac418d4d458904d81137e82683 -p 2016-11-29 12:25:27,333 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Scanned kv count -> 0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17196) deleted mob cell can come back after major compaction and minor mob compaction
huaxiang sun created HBASE-17196: Summary: deleted mob cell can come back after major compaction and minor mob compaction Key: HBASE-17196 URL: https://issues.apache.org/jira/browse/HBASE-17196 Project: HBase Issue Type: Bug Components: mob Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun In the following case, the deleted mob cell can come back. {code} 1) hbase(main):001:0> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 10} 2) hbase(main):002:0> put 't1', 'r1', 'f1:q1', '' 3) hbase(main):003:0> flush 't1' 4) hbase(main):004:0> deleteall 't1', 'r1' 5) hbase(main):005:0> scan 't1' ROW COLUMN+CELL 0 row(s) 6) hbase(main):006:0> flush 't1' 7) hbase(main):007:0> major_compact 't1' After that, go to mobdir, remove the _del file, this is to simulate the case that mob minor compaction does not the _del file. Right now, the cell in normal region is gone after the major compaction. 8) hbase(main):008:0> put 't1', 'r2', 'f1:q1', '' 9) hbase(main):009:0> flush 't1' 10) hbase(main):010:0> scan 't1' ROW COLUMN+CELL r2 column=f1:q1, timestamp=1480451201393, value= 1 row(s) 11) hbase(main):011:0> compact 't1', 'f1', 'MOB' 12) hbase(main):012:0> scan 't1' ROW COLUMN+CELL r1 column=f1:q1, timestamp=1480450987725, value= r2 column=f1:q1, timestamp=1480451201393, value= 2 row(s) The deleted "r1" comes back. The reason is that mob minor compaction does not include _del files so it generates references for the deleted cell. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17172) Optimize major mob compaction with _del files
huaxiang sun created HBASE-17172: Summary: Optimize major mob compaction with _del files Key: HBASE-17172 URL: https://issues.apache.org/jira/browse/HBASE-17172 Project: HBase Issue Type: Improvement Components: mob Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Today, when there is a _del file in mobdir, with major mob compaction, every mob file will be recompacted, this causes lots of IO and slow down major mob compaction (may take months to finish). This needs to be improved. A few ideas are: 1) Do not compact all _del files into one, instead, compact them based on groups with startKey as the key. Then use firstKey/startKey to make each mob file to see if the _del file needs to be included for this partition. 2). Based on the timerange of the _del file, compaction for files after that timerange does not need to include the _del file as these are newer files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-17043) parallelize select() work in mob compaction
huaxiang sun created HBASE-17043: Summary: parallelize select() work in mob compaction Key: HBASE-17043 URL: https://issues.apache.org/jira/browse/HBASE-17043 Project: HBase Issue Type: Improvement Components: mob Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor Today in https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/compactions/PartitionedMobCompactor.java#L141, the select() is single-threaded. Give a large number of files, it will take several seconds to finish the job. Will see how this work can be divided and speed up the processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16981) Expand Mob Compaction Partition policy from daily to weekly, monthly and beyond
huaxiang sun created HBASE-16981: Summary: Expand Mob Compaction Partition policy from daily to weekly, monthly and beyond Key: HBASE-16981 URL: https://issues.apache.org/jira/browse/HBASE-16981 Project: HBase Issue Type: New Feature Components: mob Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Today the mob region holds all mob files for all regions. With daily partition mob compaction policy, after major mob compaction, there is still one file per region daily. Given there is 365 days in one year, at least 365 files per region. Since HDFS has limitation for number of files under one folder, this is not going to scale if there are lots of regions. To reduce mob file number, we want to introduce other partition policies such as weekly, monthly to compact mob files within one week or month into one file. This jira is create to track this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16908) investigate flakey TestQuotaThrottle
huaxiang sun created HBASE-16908: Summary: investigate flakey TestQuotaThrottle Key: HBASE-16908 URL: https://issues.apache.org/jira/browse/HBASE-16908 Project: HBase Issue Type: Bug Components: hbase Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor find out the root cause for TestQuotaThrottle failures. https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16775) Flakey test with TestExportSnapshot#testExportRetry and TestMobExportSnapshot#testExportRetry
huaxiang sun created HBASE-16775: Summary: Flakey test with TestExportSnapshot#testExportRetry and TestMobExportSnapshot#testExportRetry Key: HBASE-16775 URL: https://issues.apache.org/jira/browse/HBASE-16775 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: huaxiang sun The root cause is that conf.setInt("mapreduce.map.maxattempts", 10) is not taken by the mapper job, so the retry is actually 0. Debugging to see why this is the case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16767) Mob compaction needs to clean up files in /hbase/mobdir/.tmp and /hbase/mobdir/.tmp/.bulkload when running into IO exceptions
huaxiang sun created HBASE-16767: Summary: Mob compaction needs to clean up files in /hbase/mobdir/.tmp and /hbase/mobdir/.tmp/.bulkload when running into IO exceptions Key: HBASE-16767 URL: https://issues.apache.org/jira/browse/HBASE-16767 Project: HBase Issue Type: Bug Components: mob Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun For the following lines, when it runs into IOException, it does not clean up the files under /hbase/mobdir/.tmp and /hbase/mobdir/.tmp/.bulkload. It could be due to creating of new mob file or new reference file under .bulkload directory fails. Or when mob files from ./tmp to the mob region directory fails. https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/compactions/PartitionedMobCompactor.java#L433 https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/compactions/PartitionedMobCompactor.java#L433 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16699) Overflows in AverageIntervalRateLimiter's refill() and getWaitInterval()
huaxiang sun created HBASE-16699: Summary: Overflows in AverageIntervalRateLimiter's refill() and getWaitInterval() Key: HBASE-16699 URL: https://issues.apache.org/jira/browse/HBASE-16699 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.4.0 Reporter: huaxiang sun Assignee: huaxiang sun Please see the following two lines. Once it overflows, it will cause wrong behavior. For unconfigured RateLimiters, we should have simpler logic to byPass the check. https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/AverageIntervalRateLimiter.java#L37 https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/AverageIntervalRateLimiter.java#L51 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-16590) TestRestoreSnapshotFromClientWithRegionReplicas needs to use scan with TIMELINE consistency to countRows
[ https://issues.apache.org/jira/browse/HBASE-16590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun resolved HBASE-16590. -- Resolution: Invalid According to Matteo, the original one is valid. > TestRestoreSnapshotFromClientWithRegionReplicas needs to use scan with > TIMELINE consistency to countRows > > > Key: HBASE-16590 > URL: https://issues.apache.org/jira/browse/HBASE-16590 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Minor > Attachments: HBASE-16590-master-v001.patch > > > TestRestoreSnapshotFromClientWithRegionReplicas uses Consistency.STRONG when > doing a scan for counting rows. This is not right as it will not send out > scan requests to replicas. It caused issue when read replica logic is > corrected in HBASE-16345. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16590) TestRestoreSnapshotFromClientWithRegionReplicas needs to use scan with TIMELINE consistency to countRows
huaxiang sun created HBASE-16590: Summary: TestRestoreSnapshotFromClientWithRegionReplicas needs to use scan with TIMELINE consistency to countRows Key: HBASE-16590 URL: https://issues.apache.org/jira/browse/HBASE-16590 Project: HBase Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor TestRestoreSnapshotFromClientWithRegionReplicas uses Consistency.STRONG when doing a scan for counting rows. This is not right as it will not send out scan requests to replicas. It caused issue when read replica logic is corrected in HBASE-16345. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16578) Mob data loss after mob compaction and normal compcation
huaxiang sun created HBASE-16578: Summary: Mob data loss after mob compaction and normal compcation Key: HBASE-16578 URL: https://issues.apache.org/jira/browse/HBASE-16578 Project: HBase Issue Type: Bug Components: mob Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun I have a unittest case, which could explore the mob data loss issue. The root cause is that with the following line: https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/compactions/PartitionedMobCompactor.java#L625 It will make the old mob reference cell win during compaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16345) RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions
huaxiang sun created HBASE-16345: Summary: RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer Exceptions Key: HBASE-16345 URL: https://issues.apache.org/jira/browse/HBASE-16345 Project: HBase Issue Type: Bug Components: Client Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun We run into the following exception during read replica testing. {code} Caused by: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server not running at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:326) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:168) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:100) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) ... 4 more Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.regionserver.RegionServerStoppedException): org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server not running at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkOpen(RSRpcServices.java:924) at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1766) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31439) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:31865) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas$ReplicaRegionServerCallable.call(RpcRetryingCallerWithReadReplicas.java:162) ... 6 more {code} Checking the code, we need to catch a few exceptions from the primary region server and continue with replicas. https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L211 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16301) Trigger flush without waiting when compaction is disabled on a table
huaxiang sun created HBASE-16301: Summary: Trigger flush without waiting when compaction is disabled on a table Key: HBASE-16301 URL: https://issues.apache.org/jira/browse/HBASE-16301 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor When compaction is disabled on a table, flush needs to wait MemStoreFlusher#blockingWaitTime (default value is 90 seconds) before it goes ahead to flush. It has side effect that client may be blocked due to RegionTooBusyException. Please see the mail sent to dev list. http://mail-archives.apache.org/mod_mbox/hbase-dev/201607.mbox/%3c2d66b8ca-7c6f-40ea-a861-2de5482ec...@cloudera.com%3E I guess that the right behavior is to do flush without waiting if compaction is disabled on a table. Attached a patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16293) TestSnapshotFromMaster#testSnapshotHFileArchiving flakey
huaxiang sun created HBASE-16293: Summary: TestSnapshotFromMaster#testSnapshotHFileArchiving flakey Key: HBASE-16293 URL: https://issues.apache.org/jira/browse/HBASE-16293 Project: HBase Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Got the following stack trace for this failure, not sure if it is related with HBASE-9072 --- T E S T S --- Running org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 336.042 sec <<< FAILURE! - in org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster testSnapshotHFileArchiving(org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster) Time elapsed: 303.771 sec <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 30 milliseconds at java.lang.Object.wait(Native Method) at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1810) at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1784) at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1860) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:241) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:979) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:576) at org.apache.hadoop.hbase.HBaseTestingUtility.loadTable(HBaseTestingUtility.java:2002) at org.apache.hadoop.hbase.HBaseTestingUtility.loadTable(HBaseTestingUtility.java:1979) at org.apache.hadoop.hbase.HBaseTestingUtility.loadTable(HBaseTestingUtility.java:1967) at org.apache.hadoop.hbase.HBaseTestingUtility.loadTable(HBaseTestingUtility.java:1945) at org.apache.hadoop.hbase.master.cleaner.TestSnapshotFromMaster.testSnapshotHFileArchiving(TestSnapshotFromMaster.java:297) Results : Tests in error: TestSnapshotFromMaster.testSnapshotHFileArchiving:297->Object.wait:-2 » TestTimedOut Tests run: 4, Failures: 0, Errors: 1, Skipped: 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16275) Change ServerManager#onlineServers from ConcurrentHashMap to ConcurrentSkipListMap
huaxiang sun created HBASE-16275: Summary: Change ServerManager#onlineServers from ConcurrentHashMap to ConcurrentSkipListMap Key: HBASE-16275 URL: https://issues.apache.org/jira/browse/HBASE-16275 Project: HBase Issue Type: Improvement Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor In Class ServerManager, onlineServers is declared as ConcurrentHashMap. In findServerWithSameHostnamePortWithLock(), it has to do a loop to find if there is a ServerName with same host:port pair. If replaced with ConcurrentSkipListMap, findServerWithSameHostnamePortWithLock() can be replaced with a O(log(n)) implementation. I run some performance comparison(test the function only), it seems that there is no difference if there are 1000 servers. With more servers, ConcurrentSkipListMap implementation is going to win big. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16272) Overflow in ServerName's compareTo method
huaxiang sun created HBASE-16272: Summary: Overflow in ServerName's compareTo method Key: HBASE-16272 URL: https://issues.apache.org/jira/browse/HBASE-16272 Project: HBase Issue Type: Bug Components: hbase Reporter: huaxiang sun Assignee: huaxiang sun Looking at the ServerName's compareTo(), https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/ServerName.java#L303 It converts the return int value by converting long to int like (int)(longValue), which could be incorrect when it overflows, need to replace it with Long.compareTo(a,b). [~mbertozzi] found some others as well, such as https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java#L990 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16137) Fix findbugs warning introduced by hbase-14730
huaxiang sun created HBASE-16137: Summary: Fix findbugs warning introduced by hbase-14730 Key: HBASE-16137 URL: https://issues.apache.org/jira/browse/HBASE-16137 Project: HBase Issue Type: Bug Affects Versions: 1.2.0, 1.3.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor >From stack: "Lads. This patch makes for a new findbugs warning: https://builds.apache.org/job/PreCommit-HBASE-Build/2390/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html If you are good w/ the code, i can fix the findbugs warning... just say." -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-15980) TestRegionServerMetrics#testMobMetrics failed with the master
[ https://issues.apache.org/jira/browse/HBASE-15980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun resolved HBASE-15980. -- Resolution: Duplicate duplicate as HBASE-15959 > TestRegionServerMetrics#testMobMetrics failed with the master > - > > Key: HBASE-15980 > URL: https://issues.apache.org/jira/browse/HBASE-15980 > Project: HBase > Issue Type: Bug >Reporter: huaxiang sun >Assignee: huaxiang sun > > Run the test locally, found that the following failure. > --- > T E S T S > --- > Running org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics > Tests run: 16, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 82.076 sec > <<< FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics > testMobMetrics(org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics) > Time elapsed: 11.162 sec <<< FAILURE! > java.lang.AssertionError: Metrics Counters should be equal expected:<10> but > was:<8> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at > org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185) > at > org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.assertCounter(TestRegionServerMetrics.java:146) > at > org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:460) > Results : -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15980) TestRegionServerMetrics#testMobMetrics failed with the master
huaxiang sun created HBASE-15980: Summary: TestRegionServerMetrics#testMobMetrics failed with the master Key: HBASE-15980 URL: https://issues.apache.org/jira/browse/HBASE-15980 Project: HBase Issue Type: Bug Reporter: huaxiang sun Run the test locally, found that the following failure. --- T E S T S --- Running org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics Tests run: 16, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 82.076 sec <<< FAILURE! - in org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics testMobMetrics(org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics) Time elapsed: 11.162 sec <<< FAILURE! java.lang.AssertionError: Metrics Counters should be equal expected:<10> but was:<8> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.hadoop.hbase.test.MetricsAssertHelperImpl.assertCounter(MetricsAssertHelperImpl.java:185) at org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.assertCounter(TestRegionServerMetrics.java:146) at org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics.testMobMetrics(TestRegionServerMetrics.java:460) Results : -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15975) logic in TestHTableDescriptor#testAddCoprocessorWithSpecStr is wrong
huaxiang sun created HBASE-15975: Summary: logic in TestHTableDescriptor#testAddCoprocessorWithSpecStr is wrong Key: HBASE-15975 URL: https://issues.apache.org/jira/browse/HBASE-15975 Project: HBase Issue Type: Bug Components: test Affects Versions: master Reporter: huaxiang sun Assignee: huaxiang sun Priority: Trivial While working on an unitest case for HBASE-14644, crossed over testAddCoprocessorWithSpecStr(). {code} HTableDescriptor htd = new HTableDescriptor(TableName.META_TABLE_NAME); String cpName = "a.b.c.d"; boolean expected = false; try { htd.addCoprocessorWithSpec(cpName); } catch (IllegalArgumentException iae) { expected = true; } if (!expected) fail(); // Try minimal spec. try { htd.addCoprocessorWithSpec("file:///some/path" + "|" + cpName); } catch (IllegalArgumentException iae) { expected = false; } if (expected) fail(); // Try more spec. String spec = "hdfs:///foo.jar|com.foo.FooRegionObserver|1001|arg1=1,arg2=2"; try { htd.addCoprocessorWithSpec(spec); } catch (IllegalArgumentException iae) { expected = false; It should be true as it is expected to succeed. } if (expected) fail(); // Try double add of same coprocessor try { htd.addCoprocessorWithSpec(spec); } catch (IOException ioe) { expected = true; } if (!expected) fail(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15707) ImportTSV bulk output does not support tags with hfile.format.version=3
huaxiang sun created HBASE-15707: Summary: ImportTSV bulk output does not support tags with hfile.format.version=3 Key: HBASE-15707 URL: https://issues.apache.org/jira/browse/HBASE-15707 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.1.5, 1.2.2, 1.0.5 Reporter: huaxiang sun Running the following command: {code} hbase hbase org.apache.hadoop.hbase.mapreduce.ImportTsv \ -Dhfile.format.version=3 \ -Dmapreduce.map.combine.minspills=1 \ -Dimporttsv.separator=, \ -Dimporttsv.skip.bad.lines=false \ -Dimporttsv.columns="HBASE_ROW_KEY,cf1:a,HBASE_CELL_TTL" \ -Dimporttsv.bulk.output=/tmp/testttl/output/1 \ testttl \ /tmp/testttl/input {code} The content of input is like: {code} row1,data1,0060 row2,data2,0660 row3,data3,0060 row4,data4,0660 {code} When running hfile tool with the output hfile, there is no ttl tag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15706) HFilePrettyPrinter should print out nicely formatted tags
huaxiang sun created HBASE-15706: Summary: HFilePrettyPrinter should print out nicely formatted tags Key: HBASE-15706 URL: https://issues.apache.org/jira/browse/HBASE-15706 Project: HBase Issue Type: Improvement Components: HFile Affects Versions: 2.0.0 Reporter: huaxiang sun Priority: Minor When I was using HFile to print out a rows with tags, the output is like: hsun-MBP:hbase-2.0.0-SNAPSHOT hsun$ hbase org.apache.hadoop.hbase.io.hfile.HFile -f /tmp/71afa45b1cb94ea1858a99f31197274f -p 2016-04-25 11:40:40,409 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-04-25 11:40:40,580 INFO [main] hfile.CacheConfig: CacheConfig:disabled K: b/b:b/1461608231279/Maximum/vlen=0/seqid=0 V: K: b/b:b/1461608231278/Put/vlen=1/seqid=0 V: b T[0]: � Scanned kv count -> 2 With attached patch, the print is now like: 2016-04-25 11:57:05,849 INFO [main] hfile.CacheConfig: CacheConfig:disabled K: b/b:b/1461609876838/Maximum/vlen=0/seqid=0 V: K: b/b:b/1461609876837/Put/vlen=1/seqid=0 V: b T[0]: [Tag type : 8, value : \x00\x0E\xEE\xEE] Scanned kv count -> 2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15632) Undo the checking of lastStoreFlushTimeMap.isEmpty() introduced in HBASE-13145
huaxiang sun created HBASE-15632: Summary: Undo the checking of lastStoreFlushTimeMap.isEmpty() introduced in HBASE-13145 Key: HBASE-15632 URL: https://issues.apache.org/jira/browse/HBASE-15632 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor HBASE-13145 introduce the following check {code} diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java index 215069c..8f73af5 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java @@ -1574,7 +1574,8 @@ public class HRegion implements HeapSize, PropagatingConfigurationObserver { // */ @VisibleForTesting public long getEarliestFlushTimeForAllStores() { -return Collections.min(lastStoreFlushTimeMap.values()); +return lastStoreFlushTimeMap.isEmpty() ? Long.MAX_VALUE : Collections.min(lastStoreFlushTimeMap +.values()); } {code} I think the reason for the check is that table creation without family is allowed before HBASE-15456. With HBASE-15456, table creation without family is not allowed. We have one user claimed that they run into the same HRegionServer$PeriodicMemstoreFlusher exception, and the table was created with family. The log was not kept so could not find more info there. By checking the code, it seems impossible. Can we undo this check so the real issue is not hidden in case there is one, [~Apache9]? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15627) Miss space and closing quote in AccessController#checkSystemOrSuperUser
huaxiang sun created HBASE-15627: Summary: Miss space and closing quote in AccessController#checkSystemOrSuperUser Key: HBASE-15627 URL: https://issues.apache.org/jira/browse/HBASE-15627 Project: HBase Issue Type: Bug Components: security Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor Miss space and closing quote in AccessController#checkSystemOrSuperUser {code} private void checkSystemOrSuperUser() throws IOException { // No need to check if we're not going to throw if (!authorizationEnabled) { return; } User activeUser = getActiveUser(); if (!Superusers.isSuperUser(activeUser)) { //***, miss closing ' and space in the AccessDeniedException string throw new AccessDeniedException("User '" + (activeUser != null ? activeUser.getShortName() : "null") + "is not system or super user."); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15621) Suppress Hbase SnapshotHFile cleaner error messages when a snaphot is going on
huaxiang sun created HBASE-15621: Summary: Suppress Hbase SnapshotHFile cleaner error messages when a snaphot is going on Key: HBASE-15621 URL: https://issues.apache.org/jira/browse/HBASE-15621 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor Run into the following exception when a snapshot is going on. partial file of region-manifest and data.manifest could be read and parsed by the cleaner which results in InvalidProtocolBufferException, which needs to be ignored for in-progress snapshot. 2016-04-01 00:31:50,200 ERROR org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner: Exception while checking if: *** was valid, keeping it just in case. com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either than the input has been truncated or that an embedded message misreported its own length. at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:70) at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:746) at com.google.protobuf.CodedInputStream.readRawByte(CodedInputStream.java:769) at com.google.protobuf.CodedInputStream.readRawVarint64(CodedInputStream.java:462) at com.google.protobuf.CodedInputStream.readUInt64(CodedInputStream.java:188) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile.(SnapshotProtos.java:1331) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile.(SnapshotProtos.java:1263) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile$1.parsePartialFrom(SnapshotProtos.java:1364) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile$1.parsePartialFrom(SnapshotProtos.java:1359) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles.(SnapshotProtos.java:2161) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles.(SnapshotProtos.java:2103) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles$1.parsePartialFrom(SnapshotProtos.java:2197) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles$1.parsePartialFrom(SnapshotProtos.java:2192) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1165) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.(SnapshotProtos.java:1094) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3858) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.(SnapshotProtos.java:3792) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223) at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) at org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4094) at org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:433) at org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:273) at org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:119) at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.visitTableStoreFiles(SnapshotReferenceUtil.java:125) at
[jira] [Created] (HBASE-15456) CreateTableProcedure/ModifyTableProcedure needs to fail when there is no family in descriptor
huaxiang sun created HBASE-15456: Summary: CreateTableProcedure/ModifyTableProcedure needs to fail when there is no family in descriptor Key: HBASE-15456 URL: https://issues.apache.org/jira/browse/HBASE-15456 Project: HBase Issue Type: Improvement Components: master Affects Versions: 2.0.0 Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor If there is only one family in the table, DeleteColumnFamilyProcedure will fail. Currently, when hbase.table.sanity.checks is set to false, hbase master logs a warning and CreateTableProcedure/ModifyTableProcedure will succeed. This behavior is not consistent with DeleteColumnFamilyProcedure's. Another point, before HBASE-13145, PeriodicMemstoreFlusher will run into the following exception if there is no family in the table, lastStoreFlushTimeMap is populated for families, if there is no family in the table, there is no entry in lastStoreFlushTimeMap. 16/02/01 11:14:26 ERROR regionserver.HRegionServer$PeriodicMemstoreFlusher: Caught exception java.util.NoSuchElementException at java.util.concurrent.ConcurrentHashMap$HashIterator.nextEntry(ConcurrentHashMap.java:1354) at java.util.concurrent.ConcurrentHashMap$ValueIterator.next(ConcurrentHashMap.java:1384) at java.util.Collections.min(Collections.java:628) at org.apache.hadoop.hbase.regionserver.HRegion.getEarliestFlushTimeForAllStores(HRegion.java:1572) at org.apache.hadoop.hbase.regionserver.HRegion.shouldFlush(HRegion.java:1904) at org.apache.hadoop.hbase.regionserver.HRegionServer$PeriodicMemstoreFlusher.chore(HRegionServer.java:1509) at org.apache.hadoop.hbase.Chore.run(Chore.java:87) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15261) Make Throwable t in DaughterOpener volatile
huaxiang sun created HBASE-15261: Summary: Make Throwable t in DaughterOpener volatile Key: HBASE-15261 URL: https://issues.apache.org/jira/browse/HBASE-15261 Project: HBase Issue Type: Bug Components: regionserver Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor In the region split process, daughter regions are opened in different threads, Throwable t is set in these threads and it is checked in the calling thread. Need to make it volatile so the checking will not miss any exceptions from opening daughter regions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15206) Flakey testSplitDaughtersNotInMeta test
huaxiang sun created HBASE-15206: Summary: Flakey testSplitDaughtersNotInMeta test Key: HBASE-15206 URL: https://issues.apache.org/jira/browse/HBASE-15206 Project: HBase Issue Type: Bug Components: flakey Reporter: huaxiang sun Assignee: huaxiang sun Priority: Minor Run into the following failure with hbase 1.0.0. Stacktrace java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertNotNull(Assert.java:712) at org.junit.Assert.assertNotNull(Assert.java:722) at org.apache.hadoop.hbase.util.TestHBaseFsck.testSplitDaughtersNotInMeta(TestHBaseFsck.java:1723) >From the log, the ntp issue caused clock skew and it woke up CatalogJanitor >earlier. The CatalogJanitor cleaned up the parent region. It could happen with >master branch as well. The fix is to disable CatalogJanitor to make sure this >will not happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15104) IntegrationTestAcitGuarantees occasionally fails when trying to cleanup
huaxiang sun created HBASE-15104: Summary: IntegrationTestAcitGuarantees occasionally fails when trying to cleanup Key: HBASE-15104 URL: https://issues.apache.org/jira/browse/HBASE-15104 Project: HBase Issue Type: Bug Components: integration tests Affects Versions: 2.0.0, 1.2.0 Reporter: huaxiang sun IntegrationTestAcidGuarantees fails when trying to cleanup with NotServerRegionExceptions giving up (after 36 attempts) . 5/11/09 09:19:24 INFO client.AsyncProcess: #33, waiting for some tasks to finish. Expected max=0, tasksInProgress=9 15/11/09 09:19:33 INFO client.AsyncProcess: #45, table=TestAcidGuarantees, attempt=10/35 failed=1ops, last exception: org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region TestAcidGuarantees,test_row_1,1447089367019.032439ef4f3353cb894d20337ba043bc. is not online on node-4.internal,22101,1447089152259 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2786) at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:922) at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:1893) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32213) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) ... Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Mon Nov 09 09:19:53 PST 2015, null, java.net.SocketTimeoutException: callTimeout=6, callDuration=68104: row 'test_row_1' Looked at the RS log, the following exception is found: 2015-11-10 10:07:49,091 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=TestAcidGuarantees,,1447177733243.f1be6b850fe3958c5c9b5e330b5dfb00., starting to roll back the global memstore size. org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:102) at org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:6011) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5995) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5967) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5938) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5894) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:5845) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:356) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:126) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15032) hbase shell scan filter string assumes UTF-8 encoding
huaxiang sun created HBASE-15032: Summary: hbase shell scan filter string assumes UTF-8 encoding Key: HBASE-15032 URL: https://issues.apache.org/jira/browse/HBASE-15032 Project: HBase Issue Type: Bug Components: shell Reporter: huaxiang sun Assignee: huaxiang sun Current hbase shell scan filter string is assumed to be UTF-8 encoding, which makes the following scan not working. hbase(main):011:0> scan 't1' ROW COLUMN+CELL r4 column=cf1:q1, timestamp=1450812398741, value=\x82 hbase(main):003:0> scan 't1', {FILTER => "SingleColumnValueFilter ('cf1', 'q1', >=, 'binary:\x80', true, true)"} ROW COLUMN+CELL 0 row(s) in 0.0130 seconds -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14995) Optimize setting tagsPresent in DefaultMemStore.java
huaxiang sun created HBASE-14995: Summary: Optimize setting tagsPresent in DefaultMemStore.java Key: HBASE-14995 URL: https://issues.apache.org/jira/browse/HBASE-14995 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 2.0.0, 1.2.0 Reporter: huaxiang sun Priority: Minor The current implementation calls e.getTagsLength() for each cell. Once tagsPresent is set, e.getTagsLength() can be avoided. private boolean addToCellSet(Cell e) { boolean b = this.cellSet.add(e); // In no tags case this NoTagsKeyValue.getTagsLength() is a cheap call. // When we use ACL CP or Visibility CP which deals with Tags during // mutation, the TagRewriteCell.getTagsLength() is a cheaper call. We do not // parse the byte[] to identify the tags length. if(e.getTagsLength() > 0) { tagsPresent = true; } setOldestEditTimeToNow(); return b; } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14766) WALEntryFilter's filter implement, cell.getFamily() needs to be replaced with the new low-cost implementation.
huaxiang sun created HBASE-14766: Summary: WALEntryFilter's filter implement, cell.getFamily() needs to be replaced with the new low-cost implementation. Key: HBASE-14766 URL: https://issues.apache.org/jira/browse/HBASE-14766 Project: HBase Issue Type: Improvement Reporter: huaxiang sun Cell's getFamily() gets an array copy of the cell's family, while in the filter function, it just needs to peek into the family and do a compare. Replace Bytes.toString(cell.getFamily()) with Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength()) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14730) region server needs to log warnings when there are attributes configured for cells with hfile v2
huaxiang sun created HBASE-14730: Summary: region server needs to log warnings when there are attributes configured for cells with hfile v2 Key: HBASE-14730 URL: https://issues.apache.org/jira/browse/HBASE-14730 Project: HBase Issue Type: Improvement Components: regionserver Reporter: huaxiang sun Assignee: huaxiang sun User can configure cell attributes with hfile.format.version 2. When cells are flushed from memStore to Hfile, attributes are not saved. Warnings need to be logged. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14554) Investigate the server side alternative to enable dynamic jar for security purpose
huaxiang sun created HBASE-14554: Summary: Investigate the server side alternative to enable dynamic jar for security purpose Key: HBASE-14554 URL: https://issues.apache.org/jira/browse/HBASE-14554 Project: HBase Issue Type: Bug Reporter: huaxiang sun Assignee: huaxiang sun Fix For: 2.0.0 >From [~mbertozzi] "for 2.x we probably want to do some changes. the >DynamicLoader seems to not be needed on the client side, so we should force >that to "not enabled". but on the server side we probably want that still on, >to allow user filters and so on. do we have any alternative to copy local >instead of forcing that "not enable" with security reason as motivation? how >one is supposed to use custom filters in a "secure" environment otherwise?" "Esteban Gutierrez in theory we are already supposed to do the "remote" load. the problem is the code that copies those "remote" locally. I think that was done because it was the easy way to load the class form remote since you have the friendly API that loads the class by using addUrl() where url is expected to be something that java understand and hdfs is not. Looking at the classLoader API there is a defineClass() that takes an array of bytes. In theory we can leverage that to open the hdfs stream (the jar we want to load) and add the class to our class loader and avoid the copy-to-local step. In that way we can get even rid of the tmp dir. https://docs.oracle.com/javase/7/docs/api/java/security/SecureClassLoader.html#defineClass(java.lang.String,%20byte[],%20int,%20int,%20java.security.CodeSource) I'll let huaxiang sun look into that, if it is something possible or not." -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14492) Increase REST server http header size from 8k to 64k
huaxiang sun created HBASE-14492: Summary: Increase REST server http header size from 8k to 64k Key: HBASE-14492 URL: https://issues.apache.org/jira/browse/HBASE-14492 Project: HBase Issue Type: Bug Components: REST Reporter: huaxiang sun Assignee: huaxiang sun @HBASE-13608 increased rest server http header size to 8k. We saw http header size exceeding 7k with kerberos authentication. Increase it to 64k to avoid possible future 413 error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14471) Thrift - HTTP Error 413 full HEAD if using kerberos authentication
huaxiang sun created HBASE-14471: Summary: Thrift - HTTP Error 413 full HEAD if using kerberos authentication Key: HBASE-14471 URL: https://issues.apache.org/jira/browse/HBASE-14471 Project: HBase Issue Type: Bug Components: Thrift Reporter: huaxiang sun Assignee: huaxiang sun When trying to access a Thrift sever that is kerberized, a HTTP 413 full HEAD error is received. In that case, tcpdump shows http header size exceeded 4k. This seems related to the issue outlined in @HADOOP-8816. The default header size limit is 4k, follow the fix for @HADOOP-8816, propose to increase the header size limit to 64k. -- This message was sent by Atlassian JIRA (v6.3.4#6332)