[jira] [Created] (HBASE-23615) Use a dedicated thread for executing WorkerMonitor in ProcedureExecutor.

2019-12-24 Thread Lijin Bin (Jira)
Lijin Bin created HBASE-23615:
-

 Summary: Use a dedicated thread for executing WorkerMonitor in 
ProcedureExecutor.
 Key: HBASE-23615
 URL: https://issues.apache.org/jira/browse/HBASE-23615
 Project: HBase
  Issue Type: Improvement
  Components: amv2
Affects Versions: 2.2.2
Reporter: Lijin Bin
Assignee: Lijin Bin


See the discussion in https://issues.apache.org/jira/browse/HBASE-23597,  it is 
better to have a dedicated thread for executing WorkerMonitor, so it will not 
blocked by other tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23581) Creating table gets stuck when specifying an invalid split policy as METADATA

2019-12-24 Thread Lijin Bin (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijin Bin resolved HBASE-23581.
---
Fix Version/s: 3.0.0
   Resolution: Fixed

> Creating table gets stuck when specifying an invalid split policy as METADATA
> -
>
> Key: HBASE-23581
> URL: https://issues.apache.org/jira/browse/HBASE-23581
> Project: HBase
>  Issue Type: Bug
> Environment: HDP-3.1.0
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.0.0
>
>
> We can reproduce this issue as follows:
> {code}
> create 'test', "f", {METADATA => {'hbase.regionserver.region.split.policy' => 
> 'UNDEFINED'}}
> {code}
> After running this, creating table will get stuck. And it looks like this is 
> because opening region fails with ClassNotFoundException:
> {code}
> 2019-12-16 06:45:03,671 ERROR 
> [RS_OPEN_REGION-regionserver/c126-node2:16020-17] handler.OpenRegionHandler: 
> Failed open of region=test,,1576477039045.7435965ddb2229c62d926b3ee963dcf3.
> java.io.IOException: Unable to load configured region split policy 
> 'UNDEFINED' for table 'test'
> at 
> org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPolicyClass(RegionSplitPolicy.java:132)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkClassLoading(HRegion.java:7162)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7083)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7043)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7015)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6973)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6924)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:283)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
> at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: UNDEFINED
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:264)
> at 
> org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPolicyClass(RegionSplitPolicy.java:127)
> ... 12 more
> {code}
> We should have sanity checks for the properties specified in METADATA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23589) FlushDescriptor contains non-matching family/output combinations

2019-12-24 Thread Lijin Bin (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijin Bin resolved HBASE-23589.
---
Fix Version/s: 2.1.9
   2.2.3
   2.3.0
   3.0.0
   Resolution: Fixed

> FlushDescriptor contains non-matching family/output combinations
> 
>
> Key: HBASE-23589
> URL: https://issues.apache.org/jira/browse/HBASE-23589
> Project: HBase
>  Issue Type: Bug
>  Components: read replicas
>Affects Versions: 2.2.2
>Reporter: Szabolcs Bukros
>Assignee: Szabolcs Bukros
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.2.3, 2.1.9
>
>
> Flushing the active region creates the following files:
> {code:java}
> 2019-12-13 08:00:20,866 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Added 
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/dab4d1cc01e44773bad7bdb5d2e33b6c,
>  entries=49128, sequenceid
> =70688, filesize=41.4 M
> 2019-12-13 08:00:20,897 INFO org.apache.hadoop.hbase.regionserver.HStore: 
> Added 
> hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f3/ecc50f33085042f7bd2397253b896a3a,
>  entries=5, sequenceid
> =70688, filesize=42.3 M
> {code}
> On the read replica region when we try to replay the flush we see the 
> following:
> {code:java}
> 2019-12-13 08:00:21,279 WARN org.apache.hadoop.hbase.regionserver.HRegion: 
> bfa9cdb0ab13d60b389df6621ab316d1 : At least one of the store files in flush: 
> action: COMMIT_FLUSH table_name: "IntegrationTestRegionReplicaReplication" 
> encoded_region_name: "20af2eb8929408f26d0b3b81e6b86d47" 
> flush_sequence_number: 70688 store_flushes { family_name: "f2" 
> store_home_dir: "f2" flush_output: "ecc50f33085042f7bd2397253b896a3a" } 
> store_flushes { family_name: "f3" store_home_dir: "f3" flush_output: 
> "dab4d1cc01e44773bad7bdb5d2e33b6c" } region_name: 
> "IntegrationTestRegionReplicaReplication,,1576252065847.20af2eb8929408f26d0b3b81e6b86d47."
>  doesn't exist any more. Skip loading the file(s)
> java.io.FileNotFoundException: HFileLink 
> locations=[hdfs://replica-1:8020/hbase/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a,
>  
> hdfs://replica-1:8020/hbase/.tmp/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a,
>  
> hdfs://replica-1:8020/hbase/mobdir/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a,
>  
> hdfs://replica-1:8020/hbase/archive/data/default/IntegrationTestRegionReplicaReplication/20af2eb8929408f26d0b3b81e6b86d47/f2/ecc50f33085042f7bd2397253b896a3a]
> at 
> org.apache.hadoop.hbase.io.FileLink.getFileStatus(FileLink.java:415)
> at 
> org.apache.hadoop.hbase.util.ServerRegionReplicaUtil.getStoreFileInfo(ServerRegionReplicaUtil.java:135)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.getStoreFileInfo(HRegionFileSystem.java:311)
> at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.replayFlush(HStore.java:2414)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayFlushInStores(HRegion.java:5310)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushCommitMarker(HRegion.java:5184)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayWALFlushMarker(HRegion.java:5018)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doReplayBatchOp(RSRpcServices.java:1143)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:2229)
> at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29754)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
> at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> {code}
> As we can see the flush_outputs got mixed up. 
>  
> The issue is caused by HRegion.internalFlushCacheAndCommit. The code assumes 
> "{color:#808080}stores.values() and storeFlushCtxs have same order{color}" 
> which no longer seems to be true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23616) IndexScrutinyTool is very slow on view-indexes

2019-12-24 Thread Swaroopa Kadam (Jira)
Swaroopa Kadam created HBASE-23616:
--

 Summary: IndexScrutinyTool is very slow on view-indexes
 Key: HBASE-23616
 URL: https://issues.apache.org/jira/browse/HBASE-23616
 Project: HBase
  Issue Type: Improvement
Reporter: Swaroopa Kadam


>From view-index to view, it scrutinizes about 7 rows per minute with batch 
>size 1. 

>From view to view-index, it is about 1000 rows per minute with batch size 1, 
>which is also very slow. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23616) IndexScrutinyTool is very slow on view-indexes

2019-12-24 Thread Swaroopa Kadam (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swaroopa Kadam resolved HBASE-23616.

Resolution: Invalid

> IndexScrutinyTool is very slow on view-indexes
> --
>
> Key: HBASE-23616
> URL: https://issues.apache.org/jira/browse/HBASE-23616
> Project: HBase
>  Issue Type: Improvement
>Reporter: Swaroopa Kadam
>Priority: Major
>
> From view-index to view, it scrutinizes about 7 rows per minute with batch 
> size 1. 
> From view to view-index, it is about 1000 rows per minute with batch size 1, 
> which is also very slow. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-14509) Configurable sparse indexes?

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14509.
---
Resolution: Won't Fix

> Configurable sparse indexes?
> 
>
> Key: HBASE-14509
> URL: https://issues.apache.org/jira/browse/HBASE-14509
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> This idea just popped up today and I wanted to record it for discussion:
> What if we kept sparse column indexes per region or HFile or per configurable 
> range?
> I.e. For any given CQ we record the lowest and highest value for a particular 
> range (HFile, Region, or a custom range like the Phoenix guide post).
> By tweaking the size of these ranges we can control the size of the index, vs 
> its selectivity.
> For example if we kept it by HFile we can almost instantly decide whether we 
> need scan a particular HFile at all to find a particular value in a Cell.
> We can also collect min/max values for each n MB of data, for example when we 
> can the region the first time. Assuming ranges are large enough we can always 
> keep the index in memory together with the region.
> Kind of a sparse local index. Might much easier than the buddy region stuff 
> we've been discussing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-14014) Explore row-by-row grouping options

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14014.
---
Resolution: Won't Fix

> Explore row-by-row grouping options
> ---
>
> Key: HBASE-14014
> URL: https://issues.apache.org/jira/browse/HBASE-14014
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>
> See discussion in parent.
> We need to considering the following attributes of WALKey:
> * The cluster ids
> * Table Name
> * write time (here we could use the latest of any batch)
> * seqNum
> As long as we preserve these we can rearrange the cells between WALEdits. 
> Since seqNum is unique this will be a challenge. Currently it is not used, 
> but we shouldn't design anything that prevents us guaranteeing better 
> ordering guarantees using seqNum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-13751) Refactoring replication WAL reading logic as WAL Iterator

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-13751.
---
Resolution: Won't Fix

> Refactoring replication WAL reading logic as WAL Iterator
> -
>
> Key: HBASE-13751
> URL: https://issues.apache.org/jira/browse/HBASE-13751
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> The current replication code is all over the place.
> A simple refactoring that we could consider is to factor out the part that 
> reads from the WALs. Could be a simple iterator interface with one additional 
> wrinkle: The iterator needs to be able to provide the position (file and 
> offset) of the last read edit.
> Once we have this, we use this as a building block to many other changes in 
> the replication code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-9272) A parallel, unordered scanner

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-9272.
--
Resolution: Won't Fix

> A parallel, unordered scanner
> -
>
> Key: HBASE-9272
> URL: https://issues.apache.org/jira/browse/HBASE-9272
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Priority: Minor
> Attachments: 9272-0.94-v2.txt, 9272-0.94-v3.txt, 9272-0.94-v4.txt, 
> 9272-0.94.txt, 9272-trunk-v2.txt, 9272-trunk-v3.txt, 9272-trunk-v3.txt, 
> 9272-trunk-v4.txt, 9272-trunk.txt, ParallelClientScanner.java, 
> ParallelClientScanner.java
>
>
> The contract of ClientScanner is to return rows in sort order. That limits 
> the order in which region can be scanned.
> I propose a simple ParallelScanner that does not have this requirement and 
> queries regions in parallel, return whatever gets returned first.
> This is generally useful for scans that filter a lot of data on the server, 
> or in cases where the client can very quickly react to the returned data.
> I have a simple prototype (doesn't do error handling right, and might be a 
> bit heavy on the synchronization side - it used a BlockingQueue to hand data 
> between the client using the scanner and the threads doing the scanning, it 
> also could potentially starve some scanners long enugh to time out at the 
> server).
> On the plus side, it's only a 130 lines of code. :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-6970) hbase-deamon.sh creates/updates pid file even when that start failed.

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6970.
--
Resolution: Won't Fix

> hbase-deamon.sh creates/updates pid file even when that start failed.
> -
>
> Key: HBASE-6970
> URL: https://issues.apache.org/jira/browse/HBASE-6970
> Project: HBase
>  Issue Type: Bug
>  Components: Usability
>Reporter: Lars Hofhansl
>Priority: Major
>
> We just ran into a strange issue where could neither start nor stop services 
> with hbase-deamon.sh.
> The problem is this:
> {code}
> nohup nice -n $HBASE_NICENESS "$HBASE_HOME"/bin/hbase \
> --config "${HBASE_CONF_DIR}" \
> $command "$@" $startStop > "$logout" 2>&1 < /dev/null &
> echo $! > $pid
> {code}
> So the pid file is created or updated even when the start of the service 
> failed. The next stop command will then fail, because the pid file has the 
> wrong pid in it.
> Edit: Spelling and more spelling errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23326) Implement a ProcedureStore which stores procedures in a HRegion

2019-12-24 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-23326.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to master and branch-2.

Thanks all for reviewing.

Will fill the release note soon.

> Implement a ProcedureStore which stores procedures in a HRegion
> ---
>
> Key: HBASE-23326
> URL: https://issues.apache.org/jira/browse/HBASE-23326
> Project: HBase
>  Issue Type: Improvement
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.3.0
>
>
> So we can resue the code in HRegion for persisting the procedures, and also 
> the optimized WAL implementation for better performance.
> This requires we merge the hbase-procedure module to hbase-server, which is 
> an anti-pattern as we make the hbase-server module more overloaded. But I 
> think later we can first try to move the WAL stuff out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)