[jira] [Created] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
Sergey Shelukhin created HBASE-22432: Summary: HRegionServer rssStub handling is incorrect and inconsistent Key: HBASE-22432 URL: https://issues.apache.org/jira/browse/HBASE-22432 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Some methods refresh stub on null, some assume (incorrectly) server is shutting down. The latter can cause server reports to not be sent until one of the former methods executes and happens to restore the stub. Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22428) better client-side throttling for dropped calls
Sergey Shelukhin created HBASE-22428: Summary: better client-side throttling for dropped calls Key: HBASE-22428 URL: https://issues.apache.org/jira/browse/HBASE-22428 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Not sure yet how to implement this better. Either when we get CallTimeoutException on the client, or by having the timeout on the server be less than RPC timeout to be able to actually respond to client, we could do better job of throttling retries. Right now if multiple clients are overloading a server and calls start to be dropped, they just all retry and keep the server overloaded. The server might have to track when requests from a client timed out to fail more aggressively when processing time is high. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22410) add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets
Sergey Shelukhin created HBASE-22410: Summary: add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets Key: HBASE-22410 URL: https://issues.apache.org/jira/browse/HBASE-22410 Project: HBase Issue Type: Improvement Components: Operability Reporter: Sergey Shelukhin dead servers appear to only be cleaned up when a server comes up on the same host and port; however, if HBase is running on smth like YARN with many more hosts than RSes, RS may come up on a different server and the dead one will never be cleaned. The metric should be improved to account for that... it will potentially require configuring master with expected number of region servers, so that the metric could be output based on that. Dead server list should also be expired based on timestamp in such cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22408) add a metric for regions OPEN on non-live servers
Sergey Shelukhin created HBASE-22408: Summary: add a metric for regions OPEN on non-live servers Key: HBASE-22408 URL: https://issues.apache.org/jira/browse/HBASE-22408 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin This serves 2 purposes for monitoring: 1) Catching when regions are on dead servers due to long WAL splitting or other delays in SCP; at that time, the regions are not listed as RITs; we'd like to be able to have alerts in such cases. 2) Catching various bugs in assignment and procWAL corruption, etc. that leave region "OPEN" on a server that no longer exists, again to alert the administrator via a metric. Later, it might be possible to add more logic to distinguish 1 and 2, and add logic to mitigate 2 automatically and also set some metric to alert the administrator to investigate later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)
Sergey Shelukhin created HBASE-22407: Summary: add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics) Key: HBASE-22407 URL: https://issues.apache.org/jira/browse/HBASE-22407 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Currently table metrics are output using custom metrics names that clutter various metrics lists and are impossible to (sanely) aggregate. We can use Hadoop MetricsTag to instead use tagging on a single metric (for a given logical metric), allowing both per-table display and cross-table aggregation on the other end. In this JIRA (patch coming) I'd like to add the ability to do that 1) Actual tagging in multiple paths that output table metrics. 2) The ugliest part - preventing server-level metrics from being output in tags case to avoid duplicate metrics. Seems like a large refactor of the metrics is in order (not included)... 3) Fixes for some issues where wrong metrics are output, metrics are not output at all, exceptions like null Optional cause table metrics to not be output forever, etc. 4) Renaming several table-level latency metrics to be consistent with server-level latency metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
Sergey Shelukhin created HBASE-22376: Summary: master can fail to start w/NPE if lastflushedseqids file is empty Key: HBASE-22376 URL: https://issues.apache.org/jira/browse/HBASE-22376 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
Sergey Shelukhin created HBASE-22354: Summary: master never sets abortRequested, and thus abort timeout doesn't work for it Key: HBASE-22354 URL: https://issues.apache.org/jira/browse/HBASE-22354 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Discovered w/HBASE-22353 netty deadlock. The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6
Sergey Shelukhin created HBASE-22353: Summary: update non-shaded netty for Hadoop 2 to a more recent version of 3.6 Key: HBASE-22353 URL: https://issues.apache.org/jira/browse/HBASE-22353 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin When using Netty socket for ZK, we got this deadlock. Appears to be https://github.com/netty/netty/issues/1181 (or one of similar tickets before that). We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they are compatible? {noformat} Java stack information for the threads listed above: === "main-SendThread(...)": at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958) - waiting to lock <0xc91d8848> (a java.lang.Object) - locked <0xcdcc7740> (a java.util.LinkedList) at org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627) at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587) at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578) at org.jboss.netty.channel.Channels.write(Channels.java:704) at org.jboss.netty.channel.Channels.write(Channels.java:671) at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248) at org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268) at org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291) at org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) "New I/O worker #3": at org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554) - waiting to lock <0xcdcc7740> (a java.util.LinkedList) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) at org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254) - locked <0xc91d8770> (a java.lang.Object) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145) at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775) at org.jboss.netty.channel.Channels.write(Channels.java:725) at org.jboss.netty.channel.Channels.write(Channels.java:686) at org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140) at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229) - locked <0xc91d8848> (a java.lang.Object) at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.c
[jira] [Created] (HBASE-22348) allow one to actually disable replication svc
Sergey Shelukhin created HBASE-22348: Summary: allow one to actually disable replication svc Key: HBASE-22348 URL: https://issues.apache.org/jira/browse/HBASE-22348 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Minor, but it does create extra ZK traffic for now reason and there's no way to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22347) try to archive WALs when closing a region or when shutting down RS
Sergey Shelukhin created HBASE-22347: Summary: try to archive WALs when closing a region or when shutting down RS Key: HBASE-22347 URL: https://issues.apache.org/jira/browse/HBASE-22347 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin When RS shuts down in an orderly manner due to an upgrade or decom, even it has 0 regions (discovered when testing HBASE-22254), it still dies with some active WALs. WALs are then split by master, and in the 0-region case the recovered edits are not used for anything. RS shutdown should archive WALs if possible after flushing/closing regions; given that the latter can fail, perhaps once before, and once after. Closing a region via an RPC should also try to archive WAL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
Sergey Shelukhin created HBASE-22346: Summary: scanner priorities/deadline units are invalid for non-huge scanners Key: HBASE-22346 URL: https://issues.apache.org/jira/browse/HBASE-22346 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin I was looking at using the priority (deadline) queue for scanner requests; what I see is that AnnotationReadingPriorityFunction, the only impl of the deadline function available, implements getDeadline as sqrt of the number of next() calls, from HBASE-10993. However, CallPriorityComparator.compare, its only caller, adds that "deadline" value to the callA.getReceiveTime() in milliseconds... That results in some sort of a meaningless value that I assume only make sense by coincidence for telling apart broad and specific classes of scanners... in practice next calls must be in the 1000s before it becomes meaningful vs small differences in ReceivedTime When there's contention for many scanners, e.g. small scanners for meta, or just users creating tons of scanners to the point where requests queue up, the actual deadline is not accounted for and the priority function itself is meaningless... In fact as queueing increases, it becomes worse because receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22334) handle blocking RPC threads better (time out calls? )
Sergey Shelukhin created HBASE-22334: Summary: handle blocking RPC threads better (time out calls? ) Key: HBASE-22334 URL: https://issues.apache.org/jira/browse/HBASE-22334 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Combined with HBASE-22333, we had the case where user sent lots of create table requests with pre-split for the same table (because the tasks of some job would try to create table opportunistically if it doesn't exist, and there were many such tasks); these requests took up all the RPC threads and caused large call queue to form; then, the first call got stuck because RS calls to report an opened region were stuck in queue. All the other calls were stuck here: {noformat} submitProcedure( new CreateTableProcedure(procedureExecutor.getEnvironment(), desc, newRegions, latch)); latch.await(); {noformat} The procedures in this case were stuck for hours; even if the other issue was resolved, assigning 1000s of regions can take a long time and cause lots of delay before it unblocks the the other procedures and allows them to release the latch. In general, waiting on RPC thread is not a good idea. I wonder if it would make sense to fail client requests taking up the RPC thread based on timeout; or if they are not making progress (e.g. in this case, the procedure is not getting updated; might need to be handled on case by case basis). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22333) move certain internal RPCs to high priority level
Sergey Shelukhin created HBASE-22333: Summary: move certain internal RPCs to high priority level Key: HBASE-22333 URL: https://issues.apache.org/jira/browse/HBASE-22333 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin User calls can inadvertently DDoS master (and potentially RS), causing issues (e.g. CallQueueTooBig) for important system calls like reportRegionStateTransition. These calls should be moved to high pri level... I wonder if all the low-volume internal calls (i.e. except heartbeats and maybe WAL splitting stuff) should have higher pri (e.g. 20 QoS in HConstants). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22290) add an option to replay WALs without splitting
Sergey Shelukhin created HBASE-22290: Summary: add an option to replay WALs without splitting Key: HBASE-22290 URL: https://issues.apache.org/jira/browse/HBASE-22290 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin As is, on some FS it may be faster to have every RS concerned read the WALs instead of splitting WALs first and then re-reading. Additionally, given that WAL can often be held up by a slow-flushing/low-volume region, storing an additional structure next to the WAL that is the same map that RS uses to archive WALs (a mapping that allows one to determine which region needs which WALs based on its last flush) can accelerate it even further for the common case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22289) WAL-based log splitting resubmit threshold results in a task being stuck forever
Sergey Shelukhin created HBASE-22289: Summary: WAL-based log splitting resubmit threshold results in a task being stuck forever Key: HBASE-22289 URL: https://issues.apache.org/jira/browse/HBASE-22289 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Not sure if this is handled better in procedure based WAL splitting; in any case it affects versions before that. The problem is not in ZK as such but in internal state tracking in master, it seems. Master: {noformat} 2019-04-21 01:49:49,584 INFO [master/:17000.splitLogManager..Chore.1] coordination.SplitLogManagerCoordination: Resubmitting task .1555831286638 {noformat} worker-rs, split fails {noformat} 2019-04-21 02:05:31,774 INFO [RS_LOG_REPLAY_OPS-regionserver/:17020-1] wal.WALSplitter: Processed 24 edits across 2 regions; edits skipped=457; log file=.1555831286638, length=2156363702, corrupted=false, progress failed=true {noformat} Master (not sure about the delay of the acquired-message; at any rate it seems to detect the failure fine from this server) {noformat} 2019-04-21 02:11:14,928 INFO [main-EventThread] coordination.SplitLogManagerCoordination: Task .1555831286638 acquired by ,17020,139815097 2019-04-21 02:19:41,264 INFO [master/:17000.splitLogManager..Chore.1] coordination.SplitLogManagerCoordination: Skipping resubmissions of task .1555831286638 because threshold 3 reached {noformat} After that this task is stuck in the limbo forever with the old worker, and never resubmitted. RS never logs anything else for this task. Killing the RS on the worker unblocked the task and some other server did the split very quickly, so seems like master doesn't clear the worker name in its internal state when hitting the threshold... master never restarted so restarting the master might have also cleared it. This is extracted from splitlogmanager log messages, note the times. {noformat} 2019-04-21 02:2 1555831286638=last_update = 1555837874928 last_version = 11 cur_worker_name = ,17020,139815097 status = in_progress incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20, 2019-04-22 11:1 1555831286638=last_update = 1555837874928 last_version = 11 cur_worker_name = ,17020,139815097 status = in_progress incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20} {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22288) add an option to kill RS that are not hearbeating, via ZK
Sergey Shelukhin created HBASE-22288: Summary: add an option to kill RS that are not hearbeating, via ZK Key: HBASE-22288 URL: https://issues.apache.org/jira/browse/HBASE-22288 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin To handle network partitioning cases and bad-state RS, esp. in container scenarios where many machines are available to "move" RS to, it would be helpful to kill RS that fail to heartbeat but are maintaining their znode; since they can connect to ZK it would likely be possible to affect them via that, by deleting their znode if ZK allows that, or via a separate path; and/or fencing off the WAL on HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22287) inifinite retries on failed server in RSProcedureDispatcher
Sergey Shelukhin created HBASE-22287: Summary: inifinite retries on failed server in RSProcedureDispatcher Key: HBASE-22287 URL: https://issues.apache.org/jira/browse/HBASE-22287 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin We observed this recently on some cluster, I'm still investigating the root cause however seems like the retries should have special handling for this exception; and separately probably a cap on number of retries {noformat} 2019-04-20 04:24:27,093 WARN [RSProcedureDispatcher-pool4-t1285] procedure.RSProcedureDispatcher: request to server ,17020,1555742560432 failed due to java.io.IOException: Call to :17020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: :17020, try=26603, retrying... {noformat} The corresponding worker is stuck -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22254) HBase - refactor and improve decommissioning logic
Sergey Shelukhin created HBASE-22254: Summary: HBase - refactor and improve decommissioning logic Key: HBASE-22254 URL: https://issues.apache.org/jira/browse/HBASE-22254 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Making some changes needed to support better decommissioning on large clusters and with container mode; to test those and add clarify I moved parts of decommissioning logic from HMaster, Draining tracker, and ServerManager into a separate class. Features added/improvements: 1) More resilient off-loading; right now off-loading fails for a subset of regions in case of a single region failure; is never done on master restart, etc. 2) Option to kill RS after off-loading (good for container mode HBase, e.g. on YARN). 3) Option to specify machine names only to decommission, for the API to be usable for an external system that doesn't care about HBase server names, or e.g. multiple RS in containers on the same node. 4) Option to replace existing decommissioning list instead of adding to it (the same; to avoid additionally remembering what was previously sent to HBase). 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22168) proc WALs with non-corrupted-but-"corrupted" block WAL archiving forever
Sergey Shelukhin created HBASE-22168: Summary: proc WALs with non-corrupted-but-"corrupted" block WAL archiving forever Key: HBASE-22168 URL: https://issues.apache.org/jira/browse/HBASE-22168 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin I've reported the bug before where we get these messages when loading proc WAL {noformat} 2019-04-04 14:43:00,424 ERROR [master/...:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 43459, max stack id is 43460, root procedure is Procedure(pid=43645, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure) {noformat} resulting in {noformat} 2019-04-04 14:43:16,176 ERROR [...:17000:becomeActiveMaster] procedure2.ProcedureExecutor: Corrupt pid=43645, state=WAITING:SERVER_CRASH_FINISH, hasLock=false; ServerCrashProcedure server=..., splitWal=true, meta=false {noformat} There is no actual corruption in the file, so it never gets moved to corrupted files. However, there's no accounting for these kind of procedures in the tracker as far as I can tell (I didn't spend a lot of time looking at the code though) so as a result we get 100s of proc wals that are stuck forever because of some ancient file with these WALs; that causes master startup to take a long time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22145) windows hbase-env causes hbase cli/etc to ignore HBASE_OPTS
Sergey Shelukhin created HBASE-22145: Summary: windows hbase-env causes hbase cli/etc to ignore HBASE_OPTS Key: HBASE-22145 URL: https://issues.apache.org/jira/browse/HBASE-22145 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22120) remove HTrace
Sergey Shelukhin created HBASE-22120: Summary: remove HTrace Key: HBASE-22120 URL: https://issues.apache.org/jira/browse/HBASE-22120 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin Suggested in HBASE-22115 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22115) HBase RPC aspires to grow an infinite tree of trace scopes
Sergey Shelukhin created HBASE-22115: Summary: HBase RPC aspires to grow an infinite tree of trace scopes Key: HBASE-22115 URL: https://issues.apache.org/jira/browse/HBASE-22115 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: Clipboard01.jpg All of those are ClientServices.Multi in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22107) make dead server metric work for HBase in a compute fabric (e.g. YARN) use case
Sergey Shelukhin created HBASE-22107: Summary: make dead server metric work for HBase in a compute fabric (e.g. YARN) use case Key: HBASE-22107 URL: https://issues.apache.org/jira/browse/HBASE-22107 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin dead servers appear to only be cleaned up when a server comes up on the same host and port; however, if HBase is running on smth like YARN with many more hosts than RSes, RS may come up on a different server and the dead one will never be cleaned. The metric should be improved to account for that... it will potentially require configuring master with expected number of region servers, so that the metric could be output based on that. Dead server list should also be expired based on timestamp in such cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22081) master shutdown: close RpcServer first thing, close procWAL as soon as viable, and delete znode the last thing
Sergey Shelukhin created HBASE-22081: Summary: master shutdown: close RpcServer first thing, close procWAL as soon as viable, and delete znode the last thing Key: HBASE-22081 URL: https://issues.apache.org/jira/browse/HBASE-22081 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin I had a master get stuck due to HBASE-22079 and it was logging RS abort messages during shutdown. [~bahramch] found some issues where messages are processed by old master during shutdown due to a race condition in RS cache (or it could also happen due to a network race). Previously I found some bug where SCP was created during master shutdown that had incorrect state (because some structures already got cleaned). I think before master fencing is implemented we can at least make these issues much less likely by thinking about shutdown order. 1) First kill RCP server so we don't receive any more messages. 2) Then do whatever cleanup we think is needed that requires proc wal. 3) Then close proc WAL so no errant threads can create more procs. 4) Then do whatever other cleanup. 5) Finally delete znode. Right now znode is deleted somewhat early I think, and RpcServer is closed very late. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22080) hbase-shaded-client doesn't include netty socket for ZK
Sergey Shelukhin created HBASE-22080: Summary: hbase-shaded-client doesn't include netty socket for ZK Key: HBASE-22080 URL: https://issues.apache.org/jira/browse/HBASE-22080 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin That makes it impossible to use with secure ZK as far as I can tell - the standard non-shaded socket class cannot be cast to the shaded version of the base class -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22079) master leaks ZK on shutdown and gets stuck because of netty threads if netty socket is used
Sergey Shelukhin created HBASE-22079: Summary: master leaks ZK on shutdown and gets stuck because of netty threads if netty socket is used Key: HBASE-22079 URL: https://issues.apache.org/jira/browse/HBASE-22079 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin {noformat} "master/...:17000:becomeActiveMaster-SendThread(...1)" #311 daemon prio=5 os_prio=0 tid=0x58c61800 nid=0x2dd0 waiting on condition [0x000c477fe000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xc4a5b3c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522) at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684) at org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:232) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) {noformat} This causes a bunch of netty threads to also leak it looks like, and these are not daemon (by design, apparently) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22078) corrupted procs in proc WAL
Sergey Shelukhin created HBASE-22078: Summary: corrupted procs in proc WAL Key: HBASE-22078 URL: https://issues.apache.org/jira/browse/HBASE-22078 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Not sure what the root cause is... there are ~500 proc wal files (I actually wonder if cleanup is also blocked by this, since I see these lines on master restart, do WALs with abandoned procedures like that get deleted?). {noformat} 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7571, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7600, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7610, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7631, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7650, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7651, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7657, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) 2019-03-21 12:47:17,116 ERROR [master/...:17000:becomeActiveMaster] wal.WALProcedureTree: Missing stack id 7683, max stack id is 7754, root procedure is Procedure(pid=66829, ppid=-1, class=org.apache.hadoop.hbase.master.procedure.DisableTableProcedure) {noformat} Followed by {noformat} 2019-03-20 07:37:53,751 ERROR [master/...:17000:becomeActiveMaster] procedure2.ProcedureExecutor: Corrupt pid=66829, state=WAITING:DISABLE_TABLE_ADD_REPLICATION_BARRIER, hasLock=false; DisableTableProcedure table=... {noformat} And 1000s of child procedures and grandchild procedures of this procedure. I think this area needs general overview... we should have a record for the procedure durably persisted before we create any child procedures, so I'm not sure how this could happen. Actually, I also wonder why we even have separate proc WAL when HBase already has a working WAL that's more or less time tested... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22068) improve client connection ctor error handling
Sergey Shelukhin created HBASE-22068: Summary: improve client connection ctor error handling Key: HBASE-22068 URL: https://issues.apache.org/jira/browse/HBASE-22068 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Close is called from ctor and can thus be called on a partially initialized object. It should have null checks for all the fields. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21975) allow ZK properties to be configured via hbase-site,xml
[ https://issues.apache.org/jira/browse/HBASE-21975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HBASE-21975: -- Actually, it would still help to add this logic during init. As far as I see it's not used everywhere > allow ZK properties to be configured via hbase-site,xml > --- > > Key: HBASE-21975 > URL: https://issues.apache.org/jira/browse/HBASE-21975 > Project: HBase > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-21975.patch > > > ...as opposed to configuring it thru commandline or env variables -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21975) allow ZK properties to be configured via hbase-site,xml
Sergey Shelukhin created HBASE-21975: Summary: allow ZK properties to be configured via hbase-site,xml Key: HBASE-21975 URL: https://issues.apache.org/jira/browse/HBASE-21975 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin ...as opposed to configuring it thru commandline or env variables -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21625) a runnable procedure v2 does not run
[ https://issues.apache.org/jira/browse/HBASE-21625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-21625. -- Resolution: Cannot Reproduce Probably a dup of OpenRegionProcedure issues > a runnable procedure v2 does not run > > > Key: HBASE-21625 > URL: https://issues.apache.org/jira/browse/HBASE-21625 > Project: HBase > Issue Type: Bug > Components: amv2, proc-v2 >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Priority: Critical > > This is on master snapshot as of a few weeks ago. > Haven't looked at the code much yet, but it seems rather fundamental. The > procedure comes from meta replica assignment (HBASE-21624), in case it > matters w.r.t. the engine initialization; however, the master is functional > and other procedures run fine. I can also see lots of other open region > procedures with a similar patterns that were initialized before this one and > have run fine. > Currently, there are no other runnable procedures on master - a lot of > succeeded procedures since then, the parent blocked on this procedure, and > one unrelated RIT procedure waiting with timeout and being updated > periodically. > The procedure itself is > {noformat} > 157156157155 RUNNABLEhadoop > org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure Wed Dec 19 > 17:20:27 PST 2018Wed Dec 19 17:20:28 PST 2018[ { region => { > regionId => '1', tableName => { ... }, startKey => '', endKey => '', offline > => 'false', split => 'false', replicaId => '1' }, targetServer => { hostName > => 'server1', port => '17020', startCode => '1545266805778' } }, {} ] > {noformat} > This is in PST so it's been like that for ~19 hours. > The only line involving this PID in the log is {noformat} > 2018-12-19 17:20:27,974 INFO [PEWorker-4] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=157156, ppid=157155, state=RUNNABLE, > hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure}] > {noformat} > There are no other useful logs for either this PID, parent PID, or region in > question since. This PEWorker (4) is also alive and did some work since then, > so it's not like the thread errored out somewhere. > All the PEWorker-s are waiting for work: > {noformat} > Thread 158 (PEWorker-16): > State: TIMED_WAITING > Blocked count: 1340 > Waited count: 5064 > Stack: > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) > > org.apache.hadoop.hbase.procedure2.AbstractProcedureScheduler.poll(AbstractProcedureScheduler.java:171) > > org.apache.hadoop.hbase.procedure2.AbstractProcedureScheduler.poll(AbstractProcedureScheduler.java:153) > > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1949) > {noformat} > The main assignment procedure for this region is blocked on it: > {noformat} > 157155WAITING hadoop TransitRegionStateProcedure > table=hbase:meta, region=534574363, ASSIGN Wed Dec 19 17:20:27 PST 2018 > Wed Dec 19 17:20:27 PST 2018[ { state => [ '1', '2', '3' ] }, { > regionId => '1', tableName => { ... }, startKey => '', endKey => '', offline > => 'false', split => 'false', replicaId => '1' }, { initialState => > 'REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE', lastState => > 'REGION_STATE_TRANSITION_CONFIRM_OPENED', assignCandidate => { hostName => > 'server1', port => '17020', startCode => '1545266805778' }, forceNewPlan => > 'false' } ] > 2018-12-19 17:20:27,673 INFO [PEWorker-9] > procedure.MasterProcedureScheduler: Took xlock for pid=157155, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; > TransitRegionStateProcedure table=hbase:meta, region=..., ASSIGN > 2018-12-19 17:20:27,809 INFO [PEWorker-9] > assignment.TransitRegionStateProcedure: Starting pid=157155, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; > TransitRegionStateProcedure table=hbase:meta, region=..., ASSIGN; > rit=OFFLINE, location=server1,17020,1545266805778; forceNewPlan=false, > retain=false > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21873) IPCUtil.wrapException should keep the original exception types for all the connection exceptions
[ https://issues.apache.org/jira/browse/HBASE-21873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-21873. -- Resolution: Fixed > IPCUtil.wrapException should keep the original exception types for all the > connection exceptions > > > Key: HBASE-21873 > URL: https://issues.apache.org/jira/browse/HBASE-21873 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Duo Zhang >Priority: Blocker > Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0 > > Attachments: HBASE-21862-forUT.patch, HBASE-21862-v1.patch, > HBASE-21862-v2.patch, HBASE-21862.patch > > > It's a classic bug, sort of... the call times out to open the region, but RS > actually processes it alright. It could also happen if the response didn't > make it back due to a network issue. > As a result region is opened on two servers. > There are some mitigations possible to narrow down the race window. > 1) Don't process expired open calls, fail them. Won't help for network issues. > 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that > will require fixing other network races where master kills RS, which would > require adding state versioning to the protocol. > The fundamental fix though would require either > 1) an unknown failure from open to ascertain the state of the region from the > server. Again, this would probably require protocol changes to make sure we > ascertain the region is not opened, and also that the > already-failed-on-master open is NOT going to be processed if it's some queue > or even in transit on the network (via a nonce-like mechanism)? > 2) some form of a distributed lock per region, e.g. in ZK > 3) some form of 2PC? but the participant list cannot be determined in a > manner that's both scalable and guaranteed correct. Theoretically it could be > all RSes. > {noformat} > 2019-02-08 03:21:31,715 INFO [PEWorker-7] > procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN > 2019-02-08 03:21:31,758 INFO [PEWorker-7] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, > location=server1,17020,1549567999303; forceNewPlan=false, retain=true > 2019-02-08 03:21:31,984 INFO [PEWorker-13] assignment.RegionStateStore: > pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, > regionState=OPENING, regionLocation=server1,17020,1549623714617 > 2019-02-08 03:22:32,552 WARN [RSProcedureDispatcher-pool4-t3451] > assignment.RegionRemoteProcedureBase: The remote operation pid=260637, > ppid=260626, state=RUNNABLE, hasLock=false; > org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... > to server server1,17020,1549623714617 failed > java.io.IOException: Call to server1/...:17020 failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M > ... > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M > ... 4 more^M > {noformat} > RS: > {noformat} > hbase-regionserver.log:2019-02-08 03:22:41,131 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Open ...d0214809147e43dc6870005742d5d204. > ... > hbase-regionserver.log:2019-02-08 03:25:44,751 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Opened ...d0214809147e43dc6870005742d5d204. > {noformat} > Retry: > {noformat} > 2019-02-08 03:22:32,967 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; > pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, > location=server1,17020,1549623714617 > 2019-02-08 03:22:33,084 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; > Transit
[jira] [Created] (HBASE-21873) IPCUtil.wrapException should keep the original exception types for all the connection exceptions
Sergey Shelukhin created HBASE-21873: Summary: IPCUtil.wrapException should keep the original exception types for all the connection exceptions Key: HBASE-21873 URL: https://issues.apache.org/jira/browse/HBASE-21873 Project: HBase Issue Type: Bug Affects Versions: 3.0.0, 2.2.0 Reporter: Sergey Shelukhin Assignee: Duo Zhang Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0 Attachments: HBASE-21862-forUT.patch, HBASE-21862-v1.patch, HBASE-21862-v2.patch, HBASE-21862.patch It's a classic bug, sort of... the call times out to open the region, but RS actually processes it alright. It could also happen if the response didn't make it back due to a network issue. As a result region is opened on two servers. There are some mitigations possible to narrow down the race window. 1) Don't process expired open calls, fail them. Won't help for network issues. 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that will require fixing other network races where master kills RS, which would require adding state versioning to the protocol. The fundamental fix though would require either 1) an unknown failure from open to ascertain the state of the region from the server. Again, this would probably require protocol changes to make sure we ascertain the region is not opened, and also that the already-failed-on-master open is NOT going to be processed if it's some queue or even in transit on the network (via a nonce-like mechanism)? 2) some form of a distributed lock per region, e.g. in ZK 3) some form of 2PC? but the participant list cannot be determined in a manner that's both scalable and guaranteed correct. Theoretically it could be all RSes. {noformat} 2019-02-08 03:21:31,715 INFO [PEWorker-7] procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN 2019-02-08 03:21:31,758 INFO [PEWorker-7] assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, location=server1,17020,1549567999303; forceNewPlan=false, retain=true 2019-02-08 03:21:31,984 INFO [PEWorker-13] assignment.RegionStateStore: pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, regionState=OPENING, regionLocation=server1,17020,1549623714617 2019-02-08 03:22:32,552 WARN [RSProcedureDispatcher-pool4-t3451] assignment.RegionRemoteProcedureBase: The remote operation pid=260637, ppid=260626, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... to server server1,17020,1549623714617 failed java.io.IOException: Call to server1/...:17020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, waitTime=60145, rpcTimeout=6^M at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M ... Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, waitTime=60145, rpcTimeout=6^M at org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M ... 4 more^M {noformat} RS: {noformat} hbase-regionserver.log:2019-02-08 03:22:41,131 INFO [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: Open ...d0214809147e43dc6870005742d5d204. ... hbase-regionserver.log:2019-02-08 03:25:44,751 INFO [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: Opened ...d0214809147e43dc6870005742d5d204. {noformat} Retry: {noformat} 2019-02-08 03:22:32,967 INFO [PEWorker-6] assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, location=server1,17020,1549623714617 2019-02-08 03:22:33,084 INFO [PEWorker-6] assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, location=null; forceNewPlan=true, retain=false 2019-02-08 03:22:33,238 INFO [PEWorker-7] assignment.RegionStateStore: pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, regionState=OPENING, regionLocation=server2,17020,1549569075319 {noformat} The ignore-message {noformat} 2019-02-08 03:25:44,754 WARN [RpcServe
[jira] [Reopened] (HBASE-21862) IPCUtil.wrapException should keep the original exception types for all the connection exceptions
[ https://issues.apache.org/jira/browse/HBASE-21862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HBASE-21862: -- Assignee: Sergey Shelukhin (was: Duo Zhang) > IPCUtil.wrapException should keep the original exception types for all the > connection exceptions > > > Key: HBASE-21862 > URL: https://issues.apache.org/jira/browse/HBASE-21862 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5, 2.3.0 > > Attachments: HBASE-21862-forUT.patch, HBASE-21862-v1.patch, > HBASE-21862-v2.patch, HBASE-21862.patch > > > It's a classic bug, sort of... the call times out to open the region, but RS > actually processes it alright. It could also happen if the response didn't > make it back due to a network issue. > As a result region is opened on two servers. > There are some mitigations possible to narrow down the race window. > 1) Don't process expired open calls, fail them. Won't help for network issues. > 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that > will require fixing other network races where master kills RS, which would > require adding state versioning to the protocol. > The fundamental fix though would require either > 1) an unknown failure from open to ascertain the state of the region from the > server. Again, this would probably require protocol changes to make sure we > ascertain the region is not opened, and also that the > already-failed-on-master open is NOT going to be processed if it's some queue > or even in transit on the network (via a nonce-like mechanism)? > 2) some form of a distributed lock per region, e.g. in ZK > 3) some form of 2PC? but the participant list cannot be determined in a > manner that's both scalable and guaranteed correct. Theoretically it could be > all RSes. > {noformat} > 2019-02-08 03:21:31,715 INFO [PEWorker-7] > procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN > 2019-02-08 03:21:31,758 INFO [PEWorker-7] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, > location=server1,17020,1549567999303; forceNewPlan=false, retain=true > 2019-02-08 03:21:31,984 INFO [PEWorker-13] assignment.RegionStateStore: > pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, > regionState=OPENING, regionLocation=server1,17020,1549623714617 > 2019-02-08 03:22:32,552 WARN [RSProcedureDispatcher-pool4-t3451] > assignment.RegionRemoteProcedureBase: The remote operation pid=260637, > ppid=260626, state=RUNNABLE, hasLock=false; > org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... > to server server1,17020,1549623714617 failed > java.io.IOException: Call to server1/...:17020 failed on local exception: > org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M > ... > Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, > waitTime=60145, rpcTimeout=6^M > at > org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M > ... 4 more^M > {noformat} > RS: > {noformat} > hbase-regionserver.log:2019-02-08 03:22:41,131 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Open ...d0214809147e43dc6870005742d5d204. > ... > hbase-regionserver.log:2019-02-08 03:25:44,751 INFO > [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: > Opened ...d0214809147e43dc6870005742d5d204. > {noformat} > Retry: > {noformat} > 2019-02-08 03:22:32,967 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; > pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; > TransitRegionStateProcedure table=table, > region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, > location=server1,17020,1549623714617 > 2019-02-08 03:22:33,084 INFO [PEWorker-6] > assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, > state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN
[jira] [Created] (HBASE-21864) add region state version and reinstate YouAreDead exception in region report
Sergey Shelukhin created HBASE-21864: Summary: add region state version and reinstate YouAreDead exception in region report Key: HBASE-21864 URL: https://issues.apache.org/jira/browse/HBASE-21864 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin The state version will ensure we don't have network-related races (e.g. the one I reported in some other bug - RS: "report {R1} ..." M: "close R1" R: "I closed R1" M ... "receive report {R1}" M: "you shouldn't have R1, die"). Then we can revert the change that removed YouAreDead exception... RS in incorrect state should be either brought into correct state or killed because it means some bug; right now of double assignment happens (I found 2 different cases just this week ;)) master lets RS with incorrect assignment keep it forever. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21863) narrow down the double-assignment race window
Sergey Shelukhin created HBASE-21863: Summary: narrow down the double-assignment race window Key: HBASE-21863 URL: https://issues.apache.org/jira/browse/HBASE-21863 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin See HBASE-21862. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21862) region can be assigned to 2 servers due to a timed-out call
Sergey Shelukhin created HBASE-21862: Summary: region can be assigned to 2 servers due to a timed-out call Key: HBASE-21862 URL: https://issues.apache.org/jira/browse/HBASE-21862 Project: HBase Issue Type: Bug Affects Versions: 3.0.0, 2.2.0 Reporter: Sergey Shelukhin It's a classic bug, sort of... the call times out to open the region, but RS actually processes it alright. It could also happen if the response didn't make it back due to a network issue. As a result region is opened on two servers. There are some mitigations possible to narrow down the race window. 1) Don't process expired open calls, fail them. 2) Don't ignore invalid RS state, kill it (YouAreDead exception) - but that will require fixing other network races where master kills RS, which would require adding state versioning to the protocol. The fundamental fix though would require either 1) an unknown failure from open to ascertain the state of the region from the server. Again, this would probably require protocol changes to make sure we ascertain the region is not opened, and also that the already-failed-on-master open is NOT going to be processed if it's some queue or even in transit on the network (via a nonce-like mechanism)? 2) some form of a distributed lock per region, e.g. in ZK 3) some form of 2PC? but the participant list cannot be determined in a manner that's both scalable and guaranteed correct. Theoretically it could be all RSes. {noformat} 2019-02-08 03:21:31,715 INFO [PEWorker-7] procedure.MasterProcedureScheduler: Took xlock for pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN 2019-02-08 03:21:31,758 INFO [PEWorker-7] assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPEN, location=server1,17020,1549567999303; forceNewPlan=false, retain=true 2019-02-08 03:21:31,984 INFO [PEWorker-13] assignment.RegionStateStore: pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, regionState=OPENING, regionLocation=server1,17020,1549623714617 2019-02-08 03:22:32,552 WARN [RSProcedureDispatcher-pool4-t3451] assignment.RegionRemoteProcedureBase: The remote operation pid=260637, ppid=260626, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region ... to server server1,17020,1549623714617 failed java.io.IOException: Call to server1/...:17020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, waitTime=60145, rpcTimeout=6^M at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:185)^M at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:391)^M ... Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27191, waitTime=60145, rpcTimeout=6^M at org.apache.hadoop.hbase.ipc.RpcConnection$1.run(RpcConnection.java:200)^M ... 4 more^M {noformat} RS: {noformat} hbase-regionserver.log:2019-02-08 03:22:41,131 INFO [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: Open ...d0214809147e43dc6870005742d5d204. ... hbase-regionserver.log:2019-02-08 03:25:44,751 INFO [RS_OPEN_REGION-regionserver/server1:17020-2] handler.AssignRegionHandler: Opened ...d0214809147e43dc6870005742d5d204. {noformat} Retry: {noformat} 2019-02-08 03:22:32,967 INFO [PEWorker-6] assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, location=server1,17020,1549623714617 2019-02-08 03:22:33,084 INFO [PEWorker-6] assignment.TransitRegionStateProcedure: Starting pid=260626, ppid=260595, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=table, region=d0214809147e43dc6870005742d5d204, ASSIGN; rit=OPENING, location=null; forceNewPlan=true, retain=false 2019-02-08 03:22:33,238 INFO [PEWorker-7] assignment.RegionStateStore: pid=260626 updating hbase:meta row=d0214809147e43dc6870005742d5d204, regionState=OPENING, regionLocation=server2,17020,1549569075319 {noformat} The ignore-message {noformat} 2019-02-08 03:25:44,754 WARN [RpcServer.default.FPBQ.Fifo.handler=34,queue=4,port=17000] assignment.TransitRegionStateProcedure: Received report OPENED transition from server1,17020,1549623714617 for rit=OPENING, location=server2,17020,1549569075319, table=table, region=d0214809147e43dc6870005742d5
[jira] [Created] (HBASE-21858) report more metrics for exceptions; DFS error categories; RS abort reasons
Sergey Shelukhin created HBASE-21858: Summary: report more metrics for exceptions; DFS error categories; RS abort reasons Key: HBASE-21858 URL: https://issues.apache.org/jira/browse/HBASE-21858 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin These are useful for debugging without having to read the logs as much, esp. when there are many machines involved. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21850) shell commands don't recognize meta region's full name
[ https://issues.apache.org/jira/browse/HBASE-21850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-21850. -- Resolution: Not A Problem > shell commands don't recognize meta region's full name > -- > > Key: HBASE-21850 > URL: https://issues.apache.org/jira/browse/HBASE-21850 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Priority: Minor > > {noformat} > hbase(main):001:0> move "hbase:meta,,1.1588230740", "server1" > ERROR: Unknown region hbase:meta,,1.1588230740! > For usage try 'help "move"' > Took 1.1780 seconds > hbase(main):003:0> move "1588230740", "server1" > Took 84.3050 seconds > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21851) slow meta can cause master to sortof deadlock and bring down the cluster
Sergey Shelukhin created HBASE-21851: Summary: slow meta can cause master to sortof deadlock and bring down the cluster Key: HBASE-21851 URL: https://issues.apache.org/jira/browse/HBASE-21851 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Due to many threads sync-retrying to update meta for a really long time, master doesn't appear to have enough threads to process requests. Meta server died but it's SCP is not processed, I'm not sure if it's because of the threads being full, or some other reason (the ZK issue we've seen earlier in our cluster?) {noformat} 2019-02-05 13:20:39,225 INFO [KeepAlivePEWorker-32] assignment.RegionStateStore: pid=805758 updating hbase:meta row=7130dac84857699b8cd0061298b6fe9c, regionState=OPENING, regionLocation=server,17020,1549400274239 ... 2019-02-05 13:39:42,521 WARN [ProcExecTimeout] procedure2.ProcedureExecutor: Worker stuck KeepAlivePEWorker-32(pid=805758), run time 19mins, 3.296sec {noformat} It starts dropping timed out calls: {noformat} 2019-02-05 13:39:45,877 WARN [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] ipc.RpcServer: Dropping timed out call: callId: 7 service: RegionServerStatusService methodName: RegionServerReport size: 102 connection: ...:35743 deadline: 1549401663387 ... RS: 2019-02-05 13:39:45,521 INFO [RS_OPEN_REGION-regionserver/..:17020-4] regionserver.HRegionServer: Failed report transition server ... org.apache.hadoop.hbase.CallQueueTooBigException: Call queue is full on ..., too many items queued ? {noformat} This eventually causes RSes to kill themselves I think and further increases load on master. I wonder if meta retry should be async? That way other calls could be processed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21850) shell commands don't recognize meta region's full name
Sergey Shelukhin created HBASE-21850: Summary: shell commands don't recognize meta region's full name Key: HBASE-21850 URL: https://issues.apache.org/jira/browse/HBASE-21850 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin {noformat} hbase(main):001:0> move "hbase:meta,,1.1588230740", "server1" ERROR: Unknown region hbase:meta,,1.1588230740! For usage try 'help "move"' Took 1.1780 seconds hbase(main):003:0> move "1588230740", "server1" Took 84.3050 seconds {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21849) master serving regions displays a version warning about itself in the UI
Sergey Shelukhin created HBASE-21849: Summary: master serving regions displays a version warning about itself in the UI Key: HBASE-21849 URL: https://issues.apache.org/jira/browse/HBASE-21849 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin {noformat} ... master-node,17000,1549396264134 Tue Feb 05 11:51:04 PST 20192 s 0.0.0 0 0 rs-node,17020,1549396262825 Tue Feb 05 11:51:02 PST 20192 s 3.0.4-SNAPSHOT 0 50 Total:NN1 nodes with inconsistent version {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21825) proxy support for cross-machine links in master/RS UI
Sergey Shelukhin created HBASE-21825: Summary: proxy support for cross-machine links in master/RS UI Key: HBASE-21825 URL: https://issues.apache.org/jira/browse/HBASE-21825 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Follow-up to HBASE-21824. MasterStatusServlet needs to pass the current request URL to the template. Other than having to explicitly specify the proxy URL format (what if there are several proxies?), the best way to approach this seems to be for the above (optionally, if enabled) to check if URL authority includes the current server name; if not, do a string search for the server name and port in the URL. If found, create a URL format string to be used for all the links to other masters and RSes. E.g. if current URL is MASTERMACHINE:12355/master-status, nothing needs to be done. If it's 127.0.0.1:12355/master-status, similarly there's nothing to replace. If it's e.g. myproxy/foo/server/MASTERMACHINE/port/12355/master-status, RS link might be myproxy/foo/server/RSMACHINE/port/12356/rs-status. It can be disabled by default to avoid false positives. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21824) change master and RS UI links to be relative
Sergey Shelukhin created HBASE-21824: Summary: change master and RS UI links to be relative Key: HBASE-21824 URL: https://issues.apache.org/jira/browse/HBASE-21824 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-21824.patch When HBase services are accessed thru the proxy e.g. proxy/foo/bar/machine:port/master-status, the current links on the page lead to e.g. proxy/procedures.jsp, because they start with a slash. There seems to be no reason for them to have a slash since all the pages are on the same level. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21823) only report RS abort once
Sergey Shelukhin created HBASE-21823: Summary: only report RS abort once Key: HBASE-21823 URL: https://issues.apache.org/jira/browse/HBASE-21823 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin When RS is aborted due to an error, it starts shutting down various things (e.g. WAL, files, etc.) causing various other threads to hit fatal errors, and in turn call abort, dumping more logs and reporting to master again. This pollutes RS logs and makes master dump look confusing w.r.t. aborted RS, with several messages for the same server with bogus errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21817) skip records with corrupted cells in WAL splitting
Sergey Shelukhin created HBASE-21817: Summary: skip records with corrupted cells in WAL splitting Key: HBASE-21817 URL: https://issues.apache.org/jira/browse/HBASE-21817 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin {noformat} 2018-12-13 17:01:12,208 ERROR [RS_LOG_REPLAY_OPS-regionserver/...] executor.EventHandler: Caught throwable while processing event RS_LOG_REPLAY java.lang.RuntimeException: java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.wal.WALSplitter$PipelineController.checkForErrors(WALSplitter.java:846) at org.apache.hadoop.hbase.wal.WALSplitter$OutputSink.finishWriting(WALSplitter.java:1203) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(WALSplitter.java:1267) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:349) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:196) at org.apache.hadoop.hbase.regionserver.SplitLogWorker.splitLog(SplitLogWorker.java:178) at org.apache.hadoop.hbase.regionserver.SplitLogWorker.lambda$new$0(SplitLogWorker.java:90) at org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:70) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:113) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.filterCellByStore(WALSplitter.java:1542) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.appendBuffer(WALSplitter.java:1586) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1560) at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1085) at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1077) at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1047) {noformat} Unfortunately I cannot share the file. The issue appears to be straightforward - for whatever reason the family length is negative. Not sure how such a cell got created, I suspect the file was corrupted. {code} byte[] output = new byte[cell.getFamilyLength()]; {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21813) ServerNotRunningYet exception should include machine-readable server name
Sergey Shelukhin created HBASE-21813: Summary: ServerNotRunningYet exception should include machine-readable server name Key: HBASE-21813 URL: https://issues.apache.org/jira/browse/HBASE-21813 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin As far as I see, this exception is thrown before the start code of the destination can be verified from the request on RS side; the code that handles it (e.g. retries from open region procedure) use it to retry later on the same server. However, if the start code of the server that is not running is different from the intended start code of the operations those retries are a waste of time. The exception should include a server name with a start code (which it already includes as part of the message in some cases), so that the caller could check that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21811) region can be opened on two servers due to race condition with procedures and server reports
Sergey Shelukhin created HBASE-21811: Summary: region can be opened on two servers due to race condition with procedures and server reports Key: HBASE-21811 URL: https://issues.apache.org/jira/browse/HBASE-21811 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin Looks like the region server responses are being processed incorrectly in places allowing te region to be opened on two servers. * The region server report handling in procedures should check which server is reporting. * Also although I didn't check (and it isn't implicated in this bug), RS must check in OPEN that it's actually the correct RS master sent open to (w.r.t. start timestamp) This was previosuly "mitigated" by master killing the RS with incorrect reports, but due to race conditions with reports and assignment the report was replaced with a warning, so now this condition persists. Regardless, the kill approach is not a good fix because there's still a window when a region can be opened on two servers. A region is being opened by server_48c. The server dies, and we process the retry correctly (retry=3 because 2 previous similar open failures were processed correctly). We start opening it on server_1aa now. {noformat} 2019-01-28 18:12:09,862 INFO [KeepAlivePEWorker-104] assignment.RegionStateStore: pid=4915 updating hbase:meta row=8be2a423b16471b9417f0f7de04281c6, regionState=ABNORMALLY_CLOSED 2019-01-28 18:12:09,862 INFO [KeepAlivePEWorker-104] procedure.ServerCrashProcedure: pid=11944, state=RUNNABLE:SERVER_CRASH_ASSIGN, hasLock=true; ServerCrashProcedure server=server_48c,17020,1548726406632, splitWal=true, meta=false found RIT pid=4915, ppid=7, state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=table, region=8be2a423b16471b9417f0f7de04281c6, ASSIGN; rit=OPENING, location=server_48c,17020,1548726406632, table=table, region=8be2a423b16471b9417f0f7de04281c6 2019-01-28 18:12:10,778 INFO [KeepAlivePEWorker-80] assignment.TransitRegionStateProcedure: Retry=3 of max=2147483647; pid=4915, ppid=7, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=table, region=8be2a423b16471b9417f0f7de04281c6, ASSIGN; rit=ABNORMALLY_CLOSED, location=null ... 2019-01-28 18:12:10,902 INFO [KeepAlivePEWorker-80] assignment.TransitRegionStateProcedure: Starting pid=4915, ppid=7, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=table, region=8be2a423b16471b9417f0f7de04281c6, ASSIGN; rit=ABNORMALLY_CLOSED, location=null; forceNewPlan=true, retain=false 2019-01-28 18:12:11,114 INFO [PEWorker-7] assignment.RegionStateStore: pid=4915 updating hbase:meta row=8be2a423b16471b9417f0f7de04281c6, regionState=OPENING, regionLocation=server_1aa,17020,1548727658713 {noformat} However, we get the remote procedure failure from 48c after we've already started that. It actually tried to open on the restarted RS, which makes me wonder if this is safe also w.r.t. other races - what if RS already initialized and didn't error out? Need to check if we verify the start code expected by master on RS when opening. {noformat} 2019-01-28 18:12:12,179 WARN [RSProcedureDispatcher-pool4-t362] assignment.RegionRemoteProcedureBase: The remote operation pid=11050, ppid=4915, state=SUCCESS, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region {ENCODED => 8be2a423b16471b9417f0f7de04281c6 ... to server server_48c,17020,1548726406632 failed org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server server_48c,17020,1548727752747 is not running yet {noformat} Without any other reason (at least logged), the RIT immediately retries again and chooses a new candidate. {noformat} 2019-01-28 18:12:12,289 INFO [KeepAlivePEWorker-100] assignment.TransitRegionStateProcedure: Retry=4 of max=2147483647; pid=4915, ppid=7, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=table, region=8be2a423b16471b9417f0f7de04281c6, ASSIGN; rit=OPENING, location=server_1aa,17020,1548727658713 ... 2019-01-28 18:12:12,668 INFO [PEWorker-9] assignment.RegionStateStore: pid=4915 updating hbase:meta row=8be2a423b16471b9417f0f7de04281c6, regionState=OPENING, regionLocation=server_fd3,17020,1548727536972 {noformat} It then retries again and goes to the new 48c, but that's unrelated. {noformat} 2019-01-28 18:26:29,480 INFO [KeepAlivePEWorker-154] assignment.RegionStateStore: pid=4915 updating hbase:meta row=8be2a423b16471b9417f0f7de04281c6, regionState=OPENING, regionLocation=server_48c,17020,1548727752747 {noformat} What does happen though is that 1aa, that never responded when the RIT erroneously retri
[jira] [Created] (HBASE-21806) add an option to roll WAL on very slow syncs
Sergey Shelukhin created HBASE-21806: Summary: add an option to roll WAL on very slow syncs Key: HBASE-21806 URL: https://issues.apache.org/jira/browse/HBASE-21806 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin In large heterogeneous clusters sometimes a slow datanode can cause WAL syncs to be very slow. In this case, before the bad datanode recovers, or is discovered and repaired, it would be helpful to roll WAL on a very slow sync to get a new pipeline. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21797) more resilient master startup for bad cluster state
Sergey Shelukhin created HBASE-21797: Summary: more resilient master startup for bad cluster state Key: HBASE-21797 URL: https://issues.apache.org/jira/browse/HBASE-21797 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin See HBASE-21743 for broader context. During failure, master upon restart should already be able to handle having failed to persist the state of some procedures (because by definition cluster is much more likely to be in a bad state if master restarted due to some issue), so it should also be able to abandon old recovery procedures (SCP & RIT and their children) as if they were not saved, and create new ones during startup. This should be off by default. The idea is (some steps can be done in parallel as they are now, e.g. loading server list and meta): 1) During proc WAL recovery do not recover SCP and open/close related procs. 2) Load server list as usual (dead and alive). 3) Recover meta vi either a a new SCP (or perhaps just a separate meta recovery proc without extra SCP steps, and leave the SCP for step 5), if it's on a dead server. 4) Load region list as usual. 5) Create SCPs for dead servers. 6) Reassign any regions on non-existent servers (we've seen some issues with this after SCP finishes but there are lots of HDFS errors and/or manual intervention, so master "forgets" the server ever existed and the region stays "open" there forever). 7) ? Look for other simple inconsistencies that don't require HBCK-level changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21788) OpenRegionProcedure (after recovery?) is unreliable and needs to be improved
Sergey Shelukhin created HBASE-21788: Summary: OpenRegionProcedure (after recovery?) is unreliable and needs to be improved Key: HBASE-21788 URL: https://issues.apache.org/jira/browse/HBASE-21788 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin Not much for this one yet. I repeatedly see the cases when the region is stuck in OPENING, and after master restart RIT is recovered, and stays WAITING; its OpenRegionProcedure (also recovered) is stuck in Runnable and never does anything for hours. I cannot find logs on the target server indicating that it ever tried to do anything after master restart. This procedure needs at the very least logging of what it's trying to do, and maybe a timeout so it unconditionally fails after a configurable period (1 hour?). I may also investigate why it doesn't do anything and file a separate bug. I wonder if it's somehow related to the region status check, but this is just a hunch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21787) proc WAL replaces a RIT that holds a lock with a RIT that doesn't
Sergey Shelukhin created HBASE-21787: Summary: proc WAL replaces a RIT that holds a lock with a RIT that doesn't Key: HBASE-21787 URL: https://issues.apache.org/jira/browse/HBASE-21787 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin This is not the same as the other bug I just filed, but related - after master restart, 2 RITs are both in proc WAL. According to the comment where RIT is restored, this is expected. However what happens is that master takes lock for the older RIT, and then replaces the older RIT with the newer RIT on the region. You can see two "to restore RIT" log lines. Both RITs are still active in procedures view (and stuck due to yet another bug that I will file later). However, it seems wrong that lock is held by one RIT but region points to the other RIT as the correct one. {noformat} 2019-01-25 11:26:54,616 INFO [master/master:17000:becomeActiveMaster] procedure.MasterProcedureScheduler: Took xlock for pid=1738, ppid=3, state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN 2019-01-25 11:26:54,834 INFO [master/master:17000:becomeActiveMaster] assignment.AssignmentManager: Attach pid=1738, ppid=3, state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN to rit=OFFLINE, location=null, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e to restore RIT 2019-01-25 11:26:54,853 INFO [master/master:17000:becomeActiveMaster] assignment.AssignmentManager: Attach pid=4351, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN to rit=OFFLINE, location=null, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e to restore RIT 2019-01-25 11:27:02,460 INFO [master/master:17000:becomeActiveMaster] assignment.RegionStateStore: Load hbase:meta entry region=27f7ab2a05d9d730b2ab2339d1531b8e, regionState=OPENING, lastHost=server1,17020,1548290445704, regionLocation=server2,17020,1548442571056, openSeqNum=120108 2019-01-25 11:27:10,184 INFO [PEWorker-11] procedure.MasterProcedureScheduler: Waiting on xlock for pid=4351, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN held by pid=1738 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21786) RIT for a region without a lock can mess up the RIT that has the lock
Sergey Shelukhin created HBASE-21786: Summary: RIT for a region without a lock can mess up the RIT that has the lock Key: HBASE-21786 URL: https://issues.apache.org/jira/browse/HBASE-21786 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin I cannot find in the log where the 2nd RIT is coming from, the first line I see for it is Waiting for the lock. It has no parent procedure. One RIT, restored from WAL, with a retry manages to restore the region to some server. {noformat} 2019-01-25 10:56:21,878 INFO [master/master:17000:becomeActiveMaster] procedure.MasterProcedureScheduler: Took xlock for pid=1738, ppid=3, state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN 2019-01-25 10:56:22,055 INFO [master/master:17000:becomeActiveMaster] assignment.AssignmentManager: Attach pid=1738, ppid=3, state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN to rit=OFFLINE, location=null, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e to restore RIT 2019-01-25 10:56:51,362 INFO [master/master:17000:becomeActiveMaster] assignment.RegionStateStore: Load hbase:meta entry region=27f7ab2a05d9d730b2ab2339d1531b8e, regionState=OFFLINE, lastHost=server2,17020,1548290445704, regionLocation=server1,17020,1548442302645, openSeqNum=120108 2019-01-25 10:57:26,842 INFO [PEWorker-7] procedure2.ProcedureExecutor: Finished subprocedure(s) of pid=1738, ppid=3, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN; resume parent processing. 2019-01-25 10:57:26,842 INFO [PEWorker-12] assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; pid=1738, ppid=3, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN; rit=OFFLINE, location=server1,17020,1548442302645 2019-01-25 10:57:26,902 INFO [PEWorker-12] assignment.TransitRegionStateProcedure: Starting pid=1738, ppid=3, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN; rit=OFFLINE, location=null; forceNewPlan=true, retain=false 2019-01-25 10:57:33,817 INFO [PEWorker-7] assignment.RegionStateStore: pid=1738 updating hbase:meta row=27f7ab2a05d9d730b2ab2339d1531b8e, regionState=OPENING, regionLocation=server3,17020,1548442571056 {noformat} The other RIT appears out of nowhere.. there's no "to restore RIT" line for it. I wonder if it could be a side effect of the region being offline, or the retry above? Regardless, it cannot get the lock. {noformat} 2019-01-25 10:57:46,255 INFO [PEWorker-15] procedure.MasterProcedureScheduler: Waiting on xlock for pid=4351, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, ASSIGN held by pid=1738 {noformat} However, when the server responds that region is opened, the new RIT 4351 takes the notification and discards it. {noformat} 2019-01-25 10:58:23,263 WARN [RpcServer.default.FPBQ.Fifo.handler=19,queue=4,port=17000] assignment.TransitRegionStateProcedure: Received report OPENED transition from server3,17020,1548442571056 for rit=OPENING, location=server3,17020,1548442571056, table=table, region=27f7ab2a05d9d730b2ab2339d1531b8e, pid=4351, but the TRSP is not in REGION_STATE_TRANSITION_CONFIRM_OPENED state, should be a retry, ignore {noformat} Region is stuck in OPENING forever. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21785) master reports open regions as RITs and also messes up rit age metric
Sergey Shelukhin created HBASE-21785: Summary: master reports open regions as RITs and also messes up rit age metric Key: HBASE-21785 URL: https://issues.apache.org/jira/browse/HBASE-21785 Project: HBase Issue Type: Bug Affects Versions: 3.0.0, 2.2.0 Reporter: Sergey Shelukhin {noformat} Region State RIT time (ms) Retries dba183f0dadfcc9dc8ae0a6dd59c84e6dba183f0dadfcc9dc8ae0a6dd59c84e6. state=OPEN, ts=Wed Dec 31 16:00:00 PST 1969 (1548453918s ago), server=server,17020,1548452922054 1548453918735 0 {noformat} RIT age metric also gets set to a bogus value. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21774) do not use currentMillis to measure intervals
Sergey Shelukhin created HBASE-21774: Summary: do not use currentMillis to measure intervals Key: HBASE-21774 URL: https://issues.apache.org/jira/browse/HBASE-21774 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin I've noticed it in a few places in the code... currentMillis can go backwards and have other artifacts. nanoTime should be used for intervals unless it's both the case that the calls are frequent and nanoTime will result in perf overhead, and also that artifacts from negative intervals and such are relatively harmless or possible to work around in the code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21767) findRegionsToForceFlush could be improved
Sergey Shelukhin created HBASE-21767: Summary: findRegionsToForceFlush could be improved Key: HBASE-21767 URL: https://issues.apache.org/jira/browse/HBASE-21767 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Attachments: image-2019-01-23-16-07-42-166.png We see Too many WALs message logged a lot, and it seems like it usually fails to bring the WAL count back down for a long time, sometimes hours. This results in large steady state WAL volume on some region servers, and slower recovery. Based on the functionality in HBASE-21626, it should be possible to add an option to be aggressive in this method, and to determine the minimum set of regions that will actually bring WAL count close to a limit in one operation. An example of too-many-WALs log statements reporting WAL count over ~2.5 hours, with a limit of 79. You can see when the count is coming down; it often doesn't come anywhere close to the limit so another flush is requested, etc. for a long time. !image-2019-01-23-16-07-42-166.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21576) master should proactively reassign meta when killing a RS with it
[ https://issues.apache.org/jira/browse/HBASE-21576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-21576. -- Resolution: Not A Problem > master should proactively reassign meta when killing a RS with it > - > > Key: HBASE-21576 > URL: https://issues.apache.org/jira/browse/HBASE-21576 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > Master has killed an RS that was hosting meta due to some HDFS issue (most > likely; I've lost the RS logs due to HBASE-21575). > RS took a very long time to die (again, might be a separate bug, I'll file if > I see repro), and a long time to restart; meanwhile master never tried to > reassign meta, and eventually killed itself not being able to update it. > It seems like a RS on a bad machine would be especially prone to slow > abort/startup, as well as to issues causing master to kill it, so it would > make sense for master to immediately relocate meta once meta-hosting RS is > dead after a kill; or even when killing the RS. In the former case (if the RS > needs to die for meta to be reassigned safely), perhaps the RS hosting meta > in particular should try to die fast in such circumstances, and not do any > cleanup. > {noformat} > 2018-12-08 04:52:55,144 WARN > [RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] > master.MasterRpcServices: ,17020,1544264858183 reported a fatal > error: > * ABORTING region server ,17020,1544264858183: Replay of WAL > required. Forcing server shutdown * > [aborting for ~7 minutes] > 2018-12-08 04:53:44,190 INFO [PEWorker-7] client.RpcRetryingCallerImpl: Call > exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server > ,17020,1544264858183 aborting, details=row '...' on table > 'hbase:meta' at region=hbase:meta,,1.1588230740, > hostname=,17020,1544264858183, seqNum=-1 > ... [starting for ~5] > 2018-12-08 04:59:58,574 INFO > [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] > client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, > started=392702 ms ago, cancelled=false, msg=Call to failed on > connection exception: > org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: > connection timed out: , details=row '...' on table 'hbase:meta' at > region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, > seqNum=-1 > ... [re-initializing for at least ~7] > 2018-12-08 05:04:17,271 INFO [hconnection-0x4d58bcd4-shared-pool3-t1877] > client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, > started=41137 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server > ,17020,1544274145387 is not running yet > ... > 2018-12-08 05:11:18,470 ERROR > [RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: > * ABORTING master ...,17000,1544230401860: FAILED persisting region=... > state=OPEN *^M > {noformat} > There are no signs of meta assignment activity at all in master logs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21626) log the regions blocking WAL from being archived
[ https://issues.apache.org/jira/browse/HBASE-21626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-21626. -- Resolution: Fixed Committed to master. Thanks for the review! > log the regions blocking WAL from being archived > > > Key: HBASE-21626 > URL: https://issues.apache.org/jira/browse/HBASE-21626 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-21626.01.patch, HBASE-21626.02.patch, > HBASE-21626.ADDENDUM.patch, HBASE-21626.patch > > > The WALs not being archived for a long time can result in a long recovery > later. It's useful to know what regions are blocking the WALs from being > archived, to be able to debug flush logic and tune configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21759) master can balance regions onto a known dead server
Sergey Shelukhin created HBASE-21759: Summary: master can balance regions onto a known dead server Key: HBASE-21759 URL: https://issues.apache.org/jira/browse/HBASE-21759 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin {noformat} 2019-01-18 09:42:45,664 INFO [PEWorker-1] procedure.ServerCrashProcedure: Start pid=104009, state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure server=server,17020,1547824687684, splitWal=true, meta=false 2019-01-18 09:42:46,063 INFO [PEWorker-6] master.SplitLogManager: dead splitlog workers [server,17020,1547824687684] ... 2019-01-18 09:42:57,003 INFO [PEWorker-6] master.SplitLogManager: Finished splitting (more than or equal to) 82490392538 bytes in 77 log files in [.../WALs/server,17020,1547824687684-splitting] in 10937ms [2 minutes later] 2019-01-18 09:44:01,967 INFO [master/BN01APADCE87FE2:17000.Chore.1] master.HMaster: balance hri=77387653ba3dc0988342bfbdc0c6901c, source=...,17020,1547827296275, destination=server,17020,1547824687684 2019-01-18 09:44:26,119 INFO [master/BN01APADCE87FE2:17000.Chore.1] master.HMaster: balance hri=c2f68f73fa55ab4afe8348e4c1b14cad, source=...,17020,1547824649171, destination=server,17020,1547824687684 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21757) retrying to close a region incorrectly resets its RIT age metric
Sergey Shelukhin created HBASE-21757: Summary: retrying to close a region incorrectly resets its RIT age metric Key: HBASE-21757 URL: https://issues.apache.org/jira/browse/HBASE-21757 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin We have a region stuck in RIT forever due to some other bug that I will file later. Every 10 minutes it does the typical split-brain retry; I noticed that this retry resets the region's RIT age, so the "oldest RIT" metric never becomes larger than ~10mins even though the region has been stuck for days. {noformat} 2019-01-22 10:40:52,993 INFO [PEWorker-10] assignment.RegionStateStore: pid=1865 updating hbase:meta row=region, regionState=CLOSING, regionLocation=server,17020,1547824687684 2019-01-22 10:40:53,025 WARN [PEWorker-10] assignment.RegionRemoteProcedureBase: Can not add remote operation pid=29297, ppid=1865, state=RUNNABLE, hasLock=true; org.apache.hadoop.hbase.master.assignment.CloseRegionProcedure for region {ENCODED => region, ...} to server server,17020,1547824687684, this usually because the server is alread dead, give up and mark the procedure as complete, the parent procedure will take care of this. 2019-01-22 10:40:53,040 INFO [PEWorker-10] procedure2.ProcedureExecutor: Finished subprocedure(s) of pid=1865, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=true; TransitRegionStateProcedure table=table, region=region, REOPEN/MOVE; resume parent processing. 2019-01-22 10:40:53,040 WARN [PEWorker-7] assignment.TransitRegionStateProcedure: Failed transition, suspend 600secs pid=1865, state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE, hasLock=true; TransitRegionStateProcedure table=table, region=region, REOPEN/MOVE; rit=CLOSING, location=server,17020,1547824687684; waiting on rectified condition fixed by other Procedure or operator intervention 2019-01-22 10:40:53,040 INFO [PEWorker-7] procedure2.TimeoutExecutorThread: ADDED pid=1865, state=WAITING_TIMEOUT:REGION_STATE_TRANSITION_CLOSE, hasLock=true; TransitRegionStateProcedure table=table, region=region, REOPEN/MOVE; timeout=60, timestamp=1548183053040 {noformat} !image-2019-01-22-11-00-39-030.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21744) timeout for server list refresh calls
Sergey Shelukhin created HBASE-21744: Summary: timeout for server list refresh calls Key: HBASE-21744 URL: https://issues.apache.org/jira/browse/HBASE-21744 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Not sure why yet, but we are seeing the case when cluster is in overall a bad state, where after RS dies and deletes its znode, the notification looks like it's lost, so the master doesn't detect the failure. ZK itself appears to be healthy and doesn't report anything special. After some other change is made to the server list, master rescans the list and picks up the stale notification. Might make sense to add a config that would trigger the refresh if it hasn't happened for a while (e.g. 1 minute). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21743) stateless assignment
Sergey Shelukhin created HBASE-21743: Summary: stateless assignment Key: HBASE-21743 URL: https://issues.apache.org/jira/browse/HBASE-21743 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Running HBase for only a few weeks we found dozen(s?) of bugs with assignment that all seem to have the same nature - split brain between 2 procedures; or between procedure and master startup (meta replica bugs); or procedure and master shutdown (HBASE-21742); or procedure and something else (when SCP had incorrect region list persisted, don't recall the bug#). To me, it starts to look like a pattern where, like in AMv1 where concurrent interactions were unclear and hard to reason about, despite the cleaner individual pieces in AMv2 the problem of unclear concurrent interactions has been preserved and in fact increased because of the operation state persistence and isolation. Procedures are great for multi-step operations that need rollback and stuff like that, e.g. creating a table or snapshot, or even region splitting. However I'm not so sure about assignment. We have the persisted information - region state in meta (incl transition states like opening, or closing), server list as WAL directory list. Procedure state is not any more reliable then those (we can argue that meta update can fail, but so can procv2 WAL flush, so we have to handle cases of out of date information regardless). So, we don't need any extra state to decide on assignment, whether for recovery and balancing. In fact, as mentioned in some bugs, deleting procv2 WAL is often the best way to recover the cluster, because master can already figure out what to do without additional state. I think there should be an option for stateless assignment that does that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21742) master can create bad procedures during abort, making entire cluster unusable
Sergey Shelukhin created HBASE-21742: Summary: master can create bad procedures during abort, making entire cluster unusable Key: HBASE-21742 URL: https://issues.apache.org/jira/browse/HBASE-21742 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Some small HDFS hiccup causes master and meta RS to fail together. Master goes first: {noformat} 2019-01-18 08:09:46,790 INFO [KeepAlivePEWorker-311] zookeeper.MetaTableLocator: Setting hbase:meta (replicaId=0) location in ZooKeeper as meta-rs,17020,1547824792484 ... 2019-01-18 10:01:16,904 ERROR [PEWorker-11] master.HMaster: * ABORTING master master,17000,1547604554447: FAILED [blah] * ... 2019-01-18 10:01:17,087 INFO [master/master:17000] assignment.AssignmentManager: Stopping assignment manager {noformat} Bunch of stuff keeps happening, including procedure retries, which is also suspect, but not the point here: {noformat} 2019-01-18 10:01:21,598 INFO [PEWorker-3] procedure2.TimeoutExecutorThread: ADDED pid=104031, state=WAITING_TIMEOUT:REGION_STATE_TRANSITION_CLOSE, ... {noformat} {noformat} Then the meta RS decides it's time to go: {noformat} 2019-01-18 10:01:25,319 INFO [RegionServerTracker-0] master.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [meta-rs,17020,1547824792484] ... 2019-01-18 10:01:25,463 INFO [RegionServerTracker-0] assignment.AssignmentManager: Added meta-rs,17020,1547824792484 to dead servers which carryingMeta=false, submitted ServerCrashProcedure pid=104313 {noformat} This SCP gets persisted, so when the next master starts, it waits forever for meta to be onlined, while there's no SCP with meta=true to online it. The only way around this is to delete the procv2 WAL - master has all the information here, as it often does in bugs I've found recently, but some split brain procedures cause it to get stuck one way or another. I will file a separate bug about that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-21626) log the regions blocking WAL from being archived
[ https://issues.apache.org/jira/browse/HBASE-21626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HBASE-21626: -- There's an issue in this patch that causes the message to never be logged. > log the regions blocking WAL from being archived > > > Key: HBASE-21626 > URL: https://issues.apache.org/jira/browse/HBASE-21626 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-21626.01.patch, HBASE-21626.02.patch, > HBASE-21626.patch > > > The WALs not being archived for a long time can result in a long recovery > later. It's useful to know what regions are blocking the WALs from being > archived, to be able to debug flush logic and tune configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21627) race condition between a recovered RIT for meta replica, and master startup
Sergey Shelukhin created HBASE-21627: Summary: race condition between a recovered RIT for meta replica, and master startup Key: HBASE-21627 URL: https://issues.apache.org/jira/browse/HBASE-21627 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Master recovers RIT for a meta replica {noformat} 2018-12-14 23:16:12,008 INFO [master/...:17000:becomeActiveMaster] assignment.AssignmentManager: Attach pid=83796, ppid=83788, state=RUNNABLE:REGION_STATE_TRANSITION_OPEN, hasLock=false; TransitRegionStateProcedure table=hbase:meta, region=(region), ASSIGN to rit=OFFLINE, location=null, table=hbase:meta, region=(region) to restore RIT 2018-12-14 23:16:16,475 WARN [PEWorker-8] assignment.TransitRegionStateProcedure: No location specified for {ENCODED => (region), NAME => 'hbase:meta,,1_0001', STARTKEY => '', ENDKEY => '', REPLICA_ID => 1}, jump back to state REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE to get one ... 2018-12-14 23:16:30,010 INFO [PEWorker-16] procedure2.ProcedureExecutor: Finished pid=83796, ppid=83788, state=SUCCESS, hasLock=false; TransitRegionStateProcedure table=hbase:meta, region=(region), ASSIGN in 8mins, 23.39sec {noformat} Then tries to assign replicas.. {noformat} 2018-12-14 23:16:36,091 ERROR [master/...:17000:becomeActiveMaster] master.HMaster: Failed to become active master org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state for rit=OPEN, location=server,17020,1544858156805, table=hbase:meta, region=(region) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:548) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:563) at org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146) {noformat} Unfortunately I misplaced the log from this after copy-pasting a grep result so that's all I have for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21626) log the regions blocking WAL from being archived
Sergey Shelukhin created HBASE-21626: Summary: log the regions blocking WAL from being archived Key: HBASE-21626 URL: https://issues.apache.org/jira/browse/HBASE-21626 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin The WALs not being archived for a long time can result in a long recovery later. It's useful to know what regions are blocking the WALs from being archived, to be able to debug flush logic and tune configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21625) a runnable procedure v2 does not run
Sergey Shelukhin created HBASE-21625: Summary: a runnable procedure v2 does not run Key: HBASE-21625 URL: https://issues.apache.org/jira/browse/HBASE-21625 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Haven't looked at the code much yet, but it seems rather fundamental. The procedure comes from meta replica assignment (HBASE-21624). There are no other runnable procedures on master - a lot of succeeded procedures since then, and one unrelated RIT procedure waiting with timeout and being updated peroodically. The procedure itself is {noformat} 157156 157155 RUNNABLEhadoop org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure Wed Dec 19 17:20:27 PST 2018Wed Dec 19 17:20:28 PST 2018[ { region => { regionId => '1', tableName => { ... }, startKey => '', endKey => '', offline => 'false', split => 'false', replicaId => '1' }, targetServer => { hostName => 'server1', port => '17020', startCode => '1545266805778' } }, {} ] {noformat} This is in PST so it's been like that for ~19 hours. The only line involving this PID in the log is {noformat} 2018-12-19 17:20:27,974 INFO [PEWorker-4] procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=157156, ppid=157155, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure}] {noformat} There are no other useful logs for either this PID, parent PID, or region in question since. All the PEWorker-s are waiting for work: {noformat} Thread 158 (PEWorker-16): State: TIMED_WAITING Blocked count: 1340 Waited count: 5064 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) org.apache.hadoop.hbase.procedure2.AbstractProcedureScheduler.poll(AbstractProcedureScheduler.java:171) org.apache.hadoop.hbase.procedure2.AbstractProcedureScheduler.poll(AbstractProcedureScheduler.java:153) org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1949) {noformat} The main assignment procedure for this region is blocked on it: {noformat} 157155 WAITING hadoop TransitRegionStateProcedure table=hbase:meta, region=534574363, ASSIGN Wed Dec 19 17:20:27 PST 2018Wed Dec 19 17:20:27 PST 2018[ { state => [ '1', '2', '3' ] }, { regionId => '1', tableName => { ... }, startKey => '', endKey => '', offline => 'false', split => 'false', replicaId => '1' }, { initialState => 'REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE', lastState => 'REGION_STATE_TRANSITION_CONFIRM_OPENED', assignCandidate => { hostName => 'server1', port => '17020', startCode => '1545266805778' }, forceNewPlan => 'false' } ] 2018-12-19 17:20:27,673 INFO [PEWorker-9] procedure.MasterProcedureScheduler: Took xlock for pid=157155, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=false; TransitRegionStateProcedure table=hbase:meta, region=..., ASSIGN 2018-12-19 17:20:27,809 INFO [PEWorker-9] assignment.TransitRegionStateProcedure: Starting pid=157155, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=hbase:meta, region=..., ASSIGN; rit=OFFLINE, location=server1,17020,1545266805778; forceNewPlan=false, retain=false {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21624) master startup should not sleep on assigning meta replicas
Sergey Shelukhin created HBASE-21624: Summary: master startup should not sleep on assigning meta replicas Key: HBASE-21624 URL: https://issues.apache.org/jira/browse/HBASE-21624 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Due to some other bug, a meta replica is stuck in transition forever. Master is running fine without it, however the initializer thread hasn't finished initialization for ~19 hours now and is stuck in the below state. Doesn't seem to be necessary to wait for them - could just be fire-and-forget, normal region handling should handle it after that. {noformat} Thread 118 (master/...:17000:becomeActiveMaster): State: TIMED_WAITING Blocked count: 281 Waited count: 67059 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:209) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:192) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:151) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:140) org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:133) org.apache.hadoop.hbase.master.assignment.AssignmentManager.assign(AssignmentManager.java:569) org.apache.hadoop.hbase.master.MasterMetaBootstrap.assignMetaReplicas(MasterMetaBootstrap.java:84) org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1146) org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2342) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21623) ServerCrashProcedure can stomp on a RIT for the wrong server
Sergey Shelukhin created HBASE-21623: Summary: ServerCrashProcedure can stomp on a RIT for the wrong server Key: HBASE-21623 URL: https://issues.apache.org/jira/browse/HBASE-21623 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin A server died while some region was being opened on it; eventually the open failed, and the RIT procedure started retrying on a different server. However, by then SCP for the dying server has already obtained the region from the list of regions on the server, and overwrote whatever the RIT was doing with a new server. {noformat} 2018-12-18 23:06:03,160 INFO [PEWorker-14] procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=151404, ppid=151104, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure}] ... 2018-12-18 23:06:38,208 INFO [PEWorker-10] procedure.ServerCrashProcedure: Start pid=151632, state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure server=oldServer,17020,1545202098577, splitWal=true, meta=false ... 2018-12-18 23:06:41,953 WARN [RSProcedureDispatcher-pool4-t115] assignment.RegionRemoteProcedureBase: The remote operation pid=151404, ppid=151104, state=RUNNABLE, hasLock=false; org.apache.hadoop.hbase.master.assignment.OpenRegionProcedure for region {ENCODED => region1, ... } to server oldServer,17020,1545202098577 failed org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server oldServer,17020,1545202098577 aborting 2018-12-18 23:06:42,485 INFO [PEWorker-5] procedure2.ProcedureExecutor: Finished subprocedure(s) of pid=151104, ppid=150875, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=t1, region=region1, ASSIGN; resume parent processing. 2018-12-18 23:06:42,485 INFO [PEWorker-13] assignment.TransitRegionStateProcedure: Retry=1 of max=2147483647; pid=151104, ppid=150875, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=t1, region=region1, ASSIGN; rit=OPENING, location=oldServer,17020,1545202098577 2018-12-18 23:06:42,500 INFO [PEWorker-13] assignment.TransitRegionStateProcedure: Starting pid=151104, ppid=150875, state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE, hasLock=true; TransitRegionStateProcedure table=t1, region=region1, ASSIGN; rit=OPENING, location=null; forceNewPlan=true, retain=false 2018-12-18 23:06:42,657 INFO [PEWorker-2] assignment.RegionStateStore: pid=151104 updating hbase:meta row=region1, regionState=OPENING, regionLocation=newServer,17020,1545202111238 ... 2018-12-18 23:06:43,094 INFO [PEWorker-4] procedure.ServerCrashProcedure: pid=151632, state=RUNNABLE:SERVER_CRASH_ASSIGN, hasLock=true; ServerCrashProcedure server=oldServer,17020,1545202098577, splitWal=true, meta=false found RIT pid=151104, ppid=150875, state=WAITING:REGION_STATE_TRANSITION_CONFIRM_OPENED, hasLock=true; TransitRegionStateProcedure table=t1, region=region1, ASSIGN; rit=OPENING, location=newServer,17020,1545202111238, table=t1, region=region1 2018-12-18 23:06:43,094 INFO [PEWorker-4] assignment.RegionStateStore: pid=151104 updating hbase:meta row=region1, regionState=ABNORMALLY_CLOSED {noformat} Later, the RIT later overwrote the state again, it seems, and then the region got stuck in OPENING state forever, but I'm not sure yet if that's just due to this bug or if there was another bug after that. For now this can be addressed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21614) RIT recovery with ServerCrashProcedure is broken in multiple ways
Sergey Shelukhin created HBASE-21614: Summary: RIT recovery with ServerCrashProcedure is broken in multiple ways Key: HBASE-21614 URL: https://issues.apache.org/jira/browse/HBASE-21614 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Master is restarting after a previous master crashed while recovering some regions from a dead server. Master recovers RIT for the region, however the RIT has no location (logged, at least) in CONFIRM_CLOSE state. That is a potential problem #1 - confirm where? But that should be covered by meta, so not a big deal, right. {noformat} 2018-12-17 14:51:14,606 INFO [master/:17000:becomeActiveMaster] assignment.AssignmentManager: Attach pid=38015, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT {noformat} However, in this case ServerCrashProcedure for the server kicks off BEFORE meta is loaded. That seems to be a problem #2 - it immediately gets regions to later recover, so in this case it gets nothing. I've grepped our logs for successful cases of SCP interacting with TRANSITION_CONFIRM_CLOSED, and in all cases the meta was loaded before SCP. Seems like a race condition. {noformat} 2018-12-17 14:51:14,625 INFO [master/:17000:becomeActiveMaster] master.RegionServerTracker: Starting RegionServerTracker; 0 have existing ServerCrashProcedures, 103 possibly 'live' servers, and 1 'splitting'. 2018-12-17 14:51:20,770 INFO [master/:17000:becomeActiveMaster] master.ServerManager: Processing expiration of server1,17020,1544636616174 on ,17000,1545087053243 2018-12-17 14:51:20,921 INFO [master/:17000:becomeActiveMaster] assignment.AssignmentManager: Added server1,17020,1544636616174 to dead servers which carryingMeta=false, submitted ServerCrashProcedure pid=111298 2018-12-17 14:51:30,728 INFO [PEWorker-13] procedure.ServerCrashProcedure: Start pid=111298, state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure server=server1,17020,1544636616174, splitWal=true, meta=false {noformat} Meta is only loaded 11-12 seconds later. If one looks at meta-loading code however, there is one more problem - the region is in CLOSING state, so the {{addRegionToServer}} is not going to be called - it's only called for OPENED regions. Expanding on the above, I've only seen SCP unblock stuck TRANSITION_CONFIRM_CLOSED when region started out in meta as OPEN. {noformat} 2018-12-17 14:51:42,403 INFO [master/:17000:becomeActiveMaster] assignment.RegionStateStore: Load hbase:meta entry region=region1, regionState=CLOSING, lastHost=server1,17020,1544636616174, regionLocation=server1,17020,1544636616174, openSeqNum=629131 {noformat} SCP predictably finishes without doing anything; no other logs for this pid {noformat} 2018-12-17 14:52:19,046 INFO [PEWorker-2] procedure2.ProcedureExecutor: Finished pid=111298, state=SUCCESS, hasLock=false; ServerCrashProcedure server=server1,17020,1544636616174, splitWal=true, meta=false in 58.0010sec {noformat} After that, region is still stuck trying to be closed in TransitRegionStateProcedure; it's in the same state for hours including across master restarts. {noformat} 2018-12-17 15:09:35,216 WARN [PEWorker-14] assignment.TransitRegionStateProcedure: Failed transition, suspend 604secs pid=38015, state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE, hasLock=true; TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE; rit=CLOSING, location=server1,17020,1544636616174; waiting on rectified condition fixed by other Procedure or operator intervention {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21611) REGION_STATE_TRANSITION_CONFIRM_CLOSED should interact better with crash procedure.
Sergey Shelukhin created HBASE-21611: Summary: REGION_STATE_TRANSITION_CONFIRM_CLOSED should interact better with crash procedure. Key: HBASE-21611 URL: https://issues.apache.org/jira/browse/HBASE-21611 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin 1) Not a bug per se, since HDFS is not supposed to lose files, just a bit fragile. When a dead server's WAL directory is deleted (due to a manual intervention, or some issue with HDFS) while some regions are in CLOSING state on that server, they get stuck forever in REGION_STATE_TRANSITION_CONFIRM_CLOSED - REGION_STATE_TRANSITION_CLOSE - "give up and mark the procedure as complete, the parent procedure will take care of this" loop. There's no crash procedure for the server so nobody ever takes care of that. 2) Under normal circumstances, when a large WAL is being split, this same loop keeps spamming the logs and wasting resources for no reason, until the crash procedure completes. There's no reason for it to retry - it should just wait for crash procedure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21601) corrupted WAL is not handled in all places
Sergey Shelukhin created HBASE-21601: Summary: corrupted WAL is not handled in all places Key: HBASE-21601 URL: https://issues.apache.org/jira/browse/HBASE-21601 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin {noformat} 2018-12-13 17:01:12,208 ERROR [RS_LOG_REPLAY_OPS-regionserver/...] executor.EventHandler: Caught throwable while processing event RS_LOG_REPLAY java.lang.RuntimeException: java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.wal.WALSplitter$PipelineController.checkForErrors(WALSplitter.java:846) at org.apache.hadoop.hbase.wal.WALSplitter$OutputSink.finishWriting(WALSplitter.java:1203) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(WALSplitter.java:1267) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:349) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:196) at org.apache.hadoop.hbase.regionserver.SplitLogWorker.splitLog(SplitLogWorker.java:178) at org.apache.hadoop.hbase.regionserver.SplitLogWorker.lambda$new$0(SplitLogWorker.java:90) at org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:70) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.CellUtil.cloneFamily(CellUtil.java:113) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.filterCellByStore(WALSplitter.java:1542) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.appendBuffer(WALSplitter.java:1586) at org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink.append(WALSplitter.java:1560) at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.writeBuffer(WALSplitter.java:1085) at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.doRun(WALSplitter.java:1077) at org.apache.hadoop.hbase.wal.WALSplitter$WriterThread.run(WALSplitter.java:1047) {noformat} Unfortunately I cannot share the file. The issue appears to be straightforward - for whatever reason the family length is negative. Not sure how such a cell got created, I suspect the file was corrupted. {code} byte[] output = new byte[cell.getFamilyLength()]; {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21575) memstore above high watermark message is logged too much
[ https://issues.apache.org/jira/browse/HBASE-21575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-21575. -- Resolution: Fixed Fixed the commit. Thanks for noticing... > memstore above high watermark message is logged too much > > > Key: HBASE-21575 > URL: https://issues.apache.org/jira/browse/HBASE-21575 > Project: HBase > Issue Type: Bug > Components: logging, regionserver >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-21575.01.patch, HBASE-21575.patch > > > 100s of Mb of logs like this, in a tight loop: > {noformat} > 2018-12-08 10:27:00,462 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3646ms > 2018-12-08 10:27:00,463 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3647ms > 2018-12-08 10:27:00,463 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3647ms > 2018-12-08 10:27:00,464 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3648ms > 2018-12-08 10:27:00,464 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3648ms > 2018-12-08 10:27:00,465 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3649ms > 2018-12-08 10:27:00,465 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3649ms > 2018-12-08 10:27:00,466 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3650ms > 2018-12-08 10:27:00,466 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3650ms > 2018-12-08 10:27:00,467 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3651ms > 2018-12-08 10:27:00,469 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3653ms > 2018-12-08 10:27:00,470 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3654ms > 2018-12-08 10:27:00,470 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3654ms > 2018-12-08 10:27:00,471 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3655ms > 2018-12-08 10:27:00,471 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3655ms > 2018-12-08 10:27:00,472 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3656ms > 2018-12-08 10:27:00,472 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3656ms > 2018-12-08 10:27:00,473 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3657ms > 2018-12-08 10:27:00,474 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3658ms > 2018-12-08 10:27:00,475 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3659ms > 2018-12-08 10:27:00,476 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3660ms > 2018-12-08 10:27:00,476 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above high water mark and block > 3660ms > 2018-12-08 10:27:00,477 WARN > [RpcServer.default.FPBQ.Fifo.handler=2,queue=2,port=17020] > regionserver.MemStoreFlusher: Memstore is above h
[jira] [Created] (HBASE-21577) do not close regions when RS is dying due to a broken WAL
Sergey Shelukhin created HBASE-21577: Summary: do not close regions when RS is dying due to a broken WAL Key: HBASE-21577 URL: https://issues.apache.org/jira/browse/HBASE-21577 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin See HBASE-21576. DroppedSnapshot can be an FS failure; also, when WAL is broken, some regions whose flushes are already in flight keep retrying, resulting in minutes-long shutdown times. Since WAL will be replayed anyway flushing regions doesn't provide much benefit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21576) master should proactively reassign meta when killing a RS with it
Sergey Shelukhin created HBASE-21576: Summary: master should proactively reassign meta when killing a RS with it Key: HBASE-21576 URL: https://issues.apache.org/jira/browse/HBASE-21576 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Master has killed an RS that was hosting meta due to some internal error (still need to see if it's a separate bug or just a machine/HDFS issue, I've lost the RS logs due to HBASE-21575). RS took a very long time to die (again, might be a separate bug, I'll file if I see repro), and a long time to restart; meanwhile master never tried to reassign meta, and eventually killed itself not being able to update it. It seems like a RS on a bad machine would be especially prone to slow abort/startup, as well as to issues causing master to kill it, so it would make sense for master to immediately relocate meta once meta-hosting RS is dead; or even when killing the RS. In the former case (if the RS needs to die for meta to be reassigned safely), perhaps the RS hosting meta in particular should try to die fast in such circumstances, and not do any cleanup. {noformat} 2018-12-08 04:52:55,144 WARN [RpcServer.default.FPBQ.Fifo.handler=39,queue=4,port=17000] master.MasterRpcServices: ,17020,1544264858183 reported a fatal error: * ABORTING region server ,17020,1544264858183: Replay of WAL required. Forcing server shutdown * [aborting for ~7 minutes] 2018-12-08 04:53:44,190 INFO [PEWorker-7] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41190 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.regionserver.RegionServerAbortedException: Server ,17020,1544264858183 aborting, details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [starting for ~5] 2018-12-08 04:59:58,574 INFO [RpcServer.default.FPBQ.Fifo.handler=45,queue=0,port=17000] client.RpcRetryingCallerImpl: Call exception, tries=10, retries=61, started=392702 ms ago, cancelled=false, msg=Call to failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: connection timed out: , details=row '...' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=,17020,1544264858183, seqNum=-1 ... [re-initializing for at least ~7] 2018-12-08 05:04:17,271 INFO [hconnection-0x4d58bcd4-shared-pool3-t1877] client.RpcRetryingCallerImpl: Call exception, tries=6, retries=61, started=41137 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server ,17020,1544274145387 is not running yet ... 2018-12-08 05:11:18,470 ERROR [RpcServer.default.FPBQ.Fifo.handler=38,queue=3,port=17000] master.HMaster: * ABORTING master ...,17000,1544230401860: FAILED persisting region=... state=OPEN *^M {noformat} There are no signs of meta assignment activity at all in master logs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21575) memstore above high watermark message is logged too much
Sergey Shelukhin created HBASE-21575: Summary: memstore above high watermark message is logged too much Key: HBASE-21575 URL: https://issues.apache.org/jira/browse/HBASE-21575 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin 100s of Mb of logs like this: {noformat} 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=12,queue=2,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 103076ms 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=44,queue=4,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 150781ms 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=14,queue=4,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 150792ms 2018-12-08 10:29:07,603 WARN [RpcServer.default.FPBQ.Fifo.handler=23,queue=3,port=17020] regionserver.MemStoreFlusher: Memstore is above high water mark and block 150780ms {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21564) race condition in WAL rolling
Sergey Shelukhin created HBASE-21564: Summary: race condition in WAL rolling Key: HBASE-21564 URL: https://issues.apache.org/jira/browse/HBASE-21564 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Manifests at least with AsyncFsWriter. There's a window after LogRoller replaces the writer in the WAL, but before it sets the rollLog boolean to false in the finally, where the WAL class can request another log roll (it can happen in particular when the logs are getting archived in the LogRoller thread, and there's high write volume causing the logs to roll quickly). LogRoller will blindly reset the rollLog flag in finally and "forget" about this request. AsyncWAL in turn never requests it again because its own rollRequested field is set and it expects a callback. Logs don't get rolled until a periodic roll is triggered after that. The acknowledgment of roll requests by LogRoller should be atomic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21531) race between region report and region move causes master to kill RS
[ https://issues.apache.org/jira/browse/HBASE-21531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-21531. -- Resolution: Duplicate > race between region report and region move causes master to kill RS > --- > > Key: HBASE-21531 > URL: https://issues.apache.org/jira/browse/HBASE-21531 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > In this case the delay between the ack from RS and the region report was > 1.5s, so I'm not sure what caused the race (network hiccup? unreported retry > by protobuf transport?) but in any case I don't see anything that prevents > this from happening in a normal case, with a narrowed time window. Any delay > (e.g. a GC pause on RS right after the report is built, and ack is sent for > the close) or retries expands the window. > Master starts moving the region and the source RS acks by 21:51,206 > {noformat} > Master: > 2018-11-21 21:21:49,024 INFO [master/6:17000.Chore.1] master.HMaster: > balance hri=, source=,17020,1542754626176, destination= 2>,17020,1542863268158 > ... > Server: > 2018-11-21 21:21:49,394 INFO [RS_CLOSE_REGION-regionserver/:17020-1] > handler.UnassignRegionHandler: Close > ... > 2018-11-21 21:21:51,095 INFO [RS_CLOSE_REGION-regionserver/:17020-1] > handler.UnassignRegionHandler: Closed > {noformat} > By then the region is removed from onlineRegions, so the master proceeds. > {noformat} > 2018-11-21 21:21:51,206 INFO [PEWorker-4] procedure2.ProcedureExecutor: > Finished subprocedure(s) of pid=667, > state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=true; > TransitRegionStateProcedure table=, region=, REOPEN/MOVE; > resume parent processing. > 2018-11-21 21:21:51,386 INFO [PEWorker-13] assignment.RegionStateStore: > pid=667 updating hbase:meta row=, regionState=OPENING, > regionLocation=,17020,1542863268158 > {noformat} > There are no obvious errors/delays that I see in RS log, and it doesn't log > starting to send the report. > However, at 21:52.853 the report is processed that still contains this region. > {noformat} > 2018-11-21 21:21:52,853 WARN > [RpcServer.default.FPBQ.Fifo.handler=48,queue=3,port=17000] > assignment.AssignmentManager: Killing ,17020,1542754626176: > rit=OPENING, location=,17020,1542863268158, table=, > region= reported OPEN on server=,17020,1542754626176 but > state has otherwise. > * ABORTING region server ,17020,1542754626176: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPENING, location= 2>,17020,1542863268158, table=, region= reported OPEN on > server=,17020,1542754626176 but state has otherwise. > {noformat} > RS shuts down in an orderly manner and it can be seen from the log that this > region is actually not present (there's no line indicating it's being closed, > unlike for other regions). > I think there needs to be some sort of versioning for region operations > and/or in RS reports to allow master to account for concurrent operations and > avoid races. Probably per region with either a grace period or an additional > global version, so that master could avoid killing RS based on stale reports, > but still kill RS if it did retain an old version of the region state due to > some bug after acking a new version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21531) race between region report and region move causes master to kill RS
Sergey Shelukhin created HBASE-21531: Summary: race between region report and region move causes master to kill RS Key: HBASE-21531 URL: https://issues.apache.org/jira/browse/HBASE-21531 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin In this case the delay between the ack from RS and the region report was 1.5s, so I'm not sure what caused the race (network hiccup? unreported retry by protobuf transport?) but in any case I don't see anything that prevents this from happening in a normal case, with a narrowed time window. Any delay (e.g. a GC pause on RS right after the report is built, and ack is sent for the close) or retries expands the window. Master starts moving the region and the source RS acks by 21:51,206 {noformat} Master: 2018-11-21 21:21:49,024 INFO [master/6:17000.Chore.1] master.HMaster: balance hri=, source=,17020,1542754626176, destination=,17020,1542863268158 ... Server: 2018-11-21 21:21:49,394 INFO [RS_CLOSE_REGION-regionserver/:17020-1] handler.UnassignRegionHandler: Close ... 2018-11-21 21:21:51,095 INFO [RS_CLOSE_REGION-regionserver/:17020-1] handler.UnassignRegionHandler: Closed {noformat} By then the region is removed from onlineRegions, so the master proceeds. {noformat} 2018-11-21 21:21:51,206 INFO [PEWorker-4] procedure2.ProcedureExecutor: Finished subprocedure(s) of pid=667, state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=true; TransitRegionStateProcedure table=dummy_table, region=, REOPEN/MOVE; resume parent processing. 2018-11-21 21:21:51,386 INFO [PEWorker-13] assignment.RegionStateStore: pid=667 updating hbase:meta row=, regionState=OPENING, regionLocation=,17020,1542863268158 {noformat} There are no obvious errors/delays that I see in RS log, and it doesn't log starting to send the report. However, at 21:52.853 the report is processed that still contains this region. {noformat} 2018-11-21 21:21:52,853 WARN [RpcServer.default.FPBQ.Fifo.handler=48,queue=3,port=17000] assignment.AssignmentManager: Killing ,17020,1542754626176: rit=OPENING, location=,17020,1542863268158, table=dummy_table, region= reported OPEN on server=,17020,1542754626176 but state has otherwise. * ABORTING region server ,17020,1542754626176: org.apache.hadoop.hbase.YouAreDeadException: rit=OPENING, location=,17020,1542863268158, table=dummy_table, region= reported OPEN on server=,17020,1542754626176 but state has otherwise. {noformat} RS shuts down in an orderly manner and it can be seen from the log that this region is actually not present (there's no line indicating it's being closed, unlike for other regions). I think there needs to be some sort of versioning in RS reports to allow master to account for concurrent operations and avoid races. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21522) meta replicas appear to cause master restart to kill regionservers
Sergey Shelukhin created HBASE-21522: Summary: meta replicas appear to cause master restart to kill regionservers Key: HBASE-21522 URL: https://issues.apache.org/jira/browse/HBASE-21522 Project: HBase Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sergey Shelukhin On master restart, AM.start adds FIRST_META_REGIONINFO to regionStates; that has replica ID of 0. Before the meta is loaded, AssignmentManager.checkOnlineRegionsReportForMeta is called for RS reports, and that also only checks for 0th replica of meta and loads it once discovered. Once the meta is loaded, RS reports are processed normally; however nobody appears to add meta replicas to regionStates. So, when the RS hosting one reports in, it gets killed: {noformat} * ABORTING region server : org.apache.hadoop.hbase.YouAreDeadException: Not online: hbase:meta,,1_0001 * ABORTING region server : org.apache.hadoop.hbase.YouAreDeadException: Not online: hbase:meta,,1_0002 {noformat} This exception is thrown when regionStates has no record for the region. RS in question shut down in an orderly manner and they do have the corresponding regions, that master then assigns to someone else in a few minutes. Still, this seems less than ideal. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-17715) expose a sane API to package a standalone client jar
Sergey Shelukhin created HBASE-17715: Summary: expose a sane API to package a standalone client jar Key: HBASE-17715 URL: https://issues.apache.org/jira/browse/HBASE-17715 Project: HBase Issue Type: Task Reporter: Sergey Shelukhin Assignee: Enis Soztutar TableMapReduceUtil currently exposes a method that takes some info from job object iirc, and then makes a standalone jar and adds it to classpath. It would be nice to have an API that one can call with minimum necessary arguments (not dependent on job stuff, "tmpjars" and all that) that would make a standalone client jar at a given path and let the caller manage it after that. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-15585) RegionServer coprocessors are not flexible enough
Sergey Shelukhin created HBASE-15585: Summary: RegionServer coprocessors are not flexible enough Key: HBASE-15585 URL: https://issues.apache.org/jira/browse/HBASE-15585 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin While you can do all kinds of things with coprocessors, like arbitrarily discard memstore data or replace files randomly during compaction, I believe the ultimate power and flexibility is not there. The patch aims to address this shortcoming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-10468) refactor AsyncProcess pt 2
[ https://issues.apache.org/jira/browse/HBASE-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-10468. -- Resolution: Won't Fix > refactor AsyncProcess pt 2 > -- > > Key: HBASE-10468 > URL: https://issues.apache.org/jira/browse/HBASE-10468 > Project: HBase > Issue Type: Improvement >Reporter: Sergey Shelukhin > > Followup for HBASE-10277. > Further work can be done, as discussed in comments, such as moving "global" > error management for streaming use case from AsyncProcess to HTable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-9933) checkAndMutate doesn't do checks for increments and appends
[ https://issues.apache.org/jira/browse/HBASE-9933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-9933. - Resolution: Not a Problem > checkAndMutate doesn't do checks for increments and appends > --- > > Key: HBASE-9933 > URL: https://issues.apache.org/jira/browse/HBASE-9933 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin > > See HRegionServer::mutate switch statement. For puts/deletes it checks > condition, for i/a it just does the operation. Discovered while doing stuff > for HBASE-3787 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-10794) multi-get should handle replica location missing from cache
[ https://issues.apache.org/jira/browse/HBASE-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-10794. -- Resolution: Fixed committed to branch > multi-get should handle replica location missing from cache > --- > > Key: HBASE-10794 > URL: https://issues.apache.org/jira/browse/HBASE-10794 > Project: HBase > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: hbase-10070 > > Attachments: HBASE-10794.01.patch, HBASE-10794.02.addendum.patch, > HBASE-10794.02.patch, HBASE-10794.03.patch, HBASE-10794.patch, > HBASE-10794.patch > > > Currently the way cache works is that the meta row is stored together for all > replicas of a region, so if some replicas are in recovery, getting locations > for a region will still go to cache only and return null locations for these. > Multi-get currently ignores such replicas. It should instead try to get > location again from meta if any replica is null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-10634) Multiget doesn't fully work
[ https://issues.apache.org/jira/browse/HBASE-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-10634. -- Resolution: Fixed committed to branch > Multiget doesn't fully work > --- > > Key: HBASE-10634 > URL: https://issues.apache.org/jira/browse/HBASE-10634 > Project: HBase > Issue Type: Sub-task >Reporter: Devaraj Das >Assignee: Sergey Shelukhin > Fix For: hbase-10070 > > Attachments: 10634-1.1.txt, 10634-1.txt, HBASE-10634.02.patch, > HBASE-10634.patch, HBASE-10634.patch, multi.out, no-multi.out > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-10942) support parallel request cancellation for multi-get
Sergey Shelukhin created HBASE-10942: Summary: support parallel request cancellation for multi-get Key: HBASE-10942 URL: https://issues.apache.org/jira/browse/HBASE-10942 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-10877) HBase non-retrieable exception list should be expanded
Sergey Shelukhin created HBASE-10877: Summary: HBase non-retrieable exception list should be expanded Key: HBASE-10877 URL: https://issues.apache.org/jira/browse/HBASE-10877 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Priority: Minor Example where retries do not make sense: {noformat} 2014-03-31 20:54:27,765 WARN [InputInitializer [Map 1] #0] org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Encountered problems when prefetch hbase:meta table: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Mon Mar 31 20:45:17 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString Mon Mar 31 20:45:17 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:45:17 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:45:18 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:45:20 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:45:24 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:45:34 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:45:45 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:45:55 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:46:05 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:46:25 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:46:45 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:47:05 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:47:25 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:47:45 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:48:05 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:48:25 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:48:46 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:49:06 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:49:26 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:49:46 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:50:06 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:50:26 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:50:46 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:51:06 UTC 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@343d511e, java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString Mon Mar 31 20:51:26 UTC 2014, org.apache.hadoop.hbase.client.RpcR
[jira] [Created] (HBASE-10794) multi-get should handle missing replica location from cache
Sergey Shelukhin created HBASE-10794: Summary: multi-get should handle missing replica location from cache Key: HBASE-10794 URL: https://issues.apache.org/jira/browse/HBASE-10794 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: hbase-10070 Currently the way cache works is that the meta row is stored together for all replicas of a region, so if some replicas are in recovery, getting locations for a region will still go to cache only and return null locations for these. Multi-get currently ignores such replicas. It should instead try to get location again from meta if any replica is null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-10491) RegionLocations::getRegionLocation can return unexpected replica
Sergey Shelukhin created HBASE-10491: Summary: RegionLocations::getRegionLocation can return unexpected replica Key: HBASE-10491 URL: https://issues.apache.org/jira/browse/HBASE-10491 Project: HBase Issue Type: Bug Affects Versions: hbase-10070 Reporter: Sergey Shelukhin The method returns first non-null replica. If first replica is assumed to always be non-null (discussed with Enis), then this code is not necessary, it should return 0th one, maybe assert it's not null. If that is not the case, then code may be incorrect and may return non-primary to some code (locateRegion overload) that doesn't expect it. Perhaps method should be called getAnyRegionReplica or something like that; and get(Primary?)RegionLocation should return the first. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10479) HConnection interface is public but is used internally, and contains a bunch of methods
Sergey Shelukhin created HBASE-10479: Summary: HConnection interface is public but is used internally, and contains a bunch of methods Key: HBASE-10479 URL: https://issues.apache.org/jira/browse/HBASE-10479 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin HConnection has too many methods for a public interface, and some of there should not be public. It is used extensively for internal purposes, so we keep adding methods to it that may not make sense for public interface. The idea is to create a separate internal interface inheriting HConnection, copy some methods to it and deprecate them on HConnection. New methods for internal use would be added to new interface; the deprecated methods would eventually be removed from public interface. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10468) refactor AsyncProcess pt 2
Sergey Shelukhin created HBASE-10468: Summary: refactor AsyncProcess pt 2 Key: HBASE-10468 URL: https://issues.apache.org/jira/browse/HBASE-10468 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Followup for HBASE-10277. Further work can be done, as discussed in comments, such as moving "global" error management for streaming use case from AsyncProcess to HTable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10440) integration tests fail due to nonce collisions
Sergey Shelukhin created HBASE-10440: Summary: integration tests fail due to nonce collisions Key: HBASE-10440 URL: https://issues.apache.org/jira/browse/HBASE-10440 Project: HBase Issue Type: Test Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Before HBASE-9899 is implemented, and after HBASE-3787, HBase throws OperationConflictException when client retries an already-successful non-idempotent request because the response didn't reach the client. Integration tests run into this when CM kills servers hard during relatively-recently-added appends and increments. They need to handle this, read verification would make sure the results are still correct -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10427) clean up HRegionLocation/ServerName usage
Sergey Shelukhin created HBASE-10427: Summary: clean up HRegionLocation/ServerName usage Key: HBASE-10427 URL: https://issues.apache.org/jira/browse/HBASE-10427 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor I noticed that AsyncProcess updates cache location on failures using a single HRL object that is key to the map that is intended to be by server and contains requests for multiple regions (i.e. MultiAction contains requests for regions A, B, C, and sits in a map by HRL with HRL from server A as a key; in case of failure for e.g. request to B, or entire multiaction, the location from map key will be passed to updateCache... methods, even though it's not for the correct region). It may cause some subtle mistakes in cache updates. I think it'd be good to clean up HRL usage around AP and other classes - if we intend to have a server name, then we should use ServerName not HRL. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10418) give blocks of smaller store files priority in cache
Sergey Shelukhin created HBASE-10418: Summary: give blocks of smaller store files priority in cache Key: HBASE-10418 URL: https://issues.apache.org/jira/browse/HBASE-10418 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Sergey Shelukhin That's just an idea at this point, I don't have a patch nor plan to make one in near future. It's good for datasets that don't fit in memory especially; and if scans are involved. Scans (and gets in absence of bloom filters' help) have to read from all store files. Short range request will hit one block in every file. If small files are more likely to be entirely available in memory, on average requests will hit less blocks from FS. For scans that read a lot of data, it's better to read blocks in sequence from a big file and blocks for small files from cache, rather than a mix of FS and cached blocks from different files, because the (HBase) blocks of a big file would be sequential in one HDFS block. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10390) expose checkAndPut/Delete custom comparators thru HTable
Sergey Shelukhin created HBASE-10390: Summary: expose checkAndPut/Delete custom comparators thru HTable Key: HBASE-10390 URL: https://issues.apache.org/jira/browse/HBASE-10390 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin checkAndPut/Delete appear to support custom comparators. However, thru HTable, there's no way to pass one, it always creates BinaryComparator from value. It would be good to expose the custom ones in the API. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HBASE-10339) Mutation::getFamilyMap method was lost in 98
[ https://issues.apache.org/jira/browse/HBASE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HBASE-10339. -- Resolution: Fixed committed to 98 and trunk > Mutation::getFamilyMap method was lost in 98 > > > Key: HBASE-10339 > URL: https://issues.apache.org/jira/browse/HBASE-10339 > Project: HBase > Issue Type: Bug >Affects Versions: 0.98.0, 0.99.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 0.98.0, 0.99.0 > > Attachments: HBASE-10339.patch > > > When backward compat work was done in several jiras, this method was missed. > First the return type was changed, then the method was rename to not break > the callers via new return type, but the legacy method was never re-added as > far as I see -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10339) Mutation::getFamilyMap method was lost in 96
Sergey Shelukhin created HBASE-10339: Summary: Mutation::getFamilyMap method was lost in 96 Key: HBASE-10339 URL: https://issues.apache.org/jira/browse/HBASE-10339 Project: HBase Issue Type: Bug Affects Versions: 0.96.0, 0.98.0, 0.99.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin When backward compat work was done in several jiras, this method was missed. First the return type was changed, then the method was rename to not break the callers via new return type, but the legacy method was never re-added as far as I see -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10288) make mvcc an (optional) part of KV serialization
Sergey Shelukhin created HBASE-10288: Summary: make mvcc an (optional) part of KV serialization Key: HBASE-10288 URL: https://issues.apache.org/jira/browse/HBASE-10288 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin This has been suggested in HBASE-10241. Mvcc can currently be serialized in HFile, but the mechanism is... magical. We might want to make it a part of proper serialization of the KV. It can be done using tags, but we may not want the overhead given that it will be in many KVs, so it might require HFileFormat vN+1. Regardless, the external mechanism would need to be removed while also preserving backward compat. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10277) refactor AsyncProcess
Sergey Shelukhin created HBASE-10277: Summary: refactor AsyncProcess Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on "original index" of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the "global" part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. I am not sure yet, but maybe sending of the original index to server in "ClientProtos.MultiAction" can also be avoided. -- This message was sent by Atlassian JIRA (v6.1.5#6160)