Any progress Sahil? There was an issue fixed where we'd write the clusterid with server zk client but then would have trouble picking it up with the new zk read-only client seen in tests and fixed subsequent to beta-1. This looks like it.
Thanks for trying the beta. S On Thu, Feb 22, 2018 at 11:39 PM, sahil aggarwal <[email protected]> wrote: > It doesn't have hbase-shaded-client in classpath. I realized my build was > run with -Dzookeeper.version=3.4.6 but in pom we have 3.4.10. I am gonna > try rebuilding it with 3.4.10. > > On 23 February 2018 at 00:29, Josh Elser <[email protected]> wrote: > > > This sounds like something I've seen in the past but was unable to get > > past. I think I was seeing it when the hbase-shaded-client was on the > > classpath. Could you see if the presence of that artifact makes a > > difference one way or another? > > > > > > On 2/22/18 12:52 PM, sahil aggarwal wrote: > > > >> Yes, it is a clean setup. > >> > >> Here are logs on region startup > >> > >> 2018-02-22 22:17:22,259 DEBUG [main] zookeeper.ClientCnxn: > >> zookeeper.disableAutoWatchReset is false > >> 2018-02-22 22:17:22,401 INFO [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Opening socket connection to server perf-zk-1/ > >> 10.33.225.67:2181. Will not attempt to authenticate using SASL (unknown > >> error) > >> 2018-02-22 22:17:22,407 INFO [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Socket connection established to perf-zk-1/ > >> 10.33.225.67:2181, initiating session > >> 2018-02-22 22:17:22,409 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Session establishment request sent on perf-zk-1/ > >> 10.33.225.67:2181 > >> 2018-02-22 22:17:22,415 INFO [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Session establishment complete on server > perf-zk-1/ > >> 10.33.225.67:2181, sessionid = 0x36146d5de4467de, negotiated timeout = > >> 20000 > >> 2018-02-22 22:17:22,423 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, > packet:: > >> clientPath:null serverPath:null finished:false header:: 1,3 > replyHeader:: > >> 1,111751355931,0 request:: '/hbase-unsecure2/master,T response:: > >> s{111750564873,111750564873,1519309851875,1519309851875,0,0, > >> 0,171496145189271239,74,0,111750564873} > >> 2018-02-22 22:17:22,426 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, > packet:: > >> clientPath:null serverPath:null finished:false header:: 2,4 > replyHeader:: > >> 2,111751355931,0 request:: '/hbase-unsecure2/master,T response:: > >> #ffffffff000146d61737465723a36303030304c11fffffff11646ffffff > >> e12effffffd450425546a25a18706572662d636f736d6f732d686e6e2d61 > >> 2d33363433363010ffffffe0ffffffd4318ffffff9fffffff88ffffffb2f > >> fffffefffffff9b2c10018ffffffeaffffffd43,s{111750564873,11175 > >> 0564873,1519309851875,1519309851875,0,0,0,171496145189271239 > >> ,74,0,111750564873} > >> 2018-02-22 22:17:22,428 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, > packet:: > >> clientPath:null serverPath:null finished:false header:: 3,3 > replyHeader:: > >> 3,111751355931,0 request:: '/hbase-unsecure2/running,T response:: > >> s{111750565002,111750565002,1519309853317,1519309853317,0,0, > >> 0,0,59,0,111750565002} > >> 2018-02-22 22:17:22,430 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, > packet:: > >> clientPath:null serverPath:null finished:false header:: 4,4 > replyHeader:: > >> 4,111751355931,0 request:: '/hbase-unsecure2/running,T response:: > >> #ffffffff000146d61737465723a363030303021ffffffea7f3effffff8a > >> 28576450425546a1c546875204665622032322032303a30303a353320495 > >> 3542032303138,s{111750565002,111750565002,1519309853317,1519 > >> 309853317,0,0,0,0,59,0,111750565002} > >> 2018-02-22 22:17:22,459 DEBUG [main] ipc.RpcExecutor: Started 0 > >> default.FPBQ.Fifo handlers, qsize=10 on port=16020 > >> 2018-02-22 22:17:22,475 DEBUG [main] ipc.RpcExecutor: Started 0 > >> priority.FPBQ.Fifo handlers, qsize=2 on port=16020 > >> 2018-02-22 22:17:22,478 DEBUG [main] ipc.RpcExecutor: Started 0 > >> replication.FPBQ.Fifo handlers, qsize=1 on port=16020 > >> 2018-02-22 22:17:22,524 INFO [main] util.log: Logging initialized > @3325ms > >> 2018-02-22 22:17:22,625 INFO [main] http.HttpRequestLog: Http request > log > >> for http.requests.regionserver is not defined > >> 2018-02-22 22:17:22,651 INFO [main] http.HttpServer: Added global > filter > >> 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputF > >> ilter) > >> 2018-02-22 22:17:22,651 INFO [main] http.HttpServer: Added global > filter > >> 'clickjackingprevention' > >> (class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter) > >> 2018-02-22 22:17:22,654 INFO [main] http.HttpServer: Added filter > >> static_user_filter > >> (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$ > >> StaticUserFilter) > >> to context regionserver > >> 2018-02-22 22:17:22,654 INFO [main] http.HttpServer: Added filter > >> static_user_filter > >> (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$ > >> StaticUserFilter) > >> to context static > >> 2018-02-22 22:17:22,654 INFO [main] http.HttpServer: Added filter > >> static_user_filter > >> (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$ > >> StaticUserFilter) > >> to context logs > >> 2018-02-22 22:17:22,691 INFO [main] http.HttpServer: Jetty bound to > port > >> 60030 > >> 2018-02-22 22:17:22,693 INFO [main] server.Server: > jetty-9.3.19.v20170502 > >> 2018-02-22 22:17:22,765 INFO [main] handler.ContextHandler: Started > >> o.e.j.s.ServletContextHandler@7435a578 > >> {/logs,file:///var/log/hbase/,AVAILABLE} > >> 2018-02-22 22:17:22,765 INFO [main] handler.ContextHandler: Started > >> o.e.j.s.ServletContextHandler@13047d7d > >> {/static,file:///usr/lib/hbase/hbase-webapps/static/,AVAILABLE} > >> 2018-02-22 22:17:22,912 INFO [main] handler.ContextHandler: Started > >> o.e.j.w.WebAppContext@7428de63 > >> {/,file:///usr/lib/hbase/hbase-webapps/regionserver/,AVAILAB > >> LE}{file:/usr/lib/hbase/hbase-webapps/regionserver} > >> 2018-02-22 22:17:22,917 INFO [main] server.AbstractConnector: Started > >> ServerConnector@35636217{HTTP/1.1,[http/1.1]}{0.0.0.0:60030} > >> 2018-02-22 22:17:22,918 INFO [main] server.Server: Started @3720ms > >> 2018-02-22 22:17:22,946 INFO [regionserver/perf-hdn-a-364355/ > >> 10.32.73.176:16020] zookeeper.ReadOnlyZKClient: Start read only > zookeeper > >> connection 0x1e50cf7f to > >> perf-zk-a-364363:2181,perf-zk-2:2181,perf-zk-1:2181, session timeout > >> 120000 > >> ms, retries 30, retry interval 1000 ms, keep alive 60000 ms > >> 2018-02-22 22:17:22,951 INFO [ReadOnlyZKClient] zookeeper.ZooKeeper: > >> Initiating client connection, > >> connectString=perf-zk-a-364363:2181,perf-zk-2:2181,perf-zk-1:2181 > >> sessionTimeout=120000 > >> watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$ > >> Lambda$27/147794321@4dcda6ec > >> 2018-02-22 22:17:22,953 INFO [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Opening socket connection to server perf-zk-2/ > >> 10.33.101.49:2181. Will not attempt to authenticate using SASL (unknown > >> error) > >> 2018-02-22 22:17:22,954 INFO [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Socket connection established to perf-zk-2/ > >> 10.33.101.49:2181, initiating session > >> 2018-02-22 22:17:22,954 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Session establishment request sent on perf-zk-2/ > >> 10.33.101.49:2181 > >> 2018-02-22 22:17:22,956 INFO [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Session establishment complete on server > perf-zk-2/ > >> 10.33.101.49:2181, sessionid = 0x26146d5de5c7181, negotiated timeout = > >> 20000 > >> 2018-02-22 22:17:22,959 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Reading reply sessionid:0x26146d5de5c7181, > packet:: > >> clientPath:/hbase-unsecure2/hbaseid serverPath:/hbase-unsecure2/hbaseid > >> finished:false header:: 1,4 replyHeader:: 1,111751356003,0 request:: > >> '/hbase-unsecure2/hbaseid,F response:: > >> #ffffffff000146d61737465723a3630303030395e49ffffffefffffff98 > >> ffffffd0262150425546a2430653037386566362d363931362d343665332 > >> d386335652d653237666264303135326337,s{111750564985,111750564 > >> 985,1519309853186,1519309853186,0,0,0,0,67,0,111750564985} > >> 2018-02-22 22:17:29,103 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:17:29,630 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:17:35,777 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:17:36,304 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:17:42,449 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:17:42,978 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:17:49,123 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:17:49,650 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:17:55,797 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 1ms > >> 2018-02-22 22:17:56,322 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:18:02,470 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:02,994 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:18:09,142 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:09,668 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:18:15,814 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:16,340 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x26146d5de5c7181 > >> after 0ms > >> 2018-02-22 22:18:22,487 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:23,007 INFO [ReadOnlyZKClient] > >> zookeeper.ReadOnlyZKClient: 0x1e50cf7f no activities for 60000 ms, close > >> active connection. Will reconnect next time when there are new requests. > >> 2018-02-22 22:18:23,007 DEBUG [ReadOnlyZKClient] zookeeper.ZooKeeper: > >> Closing session: 0x26146d5de5c7181 > >> 2018-02-22 22:18:23,007 DEBUG [ReadOnlyZKClient] zookeeper.ClientCnxn: > >> Closing client for session: 0x26146d5de5c7181 > >> 2018-02-22 22:18:23,009 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: Reading reply sessionid:0x26146d5de5c7181, > packet:: > >> clientPath:null serverPath:null finished:false header:: 2,-11 > >> replyHeader:: 2,111751363806,0 request:: null response:: null > >> 2018-02-22 22:18:23,009 DEBUG [ReadOnlyZKClient] zookeeper.ClientCnxn: > >> Disconnecting client for session: 0x26146d5de5c7181 > >> 2018-02-22 22:18:23,009 INFO [ReadOnlyZKClient] zookeeper.ZooKeeper: > >> Session: 0x26146d5de5c7181 closed > >> 2018-02-22 22:18:23,009 INFO [ReadOnlyZKClient-EventThread] > >> zookeeper.ClientCnxn: EventThread shut down > >> 2018-02-22 22:18:23,009 DEBUG [ReadOnlyZKClient-SendThread(p > >> erf-zk-2:2181)] > >> zookeeper.ClientCnxn: An exception was thrown while closing send thread > >> for > >> session 0x26146d5de5c7181 : Unable to read additional data from server > >> sessionid 0x26146d5de5c7181, likely server has closed socket > >> 2018-02-22 22:18:29,158 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:35,830 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:42,503 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:49,175 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:18:55,849 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:19:02,519 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:19:09,192 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:19:15,866 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> 2018-02-22 22:19:22,538 DEBUG [main-SendThread(perf-zk-1:2181)] > >> zookeeper.ClientCnxn: Got ping response for sessionid: 0x36146d5de4467de > >> after 0ms > >> > >> On 22 February 2018 at 23:13, Ted Yu <[email protected]> wrote: > >> > >> Can you show more of the region server log ? > >>> Was the cluster started clean (without any data) ? > >>> > >>> There have been a lot of changes since 2.0.0-beta-1 was released (both > in > >>> terms of correctness and performance). > >>> > >>> If possible, please deploy 2.0 SNAPSHOT for further testing. > >>> > >>> Cheers > >>> > >>> On Thu, Feb 22, 2018 at 9:39 AM, sahil aggarwal <[email protected] > > > >>> wrote: > >>> > >>> Hi, > >>>> > >>>> I am trying to get 2.0.0-beta-1 cluster up to do some perf test but > not > >>>> able to get region servers up. Its stuck in initializing state. Looks > >>>> > >>> like > >>> > >>>> it is stuck in getting the hbaseId from zk: > >>>> > >>>> jstack says: > >>>> > >>>> "regionserver/perf-rs-1/10.32.73.176:16020" #24 prio=5 os_prio=0 > >>>> tid=0x00007f19e45cc000 nid=0x1b13 waiting on condition > >>>> > >>> [0x00007f19de6a7000] > >>> > >>>> java.lang.Thread.State: WAITING (parking) > >>>> at sun.misc.Unsafe.park(Native Method) > >>>> - parking to wait for <0x00000001846d8b98> (a > >>>> java.util.concurrent.CompletableFuture$WaitNode) > >>>> at java.util.concurrent.locks.LockSupport.park(LockSupport. > >>>> java:175) > >>>> at > >>>> java.util.concurrent.CompletableFuture$WaitNode. > >>>> block(CompletableFuture.java:271) > >>>> at > >>>> java.util.concurrent.ForkJoinPool.managedBlock( > ForkJoinPool.java:3226) > >>>> at > >>>> java.util.concurrent.CompletableFuture.waitingGet( > >>>> CompletableFuture.java:319) > >>>> at > >>>> java.util.concurrent.CompletableFuture.get( > CompletableFuture.java:2224) > >>>> at > >>>> org.apache.hadoop.hbase.client.ConnectionImplementation. > >>>> > >>> retrieveClusterId( > >>> > >>>> ConnectionImplementation.java:526) > >>>> at > >>>> org.apache.hadoop.hbase.client.ConnectionImplementation.<init>( > >>>> ConnectionImplementation.java:286) > >>>> at > >>>> org.apache.hadoop.hbase.client.ConnectionUtils$ > >>>> ShortCircuitingClusterConnection.<init>(ConnectionUtils.java:141) > >>>> at > >>>> org.apache.hadoop.hbase.client.ConnectionUtils$ > >>>> ShortCircuitingClusterConnection.<init>(ConnectionUtils.java:132) > >>>> at > >>>> org.apache.hadoop.hbase.client.ConnectionUtils. > >>>> createShortCircuitConnection(ConnectionUtils.java:185) > >>>> at > >>>> org.apache.hadoop.hbase.regionserver.HRegionServer. > >>>> createClusterConnection(HRegionServer.java:770) > >>>> at > >>>> org.apache.hadoop.hbase.regionserver.HRegionServer. > >>>> > >>> setupClusterConnection( > >>> > >>>> HRegionServer.java:801) > >>>> - locked <0x000000019ccd1bc8> (a > >>>> org.apache.hadoop.hbase.regionserver.HRegionServer) > >>>> at > >>>> org.apache.hadoop.hbase.regionserver.HRegionServer. > >>>> preRegistrationInitialization(HRegionServer.java:816) > >>>> at > >>>> org.apache.hadoop.hbase.regionserver.HRegionServer. > >>>> run(HRegionServer.java:925) > >>>> at java.lang.Thread.run(Thread.java:745) > >>>> > >>>> > >>>> Even though it seem to got the response from zk: > >>>> > >>>> 2018-02-22 22:17:22,959 DEBUG [ReadOnlyZKClient-SendThread( > >>>> perf-zk-1:2181)] > >>>> zookeeper.ClientCnxn: Reading reply sessionid:0x26146d5de5c7181, > >>>> packet:: > >>>> clientPath:/hbase-unsecure2/hbaseid serverPath:/hbase-unsecure2/ > hbaseid > >>>> finished:false header:: 1,4 replyHeader:: 1,111751356003,0 request:: > >>>> '/hbase-unsecure2/hbaseid,F response:: > >>>> #ffffffff000146d61737465723a3630303030395e49ffffffefffffff98f > >>>> fffffd0262150425546a2430653037386566362d363931362d343665332d > >>>> 386335652d653237666264303135326337,s{111750564985, > >>>> 111750564985,1519309853186,1519309853186,0,0,0,0,67,0,111750564985} > >>>> > >>>> Any pointers? > >>>> > >>>> > >>>> Thanks, > >>>> Sahil > >>>> > >>>> > >>> > >> >
