Thanks, Sergey. Sounds like you're on to it. We could try configuring those
tests with a non zero thread pool size so they don't
use SynchronousQueue. Want to file a JIRA with this info so we don't lose
track of it?

    James

On Tue, May 17, 2016 at 11:21 PM, Sergey Soldatov <[email protected]>
wrote:

> Getting back to the failures with OOM/unable to create a native thread.
> Those files have around 100 tests inside each that are running on top of
> the phoenix. In total they generate over 2500 scans. (system.catalog,
> sequences and regular scans over table).  The problem that on HBase side
> all scans are going through the ThreadPoolExecutor generated in HTable.
> Which is using SynchronousQueue as the queue. As from the javadoc for
> ThreadPoolExecutor:
>
> *Direct handoffs. A good default choice for a work queue is a
> SynchronousQueue that hands off tasks to threads without otherwise holding
> them. Here, an attempt to queue a task will fail if no threads are
> immediately available to run it, so a new thread will be constructed. This
> policy avoids lockups when handling sets of requests that might have
> internal dependencies. Direct handoffs generally require unbounded
> maximumPoolSizes to avoid rejection of new submitted tasks. This in turn
> admits the possibility of unbounded thread growth when commands continue to
> arrive on average faster than they can be processed.*
>
> And actually we hit exactly last  case. But still there isl a question.
> Since all those tests all passing correctly and the scans are completed
> during execution (I checked that) it's not clear why all those threads are
> still alive. If someone has a suggestion why it could happen it will be
> interesting to listen. Otherwise I will dig deeper a bit later.  Possible
> also it's worth to change the queue in HBase to something less aggressive
> in terms of thread creation.
>
> Thanks,
> Sergey
>
>
> On Thu, May 5, 2016 at 8:24 AM, James Taylor <[email protected]>
> wrote:
>
> > Looks like all Jenkins builds are failing, but it seems environmental? Do
> > we need to exclude some particular kind of host(s)?
> >
> > On Wed, May 4, 2016 at 5:25 PM, James Taylor <[email protected]>
> > wrote:
> >
> > > Thanks, Sergey!
> > >
> > > On Wed, May 4, 2016 at 5:22 PM, Sergey Soldatov <
> > [email protected]>
> > > wrote:
> > >
> > >> James,
> > >> Ah, didn't notice that timeouts are not shown in the final report as
> > >> failures. It seems that the build is using JDK 1.7 and test run OOM
> > >> with PermGen space. Fixed in PHOENIX-2879
> > >>
> > >> Thanks,
> > >> Sergey
> > >>
> > >> On Wed, May 4, 2016 at 1:48 PM, James Taylor <[email protected]>
> > >> wrote:
> > >> > Sergey, on master branch (which is HBase 1.2):
> > >> > https://builds.apache.org/job/Phoenix-master/1214/console
> > >> >
> > >> > On Wed, May 4, 2016 at 1:31 PM, Sergey Soldatov <
> > >> [email protected]>
> > >> > wrote:
> > >> >>
> > >> >> James,
> > >> >> Regarding HivePhoenixStoreIT. Are you talking about
> > >> >> Phoenix-4.x-HBase-1.0  job? Last build passed it successfully.
> > >> >>
> > >> >>
> > >> >> On Wed, May 4, 2016 at 10:15 AM, James Taylor <
> > [email protected]>
> > >> >> wrote:
> > >> >> > Our Jenkins builds have improved, but we're seeing some issues:
> > >> >> > - timeouts with the new
> org.apache.phoenix.hive.HivePhoenixStoreIT
> > >> test.
> > >> >> > - consistent failure with 4.x-HBase-1.1 build. I suspect that
> > Jenkins
> > >> >> > build
> > >> >> > is out-of-date, as we haven't had a 4.x-HBase-1.1 branch for
> quite
> > a
> > >> >> > while.
> > >> >> > There's likely some changes that were made to the other Jenkins
> > build
> > >> >> > scripts that weren't made to this one
> > >> >> > - flapping of
> > >> >> > the
> > >> >> >
> > >>
> >
> org.apache.phoenix.end2end.index.ReadOnlyIndexFailureIT.testWriteFailureReadOnlyIndex
> > >> >> > test in 0.98 and 1.0
> > >> >> > - no email sent for 0.98 build (as far as I can tell)
> > >> >> >
> > >> >> > If folks have time to look into these, that'd be much
> appreciated.
> > >> >> >
> > >> >> >     James
> > >> >> >
> > >> >> >
> > >> >> >
> > >> >> > On Sat, Apr 30, 2016 at 11:55 AM, James Taylor <
> > >> [email protected]>
> > >> >> > wrote:
> > >> >> >
> > >> >> >> The defaults when tests are running are much lower than the
> > standard
> > >> >> >> Phoenix defaults (see QueryServicesTestImpl and
> > >> >> >> BaseTest.setUpConfigForMiniCluster()). It's unclear to me why
> the
> > >> >> >> HashJoinIT and SortMergeJoinIT tests (I think these are the
> > >> culprits)
> > >> >> >> do
> > >> >> >> not seem to adhere to these (or maybe override them?). They fail
> > >> for me
> > >> >> >> on
> > >> >> >> my Mac, but they do pass on a Linux box. Would be awesome if
> > someone
> > >> >> >> could
> > >> >> >> investigate and submit a patch to fix these.
> > >> >> >>
> > >> >> >> Thanks,
> > >> >> >> James
> > >> >> >>
> > >> >> >> On Sat, Apr 30, 2016 at 11:47 AM, Nick Dimiduk <
> > [email protected]>
> > >> >> >> wrote:
> > >> >> >>
> > >> >> >>> The default thread pool sizes for HDFS, HBase, ZK, and the
> > Phoenix
> > >> >> >>> client
> > >> >> >>> are all contributing to this huge thread count.
> > >> >> >>>
> > >> >> >>> A good starting point would be to take a jstack of the IT
> process
> > >> and
> > >> >> >>> count, group by threads with similar name. Reconfigure to
> reduce
> > >> all
> > >> >> >>> those
> > >> >> >>> groups to something like 10 each, see if the test still runs
> > >> reliably
> > >> >> >>> on
> > >> >> >>> local hardware.
> > >> >> >>>
> > >> >> >>> On Friday, April 29, 2016, Sergey Soldatov <
> > >> [email protected]>
> > >> >> >>> wrote:
> > >> >> >>>
> > >> >> >>> > but the way, we need to do something with those OOMs and
> > "unable
> > >> to
> > >> >> >>> > create new native thread" in ITs. It's quite strange to see
> in
> > 10
> > >> >> >>> > lines test such kind of failures. Especially when queries for
> > >> table
> > >> >> >>> > with less than 10 rows generate over 2500 threads. Does
> anybody
> > >> know
> > >> >> >>> > whether it's zk related issue?
> > >> >> >>> >
> > >> >> >>> > On Fri, Apr 29, 2016 at 7:51 AM, James Taylor
> > >> >> >>> > <[email protected]
> > >> >> >>> > <javascript:;>> wrote:
> > >> >> >>> > > A patch would be much appreciated, Sergey.
> > >> >> >>> > >
> > >> >> >>> > > On Fri, Apr 29, 2016 at 3:26 AM, Sergey Soldatov <
> > >> >> >>> > [email protected] <javascript:;>>
> > >> >> >>> > > wrote:
> > >> >> >>> > >
> > >> >> >>> > >> As for flume module - flume-ng is coming with commons-io
> 2.1
> > >> >> >>> > >> while
> > >> >> >>> > >> hadoop & hbase require org.apache.commons.io.Charsets
> which
> > >> was
> > >> >> >>> > >> introduced in 2.3. Easy way is to move dependency on
> > flume-ng
> > >> >> >>> > >> after
> > >> >> >>> > >> the dependencies on hbase/hadoop.
> > >> >> >>> > >>
> > >> >> >>> > >> The last thing about ConcurrentHashMap - it definitely
> means
> > >> that
> > >> >> >>> > >> the
> > >> >> >>> > >> code was compiled with 1.8 since 1.7 returns a simple Set
> > >> while
> > >> >> >>> > >> 1.8
> > >> >> >>> > >> returns KeySetView
> > >> >> >>> > >>
> > >> >> >>> > >>
> > >> >> >>> > >>
> > >> >> >>> > >> On Thu, Apr 28, 2016 at 4:08 PM, Josh Elser <
> > >> [email protected]
> > >> >> >>> > <javascript:;>> wrote:
> > >> >> >>> > >> > *tl;dr*
> > >> >> >>> > >> >
> > >> >> >>> > >> > * I'm removing ubuntu-us1 from all pools
> > >> >> >>> > >> > * Phoenix-Flume ITs look busted
> > >> >> >>> > >> > * UpsertValuesIT looks busted
> > >> >> >>> > >> > * Something is weirdly wrong with Phoenix-4.x-HBase-1.1
> in
> > >> its
> > >> >> >>> > entirety.
> > >> >> >>> > >> >
> > >> >> >>> > >> > Details below...
> > >> >> >>> > >> >
> > >> >> >>> > >> > It looks like we have a bunch of different reasons for
> the
> > >> >> >>> failures.
> > >> >> >>> > >> > Starting with Phoenix-master:
> > >> >> >>> > >> >
> > >> >> >>> > >> >>>>
> > >> >> >>> > >> >
> > org.apache.phoenix.schema.NewerTableAlreadyExistsException:
> > >> >> >>> > >> > ERROR
> > >> >> >>> 1013
> > >> >> >>> > >> > (42M04): Table already exists. tableName=T
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.phoenix.end2end.UpsertValuesIT.testBatchedUpsert(UpsertValuesIT.java:476)
> > >> >> >>> > >> > <<<
> > >> >> >>> > >> >
> > >> >> >>> > >> > I've seen this coming out of a few different tests (I
> > think
> > >> >> >>> > >> > I've
> > >> >> >>> also
> > >> >> >>> > run
> > >> >> >>> > >> > into it on my own, but that's another thing)
> > >> >> >>> > >> >
> > >> >> >>> > >> > Some of them look like the Jenkins build host is just
> > >> >> >>> > >> > over-taxed:
> > >> >> >>> > >> >
> > >> >> >>> > >> >>>>
> > >> >> >>> > >> > Java HotSpot(TM) 64-Bit Server VM warning: INFO:
> > >> >> >>> > >> > os::commit_memory(0x00000007e7600000, 331350016, 0)
> > failed;
> > >> >> >>> > error='Cannot
> > >> >> >>> > >> > allocate memory' (errno=12)
> > >> >> >>> > >> > #
> > >> >> >>> > >> > # There is insufficient memory for the Java Runtime
> > >> Environment
> > >> >> >>> > >> > to
> > >> >> >>> > >> continue.
> > >> >> >>> > >> > # Native memory allocation (malloc) failed to allocate
> > >> >> >>> > >> > 331350016
> > >> >> >>> bytes
> > >> >> >>> > >> for
> > >> >> >>> > >> > committing reserved memory.
> > >> >> >>> > >> > # An error report file with more information is saved
> as:
> > >> >> >>> > >> > #
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> /home/jenkins/jenkins-slave/workspace/Phoenix-master/phoenix-core/hs_err_pid26454.log
> > >> >> >>> > >> > Java HotSpot(TM) 64-Bit Server VM warning: INFO:
> > >> >> >>> > >> > os::commit_memory(0x00000007ea600000, 273678336, 0)
> > failed;
> > >> >> >>> > error='Cannot
> > >> >> >>> > >> > allocate memory' (errno=12)
> > >> >> >>> > >> > #
> > >> >> >>> > >> > <<<
> > >> >> >>> > >> >
> > >> >> >>> > >> > and
> > >> >> >>> > >> >
> > >> >> >>> > >> >>>>
> > >> >> >>> > >> > -------------------------------------------------------
> > >> >> >>> > >> >  T E S T S
> > >> >> >>> > >> > -------------------------------------------------------
> > >> >> >>> > >> > Build step 'Invoke top-level Maven targets' marked build
> > as
> > >> >> >>> > >> > failure
> > >> >> >>> > >> > <<<
> > >> >> >>> > >> >
> > >> >> >>> > >> > Both of these issues are limited to the host
> "ubuntu-us1".
> > >> Let
> > >> >> >>> > >> > me
> > >> >> >>> just
> > >> >> >>> > >> > remove him from the pool (on Phoenix-master) and see if
> > that
> > >> >> >>> > >> > helps
> > >> >> >>> at
> > >> >> >>> > >> all.
> > >> >> >>> > >> >
> > >> >> >>> > >> > I also see some sporadic failures of some Flume tests
> > >> >> >>> > >> >
> > >> >> >>> > >> >>>>
> > >> >> >>> > >> > Running org.apache.phoenix.flume.PhoenixSinkIT
> > >> >> >>> > >> > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time
> > >> elapsed:
> > >> >> >>> 0.004
> > >> >> >>> > sec
> > >> >> >>> > >> > <<< FAILURE! - in org.apache.phoenix.flume.PhoenixSinkIT
> > >> >> >>> > >> > org.apache.phoenix.flume.PhoenixSinkIT  Time elapsed:
> > 0.004
> > >> sec
> > >> >> >>> <<<
> > >> >> >>> > >> ERROR!
> > >> >> >>> > >> > java.lang.RuntimeException: java.io.IOException: Failed
> to
> > >> save
> > >> >> >>> > >> > in
> > >> >> >>> any
> > >> >> >>> > >> > storage directories while saving namespace.
> > >> >> >>> > >> > Caused by: java.io.IOException: Failed to save in any
> > >> storage
> > >> >> >>> > directories
> > >> >> >>> > >> > while saving namespace.
> > >> >> >>> > >> >
> > >> >> >>> > >> > Running org.apache.phoenix.flume.RegexEventSerializerIT
> > >> >> >>> > >> > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time
> > >> elapsed:
> > >> >> >>> 0.005
> > >> >> >>> > sec
> > >> >> >>> > >> > <<< FAILURE! - in
> > >> >> >>> > >> > org.apache.phoenix.flume.RegexEventSerializerIT
> > >> >> >>> > >> > org.apache.phoenix.flume.RegexEventSerializerIT  Time
> > >> elapsed:
> > >> >> >>> 0.004
> > >> >> >>> > sec
> > >> >> >>> > >> > <<< ERROR!
> > >> >> >>> > >> > java.lang.RuntimeException: java.io.IOException: Failed
> to
> > >> save
> > >> >> >>> > >> > in
> > >> >> >>> any
> > >> >> >>> > >> > storage directories while saving namespace.
> > >> >> >>> > >> > Caused by: java.io.IOException: Failed to save in any
> > >> storage
> > >> >> >>> > directories
> > >> >> >>> > >> > while saving namespace.
> > >> >> >>> > >> > <<<
> > >> >> >>> > >> >
> > >> >> >>> > >> > I'm not sure what the error message means at a glance.
> > >> >> >>> > >> >
> > >> >> >>> > >> > For Phoenix-HBase-1.1:
> > >> >> >>> > >> >
> > >> >> >>> > >> >>>>
> > >> >> >>> > >> > org.apache.hadoop.hbase.DoNotRetryIOException:
> > >> >> >>> > >> java.lang.NoSuchMethodError:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2156)
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >> >
> > >> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> > >> >> >>> > >> >         at java.lang.Thread.run(Thread.java:745)
> > >> >> >>> > >> > Caused by: java.lang.NoSuchMethodError:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.findServerWithSameHostnamePortWithLock(ServerManager.java:432)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.checkAndRecordNewServer(ServerManager.java:346)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:264)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:318)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615)
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2117)
> > >> >> >>> > >> >         ... 4 more
> > >> >> >>> > >> > 2016-04-28 22:54:35,497 WARN  [RS:0;hemera:41302]
> > >> >> >>> > >> >
> org.apache.hadoop.hbase.regionserver.HRegionServer(2279):
> > >> error
> > >> >> >>> > telling
> > >> >> >>> > >> > master we are up
> > >> >> >>> > >> > com.google.protobuf.ServiceException:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
> > >> >> >>> > >> > org.apache.hadoop.hbase.DoNotRetryIOException:
> > >> >> >>> > >> java.lang.NoSuchMethodError:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2156)
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >> >
> > >> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> > >> >> >>> > >> >         at java.lang.Thread.run(Thread.java:745)
> > >> >> >>> > >> > Caused by: java.lang.NoSuchMethodError:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.findServerWithSameHostnamePortWithLock(ServerManager.java:432)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.checkAndRecordNewServer(ServerManager.java:346)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:264)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:318)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615)
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2117)
> > >> >> >>> > >> >         ... 4 more
> > >> >> >>> > >> >
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:318)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.regionServerStartup(RegionServerStatusProtos.java:8982)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2269)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:893)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
> > >> >> >>> > >> >         at
> > >> java.security.AccessController.doPrivileged(Native
> > >> >> >>> Method)
> > >> >> >>> > >> >         at
> > >> javax.security.auth.Subject.doAs(Subject.java:356)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:307)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
> > >> >> >>> > >> >         at java.lang.Thread.run(Thread.java:745)
> > >> >> >>> > >> > Caused by:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
> > >> >> >>> > >> > org.apache.hadoop.hbase.DoNotRetryIOException:
> > >> >> >>> > >> java.lang.NoSuchMethodError:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2156)
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >> >
> > >> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> > >> >> >>> > >> >         at java.lang.Thread.run(Thread.java:745)
> > >> >> >>> > >> > Caused by: java.lang.NoSuchMethodError:
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.findServerWithSameHostnamePortWithLock(ServerManager.java:432)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.checkAndRecordNewServer(ServerManager.java:346)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:264)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:318)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615)
> > >> >> >>> > >> >         at
> > >> >> >>> > >>
> > >> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2117)
> > >> >> >>> > >> >         ... 4 more
> > >> >> >>> > >> >
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> >
> > >> >> >>> >
> > >>
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1235)
> > >> >> >>> > >> >         at
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>>
> > >>
> >
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:217)
> > >> >> >>> > >> >         ... 13 more
> > >> >> >>> > >> > <<<
> > >> >> >>> > >> >
> > >> >> >>> > >> > We have hit-or-miss on this error message which keeps
> > >> >> >>> hbase:namespace
> > >> >> >>> > >> from
> > >> >> >>> > >> > being assigned (as the RS's can never report into the
> > >> hmaster).
> > >> >> >>> This
> > >> >> >>> > is
> > >> >> >>> > >> > happening across a couple of the nodes
> (ubuntu-[3,4,6]). I
> > >> had
> > >> >> >>> tried
> > >> >> >>> > to
> > >> >> >>> > >> look
> > >> >> >>> > >> > into this one over the weekend (and was lead to a JDK8
> > built
> > >> >> >>> > >> > jar,
> > >> >> >>> > >> running on
> > >> >> >>> > >> > JDK7), but if I look at META-INF/MANIFEST.mf in the
> > >> >> >>> > >> hbase-server-1.1.3.jar
> > >> >> >>> > >> > from central, I see it was built with 1.7.0_80 (which I
> > >> think
> > >> >> >>> > >> > means
> > >> >> >>> > the
> > >> >> >>> > >> JDK8
> > >> >> >>> > >> > thought is a red-herring). I'm really confused by this
> > one,
> > >> >> >>> actually.
> > >> >> >>> > >> > Something must be amiss here.
> > >> >> >>> > >> >
> > >> >> >>> > >> > For Phoenix-HBase-1.0:
> > >> >> >>> > >> >
> > >> >> >>> > >> > We see the same Phoenix-Flume failures, UpsertValuesIT
> > >> failure,
> > >> >> >>> > >> > and
> > >> >> >>> > >> timeouts
> > >> >> >>> > >> > on ubuntu-us1. There is one crash on H10, but that might
> > >> just
> > >> >> >>> > >> > be
> > >> >> >>> bad
> > >> >> >>> > >> luck.
> > >> >> >>> > >> >
> > >> >> >>> > >> > For Phoenix-HBase-0.98:
> > >> >> >>> > >> >
> > >> >> >>> > >> > Same UpsertValuesIT failure and failures on ubuntu-us1.
> > >> >> >>> > >> >
> > >> >> >>> > >> >
> > >> >> >>> > >> > James Taylor wrote:
> > >> >> >>> > >> >>
> > >> >> >>> > >> >> Anyone know why our Jenkins builds keep failing? Is it
> > >> >> >>> environmental
> > >> >> >>> > and
> > >> >> >>> > >> >> is
> > >> >> >>> > >> >> there anything we can do about it?
> > >> >> >>> > >> >>
> > >> >> >>> > >> >> Thanks,
> > >> >> >>> > >> >> James
> > >> >> >>> > >> >>
> > >> >> >>> > >> >
> > >> >> >>> > >>
> > >> >> >>> >
> > >> >> >>>
> > >> >> >>
> > >> >> >>
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to