[ 
https://issues.apache.org/jira/browse/KYLIN-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangxiaojing updated KYLIN-4017:
--------------------------------
    Summary: Build engine get zk(zookeeper) lock failed when building job, it 
causes the whole build engine doesn't work.  (was: Build engine get 
zk(zookeeper) lock failed when building job, it can't build job ,the whole 
build engine doesn't work.)

> Build engine get zk(zookeeper) lock failed when building job, it causes the 
> whole build engine doesn't work.
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-4017
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4017
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine, Tools, Build and Test
>    Affects Versions: Future, v3.0.0, v3.0.0-alpha
>            Reporter: wangxiaojing
>            Priority: Critical
>              Labels: build
>             Fix For: Future, v3.0.0-alpha
>
>         Attachments: zkinstancestart.png
>
>
> Kylin has ZK acquisition lock exception when it is building job. Only restart 
> can solve this problem. Otherwise, it can't build job ,the whole build engine 
> doesn't work.This problem will continue to occur one day after restart. Log 
> looks like below:
> {code:java}
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] 
> threadpool.FetcherRunner:59 : 
> CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - 
> es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 
> 2019-05-15 11:03:15, state=READY} prepare to schedule and its priority is 20
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] 
> threadpool.FetcherRunner:63 : 
> CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - 
> es_report_respnse_rate_cube - 20190513000000_20190514000000 - GMT+08:00 
> 2019-05-15 11:03:15, state=READY} scheduled
> 2019-05-15 11:09:43,209 DEBUG [Scheduler 719764581 Job 
> 878974c4-4c65-88a4-a912-b238fcc33bdc-132] 
> zookeeper.ZookeeperDistributedLock:92 : 
> 18...@bigdata-kylin-build01.gz01.diditaxi.com trying to lock 
> /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
> 2019-05-15 11:09:43,212 ERROR [pool-12-thread-10] 
> threadpool.DistributedScheduler:115 : unknown error execute 
> job:878974c4-4c65-88a4-a912-b238fcc33bdc in server: 
> 18...@bigdata-kylin-build01.gz01.diditaxi.com
> java.lang.IllegalStateException: Error while 
> 18...@bigdata-kylin-build01.gz01.diditaxi.com trying to lock 
> /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:99)
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperJobLock.lock(ZookeeperJobLock.java:41)
>  at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:105)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: instance must be started before 
> calling this method
>  at 
> org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:176)
>  at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.create(CuratorFrameworkImpl.java:351)
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:95)
>  ... 5 more{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to