zhaotao created KYLIN-4595:
------------------------------
Summary: build with global dictionary error at step:Build
N-Dimension Cuboid
Key: KYLIN-4595
URL: https://issues.apache.org/jira/browse/KYLIN-4595
Project: Kylin
Issue Type: Bug
Affects Versions: v2.6.0
Environment: hadoop2.6.0-cdh5.13.2
hive1.1.0-cdh5.13.2
hbase1.2.0-cdh5.13.2
Reporter: zhaotao
When I build the cube, the cube build fails and here is the error message:
{code:java}
2020-06-22 07:30:23,172 DEBUG [Scheduler 1723420609 Job
ba1beb03-8bd1-76f6-4460-0607dbf9d6f7-5150] util.ZookeeperDistributedLock:226 :
5150-10831@ZZZZ trying to unlock /kylin/kylin_metadata/dict/KYLIN.XXXX_YYYY/lock
2020-06-22 07:30:23,175 INFO [Scheduler 1723420609 Job
ba1beb03-8bd1-76f6-4460-0607dbf9d6f7-5150] util.ZookeeperDistributedLock:237 :
5150-10831@pZZZZ released lock at
/kylin/kylin_metadata/dict/KYLIN.XXXX_YYYY/lock
2020-06-22 07:30:23,176 ERROR [Scheduler 1723420609 Job
ba1beb03-8bd1-76f6-4460-0607dbf9d6f7-5150] common.HadoopShellExecutable:65 :
error execute HadoopShellExecutable{id=ba1beb03-8bd1-76f6-4460-0607dbf9d6f7-03,
name=Build Dimension Dictionary, state=SUCCEED}
java.lang.RuntimeException: Failed to create dictionary on
KYLIN.DW_TELE_SALES_ORDER_V2_D.TRADE_NO
at
org.apache.kylin.dict.DictionaryManager.buildDictFromReadableTable(DictionaryManager.java:304)
at
org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:288)
at
org.apache.kylin.cube.CubeManager$DictionaryAssist.buildDictionary(CubeManager.java:1105)
at
org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:1074)
at
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:77)
at
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:55)
at
org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)
at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:93)
at
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:165)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:70)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:165)
at
org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Failed to create global dictionary on
KYLIN.XXXX.YYYY
at
org.apache.kylin.dict.GlobalDictionaryBuilder.addValue(GlobalDictionaryBuilder.java:89)
at
org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:82)
at
org.apache.kylin.dict.DictionaryManager.buildDictFromReadableTable(DictionaryManager.java:301)
... 15 more
Caused by: java.io.FileNotFoundException: File does not exist:
/kylin/kylin_metadata/resources/GlobalDict/dict/KYLIN.XXXX/YYYY/working/cached_1592695454554_1396573148
at
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
at
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2094)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2064)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1977)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:575)
at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:92)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:376)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1303)
at
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1288)
at
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1276)
at
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:307)
at
org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:273)
at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:265)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1607)
at
org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:338)
at
org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:334)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:334)
at
org.apache.kylin.dict.global.GlobalDictHDFSStore.readSlice(GlobalDictHDFSStore.java:187)
at
org.apache.kylin.dict.global.AppendTrieDictionaryBuilder.addValue(AppendTrieDictionaryBuilder.java:101)
at
org.apache.kylin.dict.GlobalDictionaryBuilder.addValue(GlobalDictionaryBuilder.java:85)
... 17 more
{code}
After a failed build, an attempt to rebuild succeeds
This happens occasionally, and I tried to get it to work by setting
kylin.job.retry to 3, hoping it would work in the subjob The retry mechanism is
triggered on failure, but it doesn't seem to work as found through the logs
--
This message was sent by Atlassian Jira
(v8.3.4#803005)