hello, we are testing the 2.6 RC and we are facing a systematic issue when building cubes with spark engine (even with sample cube), whereas the MapReduce engin succeeds. The job process fails at step #8 Step Name: Convert Cuboid Data to HFile with the following error (full log is available as attachment):
ClassNotFoundException: org.apache.hadoop.hbase.metrics.MetricRegistry We run kylin on AWS EMR 5.13 (it failed also with 5.17). Do you have any idea of the reasons why it happens ? Hubert
OS command error exit with return code: 1, error message: 2019-01-11 08:39:24 WARN SparkConf:66 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. SparkEntry args:-className org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/rowkey_stats/part-r-00000_hfile -counterOutput hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/counter -cubename kylin_sales_cube -output hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile -input hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/ -segmentId f944e1a8-506a-7f5e-4d6a-389a3ce53489 -metaUrl kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata -hbaseConfPath hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml Running org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/rowkey_stats/part-r-00000_hfile -counterOutput hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/counter -cubename kylin_sales_cube -output hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile -input hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/ -segmentId f944e1a8-506a-7f5e-4d6a-389a3ce53489 -metaUrl kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata -hbaseConfPath hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml 2019-01-11 08:39:25 WARN SparkConf:66 - The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead. 2019-01-11 08:39:25 INFO SparkContext:54 - Running Spark version 2.3.2 2019-01-11 08:39:25 INFO SparkContext:54 - Submitted application: Converting HFile for:kylin_sales_cube segment f944e1a8-506a-7f5e-4d6a-389a3ce53489 2019-01-11 08:39:25 INFO SecurityManager:54 - Changing view acls to: hadoop 2019-01-11 08:39:25 INFO SecurityManager:54 - Changing modify acls to: hadoop 2019-01-11 08:39:25 INFO SecurityManager:54 - Changing view acls groups to: 2019-01-11 08:39:25 INFO SecurityManager:54 - Changing modify acls groups to: 2019-01-11 08:39:25 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set() 2019-01-11 08:39:26 INFO Utils:54 - Successfully started service 'sparkDriver' on port 46289. 2019-01-11 08:39:26 INFO SparkEnv:54 - Registering MapOutputTracker 2019-01-11 08:39:26 INFO SparkEnv:54 - Registering BlockManagerMaster 2019-01-11 08:39:26 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 2019-01-11 08:39:26 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up 2019-01-11 08:39:26 INFO DiskBlockManager:54 - Created local directory at /mnt/tmp/blockmgr-985a20b9-acaf-4178-a6a8-93e18739038d 2019-01-11 08:39:26 INFO MemoryStore:54 - MemoryStore started with capacity 912.3 MB 2019-01-11 08:39:26 INFO SparkEnv:54 - Registering OutputCommitCoordinator 2019-01-11 08:39:26 INFO log:192 - Logging initialized @2530ms 2019-01-11 08:39:26 INFO Server:351 - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown 2019-01-11 08:39:26 INFO Server:419 - Started @2639ms 2019-01-11 08:39:26 WARN Utils:66 - Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 2019-01-11 08:39:26 INFO AbstractConnector:278 - Started ServerConnector@1f12e153{HTTP/1.1,[http/1.1]}{0.0.0.0:4041} 2019-01-11 08:39:26 INFO Utils:54 - Successfully started service 'SparkUI' on port 4041. 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@41477a6d{/jobs,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@585ac855{/jobs/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5bb8f9e2{/jobs/job,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5f78de22{/jobs/job/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@516ebdf8{/stages,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4d8539de{/stages/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3eba57a7{/stages/stage,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@67207d8a{/stages/stage/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@bcb09a6{/stages/pool,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7c2a69b4{/stages/pool/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@a619c2{/storage,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@648ee871{/storage/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@375b5b7f{/storage/rdd,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1813f3e9{/storage/rdd/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@28cb9120{/environment,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3b152928{/environment/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@56781d96{/executors,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5173200b{/executors/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@25c5e994{/executors/threadDump,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@378bd86d{/executors/threadDump/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2189e7a7{/static,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1ee29c84{/,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7c8326a4{/api,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@c9d82f9{/jobs/job/kill,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6f012914{/stages/stage/kill,null,AVAILABLE,@Spark} 2019-01-11 08:39:26 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://ip-172-31-33-160.eu-west-1.compute.internal:4041 2019-01-11 08:39:26 INFO SparkContext:54 - Added JAR file:/opt/kylin/apache-kylin-2.6.0/lib/kylin-job-2.6.0.jar at spark://ip-172-31-33-160.eu-west-1.compute.internal:46289/jars/kylin-job-2.6.0.jar with timestamp 1547195966698 2019-01-11 08:39:27 INFO RMProxy:98 - Connecting to ResourceManager at ip-172-31-33-160.eu-west-1.compute.internal/172.31.33.160:8032 2019-01-11 08:39:27 INFO Client:54 - Requesting a new application from cluster with 3 NodeManagers 2019-01-11 08:39:27 INFO Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (5760 MB per container) 2019-01-11 08:39:27 INFO Client:54 - Will allocate AM container, with 896 MB memory including 384 MB overhead 2019-01-11 08:39:27 INFO Client:54 - Setting up container launch context for our AM 2019-01-11 08:39:27 INFO Client:54 - Setting up the launch environment for our AM container 2019-01-11 08:39:27 INFO Client:54 - Preparing resources for our AM container 2019-01-11 08:39:29 WARN Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 2019-01-11 08:39:32 INFO Client:54 - Uploading resource file:/mnt/tmp/spark-1257f53e-11f2-48d4-9ee7-65a1f9ea878b/__spark_libs__4991678983370181637.zip -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/__spark_libs__4991678983370181637.zip 2019-01-11 08:39:33 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/hbase-common-1.4.2.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-common-1.4.2.jar 2019-01-11 08:39:33 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/hbase-server-1.4.2.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-server-1.4.2.jar 2019-01-11 08:39:33 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/hbase-client-1.4.2.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-client-1.4.2.jar 2019-01-11 08:39:33 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/hbase-protocol-1.4.2.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-protocol-1.4.2.jar 2019-01-11 08:39:34 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-hadoop-compat-1.4.2.jar 2019-01-11 08:39:34 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/htrace-core-3.1.0-incubating.jar 2019-01-11 08:39:34 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/metrics-core-2.2.0.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/metrics-core-2.2.0.jar 2019-01-11 08:39:34 WARN Client:66 - Same path resource file:///usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar added multiple times to distributed cache. 2019-01-11 08:39:34 INFO Client:54 - Uploading resource file:/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.2.jar -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/hbase-hadoop2-compat-1.4.2.jar 2019-01-11 08:39:35 INFO Client:54 - Uploading resource file:/mnt/tmp/spark-1257f53e-11f2-48d4-9ee7-65a1f9ea878b/__spark_conf__1228777866751535130.zip -> hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/user/hadoop/.sparkStaging/application_1547193611202_0020/__spark_conf__.zip 2019-01-11 08:39:35 INFO SecurityManager:54 - Changing view acls to: hadoop 2019-01-11 08:39:35 INFO SecurityManager:54 - Changing modify acls to: hadoop 2019-01-11 08:39:35 INFO SecurityManager:54 - Changing view acls groups to: 2019-01-11 08:39:35 INFO SecurityManager:54 - Changing modify acls groups to: 2019-01-11 08:39:35 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set() 2019-01-11 08:39:35 INFO Client:54 - Submitting application application_1547193611202_0020 to ResourceManager 2019-01-11 08:39:35 INFO YarnClientImpl:273 - Submitted application application_1547193611202_0020 2019-01-11 08:39:35 INFO SchedulerExtensionServices:54 - Starting Yarn extension services with app application_1547193611202_0020 and attemptId None 2019-01-11 08:39:36 INFO Client:54 - Application report for application_1547193611202_0020 (state: ACCEPTED) 2019-01-11 08:39:36 INFO Client:54 - client token: N/A diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1547195975352 final status: UNDEFINED tracking URL: http://ip-172-31-33-160.eu-west-1.compute.internal:20888/proxy/application_1547193611202_0020/ user: hadoop 2019-01-11 08:39:37 INFO Client:54 - Application report for application_1547193611202_0020 (state: ACCEPTED) 2019-01-11 08:39:38 INFO Client:54 - Application report for application_1547193611202_0020 (state: ACCEPTED) 2019-01-11 08:39:39 INFO Client:54 - Application report for application_1547193611202_0020 (state: ACCEPTED) 2019-01-11 08:39:40 INFO Client:54 - Application report for application_1547193611202_0020 (state: ACCEPTED) 2019-01-11 08:39:40 INFO YarnClientSchedulerBackend:54 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ip-172-31-33-160.eu-west-1.compute.internal, PROXY_URI_BASES -> http://ip-172-31-33-160.eu-west-1.compute.internal:20888/proxy/application_1547193611202_0020), /proxy/application_1547193611202_0020 2019-01-11 08:39:40 INFO JettyUtils:54 - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill. 2019-01-11 08:39:40 INFO YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM) 2019-01-11 08:39:41 INFO Client:54 - Application report for application_1547193611202_0020 (state: RUNNING) 2019-01-11 08:39:41 INFO Client:54 - client token: N/A diagnostics: N/A ApplicationMaster host: 172.31.44.79 ApplicationMaster RPC port: 0 queue: default start time: 1547195975352 final status: UNDEFINED tracking URL: http://ip-172-31-33-160.eu-west-1.compute.internal:20888/proxy/application_1547193611202_0020/ user: hadoop 2019-01-11 08:39:41 INFO YarnClientSchedulerBackend:54 - Application application_1547193611202_0020 has started running. 2019-01-11 08:39:41 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35297. 2019-01-11 08:39:41 INFO NettyBlockTransferService:54 - Server created on ip-172-31-33-160.eu-west-1.compute.internal:35297 2019-01-11 08:39:41 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 2019-01-11 08:39:41 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None) 2019-01-11 08:39:41 INFO BlockManagerMasterEndpoint:54 - Registering block manager ip-172-31-33-160.eu-west-1.compute.internal:35297 with 912.3 MB RAM, BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None) 2019-01-11 08:39:41 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None) 2019-01-11 08:39:41 INFO BlockManager:54 - external shuffle service port = 7337 2019-01-11 08:39:41 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, ip-172-31-33-160.eu-west-1.compute.internal, 35297, None) 2019-01-11 08:39:41 INFO JettyUtils:54 - Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json. 2019-01-11 08:39:41 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@5657967b{/metrics/json,null,AVAILABLE,@Spark} 2019-01-11 08:39:41 INFO EventLoggingListener:54 - Logging events to hdfs:/kylin/spark-history/application_1547193611202_0020 2019-01-11 08:39:46 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (172.31.35.36:54814) with ID 1 2019-01-11 08:39:46 INFO BlockManagerMasterEndpoint:54 - Registering block manager ip-172-31-35-36.eu-west-1.compute.internal:40827 with 2004.6 MB RAM, BlockManagerId(1, ip-172-31-35-36.eu-west-1.compute.internal, 40827, None) 2019-01-11 08:39:56 INFO YarnClientSchedulerBackend:54 - SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms) 2019-01-11 08:39:56 INFO AbstractHadoopJob:515 - Ready to load KylinConfig from uri: kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata 2019-01-11 08:39:56 INFO KylinConfig:455 - Creating new manager instance of class org.apache.kylin.cube.CubeManager 2019-01-11 08:39:57 INFO CubeManager:133 - Initializing CubeManager with config kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata 2019-01-11 08:39:57 INFO ResourceStore:88 - Using metadata url kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata for resource store 2019-01-11 08:39:57 INFO HDFSResourceStore:74 - hdfs meta path : hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata 2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of class org.apache.kylin.cube.CubeDescManager 2019-01-11 08:39:57 INFO CubeDescManager:91 - Initializing CubeDescManager with config kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata 2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of class org.apache.kylin.metadata.project.ProjectManager 2019-01-11 08:39:57 INFO ProjectManager:81 - Initializing ProjectManager with metadata url kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata 2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of class org.apache.kylin.metadata.cachesync.Broadcaster 2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of class org.apache.kylin.metadata.model.DataModelManager 2019-01-11 08:39:57 INFO KylinConfig:455 - Creating new manager instance of class org.apache.kylin.metadata.TableMetadataManager 2019-01-11 08:39:57 INFO MeasureTypeFactory:117 - Checking custom measure types from kylin config 2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering COUNT_DISTINCT(hllc), class org.apache.kylin.measure.hllc.HLLCMeasureType$Factory 2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering COUNT_DISTINCT(bitmap), class org.apache.kylin.measure.bitmap.BitmapMeasureType$Factory 2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering TOP_N(topn), class org.apache.kylin.measure.topn.TopNMeasureType$Factory 2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering RAW(raw), class org.apache.kylin.measure.raw.RawMeasureType$Factory 2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering EXTENDED_COLUMN(extendedcolumn), class org.apache.kylin.measure.extendedcolumn.ExtendedColumnMeasureType$Factory 2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering PERCENTILE_APPROX(percentile), class org.apache.kylin.measure.percentile.PercentileMeasureType$Factory 2019-01-11 08:39:57 INFO MeasureTypeFactory:146 - registering COUNT_DISTINCT(dim_dc), class org.apache.kylin.measure.dim.DimCountDistinctMeasureType$Factory 2019-01-11 08:39:57 INFO DataModelManager:185 - Model kylin_sales_model is missing or unloaded yet 2019-01-11 08:39:57 INFO DataModelManager:185 - Model kylin_streaming_model is missing or unloaded yet 2019-01-11 08:39:57 INFO SparkCubeHFile:165 - Input path: hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/ 2019-01-11 08:39:57 INFO SparkCubeHFile:166 - Output path: hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile 2019-01-11 08:39:57 INFO ZlibFactory:49 - Successfully loaded & initialized native-zlib library 2019-01-11 08:39:57 INFO CodecPool:181 - Got brand-new decompressor [.deflate] 2019-01-11 08:39:57 INFO SparkCubeHFile:174 - ------- split key: \x00\x0A\x00\x00\x00\x00\x00\x00\x00\x00\x7F\xFF\x00\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF 2019-01-11 08:39:57 INFO SparkCubeHFile:179 - There are 1 split keys, totally 2 hfiles 2019-01-11 08:39:57 INFO SparkCubeHFile:182 - Loading HBase configuration from:hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml 2019-01-11 08:39:57 WARN Configuration:2670 - org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to override final parameter: fs.s3.buffer.dir; Ignoring. 2019-01-11 08:39:57 WARN Configuration:2670 - org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2019-01-11 08:39:57 WARN Configuration:2670 - org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to override final parameter: yarn.nodemanager.local-dirs; Ignoring. 2019-01-11 08:39:57 WARN Configuration:2670 - org.apache.hadoop.hdfs.client.HdfsDataInputStream@2f1f9515:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2019-01-11 08:39:58 INFO MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 310.3 KB, free 912.0 MB) 2019-01-11 08:39:58 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 26.4 KB, free 912.0 MB) 2019-01-11 08:39:58 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on ip-172-31-33-160.eu-west-1.compute.internal:35297 (size: 26.4 KB, free: 912.3 MB) 2019-01-11 08:39:58 INFO SparkContext:54 - Created broadcast 0 from sequenceFile at SparkUtil.java:106 2019-01-11 08:39:58 INFO FileOutputCommitter:108 - File Output Committer Algorithm version is 1 2019-01-11 08:39:58 INFO SparkContext:54 - Starting job: runJob at SparkHadoopWriter.scala:78 2019-01-11 08:39:58 INFO FileInputFormat:249 - Total input paths to process : 11 2019-01-11 08:39:58 INFO DAGScheduler:54 - Registering RDD 1 (flatMapToPair at SparkCubeHFile.java:208) 2019-01-11 08:39:58 INFO DAGScheduler:54 - Got job 0 (runJob at SparkHadoopWriter.scala:78) with 2 output partitions 2019-01-11 08:39:58 INFO DAGScheduler:54 - Final stage: ResultStage 1 (runJob at SparkHadoopWriter.scala:78) 2019-01-11 08:39:58 INFO DAGScheduler:54 - Parents of final stage: List(ShuffleMapStage 0) 2019-01-11 08:39:58 INFO DAGScheduler:54 - Missing parents: List(ShuffleMapStage 0) 2019-01-11 08:39:58 INFO DAGScheduler:54 - Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at flatMapToPair at SparkCubeHFile.java:208), which has no missing parents 2019-01-11 08:39:59 INFO MemoryStore:54 - Block broadcast_1 stored as values in memory (estimated size 48.0 KB, free 911.9 MB) 2019-01-11 08:39:59 INFO MemoryStore:54 - Block broadcast_1_piece0 stored as bytes in memory (estimated size 22.4 KB, free 911.9 MB) 2019-01-11 08:39:59 INFO BlockManagerInfo:54 - Added broadcast_1_piece0 in memory on ip-172-31-33-160.eu-west-1.compute.internal:35297 (size: 22.4 KB, free: 912.3 MB) 2019-01-11 08:39:59 INFO SparkContext:54 - Created broadcast 1 from broadcast at DAGScheduler.scala:1039 2019-01-11 08:39:59 INFO DAGScheduler:54 - Submitting 11 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at flatMapToPair at SparkCubeHFile.java:208) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)) 2019-01-11 08:39:59 INFO YarnScheduler:54 - Adding task set 0.0 with 11 tasks 2019-01-11 08:39:59 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0, NODE_LOCAL, 8023 bytes) 2019-01-11 08:39:59 INFO BlockManagerInfo:54 - Added broadcast_1_piece0 in memory on ip-172-31-35-36.eu-west-1.compute.internal:40827 (size: 22.4 KB, free: 2004.6 MB) 2019-01-11 08:40:00 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on ip-172-31-35-36.eu-west-1.compute.internal:40827 (size: 26.4 KB, free: 2004.6 MB) 2019-01-11 08:40:01 INFO TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:01 INFO TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 2882 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (1/11) 2019-01-11 08:40:02 INFO TaskSetManager:54 - Starting task 2.0 in stage 0.0 (TID 2, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 2, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:02 INFO TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 552 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (2/11) 2019-01-11 08:40:02 INFO TaskSetManager:54 - Starting task 3.0 in stage 0.0 (TID 3, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 3, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:02 INFO TaskSetManager:54 - Finished task 2.0 in stage 0.0 (TID 2) in 503 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (3/11) 2019-01-11 08:40:03 INFO TaskSetManager:54 - Starting task 4.0 in stage 0.0 (TID 4, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 4, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:03 INFO TaskSetManager:54 - Finished task 3.0 in stage 0.0 (TID 3) in 839 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (4/11) 2019-01-11 08:40:05 INFO TaskSetManager:54 - Starting task 5.0 in stage 0.0 (TID 5, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 5, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:05 INFO TaskSetManager:54 - Finished task 4.0 in stage 0.0 (TID 4) in 1268 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (5/11) 2019-01-11 08:40:06 INFO TaskSetManager:54 - Starting task 6.0 in stage 0.0 (TID 6, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 6, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:06 INFO TaskSetManager:54 - Finished task 5.0 in stage 0.0 (TID 5) in 1286 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (6/11) 2019-01-11 08:40:07 INFO TaskSetManager:54 - Starting task 7.0 in stage 0.0 (TID 7, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 7, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:07 INFO TaskSetManager:54 - Finished task 6.0 in stage 0.0 (TID 6) in 1319 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (7/11) 2019-01-11 08:40:08 INFO TaskSetManager:54 - Starting task 8.0 in stage 0.0 (TID 8, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 8, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:08 INFO TaskSetManager:54 - Finished task 7.0 in stage 0.0 (TID 7) in 1031 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (8/11) 2019-01-11 08:40:09 INFO TaskSetManager:54 - Starting task 9.0 in stage 0.0 (TID 9, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 9, NODE_LOCAL, 8022 bytes) 2019-01-11 08:40:09 INFO TaskSetManager:54 - Finished task 8.0 in stage 0.0 (TID 8) in 610 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (9/11) 2019-01-11 08:40:09 INFO TaskSetManager:54 - Starting task 10.0 in stage 0.0 (TID 10, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 10, NODE_LOCAL, 8025 bytes) 2019-01-11 08:40:09 INFO TaskSetManager:54 - Finished task 9.0 in stage 0.0 (TID 9) in 182 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (10/11) 2019-01-11 08:40:09 INFO TaskSetManager:54 - Finished task 10.0 in stage 0.0 (TID 10) in 92 ms on ip-172-31-35-36.eu-west-1.compute.internal (executor 1) (11/11) 2019-01-11 08:40:09 INFO YarnScheduler:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 2019-01-11 08:40:09 INFO DAGScheduler:54 - ShuffleMapStage 0 (flatMapToPair at SparkCubeHFile.java:208) finished in 10.538 s 2019-01-11 08:40:09 INFO DAGScheduler:54 - looking for newly runnable stages 2019-01-11 08:40:09 INFO DAGScheduler:54 - running: Set() 2019-01-11 08:40:09 INFO DAGScheduler:54 - waiting: Set(ResultStage 1) 2019-01-11 08:40:09 INFO DAGScheduler:54 - failed: Set() 2019-01-11 08:40:09 INFO DAGScheduler:54 - Submitting ResultStage 1 (MapPartitionsRDD[3] at mapToPair at SparkCubeHFile.java:231), which has no missing parents 2019-01-11 08:40:09 INFO MemoryStore:54 - Block broadcast_2 stored as values in memory (estimated size 197.9 KB, free 911.7 MB) 2019-01-11 08:40:09 INFO MemoryStore:54 - Block broadcast_2_piece0 stored as bytes in memory (estimated size 44.8 KB, free 911.7 MB) 2019-01-11 08:40:09 INFO BlockManagerInfo:54 - Added broadcast_2_piece0 in memory on ip-172-31-33-160.eu-west-1.compute.internal:35297 (size: 44.8 KB, free: 912.2 MB) 2019-01-11 08:40:09 INFO SparkContext:54 - Created broadcast 2 from broadcast at DAGScheduler.scala:1039 2019-01-11 08:40:09 INFO DAGScheduler:54 - Submitting 2 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at mapToPair at SparkCubeHFile.java:231) (first 15 tasks are for partitions Vector(0, 1)) 2019-01-11 08:40:09 INFO YarnScheduler:54 - Adding task set 1.0 with 2 tasks 2019-01-11 08:40:09 INFO TaskSetManager:54 - Starting task 0.0 in stage 1.0 (TID 11, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:09 INFO BlockManagerInfo:54 - Added broadcast_2_piece0 in memory on ip-172-31-35-36.eu-west-1.compute.internal:40827 (size: 44.8 KB, free: 2004.5 MB) 2019-01-11 08:40:09 INFO MapOutputTrackerMasterEndpoint:54 - Asked to send map output locations for shuffle 0 to 172.31.35.36:54814 2019-01-11 08:40:19 INFO TaskSetManager:54 - Starting task 1.0 in stage 1.0 (TID 12, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:19 WARN TaskSetManager:66 - Lost task 0.0 in stage 1.0 (TID 11, ip-172-31-35-36.eu-west-1.compute.internal, executor 1): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Lorg/apache/hadoop/hbase/metrics/MetricRegistry; at java.lang.Class.getDeclaredFields0(Native Method) at java.lang.Class.privateGetDeclaredFields(Class.java:2583) at java.lang.Class.getDeclaredFields(Class.java:1916) at org.apache.hadoop.util.ReflectionUtils.getDeclaredFieldsIncludingInherited(ReflectionUtils.java:323) at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.initRegistry(MetricsSourceBuilder.java:92) at org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.<init>(MetricsSourceBuilder.java:56) at org.apache.hadoop.metrics2.lib.MetricsAnnotations.newSourceBuilder(MetricsAnnotations.java:43) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:224) at org.apache.hadoop.hbase.metrics.BaseSourceImpl.<init>(BaseSourceImpl.java:115) at org.apache.hadoop.hbase.io.MetricsIOSourceImpl.<init>(MetricsIOSourceImpl.java:44) at org.apache.hadoop.hbase.io.MetricsIOSourceImpl.<init>(MetricsIOSourceImpl.java:36) at org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactoryImpl.createIO(MetricsRegionServerSourceFactoryImpl.java:73) at org.apache.hadoop.hbase.io.MetricsIO.<init>(MetricsIO.java:32) at org.apache.hadoop.hbase.io.hfile.HFile.<clinit>(HFile.java:191) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 8 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.metrics.MetricRegistry at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 30 more 2019-01-11 08:40:19 INFO TaskSetManager:54 - Starting task 0.1 in stage 1.0 (TID 13, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:19 INFO TaskSetManager:54 - Lost task 1.0 in stage 1.0 (TID 12) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 1] 2019-01-11 08:40:24 INFO TaskSetManager:54 - Starting task 1.1 in stage 1.0 (TID 14, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:24 INFO TaskSetManager:54 - Lost task 0.1 in stage 1.0 (TID 13) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 2] 2019-01-11 08:40:24 INFO TaskSetManager:54 - Starting task 0.2 in stage 1.0 (TID 15, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:24 INFO TaskSetManager:54 - Lost task 1.1 in stage 1.0 (TID 14) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 3] 2019-01-11 08:40:32 INFO TaskSetManager:54 - Starting task 1.2 in stage 1.0 (TID 16, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:32 WARN TaskSetManager:66 - Lost task 0.2 in stage 1.0 (TID 15, ip-172-31-35-36.eu-west-1.compute.internal, executor 1): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 8 more 2019-01-11 08:40:32 INFO TaskSetManager:54 - Starting task 0.3 in stage 1.0 (TID 17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 0, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:32 INFO TaskSetManager:54 - Lost task 1.2 in stage 1.0 (TID 16) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 1] 2019-01-11 08:40:40 INFO TaskSetManager:54 - Starting task 1.3 in stage 1.0 (TID 18, ip-172-31-35-36.eu-west-1.compute.internal, executor 1, partition 1, NODE_LOCAL, 7660 bytes) 2019-01-11 08:40:40 INFO TaskSetManager:54 - Lost task 0.3 in stage 1.0 (TID 17) on ip-172-31-35-36.eu-west-1.compute.internal, executor 1: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 2] 2019-01-11 08:40:40 ERROR TaskSetManager:70 - Task 0 in stage 1.0 failed 4 times; aborting job 2019-01-11 08:40:40 INFO YarnScheduler:54 - Cancelling stage 1 2019-01-11 08:40:40 INFO YarnScheduler:54 - Stage 1 was cancelled 2019-01-11 08:40:40 INFO DAGScheduler:54 - ResultStage 1 (runJob at SparkHadoopWriter.scala:78) failed in 31.175 s due to Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 8 more Driver stacktrace: 2019-01-11 08:40:40 INFO DAGScheduler:54 - Job 0 failed: runJob at SparkHadoopWriter.scala:78, took 41.849529 s 2019-01-11 08:40:40 ERROR SparkHadoopWriter:91 - Aborting job job_20190111083958_0003. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 8 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1651) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1639) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1638) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1638) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1872) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1821) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1810) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081) at org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831) at org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238) at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37) at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 8 more 2019-01-11 08:40:40 WARN TaskSetManager:66 - Lost task 1.3 in stage 1.0 (TID 18, ip-172-31-35-36.eu-west-1.compute.internal, executor 1): TaskKilled (Stage cancelled) 2019-01-11 08:40:40 INFO YarnScheduler:54 - Removed TaskSet 1.0, whose tasks have all completed, from pool 2019-01-11 08:40:40 INFO AbstractConnector:318 - Stopped Spark@1f12e153{HTTP/1.1,[http/1.1]}{0.0.0.0:4041} 2019-01-11 08:40:40 INFO SparkUI:54 - Stopped Spark web UI at http://ip-172-31-33-160.eu-west-1.compute.internal:4041 2019-01-11 08:40:41 INFO YarnClientSchedulerBackend:54 - Interrupting monitor thread 2019-01-11 08:40:41 INFO YarnClientSchedulerBackend:54 - Shutting down all executors 2019-01-11 08:40:41 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Asking each executor to shut down 2019-01-11 08:40:41 INFO SchedulerExtensionServices:54 - Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 2019-01-11 08:40:41 INFO YarnClientSchedulerBackend:54 - Stopped 2019-01-11 08:40:41 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped! 2019-01-11 08:40:41 INFO MemoryStore:54 - MemoryStore cleared 2019-01-11 08:40:41 INFO BlockManager:54 - BlockManager stopped 2019-01-11 08:40:41 INFO BlockManagerMaster:54 - BlockManagerMaster stopped 2019-01-11 08:40:41 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped! 2019-01-11 08:40:41 INFO SparkContext:54 - Successfully stopped SparkContext Exception in thread "main" java.lang.RuntimeException: error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Job aborted. at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Job aborted. at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081) at org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831) at org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238) at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37) ... 11 more Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 17, ip-172-31-35-36.eu-west-1.compute.internal, executor 1): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 8 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1651) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1639) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1638) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1638) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1872) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1821) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1810) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78) ... 21 more Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.io.hfile.HFile at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167) at org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415) at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139) ... 8 more 2019-01-11 08:40:41 INFO ShutdownHookManager:54 - Shutdown hook called 2019-01-11 08:40:41 INFO ShutdownHookManager:54 - Deleting directory /mnt/tmp/spark-b5366df4-8778-4643-8f72-c661ea2298e9 2019-01-11 08:40:41 INFO ShutdownHookManager:54 - Deleting directory /mnt/tmp/spark-1257f53e-11f2-48d4-9ee7-65a1f9ea878b The command is: export HADOOP_CONF_DIR=/etc/hadoop/conf && /opt/kylin/apache-kylin-2.6.0/spark/bin/spark-submit --class org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=40 --conf spark.yarn.queue=default --conf spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf spark.master=yarn --conf spark.hadoop.yarn.timeline-service.enabled=false --conf spark.executor.memory=4G --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs:///kylin/spark-history --conf spark.yarn.executor.memoryOverhead=1024 --conf spark.driver.memory=2G --conf spark.shuffle.service.enabled=true --jars /usr/lib/hbase/lib/hbase-common-1.4.2.jar,/usr/lib/hbase/lib/hbase-server-1.4.2.jar,/usr/lib/hbase/lib/hbase-client-1.4.2.jar,/usr/lib/hbase/lib/hbase-protocol-1.4.2.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar,/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,/usr/lib/hbase/lib/metrics-core-2.2.0.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar,/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.2.jar, /opt/kylin/apache-kylin-2.6.0/lib/kylin-job-2.6.0.jar -className org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/rowkey_stats/part-r-00000_hfile -counterOutput hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/counter -cubename kylin_sales_cube -output hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/hfile -input hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/cuboid/ -segmentId f944e1a8-506a-7f5e-4d6a-389a3ce53489 -metaUrl kylin_metadata@hdfs,path=hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/kylin_sales_cube/metadata -hbaseConfPath hdfs://ip-172-31-33-160.eu-west-1.compute.internal:8020/kylin/kylin_metadata/kylin-7a0c8f27-c78c-530b-0c95-31258f0b0842/hbase-conf.xml