Carlos Ignacio Molina López created KYLIN-4522:
--------------------------------------------------
Summary: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile Kylin 2.6.6 EMR 5.19
Key: KYLIN-4522
URL: https://issues.apache.org/jira/browse/KYLIN-4522
Project: Kylin
Issue Type: Bug
Components: Environment , Job Engine, Others
Affects Versions: v2.6.6
Environment: Release label: emr-5.19.0
Hadoop distribution:Amazon 2.8.5
Applications: Hive 2.3.3, HBase 1.4.7, Spark 2.3.2, Livy 0.5.0, ZooKeeper
3.4.13, Sqoop 1.4.7, Oozie 5.0.0, Pig 0.17.0, HCatalog 2.3.3
Reporter: Carlos Ignacio Molina López
Attachments: base_2020_05_25_14_29_52.zip
Hi,
I've tried to build the Sample kylin_sales_cube with Spark to run in Amazon EMR
Cluster. I saw issue KYLIN-3931 and suggestion is to use the 2.6.6 Engine for
Hadoop 3. In EMR Hadoop 3 is only available on EMR 6.0 which is very recent and
I had tried to setup versions 2.6.6 and 3.0.2 for Hadoop 3, but in both cases
the Kylin Site doesn't show up (Error 404 - Not Found). So I tried to run in
EMR 5.19 that has same version of Spark (2.3.2) used in Kylin 2.6.6.
I am getting "java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile" error message.
I had already copied the following jars to Spark Jars folder, as per
documentations and what I've read in kylin-issues mailing list archives:
/usr/lib/hbase/hbase-hadoop-compat-1.4.7.jar
/usr/lib/hbase/hbase-hadoop2-compat-1.4.7.jar
/usr/lib/hbase/lib/hbase-common-1.4.7-tests.jar
/usr/lib/hbase/lib/hbase-common-1.4.7.jar
/usr/lib/hbase/hbase-client.jar
/usr/lib/hbase/hbase-client-1.4.7.jar
/usr/lib/hbase/hbase-server-1.4.7.jar
This is the output shown on the Step
{{org.apache.kylin.engine.spark.exception.SparkException: OS command error exit
with return code: 1, error message: 20/05/25 14:03:46 WARN SparkConf: The
configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as
of Spark 2.3 and may be removed in the future. Please use the new key
'spark.executor.memoryOverhead'
instead.org.apache.kylin.engine.spark.exception.SparkException: OS command
error exit with return code: 1, error message: 20/05/25 14:03:46 WARN
SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been
deprecated as of Spark 2.3 and may be removed in the future. Please use the new
key 'spark.executor.memoryOverhead' instead.20/05/25 14:03:47 INFO RMProxy:
Connecting to ResourceManager at
ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal/XXX.XXX.XXX.XXX:803220/05/25
14:03:49 INFO Client: Requesting a new application from cluster with 4
NodeManagers20/05/25 14:03:49 INFO Client: Verifying our application has not
requested more than the maximum memory capability of the cluster (6144 MB per
container)20/05/25 14:03:49 INFO Client: Will allocate AM container, with 5632
MB memory including 512 MB overhead20/05/25 14:03:49 INFO Client: Setting up
container launch context for our AM20/05/25 14:03:49 INFO Client: Setting up
the launch environment for our AM container20/05/25 14:03:49 INFO Client:
Preparing resources for our AM container20/05/25 14:03:51 WARN Client: Neither
spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading
libraries under SPARK_HOME.20/05/25 14:03:54 INFO Client: Uploading resource
file:/mnt/tmp/spark-d26c4f1f-1b8a-4cf8-a05b-842294ce017d/__spark_libs__4034657074333893156.zip
->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/__spark_libs__4034657074333893156.zip20/05/25
14:03:54 INFO Client: Uploading resource
file:/usr/local/kylin/apache-kylin-2.6.6-bin-hbase1x/lib/kylin-job-2.6.6.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/kylin-job-2.6.6.jar20/05/25
14:03:55 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/hbase-common-1.4.7.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-common-1.4.7.jar20/05/25
14:03:55 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/hbase-server-1.4.7.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-server-1.4.7.jar20/05/25
14:03:55 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/hbase-client-1.4.7.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-client-1.4.7.jar20/05/25
14:03:55 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/hbase-protocol-1.4.7.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-protocol-1.4.7.jar20/05/25
14:03:55 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-hadoop-compat-1.4.7.jar20/05/25
14:03:56 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/htrace-core-3.1.0-incubating.jar20/05/25
14:03:56 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/metrics-core-2.2.0.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/metrics-core-2.2.0.jar20/05/25
14:03:56 WARN Client: Same path resource
file:///usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar added multiple times to
distributed cache.20/05/25 14:03:56 INFO Client: Uploading resource
file:/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.7.jar ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hbase-hadoop2-compat-1.4.7.jar20/05/25
14:03:56 INFO Client: Uploading resource file:/etc/spark/conf/hive-site.xml ->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/hive-site.xml20/05/25
14:03:56 INFO Client: Uploading resource
file:/mnt/tmp/spark-d26c4f1f-1b8a-4cf8-a05b-842294ce017d/__spark_conf__1997289269037988671.zip
->
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1590337422418_0043/__spark_conf__.zip20/05/25
14:03:56 INFO SecurityManager: Changing view acls to: hadoop20/05/25 14:03:56
INFO SecurityManager: Changing modify acls to: hadoop20/05/25 14:03:56 INFO
SecurityManager: Changing view acls groups to: 20/05/25 14:03:56 INFO
SecurityManager: Changing modify acls groups to: 20/05/25 14:03:56 INFO
SecurityManager: SecurityManager: authentication disabled; ui acls disabled;
users with view permissions: Set(hadoop); groups with view permissions: Set();
users with modify permissions: Set(hadoop); groups with modify permissions:
Set()20/05/25 14:03:56 INFO Client: Submitting application
application_1590337422418_0043 to ResourceManager20/05/25 14:03:56 INFO
YarnClientImpl: Submitted application application_1590337422418_004320/05/25
14:03:57 INFO Client: Application report for application_1590337422418_0043
(state: ACCEPTED)20/05/25 14:03:57 INFO Client: client token: N/A diagnostics:
AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start
time: 1590415436952 final status: UNDEFINED tracking URL:
http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
user: hadoop20/05/25 14:03:58 INFO Client: Application report for
application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:03:59 INFO Client:
Application report for application_1590337422418_0043 (state: ACCEPTED)20/05/25
14:04:00 INFO Client: Application report for application_1590337422418_0043
(state: ACCEPTED)20/05/25 14:04:01 INFO Client: Application report for
application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:04:02 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:02 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster
host: XXX.XXX.XXX.XXX ApplicationMaster RPC port: 0 queue: default start time:
1590415436952 final status: UNDEFINED tracking URL:
http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
user: hadoop20/05/25 14:04:03 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:04 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:05 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:06 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:07 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:08 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:09 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:10 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:11 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:12 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:13 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:14 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:15 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:16 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:17 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:18 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:19 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:21 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:22 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:23 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:24 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:25 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:26 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:27 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:28 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:29 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:30 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:31 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:32 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:33 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:34 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:35 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:36 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:37 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:38 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:39 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:40 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:41 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:42 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:43 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:44 INFO Client:
Application report for application_1590337422418_0043 (state: ACCEPTED)20/05/25
14:04:44 INFO Client: client token: N/A diagnostics: AM container is launched,
waiting for AM container to Register with RM ApplicationMaster host: N/A
ApplicationMaster RPC port: -1 queue: default start time: 1590415436952 final
status: UNDEFINED tracking URL:
http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
user: hadoop20/05/25 14:04:45 INFO Client: Application report for
application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:04:46 INFO Client:
Application report for application_1590337422418_0043 (state: ACCEPTED)20/05/25
14:04:47 INFO Client: Application report for application_1590337422418_0043
(state: ACCEPTED)20/05/25 14:04:48 INFO Client: Application report for
application_1590337422418_0043 (state: ACCEPTED)20/05/25 14:04:49 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:49 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster
host: XXX.XXX.XXX.XXX ApplicationMaster RPC port: 0 queue: default start time:
1590415436952 final status: UNDEFINED tracking URL:
http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
user: hadoop20/05/25 14:04:50 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:51 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:52 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:53 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:54 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:55 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:56 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:04:57 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:04:58 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:04:59 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:00 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:01 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:02 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:03 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:04 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:05 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:06 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:07 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:08 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:09 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:10 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:11 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:12 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:13 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:14 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:15 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:16 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:17 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:18 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:19 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:20 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:21 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:22 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:23 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:24 INFO Client:
Application report for application_1590337422418_0043 (state: RUNNING)20/05/25
14:05:25 INFO Client: Application report for application_1590337422418_0043
(state: RUNNING)20/05/25 14:05:26 INFO Client: Application report for
application_1590337422418_0043 (state: RUNNING)20/05/25 14:05:27 INFO Client:
Application report for application_1590337422418_0043 (state: FINISHED)20/05/25
14:05:27 INFO Client: client token: N/A diagnostics: User class threw
exception: java.lang.RuntimeException: error execute
org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Job aborted.
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)Caused
by: org.apache.spark.SparkException: Job aborted. at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:100)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1083)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1081)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363) at
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1081)
at
org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831)
at
org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:238)
at
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
... 6 moreCaused by: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 1 in stage 1.0 failed 4 times, most recent failure: Lost task 1.3
in stage 1.0 (TID 15, ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal, executor
3): org.apache.spark.SparkException: Task failed while writing rows at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at
org.apache.spark.scheduler.Task.run(Task.scala:109) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)Caused by:
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile at
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:880)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:805)
at
org.apache.hadoop.hbase.regionserver.StoreFile$WriterBuilder.build(StoreFile.java:739)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.getNewWriter(HFileOutputFormat3.java:224)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:181)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:153)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more}}
{{Driver stacktrace: at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1803)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1791)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1790)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1790) at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:871)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:871)
at scala.Option.foreach(Option.scala:257) at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:871)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2024)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1973)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1962)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at
org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:682) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:2034) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:2055) at
org.apache.spark.SparkContext.runJob(SparkContext.scala:2087) at
org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78)
... 16 moreCaused by: org.apache.spark.SparkException: Task failed while
writing rows at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at
org.apache.spark.scheduler.Task.run(Task.scala:109) at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)Caused by:
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile at
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:880)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:805)
at
org.apache.hadoop.hbase.regionserver.StoreFile$WriterBuilder.build(StoreFile.java:739)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.getNewWriter(HFileOutputFormat3.java:224)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:181)
at
org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:153)
at
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
at
org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
... 8 more}}
{{ ApplicationMaster host: XXX.XXX.XXX.XXX ApplicationMaster RPC port: 0 queue:
default start time: 1590415436952 final status: FAILED tracking URL:
http://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:20888/proxy/application_1590337422418_0043/
user: hadoopException in thread "main" org.apache.spark.SparkException:
Application application_1590337422418_0043 finished with failed status at
org.apache.spark.deploy.yarn.Client.run(Client.scala:1165) at
org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1520) at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198) at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228) at
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137) at
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)20/05/25 14:05:27
INFO ShutdownHookManager: Shutdown hook called20/05/25 14:05:27 INFO
ShutdownHookManager: Deleting directory
/mnt/tmp/spark-04e9eed4-d16e-406c-9fb0-972cf355db0920/05/25 14:05:27 INFO
ShutdownHookManager: Deleting directory
/mnt/tmp/spark-d26c4f1f-1b8a-4cf8-a05b-842294ce017dThe command is: export
HADOOP_CONF_DIR=/etc/hadoop/conf && /usr/lib/spark/bin/spark-submit --class
org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=40
--conf spark.yarn.queue=default --conf
spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf
spark.master=yarn --conf spark.hadoop.yarn.timeline-service.enabled=false
--conf spark.executor.memory=5G --conf spark.eventLog.enabled=true --conf
spark.eventLog.dir=hdfs:///kylin/spark-history --conf
spark.yarn.executor.memoryOverhead=1024 --conf spark.driver.memory=5G --conf
spark.submit.deployMode=cluster --conf spark.shuffle.service.enabled=true
--jars
/usr/lib/hbase/lib/hbase-common-1.4.7.jar,/usr/lib/hbase/lib/hbase-server-1.4.7.jar,/usr/lib/hbase/lib/hbase-client-1.4.7.jar,/usr/lib/hbase/lib/hbase-protocol-1.4.7.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar,/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,/usr/lib/hbase/lib/metrics-core-2.2.0.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.7.jar,/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.7.jar,
/usr/local/kylin/apache-kylin-2.6.6-bin-hbase1x/lib/kylin-job-2.6.6.jar
-className org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/rowkey_stats/part-r-00000_hfile
-counterOutput
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/counter
-cubename kylin_sales_cube -output
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/hfile
-input
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/cuboid/
-segmentId 0d22a9ac-5256-02cd-a5b9-44de5247871f -metaUrl
kylin_metadata@hdfs,path=hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/kylin_sales_cube/metadata
-hbaseConfPath
hdfs://ip-XXX-XXX-XXX-XXX.us-west-2.compute.internal:8020/kylin/kylin_metadata/kylin-b75c7f69-2ebf-c5c3-4a6e-b01f177d911f/hbase-conf.xml
at
org.apache.kylin.engine.spark.SparkExecutable.doWork(SparkExecutable.java:347)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)}}
{{Please suggest how this issue can be troubleshooted.}}
Thank you and kind regards
{{Carlos Molina.}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)