Re: zeppelin (or spark-shell) with HBase fails on executor level
Interesting. I will watching your PR. On Wed, Nov 18, 2015 at 7:51 AM, 임정택wrote: > Ted, > > I suspect I hit the issue > https://issues.apache.org/jira/browse/SPARK-11818 > Could you refer the issue and verify that it makes sense? > > Thanks, > Jungtaek Lim (HeartSaVioR) > > 2015-11-18 20:32 GMT+09:00 Ted Yu : > >> Here is related code: >> >> private static void checkDefaultsVersion(Configuration conf) { >> >> if (conf.getBoolean("hbase.defaults.for.version.skip", Boolean.FALSE)) >> return; >> >> String defaultsVersion = conf.get("hbase.defaults.for.version"); >> >> String thisVersion = VersionInfo.getVersion(); >> >> if (!thisVersion.equals(defaultsVersion)) { >> >> throw new RuntimeException( >> >> "hbase-default.xml file seems to be for an older version of >> HBase (" + >> >> defaultsVersion + "), this version is " + thisVersion); >> >> null means that "hbase.defaults.for.version" was not set in the other >> hbase-default.xml >> >> Can you retrieve the classpath of Spark task so that we can have more >> clue ? >> >> >> Cheers >> >> On Tue, Nov 17, 2015 at 10:06 PM, 임정택 wrote: >> >>> Ted, >>> >>> Thanks for the reply. >>> >>> My fat jar has dependency with spark related library to only spark-core >>> as "provided". >>> Seems like Spark only adds 0.98.7-hadoop2 of hbase-common in >>> spark-example module. >>> >>> And if there're two hbase-default.xml in the classpath, should one of >>> them be loaded, instead of showing (null)? >>> >>> Best, >>> Jungtaek Lim (HeartSaVioR) >>> >>> >>> >>> 2015-11-18 13:50 GMT+09:00 Ted Yu : >>> Looks like there're two hbase-default.xml in the classpath: one for 0.98.6 and another for 0.98.7-hadoop2 (used by Spark) You can specify hbase.defaults.for.version.skip as true in your hbase-site.xml Cheers On Tue, Nov 17, 2015 at 1:01 AM, 임정택 wrote: > Hi all, > > I'm evaluating zeppelin to run driver which interacts with HBase. > I use fat jar to include HBase dependencies, and see failures on > executor level. > I thought it is zeppelin's issue, but it fails on spark-shell, too. > > I loaded fat jar via --jars option, > > > ./bin/spark-shell --jars hbase-included-assembled.jar > > and run driver code using provided SparkContext instance, and see > failures from spark-shell console and executor logs. > > below is stack traces, > > org.apache.spark.SparkException: Job aborted due to stage failure: Task > 55 in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in > stage 0.0 (TID 281, ): java.lang.NoClassDefFoundError: > Could not initialize class > org.apache.hadoop.hbase.client.HConnectionManager > at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) > at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) > at > org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) > at > org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > Driver stacktrace: > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at >
Re: zeppelin (or spark-shell) with HBase fails on executor level
Ted, I suspect I hit the issue https://issues.apache.org/jira/browse/SPARK-11818 Could you refer the issue and verify that it makes sense? Thanks, Jungtaek Lim (HeartSaVioR) 2015-11-18 20:32 GMT+09:00 Ted Yu: > Here is related code: > > private static void checkDefaultsVersion(Configuration conf) { > > if (conf.getBoolean("hbase.defaults.for.version.skip", Boolean.FALSE)) > return; > > String defaultsVersion = conf.get("hbase.defaults.for.version"); > > String thisVersion = VersionInfo.getVersion(); > > if (!thisVersion.equals(defaultsVersion)) { > > throw new RuntimeException( > > "hbase-default.xml file seems to be for an older version of HBase > (" + > > defaultsVersion + "), this version is " + thisVersion); > > null means that "hbase.defaults.for.version" was not set in the other > hbase-default.xml > > Can you retrieve the classpath of Spark task so that we can have more clue > ? > > > Cheers > > On Tue, Nov 17, 2015 at 10:06 PM, 임정택 wrote: > >> Ted, >> >> Thanks for the reply. >> >> My fat jar has dependency with spark related library to only spark-core >> as "provided". >> Seems like Spark only adds 0.98.7-hadoop2 of hbase-common in >> spark-example module. >> >> And if there're two hbase-default.xml in the classpath, should one of >> them be loaded, instead of showing (null)? >> >> Best, >> Jungtaek Lim (HeartSaVioR) >> >> >> >> 2015-11-18 13:50 GMT+09:00 Ted Yu : >> >>> Looks like there're two hbase-default.xml in the classpath: one for 0.98.6 >>> and another for 0.98.7-hadoop2 (used by Spark) >>> >>> You can specify hbase.defaults.for.version.skip as true in your >>> hbase-site.xml >>> >>> Cheers >>> >>> On Tue, Nov 17, 2015 at 1:01 AM, 임정택 wrote: >>> Hi all, I'm evaluating zeppelin to run driver which interacts with HBase. I use fat jar to include HBase dependencies, and see failures on executor level. I thought it is zeppelin's issue, but it fails on spark-shell, too. I loaded fat jar via --jars option, > ./bin/spark-shell --jars hbase-included-assembled.jar and run driver code using provided SparkContext instance, and see failures from spark-shell console and executor logs. below is stack traces, org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage 0.0 (TID 281, ): java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.client.HConnectionManager at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) at org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
Re: zeppelin (or spark-shell) with HBase fails on executor level
Here is related code: private static void checkDefaultsVersion(Configuration conf) { if (conf.getBoolean("hbase.defaults.for.version.skip", Boolean.FALSE)) return; String defaultsVersion = conf.get("hbase.defaults.for.version"); String thisVersion = VersionInfo.getVersion(); if (!thisVersion.equals(defaultsVersion)) { throw new RuntimeException( "hbase-default.xml file seems to be for an older version of HBase (" + defaultsVersion + "), this version is " + thisVersion); null means that "hbase.defaults.for.version" was not set in the other hbase-default.xml Can you retrieve the classpath of Spark task so that we can have more clue ? Cheers On Tue, Nov 17, 2015 at 10:06 PM, 임정택wrote: > Ted, > > Thanks for the reply. > > My fat jar has dependency with spark related library to only spark-core as > "provided". > Seems like Spark only adds 0.98.7-hadoop2 of hbase-common in spark-example > module. > > And if there're two hbase-default.xml in the classpath, should one of them > be loaded, instead of showing (null)? > > Best, > Jungtaek Lim (HeartSaVioR) > > > > 2015-11-18 13:50 GMT+09:00 Ted Yu : > >> Looks like there're two hbase-default.xml in the classpath: one for 0.98.6 >> and another for 0.98.7-hadoop2 (used by Spark) >> >> You can specify hbase.defaults.for.version.skip as true in your >> hbase-site.xml >> >> Cheers >> >> On Tue, Nov 17, 2015 at 1:01 AM, 임정택 wrote: >> >>> Hi all, >>> >>> I'm evaluating zeppelin to run driver which interacts with HBase. >>> I use fat jar to include HBase dependencies, and see failures on >>> executor level. >>> I thought it is zeppelin's issue, but it fails on spark-shell, too. >>> >>> I loaded fat jar via --jars option, >>> >>> > ./bin/spark-shell --jars hbase-included-assembled.jar >>> >>> and run driver code using provided SparkContext instance, and see >>> failures from spark-shell console and executor logs. >>> >>> below is stack traces, >>> >>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 >>> in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage >>> 0.0 (TID 281, ): java.lang.NoClassDefFoundError: Could not >>> initialize class org.apache.hadoop.hbase.client.HConnectionManager >>> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >>> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >>> at >>> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >>> at >>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >>> at org.apache.spark.scheduler.Task.run(Task.scala:70) >>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Driver stacktrace: >>> at >>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) >>> at >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> at >>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >>> at scala.Option.foreach(Option.scala:236) >>> at >>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) >>> at
Re: zeppelin (or spark-shell) with HBase fails on executor level
I am a bit curious: Hbase depends on hdfs. Has hdfs support for Mesos been fully implemented ? Last time I checked, there was still work to be done. Thanks > On Nov 17, 2015, at 1:06 AM, 임정택wrote: > > Oh, one thing I missed is, I built Spark 1.4.1 Cluster with 6 nodes of Mesos > 0.22.1 H/A (via ZK) cluster. > > 2015-11-17 18:01 GMT+09:00 임정택 : >> Hi all, >> >> I'm evaluating zeppelin to run driver which interacts with HBase. >> I use fat jar to include HBase dependencies, and see failures on executor >> level. >> I thought it is zeppelin's issue, but it fails on spark-shell, too. >> >> I loaded fat jar via --jars option, >> >> > ./bin/spark-shell --jars hbase-included-assembled.jar >> >> and run driver code using provided SparkContext instance, and see failures >> from spark-shell console and executor logs. >> >> below is stack traces, >> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 >> in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage >> 0.0 (TID 281, ): java.lang.NoClassDefFoundError: Could not >> initialize class org.apache.hadoop.hbase.client.HConnectionManager >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >> at org.apache.spark.scheduler.Task.run(Task.scala:70) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> Driver stacktrace: >> at >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) >> at >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> at >> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) >> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >> >> 15/11/16 18:59:57 ERROR Executor: Exception in task 14.0 in stage 0.0 (TID >> 14) >> java.lang.ExceptionInInitializerError >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >> at >>
Re: zeppelin (or spark-shell) with HBase fails on executor level
Ted, Could you elaborate, please? I maintain separated HBase cluster and Mesos cluster for some reasons, and I just can make it work via spark-submit or spark-shell / zeppelin with newly initialized SparkContext. Thanks, Jungtaek Lim (HeartSaVioR) 2015-11-17 22:17 GMT+09:00 Ted Yu: > I am a bit curious: > Hbase depends on hdfs. > Has hdfs support for Mesos been fully implemented ? > > Last time I checked, there was still work to be done. > > Thanks > > On Nov 17, 2015, at 1:06 AM, 임정택 wrote: > > Oh, one thing I missed is, I built Spark 1.4.1 Cluster with 6 nodes of > Mesos 0.22.1 H/A (via ZK) cluster. > > 2015-11-17 18:01 GMT+09:00 임정택 : > >> Hi all, >> >> I'm evaluating zeppelin to run driver which interacts with HBase. >> I use fat jar to include HBase dependencies, and see failures on executor >> level. >> I thought it is zeppelin's issue, but it fails on spark-shell, too. >> >> I loaded fat jar via --jars option, >> >> > ./bin/spark-shell --jars hbase-included-assembled.jar >> >> and run driver code using provided SparkContext instance, and see >> failures from spark-shell console and executor logs. >> >> below is stack traces, >> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 >> in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage >> 0.0 (TID 281, ): java.lang.NoClassDefFoundError: Could not >> initialize class org.apache.hadoop.hbase.client.HConnectionManager >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >> at org.apache.spark.scheduler.Task.run(Task.scala:70) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> Driver stacktrace: >> at >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) >> at >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> at >> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) >> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >> >> >> 15/11/16 18:59:57 ERROR Executor: Exception in task 14.0 in stage 0.0 (TID >> 14) >> java.lang.ExceptionInInitializerError >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >>
Re: zeppelin (or spark-shell) with HBase fails on executor level
I see - your HBase cluster is separate from Mesos cluster. I somehow got (incorrect) impression that HBase cluster runs on Mesos. On Tue, Nov 17, 2015 at 7:53 PM, 임정택wrote: > Ted, > > Could you elaborate, please? > > I maintain separated HBase cluster and Mesos cluster for some reasons, and > I just can make it work via spark-submit or spark-shell / zeppelin with > newly initialized SparkContext. > > Thanks, > Jungtaek Lim (HeartSaVioR) > > 2015-11-17 22:17 GMT+09:00 Ted Yu : > >> I am a bit curious: >> Hbase depends on hdfs. >> Has hdfs support for Mesos been fully implemented ? >> >> Last time I checked, there was still work to be done. >> >> Thanks >> >> On Nov 17, 2015, at 1:06 AM, 임정택 wrote: >> >> Oh, one thing I missed is, I built Spark 1.4.1 Cluster with 6 nodes of >> Mesos 0.22.1 H/A (via ZK) cluster. >> >> 2015-11-17 18:01 GMT+09:00 임정택 : >> >>> Hi all, >>> >>> I'm evaluating zeppelin to run driver which interacts with HBase. >>> I use fat jar to include HBase dependencies, and see failures on >>> executor level. >>> I thought it is zeppelin's issue, but it fails on spark-shell, too. >>> >>> I loaded fat jar via --jars option, >>> >>> > ./bin/spark-shell --jars hbase-included-assembled.jar >>> >>> and run driver code using provided SparkContext instance, and see >>> failures from spark-shell console and executor logs. >>> >>> below is stack traces, >>> >>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 >>> in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage >>> 0.0 (TID 281, ): java.lang.NoClassDefFoundError: Could not >>> initialize class org.apache.hadoop.hbase.client.HConnectionManager >>> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >>> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >>> at >>> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >>> at >>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >>> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >>> at >>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >>> at org.apache.spark.scheduler.Task.run(Task.scala:70) >>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Driver stacktrace: >>> at >>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) >>> at >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> at >>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >>> at scala.Option.foreach(Option.scala:236) >>> at >>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) >>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >>> >>> >>> 15/11/16 18:59:57 ERROR Executor: Exception in task 14.0 in stage 0.0 (TID >>> 14) >>> java.lang.ExceptionInInitializerError >>> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >>> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >>> at >>> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >>> at >>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >>> at
Re: zeppelin (or spark-shell) with HBase fails on executor level
Ted, Thanks for the reply. My fat jar has dependency with spark related library to only spark-core as "provided". Seems like Spark only adds 0.98.7-hadoop2 of hbase-common in spark-example module. And if there're two hbase-default.xml in the classpath, should one of them be loaded, instead of showing (null)? Best, Jungtaek Lim (HeartSaVioR) 2015-11-18 13:50 GMT+09:00 Ted Yu: > Looks like there're two hbase-default.xml in the classpath: one for 0.98.6 > and another for 0.98.7-hadoop2 (used by Spark) > > You can specify hbase.defaults.for.version.skip as true in your > hbase-site.xml > > Cheers > > On Tue, Nov 17, 2015 at 1:01 AM, 임정택 wrote: > >> Hi all, >> >> I'm evaluating zeppelin to run driver which interacts with HBase. >> I use fat jar to include HBase dependencies, and see failures on executor >> level. >> I thought it is zeppelin's issue, but it fails on spark-shell, too. >> >> I loaded fat jar via --jars option, >> >> > ./bin/spark-shell --jars hbase-included-assembled.jar >> >> and run driver code using provided SparkContext instance, and see >> failures from spark-shell console and executor logs. >> >> below is stack traces, >> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 >> in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage >> 0.0 (TID 281, ): java.lang.NoClassDefFoundError: Could not >> initialize class org.apache.hadoop.hbase.client.HConnectionManager >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >> at org.apache.spark.scheduler.Task.run(Task.scala:70) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> Driver stacktrace: >> at >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) >> at >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> at >> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) >> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >> >> >> 15/11/16 18:59:57 ERROR Executor: Exception in task 14.0 in stage 0.0 (TID >> 14) >> java.lang.ExceptionInInitializerError >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >>
Re: zeppelin (or spark-shell) with HBase fails on executor level
Oh, one thing I missed is, I built Spark 1.4.1 Cluster with 6 nodes of Mesos 0.22.1 H/A (via ZK) cluster. 2015-11-17 18:01 GMT+09:00 임정택: > Hi all, > > I'm evaluating zeppelin to run driver which interacts with HBase. > I use fat jar to include HBase dependencies, and see failures on executor > level. > I thought it is zeppelin's issue, but it fails on spark-shell, too. > > I loaded fat jar via --jars option, > > > ./bin/spark-shell --jars hbase-included-assembled.jar > > and run driver code using provided SparkContext instance, and see failures > from spark-shell console and executor logs. > > below is stack traces, > > org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 in > stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage 0.0 > (TID 281, ): java.lang.NoClassDefFoundError: Could not > initialize class org.apache.hadoop.hbase.client.HConnectionManager > at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) > at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) > at > org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) > at > org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > Driver stacktrace: > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > > > 15/11/16 18:59:57 ERROR Executor: Exception in task 14.0 in stage 0.0 (TID 14) > java.lang.ExceptionInInitializerError > at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) > at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) > at > org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) > at > org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) > at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:70) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at
Re: zeppelin (or spark-shell) with HBase fails on executor level
I just make it work from both side (zeppelin, spark-shell) via initializing another SparkContext and run. But since it feels me as a workaround, so I'd love to get proper ways (or more beautiful workarounds) to resolve this. Please let me know if you have any suggestions. Best, Jungtaek Lim (HeartSaVioR) 2015-11-17 18:06 GMT+09:00 임정택: > Oh, one thing I missed is, I built Spark 1.4.1 Cluster with 6 nodes of > Mesos 0.22.1 H/A (via ZK) cluster. > > 2015-11-17 18:01 GMT+09:00 임정택 : > >> Hi all, >> >> I'm evaluating zeppelin to run driver which interacts with HBase. >> I use fat jar to include HBase dependencies, and see failures on executor >> level. >> I thought it is zeppelin's issue, but it fails on spark-shell, too. >> >> I loaded fat jar via --jars option, >> >> > ./bin/spark-shell --jars hbase-included-assembled.jar >> >> and run driver code using provided SparkContext instance, and see >> failures from spark-shell console and executor logs. >> >> below is stack traces, >> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 >> in stage 0.0 failed 4 times, most recent failure: Lost task 55.3 in stage >> 0.0 (TID 281, ): java.lang.NoClassDefFoundError: Could not >> initialize class org.apache.hadoop.hbase.client.HConnectionManager >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70) >> at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) >> at org.apache.spark.scheduler.Task.run(Task.scala:70) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> Driver stacktrace: >> at >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263) >> at >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> at >> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418) >> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >> >> >> 15/11/16 18:59:57 ERROR Executor: Exception in task 14.0 in stage 0.0 (TID >> 14) >> java.lang.ExceptionInInitializerError >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:197) >> at org.apache.hadoop.hbase.client.HTable.(HTable.java:159) >> at >> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:101) >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.(NewHadoopRDD.scala:128) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104) >> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at >>