Hi, this is exactly the same as my issue, seems to be a bug in 1.5.x. (see my thread for details)
2015-11-19 11:20 GMT+01:00 Jeff Zhang <zjf...@gmail.com>: > Seems your jdbc url is not correct. Should be jdbc:mysql:// > 192.168.41.229:3306 > > On Thu, Nov 19, 2015 at 6:03 PM, <luohui20...@sina.com> wrote: > >> hi guy, >> >> I also found --driver-class-path and spark.driver.extraClassPath >> is not working when I'm accessing mysql driver in my spark APP. >> >> the attache is my config for my APP. >> >> >> here are my command and the logs of the failure i encountted. >> >> >> [root@h170 spark]# bin/spark-shell --master spark://h170:7077 >> --driver-class-path lib/mysql-connector-java-5.1.32-bin.jar --jars >> lib/mysql-connector-java-5.1.32-bin.jar >> >> log4j:WARN No appenders could be found for logger >> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). >> >> log4j:WARN Please initialize the log4j system properly. >> >> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for >> more info. >> >> Using Spark's repl log4j profile: >> org/apache/spark/log4j-defaults-repl.properties >> >> To adjust logging level use sc.setLogLevel("INFO") >> >> Welcome to >> >> ____ __ >> >> / __/__ ___ _____/ /__ >> >> _\ \/ _ \/ _ `/ __/ '_/ >> >> /___/ .__/\_,_/_/ /_/\_\ version 1.5.2 >> >> /_/ >> >> >> Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java >> 1.7.0_79) >> >> Type in expressions to have them evaluated. >> >> Type :help for more information. >> >> 15/11/19 17:51:33 WARN MetricsSystem: Using default name DAGScheduler for >> source because spark.app.id is not set. >> >> Spark context available as sc. >> >> 15/11/19 17:51:35 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> >> 15/11/19 17:51:35 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> >> 15/11/19 17:51:48 WARN ObjectStore: Version information not found in >> metastore. hive.metastore.schema.verification is not enabled so recording >> the schema version 1.2.0 >> >> 15/11/19 17:51:48 WARN ObjectStore: Failed to get database default, >> returning NoSuchObjectException >> >> 15/11/19 17:51:50 WARN NativeCodeLoader: Unable to load native-hadoop >> library for your platform... using builtin-java classes where applicable >> >> 15/11/19 17:51:50 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> >> 15/11/19 17:51:50 WARN Connection: BoneCP specified but not present in >> CLASSPATH (or one of dependencies) >> >> SQL context available as sqlContext. >> >> >> scala> val jdbcDF = sqlContext.read.format("jdbc").options(Map("url" -> >> "mysql://192.168.41.229:3306", "dbtable" -> "sqoop.SQOOP_ROOT")).load() >> >> java.sql.SQLException: No suitable driver found for mysql:// >> 192.168.41.229:3306 >> >> at java.sql.DriverManager.getConnection(DriverManager.java:596) >> >> at java.sql.DriverManager.getConnection(DriverManager.java:187) >> >> at >> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$getConnector$1.apply(JDBCRDD.scala:188) >> >> at >> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$getConnector$1.apply(JDBCRDD.scala:181) >> >> at >> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:121) >> >> at >> org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91) >> >> at >> org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:60) >> >> at >> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:125) >> >> at >> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114) >> >> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:19) >> >> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24) >> >> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:26) >> >> at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:28) >> >> at $iwC$$iwC$$iwC$$iwC.<init>(<console>:30) >> >> at $iwC$$iwC$$iwC.<init>(<console>:32) >> >> at $iwC$$iwC.<init>(<console>:34) >> >> at $iwC.<init>(<console>:36) >> >> at <init>(<console>:38) >> >> at .<init>(<console>:42) >> >> at .<clinit>(<console>) >> >> at .<init>(<console>:7) >> >> at .<clinit>(<console>) >> >> at $print(<console>) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:606) >> >> at >> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) >> >> at >> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1340) >> >> at >> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) >> >> at >> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) >> >> at >> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) >> >> at >> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) >> >> at >> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) >> >> at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) >> >> at >> org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) >> >> at >> org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) >> >> at org.apache.spark.repl.SparkILoop.org >> $apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) >> >> at >> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) >> >> at >> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) >> >> at >> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) >> >> at >> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) >> >> at org.apache.spark.repl.SparkILoop.org >> $apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) >> >> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) >> >> at org.apache.spark.repl.Main$.main(Main.scala:31) >> >> at org.apache.spark.repl.Main.main(Main.scala) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:606) >> >> at >> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674) >> >> at >> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) >> >> at >> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) >> >> at >> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) >> >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> >> >> >> -------------------------------- >> >> Thanks&Best regards! >> San.Luo >> >> ----- 原始邮件 ----- >> 发件人:Xiao Li <gatorsm...@gmail.com> >> 收件人:Akhil Das <ak...@sigmoidanalytics.com> >> 抄送人:Hurshal Patel <hpatel...@gmail.com>, "user@spark.apache.org" < >> user@spark.apache.org> >> 主题:Re: driver ClassNotFoundException when MySQL JDBC exceptions are >> thrown on executor >> 日期:2015年10月22日 22点10分 >> >> A few months ago, I used the DB2 jdbc drivers. I hit a couple of issues >> when using --driver-class-path. At the end, I used the following command to >> bypass most of issues: >> >> ./bin/spark-submit --jars >> /Users/smile/db2driver/db2jcc.jar,/Users/smile/db2driver/db2jcc_license_cisuz.jar >> --master local[*] --class com.sparkEngine.tttt >> /Users/smile/spark-1.3.1-bin-hadoop2.3/projects/SparkApps-master/spark-load-from-db/target/nnnn-1.0.jar >> >> Hopefully, it works for you. >> >> Xiao Li >> >> >> 2015-10-22 4:56 GMT-07:00 Akhil Das <ak...@sigmoidanalytics.com>: >> >> Did you try passing the mysql connector jar through --driver-class-path >> >> Thanks >> Best Regards >> >> On Sat, Oct 17, 2015 at 6:33 AM, Hurshal Patel <hpatel...@gmail.com> >> wrote: >> >> Hi all, >> >> I've been struggling with a particularly puzzling issue after upgrading >> to Spark 1.5.1 from Spark 1.4.1. >> >> When I use the MySQL JDBC connector and an exception (e.g. >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException) is thrown on >> the executor, I get a ClassNotFoundException on the driver, which results >> in this error (logs are abbreviated): >> >> 15/10/16 17:20:59 INFO SparkContext: Starting job: collect at >> repro.scala:73 >> ... >> 15/10/16 17:20:59 INFO Executor: Running task 0.0 in stage 3.0 (TID 3) >> 15/10/16 17:20:59 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID >> 3) >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at repro.Repro$$anonfun$main$3.apply$mcZI$sp(repro.scala:69) >> ... >> 15/10/16 17:20:59 WARN ThrowableSerializationWrapper: Task exception >> could not be deserialized >> java.lang.ClassNotFoundException: >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >> ... >> 15/10/16 17:20:59 ERROR TaskResultGetter: Could not deserialize >> TaskEndReason: ClassNotFound with classloader >> org.apache.spark.util.MutableURLClassLoader@7f08a6b1 >> 15/10/16 17:20:59 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 3, >> localhost): UnknownReason >> 15/10/16 17:20:59 ERROR TaskSetManager: Task 0 in stage 3.0 failed 1 >> times; aborting job >> 15/10/16 17:20:59 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose >> tasks have all completed, from pool >> 15/10/16 17:20:59 INFO TaskSchedulerImpl: Cancelling stage 3 >> 15/10/16 17:20:59 INFO DAGScheduler: ResultStage 3 (collect at >> repro.scala:73) failed in 0.012 s >> 15/10/16 17:20:59 INFO DAGScheduler: Job 3 failed: collect at >> repro.scala:73, took 0.018694 s >> >> In Spark 1.4.1, I get the following (logs are abbreviated): >> 15/10/16 17:42:41 INFO SparkContext: Starting job: collect at >> repro.scala:53 >> ... >> 15/10/16 17:42:41 INFO Executor: Running task 0.0 in stage 2.0 (TID 2) >> 15/10/16 17:42:41 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID >> 2) >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at repro.Repro$$anonfun$main$2.apply$mcZI$sp(repro.scala:49) >> ... >> 15/10/16 17:42:41 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, >> localhost): com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at repro.Repro$$anonfun$main$2.apply$mcZI$sp(repro.scala:49) >> ... >> >> 15/10/16 17:42:41 ERROR TaskSetManager: Task 0 in stage 2.0 failed 1 >> times; aborting job >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose >> tasks have all completed, from pool >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Cancelling stage 2 >> 15/10/16 17:42:41 INFO DAGScheduler: ResultStage 2 (collect at >> repro.scala:53) failed in 0.016 s >> 15/10/16 17:42:41 INFO DAGScheduler: Job 2 failed: collect at >> repro.scala:53, took 0.024584 s >> >> >> I have seriously screwed up somewhere or this is a change in behavior >> that I have not been able to find in the documentation. For those that are >> interested, a full repro and logs follow. >> >> Hurshal >> >> --- >> >> I am running this on Spark 1.5.1+Hadoop 2.6. I have tried this in various >> combinations of >> * local/standalone mode >> * putting mysql on the classpath with --jars/building a fat jar with >> mysql in it/manually running sc.addJar on the mysql jar >> * --deploy-mode client/--deploy-mode cluster >> but nothing seems to change. >> >> >> >> Here is an example invocation, and the accompanying source code: >> >> $ ./bin/spark-submit --master local --deploy-mode client --class >> repro.Repro /home/nix/repro/target/scala-2.10/repro-assembly-0.0.1.jar >> Using Spark's default log4j profile: >> org/apache/spark/log4j-defaults.properties >> 15/10/16 17:40:53 INFO SparkContext: Running Spark version 1.5.1 >> 15/10/16 17:40:53 WARN NativeCodeLoader: Unable to load native-hadoop >> library for your platform... using builtin-java classes where applicable >> 15/10/16 17:40:53 WARN Utils: Your hostname, choochootrain resolves to a >> loopback address: 127.0.1.1; using 10.0.1.97 instead (on interface wlan0) >> 15/10/16 17:40:53 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to >> another address >> 15/10/16 17:40:53 INFO SecurityManager: Changing view acls to: root >> 15/10/16 17:40:53 INFO SecurityManager: Changing modify acls to: root >> 15/10/16 17:40:53 INFO SecurityManager: SecurityManager: authentication >> disabled; ui acls disabled; users with view permissions: Set(root); users >> with modify permissions: Set(root) >> 15/10/16 17:40:54 INFO Slf4jLogger: Slf4jLogger started >> 15/10/16 17:40:54 INFO Remoting: Starting remoting >> 15/10/16 17:40:54 INFO Remoting: Remoting started; listening on addresses >> :[akka.tcp://sparkDriver@10.0.1.97:48116] >> 15/10/16 17:40:54 INFO Utils: Successfully started service 'sparkDriver' >> on port 48116. >> 15/10/16 17:40:54 INFO SparkEnv: Registering MapOutputTracker >> 15/10/16 17:40:54 INFO SparkEnv: Registering BlockManagerMaster >> 15/10/16 17:40:54 INFO DiskBlockManager: Created local directory at >> /tmp/blockmgr-7e7cf2b0-397e-4c44-97e9-508f5c6ec5ab >> 15/10/16 17:40:54 INFO MemoryStore: MemoryStore started with capacity >> 530.3 MB >> 15/10/16 17:40:54 INFO HttpFileServer: HTTP File server directory is >> /tmp/spark-ae2e1831-eb79-4431-bb49-81afbce963d7/httpd-51f4d72e-c080-4d58-8500-8a9174564478 >> 15/10/16 17:40:54 INFO HttpServer: Starting HTTP Server >> 15/10/16 17:40:54 INFO Utils: Successfully started service 'HTTP file >> server' on port 57598. >> 15/10/16 17:40:54 INFO SparkEnv: Registering OutputCommitCoordinator >> 15/10/16 17:40:54 INFO Utils: Successfully started service 'SparkUI' on >> port 4040. >> 15/10/16 17:40:54 INFO SparkUI: Started SparkUI at http://10.0.1.97:4040 >> 15/10/16 17:40:54 WARN MetricsSystem: Using default name DAGScheduler for >> source because spark.app.id is not set. >> 15/10/16 17:40:54 INFO Executor: Starting executor ID driver on host >> localhost >> 15/10/16 17:40:54 INFO Utils: Successfully started service >> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 53260. >> 15/10/16 17:40:54 INFO NettyBlockTransferService: Server created on 53260 >> 15/10/16 17:40:54 INFO BlockManagerMaster: Trying to register BlockManager >> 15/10/16 17:40:54 INFO BlockManagerMasterEndpoint: Registering block >> manager localhost:53260 with 530.3 MB RAM, BlockManagerId(driver, >> localhost, 53260) >> 15/10/16 17:40:54 INFO BlockManagerMaster: Registered BlockManager >> >> >> >> >> ##### Does the executor have MySQLSyntaxErrorException? >> 15/10/16 17:40:55 INFO SparkContext: Starting job: collect at >> repro.scala:24 >> 15/10/16 17:40:55 INFO DAGScheduler: Got job 0 (collect at >> repro.scala:24) with 1 output partitions >> 15/10/16 17:40:55 INFO DAGScheduler: Final stage: ResultStage 0(collect >> at repro.scala:24) >> 15/10/16 17:40:55 INFO DAGScheduler: Parents of final stage: List() >> 15/10/16 17:40:55 INFO DAGScheduler: Missing parents: List() >> 15/10/16 17:40:55 INFO DAGScheduler: Submitting ResultStage 0 >> (MapPartitionsRDD[1] at map at repro.scala:16), which has no missing parents >> 15/10/16 17:40:55 INFO MemoryStore: ensureFreeSpace(1976) called with >> curMem=0, maxMem=556038881 >> 15/10/16 17:40:55 INFO MemoryStore: Block broadcast_0 stored as values in >> memory (estimated size 1976.0 B, free 530.3 MB) >> 15/10/16 17:40:55 INFO MemoryStore: ensureFreeSpace(1243) called with >> curMem=1976, maxMem=556038881 >> 15/10/16 17:40:55 INFO MemoryStore: Block broadcast_0_piece0 stored as >> bytes in memory (estimated size 1243.0 B, free 530.3 MB) >> 15/10/16 17:40:55 INFO BlockManagerInfo: Added broadcast_0_piece0 in >> memory on localhost:53260 (size: 1243.0 B, free: 530.3 MB) >> 15/10/16 17:40:55 INFO SparkContext: Created broadcast 0 from broadcast >> at DAGScheduler.scala:861 >> 15/10/16 17:40:55 INFO DAGScheduler: Submitting 1 missing tasks from >> ResultStage 0 (MapPartitionsRDD[1] at map at repro.scala:16) >> 15/10/16 17:40:55 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks >> 15/10/16 17:40:55 INFO TaskSetManager: Starting task 0.0 in stage 0.0 >> (TID 0, localhost, PROCESS_LOCAL, 2160 bytes) >> 15/10/16 17:40:55 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) >> 15/10/16 17:40:55 INFO Executor: Fetching >> http://10.0.1.97:57598/jars/mysql-connector-java-5.1.34.jar with >> timestamp 1445042454812 >> 15/10/16 17:40:55 INFO Utils: Fetching >> http://10.0.1.97:57598/jars/mysql-connector-java-5.1.34.jar to >> /tmp/spark-ae2e1831-eb79-4431-bb49-81afbce963d7/userFiles-af3a471b-c102-4d5a-87b4-8f538bef09ec/fetchFileTemp2850274256530295645.tmp >> 15/10/16 17:40:55 INFO Executor: Adding >> file:/tmp/spark-ae2e1831-eb79-4431-bb49-81afbce963d7/userFiles-af3a471b-c102-4d5a-87b4-8f538bef09ec/mysql-connector-java-5.1.34.jar >> to class loader >> 15/10/16 17:40:55 INFO Executor: Fetching >> http://10.0.1.97:57598/jars/repro-assembly-0.0.1.jar with timestamp >> 1445042454573 >> 15/10/16 17:40:55 INFO Utils: Fetching >> http://10.0.1.97:57598/jars/repro-assembly-0.0.1.jar to >> /tmp/spark-ae2e1831-eb79-4431-bb49-81afbce963d7/userFiles-af3a471b-c102-4d5a-87b4-8f538bef09ec/fetchFileTemp4393235523023490143.tmp >> 15/10/16 17:40:55 INFO Executor: Adding >> file:/tmp/spark-ae2e1831-eb79-4431-bb49-81afbce963d7/userFiles-af3a471b-c102-4d5a-87b4-8f538bef09ec/repro-assembly-0.0.1.jar >> to class loader >> 15/10/16 17:40:55 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). >> 899 bytes result sent to driver >> 15/10/16 17:40:55 INFO TaskSetManager: Finished task 0.0 in stage 0.0 >> (TID 0) in 108 ms on localhost (1/1) >> 15/10/16 17:40:55 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose >> tasks have all completed, from pool >> 15/10/16 17:40:55 INFO DAGScheduler: ResultStage 0 (collect at >> repro.scala:24) finished in 0.116 s >> 15/10/16 17:40:55 INFO DAGScheduler: Job 0 finished: collect at >> repro.scala:24, took 0.228888 s >> ##### Yes. >> >> >> >> >> ##### Can the executor catch MySQLSyntaxErrorException? >> 15/10/16 17:40:55 INFO SparkContext: Starting job: collect at >> repro.scala:40 >> 15/10/16 17:40:55 INFO DAGScheduler: Got job 1 (collect at >> repro.scala:40) with 1 output partitions >> 15/10/16 17:40:55 INFO DAGScheduler: Final stage: ResultStage 1(collect >> at repro.scala:40) >> 15/10/16 17:40:55 INFO DAGScheduler: Parents of final stage: List() >> 15/10/16 17:40:55 INFO DAGScheduler: Missing parents: List() >> 15/10/16 17:40:55 INFO DAGScheduler: Submitting ResultStage 1 >> (MapPartitionsRDD[3] at map at repro.scala:32), which has no missing parents >> 15/10/16 17:40:55 INFO MemoryStore: ensureFreeSpace(1984) called with >> curMem=3219, maxMem=556038881 >> 15/10/16 17:40:55 INFO MemoryStore: Block broadcast_1 stored as values in >> memory (estimated size 1984.0 B, free 530.3 MB) >> 15/10/16 17:40:55 INFO MemoryStore: ensureFreeSpace(1242) called with >> curMem=5203, maxMem=556038881 >> 15/10/16 17:40:55 INFO MemoryStore: Block broadcast_1_piece0 stored as >> bytes in memory (estimated size 1242.0 B, free 530.3 MB) >> 15/10/16 17:40:55 INFO BlockManagerInfo: Added broadcast_1_piece0 in >> memory on localhost:53260 (size: 1242.0 B, free: 530.3 MB) >> 15/10/16 17:40:55 INFO SparkContext: Created broadcast 1 from broadcast >> at DAGScheduler.scala:861 >> 15/10/16 17:40:55 INFO DAGScheduler: Submitting 1 missing tasks from >> ResultStage 1 (MapPartitionsRDD[3] at map at repro.scala:32) >> 15/10/16 17:40:55 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks >> 15/10/16 17:40:55 INFO TaskSetManager: Starting task 0.0 in stage 1.0 >> (TID 1, localhost, PROCESS_LOCAL, 2160 bytes) >> 15/10/16 17:40:55 INFO Executor: Running task 0.0 in stage 1.0 (TID 1) >> 15/10/16 17:40:55 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). >> 899 bytes result sent to driver >> 15/10/16 17:40:55 INFO TaskSetManager: Finished task 0.0 in stage 1.0 >> (TID 1) in 32 ms on localhost (1/1) >> 15/10/16 17:40:55 INFO DAGScheduler: ResultStage 1 (collect at >> repro.scala:40) finished in 0.032 s >> 15/10/16 17:40:55 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose >> tasks have all completed, from pool >> 15/10/16 17:40:55 INFO DAGScheduler: Job 1 finished: collect at >> repro.scala:40, took 0.039600 s >> ##### Yes. >> >> >> >> >> ##### Can the driver recover from MySQLSyntaxErrorException? >> 15/10/16 17:40:55 INFO SparkContext: Starting job: collect at >> repro.scala:53 >> 15/10/16 17:40:55 INFO DAGScheduler: Got job 2 (collect at >> repro.scala:53) with 1 output partitions >> 15/10/16 17:40:55 INFO DAGScheduler: Final stage: ResultStage 2(collect >> at repro.scala:53) >> 15/10/16 17:40:55 INFO DAGScheduler: Parents of final stage: List() >> 15/10/16 17:40:55 INFO DAGScheduler: Missing parents: List() >> 15/10/16 17:40:55 INFO DAGScheduler: Submitting ResultStage 2 >> (MapPartitionsRDD[5] at map at repro.scala:48), which has no missing parents >> 15/10/16 17:40:55 INFO MemoryStore: ensureFreeSpace(1984) called with >> curMem=6445, maxMem=556038881 >> 15/10/16 17:40:55 INFO MemoryStore: Block broadcast_2 stored as values in >> memory (estimated size 1984.0 B, free 530.3 MB) >> 15/10/16 17:40:55 INFO MemoryStore: ensureFreeSpace(1244) called with >> curMem=8429, maxMem=556038881 >> 15/10/16 17:40:55 INFO MemoryStore: Block broadcast_2_piece0 stored as >> bytes in memory (estimated size 1244.0 B, free 530.3 MB) >> 15/10/16 17:40:55 INFO BlockManagerInfo: Added broadcast_2_piece0 in >> memory on localhost:53260 (size: 1244.0 B, free: 530.3 MB) >> 15/10/16 17:40:55 INFO SparkContext: Created broadcast 2 from broadcast >> at DAGScheduler.scala:861 >> 15/10/16 17:40:55 INFO DAGScheduler: Submitting 1 missing tasks from >> ResultStage 2 (MapPartitionsRDD[5] at map at repro.scala:48) >> 15/10/16 17:40:55 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks >> 15/10/16 17:40:55 INFO TaskSetManager: Starting task 0.0 in stage 2.0 >> (TID 2, localhost, PROCESS_LOCAL, 2160 bytes) >> 15/10/16 17:40:55 INFO Executor: Running task 0.0 in stage 2.0 (TID 2) >> 15/10/16 17:40:55 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID >> 2) >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at repro.Repro$$anonfun$main$2.apply$mcZI$sp(repro.scala:49) >> at repro.Repro$$anonfun$main$2.apply(repro.scala:48) >> at repro.Repro$$anonfun$main$2.apply(repro.scala:48) >> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> at >> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) >> at scala.collection.AbstractIterator.to >> <http://scala.collection.abstractiterator.to/>(Iterator.scala:1157) >> at >> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >> at >> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >> at >> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:905) >> at >> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:905) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1848) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1848) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) >> at org.apache.spark.scheduler.Task.run(Task.scala:88) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> 15/10/16 17:40:55 WARN ThrowableSerializationWrapper: Task exception >> could not be deserialized >> java.lang.ClassNotFoundException: >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:274) >> at >> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) >> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) >> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> at >> org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:167) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1897) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1997) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1921) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> at >> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72) >> at >> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105) >> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699) >> at >> org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> 15/10/16 17:40:55 ERROR TaskResultGetter: Could not deserialize >> TaskEndReason: ClassNotFound with classloader >> org.apache.spark.util.MutableURLClassLoader@7f08a6b1 >> 15/10/16 17:40:55 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, >> localhost): UnknownReason >> 15/10/16 17:40:55 ERROR TaskSetManager: Task 0 in stage 2.0 failed 1 >> times; aborting job >> 15/10/16 17:40:55 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose >> tasks have all completed, from pool >> 15/10/16 17:40:55 INFO TaskSchedulerImpl: Cancelling stage 2 >> 15/10/16 17:40:55 INFO DAGScheduler: ResultStage 2 (collect at >> repro.scala:53) failed in 0.022 s >> 15/10/16 17:40:55 INFO DAGScheduler: Job 2 failed: collect at >> repro.scala:53, took 0.030159 s >> ##### NO. >> >> >> >> >> 15/10/16 17:40:55 INFO SparkContext: Invoking stop() from shutdown hook >> 15/10/16 17:40:55 INFO SparkUI: Stopped Spark web UI at >> http://10.0.1.97:4040 >> 15/10/16 17:40:55 INFO DAGScheduler: Stopping DAGScheduler >> 15/10/16 17:40:55 INFO MapOutputTrackerMasterEndpoint: >> MapOutputTrackerMasterEndpoint stopped! >> 15/10/16 17:40:55 INFO MemoryStore: MemoryStore cleared >> 15/10/16 17:40:55 INFO BlockManager: BlockManager stopped >> 15/10/16 17:40:55 INFO BlockManagerMaster: BlockManagerMaster stopped >> 15/10/16 17:40:55 INFO >> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: >> OutputCommitCoordinator stopped! >> 15/10/16 17:40:55 INFO SparkContext: Successfully stopped SparkContext >> 15/10/16 17:40:55 INFO ShutdownHookManager: Shutdown hook called >> 15/10/16 17:40:55 INFO ShutdownHookManager: Deleting directory >> /tmp/spark-ae2e1831-eb79-4431-bb49-81afbce963d7 >> >> >> >> The source code: >> package repro >> >> import java.sql.DriverManager >> import org.apache.spark.SparkContext >> import org.apache.spark.SparkConf >> import org.apache.spark.SparkException >> >> object Repro { >> def main(args: Array[String]) { >> val conf = new SparkConf().setAppName("MySQL ClassNotFound Repro") >> val sc = new SparkContext(conf) >> >> println("\n\n\n") >> println("##### Does the executor have MySQLSyntaxErrorException?") >> var rdd = sc.parallelize(List(1)).map { x => >> try { >> >> Class.forName("com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException") >> true >> } catch { >> case e: ClassNotFoundException => false >> } >> } >> if (rdd.collect().toList == List(true)) { >> println("##### Yes.") >> } else { >> println("##### NO.") >> } >> >> println("\n\n\n") >> println("##### Can the executor catch MySQLSyntaxErrorException?") >> rdd = sc.parallelize(List(1)).map { x => >> try { >> throw new >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException() >> } catch { >> case e: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> => false >> } >> true >> } >> if (rdd.collect().toList == List(true)) { >> println("##### Yes.") >> } else { >> println("##### NO.") >> } >> >> println("\n\n\n") >> println("##### Can the driver recover from >> MySQLSyntaxErrorException?") >> rdd = sc.parallelize(List(1)).map { x => >> throw new >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException() >> true >> } >> try { >> rdd.collect() >> } catch { >> case e: SparkException => { >> if (e.getMessage.contains("UnknownReason")) { >> println("##### NO.") >> } else { >> println("##### Yes.") >> } >> } >> } >> println("\n\n\n") >> >> } >> } >> >> >> And the build.sbt: >> lazy val root = (project in file(".")). >> settings( >> organization := "com.repro", >> version := "0.0.1", >> scalaVersion := "2.10.5", >> name := "repro", >> libraryDependencies ++= Seq( >> "org.apache.spark" %% "spark-core" % "1.5.1" % "provided", >> "org.apache.spark" %% "spark-sql" % "1.5.1" % "provided", >> "org.apache.spark" %% "spark-streaming" % "1.5.1" % "provided", >> "mysql" % "mysql-connector-java" % "5.1.34" >> ) >> ) >> >> >> >> >> And here are the results when run against Spark 1.4.1 (build.sbt has been >> updated accordingly) >> >> $ ./bin/spark-submit --master local --deploy-mode client --class >> repro.Repro /home/nix/repro/target/scala-2.10/repro-assembly-0.0.1.jar >> Using Spark's default log4j profile: >> org/apache/spark/log4j-defaults.properties >> 15/10/16 17:42:40 INFO SparkContext: Running Spark version 1.4.1 >> 15/10/16 17:42:40 WARN NativeCodeLoader: Unable to load native-hadoop >> library for your platform... using builtin-java classes where applicable >> 15/10/16 17:42:40 WARN Utils: Your hostname, choochootrain resolves to a >> loopback address: 127.0.1.1; using 10.0.1.97 instead (on interface wlan0) >> 15/10/16 17:42:40 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to >> another address >> 15/10/16 17:42:40 INFO SecurityManager: Changing view acls to: root >> 15/10/16 17:42:40 INFO SecurityManager: Changing modify acls to: root >> 15/10/16 17:42:40 INFO SecurityManager: SecurityManager: authentication >> disabled; ui acls disabled; users with view permissions: Set(root); users >> with modify permissions: Set(root) >> 15/10/16 17:42:40 INFO Slf4jLogger: Slf4jLogger started >> 15/10/16 17:42:40 INFO Remoting: Starting remoting >> 15/10/16 17:42:40 INFO Remoting: Remoting started; listening on addresses >> :[akka.tcp://sparkDriver@10.0.1.97:48333] >> 15/10/16 17:42:41 INFO Utils: Successfully started service 'sparkDriver' >> on port 48333. >> 15/10/16 17:42:41 INFO SparkEnv: Registering MapOutputTracker >> 15/10/16 17:42:41 INFO SparkEnv: Registering BlockManagerMaster >> 15/10/16 17:42:41 INFO DiskBlockManager: Created local directory at >> /tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a/blockmgr-9f9e70fc-62f7-4308-8b7d-c1b4127241f3 >> 15/10/16 17:42:41 INFO MemoryStore: MemoryStore started with capacity >> 265.4 MB >> 15/10/16 17:42:41 INFO HttpFileServer: HTTP File server directory is >> /tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a/httpd-fa89fece-1832-4a31-bdb5-8c9b7761b4e6 >> 15/10/16 17:42:41 INFO HttpServer: Starting HTTP Server >> 15/10/16 17:42:41 INFO Utils: Successfully started service 'HTTP file >> server' on port 49551. >> 15/10/16 17:42:41 INFO SparkEnv: Registering OutputCommitCoordinator >> 15/10/16 17:42:41 INFO Utils: Successfully started service 'SparkUI' on >> port 4040. >> 15/10/16 17:42:41 INFO SparkUI: Started SparkUI at http://10.0.1.97:4040 >> 15/10/16 17:42:41 INFO SparkContext: Added JAR >> file:/home/nix/repro/target/scala-2.10/repro-assembly-0.0.1.jar at >> http://10.0.1.97:49551/jars/repro-assembly-0.0.1.jar with timestamp >> 1445042561260 >> 15/10/16 17:42:41 INFO Executor: Starting executor ID driver on host >> localhost >> 15/10/16 17:42:41 INFO Utils: Successfully started service >> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42805. >> 15/10/16 17:42:41 INFO NettyBlockTransferService: Server created on 42805 >> 15/10/16 17:42:41 INFO BlockManagerMaster: Trying to register BlockManager >> 15/10/16 17:42:41 INFO BlockManagerMasterEndpoint: Registering block >> manager localhost:42805 with 265.4 MB RAM, BlockManagerId(driver, >> localhost, 42805) >> 15/10/16 17:42:41 INFO BlockManagerMaster: Registered BlockManager >> >> >> >> >> ##### Does the executor have MySQLSyntaxErrorException? >> 15/10/16 17:42:41 INFO SparkContext: Starting job: collect at >> repro.scala:24 >> 15/10/16 17:42:41 INFO DAGScheduler: Got job 0 (collect at >> repro.scala:24) with 1 output partitions (allowLocal=false) >> 15/10/16 17:42:41 INFO DAGScheduler: Final stage: ResultStage 0(collect >> at repro.scala:24) >> 15/10/16 17:42:41 INFO DAGScheduler: Parents of final stage: List() >> 15/10/16 17:42:41 INFO DAGScheduler: Missing parents: List() >> 15/10/16 17:42:41 INFO DAGScheduler: Submitting ResultStage 0 >> (MapPartitionsRDD[1] at map at repro.scala:16), which has no missing parents >> 15/10/16 17:42:41 INFO MemoryStore: ensureFreeSpace(1976) called with >> curMem=0, maxMem=278302556 >> 15/10/16 17:42:41 INFO MemoryStore: Block broadcast_0 stored as values in >> memory (estimated size 1976.0 B, free 265.4 MB) >> 15/10/16 17:42:41 INFO MemoryStore: ensureFreeSpace(1243) called with >> curMem=1976, maxMem=278302556 >> 15/10/16 17:42:41 INFO MemoryStore: Block broadcast_0_piece0 stored as >> bytes in memory (estimated size 1243.0 B, free 265.4 MB) >> 15/10/16 17:42:41 INFO BlockManagerInfo: Added broadcast_0_piece0 in >> memory on localhost:42805 (size: 1243.0 B, free: 265.4 MB) >> 15/10/16 17:42:41 INFO SparkContext: Created broadcast 0 from broadcast >> at DAGScheduler.scala:874 >> 15/10/16 17:42:41 INFO DAGScheduler: Submitting 1 missing tasks from >> ResultStage 0 (MapPartitionsRDD[1] at map at repro.scala:16) >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks >> 15/10/16 17:42:41 INFO TaskSetManager: Starting task 0.0 in stage 0.0 >> (TID 0, localhost, PROCESS_LOCAL, 1444 bytes) >> 15/10/16 17:42:41 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) >> 15/10/16 17:42:41 INFO Executor: Fetching >> http://10.0.1.97:49551/jars/repro-assembly-0.0.1.jar with timestamp >> 1445042561260 >> 15/10/16 17:42:41 INFO Utils: Fetching >> http://10.0.1.97:49551/jars/repro-assembly-0.0.1.jar to >> /tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a/userFiles-9d259a88-ed7a-40c3-a4bc-8e8fa489e8c9/fetchFileTemp4944656807834781951.tmp >> 15/10/16 17:42:41 INFO Executor: Adding >> file:/tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a/userFiles-9d259a88-ed7a-40c3-a4bc-8e8fa489e8c9/repro-assembly-0.0.1.jar >> to class loader >> 15/10/16 17:42:41 INFO Executor: Fetching >> http://10.0.1.97:49551/jars/mysql-connector-java-5.1.34.jar with >> timestamp 1445042561466 >> 15/10/16 17:42:41 INFO Utils: Fetching >> http://10.0.1.97:49551/jars/mysql-connector-java-5.1.34.jar to >> /tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a/userFiles-9d259a88-ed7a-40c3-a4bc-8e8fa489e8c9/fetchFileTemp7552804998517503805.tmp >> 15/10/16 17:42:41 INFO Executor: Adding >> file:/tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a/userFiles-9d259a88-ed7a-40c3-a4bc-8e8fa489e8c9/mysql-connector-java-5.1.34.jar >> to class loader >> 15/10/16 17:42:41 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). >> 604 bytes result sent to driver >> 15/10/16 17:42:41 INFO TaskSetManager: Finished task 0.0 in stage 0.0 >> (TID 0) in 116 ms on localhost (1/1) >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose >> tasks have all completed, from pool >> 15/10/16 17:42:41 INFO DAGScheduler: ResultStage 0 (collect at >> repro.scala:24) finished in 0.123 s >> 15/10/16 17:42:41 INFO DAGScheduler: Job 0 finished: collect at >> repro.scala:24, took 0.231604 s >> ##### Yes. >> >> >> >> >> ##### Can the executor catch MySQLSyntaxErrorException? >> 15/10/16 17:42:41 INFO SparkContext: Starting job: collect at >> repro.scala:40 >> 15/10/16 17:42:41 INFO DAGScheduler: Got job 1 (collect at >> repro.scala:40) with 1 output partitions (allowLocal=false) >> 15/10/16 17:42:41 INFO DAGScheduler: Final stage: ResultStage 1(collect >> at repro.scala:40) >> 15/10/16 17:42:41 INFO DAGScheduler: Parents of final stage: List() >> 15/10/16 17:42:41 INFO DAGScheduler: Missing parents: List() >> 15/10/16 17:42:41 INFO DAGScheduler: Submitting ResultStage 1 >> (MapPartitionsRDD[3] at map at repro.scala:32), which has no missing parents >> 15/10/16 17:42:41 INFO MemoryStore: ensureFreeSpace(1984) called with >> curMem=3219, maxMem=278302556 >> 15/10/16 17:42:41 INFO MemoryStore: Block broadcast_1 stored as values in >> memory (estimated size 1984.0 B, free 265.4 MB) >> 15/10/16 17:42:41 INFO MemoryStore: ensureFreeSpace(1242) called with >> curMem=5203, maxMem=278302556 >> 15/10/16 17:42:41 INFO MemoryStore: Block broadcast_1_piece0 stored as >> bytes in memory (estimated size 1242.0 B, free 265.4 MB) >> 15/10/16 17:42:41 INFO BlockManagerInfo: Added broadcast_1_piece0 in >> memory on localhost:42805 (size: 1242.0 B, free: 265.4 MB) >> 15/10/16 17:42:41 INFO SparkContext: Created broadcast 1 from broadcast >> at DAGScheduler.scala:874 >> 15/10/16 17:42:41 INFO DAGScheduler: Submitting 1 missing tasks from >> ResultStage 1 (MapPartitionsRDD[3] at map at repro.scala:32) >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks >> 15/10/16 17:42:41 INFO TaskSetManager: Starting task 0.0 in stage 1.0 >> (TID 1, localhost, PROCESS_LOCAL, 1444 bytes) >> 15/10/16 17:42:41 INFO Executor: Running task 0.0 in stage 1.0 (TID 1) >> 15/10/16 17:42:41 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). >> 604 bytes result sent to driver >> 15/10/16 17:42:41 INFO TaskSetManager: Finished task 0.0 in stage 1.0 >> (TID 1) in 20 ms on localhost (1/1) >> 15/10/16 17:42:41 INFO DAGScheduler: ResultStage 1 (collect at >> repro.scala:40) finished in 0.020 s >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose >> tasks have all completed, from pool >> 15/10/16 17:42:41 INFO DAGScheduler: Job 1 finished: collect at >> repro.scala:40, took 0.029509 s >> ##### Yes. >> >> >> >> >> ##### Can the driver recover from MySQLSyntaxErrorException? >> 15/10/16 17:42:41 INFO SparkContext: Starting job: collect at >> repro.scala:53 >> 15/10/16 17:42:41 INFO DAGScheduler: Got job 2 (collect at >> repro.scala:53) with 1 output partitions (allowLocal=false) >> 15/10/16 17:42:41 INFO DAGScheduler: Final stage: ResultStage 2(collect >> at repro.scala:53) >> 15/10/16 17:42:41 INFO DAGScheduler: Parents of final stage: List() >> 15/10/16 17:42:41 INFO DAGScheduler: Missing parents: List() >> 15/10/16 17:42:41 INFO DAGScheduler: Submitting ResultStage 2 >> (MapPartitionsRDD[5] at map at repro.scala:48), which has no missing parents >> 15/10/16 17:42:41 INFO MemoryStore: ensureFreeSpace(1984) called with >> curMem=6445, maxMem=278302556 >> 15/10/16 17:42:41 INFO MemoryStore: Block broadcast_2 stored as values in >> memory (estimated size 1984.0 B, free 265.4 MB) >> 15/10/16 17:42:41 INFO MemoryStore: ensureFreeSpace(1244) called with >> curMem=8429, maxMem=278302556 >> 15/10/16 17:42:41 INFO MemoryStore: Block broadcast_2_piece0 stored as >> bytes in memory (estimated size 1244.0 B, free 265.4 MB) >> 15/10/16 17:42:41 INFO BlockManagerInfo: Added broadcast_2_piece0 in >> memory on localhost:42805 (size: 1244.0 B, free: 265.4 MB) >> 15/10/16 17:42:41 INFO SparkContext: Created broadcast 2 from broadcast >> at DAGScheduler.scala:874 >> 15/10/16 17:42:41 INFO DAGScheduler: Submitting 1 missing tasks from >> ResultStage 2 (MapPartitionsRDD[5] at map at repro.scala:48) >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks >> 15/10/16 17:42:41 INFO TaskSetManager: Starting task 0.0 in stage 2.0 >> (TID 2, localhost, PROCESS_LOCAL, 1444 bytes) >> 15/10/16 17:42:41 INFO Executor: Running task 0.0 in stage 2.0 (TID 2) >> 15/10/16 17:42:41 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID >> 2) >> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at repro.Repro$$anonfun$main$2.apply$mcZI$sp(repro.scala:49) >> at repro.Repro$$anonfun$main$2.apply(repro.scala:48) >> at repro.Repro$$anonfun$main$2.apply(repro.scala:48) >> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> at >> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) >> at scala.collection.AbstractIterator.to >> <http://scala.collection.abstractiterator.to/>(Iterator.scala:1157) >> at >> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >> at >> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >> at >> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885) >> at >> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) >> at org.apache.spark.scheduler.Task.run(Task.scala:70) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> 15/10/16 17:42:41 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, >> localhost): com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException >> at repro.Repro$$anonfun$main$2.apply$mcZI$sp(repro.scala:49) >> at repro.Repro$$anonfun$main$2.apply(repro.scala:48) >> at repro.Repro$$anonfun$main$2.apply(repro.scala:48) >> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> at >> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >> at >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) >> at scala.collection.AbstractIterator.to >> <http://scala.collection.abstractiterator.to/>(Iterator.scala:1157) >> at >> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >> at >> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >> at >> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885) >> at >> org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767) >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1767) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) >> at org.apache.spark.scheduler.Task.run(Task.scala:70) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> 15/10/16 17:42:41 ERROR TaskSetManager: Task 0 in stage 2.0 failed 1 >> times; aborting job >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose >> tasks have all completed, from pool >> 15/10/16 17:42:41 INFO TaskSchedulerImpl: Cancelling stage 2 >> 15/10/16 17:42:41 INFO DAGScheduler: ResultStage 2 (collect at >> repro.scala:53) failed in 0.016 s >> 15/10/16 17:42:41 INFO DAGScheduler: Job 2 failed: collect at >> repro.scala:53, took 0.024584 s >> ##### Yes. >> >> >> >> >> 15/10/16 17:42:41 INFO SparkContext: Invoking stop() from shutdown hook >> 15/10/16 17:42:42 INFO SparkUI: Stopped Spark web UI at >> http://10.0.1.97:4040 >> 15/10/16 17:42:42 INFO DAGScheduler: Stopping DAGScheduler >> 15/10/16 17:42:42 INFO MapOutputTrackerMasterEndpoint: >> MapOutputTrackerMasterEndpoint stopped! >> 15/10/16 17:42:42 INFO Utils: path = >> /tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a/blockmgr-9f9e70fc-62f7-4308-8b7d-c1b4127241f3, >> already present as root for deletion. >> 15/10/16 17:42:42 INFO MemoryStore: MemoryStore cleared >> 15/10/16 17:42:42 INFO BlockManager: BlockManager stopped >> 15/10/16 17:42:42 INFO BlockManagerMaster: BlockManagerMaster stopped >> 15/10/16 17:42:42 INFO >> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: >> OutputCommitCoordinator stopped! >> 15/10/16 17:42:42 INFO SparkContext: Successfully stopped SparkContext >> 15/10/16 17:42:42 INFO Utils: Shutdown hook called >> 15/10/16 17:42:42 INFO Utils: Deleting directory >> /tmp/spark-e2946d3d-6630-456f-b6a7-bc1583c1362a >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > > > -- > Best Regards > > Jeff Zhang >