[ https://issues.apache.org/jira/browse/SPARK-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327992#comment-15327992 ]
Pete Robbins commented on SPARK-15822: -------------------------------------- So this does seem to cause the NPE or SEGV intermittently, ie I get some clean runs. However, I added some tracing to detect when the UnsafeRow looks corrupt (baseobject = null, offset=massive) and I see these in every run so I suspect there is always corruption but that doesn't always lead to a visible failure. The app usually gives the appearance of success as Spark re-submits the lost tasks and restarts failing executors. Here is what I think is the plan associated with one of the failing jobs: == Parsed Logical Plan == 'Project [unresolvedalias('Origin, None), unresolvedalias('UniqueCarrier, None), 'round((('count * 100) / 'total), 2) AS rank#927] +- Project [Origin#16, UniqueCarrier#8, count#888L, total#851L] +- Join Inner, ((Origin#16 = Origin#909) && (UniqueCarrier#8 = UniqueCarrier#901)) :- Aggregate [Origin#16, UniqueCarrier#8], [Origin#16, UniqueCarrier#8, count(1) AS count#888L] : +- Filter (NOT (Cancelled#21 = 0) && (CancellationCode#22 = A)) : +- Filter (Dest#17 = ORD) : +- Relation[Year#0,Month#1,DayofMonth#2,DayOfWeek#3,DepTime#4,CRSDepTime#5,ArrTime#6,CRSArrTime#7,UniqueCarrier#8,FlightNum#9,TailNum#10,ActualElapsedTime#11,CRSElapsedTime#12,AirTime#13,ArrDelay#14,DepDelay#15,Origin#16,Dest#17,Distance#18,TaxiIn#19,TaxiOut#20,Cancelled#21,CancellationCode#22,Diverted#23,CarrierDelay#24,WeatherDelay#25,NASDelay#26,SecurityDelay#27,LateAircraftDelay#28] csv +- Project [Origin#909, UniqueCarrier#901, count#846L AS total#851L] +- Aggregate [Origin#909, UniqueCarrier#901], [Origin#909, UniqueCarrier#901, count(1) AS count#846L] +- Filter (Dest#910 = ORD) +- Relation[Year#893,Month#894,DayofMonth#895,DayOfWeek#896,DepTime#897,CRSDepTime#898,ArrTime#899,CRSArrTime#900,UniqueCarrier#901,FlightNum#902,TailNum#903,ActualElapsedTime#904,CRSElapsedTime#905,AirTime#906,ArrDelay#907,DepDelay#908,Origin#909,Dest#910,Distance#911,TaxiIn#912,TaxiOut#913,Cancelled#914,CancellationCode#915,Diverted#916,CarrierDelay#917,WeatherDelay#918,NASDelay#919,SecurityDelay#920,LateAircraftDelay#921] csv == Analyzed Logical Plan == Origin: string, UniqueCarrier: string, rank: double Project [Origin#16, UniqueCarrier#8, round((cast((count#888L * cast(100 as bigint)) as double) / cast(total#851L as double)), 2) AS rank#927] +- Project [Origin#16, UniqueCarrier#8, count#888L, total#851L] +- Join Inner, ((Origin#16 = Origin#909) && (UniqueCarrier#8 = UniqueCarrier#901)) :- Aggregate [Origin#16, UniqueCarrier#8], [Origin#16, UniqueCarrier#8, count(1) AS count#888L] : +- Filter (NOT (Cancelled#21 = 0) && (CancellationCode#22 = A)) : +- Filter (Dest#17 = ORD) : +- Relation[Year#0,Month#1,DayofMonth#2,DayOfWeek#3,DepTime#4,CRSDepTime#5,ArrTime#6,CRSArrTime#7,UniqueCarrier#8,FlightNum#9,TailNum#10,ActualElapsedTime#11,CRSElapsedTime#12,AirTime#13,ArrDelay#14,DepDelay#15,Origin#16,Dest#17,Distance#18,TaxiIn#19,TaxiOut#20,Cancelled#21,CancellationCode#22,Diverted#23,CarrierDelay#24,WeatherDelay#25,NASDelay#26,SecurityDelay#27,LateAircraftDelay#28] csv +- Project [Origin#909, UniqueCarrier#901, count#846L AS total#851L] +- Aggregate [Origin#909, UniqueCarrier#901], [Origin#909, UniqueCarrier#901, count(1) AS count#846L] +- Filter (Dest#910 = ORD) +- Relation[Year#893,Month#894,DayofMonth#895,DayOfWeek#896,DepTime#897,CRSDepTime#898,ArrTime#899,CRSArrTime#900,UniqueCarrier#901,FlightNum#902,TailNum#903,ActualElapsedTime#904,CRSElapsedTime#905,AirTime#906,ArrDelay#907,DepDelay#908,Origin#909,Dest#910,Distance#911,TaxiIn#912,TaxiOut#913,Cancelled#914,CancellationCode#915,Diverted#916,CarrierDelay#917,WeatherDelay#918,NASDelay#919,SecurityDelay#920,LateAircraftDelay#921] csv == Optimized Logical Plan == Project [Origin#16, UniqueCarrier#8, round((cast((count#888L * 100) as double) / cast(total#851L as double)), 2) AS rank#927] +- Join Inner, ((Origin#16 = Origin#909) && (UniqueCarrier#8 = UniqueCarrier#901)) :- Aggregate [Origin#16, UniqueCarrier#8], [Origin#16, UniqueCarrier#8, count(1) AS count#888L] : +- Project [UniqueCarrier#8, Origin#16] : +- Filter (((((isnotnull(UniqueCarrier#8) && isnotnull(Origin#16)) && isnotnull(Cancelled#21)) && isnotnull(CancellationCode#22)) && NOT (Cancelled#21 = 0)) && (CancellationCode#22 = A)) : +- InMemoryRelation [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Filter (isnotnull(Dest#17) && (Dest#17 = ORD)) : : +- InMemoryTableScan [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], [isnotnull(Dest#17), (Dest#17 = ORD)] : : : +- InMemoryRelation [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : : : +- Scan csv [Year#0,Month#1,DayofMonth#2,DayOfWeek#3,DepTime#4,CRSDepTime#5,ArrTime#6,CRSArrTime#7,UniqueCarrier#8,FlightNum#9,TailNum#10,ActualElapsedTime#11,CRSElapsedTime#12,AirTime#13,ArrDelay#14,DepDelay#15,Origin#16,Dest#17,Distance#18,TaxiIn#19,TaxiOut#20,Cancelled#21,CancellationCode#22,Diverted#23,CarrierDelay#24,WeatherDelay#25,NASDelay#26,SecurityDelay#27,LateAircraftDelay#28] Format: CSV, InputPaths: file:/home/robbins/brandberry/2008.csv, PushedFilters: [], ReadSchema: struct<Year:int,Month:int,DayofMonth:int,DayOfWeek:int,DepTime:string,CRSDepTime:int,ArrTime:stri... +- Aggregate [Origin#909, UniqueCarrier#901], [Origin#909, UniqueCarrier#901, count(1) AS total#851L] +- Project [UniqueCarrier#901, Origin#909] +- Filter (isnotnull(Origin#909) && isnotnull(UniqueCarrier#901)) +- InMemoryRelation [Year#893, Month#894, DayofMonth#895, DayOfWeek#896, DepTime#897, CRSDepTime#898, ArrTime#899, CRSArrTime#900, UniqueCarrier#901, FlightNum#902, TailNum#903, ActualElapsedTime#904, CRSElapsedTime#905, AirTime#906, ArrDelay#907, DepDelay#908, Origin#909, Dest#910, Distance#911, TaxiIn#912, TaxiOut#913, Cancelled#914, CancellationCode#915, Diverted#916, CarrierDelay#917, WeatherDelay#918, NASDelay#919, SecurityDelay#920, LateAircraftDelay#921], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : +- Filter (isnotnull(Dest#17) && (Dest#17 = ORD)) : +- InMemoryTableScan [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], [isnotnull(Dest#17), (Dest#17 = ORD)] : : +- InMemoryRelation [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : : +- Scan csv [Year#0,Month#1,DayofMonth#2,DayOfWeek#3,DepTime#4,CRSDepTime#5,ArrTime#6,CRSArrTime#7,UniqueCarrier#8,FlightNum#9,TailNum#10,ActualElapsedTime#11,CRSElapsedTime#12,AirTime#13,ArrDelay#14,DepDelay#15,Origin#16,Dest#17,Distance#18,TaxiIn#19,TaxiOut#20,Cancelled#21,CancellationCode#22,Diverted#23,CarrierDelay#24,WeatherDelay#25,NASDelay#26,SecurityDelay#27,LateAircraftDelay#28] Format: CSV, InputPaths: file:/home/robbins/brandberry/2008.csv, PushedFilters: [], ReadSchema: struct<Year:int,Month:int,DayofMonth:int,DayOfWeek:int,DepTime:string,CRSDepTime:int,ArrTime:stri... == Physical Plan == Project [Origin#16, UniqueCarrier#8, round((cast((count#888L * 100) as double) / cast(total#851L as double)), 2) AS rank#927] +- BroadcastHashJoin [Origin#16, UniqueCarrier#8], [Origin#909, UniqueCarrier#901], Inner, BuildRight :- HashAggregate(key=[Origin#16,UniqueCarrier#8], functions=[count(1)], output=[Origin#16,UniqueCarrier#8,count#888L]) : +- Exchange hashpartitioning(Origin#16, UniqueCarrier#8, 200) : +- HashAggregate(key=[Origin#16,UniqueCarrier#8], functions=[partial_count(1)], output=[Origin#16,UniqueCarrier#8,count#1342L]) : +- Project [UniqueCarrier#8, Origin#16] : +- Filter (((((isnotnull(UniqueCarrier#8) && isnotnull(Origin#16)) && isnotnull(Cancelled#21)) && isnotnull(CancellationCode#22)) && NOT (Cancelled#21 = 0)) && (CancellationCode#22 = A)) : +- InMemoryTableScan [UniqueCarrier#8, Origin#16, Cancelled#21, CancellationCode#22], [isnotnull(UniqueCarrier#8), isnotnull(Origin#16), isnotnull(Cancelled#21), isnotnull(CancellationCode#22), NOT (Cancelled#21 = 0), (CancellationCode#22 = A)] : : +- InMemoryRelation [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : : +- Filter (isnotnull(Dest#17) && (Dest#17 = ORD)) : : : +- InMemoryTableScan [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], [isnotnull(Dest#17), (Dest#17 = ORD)] : : : : +- InMemoryRelation [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : : : : +- Scan csv [Year#0,Month#1,DayofMonth#2,DayOfWeek#3,DepTime#4,CRSDepTime#5,ArrTime#6,CRSArrTime#7,UniqueCarrier#8,FlightNum#9,TailNum#10,ActualElapsedTime#11,CRSElapsedTime#12,AirTime#13,ArrDelay#14,DepDelay#15,Origin#16,Dest#17,Distance#18,TaxiIn#19,TaxiOut#20,Cancelled#21,CancellationCode#22,Diverted#23,CarrierDelay#24,WeatherDelay#25,NASDelay#26,SecurityDelay#27,LateAircraftDelay#28] Format: CSV, InputPaths: file:/home/robbins/brandberry/2008.csv, PushedFilters: [], ReadSchema: struct<Year:int,Month:int,DayofMonth:int,DayOfWeek:int,DepTime:string,CRSDepTime:int,ArrTime:stri... +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, string, true], input[1, string, true])) +- HashAggregate(key=[Origin#909,UniqueCarrier#901], functions=[count(1)], output=[Origin#909,UniqueCarrier#901,total#851L]) +- Exchange hashpartitioning(Origin#909, UniqueCarrier#901, 200) +- HashAggregate(key=[Origin#909,UniqueCarrier#901], functions=[partial_count(1)], output=[Origin#909,UniqueCarrier#901,count#1344L]) +- Filter (isnotnull(Origin#909) && isnotnull(UniqueCarrier#901)) +- InMemoryTableScan [UniqueCarrier#901, Origin#909], [isnotnull(Origin#909), isnotnull(UniqueCarrier#901)] : +- InMemoryRelation [Year#893, Month#894, DayofMonth#895, DayOfWeek#896, DepTime#897, CRSDepTime#898, ArrTime#899, CRSArrTime#900, UniqueCarrier#901, FlightNum#902, TailNum#903, ActualElapsedTime#904, CRSElapsedTime#905, AirTime#906, ArrDelay#907, DepDelay#908, Origin#909, Dest#910, Distance#911, TaxiIn#912, TaxiOut#913, Cancelled#914, CancellationCode#915, Diverted#916, CarrierDelay#917, WeatherDelay#918, NASDelay#919, SecurityDelay#920, LateAircraftDelay#921], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : +- Filter (isnotnull(Dest#17) && (Dest#17 = ORD)) : : +- InMemoryTableScan [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], [isnotnull(Dest#17), (Dest#17 = ORD)] : : : +- InMemoryRelation [Year#0, Month#1, DayofMonth#2, DayOfWeek#3, DepTime#4, CRSDepTime#5, ArrTime#6, CRSArrTime#7, UniqueCarrier#8, FlightNum#9, TailNum#10, ActualElapsedTime#11, CRSElapsedTime#12, AirTime#13, ArrDelay#14, DepDelay#15, Origin#16, Dest#17, Distance#18, TaxiIn#19, TaxiOut#20, Cancelled#21, CancellationCode#22, Diverted#23, CarrierDelay#24, WeatherDelay#25, NASDelay#26, SecurityDelay#27, LateAircraftDelay#28], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) : : : : +- Scan csv [Year#0,Month#1,DayofMonth#2,DayOfWeek#3,DepTime#4,CRSDepTime#5,ArrTime#6,CRSArrTime#7,UniqueCarrier#8,FlightNum#9,TailNum#10,ActualElapsedTime#11,CRSElapsedTime#12,AirTime#13,ArrDelay#14,DepDelay#15,Origin#16,Dest#17,Distance#18,TaxiIn#19,TaxiOut#20,Cancelled#21,CancellationCode#22,Diverted#23,CarrierDelay#24,WeatherDelay#25,NASDelay#26,SecurityDelay#27,LateAircraftDelay#28] Format: CSV, InputPaths: file:/home/robbins/brandberry/2008.csv, PushedFilters: [], ReadSchema: struct<Year:int,Month:int,DayofMonth:int,DayOfWeek:int,DepTime:string,CRSDepTime:int,ArrTime:stri... > segmentation violation in o.a.s.unsafe.types.UTF8String > -------------------------------------------------------- > > Key: SPARK-15822 > URL: https://issues.apache.org/jira/browse/SPARK-15822 > Project: Spark > Issue Type: Bug > Affects Versions: 2.0.0 > Environment: linux amd64 > openjdk version "1.8.0_91" > OpenJDK Runtime Environment (build 1.8.0_91-b14) > OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode) > Reporter: Pete Robbins > Assignee: Herman van Hovell > Priority: Blocker > > Executors fail with segmentation violation while running application with > spark.memory.offHeap.enabled true > spark.memory.offHeap.size 512m > Also now reproduced with > spark.memory.offHeap.enabled false > {noformat} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x00007f4559b4d4bd, pid=14182, tid=139935319750400 > # > # JRE version: OpenJDK Runtime Environment (8.0_91-b14) (build 1.8.0_91-b14) > # Java VM: OpenJDK 64-Bit Server VM (25.91-b14 mixed mode linux-amd64 > compressed oops) > # Problematic frame: > # J 4816 C2 > org.apache.spark.unsafe.types.UTF8String.compareTo(Lorg/apache/spark/unsafe/types/UTF8String;)I > (64 bytes) @ 0x00007f4559b4d4bd [0x00007f4559b4d460+0x5d] > {noformat} > We initially saw this on IBM java on PowerPC box but is recreatable on linux > with OpenJDK. On linux with IBM Java 8 we see a null pointer exception at the > same code point: > {noformat} > 16/06/08 11:14:58 ERROR Executor: Exception in task 1.0 in stage 5.0 (TID 48) > java.lang.NullPointerException > at > org.apache.spark.unsafe.types.UTF8String.compareTo(UTF8String.java:831) > at org.apache.spark.unsafe.types.UTF8String.compare(UTF8String.java:844) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.findNextInnerJoinRows$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$doExecute$2$$anon$2.hasNext(WholeStageCodegenExec.scala:377) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30) > at org.spark_project.guava.collect.Ordering.leastOf(Ordering.java:664) > at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37) > at > org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(RDD.scala:1365) > at > org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(RDD.scala:1362) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:757) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:757) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:282) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) > at org.apache.spark.scheduler.Task.run(Task.scala:85) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.lang.Thread.run(Thread.java:785) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org