[ 
https://issues.apache.org/jira/browse/SPARK-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331370#comment-15331370
 ] 

Pete Robbins commented on SPARK-15822:
--------------------------------------

Chatting with [~hvanhovell] here is the current state. I can reproduce a segv 
using local[8] on an 8 core machine. It is intermittent but  many many runs 
with eg local[2] produce no issues. The segv info is:

{noformat}
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fe8c118ca58, pid=3558, tid=140633451779840
#
# JRE version: OpenJDK Runtime Environment (8.0_91-b14) (build 1.8.0_91-b14)
# Java VM: OpenJDK 64-Bit Server VM (25.91-b14 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# J 7467 C1 org.apache.spark.unsafe.Platform.getByte(Ljava/lang/Object;J)B (9 
bytes) @ 0x00007fe8c118ca58 [0x00007fe8c118ca20+0x38]
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0x00007fe858018800):  JavaThread "Executor task launch 
worker-3" daemon [_thread_in_Java, id=3698, 
stack(0x00007fe7c6dfd000,0x00007fe7c6efe000)]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x0000000000a09cf4

Registers:
RAX=0x00007fe884ce5828, RBX=0x00007fe884ce5828, RCX=0x00007fe81e0a5360, 
RDX=0x0000000000a09cf4
RSP=0x00007fe7c6efb9e0, RBP=0x00007fe7c6efba80, RSI=0x0000000000000000, 
RDI=0x0000000000003848
R8 =0x00000000200b94c8, R9 =0x00000000eef66bf0, R10=0x00007fe8d87a2f00, 
R11=0x00007fe8c118ca20
R12=0x0000000000000000, R13=0x00007fe7c6efba28, R14=0x00007fe7c6efba98, 
R15=0x00007fe858018800
RIP=0x00007fe8c118ca58, EFLAGS=0x0000000000010206, CSGSFS=0x0000000000000033, 
ERR=0x0000000000000004
  TRAPNO=0x000000000000000e

Top of Stack: (sp=0x00007fe7c6efb9e0)
0x00007fe7c6efb9e0:   00007fe7c56941e8 0000000000000000
0x00007fe7c6efb9f0:   00007fe7c6efbab0 00007fe8c140c38c
0x00007fe7c6efba00:   00007fe8c1007d80 00000000eef66bc8
0x00007fe7c6efba10:   00007fe7c6efba80 00007fe8c1007700
0x00007fe7c6efba20:   00007fe8c1007700 0000000000a09cf4
0x00007fe7c6efba30:   0000000000000030 0000000000000000
0x00007fe7c6efba40:   00007fe7c6efba40 00007fe81e0a1f9b
0x00007fe7c6efba50:   00007fe7c6efba98 00007fe81e0a5360
0x00007fe7c6efba60:   0000000000000000 00007fe81e0a1fc0
0x00007fe7c6efba70:   00007fe7c6efba28 00007fe7c6efba90
0x00007fe7c6efba80:   00007fe7c6efbae8 00007fe8c1007700
0x00007fe7c6efba90:   0000000000000000 00000000ee4f4898
0x00007fe7c6efbaa0:   000000000000004d 00007fe7c6efbaa8
0x00007fe7c6efbab0:   00007fe81e0a42be 00007fe7c6efbb18
0x00007fe7c6efbac0:   00007fe81e0a5360 0000000000000000
0x00007fe7c6efbad0:   00007fe81e0a4338 00007fe7c6efba90
0x00007fe7c6efbae0:   00007fe7c6efbb10 00007fe7c6efbb60
0x00007fe7c6efbaf0:   00007fe8c1007a40 0000000000000000
0x00007fe7c6efbb00:   0000000000000000 0000000000000003
0x00007fe7c6efbb10:   00000000ee4f4898 00000000eef67950
0x00007fe7c6efbb20:   00007fe7c6efbb20 00007fe81e0a43f2
0x00007fe7c6efbb30:   00007fe7c6efbb78 00007fe81e0a5360
0x00007fe7c6efbb40:   0000000000000000 00007fe81e0a4418
0x00007fe7c6efbb50:   00007fe7c6efbb10 00007fe7c6efbb70
0x00007fe7c6efbb60:   00007fe7c6efbbc0 00007fe8c1007a40
0x00007fe7c6efbb70:   00000000ee4f4898 00000000eef67950
0x00007fe7c6efbb80:   00007fe7c6efbb80 00007fe7c56844e5
0x00007fe7c6efbb90:   00007fe7c6efbc28 00007fe7c5684950
0x00007fe7c6efbba0:   0000000000000000 00007fe7c5684618
0x00007fe7c6efbbb0:   00007fe7c6efbb70 00007fe7c6efbc18
0x00007fe7c6efbbc0:   00007fe7c6efbc70 00007fe8c10077d0
0x00007fe7c6efbbd0:   0000000000000000 0000000000000000 

Instructions: (pc=0x00007fe8c118ca58)
0x00007fe8c118ca38:   08 83 c7 08 89 78 08 48 b8 28 58 ce 84 e8 7f 00
0x00007fe8c118ca48:   00 81 e7 f8 3f 00 00 83 ff 00 0f 84 16 00 00 00
0x00007fe8c118ca58:   0f be 04 16 c1 e0 18 c1 f8 18 48 83 c4 30 5d 85
0x00007fe8c118ca68:   05 93 c6 85 17 c3 48 89 44 24 08 48 c7 04 24 ff 

Register to memory mapping:

RAX={method} {0x00007fe884ce5828} 'getByte' '(Ljava/lang/Object;J)B' in 
'org/apache/spark/unsafe/Platform'
RBX={method} {0x00007fe884ce5828} 'getByte' '(Ljava/lang/Object;J)B' in 
'org/apache/spark/unsafe/Platform'
RCX=0x00007fe81e0a5360 is pointing into metadata
RDX=0x0000000000a09cf4 is an unknown value
RSP=0x00007fe7c6efb9e0 is pointing into the stack for thread: 0x00007fe858018800
RBP=0x00007fe7c6efba80 is pointing into the stack for thread: 0x00007fe858018800
RSI=0x0000000000000000 is an unknown value
RDI=0x0000000000003848 is an unknown value
R8 =0x00000000200b94c8 is an unknown value
R9 =0x00000000eef66bf0 is an oop
[B 
 - klass: {type array byte}
 - length: 48
R10=0x00007fe8d87a2f00: <offset 0xf07f00> in 
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-0.b14.el6_7.x86_64/jre/lib/amd64/server/libjvm.so
 at 0x00007fe8d789b000
R11=0x00007fe8c118ca20 is at entry_point+0 in (nmethod*)0x00007fe8c118c8d0
R12=0x0000000000000000 is an unknown value
R13=0x00007fe7c6efba28 is pointing into the stack for thread: 0x00007fe858018800
R14=0x00007fe7c6efba98 is pointing into the stack for thread: 0x00007fe858018800
R15=0x00007fe858018800 is a thread


Stack: [0x00007fe7c6dfd000,0x00007fe7c6efe000],  sp=0x00007fe7c6efb9e0,  free 
space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 7467 C1 org.apache.spark.unsafe.Platform.getByte(Ljava/lang/Object;J)B (9 
bytes) @ 0x00007fe8c118ca58 [0x00007fe8c118ca20+0x38]
j  org.apache.spark.unsafe.types.UTF8String.getByte(I)B+11
j  
org.apache.spark.unsafe.types.UTF8String.compareTo(Lorg/apache/spark/unsafe/types/UTF8String;)I+30
j  
org.apache.spark.unsafe.types.UTF8String.compare(Lorg/apache/spark/unsafe/types/UTF8String;)I+2
j  
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.findNextInnerJoinRows$(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass$GeneratedIterator;Lscala/collection/Iterator;Lscala/collection/Iterator;)Z+141
j  
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext()V+410
J 7729 C1 org.apache.spark.sql.execution.BufferedRowIterator.hasNext()Z (30 
bytes) @ 0x00007fe8c1ad80d4 [0x00007fe8c1ad7e60+0x274]
j  
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$doExecute$3$$anon$2.hasNext()Z+4
J 8582 C2 scala.collection.Iterator$$anon$11.hasNext()Z (10 bytes) @ 
0x00007fe8c2506bd8 [0x00007fe8c2506760+0x478]
j  scala.collection.convert.Wrappers$IteratorWrapper.hasNext()Z+4
j  
org.spark_project.guava.collect.Ordering.leastOf(Ljava/util/Iterator;I)Ljava/util/List;+132
j  
org.apache.spark.util.collection.Utils$.takeOrdered(Lscala/collection/Iterator;ILscala/math/Ordering;)Lscala/collection/Iterator;+29
j  
org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(Lscala/collection/Iterator;)Lscala/collection/Iterator;+46
j  
org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
j  
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(Lorg/apache/spark/TaskContext;ILscala/collection/Iterator;)Lscala/collection/Iterator;+5
j  
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+13
j  
org.apache.spark.rdd.MapPartitionsRDD.compute(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator;+27
j  
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator;+26
j  
org.apache.spark.rdd.RDD.iterator(Lorg/apache/spark/Partition;Lorg/apache/spark/TaskContext;)Lscala/collection/Iterator;+33
j  
org.apache.spark.scheduler.ResultTask.runTask(Lorg/apache/spark/TaskContext;)Ljava/lang/Object;+136
j  
org.apache.spark.scheduler.Task.run(JILorg/apache/spark/metrics/MetricsSystem;)Ljava/lang/Object;+82
j  org.apache.spark.executor.Executor$TaskRunner.run()V+374
j  
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  java.lang.Thread.run()V+11
{noformat}




> segmentation violation in o.a.s.unsafe.types.UTF8String 
> --------------------------------------------------------
>
>                 Key: SPARK-15822
>                 URL: https://issues.apache.org/jira/browse/SPARK-15822
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>         Environment: linux amd64
> openjdk version "1.8.0_91"
> OpenJDK Runtime Environment (build 1.8.0_91-b14)
> OpenJDK 64-Bit Server VM (build 25.91-b14, mixed mode)
>            Reporter: Pete Robbins
>            Assignee: Herman van Hovell
>            Priority: Blocker
>
> Executors fail with segmentation violation while running application with
> spark.memory.offHeap.enabled true
> spark.memory.offHeap.size 512m
> Also now reproduced with 
> spark.memory.offHeap.enabled false
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007f4559b4d4bd, pid=14182, tid=139935319750400
> #
> # JRE version: OpenJDK Runtime Environment (8.0_91-b14) (build 1.8.0_91-b14)
> # Java VM: OpenJDK 64-Bit Server VM (25.91-b14 mixed mode linux-amd64 
> compressed oops)
> # Problematic frame:
> # J 4816 C2 
> org.apache.spark.unsafe.types.UTF8String.compareTo(Lorg/apache/spark/unsafe/types/UTF8String;)I
>  (64 bytes) @ 0x00007f4559b4d4bd [0x00007f4559b4d460+0x5d]
> {noformat}
> We initially saw this on IBM java on PowerPC box but is recreatable on linux 
> with OpenJDK. On linux with IBM Java 8 we see a null pointer exception at the 
> same code point:
> {noformat}
> 16/06/08 11:14:58 ERROR Executor: Exception in task 1.0 in stage 5.0 (TID 48)
> java.lang.NullPointerException
>       at 
> org.apache.spark.unsafe.types.UTF8String.compareTo(UTF8String.java:831)
>       at org.apache.spark.unsafe.types.UTF8String.compare(UTF8String.java:844)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.findNextInnerJoinRows$(Unknown
>  Source)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>       at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>       at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$doExecute$2$$anon$2.hasNext(WholeStageCodegenExec.scala:377)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>       at 
> scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:30)
>       at org.spark_project.guava.collect.Ordering.leastOf(Ordering.java:664)
>       at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(RDD.scala:1365)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1$$anonfun$30.apply(RDD.scala:1362)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:757)
>       at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:757)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:318)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:282)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>       at org.apache.spark.scheduler.Task.run(Task.scala:85)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>       at java.lang.Thread.run(Thread.java:785)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to