Thanks Gyula,

It kind of helped. I did remove some KryoSerializers here and there and it 
started working, but don’t understand it fully. Will try to understand and 
reproduce it, as soon as I have some spare time.

> On 17 Dec 2017, at 17:52, Gyula Fóra <gyula.f...@gmail.com> wrote:
> 
> Hi,
> I have seen similar errors when trying to serialize Kryo-typeserializers with 
> Flink type infos accidentally.
> 
> Maybe that helps :)
> 
> Gyula
> 
> 
> On Sun, Dec 17, 2017, 15:52 Dawid Wysakowicz <wysakowicz.da...@gmail.com> 
> wrote:
> Just as a follow-up I tried disabling mmap with sun.zip.disableMemoryMapping, 
> but it did not help. This time I got only Java stack:
> 
> Stack: [0x00007f9060757000,0x00007f9060858000],  sp=0x00007f9060856350,  free 
> space=1020k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> 
> [error occurred during error reporting (printing native stack), id 0xb]
> 
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  java.util.zip.Inflater.end(J)V+0
> j  java.util.zip.Inflater.end()V+29
> j  java.util.zip.ZipFile.close()V+169
> j  sun.net.www.protocol.jar.URLJarFile.close()V+18
> j  sun.net.www.protocol.jar.URLJarFile.finalize()V+1
> J 10563% C2 java.lang.ref.Finalizer$FinalizerThread.run()V (55 bytes) @ 
> 0x00007f9075be90b4 [0x00007f9075be8e00+0x2b4]
> v  ~StubRoutines::call_stub
> 
> > On 17 Dec 2017, at 15:03, Dawid Wysakowicz <wysakowicz.da...@gmail.com> 
> > wrote:
> >
> > Hi,
> >
> > Recently we observe regular taskmanager's JVM crashes just about a minute 
> > from the start of our Flink job. We run flink 1.3.2 on YARN (2.6.2.0-205). 
> > Java version:
> >
> > JRE version: Java(TM) SE Runtime Environment (8.0_112-b15) (build 
> > 1.8.0_112-b15)
> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.112-b15 mixed mode 
> > linux-amd64 compressed oops)
> >
> > Any help with this problem would be appreciated. If you need any more info 
> > I will be happy to provide it.
> > JVM crashes with SIGSEGV. Please see top of the stacktrace attached:
> >
> > Stack: [0x00007f301a1d9000,0x00007f301a2da000],  sp=0x00007f301a2d6090,  
> > free space=1012k
> > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> > code)
> > V  [libjvm.so+0x8dfada]  Monitor::jvm_raw_lock()+0xa
> > V  [libjvm.so+0x70fe17]  JVM_RawMonitorEnter+0x27
> > C  [libzip.so+0x120f1]  ZIP_GetEntry2+0x61
> > C  [libzip.so+0x3ec0]  Java_java_util_zip_ZipFile_getEntry+0xf0
> > J 136  java.util.zip.ZipFile.getEntry(J[BZ)J (0 bytes) @ 0x00007f303c314c0e 
> > [0x00007f303c314b40+0xce]
> > J 1579 C2 
> > java.util.jar.JarFile.getJarEntry(Ljava/lang/String;)Ljava/util/jar/JarEntry;
> >  (9 bytes) @ 0x00007f303c735db8 [0x00007f303c735a40+0x378]
> > J 2321 C2 java.net.URLClassLoader$1.run()Ljava/lang/Object; (5 bytes) @ 
> > 0x00007f303ca5965c [0x00007f303ca59080+0x5dc]
> > v  ~StubRoutines::call_stub
> > V  [libjvm.so+0x690c66]  JavaCalls::call_helper(JavaValue*, methodHandle*, 
> > JavaCallArguments*, Thread*)+0x1056
> > V  [libjvm.so+0x729f2c]  JVM_DoPrivileged+0x27c
> > J 308  
> > java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;
> >  (0 bytes) @ 0x00007f303c38dd15 [0x00007f303c38dc40+0xd5]
> > J 2991 C2 
> > java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class; (47 
> > bytes) @ 0x00007f303c30f430 [0x00007f303c30f3a0+0x90]
> > J 4911 C2 
> > java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; (122 
> > bytes) @ 0x00007f303cd178f8 [0x00007f303cd16600+0x12f8]
> > j  
> > com.esotericsoftware.reflectasm.AccessClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+48
> > J 2321 C2 java.net.URLClassLoader$1.run()Ljava/lang/Object; (5 bytes) @ 
> > 0x00007f303ca5965c [0x00007f303ca59080+0x5dc]
> > v  ~StubRoutines::call_stub
> > V  [libjvm.so+0x690c66]  JavaCalls::call_helper(JavaValue*, methodHandle*, 
> > JavaCallArguments*, Thread*)+0x1056
> > V  [libjvm.so+0x729f2c]  JVM_DoPrivileged+0x27c
> > J 308  
> > java.security.AccessController.doPrivileged(Ljava/security/PrivilegedExceptionAction;Ljava/security/AccessControlContext;)Ljava/lang/Object;
> >  (0 bytes) @ 0x00007f303c38dd15 [0x00007f303c38dc40+0xd5]
> > J 2991 C2 
> > java.net.URLClassLoader.findClass(Ljava/lang/String;)Ljava/lang/Class; (47 
> > bytes) @ 0x00007f303c30f430 [0x00007f303c30f3a0+0x90]
> > J 4911 C2 
> > java.lang.ClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; (122 
> > bytes) @ 0x00007f303cd178f8 [0x00007f303cd16600+0x12f8]
> > j  
> > com.esotericsoftware.reflectasm.AccessClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class;+48
> > J 2318 C2 
> > java.lang.ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class; (7 
> > bytes) @ 0x00007f303c96db80 [0x00007f303c96d9c0+0x1c0]
> > j  
> > com.esotericsoftware.reflectasm.ConstructorAccess.get(Ljava/lang/Class;)Lcom/esotericsoftware/reflectasm/ConstructorAccess;+109
> > j  
> > com.twitter.chill.Instantiators$.reflectAsm(Ljava/lang/Class;)Lscala/util/Either;+1
> > j  
> > com.twitter.chill.KryoBase$$anonfun$newInstantiator$2.apply(Ljava/lang/Class;)Lscala/util/Either;+4
> > j  
> > com.twitter.chill.KryoBase$$anonfun$newInstantiator$2.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
> > j  
> > com.twitter.chill.Instantiators$$anonfun$newOrElse$1.apply(Lscala/Function1;)Lscala/Option;+5
> > j  
> > com.twitter.chill.Instantiators$$anonfun$newOrElse$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
> > j  scala.collection.Iterator$$anon$11.next()Ljava/lang/Object;+13
> > j  
> > scala.collection.Iterator$class.find(Lscala/collection/Iterator;Lscala/Function1;)Lscala/Option;+21
> > j  scala.collection.AbstractIterator.find(Lscala/Function1;)Lscala/Option;+2
> > j  
> > com.twitter.chill.Instantiators$.newOrElse(Ljava/lang/Class;Lscala/collection/TraversableOnce;Lscala/Function0;)Lorg/objenesis/instantiator/ObjectInstantiator;+25
> > j  
> > com.twitter.chill.KryoBase.newInstantiator(Ljava/lang/Class;)Lorg/objenesis/instantiator/ObjectInstantiator;+54
> > J 11193 C2 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.copy(Lcom/esotericsoftware/kryo/Kryo;Ljava/lang/Object;)Ljava/lang/Object;
> >  (91 bytes) @ 0x00007f303db5656c [0x00007f303db56160+0x40c]
> > J 6855 C2 
> > com.esotericsoftware.kryo.Kryo.copy(Ljava/lang/Object;)Ljava/lang/Object; 
> > (211 bytes) @ 0x00007f303d456494 [0x00007f303d4560c0+0x3d4]
> > j  
> > com.esotericsoftware.kryo.serializers.UnsafeCacheFields$UnsafeObjectField.copy(Ljava/lang/Object;Ljava/lang/Object;)V+34
> > J 11193 C2 
> > com.esotericsoftware.kryo.serializers.FieldSerializer.copy(Lcom/esotericsoftware/kryo/Kryo;Ljava/lang/Object;)Ljava/lang/Object;
> >  (91 bytes) @ 0x00007f303db566c8 [0x00007f303db56160+0x568]
> > J 6855 C2 
> > com.esotericsoftware.kryo.Kryo.copy(Ljava/lang/Object;)Ljava/lang/Object; 
> > (211 bytes) @ 0x00007f303d456494 [0x00007f303d4560c0+0x3d4]
> > j  
> > org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.copy(Ljava/lang/Object;)Ljava/lang/Object;+15
> > j  
> > org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(Lorg/apache/flink/api/java/tuple/Tuple;)Lorg/apache/flink/api/java/tuple/Tuple;+26
> > j  
> > org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(Ljava/lang/Object;)Ljava/lang/Object;+5
> > j  
> > org.apache.flink.runtime.state.ArrayListSerializer.copy(Ljava/util/ArrayList;)Ljava/util/ArrayList;+51
> > j  
> > org.apache.flink.runtime.state.DefaultOperatorStateBackend$PartitionableListState.<init>(Lorg/apache/flink/runtime/state/DefaultOperatorStateBackend$PartitionableListState;)V+13
> > j 
> > org.apache.flink.runtime.state.DefaultOperatorStateBackend$PartitionableListState.deepCopy()Lorg/apache/flink/runtime/state/DefaultOperatorStateBackend$PartitionableListState;+5
> > j 
> > org.apache.flink.runtime.state.DefaultOperatorStateBackend.snapshot(JJLorg/apache/flink/runtime/state/CheckpointStreamFactory;Lorg/apache/flink/runtime/checkpoint/CheckpointOptions;)Ljava/util/concurrent/RunnableFuture;+115
> > j 
> > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(JJLorg/apache/flink/runtime/checkpoint/CheckpointOptions;)Lorg/apache/flink/streaming/api/operators/OperatorSnapshotResult;+111
> > j 
> > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(Lorg/apache/flink/streaming/api/operators/StreamOperator;)V+58
> > j  
> > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing()V+35
> > j 
> > org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(Lorg/apache/flink/runtime/checkpoint/CheckpointMetaData;Lorg/apache/flink/runtime/checkpoint/CheckpointOptions;Lorg/apache/flink/runtime/checkpoint/CheckpointMetrics;)V+15
> > j 
> > org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(Lorg/apache/flink/runtime/checkpoint/CheckpointMetaData;Lorg/apache/flink/runtime/checkpoint/CheckpointOptions;Lorg/apache/flink/runtime/checkpoint/CheckpointMetrics;)Z+74
> > j 
> > org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(Lorg/apache/flink/runtime/checkpoint/CheckpointMetaData;Lorg/apache/flink/runtime/checkpoint/CheckpointOptions;Lorg/apache/flink/runtime/checkpoint/CheckpointMetrics;)V+4
> > j  
> > org.apache.flink.streaming.runtime.io.BarrierBuffer.notifyCheckpoint(Lorg/apache/flink/runtime/io/network/api/CheckpointBarrier;)V+73
> > j  
> > org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(Lorg/apache/flink/runtime/io/network/api/CheckpointBarrier;I)V+193
> > J 11295 C2 
> > org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput()Z
> >  (602 bytes) @ 0x00007f303de7beb4 [0x00007f303de79c00+0x22b4]
> > J 10764% C2 
> > org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run()V (23 
> > bytes) @ 0x00007f303cb3812c [0x00007f303cb38080+0xac]
> > j  org.apache.flink.streaming.runtime.tasks.StreamTask.invoke()V+221
> > j  org.apache.flink.runtime.taskmanager.Task.run()V+813
> > j  java.lang.Thread.run()V+11
> 

Attachment: signature.asc
Description: Message signed with OpenPGP

Reply via email to