[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304893#comment-15304893 ] Amit Sela commented on SPARK-15489: --- The issue here is the fact that setting the SparkConf does not propagate to the KryoSerializer used by the encoder. I managed to make this work by using Java System properties instead of SparkConf#set since the SparkConf constructor will take them into account, but it's a hack... For now I think I'll change the description of the issue, and propose this as a temporary solution. > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304886#comment-15304886 ] Amit Sela commented on SPARK-15489: --- Got it! So I wasn't using the custom registrator correctly, it works better like this: public class ImmutablesRegistrator implements KryoRegistrator { @Override public void registerClasses(Kryo kryo) { UnmodifiableCollectionsSerializer.registerSerializers(kryo); // Guava ImmutableListSerializer.registerSerializers(kryo); ImmutableSetSerializer.registerSerializers(kryo); ImmutableMapSerializer.registerSerializers(kryo); ImmutableMultimapSerializer.registerSerializers(kryo); } } > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304222#comment-15304222 ] Amit Sela commented on SPARK-15489: --- I would expect this to be related to KryoSerializer not registering the user-provided registrator, but I've added a print in https://github.com/apache/spark/blob/07c36a2f07fcf5da6fb395f830ebbfc10eb27dcc/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala#L123 to check if any new Kryo is created without the provided registrator, but it seems that all instances have a user-provided registrator (if one is provided). > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298883#comment-15298883 ] Michael Armbrust commented on SPARK-15489: -- It should run in the same JVM when running in local mode, otherwise it'll run in an executor. I think that when we construct an encoder, we should probably be passing this kind of information in. > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298847#comment-15298847 ] Amit Sela commented on SPARK-15489: --- Well it looks like the codegen creates a ```java new SparkConf()```, instead of deserializing a "broadcasted" one. I've tried adding the registrator configuration as a System parameter (-D), but it didn't catch. Is the generated code executed in the JVM ? in the same one if running in standalone ? > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297095#comment-15297095 ] Michael Armbrust commented on SPARK-15489: -- Wild guess... https://github.com/apache/spark/blob/07c36a2f07fcf5da6fb395f830ebbfc10eb27dcc/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala#L554 That is probably not using spark's config correctly. > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297087#comment-15297087 ] Amit Sela commented on SPARK-15489: --- This is my registrator: public class ImmutableRegistrator implements KryoRegistrator { @Override public void registerClasses(Kryo kryo) { try { kryo.register(Class.forName("java.util.Collections$UnmodifiableCollection"), new UnmodifiableCollectionsSerializer()); kryo.register(ImmutableList.class, new ImmutableListSerializer()); } catch (ClassNotFoundException e) { // } } } And I register with: conf.set("spark.kryo.registrator", ImmutableRegistrator.class.getCanonicalName()) When KryoSerializer.deserializeStream is called I see my registrar in KryoSerializerInstance, but when KryoSerializer.deserialize[T: ClassTag](bytes: ByteBuffer) is called I'm not so sure, if the instance is this.ks then no, I don't see my registrar. > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296989#comment-15296989 ] Amit Sela commented on SPARK-15489: --- In 2.0.0-SNAPSHOT it manifests as: java.lang.AssertionError: assertion failed: copyAndReset must return a zero value copy at scala.Predef$.assert(Predef.scala:179) at org.apache.spark.util.AccumulatorV2.writeReplace(AccumulatorV2.scala:157) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.io.ObjectStreamClass.invokeWriteReplace(ObjectStreamClass.java:1075) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) . at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:287) at org.apache.spark.sql.Dataset$$anonfun$collectAsList$1$$anonfun$apply$12.apply(Dataset.scala:2115) at org.apache.spark.sql.Dataset$$anonfun$collectAsList$1$$anonfun$apply$12.apply(Dataset.scala:2114) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436) at org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2114) at org.apache.spark.sql.Dataset$$anonfun$collectAsList$1.apply(Dataset.scala:2113) at org.apache.spark.sql.Dataset.withCallback(Dataset.scala:2449) at org.apache.spark.sql.Dataset.collectAsList(Dataset.scala:2113) .. > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296964#comment-15296964 ] Michael Armbrust commented on SPARK-15489: -- Also, does this problem exist in the 2.0 preview? > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296963#comment-15296963 ] Michael Armbrust commented on SPARK-15489: -- Is your registration making into the instance of kryo that we use for encoders? Is possible we aren't propagating stuff correctly. > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296950#comment-15296950 ] Amit Sela commented on SPARK-15489: --- I've tried registering `UnmodifiableCollectionsSerializer` and `ImmutableListSerializer` from: https://github.com/magro/kryo-serializers but no luck there.. > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15489) Dataset kryo encoder fails on Collections$UnmodifiableCollection
[ https://issues.apache.org/jira/browse/SPARK-15489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296951#comment-15296951 ] Amit Sela commented on SPARK-15489: --- Serialization trace: tupleTags (org.apache.beam.sdk.values.TupleTagList) tupleTagList (org.apache.beam.sdk.transforms.join.CoGbkResultSchema) schema (org.apache.beam.sdk.transforms.join.CoGbkResult) value (org.apache.beam.sdk.values.KV) value (org.apache.beam.sdk.util.WindowedValue$TimestampedValueInGlobalWindow) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:626) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:311) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown Source) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.fromRow(ExpressionEncoder.scala:230) ... 44 more Caused by: java.lang.UnsupportedOperationException at java.util.Collections$UnmodifiableCollection.add(Collections.java:1075) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:109) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) ... 61 more > Dataset kryo encoder fails on Collections$UnmodifiableCollection > > > Key: SPARK-15489 > URL: https://issues.apache.org/jira/browse/SPARK-15489 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Amit Sela > > When using Encoders with kryo to encode generically typed Objects in the > following manner: > public static Encoder encoder() { > return Encoders.kryo((Class) Object.class); > } > I get a decoding exception when trying to decode > `java.util.Collections$UnmodifiableCollection`, which probably comes from > Guava's `ImmutableList`. > This happens when running with master = local[1]. Same code had no problems > with RDD api. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org