Github user michalsenkyr commented on the issue: https://github.com/apache/spark/pull/16240 Added support for arbitrary sequences. Now also Queues, ArrayBuffers and such can be used in datasets (all are serialized into ArrayType). I had to alter and add new implicit encoders into `SQLImplicits`. The new encoders are for `Seq` with `Product` combination (essentially only `List`) to disambiguate between `Seq` and `Product` encoders. However, I encountered a problem with implicits. When constructing a complex Dataset using `Seq.toDS` that includes a `Product` (like a case class) and a sequence, the encoder doesn't seem to be created. When constructed with `spark.createDataset` or when transforming an existing dataset, there is no problem. I added a workaround by defining a specific implicit just for `Seq`s. This makes the problem go away for existing usages, however other collections cannot be constructed by `Seq.toDS` unless `newProductSeqEncoder[A, T]` is created with the correct type parameters. If anybody knows how to fix this, let me know.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org