I actually already made a pull request adding support for arbitrary
sequence types.
https://github.com/apache/spark/pull/16240
There is still a little problem of Seq.toDS not working for those types
(couldn't get implicits with multiple type parameters to resolve
correctly) but createDataset works fine.
Would be glad if you bring some attention to it. It's my first
code-related pull request and noone responded to it yet. I'm wondering
if I'm doing something wrong on that front.
Michal Senkyr
On 16.12.2016 12:04, Jakub Dubovsky wrote:
I will give that a try. Thanks!
On Fri, Dec 16, 2016 at 12:45 AM, Michael Armbrust
<mich...@databricks.com <mailto:mich...@databricks.com>> wrote:
I would have sworn there was a ticket, but I can't find it. So
here you go: https://issues.apache.org/jira/browse/SPARK-18891
<https://issues.apache.org/jira/browse/SPARK-18891>
A work around until that is fixed would be for you to manually
specify the kryo encoder
<http://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/Encoders.html#kryo%28scala.reflect.ClassTag%29>.
On Thu, Dec 15, 2016 at 8:18 AM, Jakub Dubovsky
<spark.dubovsky.ja...@gmail.com
<mailto:spark.dubovsky.ja...@gmail.com>> wrote:
Hey,
I want to ask whether there is any roadmap/plan for adding
Encoders for further types in next releases of Spark. Here is
a list
<http://spark.apache.org/docs/latest/sql-programming-guide.html#data-types> of
currently supported types. We would like to use Datasets with
our internally defined case classes containing
scala.collection.immutable.List(s). This does not work now
because these lists are converted to ArrayType (Seq). This
then fails a constructor lookup because of seq-is-not-a-list
error...
This means that for now we are stuck with using RDDs.
Thanks for any insights!
Jakub Dubovsky