[jira] [Commented] (SPARK-12787) Dataset to support custom encoder

Chris Bannister (JIRA) Wed, 31 Aug 2016 04:26:47 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-12787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451964#comment-15451964
 ]


Chris Bannister commented on SPARK-12787:
-----------------------------------------

We would like to use spark-avro provided schema mapping to allow reading avro 
files via DataSets, this would make working with avro files very easy. It 
appears to be very simple to provide an avro Encoder but this is currently 
explicitly forbidden in [0]. Is it possible to relax the api around Encoders to 
allow experimental implementations?

[0] 
https://github.com/apache/spark/blob/12fd0cd615683cd4c3e9094ce71a1e6fc33b8d6a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/package.scala#L33
 

> Dataset to support custom encoder
> ---------------------------------
>
>                 Key: SPARK-12787
>                 URL: https://issues.apache.org/jira/browse/SPARK-12787
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Muthu Jayakumar
>
> The current Dataset API allows to be loaded using a case-class that requires 
> the the attribute name and types to be match up precisely.
> It would be nicer, if a Partial function can be provided as a parameter to 
> transform the Dataframe like schema into Dataset. 
> Something like...
> test_dataframe.as[TestCaseClass](partial_function)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-12787) Dataset to support custom encoder

Reply via email to