[ 
https://issues.apache.org/jira/browse/SPARK-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282978#comment-14282978
 ] 

Hamel Ajay Kothari commented on SPARK-5097:
-------------------------------------------

Thanks for the response [~rxin], one more question: how are we planning on 
allowing the breadth of things that we enabled by expressions with this new 
API. For example, if I want to do a join where {{rdd1.colA == rdd2.colB}} but I 
want to cast rdd2.colB to String first, how would I do that? 

In the expressions API I could do {{new EqualTo(colAExpression, 
Cast(colBExpression, DataType.StringType))}} where colAExpression and 
colBExpression are resolved NamedExpressions. How would this look in the new 
API?

I'm happy to take these questions elsewhere if there is a better place to ask. 
Thanks for your help!

> Adding data frame APIs to SchemaRDD
> -----------------------------------
>
>                 Key: SPARK-5097
>                 URL: https://issues.apache.org/jira/browse/SPARK-5097
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>            Assignee: Reynold Xin
>            Priority: Critical
>         Attachments: DesignDocAddingDataFrameAPIstoSchemaRDD.pdf
>
>
> SchemaRDD, through its DSL, already provides common data frame 
> functionalities. However, the DSL was originally created for constructing 
> test cases without much end-user usability and API stability consideration. 
> This design doc proposes a set of API changes for Scala and Python to make 
> the SchemaRDD DSL API more usable and stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to