[ https://issues.apache.org/jira/browse/SPARK-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282978#comment-14282978 ]
Hamel Ajay Kothari commented on SPARK-5097: ------------------------------------------- Thanks for the response [~rxin], one more question: how are we planning on allowing the breadth of things that we enabled by expressions with this new API. For example, if I want to do a join where {{rdd1.colA == rdd2.colB}} but I want to cast rdd2.colB to String first, how would I do that? In the expressions API I could do {{new EqualTo(colAExpression, Cast(colBExpression, DataType.StringType))}} where colAExpression and colBExpression are resolved NamedExpressions. How would this look in the new API? I'm happy to take these questions elsewhere if there is a better place to ask. Thanks for your help! > Adding data frame APIs to SchemaRDD > ----------------------------------- > > Key: SPARK-5097 > URL: https://issues.apache.org/jira/browse/SPARK-5097 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Reynold Xin > Assignee: Reynold Xin > Priority: Critical > Attachments: DesignDocAddingDataFrameAPIstoSchemaRDD.pdf > > > SchemaRDD, through its DSL, already provides common data frame > functionalities. However, the DSL was originally created for constructing > test cases without much end-user usability and API stability consideration. > This design doc proposes a set of API changes for Scala and Python to make > the SchemaRDD DSL API more usable and stable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org