[jira] [Commented] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.
[ https://issues.apache.org/jira/browse/SPARK-33920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255681#comment-17255681 ] L. C. Hsieh commented on SPARK-33920: - Could you elaborate more on the data and datatype you want explicitly assign? Unlike Python, Scala API might not allow you arbitrarily assign different datatype to an input data. For example, for DecimalType, the input data must be Decimal related JVM types like BigDecimal, Decimal, Java's BigDecimal, Java's BigInteger. If you assign Decimal datatype to a float input, the converter cannot convert it. > We cannot pass schema to a createDataFrame function in scala, however we can > do this in python. > --- > > Key: SPARK-33920 > URL: https://issues.apache.org/jira/browse/SPARK-33920 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.1 >Reporter: Abdul Rafay Abdul Rafay >Priority: Major > Attachments: Screenshot 2020-12-28 at 2.23.13 PM.png > > Original Estimate: 168h > Remaining Estimate: 168h > > ~spark.createDataFrame(data, schema)~ > ~I am able to pass schema as a parameter to a function createDataFrame in > python but cannot pass this in scala for static data.~ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.
[ https://issues.apache.org/jira/browse/SPARK-33920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255481#comment-17255481 ] Abdul Rafay Abdul Rafay commented on SPARK-33920: - [~viirya] you're right and I know Scala API uses Scala reflection to infer the schema of the given Product. But where I want to assign a datatype explicitly like DecimalType or FloatType to a dataframe created from a static sequence of rows, it creates a problem there, however, in the case of pyspark it does not > We cannot pass schema to a createDataFrame function in scala, however we can > do this in python. > --- > > Key: SPARK-33920 > URL: https://issues.apache.org/jira/browse/SPARK-33920 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.1 >Reporter: Abdul Rafay Abdul Rafay >Priority: Major > Original Estimate: 168h > Remaining Estimate: 168h > > ~spark.createDataFrame(data, schema)~ > ~I am able to pass schema as a parameter to a function createDataFrame in > python but cannot pass this in scala for static data.~ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33920) We cannot pass schema to a createDataFrame function in scala, however we can do this in python.
[ https://issues.apache.org/jira/browse/SPARK-33920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255321#comment-17255321 ] L. C. Hsieh commented on SPARK-33920: - There is `{{def createDataFrame(rowRDD: RDD[Row], schema: StructType)}}` in Scala API. If you mean `{{def createDataFrame[A <: Product : TypeTag](data: Seq[A])}}`, Scala API uses Scala reflection to infer the schema of the given Product. Why you need `{{schema}}` parameter here? > We cannot pass schema to a createDataFrame function in scala, however we can > do this in python. > --- > > Key: SPARK-33920 > URL: https://issues.apache.org/jira/browse/SPARK-33920 > Project: Spark > Issue Type: Improvement > Components: Build, SQL >Affects Versions: 3.0.1 >Reporter: Abdul Rafay Abdul Rafay >Priority: Critical > Original Estimate: 168h > Remaining Estimate: 168h > > ~spark.createDataFrame(data, schema)~ > ~I am able to pass schema as a parameter to a function createDataFrame in > python but cannot pass this in scala for static data.~ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org