[jira] [Assigned] (SPARK-39494) Support `createDataFrame` from a list of scalars when schema is not provided
[ https://issues.apache.org/jira/browse/SPARK-39494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39494: Assignee: Apache Spark > Support `createDataFrame` from a list of scalars when schema is not provided > > > Key: SPARK-39494 > URL: https://issues.apache.org/jira/browse/SPARK-39494 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Assignee: Apache Spark >Priority: Major > > Currently, DataFrame creation from a list of native Python scalars is > unsupported in PySpark, for example, > {{>>> spark.createDataFrame([1, 2]).collect()}} > {{Traceback (most recent call last):}} > {{...}} > {{TypeError: Can not infer schema for type: }} > {{However, Spark DataFrame Scala API supports that:}} > {{scala> Seq(1, 2).toDF().collect()}} > {{res6: Array[org.apache.spark.sql.Row] = Array([1], [2])}} > To maintain API consistency, we propose to support DataFrame creation from a > list of scalars. > See more > [here]([https://docs.google.com/document/d/1Rd20PVbVxNrLfOmDtetVRxkgJQhgAAtJp6XAAZfGQgc/edit?usp=sharing]). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-39494) Support `createDataFrame` from a list of scalars when schema is not provided
[ https://issues.apache.org/jira/browse/SPARK-39494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39494: Assignee: (was: Apache Spark) > Support `createDataFrame` from a list of scalars when schema is not provided > > > Key: SPARK-39494 > URL: https://issues.apache.org/jira/browse/SPARK-39494 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Priority: Major > > Currently, DataFrame creation from a list of native Python scalars is > unsupported in PySpark, for example, > {{>>> spark.createDataFrame([1, 2]).collect()}} > {{Traceback (most recent call last):}} > {{...}} > {{TypeError: Can not infer schema for type: }} > {{However, Spark DataFrame Scala API supports that:}} > {{scala> Seq(1, 2).toDF().collect()}} > {{res6: Array[org.apache.spark.sql.Row] = Array([1], [2])}} > To maintain API consistency, we propose to support DataFrame creation from a > list of scalars. > See more > [here]([https://docs.google.com/document/d/1Rd20PVbVxNrLfOmDtetVRxkgJQhgAAtJp6XAAZfGQgc/edit?usp=sharing]). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org