[ https://issues.apache.org/jira/browse/SPARK-25213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590406#comment-16590406 ]
Ryan Blue commented on SPARK-25213: ----------------------------------- [~cloud_fan], that PR ensures that there is a Project node on top of the v2 scan to ensure the rows are converted to unsafe. We should be able to take a look at the physical plan to see whether it is there. If not, then we should find out why it isn't there. If it is there, we should find out why it isn't producing unsafe rows. > DataSourceV2 doesn't seem to produce unsafe rows > ------------------------------------------------- > > Key: SPARK-25213 > URL: https://issues.apache.org/jira/browse/SPARK-25213 > Project: Spark > Issue Type: Task > Components: SQL > Affects Versions: 2.4.0 > Reporter: Li Jin > Priority: Major > > Reproduce (Need to compile test-classes): > bin/pyspark --driver-class-path sql/core/target/scala-2.11/test-classes > {code:java} > datasource_v2_df = spark.read \ > .format("org.apache.spark.sql.sources.v2.SimpleDataSourceV2") > \ > .load() > result = datasource_v2_df.withColumn('x', udf(lambda x: x, > 'int')(datasource_v2_df['i'])) > result.show() > {code} > The above code fails with: > {code:java} > Caused by: java.lang.ClassCastException: > org.apache.spark.sql.catalyst.expressions.GenericInternalRow cannot be cast > to org.apache.spark.sql.catalyst.expressions.UnsafeRow > at > org.apache.spark.sql.execution.python.EvalPythonExec$$anonfun$doExecute$1$$anonfun$5.apply(EvalPythonExec.scala:127) > at > org.apache.spark.sql.execution.python.EvalPythonExec$$anonfun$doExecute$1$$anonfun$5.apply(EvalPythonExec.scala:126) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) > {code} > Seems like Data Source V2 doesn't produce unsafeRows here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org