[ https://issues.apache.org/jira/browse/SPARK-49686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alvaro Berdonces updated SPARK-49686: ------------------------------------- Description: Under the below scenario `sorted.rdd` takes for ever instead of throwing the expected exception because the schema does not fit the data types. {code:java} import org.apache.spark.sql.Row import org.apache.spark.sql.functions.col import org.apache.spark.sql.types._ import scala.util.Try val data: Seq[Row] = Seq( Row(1, "a"), Row(2, "b"), Row(3, "c") ) val schema = StructType(Seq( StructField("id", StringType), StructField("value", StringType) )) val df = spark.createDataFrame( spark.sparkContext.parallelize(data), schema ) val sorted = df.orderBy(col("value")) Try(sorted.rdd) sorted.rdd {code} A less simplified error is happening to us when using Holden Karau Spark Testing Base and as workaround we are forcing an action before the assert, but I guess it is not the expected behaviour. was: Under the below scenario `sorted.rdd` takes for ever instead of throwing the expected exception because the schema does not fit the data types. {code:java} import org.apache.spark.sql.Row import org.apache.spark.sql.functions.col import org.apache.spark.sql.types._ import scala.util.Try val data: Seq[Row] = Seq( Row(1, "a"), Row(2, "b"), Row(3, "c") ) val schema = StructType(Seq( StructField("id", StringType), StructField("value", StringType) )) val df = spark.createDataFrame( spark.sparkContext.parallelize(data), schema ) val sorted = df.orderBy(col("value")) Try(sorted.rdd) sorted.rdd {code} A less simplified error is happening to us when using Holden Karau Spark Testing Base and as workaround we are forcing an action before the assert, but I guess it is not the expected behaviour. > Spark get stuck while evaluating sorted wrong dataframe rdd > ----------------------------------------------------------- > > Key: SPARK-49686 > URL: https://issues.apache.org/jira/browse/SPARK-49686 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.5.0, 3.5.1, 3.5.2 > Reporter: Alvaro Berdonces > Priority: Minor > > Under the below scenario `sorted.rdd` takes for ever instead of throwing the > expected exception because the schema does not fit the data types. > > {code:java} > import org.apache.spark.sql.Row > import org.apache.spark.sql.functions.col > import org.apache.spark.sql.types._ > import scala.util.Try > val data: Seq[Row] = Seq( > Row(1, "a"), > Row(2, "b"), > Row(3, "c") > ) > val schema = StructType(Seq( > StructField("id", StringType), > StructField("value", StringType) > )) > val df = spark.createDataFrame( > spark.sparkContext.parallelize(data), schema > ) > val sorted = df.orderBy(col("value")) > Try(sorted.rdd) > sorted.rdd > {code} > > > A less simplified error is happening to us when using Holden Karau Spark > Testing Base and as workaround we are forcing an action before the assert, > but I guess it is not the expected behaviour. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org