ericm-db commented on code in PR #47445: URL: https://github.com/apache/spark/pull/47445#discussion_r1690054569
########## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/state/metadata/StateMetadataSource.scala: ########## @@ -188,29 +191,54 @@ class StateMetadataPartitionReader( } else Array.empty } - private def allOperatorStateMetadata: Array[OperatorStateMetadata] = { + private[sql] def allOperatorStateMetadata: Array[OperatorStateMetadata] = { val stateDir = new Path(checkpointLocation, "state") val opIds = fileManager .list(stateDir, pathNameCanBeParsedAsLongFilter).map(f => pathToLong(f.getPath)).sorted - opIds.map { opId => - new OperatorStateMetadataReader(new Path(stateDir, opId.toString), hadoopConf).read() + opIds.flatMap { opId => Review Comment: I see, is this the case in which we have added an operator after the first batch of the query? Like what we expect to happen? I'm not sure what the behavior should be in this case. I think just 'dropping' the operator makes sense, but I can also conform to the previous interface, and fail the query if the operator does not have any metadata. What do you think @anishshri-db ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org