jingz-db commented on code in PR #47933:
URL: https://github.com/apache/spark/pull/47933#discussion_r1746072835


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/python/TransformWithStateInPandasStateServer.scala:
##########
@@ -62,6 +74,22 @@ class TransformWithStateInPandasStateServer(
     new mutable.HashMap[String, (ValueState[Row], StructType,
       ExpressionEncoder.Deserializer[Row])]()
   }
+  // A map to store the list state name -> (list state, schema, list state row 
deserializer,
+  // list state row serializer) mapping.
+  private val listStates = if (listStatesMapForTest != null) {
+    listStatesMapForTest
+  } else {
+    new mutable.HashMap[String, (ListState[Row], StructType,
+      ExpressionEncoder.Deserializer[Row], 
ExpressionEncoder.Serializer[Row])]()
+  }
+  // A map to store the list state name -> iterator mapping. This is to keep 
track of the
+  // current iterator position for each list state in a grouping key in case 
user tries to fetch
+  // another list state before the current iterator is exhausted.
+  private var listStateIterators = if (listStateIteratorMapForTest != null) {
+    listStateIteratorMapForTest
+  } else {
+    new mutable.HashMap[String, Iterator[Row]]()

Review Comment:
   Just thinking out loud here, curious if we should consider using thread-safe 
map here. I remember we were using thread-safety map in 
`RocksDBStateStoreProvider` here: 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreProvider.scala#L462
   Though at the top of my head I think we should be fine here for 
`TransformWithStateInPandasStateServer` as each state store should have their 
own server instance so there won't be any race condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to