[ https://issues.apache.org/jira/browse/SPARK-38329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507241#comment-17507241 ]
Neven Jovic edited comment on SPARK-38329 at 3/15/22, 9:40 PM: --------------------------------------------------------------- [~hyukjin.kwon] I updated Spark to 3.2.1, and I/O wait is still there. I used structured streaming monitoring tool and found out that my aggregated states in memory were continuously growing. I added watermark and that probably solved issue with State Store Provider (haven't seen that WARN message yet). About high I/O wait, I can assume that it comes from writing to efs. Here is screen shot of CPU Utilization with updated Spark and same load was (Author: JIRAUSER285811): [~hyukjin.kwon] I updated Spark to 3.2.1, and I/O wait is still there. I used structured streaming monitoring tool and found out that my aggregated states in memory were continuously growing. I added watermark and that probably solved issue with State Store Provider (haven't seen that WARN message yet). About high I/O wait, I can assume that it comes from writing to efs. Here is screen shot of CPU Utilization with updated Spark and same load > High I/O wait when Spark Structured Streaming checkpoint changed to EFS > ----------------------------------------------------------------------- > > Key: SPARK-38329 > URL: https://issues.apache.org/jira/browse/SPARK-38329 > Project: Spark > Issue Type: Question > Components: EC2, Input/Output, PySpark, Structured Streaming > Affects Versions: 2.4.6 > Reporter: Neven Jovic > Priority: Major > Attachments: 100k_zbx_21.png > > > I'm currently running spark structured streaming application written in > python(pyspark) where my source is kafka topic and sink i mongodb. I changed > my checkpoint to Amazon EFS, which is distributed on all spark workers and > after that I got increased I/o wait, averaging 8% > > !Screenshot from 2022-02-25 14-16-11.png! > Currently I have 6000 messages coming to kafka every second, and I get every > once in a while a WARN message: > {quote}22/02/25 13:12:31 WARN HDFSBackedStateStoreProvider: Error cleaning up > files for HDFSStateStoreProvider[id = (op=0,part=90),dir = > file:/mnt/efs_max_io/spark/state/0/90] java.lang.NumberFormatException: For > input string: "" > {quote} > I'm not quite sure if that message has anything to do with high I/O wait and > is this behavior expected, or something to be concerned about? > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org