[ https://issues.apache.org/jira/browse/SPARK-46639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eren Avsarogullari updated SPARK-46639: --------------------------------------- Description: Currently, WindowExec Physical Operator has only spillSize SQLMetric. This jira aims to add following SQLMetrics to provide more information from WindowExec usage during query execution: {code:java} numOfOutputRows: Number of total output rows. numOfPartitions: Number of processed input partitions. numOfWindowPartitions: Number of generated window partitions. spilledRows: Number of total spilled rows. spillSizeOnDisk: Total spilled data size on disk.{code} As an example use-case, WindowExec spilling behavior depends on multiple factors and it can sometime cause {{SparkOutOfMemoryError}} instead of spilling to disk so it is hard to track without SQL Metrics such as: *1-* WindowExec creates ExternalAppendOnlyUnsafeRowArray (internal ArrayBuffer) per task (a.k.a child RDD partition) *2-* When ExternalAppendOnlyUnsafeRowArray size exceeds spark.sql.windowExec.buffer.in.memory.threshold=4096, ExternalAppendOnlyUnsafeRowArray switches to UnsafeExternalSorter as spillableArray by moving its all buffered rows into UnsafeExternalSorter and ExternalAppendOnlyUnsafeRowArray (internal ArrayBuffer) is cleared. In this case, WindowExec starts to write UnsafeExternalSorter' s buffer (a.k.a UnsafeInMemorySorter). *3-* UnsafeExternalSorter is being created per window partition. When UnsafeExternalSorter' buffer size exceeds spark.sql.windowExec.buffer.spill.threshold=Integer.MAX_VALUE, it starts to write to disk and get cleared all buffer (a.k.a UnsafeInMemorySorter) content. In this case, UnsafeExternalSorter will continue to buffer next records until exceeding spark.sql.windowExec.buffer.spill.threshold. *New WindowExec SQLMetrics Sample Screenshot:* !WindowExec SQLMetrics.png|width=257,height=152! was: Currently, WindowExec Physical Operator has only spillSize SQLMetric. This jira aims to add following SQLMetrics to provide more information from WindowExec usage during query execution: {code:java} numOfOutputRows: Number of total output rows. numOfPartitions: Number of processed input partitions. numOfWindowPartitions: Number of generated window partitions. spilledRows: Number of total spilled rows. spillSizeOnDisk: Total spilled data size on disk.{code} As an example use-case, WindowExec spilling behavior depends on multiple factors and it can sometime cause {{SparkOutOfMemoryError}} instead of spilling to disk so it is hard to track without SQL Metrics such as: *1-* WindowExec creates ExternalAppendOnlyUnsafeRowArray (internal ArrayBuffer) per task (a.k.a child RDD partition) *2-* When ExternalAppendOnlyUnsafeRowArray size exceeds spark.sql.windowExec.buffer.in.memory.threshold=4096, ExternalAppendOnlyUnsafeRowArray switches to UnsafeExternalSorter as spillableArray by moving its all buffered rows into UnsafeExternalSorter and ExternalAppendOnlyUnsafeRowArray (internal ArrayBuffer) is cleared. In this case, WindowExec starts to write UnsafeExternalSorter' s buffer (a.k.a UnsafeInMemorySorter). *3-* UnsafeExternalSorter is being created per window partition. When UnsafeExternalSorter' buffer size exceeds spark.sql.windowExec.buffer.spill.threshold=Integer.MAX_VALUE, it starts to write to disk and get cleared all buffer (a.k.a UnsafeInMemorySorter) content. In this case, UnsafeExternalSorter will continue to buffer next records until exceeding spark.sql.windowExec.buffer.spill.threshold. Sample UI Screenshot: !WindowExec SQLMetrics.png|width=257,height=152! > Add WindowExec SQLMetrics > ------------------------- > > Key: SPARK-46639 > URL: https://issues.apache.org/jira/browse/SPARK-46639 > Project: Spark > Issue Type: Task > Components: SQL > Affects Versions: 4.0.0 > Reporter: Eren Avsarogullari > Priority: Major > Labels: pull-request-available > Attachments: WindowExec SQLMetrics.png > > > Currently, WindowExec Physical Operator has only spillSize SQLMetric. This > jira aims to add following SQLMetrics to provide more information from > WindowExec usage during query execution: > {code:java} > numOfOutputRows: Number of total output rows. > numOfPartitions: Number of processed input partitions. > numOfWindowPartitions: Number of generated window partitions. > spilledRows: Number of total spilled rows. > spillSizeOnDisk: Total spilled data size on disk.{code} > As an example use-case, WindowExec spilling behavior depends on multiple > factors and it can sometime cause {{SparkOutOfMemoryError}} instead of > spilling to disk so it is hard to track without SQL Metrics such as: > *1-* WindowExec creates ExternalAppendOnlyUnsafeRowArray (internal > ArrayBuffer) per task (a.k.a child RDD partition) > *2-* When ExternalAppendOnlyUnsafeRowArray size exceeds > spark.sql.windowExec.buffer.in.memory.threshold=4096, > ExternalAppendOnlyUnsafeRowArray switches to UnsafeExternalSorter as > spillableArray by moving its all buffered rows into UnsafeExternalSorter and > ExternalAppendOnlyUnsafeRowArray (internal ArrayBuffer) is cleared. In this > case, WindowExec starts to write UnsafeExternalSorter' s buffer (a.k.a > UnsafeInMemorySorter). > *3-* UnsafeExternalSorter is being created per window partition. When > UnsafeExternalSorter' buffer size exceeds > spark.sql.windowExec.buffer.spill.threshold=Integer.MAX_VALUE, it starts to > write to disk and get cleared all buffer (a.k.a UnsafeInMemorySorter) > content. In this case, UnsafeExternalSorter will continue to buffer next > records until exceeding spark.sql.windowExec.buffer.spill.threshold. > *New WindowExec SQLMetrics Sample Screenshot:* > !WindowExec SQLMetrics.png|width=257,height=152! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org