Jungtaek Lim created SPARK-56536:
------------------------------------
Summary: Further optimize skipping on writing secondary index in
stream-stream join v4
Key: SPARK-56536
URL: https://issues.apache.org/jira/browse/SPARK-56536
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 4.2.0
Reporter: Jungtaek Lim
// When there is no event time column in the value and no watermark ordinal in
the key,
// the secondary index (TsWithKey) will never be used for eviction. Skip
writing to it
// to avoid unnecessary RocksDB merge overhead.
*// TODO: This could be further optimized by also considering whether the state
watermark*
*// predicate is defined. Even when an event time column exists, the
secondary index is*
*// unused if eviction is not possible (e.g., only one side defines a
watermark in a time*
*// interval join). That would require propagating the predicate information
here.*
This ticket tracks the effort of the TODO in above.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]