Jungtaek Lim created SPARK-56536:
------------------------------------

             Summary: Further optimize skipping on writing secondary index in 
stream-stream join v4
                 Key: SPARK-56536
                 URL: https://issues.apache.org/jira/browse/SPARK-56536
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 4.2.0
            Reporter: Jungtaek Lim


// When there is no event time column in the value and no watermark ordinal in 
the key,
// the secondary index (TsWithKey) will never be used for eviction. Skip 
writing to it
// to avoid unnecessary RocksDB merge overhead.
*// TODO: This could be further optimized by also considering whether the state 
watermark*
*//   predicate is defined. Even when an event time column exists, the 
secondary index is*
*//   unused if eviction is not possible (e.g., only one side defines a 
watermark in a time*
*//   interval join). That would require propagating the predicate information 
here.*

This ticket tracks the effort of the TODO in above.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to