Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21733 I guess we would have to treat reducing state memory size to have worth to do: as described in above commit, we already optimized in HDFSBackedStateStoreProvider for reducing state store disk size (as well as network transfer) via not storing 4 bytes per each row (from both key and value). This approach would normally save more than previous optimization on value row, given key would have window information which contains two values: start and end. The main issue on this approach for me is possible perf. impact on workloads. Hopefully the workload I've covered shows even slight perf. improvement but not sure for other workloads yet. I might say we need to consider changing default behavior when I have overall good backing numbers, but in any way, I'm sure I agree that deciding from committer(s) is necessary. Would we be better to initiate mail thread in dev. mailing list?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org