HeartSaVioR commented on pull request #30427: URL: https://github.com/apache/spark/pull/30427#issuecomment-730855993
> If we process history data or some simulation data, the event time could be far different to processing time. For example, if we process some data from 2010 to 2019, now the gap is current time - 2010-xx-xx...? You understand it correctly, though that's just a one of use cases. Given they are running "streaming workload", one of the main goals is to capture the recent outputs (e.g. trends). Watermark would still work for such use cases as well, but what to plot to provide values even on the situation remains the question. (What would be the "ideal" timestamp to calculate the gap in this case?) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org