There was a similar question (but another approach) and I've explained the
current status a bit.

https://lists.apache.org/thread.html/r89a61a10df71ccac132ce5d50b8fe405635753db7fa2aeb79f82fb77%40%3Cuser.spark.apache.org%3E

I guess this would also answer your question as well. At least for now,
Spark doesn't expose the current watermark in specific micro-batch to the
user level. It's abstracted away. I'm not sure knowing the exact global
watermark "outside" of the query would be able to affect the running query.

If there's a strong demand, we could probably consider adding some function
which provides the current watermark. I guess producing dropped events via
side-output is something we are in favor of (if that is not quite hard to
do), more than exposing the current watermark and letting users do that
instead.


On Wed, Mar 17, 2021 at 1:20 AM dwichman <dwich...@alumni.calpoly.edu>
wrote:

> Hi Spark Developers,
>
> Is it possible to reliably determine the current global watermark that is
> being used in a streaming query via StreamingQueryProgress.onQueryProgress
> eventTime watermark String?
>
>
> https://spark.apache.org/docs/2.2.1/api/java/org/apache/spark/sql/streaming/StreamingQueryProgress.html
>
> The intention would be to precede the query watermark function with
> something like a map function and compare event times with the assumed
> global watermark to determine if the event will be dropped (i.e. too late).
>
> If StreamingQueryProgress.onQueryProgress eventTime watermark does not
> accurately reflect the current global watermark, is there another way to
> reliably determine it?
>
> Thanks for your help.
>
> -Derek
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to