Hi Thomas,

Thanks a lot for explaining.

Best,

Shen

On Thu, Jan 26, 2017 at 12:48 PM, Thomas Groh <[email protected]>
wrote:

> The default timestamp should be BoundedWindow.TIMESTAMP_MIN_VALUE, which
> is
> equivalent to -2**63 microseconds. We also occasionally refer to this
> timestamp as "negative infinity".
>
> The default watermark policy for a bounded source should be negative
> infinity until all of the data is read, then positive infinity. There isn't
> really a default watermark policy for an unbounded source - this is
> dependent on the data that hasn't been read from that source, so it's
> dependent on where you're reading from.
>
> Currently, modifying the timestamp of an element from within a DoFn does
> not modify the watermark; modifying a timestamp forwards in time is
> generally "safe", as it can't cause data to move to behind the watermark -
> this is why moving elements backwards in time requires setting
> "withAllowedTimestampSkew" (which also doesn't modify the watermark, which
> means that elements that are moved backwards in time can become late and be
> dropped by a runner). I don't think we currently have any changes in-flight
> to make this configurable.
>
> On Wed, Jan 25, 2017 at 9:24 PM, Shen Li <[email protected]> wrote:
>
> > Hi,
> >
> > When reading from a source with no timestamp specified on elements, what
> > should be the default timestamp? I presume that it should be 0 as I saw
> > PAssertTest trying to set timestamps to very small values with 0 allowed
> > timestamp skew. Is that right?
> >
> > What about the default watermark policy?
> >
> > If a ParDo modifies the timestamp using
> > DoFnProcessContext.outputWithTimestamp, how should that affect the
> output
> > watermark? Say the ParDo adds 100 seconds to the timestamp of each
> element
> > in processElement, how could the runner know it should also add 100
> seconds
> > to output timestamps?
> >
> > Thanks,
> >
> > Shen
> >
>

Reply via email to