+1
- Ran full Clojure ports of all mobile gaming demos ([1],[2]): user score,
hourly team score, leaderboard, game stats, and stateful team score using
thurber[3]
...using 2.20.0 + JDK8 + Dataflow
- Ran thurber test suite (DirectRunner, org.apache.beam.sdk.testing) which
exercises
@Tomo Suzuki Thanks for looking at this and your thorough analysis// let me
know if I can help with any regression testing, CRs, or more context
anything else.
---
@Elliotte I was a little confused at first re: vendoring vs shading but
here is how I understand it (happy to be corrected if
thub.com/apache/beam/blob/master/vendor/README.md
> >>> >
> >>> >
> >>> > Tomo Suzuki 于2020年1月16日周四 下午1:18写道:
> >>> >>
> >>> >> I've been upgrading dependencies around gRPC. This Avro-problem is
> >>
I meant to mention that we must use Avro 1.9.x as we rely on some schema
resolution fixes not present in 1.8.x - so am indeed blocked.
On Wed, Jan 15, 2020 at 8:50 PM Aaron Dixon wrote:
> It looks like Avro version dependency from Beam has come up in the past
> [1, 2].
>
> I'm curre
It looks like Avro version dependency from Beam has come up in the past [1,
2].
I'm currently on Beam 2.16.0, which has been compatible with my usage of
Avro 1.9.x.
But upgrading to Beam 2.17.0 is not possible for us now that 2.17.0 has
some dependencies on Avro classes only available in 1.8.x.
terintuitive in
> accumulating mode, where the empty pane is not the identity element.
>
> Kenn
>
> *I don't recall if this is the default or not, and also because on phone
> it is slow to look up. From your experience I think not default.
>
> On Mon, Jan 13, 2020, 15:03 Aa
are "optional"... ?
On Mon, Jan 13, 2020 at 4:18 PM Aaron Dixon wrote:
> Yes. Using calendar day-based windows and watermark is completely caught
> up to today ... calendar window ends several days ago. I got EARLY panes
> for each element but never ON_TIME pane.
>
> On
dow?
>
> On Mon, Jan 13, 2020 at 2:02 PM Aaron Dixon wrote:
>
>> The window is not empty fwiw; it has elements; I get an early firing pane
>> for the window but well after the watermark passes there is no ON_TIME
>> pane. Would this be a bug in Dataflow? Seems fundamenta
, 2020 at 3:58 PM Luke Cwik wrote:
> I would have expected an empty on time pane since the default on time
> behavior is FIRE_ALWAYS.
>
> On Mon, Jan 13, 2020 at 1:54 PM Aaron Dixon wrote:
>
>> Can anyone confirm?
>>
>> This is intermittent. Some (it seems, sparse
rantee to fire.
>
>
>
> -Rui
>
> On Mon, Jan 13, 2020 at 1:43 PM Aaron Dixon wrote:
>
>> I have the following trigger:
>>
>> .apply(Window
>> .configure()
>> .triggering(AfterWatermark
>>.pastEndOfWindow(
I have the following trigger:
.apply(Window
.configure()
.triggering(AfterWatermark
.pastEndOfWindow()
.withEarlyFirings(AfterPane
.elementCountAtLeast(1)))
.accumulatingFiredPanes()
.withAllowedLateness(Duration.ZERO)
But in Dataflow
osing some
> of these (if the skew ever happens to be larger than the static value you
> set, you'll not be about to output the session). However now you know
> you're limiting the skewed output to only those specific long sessions
> you've chosen, which is much better than emitting all
ing properly, and might cause data to be dropped. Have
> you considered instead using either triggers or timers to trigger your
> aggregations? That way you don't need to wait for the watermark to advance
> to the end of the window to trigger the aggregation, but the end-of-window
&g
think this use
case is interesting because it ...seems... to be a rather simple/distilled
justification for being able to output data behind the watermark
mid-stream.)
[1] https://beam.apache.org/blog/2016/10/20/test-stream.html
On Sat, Jan 11, 2020 at 10:10 PM Aaron Dixon wrote:
> Oh nice—that will be
ou set a
> timer using withOutputTimetstamp(t), the watermark will be held to t.
>
> On Sat, Jan 11, 2020 at 4:15 PM Aaron Dixon wrote:
>
>> Hi Reuven thanks for your quick reply
>>
>> I've tried that but the drag it puts on the watermark was too intrusive.
>> For ex
ent), so you can .call outputWithTimestamp using
> the CLICK GREEN timestamp without needing to set the allowed-lateness skew.
>
> Reuven
>
> On Sat, Jan 11, 2020 at 1:50 PM Aaron Dixon wrote:
>
>> I've just built a pipeline in Beam and after exploring several options
>&
I've just built a pipeline in Beam and after exploring several options for
my use case, I've ended up relying on the deprecated
.outputWithTimestamp() + DoFn#getAllowedTimestampSkew in what seems to me a
quite valid use case. So I suppose this is a vote for un-deprecating this
API (or a teachable
interested in seeing how the windowing merge semantics get
nailed down / evolve, and to explore what kind of innovative stuff could be
ultimately done with them..
On Fri, Jan 10, 2020 at 4:38 PM Aaron Dixon wrote:
> Once again this is a great help, thank you Kenneth
>
> On Wed, Jan 8, 2020
bug or a mismatch based on the assumptions. Note that this code/logic is
> shared by all runners. I do think you can write a WindowFn that induces it.
>
> Kenn
>
> *this was intended to be a performance optimization, but eagerly copying
> the data turned out faster so now it is
ate feature on this pipeline? Also, do
> you have the code for your WindowFn?
>
> On Tue, Jan 7, 2020 at 12:05 PM Aaron Dixon wrote:
>
>> Dataflow. (See stacktrace)
>>
>> On Tue, Jan 7, 2020 at 1:50 PM Reuven Lax wrote:
>>
>>> Which runner are you using?
>
Dataflow. (See stacktrace)
On Tue, Jan 7, 2020 at 1:50 PM Reuven Lax wrote:
> Which runner are you using?
>
> On Tue, Jan 7, 2020, 11:17 AM Aaron Dixon wrote:
>
>> I get an IllegalStateException " is in more than one state
>> address window set" (stacktrace
I get an IllegalStateException " is in more than one state address
window set" (stacktrace below).
What does this mean? What invariant of custom window implementation
& merging am I violating?
Thank you for any advise.
```
java.lang.IllegalStateException:
have?)
On Sat, Dec 7, 2019 at 10:54 PM Aaron Dixon wrote:
> > In your proposed solution, it probably could be expressed as a new
> merging WindowFn. You would assign each Green element to two tagged windows
> that were GreenFromOrange and GreenToBlue type, and have a se
eth Knowles wrote:
> On Wed, Nov 13, 2019 at 7:39 PM Aaron Dixon wrote:
>
>> This is a great help. Thank you. I like the custom window solution
>> pattern as a way to hold the watermark and merge down to keep the watermark
>> where it is needed. Perhaps there is some inte
/org/apache/beam/sdk/transforms/DoFn.java#L557
>>>
>>>
>>> On Mon, Nov 25, 2019 at 4:06 PM Pablo Estrada
>>> wrote:
>>>
>>>> The blog posts on stateful and timely computation with Beam should help
>>>> clarify a lot about how to use state and timers t
; > b) use a buffer to buffer elements inside the outputting ParDo and
> > only output them when watermark passes (using a timer).
> >
> > There is actually an ongoing discussion about how to make option b)
> > user-friendly and part of Beam itself, but currently there
Suppose I trigger a Combine per-element (in a high-volume stream) and use a
ParDo as a sink.
I assume there is no guarantee about the order that my ParDo will see these
triggers, especially as it processes in parallel, anyway.
That said, my sink writes to a db or cache and I would not like the
This should be doable with one.
>
> *SessionWindow plus EARLIEST holding up the watermark/pipeline was an
> early complaint. That is part of why we switched the default to
> end-of-window (also it is easier to understand and more efficient to
> compute)
>
> Kenn
>
> On
, 2019 at 10:55 AM Reuven Lax wrote:
>
>
> On Tue, Nov 5, 2019 at 8:07 AM Aaron Dixon wrote:
>
>> I noticed that if I use TimestampCombiner/EARLIEST for session windows
>> that the watermark appears to get held up for sessions that never "close"
>> (or that
I noticed that if I use TimestampCombiner/EARLIEST for session windows that
the watermark appears to get held up for sessions that never "close" (or
that extend for a long time).
But if I use default (TimestampCombiner/END_OF_WINDOW) the watermark
doesn't get held.
Does this mean that the
igger 'into' downstream aggregations, why is the API
so friendly to doing just this?
On Wed, Oct 30, 2019 at 5:37 PM Robert Bradshaw wrote:
> On Tue, Oct 29, 2019 at 7:01 PM Aaron Dixon wrote:
> >
> > Thank you, Luke and Robert. Sorry for hitting dev@, I criss-crossed and
> meant
alue once even if its in many windows. If that
> > doesn't perform well, then come back to dev@ and look to optimize.
> >
> > On Tue, Oct 29, 2019 at 1:22 PM Aaron Dixon wrote:
> >>
> >> Hi I am new to Beam.
> >>
> >> I would like to accumulat
Hi I am new to Beam.
I would like to accumulate data over 30 day period and perform a running
aggregation over this data, say every 10 minutes.
I could use a sliding window of 30 days every 10 minutes (triggering at end
of window) but this seems grossly inefficient (both in terms of # of
windows
33 matches
Mail list logo