from:"Reza Ardeshir Rokni"

Re: Beam Website Feedback

2023-02-06 Thread Reza Ardeshir Rokni

Hi Shlomi,

If you intend to make use of GPU's for the purposes of Machine Learning
Inference, the following resources may also be of interest to you:

RunInference transform information:
https://beam.apache.org/documentation/sdks/python-machine-learning/

You may also want to have a look at:
https://cloud.google.com/dataflow/docs/machine-learning

Cheers

Reza

On Mon, 6 Feb 2023 at 13:24, Bruno Volpato via dev 
wrote:

> Hi Shlomi,
>
> Unfortunately, those cited references are about as much as we have
> available. I acknowledge that they are not very comprehensive -- so I'll
> try to share some insight.
>
> Related to your sample, I believe there are relevant missing pieces, as I
> am note sure what the input looks like (bounded / unbounded, how the
> triggering looks like if unbounded) or how KVs became Rows.
> But regarding ResourceHints, they are applicable to any PTransform, so in
> your example, you can apply it directly when composing
> AvroIO.parseFilesGenericRecords:
>
> .apply("Match file names", FileIO.*matchAll*())
> .apply("Read Avro files", FileIO.*readMatches*())
> *.apply**(**"Parse Avro files into GenericRecord"**, **AvroIO**.*
> *parseFilesGenericRecords**(**new **CustomerTransformFn**()*
> *)**.withCoder**(**KvCoder**.**of**(**Customer**.**keyCoder**()*
> *, **Customer**.**valueCoder**()**)**)*
>
> .setResourceHints(ResourceHints.create().withMinRam("50GB")*)*
>
> .apply("Chunk customer", GroupIntoBatches.*ofSize*(size)
> .withMaxBufferingDuration(Duration.*standardSeconds*(duration)))
>
>
> Accelerators are mostly related to usage of GPUs (
> https://cloud.google.com/dataflow/docs/guides/using-gpus) that may
> overcome CPUs in certain scenarios (such as graphics or ML workloads that
> require highly parallelization/vectorization), but I don't think those
> transforms mentioned here are ready to leverage them.
>
> Besides providing good resource hints so the workers are sized
> accordingly, I'd suggest analyzing which steps are being fused together
> (please check
> https://cloud.google.com/dataflow/docs/guides/right-fitting#right_fitting_and_fusion),
> as it may be the case that you could separate file discovery / matching
> (again, without analyzing the missing parts of the graph, it may be hard to
> make good suggestions).
>
>
> Best,
> Bruno
>
> On Mon, Feb 6, 2023 at 2:50 PM Ahmet Altay  wrote:
>
>> Adding @John Casey  @Bruno Volpato
>>  - who might be able to point to relevant docs.
>>
>> On Sat, Feb 4, 2023 at 11:59 AM Shlomi Elbaz 
>> wrote:
>>
>>> Hello All,
>>>
>>>
>>>
>>> We developed a service with Apache Beam where we read an Avro file that
>>> locate in GCP bucket,
>>>
>>> We had a load and benchmark tests, during the pipeline we got a
>>> bottleneck and *out-of-memory* issues in the stage where the service
>>> accesses the Avro’s by AvroIO.*parseFilesGenericRecords*
>>>
>>>
>>>
>>> The issue happened in highlight part:
>>>
>>> .apply("Match file names", FileIO.*matchAll*())
>>> .apply("Read Avro files", FileIO.*readMatches*())
>>> *.apply**(**"Parse Avro files into GenericRecord"**, **AvroIO**.*
>>> *parseFilesGenericRecords**(**new **CustomerTransformFn**()*
>>> *) **.withCoder**(**KvCoder**.**of**(**Customer**.**keyCoder*
>>> *()**, **Customer**.**valueCoder**()**)**)**)*
>>> .apply("Chunk customer", GroupIntoBatches.*ofSize*(size)
>>> .withMaxBufferingDuration(Duration.*standardSeconds*(duration)))
>>>
>>>
>>>
>>> Issues we saw a tutorial regarding resource-hints in Apache Beam
>>> website, but there is no examples/information how to use with *AvroIO*
>>> *.**parseFilesGenericRecords*.
>>>
>>> https://beam.apache.org/documentation/runtime/resource-hints/
>>>
>>>
>>>
>>> is there more information or examples where we can read about ResourceHints
>>> and Accelerator’s?
>>>
>>>
>>>
>>> Also, would you please recommend us for optimal settings of using
>>> ResourceHints?
>>>
>>>
>>>
>>> The additional tutorials that we rely on:
>>>
>>> https://www.youtube.com/watch?v=9fc2MNQHQ2s
>>>
>>> https://cloud.google.com/dataflow/docs/guides/right-fitting
>>>
>>>
>>> https://cloud.google.com/blog/products/data-analytics/introducing-vertical-autoscaling-in-dataflow-prime
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Shlomi Elbaz,
>>>
>>>
>>>
>>>
>>>
>>> ---
>>> Optimove Named a Leader in the 2022 IDC MarketScape for Retail CDP -
>>> 
>>> Download
>>> report here
>>> 
>>>
>>> Say Hello to Optitext - Optimove Adds Native SMS Capabilities-
>>>

Re: Beam Java starter project template

2022-02-15 Thread Reza Ardeshir Rokni

Hi,

This is great!

What do folks think about also having a less minimal set of starters? For
Java I am thinking about protobuf / autovalue. For Python maybe an
opinionated setup with tox etc... Again this would just contain 'hello'
world samples to get folks going.

Regards
Reza

On Wed, 9 Feb 2022 at 13:56, Robert Burke  wrote:

> SGTM.
>
> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles  wrote:
>
>> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601 I
>> think it will be simplest to license it under ASL2 and include a NOTICE
>> file. The user will be free to "clone and go".
>>
>> I would bring these points back to the dev list:
>>
>>  - ASL2 is what people expect from an ASF project, so it is "least
>> surprise"
>>  - Dual-licensing is possible (but I think not worthwhile due to its
>> impact on contributor license agreements)
>>  - ASL2 says "You must cause any modified files to carry prominent
>> notices stating that You changed the files" which won't apply to the user's
>> code and I would guess they simply won't bother with for files in the
>> template. Or maybe there is a clever way to phrase the header so it is
>> already good to go.
>>  - ASL2 says if the work includes a NOTICE file, you have to includes the
>> attributions from it. The NOTICE file is required by ASF policy. We can
>> easily set it up to be a noop for the user.
>>
>> So my overall take is that we should go ahead with ASL2 and a simple
>> NOTICE file. Check the Jira for details.
>>
>> Kenn
>>
>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles  wrote:
>>
>>> And I've created the repos just now.
>>>
>>> Kenn
>>>
>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles  wrote:
>>>
 Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601

 Kenn

 On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
 dannymccorm...@google.com> wrote:

> Sure - I'm happy to help out with the Actions setup (and/or with the
> Go template). I will say though, the Actions config should be pretty darn
> simple for these examples -
> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
> seems right, for each language configuration we're targeting we basically
> just want a job with:
>
>- checkout
>- setup-
>- inlined script to run tests
>
> Always happy to help with or consult on any actions issues 
>
> Thanks,
> Danny
>
> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark 
> wrote:
>
>> Danny has extensive experience with GitHub actions, and may be able
>> to help out.
>> Kerry
>>
>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles 
>> wrote:
>>
>>> I'm convinced on all points. My main motivation was to keep it
>>> simple. But of course we should keep it simple for users, not us :-)
>>>
>>> I can take on the task of asking about MIT license and requesting
>>> the repos be created. Not sure if it needs my level of privileges but 
>>> I'm
>>> happy to do it anyhow.
>>>
>>> Kenn
>>>
>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw 
>>> wrote:
>>>
 On Wed, Feb 2, 2022 at 10:12 AM David Cavazos 
 wrote:
 >
 > MIT is much more permissive, but I also don't have any problems
 changing it to Apache license. In any case, how about we create the
 following repos?

 For these starter projects, we don't want to encumber any users of
 these templates with any particular licensing requirements (right?)
 and we don't even care about attribution. We want these to be pretty
 much as close to public domain as possible. That's not what the
 Apache
 licence does. (If it's even relevant, a good argument could likely
 be
 made for de minis or fair use, but I think it's best to be explicit
 about this. Perhaps this'd be a good question for apache legal?

 > apache/beam-starter-java
 > apache/beam-starter-python
 > apache/beam-starter-go
 > apache/beam-starter-kotlin
 > apache/beam-starter-scala
 >
 > We'll start by populating the Java one which is the most pressing
 one and the one that is ready, but the rest should be simpler.
 >
 > +David Huntsperger, tldr; these are minimal starter projects for
 every language. Once we have Java, Python and Go, it might be a good 
 idea
 to change the quickstarts to use these instead of the word count. 
 There is
 already a dedicated word count walkthrough so I think that is already
 covered.
 >
 > If we all agree on the repo names, who can help us create them?
 >
 > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
 rober...@google.com> wrote:
 >>
 >> On Tue, Jan 18, 2022 at 6:17 AM

Re: Unbounded pipeline: how to trigger a computation on empty windows

2021-01-19 Thread Reza Ardeshir Rokni

You may want to explore the use of Looping Timers.

https://beam.apache.org/blog/looping-timers/

Please note that there are no order guarantees in OnProcess, some ideas on
solutions to this are explored in:

https://www.youtube.com/watch?v=Y_HoNNU6b3I

Cheers

On Wed, 20 Jan 2021 at 05:07, Reuven Lax  wrote:

> What sort of windows are you using?
>
> On Tue, Jan 19, 2021 at 1:05 PM Andrei Arion 
> wrote:
>
>> Hello,
>>
>> I have tried to use Event Time Timers cf
>> https://beam.apache.org/blog/timely-processing/ but i cannot seem to
>> find a way to set the timer (when there are no elements the process()
>> method is never called...)
>>
>> The closest answer that I could find on StackOverflow (
>> https://stackoverflow.com/questions/5718/apache-beam-how-to-trigger-empty-windows)
>> seem to suggest that there is no alternative than inserting dummy/fake data
>> ?
>>
>>
>> Is this true or there are better ways to achieve that.
>>
>> Thank you,
>> Andrei
>>
>

Re: PTransform Annotations Proposal

2020-11-16 Thread Reza Ardeshir Rokni

+1 having a NeedsRam(x) annotation would be incredibly helpful.

On Fri, 13 Nov 2020 at 05:57, Robert Burke  wrote:

> (Disclaimer, Mirac and their team did approach me about this beforehand as
> their interest is in the Go SDK.)
>
> +1 I think it's a good idea. As you've pointed out, there are many
> opportunities for optional pipeline analysis here as well.
>
> A strawman counter point would be to re-used the static DisplayData for
> this kind of thing, but I think that's not necessarily the same thing. It's
> very hard to get something that's purely intended for Human consumption to
> also be suitable for machine consumption, without various adapters and
> such, and it would be an awful hack. Having something specifically for
> Machines to understand is valuable in and of itself.
>
> I appreciate the versatility of simply using known URNs and their defined
> formats, and especially keeping the proposal to optional annotations that
> don't affect correctness. This will work well with most DoFns that need
> specialized hardware. They can usually be emulated on ordinary CPUs, which
> is good for testing, but can perform much better if the hardware is
> available. This also allows the runners to move execution of specific DoFns
> to the machines with the specialized hardware, for better scheduling of
> resources.
>
> I look forward to the PR, and before then, all the discussion the
> community has about this new field in the model proto.
>
>
>
>
>
> On Thu, 12 Nov 2020 at 09:41, Mirac Vuslat Basaran 
> wrote:
>
>> Hi all,
>>
>> We would like to propose adding functionality to add annotations to Beam
>> transforms. These annotations would be readable by the runner, and the
>> runner could then act on this information; for example by doing some
>> special resource allocation. There have been discussions around annotations
>> (or hints as they are sometimes called) in the past (
>> https://lists.apache.org/thread.html/rdf247cfa3a509f80578f03b2454ea1e50474ee3576a059486d58fdf4%40%3Cdev.beam.apache.org%3E,
>>
>> https://lists.apache.org/thread.html/fc090d8acd96c4cf2d23071b5d99f538165d3ff7fbe6f65297655309%40%3Cdev.beam.apache.org%3E).
>> This proposal aims to come up with an accepted lightweight solution with a
>> follow-up Pull Request to implement it in Go.
>>
>> By annotations, we refer to optional information / hints provided to the
>> runner. This proposal explicitly excludes “required” annotations that could
>> cause incorrect output. A runner that does not understand the annotations
>> and ignores them must still produce correct output, with perhaps a
>> degradation in performance or other nonfunctional requirements. Supporting
>> only “optional” annotations allows for compatibility with runners that do
>> not recognize those annotations.
>>
>> A good example of an optional annotation is marking a transform to be run
>> on GPU or TPU or that it needs a certain amount of RAM. If the runner knows
>> about this annotation, it can then allocate the requested resources for
>> that transform only to improve performance and avoid using these scarce
>> resources for other transforms.
>>
>> Another example of an optional annotation is marking a transform to run
>> on secure hardware, or to give hints to profiling/dynamic analysis tools.
>>
>> In all these cases, the runner can run the pipeline with or without the
>> annotation, and in both cases the same output would be produced. There
>> would be differences in nonfunctional requirements (performance, security,
>> ease of profiling), hence the optional part.
>>
>> A counter-example that this proposal explicitly excludes is marking a
>> transform as requiring sorted input. For example, on a transform that
>> expects time-sorted input in order to produce the correct output. If the
>> runner ignores this requirement, it would risk producing an incorrect
>> output. In order to avoid this, we exclude these required annotations.
>>
>> Implementation-wise, we propose to add a field:
>>  - map annotations = 8;
>> to PTransform proto (
>> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L127).
>> The key would be a URN that uniquely identifies the type of annotation. The
>> value is an opaque byte array (e.g., a serialized protocol buffer) to allow
>> for maximum flexibility to the implementation of that specific type of
>> annotation.
>>
>> We have a specific interest in adding this to the Go SDK. In Go, the user
>> would specify the annotations to a structural ParDo as follows, by defining
>> a field:
>>  - Annotations map[string][]byte
>> and filling it out. For simplicity, we will only support structural doFns
>> in Go for the time being.
>>
>> The runners could then read the annotations from the PTransform proto and
>> support the annotations that they would like to in the way they want.
>>
>> Please let me know what you think, and what would be the best way to
>> proceed, e.g., we can share a small design doc or, in case

Re: Question about saving data to use across runner's instances

2020-11-16 Thread Reza Ardeshir Rokni

Hi,

Do you have an upper bound on how large the file will become?  If
it's small enough to fit into a sideinput you may be able to make use of
the Slow update sideinput pattern:
https://beam.apache.org/documentation/patterns/side-inputs/

If not, then SatefulDoFn would be a good choice, but note a stateful dofn
is per key/window. Is there a natural key in the data that you can use ? If
yes, something like this pattern may be useful for you use case:
streaming-joins-in-a-recommendation-system

.

In terms of persisting the file, you may want to create a branch in the
pipeline and every time you update the file data, write the file out to an
object store, which you can read from if the pipeline needs to be restarted
or crashes.

Cheers
Reza

On Mon, 16 Nov 2020 at 04:48, Artur Khanin  wrote:

> Hi all,
>
> I am designing a Dataflow pipeline in Java that has to:
>
>- Read a file (it may be pretty large) during initialization and then
>store it in some sort of shared memory
>- Periodically update this file
>- Make this file available to read across all runner's instances
>- Persist this file in cases of restarts/crashes/scale-up/scale down
>
>
> I found some information about stateful processing in Beam using Stateful
> DoFn . Is it an
> appropriate way to handle such functionality, or is there a better approach
> for it?
>
> Any help or information is very appreciated!
>
> Thanks,
> Artur Khanin
> Akvelon, Inc.
>
>

Re: [Proposal] Add a new Beam example to ingest data from Kafka to Pub/Sub

2020-10-14 Thread Reza Ardeshir Rokni

Just a thought, but what if in the future there were templates for other
runners?

Then having a template folder would fit nicely no? We could even have a
runner specific subfolder and maybe even a shared area for things that
could be used by all templates for all runners?

On Thu, 15 Oct 2020 at 11:47, Kenneth Knowles  wrote:

> Hi Ilya,
>
> I have added you to the "Contributors" role on Jira so you can be assigned
> tickets, and given you the ticket you filed since you are already solving
> it. Thanks!
>
> I have a very high level thought: Since Dataflow's "Flex Templates"
> feature is just any pipeline, perhaps the main pipeline can be more of an
> "example" and fit into the `examples/` folder? Then the containerization
> and Google-specific* JSON could be alongside. In this way, users of other
> runners could possibly use or learn from it even if they are not interested
> in GCP. I understand this is not your primary goal, considering
> the contribution. I just want to open this for discussion.
>
> Kenn
>
> *In fact, the JSON is very generic. It is not really "Google specific" in
> concept, just in practice.
>
> On Wed, Oct 14, 2020 at 12:14 PM Ilya Kozyrev 
> wrote:
>
>> Hi Beam Community,
>>
>> There was no feedback on the proposal, and I would like to submit PR for
>> this proposal.
>>
>> I created a JIRA improvement
>>  to track this
>> proposal and now submitting  PR
>>  in the Beam repository
>> related to the proposal that I sent before. We suggest adding /template
>> folder to the repository root to help discover templates by developers.
>> This will provide structure for future templates development for Beam.
>>
>> Could someone kindle help with reviewing the PR
>>  ?
>>
>> Thank you,
>> Ilya
>>
>> On 7 Oct 2020, at 21:23, Ilya Kozyrev  wrote:
>>
>> Hi Beam Community,
>>
>> I have a proposal to add Apache Beam example that is a template to ingest
>> data from Apache Kafka to Google Cloud Pub/Sub. More detailed information
>> about the proposed template can be found in README
>> 
>>  file,
>> and a prototype  was built with
>> a team. I'd like to ask for your feedback before moving forward with
>> finishing the template.
>>
>> I did not see a folder that provides easily discoverable templates to a
>> developer.  I would like to propose adding a "templates" folder where other
>> Apache Beam templates may be added in the future. E.g.,
>> beam/templates/java/kafka-to-pubsub could be used for the Kafka to Pub/Sub
>> template.
>>
>> Please share your feedback/comments about this proposal in the thread.
>>
>> Thank you,
>> Ilya
>>
>>
>>

Re: Stateful Pardo Question

2020-08-09 Thread Reza Ardeshir Rokni

+1 on having the behavior clearly documented, would also be great to try
and add more stat and timer patterns to the Beam docs patterns page
https://beam.apache.org/documentation/patterns/overview/.

I think it might be worth thinking about describing these kind of patterns
with an emphasis on the OnTimer being where the work happens. One thing
that would make all of this a lot easier in reducing the boiler plate code
that would need to be written is a sorted map state. ( a topic of
discussion on a few threads).

On Mon, 10 Aug 2020 at 01:16, Reuven Lax  wrote:

> Timers in Beam are considered "eligible to fire" once the watermark has
> advanced. This is not the same as saying that they will fire immediately.
> You should not assume ordering between the elements and the timers.
>
> This is one reason (among many) that Beam does not provide a "read
> watermark" primitive, as it leads to confusions such as this. Since there
> is no read-watermark operator, the only way for a user's ParDo to view that
> the watermark has been set is to set a timer and wait for it to expire.
> Watermarks on their own can act in very non-intuitive ways (due to
> asynchronous advancement), so generally we encourage people to reason about
> timers and windowing in their code instead.
>
> Reuven
>
> On Sun, Aug 9, 2020 at 9:39 AM jmac...@godaddy.com 
> wrote:
>
>> I understand that watermarks are concurrently advanced, and that they are
>> estimates and not precise. but I’m not sure this is relevant in this case.
>> In this repro code we are in processElement() and the watermark HAS
>> advanced but the timer has not been called even though we asked the runtime
>> to do that. In this case we are in a per-key stateful operating mode and
>> our timer should not be shared with any other runners (is that correct?) so
>> it seems to me that we should be able to operate in a manner that is
>> locally consistent from the point of view of the DoFn we are writing. That
>> is to say, _*before*_ we enter processElement we check any local timers
>> first. I would argue that this would be far more sensible from the authors
>> perspective.
>>
>>
>>
>> *From: *Reuven Lax 
>> *Reply-To: *"dev@beam.apache.org" 
>> *Date: *Thursday, August 6, 2020 at 11:57 PM
>> *To: *dev 
>> *Subject: *Re: Stateful Pardo Question
>>
>>
>>
>> Notice: This email is from an external sender.
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Aug 4, 2020 at 1:08 PM jmac...@godaddy.com 
>> wrote:
>>
>> So, after some additional digging, it appears that Beam does not
>> consistently check for timer expiry before calling process. The result is
>> that it may be the case that the watermark has moved beyond your timer
>> expiry, and if youre counting on the timer callback happening at the time
>> you set it for, that simply may NOT have happened when you are in
>> DoFn.process(). You can “fix” the behavior by simply checking the watermark
>> manually in process() and doing what you would normally do for timestamp
>> exipry before proceeding. See my latest updated code reproducing the issue
>> and showing the fix at  https://github.com/randomsamples/pardo_repro.
>>
>>
>>
>> I would argue that users of this API will naturally expect that timer
>> callback semantics will guarantee that when they are in process(), if the
>> current watermark is past a timers expiry that the timer callback in
>> question will have been called. Is there any reason why this isn’t
>> happening? Am I misunderstanding something?
>>
>>
>>
>> Timers do not expire synchronously with the watermark advancing. So if
>> you have a timer set for 12pm and the watermark advances past 12pm, that
>> timer is now eligible to fire, but might not fire immediately. Some other
>> elements may process before that timer fires.
>>
>>
>>
>> There are multiple reasons for this, but one is that Beam does not
>> guarantee that watermark advancement is synchronous with element
>> processing. The watermark might advance suddenly while in the middle
>> processing an element, or at any other time. This makes it impossible (or
>> at least, exceedingly difficult) to really provide the guarantee you
>> expected.
>>
>>
>>
>> Reuven
>>
>>
>>
>> *From: *"jmac...@godaddy.com" 
>> *Reply-To: *"dev@beam.apache.org" 
>> *Date: *Monday, August 3, 2020 at 10:51 AM
>> *To: *"dev@beam.apache.org" 
>> *Subject: *Re: Stateful Pardo Question
>>
>>
>>
>> Notice: This email is from an external sender.
>>
>>
>>
>> Yeah, unless I am misunderstanding something. The output from my repro
>> code shows event timestamp and the context timestamp every time we process
>> an event.
>>
>> Receiving event at: 2000-01-01T00:00:00.000Z
>>
>> Resetting timer to : 2000-01-01T00:15:00.000Z
>>
>> Receiving event at: 2000-01-01T00:05:00.000Z
>>
>> Resetting timer to : 2000-01-01T00:20:00.000Z ß Shouldn’t the timer have
>> fired before we processed the next event?
>>
>> Receiving event at: 2000-01-01T00:40:00.000Z
>>
>> Why didnt the timer fire?
>>
>> Resetting timer to :

Re: Stateful Pardo Question

2020-08-09 Thread Reza Ardeshir Rokni

I think the difference is that you could try doing the timer resets within
the OnTimer code ( after the initial start) rather than onProcess() . This
way it doesn't matter if more events arrive before the timer fires. As you
would sort them when the timer actually does go off. You would need to
store your elements as Timestamped of course. Assuming I have understood
the use case correctly.

Sorry I won't have time to try it out myself this week, but it's a
worthwhile pattern to explore and publish on the patterns page.

Cheers
Rez

On Mon, 10 Aug 2020, 00:30 jmac...@godaddy.com,  wrote:

> This is pretty much what the repro code does. The problem is that it
> doesn’t work the way we would expect it should because the timer isn’t
> called before processevent.
>
>
>
> *From: *Reza Ardeshir Rokni 
> *Reply-To: *"dev@beam.apache.org" 
> *Date: *Friday, August 7, 2020 at 5:34 AM
> *To: *dev 
> *Subject: *Re: Stateful Pardo Question
>
>
>
> Notice: This email is from an external sender.
>
>
>
> Hi,
>
>
>
> One possible approach ( have not tried it out, so might be missing
> cases..) but you can reset the timer from within the OnTimer code.
>
>
>
> So maybe you start the timer on the onprocess to go off at
> current+requiredGap. Then OnTimer, you check the list of elements and
> output a session if nothing new. Then reset the timer to go off either at
> latestTimestampValue+requiredGap if there was new elements or at
> currentEventTime+requiredGap. If a timer fires and there are no elements in
> the bag then you don't rest.
>
>
>
> You will need to keep state to know you have a timer firing so as not to
> set it again in OnProcess as there is no read() for timers.
>
>
>
> Also we don't have a sorted map state, so you will take a performance hit
> as you will need to keep sorting the events every OnTimer...
>
>
>
> Cheers
>
> Reza
>
>
>
>
>
> On Fri, 7 Aug 2020 at 14:57, Reuven Lax  wrote:
>
>
>
>
>
> On Tue, Aug 4, 2020 at 1:08 PM jmac...@godaddy.com 
> wrote:
>
> So, after some additional digging, it appears that Beam does not
> consistently check for timer expiry before calling process. The result is
> that it may be the case that the watermark has moved beyond your timer
> expiry, and if youre counting on the timer callback happening at the time
> you set it for, that simply may NOT have happened when you are in
> DoFn.process(). You can “fix” the behavior by simply checking the watermark
> manually in process() and doing what you would normally do for timestamp
> exipry before proceeding. See my latest updated code reproducing the issue
> and showing the fix at  https://github.com/randomsamples/pardo_repro.
>
>
>
> I would argue that users of this API will naturally expect that timer
> callback semantics will guarantee that when they are in process(), if the
> current watermark is past a timers expiry that the timer callback in
> question will have been called. Is there any reason why this isn’t
> happening? Am I misunderstanding something?
>
>
>
> Timers do not expire synchronously with the watermark advancing. So if you
> have a timer set for 12pm and the watermark advances past 12pm, that timer
> is now eligible to fire, but might not fire immediately. Some other
> elements may process before that timer fires.
>
>
>
> There are multiple reasons for this, but one is that Beam does not
> guarantee that watermark advancement is synchronous with element
> processing. The watermark might advance suddenly while in the middle
> processing an element, or at any other time. This makes it impossible (or
> at least, exceedingly difficult) to really provide the guarantee you
> expected.
>
>
>
> Reuven
>
>
>
> *From: *"jmac...@godaddy.com" 
> *Reply-To: *"dev@beam.apache.org" 
> *Date: *Monday, August 3, 2020 at 10:51 AM
> *To: *"dev@beam.apache.org" 
> *Subject: *Re: Stateful Pardo Question
>
>
>
> Notice: This email is from an external sender.
>
>
>
> Yeah, unless I am misunderstanding something. The output from my repro
> code shows event timestamp and the context timestamp every time we process
> an event.
>
> Receiving event at: 2000-01-01T00:00:00.000Z
>
> Resetting timer to : 2000-01-01T00:15:00.000Z
>
> Receiving event at: 2000-01-01T00:05:00.000Z
>
> Resetting timer to : 2000-01-01T00:20:00.000Z ß Shouldn’t the timer have
> fired before we processed the next event?
>
> Receiving event at: 2000-01-01T00:40:00.000Z
>
> Why didnt the timer fire?
>
> Resetting timer to : 2000-01-01T00:55:00.000Z
>
> Receiving event at: 2000-01-01T00:45:00.000Z
>
> Resetting timer to : 2

Re: Stateful Pardo Question

2020-08-07 Thread Reza Ardeshir Rokni

Hi,

One possible approach ( have not tried it out, so might be missing cases..)
but you can reset the timer from within the OnTimer code.

So maybe you start the timer on the onprocess to go off at
current+requiredGap. Then OnTimer, you check the list of elements and
output a session if nothing new. Then reset the timer to go off either at
latestTimestampValue+requiredGap if there was new elements or at
currentEventTime+requiredGap. If a timer fires and there are no elements in
the bag then you don't rest.

You will need to keep state to know you have a timer firing so as not to
set it again in OnProcess as there is no read() for timers.

Also we don't have a sorted map state, so you will take a performance hit
as you will need to keep sorting the events every OnTimer...

Cheers
Reza


On Fri, 7 Aug 2020 at 14:57, Reuven Lax  wrote:

>
>
> On Tue, Aug 4, 2020 at 1:08 PM jmac...@godaddy.com 
> wrote:
>
>> So, after some additional digging, it appears that Beam does not
>> consistently check for timer expiry before calling process. The result is
>> that it may be the case that the watermark has moved beyond your timer
>> expiry, and if youre counting on the timer callback happening at the time
>> you set it for, that simply may NOT have happened when you are in
>> DoFn.process(). You can “fix” the behavior by simply checking the watermark
>> manually in process() and doing what you would normally do for timestamp
>> exipry before proceeding. See my latest updated code reproducing the issue
>> and showing the fix at  https://github.com/randomsamples/pardo_repro.
>>
>>
>>
>> I would argue that users of this API will naturally expect that timer
>> callback semantics will guarantee that when they are in process(), if the
>> current watermark is past a timers expiry that the timer callback in
>> question will have been called. Is there any reason why this isn’t
>> happening? Am I misunderstanding something?
>>
>
> Timers do not expire synchronously with the watermark advancing. So if you
> have a timer set for 12pm and the watermark advances past 12pm, that timer
> is now eligible to fire, but might not fire immediately. Some other
> elements may process before that timer fires.
>
> There are multiple reasons for this, but one is that Beam does not
> guarantee that watermark advancement is synchronous with element
> processing. The watermark might advance suddenly while in the middle
> processing an element, or at any other time. This makes it impossible (or
> at least, exceedingly difficult) to really provide the guarantee you
> expected.
>
> Reuven
>
>>
>>
>> *From: *"jmac...@godaddy.com" 
>> *Reply-To: *"dev@beam.apache.org" 
>> *Date: *Monday, August 3, 2020 at 10:51 AM
>> *To: *"dev@beam.apache.org" 
>> *Subject: *Re: Stateful Pardo Question
>>
>>
>>
>> Notice: This email is from an external sender.
>>
>>
>>
>> Yeah, unless I am misunderstanding something. The output from my repro
>> code shows event timestamp and the context timestamp every time we process
>> an event.
>>
>> Receiving event at: 2000-01-01T00:00:00.000Z
>>
>> Resetting timer to : 2000-01-01T00:15:00.000Z
>>
>> Receiving event at: 2000-01-01T00:05:00.000Z
>>
>> Resetting timer to : 2000-01-01T00:20:00.000Z ß Shouldn’t the timer have
>> fired before we processed the next event?
>>
>> Receiving event at: 2000-01-01T00:40:00.000Z
>>
>> Why didnt the timer fire?
>>
>> Resetting timer to : 2000-01-01T00:55:00.000Z
>>
>> Receiving event at: 2000-01-01T00:45:00.000Z
>>
>> Resetting timer to : 2000-01-01T01:00:00.000Z
>>
>> Receiving event at: 2000-01-01T00:50:00.000Z
>>
>> Resetting timer to : 2000-01-01T01:05:00.000Z
>>
>> Timer firing at: 2000-01-01T01:05:00.000Z
>>
>>
>>
>> *From: *Reuven Lax 
>> *Reply-To: *"dev@beam.apache.org" 
>> *Date: *Monday, August 3, 2020 at 10:02 AM
>> *To: *dev 
>> *Subject: *Re: Stateful Pardo Question
>>
>>
>>
>> Notice: This email is from an external sender.
>>
>>
>>
>> Are you sure that there is a 15 minute gap in your data?
>>
>>
>>
>> On Mon, Aug 3, 2020 at 6:20 AM jmac...@godaddy.com 
>> wrote:
>>
>> I am confused about the behavior of timers on a simple stateful pardo. I
>> have put together a little repro here:
>> https://github.com/randomsamples/pardo_repro
>>
>>
>>
>> I basically want to build something like a session window, accumulating
>> events until quiescence of the stream for a given key and gap time, then
>> output results. But it appears that the timer is not firing when the
>> watermark is passed it expiration time, so the event stream is not being
>> split as I would have expected. Would love some help getting this work, the
>> behavior is for a project I’m working on.
>>
>>

Re: [ANNOUNCE] New committer: Robin Qiu

2020-06-06 Thread Reza Ardeshir Rokni

Congratulations!

On Wed, 20 May 2020 at 23:57, Austin Bennett 
wrote:

> Congrats!
>
> On Tue, May 19, 2020, 8:32 PM Chamikara Jayalath 
> wrote:
>
>> Congrats Robin!
>>
>> On Tue, May 19, 2020 at 2:39 PM Rui Wang  wrote:
>>
>>> Nice! Congrats!
>>>
>>>
>>>
>>> -Rui
>>>
>>> On Tue, May 19, 2020 at 11:13 AM Pablo Estrada 
>>> wrote:
>>>
 yoohoo : )

 On Tue, May 19, 2020 at 11:03 AM Yifan Zou  wrote:

> Congratulations, Robin!
>
> On Tue, May 19, 2020 at 10:53 AM Udi Meiri  wrote:
>
>> Congratulations Robin!
>>
>> On Tue, May 19, 2020, 10:15 Valentyn Tymofieiev 
>> wrote:
>>
>>> Congratulations, Robin!
>>>
>>> On Tue, May 19, 2020 at 9:10 AM Yichi Zhang 
>>> wrote:
>>>
 Congrats Robin!

 On Tue, May 19, 2020 at 8:56 AM Kamil Wasilewski <
 kamil.wasilew...@polidea.com> wrote:

> Congrats!
>
> On Tue, May 19, 2020 at 5:33 PM Jan Lukavský 
> wrote:
>
>> Congrats Robin!
>> On 5/19/20 5:01 PM, Tyson Hamilton wrote:
>>
>> Congratulations!
>>
>> On Tue, May 19, 2020 at 6:10 AM Omar Ismail <
>> omarism...@google.com> wrote:
>>
>>> Congrats!
>>>
>>> On Tue, May 19, 2020 at 5:00 AM Gleb Kanterov 
>>> wrote:
>>>
 Congratulations!

 On Tue, May 19, 2020 at 7:31 AM Aizhamal Nurmamat kyzy <
 aizha...@apache.org> wrote:

> Congratulations, Robin! Thank you for your contributions!
>
> On Mon, May 18, 2020, 7:18 PM Boyuan Zhang 
> wrote:
>
>> Congrats~~
>>
>> On Mon, May 18, 2020 at 7:17 PM Reza Rokni 
>> wrote:
>>
>>> Congratulations!
>>>
>>> On Tue, May 19, 2020 at 10:06 AM Ahmet Altay <
>>> al...@google.com> wrote:
>>>
 Hi everyone,

 Please join me and the rest of the Beam PMC in welcoming a
 new committer: Robin Qiu .

 Robin has been active in the community for close to 2
 years, worked on HyperLogLog++ [1], SQL [2], improved 
 documentation, and
 helped with releases(*).

 In consideration of his contributions, the Beam PMC trusts
 him with the responsibilities of a Beam committer [3].

 Thank you for your contributions Robin!

 -Ahmet, on behalf of the Apache Beam PMC

 [1]
 https://www.meetup.com/Zurich-Apache-Beam-Meetup/events/265529665/
 [2]
 https://www.meetup.com/Belgium-Apache-Beam-Meetup/events/264933301/
 [3] https://beam.apache.org/contribute/become-a-committer
 /#an-apache-beam-committer
 (*) And maybe he will be a release manager soon :)

 --
>>>
>>> Omar Ismail |  Technical Solutions Engineer |
>>> omarism...@google.com |
>>>
>>

Re: Non-trivial joins examples

2020-05-03 Thread Reza Ardeshir Rokni

A couple of things that are really nice here,

1- Domain specific (CTR in your example). We may find that eventually it's
not possible / practical to build out generic joins for all situations. But
with the primitives available in Beam and good 'patterns' domain specific
joins could be added for different industries.

2- Pros / Cons section. This is very nice and as Kenn mentioned it would be
great for there to be a Collection of joins that users can choose from
based on the pros / cons.

I got pulled onto other work before I could complete this PR (LINK
) for example, but I hope to go
back to it, it's specific to a time series use case from a specific
industry with pros and cons based on throughput etc

Maybe we should consider adding something with links etc to Beam
patterns...

https://beam.apache.org/documentation/patterns/overview/

Perhaps a Joins section and we do something that has not been done before
and add a Industry / Domain flavour..

Cheers

Reza

On Sat, 2 May 2020 at 14:45, Marcin Kuthan  wrote:

> @Kenneth - thank for your response, for sure I was inspired a lot by
> earlier discussions on the group and latest documentation updates about
> Timers:
> https://beam.apache.org/documentation/programming-guide/#timers
>
> In the limitations I forgot to mention about SideInputs, it works quite
> well for scenarios where one side of the join is updated slowly, very
> slowly. But for scenarios where the main stream gets 50k+ events per
> seconds and the joined stream ~100 events per second it simply does not
> work. Especially if there is no support for updates in Map side input and
> the side input has to be updated/broadcasted as a whole.
>
> @Jan - very interesting, as I understood the joins are already implemented
> (plenty of them in Scio, classic ones, sparse versions, etc.) the problem
> is with limited windows semantics, triggering policy and the time of
> emitted events.
>
> Please look at LookupCacheDoFn, it looks like left outer join - but it
> isn't. Only the latest Lookup value (right side of the join) is cached. And
> the left side of the join is cached only until the first matching lookup is
> observed. Not so generic but quite efficient.
>
>
> https://github.com/mkuthan/beam-examples/blob/master/src/main/scala/org/mkuthan/beam/examples/LookupCacheDoFn.scala
>
> Marcin
>
> On Fri, 1 May 2020 at 22:22, Jan Lukavský  wrote:
>
>> Interestingly, I'm currently also working on a proposal for generic join
>> semantics. I plan to send a proposal for review, but unfortunately, there
>> are still other things keeping me busy. I take this opportunity to review
>> high-level thoughts, maybe someone can give some points.
>>
>> The general idea is to define a join that can incorporate all other types
>> as special cases, where the generic implementation can be simplified or
>> optimized, but the semantics remain the same. As I plan to put this down to
>> a full design document I will just very roughly outline ideas:
>>
>>  a) the generic semantics, should be equivalent to running relational
>> join against set of tables _after each individual modification of the
>> relation_ and producing results with timestamp of the last modification
>>
>>  b) windows "scope" state of each "table" - i.e. when time reaches
>> window.maxTimestamp() corresponding "table" is cleared
>>
>>  c) it should be possible to derive other types of joins from this
>> definition by certain manipulations (e.g. buffering multiple updates in
>> single window and assigninig all elements timestamp of
>> window.maxTimestamp() will yield the classical "windowed join" with the
>> requirement to have same windows on both (all) sides as otherwise the
>> result will be empty) - the goal of these modification is typically
>> enabling some optimization (e.g. the fully generic implementation must
>> include time sorting - either implicitly or explicitly, optimized variants
>> can drop this requirement).
>>
>> It would be great is someone has any comments on this bottom-up approach.
>>
>> Jan
>> On 5/1/20 5:30 PM, Kenneth Knowles wrote:
>>
>> +dev @beam and some people who I talk about joins
>> with
>>
>> Interesting! It is a lot to take in and fully grok the code, so calling
>> in reinforcements...
>>
>> Generally, I think there's agreement that for a lot of real use cases,
>> you have to roll your own join using the lower level Beam primitives. So I
>> think it would be great to get some of these other approaches to joins into
>> Beam, perhaps as an extension of the Java SDK or even in the core (since
>> schema joins are in the core). In particular:
>>
>>  - "join in fixed window with repeater" sounds similar (but not
>> identical) to work by Mikhail
>>  - "join in global window with cache" sounds similar (but not identical)
>> to work and discussions w/ Reza and Tyson
>>
>> I want to be clear that I am *not* saying there's any duplication. I'm
>> guessing these all fit into a collection of

Re: [docs] Python State & Timers

2019-04-24 Thread Reza Ardeshir Rokni

Pablo, Kenneth and I have a new blog ready for publication which covers how
to create a "looping timer" it allows for default values to be created in a
window when no incoming elements exists. We just need to clear a few bits
before publication, but would be great to have that also include a python
example, I wrote it in java...

Cheers

Reza

On Thu, 25 Apr 2019 at 04:34, Reuven Lax  wrote:

> Well state is still not implemented for merging windows even for Java
> (though I believe the idea was to disallow ValueState there).
>
> On Wed, Apr 24, 2019 at 1:11 PM Robert Bradshaw 
> wrote:
>
>> It was unclear what the semantics were for ValueState for merging
>> windows. (It's also a bit weird as it's inherently a race condition
>> wrt element ordering, unlike Bag and CombineState, though you can
>> always implement it as a CombineState that always returns the latest
>> value which is a bit more explicit about the dangers here.)
>>
>> On Wed, Apr 24, 2019 at 10:08 PM Brian Hulette 
>> wrote:
>> >
>> > That's a great idea! I thought about this too after those posts came up
>> on the list recently. I started to look into it, but I noticed that there's
>> actually no implementation of ValueState in userstate. Is there a reason
>> for that? I started to work on a patch to add it but I was just curious if
>> there was some reason it was omitted that I should be aware of.
>> >
>> > We could certainly replicate the example without ValueState by using
>> BagState and clearing it before each write, but it would be nice if we
>> could draw a direct parallel.
>> >
>> > Brian
>> >
>> > On Fri, Apr 12, 2019 at 7:05 AM Maximilian Michels 
>> wrote:
>> >>
>> >> > It would probably be pretty easy to add the corresponding code
>> snippets to the docs as well.
>> >>
>> >> It's probably a bit more work because there is no section dedicated to
>> >> state/timer yet in the documentation. Tracked here:
>> >> https://jira.apache.org/jira/browse/BEAM-2472
>> >>
>> >> > I've been going over this topic a bit. I'll add the snippets next
>> week, if that's fine by y'all.
>> >>
>> >> That would be great. The blog posts are a great way to get started with
>> >> state/timers.
>> >>
>> >> Thanks,
>> >> Max
>> >>
>> >> On 11.04.19 20:21, Pablo Estrada wrote:
>> >> > I've been going over this topic a bit. I'll add the snippets next
>> week,
>> >> > if that's fine by y'all.
>> >> > Best
>> >> > -P.
>> >> >
>> >> > On Thu, Apr 11, 2019 at 5:27 AM Robert Bradshaw > >> > > wrote:
>> >> >
>> >> > That's a great idea! It would probably be pretty easy to add the
>> >> > corresponding code snippets to the docs as well.
>> >> >
>> >> > On Thu, Apr 11, 2019 at 2:00 PM Maximilian Michels <
>> m...@apache.org
>> >> > > wrote:
>> >> >  >
>> >> >  > Hi everyone,
>> >> >  >
>> >> >  > The Python SDK still lacks documentation on state and timers.
>> >> >  >
>> >> >  > As a first step, what do you think about updating these two
>> blog
>> >> > posts
>> >> >  > with the corresponding Python code?
>> >> >  >
>> >> >  >
>> https://beam.apache.org/blog/2017/02/13/stateful-processing.html
>> >> >  >
>> https://beam.apache.org/blog/2017/08/28/timely-processing.html
>> >> >  >
>> >> >  > Thanks,
>> >> >  > Max
>> >> >
>>
>

Re: hi from DevRel land

2019-03-12 Thread Reza Ardeshir Rokni

Thanx folks!

Oppsy on the link, here it is again:
https://stackoverflow.com/questions/54422510/how-to-solve-duplicate-values-exception-when-i-create-pcollectionviewmapstring/54623618#54623618

On Tue, 12 Mar 2019 at 23:32, Teja MVSR  wrote:

> Hi Reza,
>
> I am also interested to contribute towards documentation. Please let me
> know if I can be of any help.
>
> Thanks and Regards,
> Teja.
>
> On Tue, Mar 12, 2019, 11:30 AM Kenneth Knowles  wrote:
>
>> This is great news.
>>
>> For the benefit of the list, I want to say how nice it has been when I
>> have had a chance to work with you. I've learned a great deal real and
>> complex use cases through those opportunities. I'm really excited that
>> you'll be helping out Beam in this new role.
>>
>> Kenn
>>
>> On Tue, Mar 12, 2019 at 7:21 AM Valentyn Tymofieiev 
>> wrote:
>>
>>> Hi Reza!
>>>
>>> Welcome to Beam. Very nice to have you onboard. Btw, the link seems
>>> broken.
>>>
>>> Thanks,
>>> Valentyn
>>>
>>> On Tue, Mar 12, 2019 at 6:04 AM Reza Ardeshir Rokni 
>>> wrote:
>>>
>>>> Hi Folks,
>>>>
>>>> Just wanted to say hi to the good folks in the Beam community in my new
>>>> capacity as a Developer advocate for Beam/Dataflow @ Google. :-)
>>>>
>>>> At the moment I am working on a couple of blogs around the Timer and
>>>> State API as well as some work on general patterns that I hope to
>>>> contribute as documentation to the Beam site. An example of the patterns
>>>> can be seen here:  LINK
>>>>
>>>> Hope to be adding many more in 2019 and really looking forward to being
>>>> able to contribute to Beam in anyway that I can!
>>>>
>>>> Cheers
>>>> Reza
>>>>
>>>>

hi from DevRel land

2019-03-12 Thread Reza Ardeshir Rokni

Hi Folks,

Just wanted to say hi to the good folks in the Beam community in my new
capacity as a Developer advocate for Beam/Dataflow @ Google. :-)

At the moment I am working on a couple of blogs around the Timer and State
API as well as some work on general patterns that I hope to contribute as
documentation to the Beam site. An example of the patterns can be seen
here:  LINK

Hope to be adding many more in 2019 and really looking forward to being
able to contribute to Beam in anyway that I can!

Cheers
Reza

Re: Another another new contributor! :)

2019-02-06 Thread Reza Ardeshir Rokni

Welcome!

On Tue, 5 Feb 2019 at 23:34, Kenneth Knowles  wrote:

> Welcome Kyle!
>
> On Tue, Feb 5, 2019 at 4:34 AM Maximilian Michels  wrote:
>
>> Welcome Kyle! Excited to see the Spark Runner moving towards portability!
>>
>> On 05.02.19 01:14, Connell O'Callaghan wrote:
>> > Welcome Kyle!
>> >
>> > On Mon, Feb 4, 2019 at 3:18 PM Ahmet Altay > > > wrote:
>> >
>> > Welcome!
>> >
>> > On Mon, Feb 4, 2019 at 3:13 PM Rui Wang > > > wrote:
>> >
>> > Welcome!
>> >
>> > -Rui
>> >
>> > On Mon, Feb 4, 2019 at 2:50 PM Kyle Weaver > > > wrote:
>> >
>> > Hello Beam developers,
>> >
>> > My name is Kyle Weaver (alias "ibzib" on Github/Slack). Like
>> > Brian, I recently switched roles at Google (I previously
>> > worked on Prow, Kubernetes' CI system). My goal in the
>> > coming weeks is to help begin implementing portability
>> > support for the Spark runner. I look forward to
>> > collaborating with all of you!
>> >
>> > Kyle
>> >
>> > Kyle Weaver |  Software Engineer |
>> kcwea...@google.com
>> >  | +1650203
>> >
>> >
>>
>

Re: New contributor: Michał Walenia

2019-01-31 Thread Reza Ardeshir Rokni

Welcome!

On Thu, 31 Jan 2019 at 15:48, Michał Walenia 
wrote:

> HI all,
> thanks for a warm welcome :)
>
> Michał
>
> Wiadomość napisana przez Ahmet Altay  w dniu
> 30.01.2019, o godz. 21:32:
>
> Welcome Michał!
>
> On Wed, Jan 30, 2019 at 11:38 AM Kenneth Knowles  wrote:
>
>> Welcome Michał!
>>
>> Kenn
>>
>> *And maybe your system uses a compose key. Ubuntu:
>> https://help.ubuntu.com/community/ComposeKey. It is composition of L and
>> / just like it looks. (unless I can't see it clearly)
>>
>>
>> On Wed, Jan 30, 2019 at 10:20 AM Rui Wang  wrote:
>>
>>> Welcome! Welcome!
>>>
>>> -Rui
>>>
>>> On Wed, Jan 30, 2019 at 9:22 AM Łukasz Gajowy 
>>> wrote:
>>>
 Impressive, so many ways! I didn't know the mac trick though, thanks
 Ankur. :D

 śr., 30 sty 2019 o 17:24 Ismaël Mejía  napisał(a):

> Welcome Michał!
>
> For more foreign languages copy/pastables characters:
> http://polish.typeit.org/
>
> Yay for more people with crazy accents, (yes I know I can be biased :P)
>
> Ismaël
>
> On Wed, Jan 30, 2019 at 3:30 PM Ankur Goenka 
> wrote:
> >
> > Welcome Michał!
> >
> > long press "l" on mac to type "ł' :)
> >
> > On Wed, Jan 30, 2019 at 7:57 PM Maximilian Michels 
> wrote:
> >>
> >> Welcome Michał!
> >>
> >> I do have to find out how to type ł without copy/pasting it every
> time ;)
> >>
> >> On 30.01.19 15:22, Łukasz Gajowy wrote:
> >> > Hi all,
> >> >
> >> > a new fellow joined Kasia Kucharczyk and me to work on
> integration and load
> >> > testing areas. Welcome, Michał!
> >> >
> >> > Łukasz
> >> >
>

>

Re: [ANNOUNCE] New PMC member: Etienne Chauchot

2019-01-27 Thread Reza Ardeshir Rokni

Congratulations Etienne!

On Sat, 26 Jan 2019 at 14:16, Ismaël Mejía  wrote:

> Congratulations Etienne!
>
> Le sam. 26 janv. 2019 à 06:42, Reuven Lax  a écrit :
>
>> Welcome!
>>
>> On Fri, Jan 25, 2019 at 9:30 PM Pablo Estrada  wrote:
>>
>>> Congrats Etienne :)
>>>
>>> On Fri, Jan 25, 2019, 9:24 PM Trần Thành Đạt >> wrote:
>>>
 Congratulations Etienne!

 On Sat, Jan 26, 2019 at 12:08 PM Thomas Weise  wrote:

> Congrats, félicitations!
>
>
> On Fri, Jan 25, 2019 at 3:06 PM Scott Wegner  wrote:
>
>> Congrats Etienne!
>>
>> On Fri, Jan 25, 2019 at 2:34 PM Tim 
>> wrote:
>>
>>> Congratulations Etienne!
>>>
>>> Tim
>>>
>>> > On 25 Jan 2019, at 23:00, Kenneth Knowles  wrote:
>>> >
>>> > Hi all,
>>> >
>>> > Please join me and the rest of the Beam PMC in welcoming Etienne
>>> Chauchot to join the PMC.
>>> >
>>> > Etienne introduced himself to dev@ in September of 2017 and over
>>> the years has contributed to Beam in many ways - connectors, 
>>> performance,
>>> design discussion, talks, code reviews, and I'm sure I cannot list them
>>> all. He already has a major impact on the direction of Beam.
>>> >
>>> > Thanks for being a part of Beam, Etienne!
>>> >
>>> > Kenn
>>>
>>
>>
>> --
>>
>>
>>
>>
>> Got feedback? tinyurl.com/swegner-feedback
>>
>

Re: Beam Website Feedback

Re: Beam Java starter project template

Re: Unbounded pipeline: how to trigger a computation on empty windows

Re: PTransform Annotations Proposal

Re: Question about saving data to use across runner's instances

Re: [Proposal] Add a new Beam example to ingest data from Kafka to Pub/Sub

Re: Stateful Pardo Question

Re: Stateful Pardo Question

Re: Stateful Pardo Question

Re: [ANNOUNCE] New committer: Robin Qiu

Re: Non-trivial joins examples

Re: [docs] Python State & Timers

Re: hi from DevRel land

hi from DevRel land

Re: Another another new contributor! :)

Re: New contributor: Michał Walenia

Re: [ANNOUNCE] New PMC member: Etienne Chauchot

17 matches

Site Navigation

Mail list logo

Footer information