Getting Started With Implementing a Runner

2023-06-22 Thread Joey Tran
Hi Beam community! I'm interested in trying to implement a runner with my company's execution environment but I'm struggling to get started. I've read the docs page on implementing a runner but it's quite high level. Anyone hav

Re: Best patterns for a polling transform

2023-06-22 Thread Chad Dombrova
I’m also interested in the answer to this. This is essential for reading from many types of data sources. On Tue, Jun 20, 2023 at 2:57 PM Sam Bourne wrote: > +dev to see if anyone has any suggestions. > > On Fri, Jun 16, 2023 at 5:46 PM Sam Bourne wrote: > >> Hello beam community! >> >> I’m h

Re: Getting Started With Implementing a Runner

2023-06-22 Thread Jack McCluskey via user
Hey Joey, The best resource to look at, at the moment, is likely Robert Burke's Prism runner that he is implementing ( https://github.com/apache/beam/tree/master/sdks/go/pkg/beam/runners/prism). Runners are pretty complicated and there are a number of primitives that you need to have implemented o

Re: Getting Started With Implementing a Runner

2023-06-22 Thread Joey Tran
Thanks Jack! I've tried that Slack link but it requires an account with a @apache email On Thu, Jun 22, 2023 at 10:08 AM Jack McCluskey via user < user@beam.apache.org> wrote: > Hey Joey, > > The best resource to look at, at the moment, is likely Robert Burke's > Prism runner that he is impleme

Re: Best patterns for a polling transform

2023-06-22 Thread Valentyn Tymofieiev via user
> The below code runs fine with a single worker but with multiple workers there are duplicate values. > I’m using TimeDomain.WATERMARK here due to it simply not working when using REAL_TIME. The docs seem to suggest REAL_TIME would be the way to do this, however there seems to be no guarantee that

Re: Best patterns for a polling transform

2023-06-22 Thread Sam Bourne
The streaming support in Python direct runner is currently rather limited In this experiment I was running a batch pipeline instead of a streaming one. Are there any known issues using timers with a batch pipeline? It sounds like we should identify whether this is a problem in the SDK or in the D