Re: Beam portable runner setup for Flink + Python on Kubernetes

2024-02-23 Thread Sam Bourne
link. In your example, the environment type is > set to DOCKER and it requires a docker container running together with the > task manager. Would you think it is acceptable in a production environment? > > Cheers, > Jaehyeon > > On Fri, 23 Feb 2024 at 13:57, Sam Bourne wrote: >

Re: Beam portable runner setup for Flink + Python on Kubernetes

2024-02-22 Thread Sam Bourne
I made this a few years ago to help people like yourself. https://github.com/sambvfx/beam-flink-k8s Hopefully it's insightful and I'm happy to accept any MRs to update any outdated information or to flesh it out more. On Thu, Feb 22, 2024 at 3:48 PM Jaehyeon Kim wrote: > Hello, > > I'm

Re: Seeking Assistance to Resolve Issues/bug with Flink Runner on Kubernetes

2023-08-14 Thread Sam Bourne
Hey Kapil, I grappled with a similar deployment and created this repo [1] to attempt to provide others with some nuggets of useful information. We were running cross language pipelines on flink connecting PubsubIO

Re: Best patterns for a polling transform

2023-06-22 Thread Sam Bourne
ior should > be consistent across SDK, so different behavior between Python and Java > SDK would implicate an SDK bug. > > > On Thu, Jun 22, 2023 at 10:00 AM Chad Dombrova wrote: > >> I’m also interested in the answer to this. This is essential for reading >> f

Re: Best patterns for a polling transform

2023-06-20 Thread Sam Bourne
+dev to see if anyone has any suggestions. On Fri, Jun 16, 2023 at 5:46 PM Sam Bourne wrote: > Hello beam community! > > I’m having trouble coming up with the best pattern to *eagerly* poll. By > eagerly, I mean that elements should be consumed and yielded as soon a

Best patterns for a polling transform

2023-06-16 Thread Sam Bourne
Hello beam community! I’m having trouble coming up with the best pattern to *eagerly* poll. By eagerly, I mean that elements should be consumed and yielded as soon as possible. There are a handful of experiments that I’ve tried and my latest attempt using the timer API seems quite promising, but

Re: How to run Beam pipeline in Flink [Python]?

2022-06-20 Thread Sam Bourne
Hi Mike, I’m not an expert, but I have some experience running beam pipelines in Flink that require access to something on disk. When the Flink taskmanager executes the WriteToText transform, it spins up a beam python SDK docker container to perform the work*. At the moment there is not a way to

Re: Python SDF for unbound source

2022-03-11 Thread Sam Bourne
= argparse.ArgumentParser() > known_args, pipeline_args = parser.parse_known_args(argv) > > pipeline_options = PipelineOptions(pipeline_args) > > with beam.Pipeline(options=pipeline_options) as pipe: > ( > pipe > | beam.Impulse() > | beam.ParD

Re: Python SDF for unbound source

2022-03-09 Thread Sam Bourne
eam/sdk/transforms/DoFn.java#L695 > > As this will eventually run on a cluster, I would recommend going via the > SDF route so that Watermarks and checkpoints can happen appropriately not > just for this tranform but down stream tranforms. > > On Wed, Mar 9, 2022 at 10:54 AM Sam Bour

Python SDF for unbound source

2022-03-09 Thread Sam Bourne
Hello! I’m looking for some help writing a custom transform that reads from an unbound source. A basic version of this would look something like this: import apache_beam as beam import myeventlibrary class _ReadEventsFn(beam.DoFn): def process(self, unused_element): subscriber =

Re: [Question] Beam+Python+Flink

2021-10-28 Thread Sam Bourne
Hey Chiara, I went through a lot of the same struggles a while back and made this repo to showcase how I accomplished something similar. https://github.com/sambvfx/beam-flink-k8s It shouldn't be hard to convert to a docker-compose setup (I actually had it like this originally while testing

Re: using python sdk+kafka under k8s

2021-02-24 Thread Sam Bourne
Hi Yilun! I made a quick proof of concept repo showcasing how to run a beam pipeline in flink on k8s. It may be useful for you as reference. https://github.com/sambvfx/beam-flink-k8s On Wed, Feb 24, 2021, 8:13 AM yilun zhang wrote: > Hey, > > Our team is trying to use beam with connector

Re: Deploying with Flink and Beam Python Worker

2021-01-14 Thread Sam Bourne
ying to run the word count example and I'm trying to understand what can >> be the cause of it? >> >> On Thu, Jan 14, 2021 at 7:29 PM Sam Bourne wrote: >> >>> Hi Nir, >>> >>> I have a simple repo where I have a proof of concept deployment setup >&g

Re: Deploying with Flink and Beam Python Worker

2021-01-14 Thread Sam Bourne
Hi Nir, I have a simple repo where I have a proof of concept deployment setup for doing this. https://github.com/sambvfx/beam-flink-k8s Depending on the type of runner you're using there are a few explanations. That repo should hopefully point you in the right direction. On Thu, Jan 14, 2021

Re: Issues with python's external ReadFromPubSub

2020-10-28 Thread Sam Bourne
, 2020 at 10:47 AM Kyle Weaver wrote: > Are you able to run streaming word count on the same setup? > > On Tue, Oct 27, 2020 at 5:39 PM Sam Bourne wrote: > >> We updated from beam 2.18.0 to 2.24.0 and have been having issues using >> the python ReadFromPubSub external

Issues with python's external ReadFromPubSub

2020-10-27 Thread Sam Bourne
We updated from beam 2.18.0 to 2.24.0 and have been having issues using the python ReadFromPubSub external transform in flink 1.10. It seems like it starts up just fine, but it doesn’t consume any messages. I tried to reduce it to a simple example and tested back to beam 2.22.0 but have gotten

Re: Flink JobService on k8s

2020-09-24 Thread Sam Bourne
generally recommend using a > distributed filesystem for this purpose if possible). > > General note: there is never any direct communication between the job > server and the SDK harness. Usually it goes Beam job server -> Flink job > manager -> Flink task manager -> Beam

Re: Flink JobService on k8s

2020-09-22 Thread Sam Bourne
> /tmp/staged/pickled_main_session > > Are you sure that's due to a networking issue, and not a problem with the > filesystem / volume mounting? > > On Tue, Sep 22, 2020 at 3:55 PM Sam Bourne wrote: > >> I would not be surprised if there was something weird going on with &

Re: Flink JobService on k8s

2020-09-22 Thread Sam Bourne
flink. > > More information about this failure mode would be helpful as well. > > [1] > https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/examples/beam/with_job_server/beam_job_server.yaml > > > On Tue, Sep 22, 2020 at 10:36 AM Sam Bourne wrote: &g

Re: Getting Beam(Python)-on-Flink-on-k8s to work

2020-08-30 Thread Sam Bourne
On Sat, Aug 29, 2020 at 10:59 AM Eugene Kirpichov wrote: > > On Fri, Aug 28, 2020 at 6:52 PM Sam Bourne wrote: > >> Hi Eugene, >> >> Glad that helped you out and thanks for the PR tweaking it for GCP. >> >> To fetch the containers from GCR, I had to lo

Re: Getting Beam(Python)-on-Flink-on-k8s to work

2020-08-27 Thread Sam Bourne
Hi Eugene! I’m struggling to find complete documentation on how to do this. There seems to be lots of conflicting or incomplete information: several ways to deploy Flink, several ways to get Beam working with it, bizarre StackOverflow questions, and no documentation explaining a complete working