I'm not convinced the Scio approach is hacky :-) and anyhow it is at its core the same approach as this one isn't it? The difference is just making run() spin up and read the temporary location instantly, which this demo achieves by working in memory on one machine. No?
Kenn On Thu, Jun 21, 2018 at 1:16 PM Harsh Vardhan <[email protected]> wrote: > This is targeting Python SDK + DirectRunner to start with. We can explore > ways to make this applicable to other SDKs and Runners. > > On Fri, Jun 15, 2018 at 1:00 PM Neville Li <[email protected]> wrote: > >> Is this targeting mainly Python SDK and DirectRunner for now? >> >> Our interactivity solution is hacky, basically calling `pipeline.run()`, >> saving PCollections to temporary locations and loading it to memory. >> >> I think the most requested feature is the ability to inspect PCollection >> content during execution and conditionally alter DAG, similar to what the >> spark driver does. But this would require significant change to the >> DataflowRunner execution model? >> >> On Fri, Jun 15, 2018 at 9:56 AM, Kenneth Knowles <[email protected]> wrote: >> >>> Nice! As-is, this already looks useful for making Beam accessible. >>> >>> Commented a bit on doc to highlight where SQL is different than >>> Scio/Python style. I think notebooks are the perfect target. Specifically, >>> Python and SQL on the same notebook would be amazing. >>> >>> Kenn >>> >>> On Thu, Jun 14, 2018 at 2:04 PM Sindy Li <[email protected]> wrote: >>> >>>> Thanks Ahmet, >>>> >>>> We know quite a few teams in Google are interested to run interactive >>>> Beam pipelines, especially in Python for Machine Learning -- some are >>>> already using it interactively in their own way. So instead of for the >>>> those teams to develop their own version of interactive solution, we want >>>> one repository that people can contribute to. We could also provide better >>>> features like fast re-execution as is shown in the demo. >>>> >>>> Thanks, >>>> Sindy >>>> >>>> On Wed, Jun 13, 2018 at 5:48 PM, Ahmet Altay <[email protected]> wrote: >>>> >>>>> Thank you Sindy. >>>>> >>>>> I like the demo; it looks great. This would be interesting to a lot of >>>>> users. What are your plans for moving this forward? What kind of an input >>>>> you are looking for? >>>>> >>>>> Ahmet >>>>> >>>>> On Wed, Jun 13, 2018 at 2:32 PM, Eugene Kirpichov < >>>>> [email protected]> wrote: >>>>> >>>>>> This is awesome, thanks Sindy! I hope that the questions related to >>>>>> portability will get resolved in a way that will allow to reuse some of >>>>>> the >>>>>> work for other interactive Beam experiences, including SQL as Andrew >>>>>> says, >>>>>> and providing a REPL e.g. for users of Scala or other JVM-based >>>>>> languages. >>>>>> >>>>>> +Neville Li <[email protected]> Do I remember correctly that you >>>>>> guys had some sort of interactivity going in Scio but were looking >>>>>> forward >>>>>> to Beam developing a native solution? >>>>>> >>>>>> On Wed, Jun 13, 2018 at 2:22 PM Sindy Li <[email protected]> wrote: >>>>>> >>>>>>> *Thanks, Andrew!* >>>>>>> >>>>>>> *Here is a link to the demo on Youtube for people interested:* >>>>>>> *https://www.youtube.com/watch?v=c5CjA1e3Cqw&feature=youtu.be >>>>>>> <https://www.youtube.com/watch?v=c5CjA1e3Cqw&feature=youtu.be>* >>>>>>> >>>>>>> On Wed, Jun 13, 2018 at 1:23 PM, Andrew Pilloud <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> This sounds really interesting, thanks for sharing! We've just >>>>>>>> begun to explore making Beam SQL interactive. The Interactive Runner >>>>>>>> you've >>>>>>>> proposed sounds like it would solve a bunch of the problems SQL faces >>>>>>>> as >>>>>>>> well. SQL is written in Java right now, so we can't immediately reuse >>>>>>>> any >>>>>>>> code. >>>>>>>> >>>>>>>> Andrew >>>>>>>> >>>>>>>> On Wed, Jun 13, 2018 at 11:48 AM Sindy Li <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Resending after subscribing to dev list. >>>>>>>>> >>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>> From: Sindy Li <[email protected]> >>>>>>>>> Date: Fri, Jun 8, 2018 at 5:57 PM >>>>>>>>> Subject: Proposing interactive beam runner >>>>>>>>> To: [email protected] >>>>>>>>> Cc: Harsh Vardhan <[email protected]>, Chamikara Jayalath < >>>>>>>>> [email protected]>, Anand Iyer <[email protected]>, Robert >>>>>>>>> Bradshaw <[email protected]> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> We were exploring ways to provide an interactive notebook >>>>>>>>> experience for writing Beam Python pipelines. The design doc >>>>>>>>> <https://docs.google.com/document/d/10bTc97GN5Wk-nhwncqNq9_XkJFVVy0WLT4gPFqP6Kmw/edit?usp=sharing> >>>>>>>>> provides >>>>>>>>> an overview/vision of what we would like to achieve. Pull request >>>>>>>>> <https://github.com/apache/beam/pull/5595> provides a prototype >>>>>>>>> for the same. The document also provides demo screen shots and >>>>>>>>> instructions for running a demo in Jupyter. Please take a look. We >>>>>>>>> believe >>>>>>>>> this would be a useful addition to Beam. >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>
