+1 for dsls/scio for the already listed reasons On Fri, Jun 24, 2016 at 11:21 AM, Rafal Wojdyla <r...@spotify.com.invalid> wrote:
> Hello. When it comes to SDK vs DSL - I fully agree with Frances. About > dsls/java/scio or dsls/scio - dsls/java/scio may cause confusion, scio is a > scala DSL but lives under java directory (?) - that makes sense only once > you get that scio is using java SDK under the hood. Thus, +1 to dsls/scio. > - Rafal > > On Fri, Jun 24, 2016 at 2:01 PM, Kenneth Knowles <k...@google.com.invalid> > wrote: > > > My +1 goes to dsls/scio. It already has a cool name, so let's use it. And > > there might be other Scala-based DSLs. > > > > On Fri, Jun 24, 2016 at 8:39 AM, Ismaël Mejía <ieme...@gmail.com> wrote: > > > > > Hello everyone, > > > > > > Neville, thanks a lot for your contribution. Your work is amazing and I > > am > > > really happy that this scala integration is finally happening. > > > Congratulations to you and your team. > > > > > > I *strongly* disagree about the DSL classification for scio for one > > reason, > > > if you go to the root of the term, Domain Specific Languages are about > a > > > domain, and the domain in this case is writing Beam pipelines, which > is a > > > really broad domain. > > > > > > I agree with Frances’ argument that scio is not an SDK e.g. it reuses > the > > > existing Beam java SDK. My proposition is that scio will be called the > > > Scala API because in the end this is what it is. I think the confusion > > > comes from the common definition of SDK which is normally an API + a > > > Runtime. In this case scio will share the runtime with what we call the > > > Beam Java SDK. > > > > > > One additional point of using the term API is that it sends the clear > > > message that Beam has a Scala API too (which is good for visibility as > JB > > > mentioned). > > > > > > Regards, > > > Ismaël > > > > > > > > > On Fri, Jun 24, 2016 at 5:08 PM, Jean-Baptiste Onofré <j...@nanthrax.net > > > > > wrote: > > > > > > > Hi Dan, > > > > > > > > fair enough. > > > > > > > > As I'm also working on new DSLs (XML, JSON), I already created the > dsls > > > > module. > > > > > > > > So, I would say dsls/scala. > > > > > > > > WDYT ? > > > > > > > > Regards > > > > JB > > > > > > > > > > > > On 06/24/2016 05:07 PM, Dan Halperin wrote: > > > > > > > >> I don't think that sdks/scala is the right place -- scio is not a > Beam > > > >> Scala SDK; it wraps the existing Java SDK. > > > >> > > > >> Some options: > > > >> * sdks/java/extensions (Scio builds on the Java SDK) -- mentally > > vetoed > > > >> since Scio isn't an extension for the Java SDK, but rather a wrapper > > > >> > > > >> * dsls/java/scio (Scio is a Beam DSL that uses the Java SDK) > > > >> * dsls/scio (Scio is a Beam DSL that could eventually use multiple > > > SDKs) > > > >> * extensions/java/scio (Scio is an extension of Beam that uses the > > Java > > > >> SDK) > > > >> * extensions/scio (Scio is an extension of Beam that is not limited > > to > > > >> one > > > >> SDK) > > > >> > > > >> I lean towards either dsls/java/scio or extensions/java/scio, since > I > > > >> don't > > > >> think there are plans for Scio to handle multiple different SDKs (in > > > >> different languages). The question between these two is whether we > > think > > > >> DSLs are "big enough" to be a top level concept. > > > >> > > > >> On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré < > > j...@nanthrax.net > > > > > > > >> wrote: > > > >> > > > >> Good point about new Fn and the fact it's based on the Java SDK. > > > >>> > > > >>> It's just that in term of "marketing", it's a good message to > > provide a > > > >>> Scala SDK even if technically it's more a DSL. > > > >>> > > > >>> For instance, a valid "marketing" DSL would be a Java fluent DSL on > > top > > > >>> of > > > >>> the Java SDK, or a declarative XML DSL. > > > >>> > > > >>> However, from a technical perspective, it can go into dsl module. > > > >>> > > > >>> My $0.02 ;) > > > >>> > > > >>> Regards > > > >>> JB > > > >>> > > > >>> > > > >>> On 06/24/2016 06:51 AM, Frances Perry wrote: > > > >>> > > > >>> +Rafal & Andrew again > > > >>>> > > > >>>> I am leaning DSL for two reasons: (1) scio uses the existing java > > > >>>> execution > > > >>>> environment (and won't have a language-specific fn harness of its > > > own), > > > >>>> and > > > >>>> (2) it changes the abstractions that users interact with. > > > >>>> > > > >>>> I recently saw a scio repl demo from Reuven -- there's some really > > > cool > > > >>>> stuff in there. I'd love to dive into it a bit more and see what > can > > > be > > > >>>> generalized beyond scio. The repl-like interactive graph > > construction > > > is > > > >>>> very similar to what we've seen with ipython, in that it doesn't > > > always > > > >>>> play nicely with the graph construction / graph execution > > > distinction. I > > > >>>> wonder what changes to Beam might more generally support this. The > > > >>>> materialize stuff looks similar to some functionality in FlumeJava > > we > > > >>>> used > > > >>>> to support multi-segment pipelines with some shared intermediate > > > >>>> PCollections. > > > >>>> > > > >>>> On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré < > > > j...@nanthrax.net> > > > >>>> wrote: > > > >>>> > > > >>>> Hi Neville, > > > >>>> > > > >>>>> > > > >>>>> thanks for the update ! > > > >>>>> > > > >>>>> As it's another language support, and to clearly identify the > > > purpose, > > > >>>>> I > > > >>>>> would say sdks/scala. > > > >>>>> > > > >>>>> Regards > > > >>>>> JB > > > >>>>> > > > >>>>> > > > >>>>> On 06/23/2016 11:56 PM, Neville Li wrote: > > > >>>>> > > > >>>>> +folks in my team > > > >>>>> > > > >>>>>> > > > >>>>>> On Thu, Jun 23, 2016 at 5:57 PM Neville Li < > neville....@gmail.com > > > > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>> Hi all, > > > >>>>>> > > > >>>>>> > > > >>>>>>> I'm the co-author of Scio <https://github.com/spotify/scio> > and > > am > > > >>>>>>> in > > > >>>>>>> the > > > >>>>>>> progress of moving code to Beam (BEAM-302 > > > >>>>>>> <https://issues.apache.org/jira/browse/BEAM-302>). Just > > wondering > > > if > > > >>>>>>> sdks/scala is the right place for this code or if something > like > > > >>>>>>> dsls/scio > > > >>>>>>> is a better choice? What do you think? > > > >>>>>>> > > > >>>>>>> A little background: Scio was built as a high-level Scala API > for > > > >>>>>>> Google > > > >>>>>>> Cloud Dataflow (now also Apache Beam) and is heavily influenced > > by > > > >>>>>>> Spark > > > >>>>>>> and Scalding. It wraps around the Dataflow/Beam Java SDK while > > also > > > >>>>>>> providing features comparable to other Scala data frameworks. > We > > > use > > > >>>>>>> Scio > > > >>>>>>> on Dataflow for production extensively inside Spotify. > > > >>>>>>> > > > >>>>>>> Cheers, > > > >>>>>>> Neville > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> -- > > > >>>>>> > > > >>>>> Jean-Baptiste Onofré > > > >>>>> jbono...@apache.org > > > >>>>> http://blog.nanthrax.net > > > >>>>> Talend - http://www.talend.com > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>> -- > > > >>> Jean-Baptiste Onofré > > > >>> jbono...@apache.org > > > >>> http://blog.nanthrax.net > > > >>> Talend - http://www.talend.com > > > >>> > > > >>> > > > >> > > > > -- > > > > Jean-Baptiste Onofré > > > > jbono...@apache.org > > > > http://blog.nanthrax.net > > > > Talend - http://www.talend.com > > > > > > > > > >