Hello. When it comes to SDK vs DSL - I fully agree with Frances. About dsls/java/scio or dsls/scio - dsls/java/scio may cause confusion, scio is a scala DSL but lives under java directory (?) - that makes sense only once you get that scio is using java SDK under the hood. Thus, +1 to dsls/scio. - Rafal
On Fri, Jun 24, 2016 at 2:01 PM, Kenneth Knowles <k...@google.com.invalid> wrote: > My +1 goes to dsls/scio. It already has a cool name, so let's use it. And > there might be other Scala-based DSLs. > > On Fri, Jun 24, 2016 at 8:39 AM, Ismaël Mejía <ieme...@gmail.com> wrote: > > > Hello everyone, > > > > Neville, thanks a lot for your contribution. Your work is amazing and I > am > > really happy that this scala integration is finally happening. > > Congratulations to you and your team. > > > > I *strongly* disagree about the DSL classification for scio for one > reason, > > if you go to the root of the term, Domain Specific Languages are about a > > domain, and the domain in this case is writing Beam pipelines, which is a > > really broad domain. > > > > I agree with Frances’ argument that scio is not an SDK e.g. it reuses the > > existing Beam java SDK. My proposition is that scio will be called the > > Scala API because in the end this is what it is. I think the confusion > > comes from the common definition of SDK which is normally an API + a > > Runtime. In this case scio will share the runtime with what we call the > > Beam Java SDK. > > > > One additional point of using the term API is that it sends the clear > > message that Beam has a Scala API too (which is good for visibility as JB > > mentioned). > > > > Regards, > > Ismaël > > > > > > On Fri, Jun 24, 2016 at 5:08 PM, Jean-Baptiste Onofré <j...@nanthrax.net> > > wrote: > > > > > Hi Dan, > > > > > > fair enough. > > > > > > As I'm also working on new DSLs (XML, JSON), I already created the dsls > > > module. > > > > > > So, I would say dsls/scala. > > > > > > WDYT ? > > > > > > Regards > > > JB > > > > > > > > > On 06/24/2016 05:07 PM, Dan Halperin wrote: > > > > > >> I don't think that sdks/scala is the right place -- scio is not a Beam > > >> Scala SDK; it wraps the existing Java SDK. > > >> > > >> Some options: > > >> * sdks/java/extensions (Scio builds on the Java SDK) -- mentally > vetoed > > >> since Scio isn't an extension for the Java SDK, but rather a wrapper > > >> > > >> * dsls/java/scio (Scio is a Beam DSL that uses the Java SDK) > > >> * dsls/scio (Scio is a Beam DSL that could eventually use multiple > > SDKs) > > >> * extensions/java/scio (Scio is an extension of Beam that uses the > Java > > >> SDK) > > >> * extensions/scio (Scio is an extension of Beam that is not limited > to > > >> one > > >> SDK) > > >> > > >> I lean towards either dsls/java/scio or extensions/java/scio, since I > > >> don't > > >> think there are plans for Scio to handle multiple different SDKs (in > > >> different languages). The question between these two is whether we > think > > >> DSLs are "big enough" to be a top level concept. > > >> > > >> On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré < > j...@nanthrax.net > > > > > >> wrote: > > >> > > >> Good point about new Fn and the fact it's based on the Java SDK. > > >>> > > >>> It's just that in term of "marketing", it's a good message to > provide a > > >>> Scala SDK even if technically it's more a DSL. > > >>> > > >>> For instance, a valid "marketing" DSL would be a Java fluent DSL on > top > > >>> of > > >>> the Java SDK, or a declarative XML DSL. > > >>> > > >>> However, from a technical perspective, it can go into dsl module. > > >>> > > >>> My $0.02 ;) > > >>> > > >>> Regards > > >>> JB > > >>> > > >>> > > >>> On 06/24/2016 06:51 AM, Frances Perry wrote: > > >>> > > >>> +Rafal & Andrew again > > >>>> > > >>>> I am leaning DSL for two reasons: (1) scio uses the existing java > > >>>> execution > > >>>> environment (and won't have a language-specific fn harness of its > > own), > > >>>> and > > >>>> (2) it changes the abstractions that users interact with. > > >>>> > > >>>> I recently saw a scio repl demo from Reuven -- there's some really > > cool > > >>>> stuff in there. I'd love to dive into it a bit more and see what can > > be > > >>>> generalized beyond scio. The repl-like interactive graph > construction > > is > > >>>> very similar to what we've seen with ipython, in that it doesn't > > always > > >>>> play nicely with the graph construction / graph execution > > distinction. I > > >>>> wonder what changes to Beam might more generally support this. The > > >>>> materialize stuff looks similar to some functionality in FlumeJava > we > > >>>> used > > >>>> to support multi-segment pipelines with some shared intermediate > > >>>> PCollections. > > >>>> > > >>>> On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré < > > j...@nanthrax.net> > > >>>> wrote: > > >>>> > > >>>> Hi Neville, > > >>>> > > >>>>> > > >>>>> thanks for the update ! > > >>>>> > > >>>>> As it's another language support, and to clearly identify the > > purpose, > > >>>>> I > > >>>>> would say sdks/scala. > > >>>>> > > >>>>> Regards > > >>>>> JB > > >>>>> > > >>>>> > > >>>>> On 06/23/2016 11:56 PM, Neville Li wrote: > > >>>>> > > >>>>> +folks in my team > > >>>>> > > >>>>>> > > >>>>>> On Thu, Jun 23, 2016 at 5:57 PM Neville Li <neville....@gmail.com > > > > >>>>>> wrote: > > >>>>>> > > >>>>>> Hi all, > > >>>>>> > > >>>>>> > > >>>>>>> I'm the co-author of Scio <https://github.com/spotify/scio> and > am > > >>>>>>> in > > >>>>>>> the > > >>>>>>> progress of moving code to Beam (BEAM-302 > > >>>>>>> <https://issues.apache.org/jira/browse/BEAM-302>). Just > wondering > > if > > >>>>>>> sdks/scala is the right place for this code or if something like > > >>>>>>> dsls/scio > > >>>>>>> is a better choice? What do you think? > > >>>>>>> > > >>>>>>> A little background: Scio was built as a high-level Scala API for > > >>>>>>> Google > > >>>>>>> Cloud Dataflow (now also Apache Beam) and is heavily influenced > by > > >>>>>>> Spark > > >>>>>>> and Scalding. It wraps around the Dataflow/Beam Java SDK while > also > > >>>>>>> providing features comparable to other Scala data frameworks. We > > use > > >>>>>>> Scio > > >>>>>>> on Dataflow for production extensively inside Spotify. > > >>>>>>> > > >>>>>>> Cheers, > > >>>>>>> Neville > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> -- > > >>>>>> > > >>>>> Jean-Baptiste Onofré > > >>>>> jbono...@apache.org > > >>>>> http://blog.nanthrax.net > > >>>>> Talend - http://www.talend.com > > >>>>> > > >>>>> > > >>>>> > > >>>> -- > > >>> Jean-Baptiste Onofré > > >>> jbono...@apache.org > > >>> http://blog.nanthrax.net > > >>> Talend - http://www.talend.com > > >>> > > >>> > > >> > > > -- > > > Jean-Baptiste Onofré > > > jbono...@apache.org > > > http://blog.nanthrax.net > > > Talend - http://www.talend.com > > > > > >