Hi Dan,

fair enough.

As I'm also working on new DSLs (XML, JSON), I already created the dsls module.

So, I would say dsls/scala.

WDYT ?

Regards
JB

On 06/24/2016 05:07 PM, Dan Halperin wrote:
I don't think that sdks/scala is the right place -- scio is not a Beam
Scala SDK; it wraps the existing Java SDK.

Some options:
* sdks/java/extensions  (Scio builds on the Java SDK) -- mentally vetoed
since Scio isn't an extension for the Java SDK, but rather a wrapper

* dsls/java/scio (Scio is a Beam DSL that uses the Java SDK)
* dsls/scio  (Scio is a Beam DSL that could eventually use multiple SDKs)
* extensions/java/scio  (Scio is an extension of Beam that uses the Java
SDK)
* extensions/scio  (Scio is an extension of Beam that is not limited to one
SDK)

I lean towards either dsls/java/scio or extensions/java/scio, since I don't
think there are plans for Scio to handle multiple different SDKs (in
different languages). The question between these two is whether we think
DSLs are "big enough" to be a top level concept.

On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

Good point about new Fn and the fact it's based on the Java SDK.

It's just that in term of "marketing", it's a good message to provide a
Scala SDK even if technically it's more a DSL.

For instance, a valid "marketing" DSL would be a Java fluent DSL on top of
the Java SDK, or a declarative XML DSL.

However, from a technical perspective, it can go into dsl module.

My $0.02 ;)

Regards
JB


On 06/24/2016 06:51 AM, Frances Perry wrote:

+Rafal & Andrew again

I am leaning DSL for two reasons: (1) scio uses the existing java
execution
environment (and won't have a language-specific fn harness of its own),
and
(2) it changes the abstractions that users interact with.

I recently saw a scio repl demo from Reuven -- there's some really cool
stuff in there. I'd love to dive into it a bit more and see what can be
generalized beyond scio. The repl-like interactive graph construction is
very similar to what we've seen with ipython, in that it doesn't always
play nicely with the graph construction / graph execution distinction. I
wonder what changes to Beam might more generally support this. The
materialize stuff looks similar to some functionality in FlumeJava we used
to support multi-segment pipelines with some shared intermediate
PCollections.

On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

Hi Neville,

thanks for the update !

As it's another language support, and to clearly identify the purpose, I
would say sdks/scala.

Regards
JB


On 06/23/2016 11:56 PM, Neville Li wrote:

+folks in my team

On Thu, Jun 23, 2016 at 5:57 PM Neville Li <neville....@gmail.com>
wrote:

Hi all,


I'm the co-author of Scio <https://github.com/spotify/scio> and am in
the
progress of moving code to Beam (BEAM-302
<https://issues.apache.org/jira/browse/BEAM-302>). Just wondering if
sdks/scala is the right place for this code or if something like
dsls/scio
is a better choice? What do you think?

A little background: Scio was built as a high-level Scala API for
Google
Cloud Dataflow (now also Apache Beam) and is heavily influenced by
Spark
and Scalding. It wraps around the Dataflow/Beam Java SDK while also
providing features comparable to other Scala data frameworks. We use
Scio
on Dataflow for production extensively inside Spotify.

Cheers,
Neville



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com



--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to