Hi Dan,
fair enough.
As I'm also working on new DSLs (XML, JSON), I already created the dsls
module.
So, I would say dsls/scala.
WDYT ?
Regards
JB
On 06/24/2016 05:07 PM, Dan Halperin wrote:
I don't think that sdks/scala is the right place -- scio is not a Beam
Scala SDK; it wraps the existing Java SDK.
Some options:
* sdks/java/extensions (Scio builds on the Java SDK) -- mentally vetoed
since Scio isn't an extension for the Java SDK, but rather a wrapper
* dsls/java/scio (Scio is a Beam DSL that uses the Java SDK)
* dsls/scio (Scio is a Beam DSL that could eventually use multiple SDKs)
* extensions/java/scio (Scio is an extension of Beam that uses the Java
SDK)
* extensions/scio (Scio is an extension of Beam that is not limited to one
SDK)
I lean towards either dsls/java/scio or extensions/java/scio, since I don't
think there are plans for Scio to handle multiple different SDKs (in
different languages). The question between these two is whether we think
DSLs are "big enough" to be a top level concept.
On Thu, Jun 23, 2016 at 11:05 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:
Good point about new Fn and the fact it's based on the Java SDK.
It's just that in term of "marketing", it's a good message to provide a
Scala SDK even if technically it's more a DSL.
For instance, a valid "marketing" DSL would be a Java fluent DSL on top of
the Java SDK, or a declarative XML DSL.
However, from a technical perspective, it can go into dsl module.
My $0.02 ;)
Regards
JB
On 06/24/2016 06:51 AM, Frances Perry wrote:
+Rafal & Andrew again
I am leaning DSL for two reasons: (1) scio uses the existing java
execution
environment (and won't have a language-specific fn harness of its own),
and
(2) it changes the abstractions that users interact with.
I recently saw a scio repl demo from Reuven -- there's some really cool
stuff in there. I'd love to dive into it a bit more and see what can be
generalized beyond scio. The repl-like interactive graph construction is
very similar to what we've seen with ipython, in that it doesn't always
play nicely with the graph construction / graph execution distinction. I
wonder what changes to Beam might more generally support this. The
materialize stuff looks similar to some functionality in FlumeJava we used
to support multi-segment pipelines with some shared intermediate
PCollections.
On Thu, Jun 23, 2016 at 9:22 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:
Hi Neville,
thanks for the update !
As it's another language support, and to clearly identify the purpose, I
would say sdks/scala.
Regards
JB
On 06/23/2016 11:56 PM, Neville Li wrote:
+folks in my team
On Thu, Jun 23, 2016 at 5:57 PM Neville Li <neville....@gmail.com>
wrote:
Hi all,
I'm the co-author of Scio <https://github.com/spotify/scio> and am in
the
progress of moving code to Beam (BEAM-302
<https://issues.apache.org/jira/browse/BEAM-302>). Just wondering if
sdks/scala is the right place for this code or if something like
dsls/scio
is a better choice? What do you think?
A little background: Scio was built as a high-level Scala API for
Google
Cloud Dataflow (now also Apache Beam) and is heavily influenced by
Spark
and Scalding. It wraps around the Dataflow/Beam Java SDK while also
providing features comparable to other Scala data frameworks. We use
Scio
on Dataflow for production extensively inside Spotify.
Cheers,
Neville
--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com
--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com
--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com