Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Yes exactly JB, I just want to ensure the sdk/core API is clear and well defined and that any not respect of that falls into a runner bug. What I don't want is that a buggy impl leaks in the SDK/core definition. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> |

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
2. Bundle size can be highly influenced / configured by user Both are needed to be able to propose a strong api compared to competitors and aims to not only have disavantages going portable for users. Let just do it, no? Regards JB On 02/18/2018 11:05 AM, Romain Manni-Bucau wrote: > > >

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Le 18 févr. 2018 00:23, "Kenneth Knowles" a écrit : On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau wrote: > > If you give an example of a high-level need (e.g. "I'm trying to write an > IO for system $x and it requires the following initialization and the >

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
Le 17 févr. 2018 22:31, "Eugene Kirpichov" a écrit : On Sat, Feb 17, 2018 at 1:10 PM Romain Manni-Bucau wrote: > You phrased it right Eugene - thanks for that. > > However the solution is not functional I think - hope I missed something. > With distribution etc you cant

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
? Anything important i miss? Le 17 févr. 2018 21:11, "Jean-Baptiste Onofré" a écrit : > I agree, it's a decent assumption. > > Regards > JB > > On 02/17/2018 05:59 PM, Romain Manni-Bucau wrote: > > Assuming a Pipeline.run(); the corresponding sequence: &

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
What do you mean by execution? > > On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: > >> >> >> Le 16 févr. 2018 22:41, "Reuven Lax" a écrit : >> >> Kenn is correct. Allowing Fn reuse across bundles was a m

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
at 1:33 PM, Kenneth Knowles wrote: > On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau > wrote: >> >> The serialization of fn being once per bundle, the perf impact is only >> huge if there is a bug somewhere else, even java serialization is >> negligeable on

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
Le 16 févr. 2018 19:28, "Kenneth Knowles" a écrit : On Fri, Feb 16, 2018 at 9:39 AM, Romain Manni-Bucau wrote: > > 2018-02-16 18:18 GMT+01:00 Kenneth Knowles : > >> Which runner's bundling are you concerned with? It sounds like the Flink >> runner? >> &

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
2018-02-16 18:18 GMT+01:00 Kenneth Knowles : > Which runner's bundling are you concerned with? It sounds like the Flink > runner? > Flink, Spark, DirectRunner, DataFlow at least (others would be good but are out of scope) > > Kenn > > > On Fri, Feb 16, 2018 at 9:04

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
g to establish the basics to start over and get this on track to > solving the real problem. > Concretely I need a well defined lifecycle for any DoFn executed in beam and today there is no such a thing making it impossible to develop correctly transforms/fn on an user side. > > Kenn

multi-env var representation for pipeline options

2018-02-16 Thread Romain Manni-Bucau
beam.appName=myapp any opinion on that? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin

Re: PipelineOptions fromSystemProps?

2018-02-16 Thread Romain Manni-Bucau
will create it now, thanks Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
, you want it at the end so in tear down. So yes we must have teardown reliable somehow. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rm

Re: PipelineOptions fromSystemProps?

2018-02-16 Thread Romain Manni-Bucau
Oh, so the point was than env would be under the portability umbrella versus system properties are not? Kind of makes sense phrased this way for me. Do we want another thread for that? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.met

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
erf 2. keep teardown a final release object (which is not that useful cause of the end of the sentence) and add a clean cache lifecycle management tempted to say 1 is saner short terms, in particular cause beam is 2.x and users already use it this way. Romain Manni-Bucau @rmannibucau <https://twi

@TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
ut imposes the runner to do so. This way the not portable behavior is where it belongs to, in the vendor specific code. It leads to a reliable API for the end user and let runners document they don't respect - yet - the API when relevant. wdyt? Romain Manni-Bucau @rmannibucau <https://t

Re: PipelineOptions fromSystemProps?

2018-02-15 Thread Romain Manni-Bucau
2018-02-15 20:00 GMT+01:00 Kenneth Knowles : > On Thu, Feb 15, 2018 at 12:03 AM, Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: >> >> 2. default properties = env + system properties: this is what all config >> libs do (spring config, tamaya, deltaspike, micropr

Re: PipelineOptions fromSystemProps?

2018-02-15 Thread Romain Manni-Bucau
o it, doesn't cost anything for us and enable more use cases so it is a clear win-win. Hope it makes sense Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <

Re: PipelineOptions fromSystemProps?

2018-02-14 Thread Romain Manni-Bucau
onfig/EnvironmentPropertyConfigSource.java#L55 or a more fancy - but real world - one like key -> key.substring(prefix.length()).toUpperCase(ROOT).replace('.', '_') Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmanni

Re: PipelineOptions fromSystemProps?

2018-02-14 Thread Romain Manni-Bucau
FYI created https://github.com/apache/beam/pull/4683 about it Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn &

Re: PipelineOptions fromSystemProps?

2018-02-13 Thread Romain Manni-Bucau
18 at 9:51 AM, Romain Manni-Bucau > wrote: > >> I like your proposal Kenneth. Perfectly fits my use case and deployment >> one as well - when ops configure the env without modifying the code. >> >> How do we move forward on that? Should I send a PR or do you want to &g

Re: PipelineOptions fromSystemProps?

2018-02-13 Thread Romain Manni-Bucau
I like your proposal Kenneth. Perfectly fits my use case and deployment one as well - when ops configure the env without modifying the code. How do we move forward on that? Should I send a PR or do you want to import what was in dataflow? Romain Manni-Bucau @rmannibucau <https://twitter.

classloader fixes for pipeline options

2018-02-13 Thread Romain Manni-Bucau
//out") .create() .execute(); // add asserts on the output if needed } } This will execute the method 4 times (once per env). Thanks, Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old B

Re: PipelineOptions fromSystemProps?

2018-02-13 Thread Romain Manni-Bucau
makes sense, do we want beam.foo.bar -> --foo-bar conversion too? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmanni

PipelineOptions fromSystemProps?

2018-02-13 Thread Romain Manni-Bucau
is to enable users to wrap the pipeline API but still expose the pipeline options to end users for advanced cases. Any discussion on this kind of usages already? What do you think of this proposal? Side note: we can think about a fromEnv() too. Romain Manni-Bucau @rmannibucau <https://twitter.

Re: [VOTE] Release 2.3.0, release candidate #3

2018-02-12 Thread Romain Manni-Bucau
you can't once you closed the staging repo Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https:/

Re: [VOTE] Release 2.3.0, release candidate #3

2018-02-12 Thread Romain Manni-Bucau
oops sorry, read too fast (thanks to not align artifactId and folder names ;)): deploy#skip=true in the module :) Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com&g

Re: [VOTE] Release 2.3.0, release candidate #3

2018-02-12 Thread Romain Manni-Bucau
it is not in the parent modules so completely skipped from the reactor Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibuc

Re: [VOTE] Release 2.3.0, release candidate #3

2018-02-12 Thread Romain Manni-Bucau
Ok, checked custom jobs on spark and direct runners + -parameters is usable + some advanced sdk-core integration usages (outside runners) - not sure where it fits the spreadsheet though. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.met

Re: [VOTE] Release 2.3.0, release candidate #3

2018-02-11 Thread Romain Manni-Bucau
+1 Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmannibucau> |

Re: [INFO] Gradle build is flaky on Jenkins

2018-02-09 Thread Romain Manni-Bucau
and never hardcoded IMHO) Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in

Re: [VOTE] Release 2.3.0, release candidate #2

2018-02-08 Thread Romain Manni-Bucau
since it breaks only examples not sure it does worth yet another reroll (which means already a 2 weeks delay on the plan). Users will be affected the same anyway - and in an expected way until beam handles classloaders per transform. A note in the side is fine probably. Romain Manni-Bucau

Re: [VOTE] Release 2.3.0, release candidate #2

2018-02-08 Thread Romain Manni-Bucau
; wrote: > > > > +1 > > > > I verified python quick start, mobile gaming examples, > streaming > > on Direct and Dataflow runners. Thank you JB! > > > > On Thu, Feb 8, 2018 at

Re: dependencies.txt in META-INF?

2018-02-08 Thread Romain Manni-Bucau
Was too much abused by libs and not supported everywhere :( Le 8 févr. 2018 22:39, "Lukasz Cwik" a écrit : > It is unfortunate that setting Class-Path is so broken. > > On Wed, Feb 7, 2018 at 10:55 PM, Romain Manni-Bucau > wrote: > >> Not really: >> 1. I ne

Re: A 15x speed-up in local Python DirectRunner execution

2018-02-08 Thread Romain Manni-Bucau
Very interesting! Sounds like a sane way for beam future and I'm very happy it is consistent with the current Java experience: no need to interlace runners at the end, it makes design, code and user experience way better than trying to put everything in the direct runner :). Le 8 févr. 2018 19:20,

Re: [VOTE] Release 2.3.0, release candidate #2

2018-02-08 Thread Romain Manni-Bucau
+1 (non-binding), thanks JB for the effort! Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.

Re: dependencies.txt in META-INF?

2018-02-07 Thread Romain Manni-Bucau
entry is the name of all dependencies resolved in > the shadow configuration for the project. > > dependencies { > shadow 'junit:junit:3.8.2' > } > > Inspecting the META-INF/MANIFEST.MF entry in the JAR file will reveal the > following attribute: > > Class-Pat

Re: dependencies.txt in META-INF?

2018-02-07 Thread Romain Manni-Bucau
ng already running a server from this kind of file typically. > > On Wed, Feb 7, 2018 at 7:23 AM, Romain Manni-Bucau > wrote: > >> Hi guys, >> >> I have a use case where I would resolve beam classpath programmatically. >> I wonder if it would be possible

dependencies.txt in META-INF?

2018-02-07 Thread Romain Manni-Bucau
these metadata. Is it something the project could be interested in? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn

Re: Schema-Aware PCollections revisited

2018-02-05 Thread Romain Manni-Bucau
; > Le 5 févr. 2018 21:43, "Reuven Lax" a écrit : > > Which json library are you thinking of? At least in Java, there's always > been a problem of no good standard Json library. > > > > On Mon, Feb 5, 2018 at 12:03 PM, Romain Manni-Bucau > wrote: > &g

Re: Schema-Aware PCollections revisited

2018-02-05 Thread Romain Manni-Bucau
hinking of? At least in Java, there's always been a problem of no good standard Json library. On Mon, Feb 5, 2018 at 12:03 PM, Romain Manni-Bucau wrote: > > > Le 5 févr. 2018 19:54, "Reuven Lax" a écrit : > > multiplying by 1.0 doesn't really solve the right p

Re: coder evolutions?

2018-02-05 Thread Romain Manni-Bucau
ll). >> Instead, I'd keep thing the way they are, but offer a new Coder >> subclass that users can subclass if they want to write an "easy" Coder >> that does the delimiting for them (on encode and decode). We would >> point users to this for writing custom c

Re: Schema-Aware PCollections revisited

2018-02-05 Thread Romain Manni-Bucau
on friendly so you are back on json + metada so jsonschema+extension entry is strictly equivalent and as typed Reuven On Sun, Feb 4, 2018 at 9:31 AM, Romain Manni-Bucau wrote: > You can handle integers using multipleOf: 1.0 IIRC. > Yes limitations are still here but it is a good sta

Re: coder evolutions?

2018-02-05 Thread Romain Manni-Bucau
Would this work for everyone - can update the pr if so: If coder is not built in Prefix with byte size Else Current behavior ? Le 5 févr. 2018 19:21, "Romain Manni-Bucau" a écrit : > Answered inlined but I want to highlight beam is a portable API on top of > well k

Re: coder evolutions?

2018-02-05 Thread Romain Manni-Bucau
also not usable by most existing codecs out there. Even some jaxb or plain xml flavors dont work with it :(. Le 5 févr. 2018 18:46, "Robert Bradshaw" a écrit : On Sun, Feb 4, 2018 at 6:44 AM, Romain Manni-Bucau wrote: > Hi guys, > > I submitted a PR on coders to enhance 1. th

Re: coder evolutions?

2018-02-05 Thread Romain Manni-Bucau
Thanks, created https://issues.apache.org/jira/browse/BEAM-3616 Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibuc

Re: [DISCUSS] State of the project: Culture and governance

2018-02-04 Thread Romain Manni-Bucau
ilding and the Apache Way. PTAL I think > >> > this is really good and I don't see why others could disagree: > >> > > >> > https://flink.apache.org/how-to-contribute.html#how-to-becom > e-a-committer > >> > > >> > > >> > Roma

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
gt; > > BTW in the past when I looked, Json schemas seemed to have some odd > limitations > > inherited from Javascript (e.g. no distinction between integer and > > floating-point types). Is that still true? > > > > Reuven > > > > On Sun, Feb 4, 2018 at 9:12 A

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
d >> > > be great if in the next major version we were better prepared for >> evolution of >> > > coders, e.g. by having coders support a version marker or >> something like that, >> > > with an API for detecting the version of data on wire and read

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
enrich the model with a beam object which would allow to complete the metadata as required when needed (never?). Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github &

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
yep sadly :( how should we track it properly to not forget it for v3? (I dont trust jira much but if we don't have anything better...) when do we start beam 3? next week? :) Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.ne

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
;). With this: no runner impact at all :). Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
2018-02-04 17:53 GMT+01:00 Reuven Lax : > > > On Sun, Feb 4, 2018 at 8:42 AM, Romain Manni-Bucau > wrote: > >> >> 2018-02-04 17:37 GMT+01:00 Reuven Lax : >> >>> I'm not sure where proto comes from here. Proto is one example of a type >>> th

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
this thread. We can postpone it but it would break later so for probably more users. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rm

Re: coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
://www.packtpub.com/application-development/java-ee-8-high-performance> 2018-02-04 17:34 GMT+01:00 Reuven Lax : > One question - does this change the actual byte encoding of elements? > We've tried hard not to do that so far for reasons of compatibility. > > Reuven > >

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
long story short the main work of this schema track is not only on using schema in runners and other ways but also starting to make beam consistent with itself which is probably the most important outcome since it is the user facing side of this work. > > On Sun, Feb 4, 2018 at 12:22 AM

coder evolutions?

2018-02-04 Thread Romain Manni-Bucau
makes sense, if not, don't hesitate to ask questions. Happy end of week-end. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmanni

Re: Schema-Aware PCollections revisited

2018-02-04 Thread Romain Manni-Bucau
al for some time, so the API will not be fixed in stone. >> >> Any more comments on this approach before we start implementing a >> prototype? >> >> Reuven >> >> On Wed, Jan 31, 2018 at 1:12 PM, Romain Manni-Bucau < >> rmannibu...@gmail.com> wrot

Re: rename: BeamRecord -> Row

2018-02-03 Thread Romain Manni-Bucau
This is as true as the renaming is not needed so I guess the PR owner will decide ;). Thanks for the clarification. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com&g

Re: rename: BeamRecord -> Row

2018-02-03 Thread Romain Manni-Bucau
of both will be beneficial to beam in any case so better to ensure all parts of the projects move in the same direction instead of requiring yet another layer of conversion, no? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.ne

Re: rename: BeamRecord -> Row

2018-02-02 Thread Romain Manni-Bucau
Hi Shouldnt the discussion on schema which has a direct impact on this generic container be closed before any action on this? Le 3 févr. 2018 01:09, "Ankur Chauhan" a écrit : > ++ > > On Fri, Feb 2, 2018 at 1:33 PM Rafael Fernandez > wrote: > >> Very strong +1 >> >> >> On Fri, Feb 2, 2018 at

Re: [DISCUSS] [Java] Private shaded dependency uber jars

2018-02-02 Thread Romain Manni-Bucau
well we can disagree on the code - it is fine ;), but the needed part of it by beam is not huge and in any case it can be forked without requiring 10 classes - if so we'll use another impl than the guava one ;). This is the whole point. Romain Manni-Bucau @rmannibucau <https://twi

Re: [DISCUSS] [Java] Private shaded dependency uber jars

2018-02-02 Thread Romain Manni-Bucau
(I'm thinking to coders here) but this can be another topic. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | Lin

Re: [DISCUSS] [Java] Private shaded dependency uber jars

2018-02-02 Thread Romain Manni-Bucau
ver a good idea, in particular cause the shade can be broken and requires to setup clirr or things like that and when it breaks you are done and need to fork it anyway. Limiting the dependencies for an API - as beam is - is always saner even if it requires a small fork of code. Romain Manni-Bucau @r

Re: [DISCUSS] [Java] Private shaded dependency uber jars

2018-02-02 Thread Romain Manni-Bucau
>>> for 'sdks/java/core'. We can then attack other areas afterwards. >>> >>> Other important idea would be to get rid of Protobuf in public APIs >>> like GCPIO and to better shade it from leaking into the runners. An >>> unexpected sid

Re: [PROPOSAL] Switch from Guava futures vs Java 8 futures

2018-02-01 Thread Romain Manni-Bucau
+1 indeed Le 1 févr. 2018 21:34, "Eugene Kirpichov" a écrit : > Reducing dependency on Guava in favor of something Java-standard sounds > great, +1. > > On Thu, Feb 1, 2018 at 11:53 AM Reuven Lax wrote: > >> Unless there's something that doesn't work in Java 8 future, +1 to >> migrating. >> >>

Re: drop scala....version from artifact ;)

2018-02-01 Thread Romain Manni-Bucau
Flink, Gearpump, Spark, and GCE provisioning are affected by this "issue". Dropping it if we never manage 2 versions is nicer for end users IMHO but I'm fine keeping it. Just would like to ensure it is uniform accross the whole projet. Romain Manni-Bucau @rmannibucau <h

Re: [DISCUSS] [Java] Private shaded dependency uber jars

2018-02-01 Thread Romain Manni-Bucau
a map of list is fine and not a challenge we'll face long I hope ;) Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rman

Re: why org.apache.beam.sdk.util.UnownedInputStream fails on close instead of ignoring it

2018-01-31 Thread Romain Manni-Bucau
k" a écrit : > I'm not sure what you mean by it closes the door since as the caller of > the library you can create a wrapper filter input stream that ignores close > calls effectively overriding what happens in the UnownedInputStream. > > On Wed, Jan 31, 2018 a

Re: why org.apache.beam.sdk.util.UnownedInputStream fails on close instead of ignoring it

2018-01-31 Thread Romain Manni-Bucau
ayer and making smoothly work for most mappers. Anyway I can live with it but I'm a bit sad it closes the door to the easyness to write extensions. On Wed, Jan 31, 2018 at 12:04 PM, Romain Manni-Bucau wrote: > Hmm, here we are the ones owning the call since it is in a coder, no? Do > w

Re: [VOTE] Release 2.3.0, release candidate #1

2018-01-31 Thread Romain Manni-Bucau
@ismael: any vote can be changes from -1 to +1 (or +-0) without additional delay Le 1 févr. 2018 03:15, "Lukasz Cwik" a écrit : > Note that a user reported TextIO being broken on Flink. > Thread is here: https://lists.apache.org/thread.html/ > 47b16c94032392782505415e010970fd2a9480891c55c2f7b5de

Re: [DISCUSS] [Java] Private shaded dependency uber jars

2018-01-31 Thread Romain Manni-Bucau
Why not dropping guava for all beam codebase? With java 8 it is quite easy to do it and avoid a bunch of conflicts. Did it in 2 projects with quite a good result. Le 1 févr. 2018 06:50, "Lukasz Cwik" a écrit : > Make sure to include the guava version in the artifact name so that we can > have mu

Re: Tracking Sickbayed tests in Jira

2018-01-31 Thread Romain Manni-Bucau
find . -name '*.java' | xargs grep @Ignore | sed 's#src/.*##' | sort -u # to count ignored tests $ find . -name '*.java' | xargs grep @Ignore | wc -l last one mixed with a loop and git allows to follow the evolution and check if it grows or decreases. Romain Manni-Bu

Re: Schema-Aware PCollections revisited

2018-01-31 Thread Romain Manni-Bucau
If you need help on the json part I'm happy to help. To give a few hints on what is very doable: we can add an avro module to johnzon (asf json{p,b} impl) to back jsonp by avro (guess it will be one of the first to be asked) for instance. Romain Manni-Bucau @rmannibucau <https://twi

Re: Should we have a predictable test run order?

2018-01-31 Thread Romain Manni-Bucau
s replaying a test run > >> > because > >> > you can specify the order in which it should execute. > >> > > >> > I don't like having a strict order since it hides poorly written tests > >> > and > >> > people have a ten

Re: Schema-Aware PCollections revisited

2018-01-31 Thread Romain Manni-Bucau
. Can you detail it please? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmann

Re: why org.apache.beam.sdk.util.UnownedInputStream fails on close instead of ignoring it

2018-01-31 Thread Romain Manni-Bucau
Hmm, here we are the ones owning the call since it is in a coder, no? Do we assume people will badly implement coders? In this particular case we can assume close() will be called by a framework I think. What about swallowing one close() and fail on the second? Romain Manni-Bucau @rmannibucau

Re: Schema-Aware PCollections revisited

2018-01-31 Thread Romain Manni-Bucau
usage. > > Once 2.3.0 release is out, I will start to update the document with those > ideas, > and PoC. > > Thanks ! > Regards > JB > > On 01/30/2018 08:42 AM, Romain Manni-Bucau wrote: > > > > > > Le 30 janv. 2018 01:09, "Reuven Lax" >

drop scala....version from artifact ;)

2018-01-31 Thread Romain Manni-Bucau
Hi guys since beam supports a single version of runners why not dropping the scala version from the artifactId? ATM upgrades are painful cause you upgrade beam version+ runner artifactIds. wdyt? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog

Re: [VOTE] Release 2.3.0, release candidate #1

2018-01-31 Thread Romain Manni-Bucau
+1 (non-binding), upgraded to spark 2 in several test suites and integrations and works very well. Good job guys! Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com&g

Re: IO plans?

2018-01-31 Thread Romain Manni-Bucau
Thanks JB, this is great news since they are highly used IO in the industry and really awaited now. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://g

IO plans?

2018-01-31 Thread Romain Manni-Bucau
Hi guys, is there a plan for future IO and some tracking somewhere? I particularly wonder if there are plans for a HTTP IO and common server IO like SFTP, SSH, etc... Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Ol

Re: [DISCUSSION] Add hint/option on PCollection

2018-01-31 Thread Romain Manni-Bucau
icious and insane in terms of application code and maintenance IMHO. In that context, hints are a cheap and acceptable trade-off which enable without breaking users. Am I missing something? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.m

Re: untyped pipeline API?

2018-01-30 Thread Romain Manni-Bucau
he SDK has no good > way of knowing what those decisions will be, so needs to conservatively > assume it could happen anywhere. > > On Tue, Jan 30, 2018 at 1:31 PM, Romain Manni-Bucau > wrote: > >> Hmm starts to smell like the old question "how to enforce runner >> con

Re: untyped pipeline API?

2018-01-30 Thread Romain Manni-Bucau
ints are depends on the runner. Runners are free to > split at any point (and often do to prevent cycles from appearing in the > graph). > > On Tue, Jan 30, 2018 at 1:27 PM, Romain Manni-Bucau > wrote: > >> I kind of agree on all of that and brings me to the interest

Re: [DISCUSSION] Add hint/option on PCollection

2018-01-30 Thread Romain Manni-Bucau
dy today with PCollection.setCoder, and that has caused some >>>>>> problems. Hints can be set on PTransforms though, and propagate to >>>>>> that >>>>>> PTransform's output PCollections. This is nearly as easy to use >

Re: untyped pipeline API?

2018-01-30 Thread Romain Manni-Bucau
is that the thing you want to happen is >> already done. There are some corner cases when you get to the portability >> framework but I am pretty sure it already works this way. If you show what >> is a PTransform and PCollection in your example it might show where we can >>

Re: untyped pipeline API?

2018-01-30 Thread Romain Manni-Bucau
xample. In other words if an coder output is readable from another coder input, the java strong typing doesn't know about it and can enforce some fake steps. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Ol

Re: [DISCUSSION] Add hint/option on PCollection

2018-01-30 Thread Romain Manni-Bucau
Maybe I should have started the discussion on the user mailing list: >>> it would be great to have user feedback on this, even if I got your >>> points. >>> >>> Sometime, I have the feeling that whatever we are proposing and >>>

Re: [DISCUSSION] Add hint/option on PCollection

2018-01-30 Thread Romain Manni-Bucau
Hmm, can work for pipeline hints but for transform hints we would need: p.apply(AddHint.of(.).wrap(originalTransform)) Would work for me too. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Ol

untyped pipeline API?

2018-01-30 Thread Romain Manni-Bucau
y and enforce the user to use an intermediate state to be typed. Is there already a way to avoid these useless round trips? Said otherwise: how to handle coders transitivity? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/>

Re: why org.apache.beam.sdk.util.UnownedInputStream fails on close instead of ignoring it

2018-01-30 Thread Romain Manni-Bucau
I get the issue but I don't get the last part. Concretely we can support any lib by just removing the exception in the close, no? What would be the issue? No additional wrapper, no lib integration issue. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> |

Re: [DISCUSSION] Add hint/option on PCollection

2018-01-30 Thread Romain Manni-Bucau
free-form hints that we > haven't already covered. > There are several cases, even in the direct runner to be able to industrialize it: - use that particular executor instance - debug these infos for that transform etc... As a high level design I think it is good to bring hints to beam to

why org.apache.beam.sdk.util.UnownedInputStream fails on close instead of ignoring it

2018-01-30 Thread Romain Manni-Bucau
Hi guys, All is in the subject ;) Rational is to support any I/O library and not fail when the close is encapsulated. Any blocker to swallow this close call? Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Ol

Re: [DISCUSSION] Add hint/option on PCollection

2018-01-30 Thread Romain Manni-Bucau
. It should really be about runing a runner execution (like the schema in spark). Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibuc

Re: Should we have a predictable test run order?

2018-01-30 Thread Romain Manni-Bucau
nd you can configure this sequence in idea. Not perfect but better than hiding the issue probably. Also running "clean" enforces inodes to change and increase the probability to reproduce it on linux. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <h

Re: Should we have a predictable test run order?

2018-01-30 Thread Romain Manni-Bucau
Hi Daniel, As a quick fix it sounds good but doesnt it hide a leak or issue (in test setup or in main code)? Long story short: using a random order can allow to find bugs faster instead of hiding them and discover them randomly adding a new test. That said, good point to have it configurable with

Re: Schema-Aware PCollections revisited

2018-01-29 Thread Romain Manni-Bucau
Le 30 janv. 2018 01:09, "Reuven Lax" a écrit : On Mon, Jan 29, 2018 at 12:17 PM, Romain Manni-Bucau wrote: > Hi > > I have some questions on this: how hierarchic schemas would work? Seems it > is not really supported by the ecosystem (out of custom stuff) :(. Ho

Re: Detecting resources to stage

2018-01-29 Thread Romain Manni-Bucau
The fact to scan from the classloader and not assume it is an urlclassloader or even respects its contract makes it more portable (EE, plain tomcat which doesnt respect this contract, OSGi soon, custom classloaders etc...). It is trivial to have failing cases using URLClassLoader sadly. That said

Re: Schema-Aware PCollections revisited

2018-01-29 Thread Romain Manni-Bucau
Hi I have some questions on this: how hierarchic schemas would work? Seems it is not really supported by the ecosystem (out of custom stuff) :(. How would it integrate smoothly with other generic record types - N bridges? Concretely I wonder if using json API couldnt be beneficial: json-p is a ni

<    1   2   3   4   5   6   >