Yes, we moves a few steps forward - was waiting before speaking of it but
seems it led to it.
Idea is to prepend a prefix per IO in the key. Currently the easiest
working solution is to sanitize the transform/fn name (since we validate
the unicity by default) and we are done. If not znough we can
2018-01-12 19:12 GMT+01:00 Lukasz Cwik :
>
>
> On Fri, Jan 12, 2018 at 10:01 AM, Romain Manni-Bucau <
> rmannibu...@gmail.com> wrote:
>
>>
>> 2018-01-12 18:54 GMT+01:00 Lukasz Cwik :
>>
>>> Some concerns would be:
>>> * How does a user discover what options can
On Fri, Jan 12, 2018 at 10:01 AM, Romain Manni-Bucau
wrote:
>
> 2018-01-12 18:54 GMT+01:00 Lukasz Cwik :
>
>> Some concerns would be:
>> * How does a user discover what options can be set and what values they
>> take?
>>
>
> Tempted to say "same as
2018-01-12 18:54 GMT+01:00 Lukasz Cwik :
> Some concerns would be:
> * How does a user discover what options can be set and what values they
> take?
>
Tempted to say "same as today", what's the regression you would see? A
--verbose can be nice thought to dump the DAG at startup
Some concerns would be:
* How does a user discover what options can be set and what values they
take?
* System properties are global so how would you prevent system properties
from conflicting within Apache Beam and with other non Apache Beam
libraries that may rely on system properties?
* There
Hmm, here is my thought: allowing main options settings is super important
and one of the most important user experience point. I d even say any IO
config shouldnt be an auto value but an option and through the transform
name you should be able to prefux the option name and override a particular
Robert Bradshaw had the idea of migrating away from using main(String[]
args) and just refactoring the code to test the WordCount PTransform
allowing one to write a traditional JUnit test that didn't call
main(String[] args). This would change what the contents of our examples
are and make them
1) TestPipeline#convertToArgs is meant to convert TestPipelineOptions into
a String[] containing --arg=value for integration tests that only have a
main(String[] arg) entry point like WordCount. There is this PR[1] that is
outstanding that is attempting to clean this up and simplify it so that we
Some inputs - maybe comments? - would be appreciated on:
1. Why using json in
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestPipeline.java#L480
and not iterate over the options?
2. Also this is likely used to do a fromArgs and create
Give links to the code segments that you want background on.
On Wed, Jan 10, 2018 at 12:44 AM, Romain Manni-Bucau
wrote:
> updated my junit 5 PR to show that: https://github.com/
> apache/beam/pull/4360/files#diff-578d1770f8b47ebbc1e74a2c19de9a6aR28
>
> It doesn't remove
updated my junit 5 PR to show that:
https://github.com/apache/beam/pull/4360/files#diff-578d1770f8b47ebbc1e74a2c19de9a6aR28
It doesn't remove jackson yet but exposes a nicer user interface for the
config.
I'm not fully clear on all the jackson usage yet, there are some round
trips (PO -> json ->
Removing large dependencies is great but I can only assume it will be a lot
of work so if you can get it to work in a backwards compatible way great.
On Tue, Jan 9, 2018 at 1:52 PM, Romain Manni-Bucau
wrote:
> It conflicts easily between libs and containers. Shade is not
It conflicts easily between libs and containers. Shade is not a good option
too - see the thread on this topic :(.
At the end i see using the cli solution closer to user - vs framework dev
for json - and hurting less in terms of classpath so pby sthg to test no?
Le 9 janv. 2018 22:47, "Lukasz
Romain, how has Jackson been a classpath breaker?
On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau
wrote:
> Hmm, beam already owns the cli parsing - that is what I meant - it only
> misses the arg delimiter (ie quoting) and adding it is easy no?
>
> Le 9 janv. 2018
Hmm, beam already owns the cli parsing - that is what I meant - it only
misses the arg delimiter (ie quoting) and adding it is easy no?
Le 9 janv. 2018 21:19, "Robert Bradshaw" a écrit :
> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau
> wrote:
On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau
wrote:
>
> Le 9 janv. 2018 19:50, "Robert Bradshaw" a écrit :
>
> From what I understand:
>
> 1) The command line argument values (both simple and more complex) are
> all JSON-representable.
>
> And
Le 9 janv. 2018 19:50, "Robert Bradshaw" a écrit :
>From what I understand:
1) The command line argument values (both simple and more complex) are
all JSON-representable.
And must be all CLI representable
2) The command line is a mapping of keys to these values.
Which
>From what I understand:
1) The command line argument values (both simple and more complex) are
all JSON-representable.
2) The command line is a mapping of keys to these values.
As such, it seems quite natural to represent the whole set as a single
JSON map, rather than using a different, custom
Ah, you don't want JSON within JSON. I see, if thats the case just migrate
all of them to string tokenization and drop the Jackson usage for string ->
string[] conversion.
On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau
wrote:
> Lukasz, the use case is to znsure the
Lukasz, the use case is to znsure the config used can still map the CLI and
that options dont start to abuse json so it is exactly the opposite of "we
can be fancy" and is closer to "we can be robust". Also the default should
be easy and not json (i just want to set the runner, --runner=xxx is
Now that I think about this more. I looked at some of the examples in the
pom.xml and they don't seem to be tricky to write in JSON.
I also looked at the Jenkins job configurations (specifically the
performance tests) and they pass around maps which they convert to the
required format without
Good point for \t and ,.
Any objection to use jackson as a fallback for that purpose - for backwqrd
compat only - and make it optional then? Will create the ticket if not.
Le 8 janv. 2018 20:32, "Robert Bradshaw" a écrit :
> Part of the motivation to use JSON for more
Part of the motivation to use JSON for more complex options was that
it avoids having to define (and document, test, have users learn, ...)
yet another format for expressing lists, maps, etc.
On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik wrote:
> Ken, this is specifically about
On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik wrote:
> Ken, this is specifically about running integration tests and not a users
> main().
>
Yea, I was just addressing the bit about advanced users having a headache.
Hopefully no end users are really writing options as JSON
Ken, this is specifically about running integration tests and not a users
main().
Note, that PipelineOptions JSON format was used because it was a convenient
serialized form that is easy to explain to people what is required.
Using a different string tokenizer and calling
We do have a plain command line syntax, and whoever writes the
main(String[]) function is responsible for invoking the parser. It isn't
quite as nice as standard arg parse libraries, but it isn't too bad. It
would be great to improve, though.
Jackson is for machine-to-machine communication or
Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" a écrit :
Hi Romain,
I guess you are assuming that pipeline options are flat command line like
argument, right ?
Actually, theoretically, pipeline options can be represented as json,
that's why we use jackson.
The pipeline
Hi Romain,
I guess you are assuming that pipeline options are flat command line like
argument, right ?
Actually, theoretically, pipeline options can be represented as json, that's
why we use jackson.
The pipeline options can be serialized/deserialized as json.
The purpose is to remove the
Hi guys,
not sure i fully get why jackson is used to parse pipeline options in the
testing integration
why not using a token parser like [1] which allows to map 1-1 with the user
interface (command line) the options?
[1]
29 matches
Mail list logo