Some inputs - maybe comments? - would be appreciated on:

1. Why using json in
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestPipeline.java#L480
and not iterate over the options?
2. Also this is likely used to do a fromArgs and create another pipeline
options later no? Is this api that useful here or does it need an options
composition - fully object?

Le 11 janv. 2018 19:09, "Lukasz Cwik" <[email protected]> a écrit :

> Give links to the code segments that you want background on.
>
> On Wed, Jan 10, 2018 at 12:44 AM, Romain Manni-Bucau <
> [email protected]> wrote:
>
>> updated my junit 5 PR to show that: https://github.com/apach
>> e/beam/pull/4360/files#diff-578d1770f8b47ebbc1e74a2c19de9a6aR28
>>
>> It doesn't remove jackson yet but exposes a nicer user interface for the
>> config.
>>
>> I'm not fully clear on all the jackson usage yet, there are some round
>> trips (PO -> json -> PO) which are weird without more knowledge.
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau>
>>
>> 2018-01-09 22:57 GMT+01:00 Lukasz Cwik <[email protected]>:
>>
>>> Removing large dependencies is great but I can only assume it will be a
>>> lot of work so if you can get it to work in a backwards compatible way
>>> great.
>>>
>>> On Tue, Jan 9, 2018 at 1:52 PM, Romain Manni-Bucau <
>>> [email protected]> wrote:
>>>
>>>> It conflicts easily between libs and containers. Shade is not a good
>>>> option too - see the thread on this topic :(.
>>>>
>>>> At the end i see using the cli solution closer to user - vs framework
>>>> dev for json - and hurting less in terms of classpath so pby sthg to test
>>>> no?
>>>>
>>>> Le 9 janv. 2018 22:47, "Lukasz Cwik" <[email protected]> a écrit :
>>>>
>>>>> Romain, how has Jackson been a classpath breaker?
>>>>>
>>>>>
>>>>> On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hmm, beam already owns the cli parsing - that is what I meant - it
>>>>>> only misses the arg delimiter (ie quoting) and adding it is easy no?
>>>>>>
>>>>>> Le 9 janv. 2018 21:19, "Robert Bradshaw" <[email protected]> a
>>>>>> écrit :
>>>>>>
>>>>>>> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau
>>>>>>> <[email protected]> wrote:
>>>>>>> >
>>>>>>> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <[email protected]> a
>>>>>>> écrit :
>>>>>>> >
>>>>>>> > From what I understand:
>>>>>>> >
>>>>>>> > 1) The command line argument values (both simple and more complex)
>>>>>>> are
>>>>>>> > all JSON-representable.
>>>>>>> >
>>>>>>> > And must be all CLI representable
>>>>>>>
>>>>>>> Sorry, I should have said "all pipeline options." In any case, one
>>>>>>> can
>>>>>>> always do JSON -> string and the resulting string is usable as a
>>>>>>> command line argument.
>>>>>>>
>>>>>>> > 2) The command line is a mapping of keys to these values.
>>>>>>> >
>>>>>>> > Which makes your 1 not true since json supports more ;)
>>>>>>> >
>>>>>>> > As such, it seems quite natural to represent the whole set as a
>>>>>>> single
>>>>>>> > JSON map, rather than using a different, custom encoding for the
>>>>>>> top
>>>>>>> > level (whose custom escaping would have to be carried into the
>>>>>>> inner
>>>>>>> > JSON values). Note that JSON has the advantage that one never
>>>>>>> needs to
>>>>>>> > explain or define it, and parsers/serializers already exists for
>>>>>>> all
>>>>>>> > languages (e.g. if one has a driver script in another language for
>>>>>>> > launching a java pipeline, it's easy to communicate all the args).
>>>>>>> >
>>>>>>> > Same reasonning applies to CLI AFAIK
>>>>>>>
>>>>>>> The spec of what a valid command line argument list is is
>>>>>>> surprisingly
>>>>>>> inconsistent across platforms, languages, and programs. And your
>>>>>>> proposal seems to be getting into what the delimiter is, and how to
>>>>>>> escape it, and possibly then how to escape the escape character. All
>>>>>>> of this is sidestepped by pointing at an existing spec.
>>>>>>>
>>>>>>> > We can't get rid of Jackson in the core because of (1) so there's
>>>>>>> > little value in adding complexity to remove it from (2). The fact
>>>>>>> that
>>>>>>> > Java doesn't ship anything in its expansive standard library for
>>>>>>> this
>>>>>>> > is unfortuante, so we have to take a dependency on something.
>>>>>>> >
>>>>>>> > We actually can as shown before
>>>>>>>
>>>>>>> How, if JSON is integral to the parsing of the argument values
>>>>>>> themselves? (Or is the argument that it's not, but each extender of
>>>>>>> PipelineOptions is responsible for choosing a JSON parser and doing
>>>>>>> it
>>>>>>> themselves.)
>>>>>>>
>>>>>>> > and jackson is not a simple negligible lib
>>>>>>> > but a classpath breaker :(
>>>>>>>
>>>>>>> Perhaps there are better libraries we could use instead? Is the
>>>>>>> ecosystem so bad that it encourages projects to roll their own of
>>>>>>> everything rather than share code?
>>>>>>>
>>>>>>> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <[email protected]>
>>>>>>> wrote:
>>>>>>> >> Ah, you don't want JSON within JSON. I see, if thats the case
>>>>>>> just migrate
>>>>>>> >> all of them to string tokenization and drop the Jackson usage for
>>>>>>> string
>>>>>>> >> ->
>>>>>>> >> string[] conversion.
>>>>>>> >>
>>>>>>> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau
>>>>>>> >> <[email protected]>
>>>>>>> >> wrote:
>>>>>>> >>>
>>>>>>> >>> Lukasz, the use case is to znsure the config used can still map
>>>>>>> the CLI
>>>>>>> >>> and that options dont start to abuse json so it is exactly the
>>>>>>> opposite
>>>>>>> >>> of
>>>>>>> >>> "we can be fancy" and is closer to "we can be robust". Also the
>>>>>>> default
>>>>>>> >>> should be easy and not json (i just want to set the runner,
>>>>>>> --runner=xxx
>>>>>>> >>> is
>>>>>>> >>> easier than a json version). If the value doesnt start with a [
>>>>>>> we can
>>>>>>> >>> use
>>>>>>> >>> the tokenizer else jackson, wdyt?
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <[email protected]> a écrit
>>>>>>> :
>>>>>>> >>>
>>>>>>> >>> Now that I think about this more. I looked at some of the
>>>>>>> examples in the
>>>>>>> >>> pom.xml and they don't seem to be tricky to write in JSON.
>>>>>>> >>> I also looked at the Jenkins job configurations (specifically the
>>>>>>> >>> performance tests) and they pass around maps which they convert
>>>>>>> to the
>>>>>>> >>> required format without needing a user to write it themselves.
>>>>>>> >>> Using gradle, we will be able to trivially to do the same thing
>>>>>>> (convert
>>>>>>> >>> maps to json without needing the person to write it by hand)
>>>>>>> like groovy
>>>>>>> >>> does.
>>>>>>> >>> Since we are migrating away from Maven it doesn't seem
>>>>>>> worthwhile to
>>>>>>> >>> spend
>>>>>>> >>> time on to make it easier to write the args in the Maven poms.
>>>>>>> >>>
>>>>>>> >>> Is there another use case that is being missed?
>>>>>>> >>>
>>>>>>> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau
>>>>>>> >>> <[email protected]>
>>>>>>> >>> wrote:
>>>>>>> >>>>
>>>>>>> >>>> Good point for \t and ,.
>>>>>>> >>>>
>>>>>>> >>>> Any objection to use jackson as a fallback for that purpose -
>>>>>>> for
>>>>>>> >>>> backwqrd compat only - and make it optional then? Will create
>>>>>>> the ticket
>>>>>>> >>>> if
>>>>>>> >>>> not.
>>>>>>> >>>>
>>>>>>> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <[email protected]>
>>>>>>> a écrit :
>>>>>>> >>>>>
>>>>>>> >>>>> Part of the motivation to use JSON for more complex options
>>>>>>> was that
>>>>>>> >>>>> it avoids having to define (and document, test, have users
>>>>>>> learn, ...)
>>>>>>> >>>>> yet another format for expressing lists, maps, etc.
>>>>>>> >>>>>
>>>>>>> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <[email protected]>
>>>>>>> wrote:
>>>>>>> >>>>> > Ken, this is specifically about running integration tests
>>>>>>> and not a
>>>>>>> >>>>> > users
>>>>>>> >>>>> > main().
>>>>>>> >>>>> >
>>>>>>> >>>>> > Note, that PipelineOptions JSON format was used because it
>>>>>>> was a
>>>>>>> >>>>> > convenient
>>>>>>> >>>>> > serialized form that is easy to explain to people what is
>>>>>>> required.
>>>>>>> >>>>> > Using a different string tokenizer and calling
>>>>>>> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings
>>>>>>> seems like
>>>>>>> >>>>> > it
>>>>>>> >>>>> > would be better.
>>>>>>> >>>>> >
>>>>>>> >>>>> > These are the supported formats for fromArgs():
>>>>>>> >>>>> >    *   --project=MyProject (simple property, will set the
>>>>>>> "project"
>>>>>>> >>>>> > property
>>>>>>> >>>>> > to "MyProject")
>>>>>>> >>>>> >    *   --readOnly=true (for boolean properties, will set the
>>>>>>> >>>>> > "readOnly"
>>>>>>> >>>>> > property to "true")
>>>>>>> >>>>> >    *   --readOnly (shorthand for boolean properties, will
>>>>>>> set the
>>>>>>> >>>>> > "readOnly"
>>>>>>> >>>>> > property to "true")
>>>>>>> >>>>> >    *   --x=1 --x=2 --x=3 (list style simple property, will
>>>>>>> set the
>>>>>>> >>>>> > "x"
>>>>>>> >>>>> > property to [1, 2, 3])
>>>>>>> >>>>> >    *   --x=1,2,3 (shorthand list style simple property, will
>>>>>>> set the
>>>>>>> >>>>> > "x"
>>>>>>> >>>>> > property to [1, 2, 3])
>>>>>>> >>>>> >    *   --complexObject='{"key1":"value1",...} (JSON format
>>>>>>> for all
>>>>>>> >>>>> > other
>>>>>>> >>>>> > complex types)
>>>>>>> >>>>> >
>>>>>>> >>>>> > Using a string tokenizer that minimizes the number of
>>>>>>> required escape
>>>>>>> >>>>> > characters would be good so we could use newline characters
>>>>>>> as our
>>>>>>> >>>>> > only
>>>>>>> >>>>> > token. I would avoid ',\t ' as tokens since they are more
>>>>>>> likely to
>>>>>>> >>>>> > appear.
>>>>>>> >>>>> >
>>>>>>> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <
>>>>>>> [email protected]>
>>>>>>> >>>>> > wrote:
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> We do have a plain command line syntax, and whoever writes
>>>>>>> the
>>>>>>> >>>>> >> main(String[]) function is responsible for invoking the
>>>>>>> parser. It
>>>>>>> >>>>> >> isn't
>>>>>>> >>>>> >> quite as nice as standard arg parse libraries, but it isn't
>>>>>>> too bad.
>>>>>>> >>>>> >> It
>>>>>>> >>>>> >> would be great to improve, though.
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> Jackson is for machine-to-machine communication or other
>>>>>>> situations
>>>>>>> >>>>> >> where
>>>>>>> >>>>> >> command line parsing doesn't work so well.
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> Are we using these some other way?
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> Kenn
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau
>>>>>>> >>>>> >> <[email protected]>
>>>>>>> >>>>> >> wrote:
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <
>>>>>>> [email protected]> a
>>>>>>> >>>>> >>> écrit :
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> Hi Romain,
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> I guess you are assuming that pipeline options are flat
>>>>>>> command
>>>>>>> >>>>> >>> line
>>>>>>> >>>>> >>> like
>>>>>>> >>>>> >>> argument, right ?
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> Actually, theoretically,  pipeline options can be
>>>>>>> represented as
>>>>>>> >>>>> >>> json,
>>>>>>> >>>>> >>> that's why we use jackson.
>>>>>>> >>>>> >>> The pipeline options can be serialized/deserialized as
>>>>>>> json.
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is
>>>>>>> trivial
>>>>>>> >>>>> >>> to
>>>>>>> >>>>> >>> break the cli handling and fromArgs parsing or, if not,
>>>>>>> break the
>>>>>>> >>>>> >>> user
>>>>>>> >>>>> >>> experience. So at the end it is a kind of "yes but no",
>>>>>>> right?
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> PS: already see some advanced users having a headache
>>>>>>> trying to
>>>>>>> >>>>> >>> pass
>>>>>>> >>>>> >>> pipeline options in json so using the plain command line
>>>>>>> syntax can
>>>>>>> >>>>> >>> be more
>>>>>>> >>>>> >>> friendly too.
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> The purpose is to remove the jackson dependencies ?
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep
>>>>>>> which can
>>>>>>> >>>>> >>> conflict in some env and are not really needed or bring
>>>>>>> much being
>>>>>>> >>>>> >>> in the
>>>>>>> >>>>> >>> core - like avro as mentionned in another thread. Can need
>>>>>>> a
>>>>>>> >>>>> >>> sanitizing
>>>>>>> >>>>> >>> round. Short term it was really a "why is it that
>>>>>>> complicated" ;).
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> Regards
>>>>>>> >>>>> >>> JB
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote:
>>>>>>> >>>>> >>>>
>>>>>>> >>>>> >>>> Hi guys,
>>>>>>> >>>>> >>>>
>>>>>>> >>>>> >>>> not sure i fully get why jackson is used to parse
>>>>>>> pipeline options
>>>>>>> >>>>> >>>> in
>>>>>>> >>>>> >>>> the testing integration
>>>>>>> >>>>> >>>>
>>>>>>> >>>>> >>>> why not using a token parser like [1] which allows to map
>>>>>>> 1-1 with
>>>>>>> >>>>> >>>> the
>>>>>>> >>>>> >>>> user interface (command line) the options?
>>>>>>> >>>>> >>>>
>>>>>>> >>>>> >>>> [1]
>>>>>>> >>>>> >>>>
>>>>>>> >>>>> >>>>
>>>>>>> >>>>> >>>> https://github.com/Talend/comp
>>>>>>> onent-runtime/blob/master/component-server/src/main/java/org
>>>>>>> /talend/sdk/component/server/lang/StringPropertiesTokenizer.java
>>>>>>> >>>>> >>>>
>>>>>>> >>>>> >>>> Romain Manni-Bucau
>>>>>>> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog
>>>>>>> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>> >>>>> >>>> <https://github.com/rmannibucau>
>>>>>>> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>> --
>>>>>>> >>>>> >>> Jean-Baptiste Onofré
>>>>>>> >>>>> >>> [email protected]
>>>>>>> >>>>> >>> http://blog.nanthrax.net
>>>>>>> >>>>> >>> Talend - http://www.talend.com
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>>
>>>>>>> >>>>> >>
>>>>>>> >>>>> >
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>

Reply via email to