Hmm, beam already owns the cli parsing - that is what I meant - it only
misses the arg delimiter (ie quoting) and adding it is easy no?

Le 9 janv. 2018 21:19, "Robert Bradshaw" <[email protected]> a écrit :

> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau
> <[email protected]> wrote:
> >
> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <[email protected]> a écrit :
> >
> > From what I understand:
> >
> > 1) The command line argument values (both simple and more complex) are
> > all JSON-representable.
> >
> > And must be all CLI representable
>
> Sorry, I should have said "all pipeline options." In any case, one can
> always do JSON -> string and the resulting string is usable as a
> command line argument.
>
> > 2) The command line is a mapping of keys to these values.
> >
> > Which makes your 1 not true since json supports more ;)
> >
> > As such, it seems quite natural to represent the whole set as a single
> > JSON map, rather than using a different, custom encoding for the top
> > level (whose custom escaping would have to be carried into the inner
> > JSON values). Note that JSON has the advantage that one never needs to
> > explain or define it, and parsers/serializers already exists for all
> > languages (e.g. if one has a driver script in another language for
> > launching a java pipeline, it's easy to communicate all the args).
> >
> > Same reasonning applies to CLI AFAIK
>
> The spec of what a valid command line argument list is is surprisingly
> inconsistent across platforms, languages, and programs. And your
> proposal seems to be getting into what the delimiter is, and how to
> escape it, and possibly then how to escape the escape character. All
> of this is sidestepped by pointing at an existing spec.
>
> > We can't get rid of Jackson in the core because of (1) so there's
> > little value in adding complexity to remove it from (2). The fact that
> > Java doesn't ship anything in its expansive standard library for this
> > is unfortuante, so we have to take a dependency on something.
> >
> > We actually can as shown before
>
> How, if JSON is integral to the parsing of the argument values
> themselves? (Or is the argument that it's not, but each extender of
> PipelineOptions is responsible for choosing a JSON parser and doing it
> themselves.)
>
> > and jackson is not a simple negligible lib
> > but a classpath breaker :(
>
> Perhaps there are better libraries we could use instead? Is the
> ecosystem so bad that it encourages projects to roll their own of
> everything rather than share code?
>
> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <[email protected]> wrote:
> >> Ah, you don't want JSON within JSON. I see, if thats the case just
> migrate
> >> all of them to string tokenization and drop the Jackson usage for string
> >> ->
> >> string[] conversion.
> >>
> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau
> >> <[email protected]>
> >> wrote:
> >>>
> >>> Lukasz, the use case is to znsure the config used can still map the CLI
> >>> and that options dont start to abuse json so it is exactly the opposite
> >>> of
> >>> "we can be fancy" and is closer to "we can be robust". Also the default
> >>> should be easy and not json (i just want to set the runner,
> --runner=xxx
> >>> is
> >>> easier than a json version). If the value doesnt start with a [ we can
> >>> use
> >>> the tokenizer else jackson, wdyt?
> >>>
> >>>
> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <[email protected]> a écrit :
> >>>
> >>> Now that I think about this more. I looked at some of the examples in
> the
> >>> pom.xml and they don't seem to be tricky to write in JSON.
> >>> I also looked at the Jenkins job configurations (specifically the
> >>> performance tests) and they pass around maps which they convert to the
> >>> required format without needing a user to write it themselves.
> >>> Using gradle, we will be able to trivially to do the same thing
> (convert
> >>> maps to json without needing the person to write it by hand) like
> groovy
> >>> does.
> >>> Since we are migrating away from Maven it doesn't seem worthwhile to
> >>> spend
> >>> time on to make it easier to write the args in the Maven poms.
> >>>
> >>> Is there another use case that is being missed?
> >>>
> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau
> >>> <[email protected]>
> >>> wrote:
> >>>>
> >>>> Good point for \t and ,.
> >>>>
> >>>> Any objection to use jackson as a fallback for that purpose - for
> >>>> backwqrd compat only - and make it optional then? Will create the
> ticket
> >>>> if
> >>>> not.
> >>>>
> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <[email protected]> a
> écrit :
> >>>>>
> >>>>> Part of the motivation to use JSON for more complex options was that
> >>>>> it avoids having to define (and document, test, have users learn,
> ...)
> >>>>> yet another format for expressing lists, maps, etc.
> >>>>>
> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <[email protected]>
> wrote:
> >>>>> > Ken, this is specifically about running integration tests and not a
> >>>>> > users
> >>>>> > main().
> >>>>> >
> >>>>> > Note, that PipelineOptions JSON format was used because it was a
> >>>>> > convenient
> >>>>> > serialized form that is easy to explain to people what is required.
> >>>>> > Using a different string tokenizer and calling
> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings seems
> like
> >>>>> > it
> >>>>> > would be better.
> >>>>> >
> >>>>> > These are the supported formats for fromArgs():
> >>>>> >    *   --project=MyProject (simple property, will set the "project"
> >>>>> > property
> >>>>> > to "MyProject")
> >>>>> >    *   --readOnly=true (for boolean properties, will set the
> >>>>> > "readOnly"
> >>>>> > property to "true")
> >>>>> >    *   --readOnly (shorthand for boolean properties, will set the
> >>>>> > "readOnly"
> >>>>> > property to "true")
> >>>>> >    *   --x=1 --x=2 --x=3 (list style simple property, will set the
> >>>>> > "x"
> >>>>> > property to [1, 2, 3])
> >>>>> >    *   --x=1,2,3 (shorthand list style simple property, will set
> the
> >>>>> > "x"
> >>>>> > property to [1, 2, 3])
> >>>>> >    *   --complexObject='{"key1":"value1",...} (JSON format for all
> >>>>> > other
> >>>>> > complex types)
> >>>>> >
> >>>>> > Using a string tokenizer that minimizes the number of required
> escape
> >>>>> > characters would be good so we could use newline characters as our
> >>>>> > only
> >>>>> > token. I would avoid ',\t ' as tokens since they are more likely to
> >>>>> > appear.
> >>>>> >
> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <[email protected]>
> >>>>> > wrote:
> >>>>> >>
> >>>>> >> We do have a plain command line syntax, and whoever writes the
> >>>>> >> main(String[]) function is responsible for invoking the parser. It
> >>>>> >> isn't
> >>>>> >> quite as nice as standard arg parse libraries, but it isn't too
> bad.
> >>>>> >> It
> >>>>> >> would be great to improve, though.
> >>>>> >>
> >>>>> >> Jackson is for machine-to-machine communication or other
> situations
> >>>>> >> where
> >>>>> >> command line parsing doesn't work so well.
> >>>>> >>
> >>>>> >> Are we using these some other way?
> >>>>> >>
> >>>>> >> Kenn
> >>>>> >>
> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau
> >>>>> >> <[email protected]>
> >>>>> >> wrote:
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <[email protected]>
> a
> >>>>> >>> écrit :
> >>>>> >>>
> >>>>> >>> Hi Romain,
> >>>>> >>>
> >>>>> >>> I guess you are assuming that pipeline options are flat command
> >>>>> >>> line
> >>>>> >>> like
> >>>>> >>> argument, right ?
> >>>>> >>>
> >>>>> >>> Actually, theoretically,  pipeline options can be represented as
> >>>>> >>> json,
> >>>>> >>> that's why we use jackson.
> >>>>> >>> The pipeline options can be serialized/deserialized as json.
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is trivial
> >>>>> >>> to
> >>>>> >>> break the cli handling and fromArgs parsing or, if not, break the
> >>>>> >>> user
> >>>>> >>> experience. So at the end it is a kind of "yes but no", right?
> >>>>> >>>
> >>>>> >>> PS: already see some advanced users having a headache trying to
> >>>>> >>> pass
> >>>>> >>> pipeline options in json so using the plain command line syntax
> can
> >>>>> >>> be more
> >>>>> >>> friendly too.
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> The purpose is to remove the jackson dependencies ?
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which can
> >>>>> >>> conflict in some env and are not really needed or bring much
> being
> >>>>> >>> in the
> >>>>> >>> core - like avro as mentionned in another thread. Can need a
> >>>>> >>> sanitizing
> >>>>> >>> round. Short term it was really a "why is it that complicated"
> ;).
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> Regards
> >>>>> >>> JB
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote:
> >>>>> >>>>
> >>>>> >>>> Hi guys,
> >>>>> >>>>
> >>>>> >>>> not sure i fully get why jackson is used to parse pipeline
> options
> >>>>> >>>> in
> >>>>> >>>> the testing integration
> >>>>> >>>>
> >>>>> >>>> why not using a token parser like [1] which allows to map 1-1
> with
> >>>>> >>>> the
> >>>>> >>>> user interface (command line) the options?
> >>>>> >>>>
> >>>>> >>>> [1]
> >>>>> >>>>
> >>>>> >>>>
> >>>>> >>>> https://github.com/Talend/component-runtime/blob/master/
> component-server/src/main/java/org/talend/sdk/component/server/lang/
> StringPropertiesTokenizer.java
> >>>>> >>>>
> >>>>> >>>> Romain Manni-Bucau
> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog
> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog
> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github
> >>>>> >>>> <https://github.com/rmannibucau>
> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> --
> >>>>> >>> Jean-Baptiste Onofré
> >>>>> >>> [email protected]
> >>>>> >>> http://blog.nanthrax.net
> >>>>> >>> Talend - http://www.talend.com
> >>>>> >>>
> >>>>> >>>
> >>>>> >>
> >>>>> >
> >>>
> >>>
> >>>
> >>
> >
> >
>

Reply via email to