>From what I understand:

1) The command line argument values (both simple and more complex) are
all JSON-representable.
2) The command line is a mapping of keys to these values.

As such, it seems quite natural to represent the whole set as a single
JSON map, rather than using a different, custom encoding for the top
level (whose custom escaping would have to be carried into the inner
JSON values). Note that JSON has the advantage that one never needs to
explain or define it, and parsers/serializers already exists for all
languages (e.g. if one has a driver script in another language for
launching a java pipeline, it's easy to communicate all the args).

We can't get rid of Jackson in the core because of (1) so there's
little value in adding complexity to remove it from (2). The fact that
Java doesn't ship anything in its expansive standard library for this
is unfortuante, so we have to take a dependency on something.


On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <lc...@google.com> wrote:
> Ah, you don't want JSON within JSON. I see, if thats the case just migrate
> all of them to string tokenization and drop the Jackson usage for string ->
> string[] conversion.
>
> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>>
>> Lukasz, the use case is to znsure the config used can still map the CLI
>> and that options dont start to abuse json so it is exactly the opposite of
>> "we can be fancy" and is closer to "we can be robust". Also the default
>> should be easy and not json (i just want to set the runner, --runner=xxx is
>> easier than a json version). If the value doesnt start with a [ we can use
>> the tokenizer else jackson, wdyt?
>>
>>
>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <lc...@google.com> a écrit :
>>
>> Now that I think about this more. I looked at some of the examples in the
>> pom.xml and they don't seem to be tricky to write in JSON.
>> I also looked at the Jenkins job configurations (specifically the
>> performance tests) and they pass around maps which they convert to the
>> required format without needing a user to write it themselves.
>> Using gradle, we will be able to trivially to do the same thing (convert
>> maps to json without needing the person to write it by hand) like groovy
>> does.
>> Since we are migrating away from Maven it doesn't seem worthwhile to spend
>> time on to make it easier to write the args in the Maven poms.
>>
>> Is there another use case that is being missed?
>>
>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau <rmannibu...@gmail.com>
>> wrote:
>>>
>>> Good point for \t and ,.
>>>
>>> Any objection to use jackson as a fallback for that purpose - for
>>> backwqrd compat only - and make it optional then? Will create the ticket if
>>> not.
>>>
>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <rober...@google.com> a écrit :
>>>>
>>>> Part of the motivation to use JSON for more complex options was that
>>>> it avoids having to define (and document, test, have users learn, ...)
>>>> yet another format for expressing lists, maps, etc.
>>>>
>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <lc...@google.com> wrote:
>>>> > Ken, this is specifically about running integration tests and not a
>>>> > users
>>>> > main().
>>>> >
>>>> > Note, that PipelineOptions JSON format was used because it was a
>>>> > convenient
>>>> > serialized form that is easy to explain to people what is required.
>>>> > Using a different string tokenizer and calling
>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings seems like
>>>> > it
>>>> > would be better.
>>>> >
>>>> > These are the supported formats for fromArgs():
>>>> >    *   --project=MyProject (simple property, will set the "project"
>>>> > property
>>>> > to "MyProject")
>>>> >    *   --readOnly=true (for boolean properties, will set the
>>>> > "readOnly"
>>>> > property to "true")
>>>> >    *   --readOnly (shorthand for boolean properties, will set the
>>>> > "readOnly"
>>>> > property to "true")
>>>> >    *   --x=1 --x=2 --x=3 (list style simple property, will set the "x"
>>>> > property to [1, 2, 3])
>>>> >    *   --x=1,2,3 (shorthand list style simple property, will set the
>>>> > "x"
>>>> > property to [1, 2, 3])
>>>> >    *   --complexObject='{"key1":"value1",...} (JSON format for all
>>>> > other
>>>> > complex types)
>>>> >
>>>> > Using a string tokenizer that minimizes the number of required escape
>>>> > characters would be good so we could use newline characters as our
>>>> > only
>>>> > token. I would avoid ',\t ' as tokens since they are more likely to
>>>> > appear.
>>>> >
>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <k...@google.com>
>>>> > wrote:
>>>> >>
>>>> >> We do have a plain command line syntax, and whoever writes the
>>>> >> main(String[]) function is responsible for invoking the parser. It
>>>> >> isn't
>>>> >> quite as nice as standard arg parse libraries, but it isn't too bad.
>>>> >> It
>>>> >> would be great to improve, though.
>>>> >>
>>>> >> Jackson is for machine-to-machine communication or other situations
>>>> >> where
>>>> >> command line parsing doesn't work so well.
>>>> >>
>>>> >> Are we using these some other way?
>>>> >>
>>>> >> Kenn
>>>> >>
>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau
>>>> >> <rmannibu...@gmail.com>
>>>> >> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <j...@nanthrax.net> a
>>>> >>> écrit :
>>>> >>>
>>>> >>> Hi Romain,
>>>> >>>
>>>> >>> I guess you are assuming that pipeline options are flat command line
>>>> >>> like
>>>> >>> argument, right ?
>>>> >>>
>>>> >>> Actually, theoretically,  pipeline options can be represented as
>>>> >>> json,
>>>> >>> that's why we use jackson.
>>>> >>> The pipeline options can be serialized/deserialized as json.
>>>> >>>
>>>> >>>
>>>> >>> Yes but if users (or dev ;)) start to use that then it is trivial to
>>>> >>> break the cli handling and fromArgs parsing or, if not, break the
>>>> >>> user
>>>> >>> experience. So at the end it is a kind of "yes but no", right?
>>>> >>>
>>>> >>> PS: already see some advanced users having a headache trying to pass
>>>> >>> pipeline options in json so using the plain command line syntax can
>>>> >>> be more
>>>> >>> friendly too.
>>>> >>>
>>>> >>>
>>>> >>> The purpose is to remove the jackson dependencies ?
>>>> >>>
>>>> >>>
>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which can
>>>> >>> conflict in some env and are not really needed or bring much being
>>>> >>> in the
>>>> >>> core - like avro as mentionned in another thread. Can need a
>>>> >>> sanitizing
>>>> >>> round. Short term it was really a "why is it that complicated" ;).
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Regards
>>>> >>> JB
>>>> >>>
>>>> >>>
>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote:
>>>> >>>>
>>>> >>>> Hi guys,
>>>> >>>>
>>>> >>>> not sure i fully get why jackson is used to parse pipeline options
>>>> >>>> in
>>>> >>>> the testing integration
>>>> >>>>
>>>> >>>> why not using a token parser like [1] which allows to map 1-1 with
>>>> >>>> the
>>>> >>>> user interface (command line) the options?
>>>> >>>>
>>>> >>>> [1]
>>>> >>>>
>>>> >>>> https://github.com/Talend/component-runtime/blob/master/component-server/src/main/java/org/talend/sdk/component/server/lang/StringPropertiesTokenizer.java
>>>> >>>>
>>>> >>>> Romain Manni-Bucau
>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog
>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> >>>> <http://rmannibucau.wordpress.com> | Github
>>>> >>>> <https://github.com/rmannibucau>
>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Jean-Baptiste Onofré
>>>> >>> jbono...@apache.org
>>>> >>> http://blog.nanthrax.net
>>>> >>> Talend - http://www.talend.com
>>>> >>>
>>>> >>>
>>>> >>
>>>> >
>>
>>
>>
>

Reply via email to