Removing large dependencies is great but I can only assume it will be a lot
of work so if you can get it to work in a backwards compatible way great.

On Tue, Jan 9, 2018 at 1:52 PM, Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:

> It conflicts easily between libs and containers. Shade is not a good
> option too - see the thread on this topic :(.
>
> At the end i see using the cli solution closer to user - vs framework dev
> for json - and hurting less in terms of classpath so pby sthg to test no?
>
> Le 9 janv. 2018 22:47, "Lukasz Cwik" <lc...@google.com> a écrit :
>
>> Romain, how has Jackson been a classpath breaker?
>>
>>
>> On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau <rmannibu...@gmail.com
>> > wrote:
>>
>>> Hmm, beam already owns the cli parsing - that is what I meant - it only
>>> misses the arg delimiter (ie quoting) and adding it is easy no?
>>>
>>> Le 9 janv. 2018 21:19, "Robert Bradshaw" <rober...@google.com> a écrit :
>>>
>>>> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau
>>>> <rmannibu...@gmail.com> wrote:
>>>> >
>>>> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <rober...@google.com> a
>>>> écrit :
>>>> >
>>>> > From what I understand:
>>>> >
>>>> > 1) The command line argument values (both simple and more complex) are
>>>> > all JSON-representable.
>>>> >
>>>> > And must be all CLI representable
>>>>
>>>> Sorry, I should have said "all pipeline options." In any case, one can
>>>> always do JSON -> string and the resulting string is usable as a
>>>> command line argument.
>>>>
>>>> > 2) The command line is a mapping of keys to these values.
>>>> >
>>>> > Which makes your 1 not true since json supports more ;)
>>>> >
>>>> > As such, it seems quite natural to represent the whole set as a single
>>>> > JSON map, rather than using a different, custom encoding for the top
>>>> > level (whose custom escaping would have to be carried into the inner
>>>> > JSON values). Note that JSON has the advantage that one never needs to
>>>> > explain or define it, and parsers/serializers already exists for all
>>>> > languages (e.g. if one has a driver script in another language for
>>>> > launching a java pipeline, it's easy to communicate all the args).
>>>> >
>>>> > Same reasonning applies to CLI AFAIK
>>>>
>>>> The spec of what a valid command line argument list is is surprisingly
>>>> inconsistent across platforms, languages, and programs. And your
>>>> proposal seems to be getting into what the delimiter is, and how to
>>>> escape it, and possibly then how to escape the escape character. All
>>>> of this is sidestepped by pointing at an existing spec.
>>>>
>>>> > We can't get rid of Jackson in the core because of (1) so there's
>>>> > little value in adding complexity to remove it from (2). The fact that
>>>> > Java doesn't ship anything in its expansive standard library for this
>>>> > is unfortuante, so we have to take a dependency on something.
>>>> >
>>>> > We actually can as shown before
>>>>
>>>> How, if JSON is integral to the parsing of the argument values
>>>> themselves? (Or is the argument that it's not, but each extender of
>>>> PipelineOptions is responsible for choosing a JSON parser and doing it
>>>> themselves.)
>>>>
>>>> > and jackson is not a simple negligible lib
>>>> > but a classpath breaker :(
>>>>
>>>> Perhaps there are better libraries we could use instead? Is the
>>>> ecosystem so bad that it encourages projects to roll their own of
>>>> everything rather than share code?
>>>>
>>>> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <lc...@google.com>
>>>> wrote:
>>>> >> Ah, you don't want JSON within JSON. I see, if thats the case just
>>>> migrate
>>>> >> all of them to string tokenization and drop the Jackson usage for
>>>> string
>>>> >> ->
>>>> >> string[] conversion.
>>>> >>
>>>> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau
>>>> >> <rmannibu...@gmail.com>
>>>> >> wrote:
>>>> >>>
>>>> >>> Lukasz, the use case is to znsure the config used can still map the
>>>> CLI
>>>> >>> and that options dont start to abuse json so it is exactly the
>>>> opposite
>>>> >>> of
>>>> >>> "we can be fancy" and is closer to "we can be robust". Also the
>>>> default
>>>> >>> should be easy and not json (i just want to set the runner,
>>>> --runner=xxx
>>>> >>> is
>>>> >>> easier than a json version). If the value doesnt start with a [ we
>>>> can
>>>> >>> use
>>>> >>> the tokenizer else jackson, wdyt?
>>>> >>>
>>>> >>>
>>>> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <lc...@google.com> a écrit :
>>>> >>>
>>>> >>> Now that I think about this more. I looked at some of the examples
>>>> in the
>>>> >>> pom.xml and they don't seem to be tricky to write in JSON.
>>>> >>> I also looked at the Jenkins job configurations (specifically the
>>>> >>> performance tests) and they pass around maps which they convert to
>>>> the
>>>> >>> required format without needing a user to write it themselves.
>>>> >>> Using gradle, we will be able to trivially to do the same thing
>>>> (convert
>>>> >>> maps to json without needing the person to write it by hand) like
>>>> groovy
>>>> >>> does.
>>>> >>> Since we are migrating away from Maven it doesn't seem worthwhile to
>>>> >>> spend
>>>> >>> time on to make it easier to write the args in the Maven poms.
>>>> >>>
>>>> >>> Is there another use case that is being missed?
>>>> >>>
>>>> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau
>>>> >>> <rmannibu...@gmail.com>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> Good point for \t and ,.
>>>> >>>>
>>>> >>>> Any objection to use jackson as a fallback for that purpose - for
>>>> >>>> backwqrd compat only - and make it optional then? Will create the
>>>> ticket
>>>> >>>> if
>>>> >>>> not.
>>>> >>>>
>>>> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <rober...@google.com> a
>>>> écrit :
>>>> >>>>>
>>>> >>>>> Part of the motivation to use JSON for more complex options was
>>>> that
>>>> >>>>> it avoids having to define (and document, test, have users learn,
>>>> ...)
>>>> >>>>> yet another format for expressing lists, maps, etc.
>>>> >>>>>
>>>> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <lc...@google.com>
>>>> wrote:
>>>> >>>>> > Ken, this is specifically about running integration tests and
>>>> not a
>>>> >>>>> > users
>>>> >>>>> > main().
>>>> >>>>> >
>>>> >>>>> > Note, that PipelineOptions JSON format was used because it was a
>>>> >>>>> > convenient
>>>> >>>>> > serialized form that is easy to explain to people what is
>>>> required.
>>>> >>>>> > Using a different string tokenizer and calling
>>>> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings
>>>> seems like
>>>> >>>>> > it
>>>> >>>>> > would be better.
>>>> >>>>> >
>>>> >>>>> > These are the supported formats for fromArgs():
>>>> >>>>> >    *   --project=MyProject (simple property, will set the
>>>> "project"
>>>> >>>>> > property
>>>> >>>>> > to "MyProject")
>>>> >>>>> >    *   --readOnly=true (for boolean properties, will set the
>>>> >>>>> > "readOnly"
>>>> >>>>> > property to "true")
>>>> >>>>> >    *   --readOnly (shorthand for boolean properties, will set
>>>> the
>>>> >>>>> > "readOnly"
>>>> >>>>> > property to "true")
>>>> >>>>> >    *   --x=1 --x=2 --x=3 (list style simple property, will set
>>>> the
>>>> >>>>> > "x"
>>>> >>>>> > property to [1, 2, 3])
>>>> >>>>> >    *   --x=1,2,3 (shorthand list style simple property, will
>>>> set the
>>>> >>>>> > "x"
>>>> >>>>> > property to [1, 2, 3])
>>>> >>>>> >    *   --complexObject='{"key1":"value1",...} (JSON format for
>>>> all
>>>> >>>>> > other
>>>> >>>>> > complex types)
>>>> >>>>> >
>>>> >>>>> > Using a string tokenizer that minimizes the number of required
>>>> escape
>>>> >>>>> > characters would be good so we could use newline characters as
>>>> our
>>>> >>>>> > only
>>>> >>>>> > token. I would avoid ',\t ' as tokens since they are more
>>>> likely to
>>>> >>>>> > appear.
>>>> >>>>> >
>>>> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <
>>>> k...@google.com>
>>>> >>>>> > wrote:
>>>> >>>>> >>
>>>> >>>>> >> We do have a plain command line syntax, and whoever writes the
>>>> >>>>> >> main(String[]) function is responsible for invoking the
>>>> parser. It
>>>> >>>>> >> isn't
>>>> >>>>> >> quite as nice as standard arg parse libraries, but it isn't
>>>> too bad.
>>>> >>>>> >> It
>>>> >>>>> >> would be great to improve, though.
>>>> >>>>> >>
>>>> >>>>> >> Jackson is for machine-to-machine communication or other
>>>> situations
>>>> >>>>> >> where
>>>> >>>>> >> command line parsing doesn't work so well.
>>>> >>>>> >>
>>>> >>>>> >> Are we using these some other way?
>>>> >>>>> >>
>>>> >>>>> >> Kenn
>>>> >>>>> >>
>>>> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau
>>>> >>>>> >> <rmannibu...@gmail.com>
>>>> >>>>> >> wrote:
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <
>>>> j...@nanthrax.net> a
>>>> >>>>> >>> écrit :
>>>> >>>>> >>>
>>>> >>>>> >>> Hi Romain,
>>>> >>>>> >>>
>>>> >>>>> >>> I guess you are assuming that pipeline options are flat
>>>> command
>>>> >>>>> >>> line
>>>> >>>>> >>> like
>>>> >>>>> >>> argument, right ?
>>>> >>>>> >>>
>>>> >>>>> >>> Actually, theoretically,  pipeline options can be represented
>>>> as
>>>> >>>>> >>> json,
>>>> >>>>> >>> that's why we use jackson.
>>>> >>>>> >>> The pipeline options can be serialized/deserialized as json.
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is
>>>> trivial
>>>> >>>>> >>> to
>>>> >>>>> >>> break the cli handling and fromArgs parsing or, if not, break
>>>> the
>>>> >>>>> >>> user
>>>> >>>>> >>> experience. So at the end it is a kind of "yes but no", right?
>>>> >>>>> >>>
>>>> >>>>> >>> PS: already see some advanced users having a headache trying
>>>> to
>>>> >>>>> >>> pass
>>>> >>>>> >>> pipeline options in json so using the plain command line
>>>> syntax can
>>>> >>>>> >>> be more
>>>> >>>>> >>> friendly too.
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>> The purpose is to remove the jackson dependencies ?
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which
>>>> can
>>>> >>>>> >>> conflict in some env and are not really needed or bring much
>>>> being
>>>> >>>>> >>> in the
>>>> >>>>> >>> core - like avro as mentionned in another thread. Can need a
>>>> >>>>> >>> sanitizing
>>>> >>>>> >>> round. Short term it was really a "why is it that
>>>> complicated" ;).
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>> Regards
>>>> >>>>> >>> JB
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote:
>>>> >>>>> >>>>
>>>> >>>>> >>>> Hi guys,
>>>> >>>>> >>>>
>>>> >>>>> >>>> not sure i fully get why jackson is used to parse pipeline
>>>> options
>>>> >>>>> >>>> in
>>>> >>>>> >>>> the testing integration
>>>> >>>>> >>>>
>>>> >>>>> >>>> why not using a token parser like [1] which allows to map
>>>> 1-1 with
>>>> >>>>> >>>> the
>>>> >>>>> >>>> user interface (command line) the options?
>>>> >>>>> >>>>
>>>> >>>>> >>>> [1]
>>>> >>>>> >>>>
>>>> >>>>> >>>>
>>>> >>>>> >>>> https://github.com/Talend/component-runtime/blob/master/comp
>>>> onent-server/src/main/java/org/talend/sdk/component/server/l
>>>> ang/StringPropertiesTokenizer.java
>>>> >>>>> >>>>
>>>> >>>>> >>>> Romain Manni-Bucau
>>>> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog
>>>> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github
>>>> >>>>> >>>> <https://github.com/rmannibucau>
>>>> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau>
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>> --
>>>> >>>>> >>> Jean-Baptiste Onofré
>>>> >>>>> >>> jbono...@apache.org
>>>> >>>>> >>> http://blog.nanthrax.net
>>>> >>>>> >>> Talend - http://www.talend.com
>>>> >>>>> >>>
>>>> >>>>> >>>
>>>> >>>>> >>
>>>> >>>>> >
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>
>>>> >
>>>> >
>>>>
>>>
>>

Reply via email to