It conflicts easily between libs and containers. Shade is not a good option
too - see the thread on this topic :(.

At the end i see using the cli solution closer to user - vs framework dev
for json - and hurting less in terms of classpath so pby sthg to test no?

Le 9 janv. 2018 22:47, "Lukasz Cwik" <lc...@google.com> a écrit :

> Romain, how has Jackson been a classpath breaker?
>
>
> On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
>> Hmm, beam already owns the cli parsing - that is what I meant - it only
>> misses the arg delimiter (ie quoting) and adding it is easy no?
>>
>> Le 9 janv. 2018 21:19, "Robert Bradshaw" <rober...@google.com> a écrit :
>>
>>> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau
>>> <rmannibu...@gmail.com> wrote:
>>> >
>>> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <rober...@google.com> a
>>> écrit :
>>> >
>>> > From what I understand:
>>> >
>>> > 1) The command line argument values (both simple and more complex) are
>>> > all JSON-representable.
>>> >
>>> > And must be all CLI representable
>>>
>>> Sorry, I should have said "all pipeline options." In any case, one can
>>> always do JSON -> string and the resulting string is usable as a
>>> command line argument.
>>>
>>> > 2) The command line is a mapping of keys to these values.
>>> >
>>> > Which makes your 1 not true since json supports more ;)
>>> >
>>> > As such, it seems quite natural to represent the whole set as a single
>>> > JSON map, rather than using a different, custom encoding for the top
>>> > level (whose custom escaping would have to be carried into the inner
>>> > JSON values). Note that JSON has the advantage that one never needs to
>>> > explain or define it, and parsers/serializers already exists for all
>>> > languages (e.g. if one has a driver script in another language for
>>> > launching a java pipeline, it's easy to communicate all the args).
>>> >
>>> > Same reasonning applies to CLI AFAIK
>>>
>>> The spec of what a valid command line argument list is is surprisingly
>>> inconsistent across platforms, languages, and programs. And your
>>> proposal seems to be getting into what the delimiter is, and how to
>>> escape it, and possibly then how to escape the escape character. All
>>> of this is sidestepped by pointing at an existing spec.
>>>
>>> > We can't get rid of Jackson in the core because of (1) so there's
>>> > little value in adding complexity to remove it from (2). The fact that
>>> > Java doesn't ship anything in its expansive standard library for this
>>> > is unfortuante, so we have to take a dependency on something.
>>> >
>>> > We actually can as shown before
>>>
>>> How, if JSON is integral to the parsing of the argument values
>>> themselves? (Or is the argument that it's not, but each extender of
>>> PipelineOptions is responsible for choosing a JSON parser and doing it
>>> themselves.)
>>>
>>> > and jackson is not a simple negligible lib
>>> > but a classpath breaker :(
>>>
>>> Perhaps there are better libraries we could use instead? Is the
>>> ecosystem so bad that it encourages projects to roll their own of
>>> everything rather than share code?
>>>
>>> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <lc...@google.com> wrote:
>>> >> Ah, you don't want JSON within JSON. I see, if thats the case just
>>> migrate
>>> >> all of them to string tokenization and drop the Jackson usage for
>>> string
>>> >> ->
>>> >> string[] conversion.
>>> >>
>>> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau
>>> >> <rmannibu...@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> Lukasz, the use case is to znsure the config used can still map the
>>> CLI
>>> >>> and that options dont start to abuse json so it is exactly the
>>> opposite
>>> >>> of
>>> >>> "we can be fancy" and is closer to "we can be robust". Also the
>>> default
>>> >>> should be easy and not json (i just want to set the runner,
>>> --runner=xxx
>>> >>> is
>>> >>> easier than a json version). If the value doesnt start with a [ we
>>> can
>>> >>> use
>>> >>> the tokenizer else jackson, wdyt?
>>> >>>
>>> >>>
>>> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <lc...@google.com> a écrit :
>>> >>>
>>> >>> Now that I think about this more. I looked at some of the examples
>>> in the
>>> >>> pom.xml and they don't seem to be tricky to write in JSON.
>>> >>> I also looked at the Jenkins job configurations (specifically the
>>> >>> performance tests) and they pass around maps which they convert to
>>> the
>>> >>> required format without needing a user to write it themselves.
>>> >>> Using gradle, we will be able to trivially to do the same thing
>>> (convert
>>> >>> maps to json without needing the person to write it by hand) like
>>> groovy
>>> >>> does.
>>> >>> Since we are migrating away from Maven it doesn't seem worthwhile to
>>> >>> spend
>>> >>> time on to make it easier to write the args in the Maven poms.
>>> >>>
>>> >>> Is there another use case that is being missed?
>>> >>>
>>> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau
>>> >>> <rmannibu...@gmail.com>
>>> >>> wrote:
>>> >>>>
>>> >>>> Good point for \t and ,.
>>> >>>>
>>> >>>> Any objection to use jackson as a fallback for that purpose - for
>>> >>>> backwqrd compat only - and make it optional then? Will create the
>>> ticket
>>> >>>> if
>>> >>>> not.
>>> >>>>
>>> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <rober...@google.com> a
>>> écrit :
>>> >>>>>
>>> >>>>> Part of the motivation to use JSON for more complex options was
>>> that
>>> >>>>> it avoids having to define (and document, test, have users learn,
>>> ...)
>>> >>>>> yet another format for expressing lists, maps, etc.
>>> >>>>>
>>> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <lc...@google.com>
>>> wrote:
>>> >>>>> > Ken, this is specifically about running integration tests and
>>> not a
>>> >>>>> > users
>>> >>>>> > main().
>>> >>>>> >
>>> >>>>> > Note, that PipelineOptions JSON format was used because it was a
>>> >>>>> > convenient
>>> >>>>> > serialized form that is easy to explain to people what is
>>> required.
>>> >>>>> > Using a different string tokenizer and calling
>>> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings seems
>>> like
>>> >>>>> > it
>>> >>>>> > would be better.
>>> >>>>> >
>>> >>>>> > These are the supported formats for fromArgs():
>>> >>>>> >    *   --project=MyProject (simple property, will set the
>>> "project"
>>> >>>>> > property
>>> >>>>> > to "MyProject")
>>> >>>>> >    *   --readOnly=true (for boolean properties, will set the
>>> >>>>> > "readOnly"
>>> >>>>> > property to "true")
>>> >>>>> >    *   --readOnly (shorthand for boolean properties, will set the
>>> >>>>> > "readOnly"
>>> >>>>> > property to "true")
>>> >>>>> >    *   --x=1 --x=2 --x=3 (list style simple property, will set
>>> the
>>> >>>>> > "x"
>>> >>>>> > property to [1, 2, 3])
>>> >>>>> >    *   --x=1,2,3 (shorthand list style simple property, will set
>>> the
>>> >>>>> > "x"
>>> >>>>> > property to [1, 2, 3])
>>> >>>>> >    *   --complexObject='{"key1":"value1",...} (JSON format for
>>> all
>>> >>>>> > other
>>> >>>>> > complex types)
>>> >>>>> >
>>> >>>>> > Using a string tokenizer that minimizes the number of required
>>> escape
>>> >>>>> > characters would be good so we could use newline characters as
>>> our
>>> >>>>> > only
>>> >>>>> > token. I would avoid ',\t ' as tokens since they are more likely
>>> to
>>> >>>>> > appear.
>>> >>>>> >
>>> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <k...@google.com
>>> >
>>> >>>>> > wrote:
>>> >>>>> >>
>>> >>>>> >> We do have a plain command line syntax, and whoever writes the
>>> >>>>> >> main(String[]) function is responsible for invoking the parser.
>>> It
>>> >>>>> >> isn't
>>> >>>>> >> quite as nice as standard arg parse libraries, but it isn't too
>>> bad.
>>> >>>>> >> It
>>> >>>>> >> would be great to improve, though.
>>> >>>>> >>
>>> >>>>> >> Jackson is for machine-to-machine communication or other
>>> situations
>>> >>>>> >> where
>>> >>>>> >> command line parsing doesn't work so well.
>>> >>>>> >>
>>> >>>>> >> Are we using these some other way?
>>> >>>>> >>
>>> >>>>> >> Kenn
>>> >>>>> >>
>>> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau
>>> >>>>> >> <rmannibu...@gmail.com>
>>> >>>>> >> wrote:
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <j...@nanthrax.net>
>>> a
>>> >>>>> >>> écrit :
>>> >>>>> >>>
>>> >>>>> >>> Hi Romain,
>>> >>>>> >>>
>>> >>>>> >>> I guess you are assuming that pipeline options are flat command
>>> >>>>> >>> line
>>> >>>>> >>> like
>>> >>>>> >>> argument, right ?
>>> >>>>> >>>
>>> >>>>> >>> Actually, theoretically,  pipeline options can be represented
>>> as
>>> >>>>> >>> json,
>>> >>>>> >>> that's why we use jackson.
>>> >>>>> >>> The pipeline options can be serialized/deserialized as json.
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is
>>> trivial
>>> >>>>> >>> to
>>> >>>>> >>> break the cli handling and fromArgs parsing or, if not, break
>>> the
>>> >>>>> >>> user
>>> >>>>> >>> experience. So at the end it is a kind of "yes but no", right?
>>> >>>>> >>>
>>> >>>>> >>> PS: already see some advanced users having a headache trying to
>>> >>>>> >>> pass
>>> >>>>> >>> pipeline options in json so using the plain command line
>>> syntax can
>>> >>>>> >>> be more
>>> >>>>> >>> friendly too.
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>> The purpose is to remove the jackson dependencies ?
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which
>>> can
>>> >>>>> >>> conflict in some env and are not really needed or bring much
>>> being
>>> >>>>> >>> in the
>>> >>>>> >>> core - like avro as mentionned in another thread. Can need a
>>> >>>>> >>> sanitizing
>>> >>>>> >>> round. Short term it was really a "why is it that complicated"
>>> ;).
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>> Regards
>>> >>>>> >>> JB
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote:
>>> >>>>> >>>>
>>> >>>>> >>>> Hi guys,
>>> >>>>> >>>>
>>> >>>>> >>>> not sure i fully get why jackson is used to parse pipeline
>>> options
>>> >>>>> >>>> in
>>> >>>>> >>>> the testing integration
>>> >>>>> >>>>
>>> >>>>> >>>> why not using a token parser like [1] which allows to map 1-1
>>> with
>>> >>>>> >>>> the
>>> >>>>> >>>> user interface (command line) the options?
>>> >>>>> >>>>
>>> >>>>> >>>> [1]
>>> >>>>> >>>>
>>> >>>>> >>>>
>>> >>>>> >>>> https://github.com/Talend/component-runtime/blob/master/comp
>>> onent-server/src/main/java/org/talend/sdk/component/server/
>>> lang/StringPropertiesTokenizer.java
>>> >>>>> >>>>
>>> >>>>> >>>> Romain Manni-Bucau
>>> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog
>>> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github
>>> >>>>> >>>> <https://github.com/rmannibucau>
>>> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau>
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>> --
>>> >>>>> >>> Jean-Baptiste Onofré
>>> >>>>> >>> jbono...@apache.org
>>> >>>>> >>> http://blog.nanthrax.net
>>> >>>>> >>> Talend - http://www.talend.com
>>> >>>>> >>>
>>> >>>>> >>>
>>> >>>>> >>
>>> >>>>> >
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>> >
>>> >
>>>
>>
>

Reply via email to