It conflicts easily between libs and containers. Shade is not a good option too - see the thread on this topic :(.
At the end i see using the cli solution closer to user - vs framework dev for json - and hurting less in terms of classpath so pby sthg to test no? Le 9 janv. 2018 22:47, "Lukasz Cwik" <lc...@google.com> a écrit : > Romain, how has Jackson been a classpath breaker? > > > On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > >> Hmm, beam already owns the cli parsing - that is what I meant - it only >> misses the arg delimiter (ie quoting) and adding it is easy no? >> >> Le 9 janv. 2018 21:19, "Robert Bradshaw" <rober...@google.com> a écrit : >> >>> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau >>> <rmannibu...@gmail.com> wrote: >>> > >>> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <rober...@google.com> a >>> écrit : >>> > >>> > From what I understand: >>> > >>> > 1) The command line argument values (both simple and more complex) are >>> > all JSON-representable. >>> > >>> > And must be all CLI representable >>> >>> Sorry, I should have said "all pipeline options." In any case, one can >>> always do JSON -> string and the resulting string is usable as a >>> command line argument. >>> >>> > 2) The command line is a mapping of keys to these values. >>> > >>> > Which makes your 1 not true since json supports more ;) >>> > >>> > As such, it seems quite natural to represent the whole set as a single >>> > JSON map, rather than using a different, custom encoding for the top >>> > level (whose custom escaping would have to be carried into the inner >>> > JSON values). Note that JSON has the advantage that one never needs to >>> > explain or define it, and parsers/serializers already exists for all >>> > languages (e.g. if one has a driver script in another language for >>> > launching a java pipeline, it's easy to communicate all the args). >>> > >>> > Same reasonning applies to CLI AFAIK >>> >>> The spec of what a valid command line argument list is is surprisingly >>> inconsistent across platforms, languages, and programs. And your >>> proposal seems to be getting into what the delimiter is, and how to >>> escape it, and possibly then how to escape the escape character. All >>> of this is sidestepped by pointing at an existing spec. >>> >>> > We can't get rid of Jackson in the core because of (1) so there's >>> > little value in adding complexity to remove it from (2). The fact that >>> > Java doesn't ship anything in its expansive standard library for this >>> > is unfortuante, so we have to take a dependency on something. >>> > >>> > We actually can as shown before >>> >>> How, if JSON is integral to the parsing of the argument values >>> themselves? (Or is the argument that it's not, but each extender of >>> PipelineOptions is responsible for choosing a JSON parser and doing it >>> themselves.) >>> >>> > and jackson is not a simple negligible lib >>> > but a classpath breaker :( >>> >>> Perhaps there are better libraries we could use instead? Is the >>> ecosystem so bad that it encourages projects to roll their own of >>> everything rather than share code? >>> >>> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <lc...@google.com> wrote: >>> >> Ah, you don't want JSON within JSON. I see, if thats the case just >>> migrate >>> >> all of them to string tokenization and drop the Jackson usage for >>> string >>> >> -> >>> >> string[] conversion. >>> >> >>> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau >>> >> <rmannibu...@gmail.com> >>> >> wrote: >>> >>> >>> >>> Lukasz, the use case is to znsure the config used can still map the >>> CLI >>> >>> and that options dont start to abuse json so it is exactly the >>> opposite >>> >>> of >>> >>> "we can be fancy" and is closer to "we can be robust". Also the >>> default >>> >>> should be easy and not json (i just want to set the runner, >>> --runner=xxx >>> >>> is >>> >>> easier than a json version). If the value doesnt start with a [ we >>> can >>> >>> use >>> >>> the tokenizer else jackson, wdyt? >>> >>> >>> >>> >>> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <lc...@google.com> a écrit : >>> >>> >>> >>> Now that I think about this more. I looked at some of the examples >>> in the >>> >>> pom.xml and they don't seem to be tricky to write in JSON. >>> >>> I also looked at the Jenkins job configurations (specifically the >>> >>> performance tests) and they pass around maps which they convert to >>> the >>> >>> required format without needing a user to write it themselves. >>> >>> Using gradle, we will be able to trivially to do the same thing >>> (convert >>> >>> maps to json without needing the person to write it by hand) like >>> groovy >>> >>> does. >>> >>> Since we are migrating away from Maven it doesn't seem worthwhile to >>> >>> spend >>> >>> time on to make it easier to write the args in the Maven poms. >>> >>> >>> >>> Is there another use case that is being missed? >>> >>> >>> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau >>> >>> <rmannibu...@gmail.com> >>> >>> wrote: >>> >>>> >>> >>>> Good point for \t and ,. >>> >>>> >>> >>>> Any objection to use jackson as a fallback for that purpose - for >>> >>>> backwqrd compat only - and make it optional then? Will create the >>> ticket >>> >>>> if >>> >>>> not. >>> >>>> >>> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <rober...@google.com> a >>> écrit : >>> >>>>> >>> >>>>> Part of the motivation to use JSON for more complex options was >>> that >>> >>>>> it avoids having to define (and document, test, have users learn, >>> ...) >>> >>>>> yet another format for expressing lists, maps, etc. >>> >>>>> >>> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <lc...@google.com> >>> wrote: >>> >>>>> > Ken, this is specifically about running integration tests and >>> not a >>> >>>>> > users >>> >>>>> > main(). >>> >>>>> > >>> >>>>> > Note, that PipelineOptions JSON format was used because it was a >>> >>>>> > convenient >>> >>>>> > serialized form that is easy to explain to people what is >>> required. >>> >>>>> > Using a different string tokenizer and calling >>> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings seems >>> like >>> >>>>> > it >>> >>>>> > would be better. >>> >>>>> > >>> >>>>> > These are the supported formats for fromArgs(): >>> >>>>> > * --project=MyProject (simple property, will set the >>> "project" >>> >>>>> > property >>> >>>>> > to "MyProject") >>> >>>>> > * --readOnly=true (for boolean properties, will set the >>> >>>>> > "readOnly" >>> >>>>> > property to "true") >>> >>>>> > * --readOnly (shorthand for boolean properties, will set the >>> >>>>> > "readOnly" >>> >>>>> > property to "true") >>> >>>>> > * --x=1 --x=2 --x=3 (list style simple property, will set >>> the >>> >>>>> > "x" >>> >>>>> > property to [1, 2, 3]) >>> >>>>> > * --x=1,2,3 (shorthand list style simple property, will set >>> the >>> >>>>> > "x" >>> >>>>> > property to [1, 2, 3]) >>> >>>>> > * --complexObject='{"key1":"value1",...} (JSON format for >>> all >>> >>>>> > other >>> >>>>> > complex types) >>> >>>>> > >>> >>>>> > Using a string tokenizer that minimizes the number of required >>> escape >>> >>>>> > characters would be good so we could use newline characters as >>> our >>> >>>>> > only >>> >>>>> > token. I would avoid ',\t ' as tokens since they are more likely >>> to >>> >>>>> > appear. >>> >>>>> > >>> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <k...@google.com >>> > >>> >>>>> > wrote: >>> >>>>> >> >>> >>>>> >> We do have a plain command line syntax, and whoever writes the >>> >>>>> >> main(String[]) function is responsible for invoking the parser. >>> It >>> >>>>> >> isn't >>> >>>>> >> quite as nice as standard arg parse libraries, but it isn't too >>> bad. >>> >>>>> >> It >>> >>>>> >> would be great to improve, though. >>> >>>>> >> >>> >>>>> >> Jackson is for machine-to-machine communication or other >>> situations >>> >>>>> >> where >>> >>>>> >> command line parsing doesn't work so well. >>> >>>>> >> >>> >>>>> >> Are we using these some other way? >>> >>>>> >> >>> >>>>> >> Kenn >>> >>>>> >> >>> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau >>> >>>>> >> <rmannibu...@gmail.com> >>> >>>>> >> wrote: >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <j...@nanthrax.net> >>> a >>> >>>>> >>> écrit : >>> >>>>> >>> >>> >>>>> >>> Hi Romain, >>> >>>>> >>> >>> >>>>> >>> I guess you are assuming that pipeline options are flat command >>> >>>>> >>> line >>> >>>>> >>> like >>> >>>>> >>> argument, right ? >>> >>>>> >>> >>> >>>>> >>> Actually, theoretically, pipeline options can be represented >>> as >>> >>>>> >>> json, >>> >>>>> >>> that's why we use jackson. >>> >>>>> >>> The pipeline options can be serialized/deserialized as json. >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is >>> trivial >>> >>>>> >>> to >>> >>>>> >>> break the cli handling and fromArgs parsing or, if not, break >>> the >>> >>>>> >>> user >>> >>>>> >>> experience. So at the end it is a kind of "yes but no", right? >>> >>>>> >>> >>> >>>>> >>> PS: already see some advanced users having a headache trying to >>> >>>>> >>> pass >>> >>>>> >>> pipeline options in json so using the plain command line >>> syntax can >>> >>>>> >>> be more >>> >>>>> >>> friendly too. >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> The purpose is to remove the jackson dependencies ? >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which >>> can >>> >>>>> >>> conflict in some env and are not really needed or bring much >>> being >>> >>>>> >>> in the >>> >>>>> >>> core - like avro as mentionned in another thread. Can need a >>> >>>>> >>> sanitizing >>> >>>>> >>> round. Short term it was really a "why is it that complicated" >>> ;). >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> Regards >>> >>>>> >>> JB >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote: >>> >>>>> >>>> >>> >>>>> >>>> Hi guys, >>> >>>>> >>>> >>> >>>>> >>>> not sure i fully get why jackson is used to parse pipeline >>> options >>> >>>>> >>>> in >>> >>>>> >>>> the testing integration >>> >>>>> >>>> >>> >>>>> >>>> why not using a token parser like [1] which allows to map 1-1 >>> with >>> >>>>> >>>> the >>> >>>>> >>>> user interface (command line) the options? >>> >>>>> >>>> >>> >>>>> >>>> [1] >>> >>>>> >>>> >>> >>>>> >>>> >>> >>>>> >>>> https://github.com/Talend/component-runtime/blob/master/comp >>> onent-server/src/main/java/org/talend/sdk/component/server/ >>> lang/StringPropertiesTokenizer.java >>> >>>>> >>>> >>> >>>>> >>>> Romain Manni-Bucau >>> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog >>> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github >>> >>>>> >>>> <https://github.com/rmannibucau> >>> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau> >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >>> -- >>> >>>>> >>> Jean-Baptiste Onofré >>> >>>>> >>> jbono...@apache.org >>> >>>>> >>> http://blog.nanthrax.net >>> >>>>> >>> Talend - http://www.talend.com >>> >>>>> >>> >>> >>>>> >>> >>> >>>>> >> >>> >>>>> > >>> >>> >>> >>> >>> >>> >>> >> >>> > >>> > >>> >> >