Removing large dependencies is great but I can only assume it will be a lot of work so if you can get it to work in a backwards compatible way great.
On Tue, Jan 9, 2018 at 1:52 PM, Romain Manni-Bucau <rmannibu...@gmail.com> wrote: > It conflicts easily between libs and containers. Shade is not a good > option too - see the thread on this topic :(. > > At the end i see using the cli solution closer to user - vs framework dev > for json - and hurting less in terms of classpath so pby sthg to test no? > > Le 9 janv. 2018 22:47, "Lukasz Cwik" <lc...@google.com> a écrit : > >> Romain, how has Jackson been a classpath breaker? >> >> >> On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau <rmannibu...@gmail.com >> > wrote: >> >>> Hmm, beam already owns the cli parsing - that is what I meant - it only >>> misses the arg delimiter (ie quoting) and adding it is easy no? >>> >>> Le 9 janv. 2018 21:19, "Robert Bradshaw" <rober...@google.com> a écrit : >>> >>>> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau >>>> <rmannibu...@gmail.com> wrote: >>>> > >>>> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <rober...@google.com> a >>>> écrit : >>>> > >>>> > From what I understand: >>>> > >>>> > 1) The command line argument values (both simple and more complex) are >>>> > all JSON-representable. >>>> > >>>> > And must be all CLI representable >>>> >>>> Sorry, I should have said "all pipeline options." In any case, one can >>>> always do JSON -> string and the resulting string is usable as a >>>> command line argument. >>>> >>>> > 2) The command line is a mapping of keys to these values. >>>> > >>>> > Which makes your 1 not true since json supports more ;) >>>> > >>>> > As such, it seems quite natural to represent the whole set as a single >>>> > JSON map, rather than using a different, custom encoding for the top >>>> > level (whose custom escaping would have to be carried into the inner >>>> > JSON values). Note that JSON has the advantage that one never needs to >>>> > explain or define it, and parsers/serializers already exists for all >>>> > languages (e.g. if one has a driver script in another language for >>>> > launching a java pipeline, it's easy to communicate all the args). >>>> > >>>> > Same reasonning applies to CLI AFAIK >>>> >>>> The spec of what a valid command line argument list is is surprisingly >>>> inconsistent across platforms, languages, and programs. And your >>>> proposal seems to be getting into what the delimiter is, and how to >>>> escape it, and possibly then how to escape the escape character. All >>>> of this is sidestepped by pointing at an existing spec. >>>> >>>> > We can't get rid of Jackson in the core because of (1) so there's >>>> > little value in adding complexity to remove it from (2). The fact that >>>> > Java doesn't ship anything in its expansive standard library for this >>>> > is unfortuante, so we have to take a dependency on something. >>>> > >>>> > We actually can as shown before >>>> >>>> How, if JSON is integral to the parsing of the argument values >>>> themselves? (Or is the argument that it's not, but each extender of >>>> PipelineOptions is responsible for choosing a JSON parser and doing it >>>> themselves.) >>>> >>>> > and jackson is not a simple negligible lib >>>> > but a classpath breaker :( >>>> >>>> Perhaps there are better libraries we could use instead? Is the >>>> ecosystem so bad that it encourages projects to roll their own of >>>> everything rather than share code? >>>> >>>> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <lc...@google.com> >>>> wrote: >>>> >> Ah, you don't want JSON within JSON. I see, if thats the case just >>>> migrate >>>> >> all of them to string tokenization and drop the Jackson usage for >>>> string >>>> >> -> >>>> >> string[] conversion. >>>> >> >>>> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau >>>> >> <rmannibu...@gmail.com> >>>> >> wrote: >>>> >>> >>>> >>> Lukasz, the use case is to znsure the config used can still map the >>>> CLI >>>> >>> and that options dont start to abuse json so it is exactly the >>>> opposite >>>> >>> of >>>> >>> "we can be fancy" and is closer to "we can be robust". Also the >>>> default >>>> >>> should be easy and not json (i just want to set the runner, >>>> --runner=xxx >>>> >>> is >>>> >>> easier than a json version). If the value doesnt start with a [ we >>>> can >>>> >>> use >>>> >>> the tokenizer else jackson, wdyt? >>>> >>> >>>> >>> >>>> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <lc...@google.com> a écrit : >>>> >>> >>>> >>> Now that I think about this more. I looked at some of the examples >>>> in the >>>> >>> pom.xml and they don't seem to be tricky to write in JSON. >>>> >>> I also looked at the Jenkins job configurations (specifically the >>>> >>> performance tests) and they pass around maps which they convert to >>>> the >>>> >>> required format without needing a user to write it themselves. >>>> >>> Using gradle, we will be able to trivially to do the same thing >>>> (convert >>>> >>> maps to json without needing the person to write it by hand) like >>>> groovy >>>> >>> does. >>>> >>> Since we are migrating away from Maven it doesn't seem worthwhile to >>>> >>> spend >>>> >>> time on to make it easier to write the args in the Maven poms. >>>> >>> >>>> >>> Is there another use case that is being missed? >>>> >>> >>>> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau >>>> >>> <rmannibu...@gmail.com> >>>> >>> wrote: >>>> >>>> >>>> >>>> Good point for \t and ,. >>>> >>>> >>>> >>>> Any objection to use jackson as a fallback for that purpose - for >>>> >>>> backwqrd compat only - and make it optional then? Will create the >>>> ticket >>>> >>>> if >>>> >>>> not. >>>> >>>> >>>> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <rober...@google.com> a >>>> écrit : >>>> >>>>> >>>> >>>>> Part of the motivation to use JSON for more complex options was >>>> that >>>> >>>>> it avoids having to define (and document, test, have users learn, >>>> ...) >>>> >>>>> yet another format for expressing lists, maps, etc. >>>> >>>>> >>>> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <lc...@google.com> >>>> wrote: >>>> >>>>> > Ken, this is specifically about running integration tests and >>>> not a >>>> >>>>> > users >>>> >>>>> > main(). >>>> >>>>> > >>>> >>>>> > Note, that PipelineOptions JSON format was used because it was a >>>> >>>>> > convenient >>>> >>>>> > serialized form that is easy to explain to people what is >>>> required. >>>> >>>>> > Using a different string tokenizer and calling >>>> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings >>>> seems like >>>> >>>>> > it >>>> >>>>> > would be better. >>>> >>>>> > >>>> >>>>> > These are the supported formats for fromArgs(): >>>> >>>>> > * --project=MyProject (simple property, will set the >>>> "project" >>>> >>>>> > property >>>> >>>>> > to "MyProject") >>>> >>>>> > * --readOnly=true (for boolean properties, will set the >>>> >>>>> > "readOnly" >>>> >>>>> > property to "true") >>>> >>>>> > * --readOnly (shorthand for boolean properties, will set >>>> the >>>> >>>>> > "readOnly" >>>> >>>>> > property to "true") >>>> >>>>> > * --x=1 --x=2 --x=3 (list style simple property, will set >>>> the >>>> >>>>> > "x" >>>> >>>>> > property to [1, 2, 3]) >>>> >>>>> > * --x=1,2,3 (shorthand list style simple property, will >>>> set the >>>> >>>>> > "x" >>>> >>>>> > property to [1, 2, 3]) >>>> >>>>> > * --complexObject='{"key1":"value1",...} (JSON format for >>>> all >>>> >>>>> > other >>>> >>>>> > complex types) >>>> >>>>> > >>>> >>>>> > Using a string tokenizer that minimizes the number of required >>>> escape >>>> >>>>> > characters would be good so we could use newline characters as >>>> our >>>> >>>>> > only >>>> >>>>> > token. I would avoid ',\t ' as tokens since they are more >>>> likely to >>>> >>>>> > appear. >>>> >>>>> > >>>> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles < >>>> k...@google.com> >>>> >>>>> > wrote: >>>> >>>>> >> >>>> >>>>> >> We do have a plain command line syntax, and whoever writes the >>>> >>>>> >> main(String[]) function is responsible for invoking the >>>> parser. It >>>> >>>>> >> isn't >>>> >>>>> >> quite as nice as standard arg parse libraries, but it isn't >>>> too bad. >>>> >>>>> >> It >>>> >>>>> >> would be great to improve, though. >>>> >>>>> >> >>>> >>>>> >> Jackson is for machine-to-machine communication or other >>>> situations >>>> >>>>> >> where >>>> >>>>> >> command line parsing doesn't work so well. >>>> >>>>> >> >>>> >>>>> >> Are we using these some other way? >>>> >>>>> >> >>>> >>>>> >> Kenn >>>> >>>>> >> >>>> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau >>>> >>>>> >> <rmannibu...@gmail.com> >>>> >>>>> >> wrote: >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" < >>>> j...@nanthrax.net> a >>>> >>>>> >>> écrit : >>>> >>>>> >>> >>>> >>>>> >>> Hi Romain, >>>> >>>>> >>> >>>> >>>>> >>> I guess you are assuming that pipeline options are flat >>>> command >>>> >>>>> >>> line >>>> >>>>> >>> like >>>> >>>>> >>> argument, right ? >>>> >>>>> >>> >>>> >>>>> >>> Actually, theoretically, pipeline options can be represented >>>> as >>>> >>>>> >>> json, >>>> >>>>> >>> that's why we use jackson. >>>> >>>>> >>> The pipeline options can be serialized/deserialized as json. >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is >>>> trivial >>>> >>>>> >>> to >>>> >>>>> >>> break the cli handling and fromArgs parsing or, if not, break >>>> the >>>> >>>>> >>> user >>>> >>>>> >>> experience. So at the end it is a kind of "yes but no", right? >>>> >>>>> >>> >>>> >>>>> >>> PS: already see some advanced users having a headache trying >>>> to >>>> >>>>> >>> pass >>>> >>>>> >>> pipeline options in json so using the plain command line >>>> syntax can >>>> >>>>> >>> be more >>>> >>>>> >>> friendly too. >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> The purpose is to remove the jackson dependencies ? >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which >>>> can >>>> >>>>> >>> conflict in some env and are not really needed or bring much >>>> being >>>> >>>>> >>> in the >>>> >>>>> >>> core - like avro as mentionned in another thread. Can need a >>>> >>>>> >>> sanitizing >>>> >>>>> >>> round. Short term it was really a "why is it that >>>> complicated" ;). >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> Regards >>>> >>>>> >>> JB >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote: >>>> >>>>> >>>> >>>> >>>>> >>>> Hi guys, >>>> >>>>> >>>> >>>> >>>>> >>>> not sure i fully get why jackson is used to parse pipeline >>>> options >>>> >>>>> >>>> in >>>> >>>>> >>>> the testing integration >>>> >>>>> >>>> >>>> >>>>> >>>> why not using a token parser like [1] which allows to map >>>> 1-1 with >>>> >>>>> >>>> the >>>> >>>>> >>>> user interface (command line) the options? >>>> >>>>> >>>> >>>> >>>>> >>>> [1] >>>> >>>>> >>>> >>>> >>>>> >>>> >>>> >>>>> >>>> https://github.com/Talend/component-runtime/blob/master/comp >>>> onent-server/src/main/java/org/talend/sdk/component/server/l >>>> ang/StringPropertiesTokenizer.java >>>> >>>>> >>>> >>>> >>>>> >>>> Romain Manni-Bucau >>>> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github >>>> >>>>> >>>> <https://github.com/rmannibucau> >>>> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau> >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >>> -- >>>> >>>>> >>> Jean-Baptiste Onofré >>>> >>>>> >>> jbono...@apache.org >>>> >>>>> >>> http://blog.nanthrax.net >>>> >>>>> >>> Talend - http://www.talend.com >>>> >>>>> >>> >>>> >>>>> >>> >>>> >>>>> >> >>>> >>>>> > >>>> >>> >>>> >>> >>>> >>> >>>> >> >>>> > >>>> > >>>> >>> >>