[ https://issues.apache.org/jira/browse/STORM-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141016#comment-15141016 ]
ASF GitHub Bot commented on STORM-1532: --------------------------------------- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/1094#discussion_r52473785 --- Diff: storm-core/src/jvm/org/apache/storm/utils/Utils.java --- @@ -330,7 +330,11 @@ public static Map readCommandLineOpts() { Map ret = new HashMap(); String commandOptions = System.getProperty("storm.options"); if (commandOptions != null) { - String[] configs = commandOptions.split(","); + /* + below regex uses negative lookahead to not split in the middle of + json objects '{}' or json arrays '[]' + */ + String[] configs = commandOptions.split(",(?![^\\[\\]{}]*(]|}))"); --- End diff -- So first of all this is a really complicated REGEX and even if it worked correctly I would be nervous moving from something simple to something so complex. To break down the REGEX (Just so I am sure I understand what it is doing) `,(?![^\\[\\]{}]*(]|}))` will split on a `,` so long as `(?!` it is not followed by anything that matches `[^\\[\\]{}]*(]|})` which matches any character that is not one of `[`, `]`, `{`, or `}` zero or more times followed by either a `]` or `}` character. In my testing that does not maintain backwards compatibility with what we had before nor what we currently have. The python code will URL encode the key/value so no `,` appears in it. That lets us provide backwards compatibility where I could type odd values that didn't parse to JSON, or other odd things like having a string in the JSON that had JSON like characters in it. This code does nothing to handle quoting of strings, and in some cases with bad JSON it will split things very differently from before, which is at lest an incompatibility between the windows code and the python code. On Windows `-c 'foo=["A", "B", "C"' -c bar=B` I forgot to close the array for foo ``` (clojure.string/split "foo=[\"A\", \"B\", \"C\",bar=B" #",(?![^\\[\\]{}]*(]|}))") ["foo=[\"A\"" " \"B\"" " \"C\"" "bar=B"] ``` Because we silently skip things that don't make since to us (in keeping with what happened before but is probably not ideal. we would end up with foo set to `[\"A\"` and bar set to `B`. They python code works just fine here, but if for some reason I wanted bar to be an array or a map It would combine the two options together. `-c 'foo=["A", "B", "C"' -c 'bar=["B"]'` I forgot to close the array for foo again ``` (clojure.string/split "foo=[\"A\"%2C \"B\" %2C \"C\",bar=[\"B\"]" #",(?![^\\[\\]{}]*(]|}))") ["foo=[\"A\"%2C \"B\" %2C \"C\",bar=[\"B\"]"] ``` Now they are all just one parameter and foo will be set to a really ugly complex string value, and bar will just disappear. This code cannot go in as is, and making the REGEX more complicated is not going to solve it. Not to mention the case where for some reason I may want a value that is URL encoded, and if windows does not URL encode it before sending it over, we will URL decode it, and ... > Fix readCommandLineOpts to parse JSON correctly > ----------------------------------------------- > > Key: STORM-1532 > URL: https://issues.apache.org/jira/browse/STORM-1532 > Project: Apache Storm > Issue Type: Bug > Reporter: Arun Mahadevan > Assignee: Arun Mahadevan > > Utils.readCommandLineOpts does a split on "," so it does not correctly parse > values passed as json object or json arrays. > Tested by passing > -c drpc.servers=[\"host1\", \"host2\"] in storm jar command and it fails > with an exception, > Exception in thread "main" java.lang.IllegalArgumentException: Field > DRPC_SERVERS must be an Iterable but was a class java.lang.String -- This message was sent by Atlassian JIRA (v6.3.4#6332)