Hi Aljoscha,

Thanks for your reply! Before bringing up this discussion I did some research 
on commonly used separators for options that take multiple values. I have 
considered ",", ":" and "#". Finally I chose "#" as the separator of 
"--pyRequirements".

For ",", it is the most widely used separator. Many projects use it as the 
separator of the values in same level. e.g. "-Dexcludes" in Maven, "--files" in 
Spark and "-pyFiles" in Flink. But the second parameter of "--pyRequirements", 
the requirement cached directory, is not at the same level as its first 
parameter (the requirements file). It is secondary and is only needed when the 
packages in the requirements file can not be downloaded from the package index 
server.

For ":", it is used as a path separator in most cases. e.g. main arguments of 
scp (secure copy), "--volume" in Docker and "-cp" in Java. But as we support 
accept a URI as the file path, which contains ":" in most cases, ":" can not be 
used as the separator of "--pyRequirements".

For "#", it is really rarely used as a separator for multiple values. I only 
find Spark using "#" as the separator for option "--files" and "--archives" 
between file path and target file/directory name. After some research I find 
that this usage comes from the URI fragment. We can append a secondary resource 
as the fragment of the URI after a number sign ("#") character. As we treat 
user file paths as URIs when parsing command line, using "#" as the separator 
of "--pyRequirements" makes sense to me, which means the second parameter is 
the fragment of the first parameter. The definition of URI fragment can be 
found here [1].

The reason of using "#" in "--pyArchives" as the separator of file path and 
targer directory name is the same as above.

Best,
Wei

[1] https://tools.ietf.org/html/rfc3986#section-3.5

> 在 2019年12月3日,22:02,Aljoscha Krettek <aljos...@apache.org> 写道:
> 
> Hi,
> 
> Yes, I think it’s a good idea to make the options uniform. Using ‘#’ as a 
> separator for options that take two values seems a bit strange to me, did you 
> research if any other CLI tools have this convention?
> 
> Side note: I don’t like that our options use camel-case, I think that’s very 
> non-standard. But that’s how it is now…
> 
> Best,
> Aljoscha
> 
>> On 3. Dec 2019, at 10:14, jincheng sun <sunjincheng...@gmail.com> wrote:
>> 
>> Thanks for bringup this discussion Wei!
>> I think this is very important for Flink User, we should contains this
>> changes in Flink 1.10.
>> +1  for the optimization from the perspective of user convenience and the
>> unified use of Flink command line parameters.
>> 
>> Best,
>> Jincheng
>> 
>> Wei Zhong <weizhong0...@gmail.com> 于2019年12月2日周一 下午3:26写道:
>> 
>>> Hi everyone,
>>> 
>>> I wanted to bring up the discussion of improving the Pyflink command line
>>> options.
>>> 
>>> A few command line options have been introduced in the FLIP-78 [1], i.e.
>>> "python-executable-path", "python-requirements","python-archive", etc.
>>> There are a few problems with these options, i.e. the naming style,
>>> variable argument options, etc.
>>> 
>>> We want to make some adjustment of FLIP-78 to improve the newly introduced
>>> command line options, here is the design doc:
>>> 
>>> https://docs.google.com/document/d/1R8CaDa3908V1SnTxBkTBzeisWqBF40NAYYjfRl680eg/edit?usp=sharing
>>> <
>>> https://docs.google.com/document/d/1R8CaDa3908V1SnTxBkTBzeisWqBF40NAYYjfRl680eg/edit?usp=sharing
>>>> 
>>> Looking forward to your feedback!
>>> 
>>> Best,
>>> Wei
>>> 
>>> [1]
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-78%3A+Flink+Python+UDF+Environment+and+Dependency+Management
>>> <
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-78:+Flink+Python+UDF+Environment+and+Dependency+Management
>>>> 
>>> 
>>> 
> 

Reply via email to