[ 
https://issues.apache.org/jira/browse/SPARK-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260123#comment-14260123
 ] 

Eric O. LEBIGOT (EOL) edited comment on SPARK-4992 at 12/29/14 2:24 PM:
------------------------------------------------------------------------

Now I see that the problem with Python is more pervasive: the Python 
introduction at http://spark.apache.org/docs/latest/quick-start.html does not 
respect the style guidelines of PEPĀ 8, for instance (use of camel case for 
simple quantities), and shows some unusual convention.

At this point, I think that not shadowing "file", on the main web page, is 
important.

Another important shadowing instance is the builtin "max" function, which is 
shadowed, in an example at http://spark.apache.org/docs/latest/quick-start.html 
(look for "def max"). The function could be called "larger", for instance. I 
have not used PySpark yet, but there might be no need to redefine a max 
function since there is a builtin (maybe its signature is incompatible with 
PySpark?).

More generally, it would be better advertisement for Spark if the Python 
examples showed a higher mastery of the language (the errors I see are 
beginner's mistakes): they let me wonder about the quality of the Python 
bindings. Now, my fears might be just that, and I will still investigate Spark. 
I must also say that I am happy that the bindings exist, and I do appreciate 
all the volunteer work that went into them and into Spark. :)


was (Author: lebigot):
Now I see that the problem with Python is more pervasive: the Python 
introduction at http://spark.apache.org/docs/latest/quick-start.html does not 
respect the style guidelines of PEPĀ 8, for instance, and shows some unusual 
convention.

At this point, I think that not shadowing "file" is important.

I also noticed that the builtin "max" function is also shadowed by an example 
at http://spark.apache.org/docs/latest/quick-start.html (look for "def max"). 
The function could be called "larger", for instance. I have not used PySpark 
yet, but there might be no need to redefine a max function since there is a 
builtin (maybe its signature is incompatible with PySpark?).

More generally, it would be better advertisement for Spark if the Python 
examples showed a higher mastery of the language (the errors I see are 
beginner's mistakes): they let me wonder about the quality of the Python 
bindings. Now, my fears might be just that; I am happy that the bindings exist, 
and I do appreciate all the volunteer work that went into them and into Spark. 
:)

> Prominent Python example has bad, beginner style
> ------------------------------------------------
>
>                 Key: SPARK-4992
>                 URL: https://issues.apache.org/jira/browse/SPARK-4992
>             Project: Spark
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Eric O. LEBIGOT (EOL)
>            Priority: Trivial
>              Labels: documentation
>         Attachments: SPARK-4992.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the main webpage http://spark.apache.org/, the Python example uses a 
> variable named "file": this is a well-known bad practice (because file is the 
> name of a built-in class).
> Such a prominently visible example should show a better mastery of Python, if 
> Spark is to give confidence in the quality of its Python API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to