The corr() and cov() methods of DataFrame require an instance of str for
column names:

. 
https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py#L1356

although instances of basestring appear to work for addressing columns:

. 
https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py#L708

Humble request: could we replace the "isinstance(col1, str)" tests with
"isinstance(col1, basestring)"?

Less humble request: why test types at all? Why not just do one of {raise
KeyError, coerce to string}?

Cheers,
Sam



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-accept-unicode-column-names-in-DataFrame-corr-and-cov-tp28065.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to