Hi Sam,
I think I have some answers for two questions.
> Humble request: could we replace the "isinstance(col1, str)" tests with
"isinstance(col1, basestring)"?
IMHO, yes, I believe this should be basestring. Otherwise, some functions
would not accept unicode as arguments for columns in Python
The corr() and cov() methods of DataFrame require an instance of str for
column names:
.
https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py#L1356
although instances of basestring appear to work for addressing columns:
.