Joseph K. Bradley created SPARK-14605: -----------------------------------------
Summary: Python spark.ml classes should use unicode uid Key: SPARK-14605 URL: https://issues.apache.org/jira/browse/SPARK-14605 Project: Spark Issue Type: Improvement Components: ML, PySpark Reporter: Joseph K. Bradley Assignee: Joseph K. Bradley Priority: Minor Python spark.ml Identifiable classes use UIDs of type {{str}}, but they should use {{unicode}} (in Python 2.x) to match Java. This could be a problem if someone created a class in Java with odd unicode characters, saved it, and loaded it in Python. This is also odd since the following code in Python produces a class {{lr}} with {{str}} uid and {{lr2}} with {{unicode}} uid: {code} from pyspark.ml.regression import LinearRegression lr = LinearRegression(maxIter=1) lr_path = "lr-TEMP" lr.write.overwrite().save(lr_path) lr2 = LinearRegression.load(lr_path) {code} Proposal: Use unicode everywhere in Python. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org