[ https://issues.apache.org/jira/browse/SPARK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211768#comment-14211768 ]
Xiangrui Meng commented on SPARK-4348: -------------------------------------- Note that after this fix, it is very likely that the bytecode file `random.pyc` still sits under `pyspark/mllib`. We need to remove it manually to prevent "import random" taking that file. > pyspark.mllib.random conflicts with random module > ------------------------------------------------- > > Key: SPARK-4348 > URL: https://issues.apache.org/jira/browse/SPARK-4348 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark > Affects Versions: 1.1.0, 1.2.0 > Reporter: Davies Liu > Assignee: Davies Liu > Priority: Blocker > Fix For: 1.2.0 > > > There are conflict in two cases: > 1. random module is used by pyspark.mllib.feature, if the first part of > sys.path is not '', then the hack in pyspark/__init__.py will fail to fix the > conflict. > 2. Run tests in mllib/xxx.py, the '' should be popped out before import > anything, or it will fail. > The first one is not fully fixed for user, it will introduce problems in some > cases, such as: > {code} > >>> import sys > >>> import sys.insert(0, PATH_OF_MODULE) > >>> import pyspark > >>> # use Word2Vec will fail > {code} > I'd like to rename mllib/random.py as random/_random.py, then in > mllib/__init.py > {code} > import pyspark.mllib._random as random > {code} > cc [~mengxr] [~dorx] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org