[jira] [Commented] (SPARK-3869) ./bin/spark-class miss Java version with _JAVA_OPTIONS set
[ https://issues.apache.org/jira/browse/SPARK-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169329#comment-14169329 ] cocoatomo commented on SPARK-3869: -- Hi [~pwendell], thank you for informing me. Is it OK to use the abbreviated last name (e.g. "Barack O.") ? > ./bin/spark-class miss Java version with _JAVA_OPTIONS set > -- > > Key: SPARK-3869 > URL: https://issues.apache.org/jira/browse/SPARK-3869 > Project: Spark > Issue Type: Bug > Components: Spark Shell >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 >Reporter: cocoatomo > > When _JAVA_OPTIONS environment variable is set, a command "java -version" > outputs a message like "Picked up _JAVA_OPTIONS: -Dfile.encoding=UTF-8". > ./bin/spark-class knows java version from the first line of "java -version" > output, so it mistakes java version with _JAVA_OPTIONS set. > commit: a85f24accd3266e0f97ee04d03c22b593d99c062 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3910) ./python/pyspark/mllib/classification.py doctests fails with module name pollution
[ https://issues.apache.org/jira/browse/SPARK-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168786#comment-14168786 ] cocoatomo commented on SPARK-3910: -- Thank you for the comment. I am trying it at $SPARK_HOME. (Executing "./bin/run-tests" command shows this.) In addition, it is strange that a command {noformat} ./bin/pyspark python/pyspark/mllib/classification.py {noformat} fails with "numpy ImportError". So, my environment have some trouble (sys.path is suspicious) and at least we have some difference between environments where PySpark runs. I set up my environment using virtualenvwrapper with Python 2.6.8 (default python executable on Mac OS X 10.9.5). ImportError mentioned in this issue occurred on this environment. For comparison, I tried testing on other environment which Python version is 2.7.8, then got a same error. Is there some difference between our environments? > ./python/pyspark/mllib/classification.py doctests fails with module name > pollution > -- > > Key: SPARK-3910 > URL: https://issues.apache.org/jira/browse/SPARK-3910 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20, > Jinja2==2.7.3, MarkupSafe==0.23, Pygments==1.6, Sphinx==1.2.3, > argparse==1.2.1, docutils==0.12, flake8==2.2.3, mccabe==0.2.1, numpy==1.9.0, > pep8==1.5.7, psutil==2.1.3, pyflake8==0.1.9, pyflakes==0.8.1, > unittest2==0.5.1, wsgiref==0.1.2 >Reporter: cocoatomo > Labels: pyspark, testing > > In ./python/run-tests script, we run the doctests in > ./pyspark/mllib/classification.py. > The output is as following: > {noformat} > $ ./python/run-tests > ... > Running test: pyspark/mllib/classification.py > Traceback (most recent call last): > File "pyspark/mllib/classification.py", line 20, in > import numpy > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/__init__.py", > line 170, in > from . import add_newdocs > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/add_newdocs.py", > line 13, in > from numpy.lib import add_newdoc > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/lib/__init__.py", > line 8, in > from .type_check import * > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/lib/type_check.py", > line 11, in > import numpy.core.numeric as _nx > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/core/__init__.py", > line 46, in > from numpy.testing import Tester > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/testing/__init__.py", > line 13, in > from .utils import * > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/testing/utils.py", > line 15, in > from tempfile import mkdtemp > File > "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/tempfile.py", > line 34, in > from random import Random as _Random > File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/mllib/random.py", > line 24, in > from pyspark.rdd import RDD > File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/__init__.py", line > 51, in > from pyspark.context import SparkContext > File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/context.py", line > 22, in > from tempfile import NamedTemporaryFile > ImportError: cannot import name NamedTemporaryFile > 0.07 real 0.04 user 0.02 sys > Had test failures; see logs. > {noformat} > The problem is a cyclic import of tempfile module. > The cause of it is that pyspark.mllib.random module exists in the directory > where pyspark.mllib.classification module exists. > classification module imports numpy module, and then numpy module imports > tempfile module from its inside. > Now the first entry sys.path is the directory "./python/pyspark/mllib" (where > the executed file "classification.py" exists), so tempfile module imports > pyspark.mllib.random module (not the standard library "random" module). > Finally, import chains reach tempfile again, then a cyclic import is formed. > Summary: classification → numpy → tempfile → pyspark.mllib.random → tempfile > → (cyclic import!!) > Furthermore, stat module is in a standard library, and pyspark.mllib.stat > module exists. This also may be troublesome. > commit: 0e8203f4fb721158fb27897680da476174d24c4b > A fundamental solution is to avoid using module names used by standard > libraries (currently "random" and "stat"). > A difficulty of this solution is to rename pyspark.mllib.random and
[jira] [Updated] (SPARK-3910) ./python/pyspark/mllib/classification.py doctests fails with module name pollution
[ https://issues.apache.org/jira/browse/SPARK-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3910: - Labels: pyspark testing (was: ) > ./python/pyspark/mllib/classification.py doctests fails with module name > pollution > -- > > Key: SPARK-3910 > URL: https://issues.apache.org/jira/browse/SPARK-3910 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20, > Jinja2==2.7.3, MarkupSafe==0.23, Pygments==1.6, Sphinx==1.2.3, > argparse==1.2.1, docutils==0.12, flake8==2.2.3, mccabe==0.2.1, numpy==1.9.0, > pep8==1.5.7, psutil==2.1.3, pyflake8==0.1.9, pyflakes==0.8.1, > unittest2==0.5.1, wsgiref==0.1.2 >Reporter: cocoatomo > Labels: pyspark, testing > > In ./python/run-tests script, we run the doctests in > ./pyspark/mllib/classification.py. > The output is as following: > {noformat} > $ ./python/run-tests > ... > Running test: pyspark/mllib/classification.py > Traceback (most recent call last): > File "pyspark/mllib/classification.py", line 20, in > import numpy > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/__init__.py", > line 170, in > from . import add_newdocs > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/add_newdocs.py", > line 13, in > from numpy.lib import add_newdoc > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/lib/__init__.py", > line 8, in > from .type_check import * > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/lib/type_check.py", > line 11, in > import numpy.core.numeric as _nx > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/core/__init__.py", > line 46, in > from numpy.testing import Tester > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/testing/__init__.py", > line 13, in > from .utils import * > File > "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/testing/utils.py", > line 15, in > from tempfile import mkdtemp > File > "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/tempfile.py", > line 34, in > from random import Random as _Random > File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/mllib/random.py", > line 24, in > from pyspark.rdd import RDD > File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/__init__.py", line > 51, in > from pyspark.context import SparkContext > File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/context.py", line > 22, in > from tempfile import NamedTemporaryFile > ImportError: cannot import name NamedTemporaryFile > 0.07 real 0.04 user 0.02 sys > Had test failures; see logs. > {noformat} > The problem is a cyclic import of tempfile module. > The cause of it is that pyspark.mllib.random module exists in the directory > where pyspark.mllib.classification module exists. > classification module imports numpy module, and then numpy module imports > tempfile module from its inside. > Now the first entry sys.path is the directory "./python/pyspark/mllib" (where > the executed file "classification.py" exists), so tempfile module imports > pyspark.mllib.random module (not the standard library "random" module). > Finally, import chains reach tempfile again, then a cyclic import is formed. > Summary: classification → numpy → tempfile → pyspark.mllib.random → tempfile > → (cyclic import!!) > Furthermore, stat module is in a standard library, and pyspark.mllib.stat > module exists. This also may be troublesome. > commit: 0e8203f4fb721158fb27897680da476174d24c4b > A fundamental solution is to avoid using module names used by standard > libraries (currently "random" and "stat"). > A difficulty of this solution is to rename pyspark.mllib.random and > pyspark.mllib.stat, which may be already used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3910) ./python/pyspark/mllib/classification.py doctests fails with module name pollution
cocoatomo created SPARK-3910: Summary: ./python/pyspark/mllib/classification.py doctests fails with module name pollution Key: SPARK-3910 URL: https://issues.apache.org/jira/browse/SPARK-3910 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20, Jinja2==2.7.3, MarkupSafe==0.23, Pygments==1.6, Sphinx==1.2.3, argparse==1.2.1, docutils==0.12, flake8==2.2.3, mccabe==0.2.1, numpy==1.9.0, pep8==1.5.7, psutil==2.1.3, pyflake8==0.1.9, pyflakes==0.8.1, unittest2==0.5.1, wsgiref==0.1.2 Reporter: cocoatomo In ./python/run-tests script, we run the doctests in ./pyspark/mllib/classification.py. The output is as following: {noformat} $ ./python/run-tests ... Running test: pyspark/mllib/classification.py Traceback (most recent call last): File "pyspark/mllib/classification.py", line 20, in import numpy File "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/__init__.py", line 170, in from . import add_newdocs File "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/add_newdocs.py", line 13, in from numpy.lib import add_newdoc File "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/lib/__init__.py", line 8, in from .type_check import * File "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/lib/type_check.py", line 11, in import numpy.core.numeric as _nx File "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/core/__init__.py", line 46, in from numpy.testing import Tester File "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/testing/__init__.py", line 13, in from .utils import * File "/Users/tomohiko/.virtualenvs/pyspark_py26/lib/python2.6/site-packages/numpy/testing/utils.py", line 15, in from tempfile import mkdtemp File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/tempfile.py", line 34, in from random import Random as _Random File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/mllib/random.py", line 24, in from pyspark.rdd import RDD File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/__init__.py", line 51, in from pyspark.context import SparkContext File "/Users/tomohiko/MyRepos/Scala/spark/python/pyspark/context.py", line 22, in from tempfile import NamedTemporaryFile ImportError: cannot import name NamedTemporaryFile 0.07 real 0.04 user 0.02 sys Had test failures; see logs. {noformat} The problem is a cyclic import of tempfile module. The cause of it is that pyspark.mllib.random module exists in the directory where pyspark.mllib.classification module exists. classification module imports numpy module, and then numpy module imports tempfile module from its inside. Now the first entry sys.path is the directory "./python/pyspark/mllib" (where the executed file "classification.py" exists), so tempfile module imports pyspark.mllib.random module (not the standard library "random" module). Finally, import chains reach tempfile again, then a cyclic import is formed. Summary: classification → numpy → tempfile → pyspark.mllib.random → tempfile → (cyclic import!!) Furthermore, stat module is in a standard library, and pyspark.mllib.stat module exists. This also may be troublesome. commit: 0e8203f4fb721158fb27897680da476174d24c4b A fundamental solution is to avoid using module names used by standard libraries (currently "random" and "stat"). A difficulty of this solution is to rename pyspark.mllib.random and pyspark.mllib.stat, which may be already used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3909) A corrupted format in Sphinx documents and building warnings
cocoatomo created SPARK-3909: Summary: A corrupted format in Sphinx documents and building warnings Key: SPARK-3909 URL: https://issues.apache.org/jira/browse/SPARK-3909 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.7.8, Jinja2==2.7.3, MarkupSafe==0.23, Pygments==1.6, Sphinx==1.2.3, docutils==0.12, numpy==1.9.0, wsgiref==0.1.2 Reporter: cocoatomo Priority: Minor Sphinx documents contains a corrupted ReST format and have some warnings. The purpose of this issue is same as https://issues.apache.org/jira/browse/SPARK-3773. commit: 0e8203f4fb721158fb27897680da476174d24c4b output {noformat} $ cd ./python/docs $ make clean html rm -rf _build/* sphinx-build -b html -d _build/doctrees . _build/html Making output directory... Running Sphinx v1.2.3 loading pickled environment... not yet created building [html]: targets for 4 source files that are out of date updating environment: 4 added, 0 changed, 0 removed reading sources... [100%] pyspark.sql /Users//MyRepos/Scala/spark/python/pyspark/mllib/feature.py:docstring of pyspark.mllib.feature.Word2VecModel.findSynonyms:4: WARNING: Field list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/feature.py:docstring of pyspark.mllib.feature.Word2VecModel.transform:3: WARNING: Field list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/sql.py:docstring of pyspark.sql:4: WARNING: Bullet list ends without a blank line; unexpected unindent. looking for now-outdated files... none found pickling environment... done checking consistency... done preparing documents... done writing output... [100%] pyspark.sql writing additional files... (12 module code pages) _modules/index search copying static files... WARNING: html_static_path entry u'/Users//MyRepos/Scala/spark/python/docs/_static' does not exist done copying extra files... done dumping search index... done dumping object inventory... done build succeeded, 4 warnings. Build finished. The HTML pages are in _build/html. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3867) ./python/run-tests failed when it run with Python 2.6 and unittest2 is not installed
[ https://issues.apache.org/jira/browse/SPARK-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3867: - Description: ./python/run-tests search a Python 2.6 executable on PATH and use it if available. When using Python 2.6, it is going to import unittest2 module which is *not* a standard library in Python 2.6, so it fails with ImportError. commit: 1d72a30874a88bdbab75217f001cf2af409016e7 was: ./python/run-tests search a Python 2.6 executable on PATH and use it if available. When using Python 2.6, it is going to import unittest2 module which is *not* a standard library in Python 2.6, so it fails with ImportError. > ./python/run-tests failed when it run with Python 2.6 and unittest2 is not > installed > > > Key: SPARK-3867 > URL: https://issues.apache.org/jira/browse/SPARK-3867 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > > ./python/run-tests search a Python 2.6 executable on PATH and use it if > available. > When using Python 2.6, it is going to import unittest2 module which is *not* > a standard library in Python 2.6, so it fails with ImportError. > commit: 1d72a30874a88bdbab75217f001cf2af409016e7 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3869) ./bin/spark-class miss Java version with _JAVA_OPTIONS set
[ https://issues.apache.org/jira/browse/SPARK-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3869: - Description: When _JAVA_OPTIONS environment variable is set, a command "java -version" outputs a message like "Picked up _JAVA_OPTIONS: -Dfile.encoding=UTF-8". ./bin/spark-class knows java version from the first line of "java -version" output, so it mistakes java version with _JAVA_OPTIONS set. commit: a85f24accd3266e0f97ee04d03c22b593d99c062 was: When _JAVA_OPTIONS environment variable is set, a command "java -version" outputs a message like "Picked up _JAVA_OPTIONS: -Dfile.encoding=UTF-8". ./bin/spark-class knows java version from the first line of "java -version" output, so it mistakes java version with _JAVA_OPTIONS set. > ./bin/spark-class miss Java version with _JAVA_OPTIONS set > -- > > Key: SPARK-3869 > URL: https://issues.apache.org/jira/browse/SPARK-3869 > Project: Spark > Issue Type: Bug > Components: Spark Shell >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 >Reporter: cocoatomo > > When _JAVA_OPTIONS environment variable is set, a command "java -version" > outputs a message like "Picked up _JAVA_OPTIONS: -Dfile.encoding=UTF-8". > ./bin/spark-class knows java version from the first line of "java -version" > output, so it mistakes java version with _JAVA_OPTIONS set. > commit: a85f24accd3266e0f97ee04d03c22b593d99c062 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3869) ./bin/spark-class miss Java version with _JAVA_OPTIONS set
cocoatomo created SPARK-3869: Summary: ./bin/spark-class miss Java version with _JAVA_OPTIONS set Key: SPARK-3869 URL: https://issues.apache.org/jira/browse/SPARK-3869 Project: Spark Issue Type: Bug Components: Spark Shell Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 Reporter: cocoatomo When _JAVA_OPTIONS environment variable is set, a command "java -version" outputs a message like "Picked up _JAVA_OPTIONS: -Dfile.encoding=UTF-8". ./bin/spark-class knows java version from the first line of "java -version" output, so it mistakes java version with _JAVA_OPTIONS set. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3867) ./python/run-tests failed when it run with Python 2.6 and unittest2 is not installed
[ https://issues.apache.org/jira/browse/SPARK-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3867: - Issue Type: Sub-task (was: Bug) Parent: SPARK-3866 > ./python/run-tests failed when it run with Python 2.6 and unittest2 is not > installed > > > Key: SPARK-3867 > URL: https://issues.apache.org/jira/browse/SPARK-3867 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > > ./python/run-tests search a Python 2.6 executable on PATH and use it if > available. > When using Python 2.6, it is going to import unittest2 module which is *not* > a standard library in Python 2.6, so it fails with ImportError. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3868) Hard to recognize which module is tested from unit-tests.log
[ https://issues.apache.org/jira/browse/SPARK-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3868: - Issue Type: Sub-task (was: Bug) Parent: SPARK-3866 > Hard to recognize which module is tested from unit-tests.log > > > Key: SPARK-3868 > URL: https://issues.apache.org/jira/browse/SPARK-3868 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > > ./python/run-tests script display messages about which test it is running > currently on stdout but not write them on unit-tests.log. > It is harder for us to recognize what test programs were executed and which > test was failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3868) Hard to recognize which module is tested from unit-tests.log
cocoatomo created SPARK-3868: Summary: Hard to recognize which module is tested from unit-tests.log Key: SPARK-3868 URL: https://issues.apache.org/jira/browse/SPARK-3868 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 Reporter: cocoatomo ./python/run-tests script display messages about which test it is running currently on stdout but not write them on unit-tests.log. It is harder for us to recognize what test programs were executed and which test was failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3867) ./python/run-tests failed when it run with Python 2.6 and unittest2 is not installed
cocoatomo created SPARK-3867: Summary: ./python/run-tests failed when it run with Python 2.6 and unittest2 is not installed Key: SPARK-3867 URL: https://issues.apache.org/jira/browse/SPARK-3867 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.6.8, Java 1.8.0_20 Reporter: cocoatomo ./python/run-tests search a Python 2.6 executable on PATH and use it if available. When using Python 2.6, it is going to import unittest2 module which is *not* a standard library in Python 2.6, so it fails with ImportError. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Attachment: unit-tests.log An output from ./python/run-tests > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, Java 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > Attachments: unit-tests.log > > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > A test output is contained in the attached file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Attachment: (was: unit-tests.log) > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, Java 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > A test output is contained in the attached file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Environment: Mac OS X 10.9.5, Python 2.7.8, Java 1.8.0_20 (was: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, Java 1.8.0_20) > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, Java 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > Attachments: unit-tests.log > > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > A test output is contained in the attached file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Attachment: unit-tests.log An output from ./python/run-tests > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, Java > 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > Attachments: unit-tests.log > > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > A test output is contained in the attached file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Description: This issue is a overhaul issue to remove problems encountered when I run ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. It will have sub-tasks for some kinds of issues. A test output is contained in the attached file. was: This issue is a overhaul issue to remove problems encountered when I run ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. It will have sub-tasks for some kinds of issues. Contents of unit-tests.log: {noformat} {noformat} > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, Java > 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > A test output is contained in the attached file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Description: This issue is a overhaul issue to remove problems encountered when I run ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. It will have sub-tasks for some kinds of issues. Contents of unit-tests.log: {noformat} {noformat} was: This issue is a overhaul issue to remove problems encountered when I run ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. It will have sub-tasks for some kinds of issues. Contents of unit-tests.log: {noformat} {noformat} > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, Java > 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > Contents of unit-tests.log: > {noformat} > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Description: This issue is a overhaul issue to remove problems encountered when I run ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. It will have sub-tasks for some kinds of issues. Contents of unit-tests.log: {noformat} {noformat} was: This issue is a overhaul issue to remove problems encountered when I run ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. It will have sub-tasks for some kinds of issues. > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0 >Reporter: cocoatomo > Labels: pyspark, testing > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > Contents of unit-tests.log: > {noformat} > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3866) Clean up python/run-tests problems
[ https://issues.apache.org/jira/browse/SPARK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3866: - Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, Java 1.8.0_20 (was: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0) > Clean up python/run-tests problems > -- > > Key: SPARK-3866 > URL: https://issues.apache.org/jira/browse/SPARK-3866 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, Java > 1.8.0_20 >Reporter: cocoatomo > Labels: pyspark, testing > > This issue is a overhaul issue to remove problems encountered when I run > ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. > It will have sub-tasks for some kinds of issues. > Contents of unit-tests.log: > {noformat} > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3866) Clean up python/run-tests problems
cocoatomo created SPARK-3866: Summary: Clean up python/run-tests problems Key: SPARK-3866 URL: https://issues.apache.org/jira/browse/SPARK-3866 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0 Reporter: cocoatomo This issue is a overhaul issue to remove problems encountered when I run ./python/run-tests at commit a85f24accd3266e0f97ee04d03c22b593d99c062. It will have sub-tasks for some kinds of issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3794) Building spark core fails with specific hadoop version
[ https://issues.apache.org/jira/browse/SPARK-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159384#comment-14159384 ] cocoatomo commented on SPARK-3794: -- Thank you for the comment. Building from the root directory results in a same error. {noformat} $ mvn -Dhadoop.version=1.1.0 -DskipTests clean compile ... [ERROR] /Users//MyRepos/Scala/spark/core/src/main/scala/org/apache/spark/util/Utils.scala:720: value listFilesAndDirs is not a member of object org.apache.commons.io.FileUtils [ERROR] val files = FileUtils.listFilesAndDirs(dir, TrueFileFilter.TRUE, TrueFileFilter.TRUE) [ERROR] ^ [ERROR] one error found [INFO] [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM ... SUCCESS [ 2.147 s] [INFO] Spark Project Core . FAILURE [ 42.550 s] [INFO] Spark Project Bagel SKIPPED [INFO] Spark Project GraphX ... SKIPPED [INFO] Spark Project Streaming SKIPPED [INFO] Spark Project ML Library ... SKIPPED [INFO] Spark Project Tools SKIPPED [INFO] Spark Project Catalyst . SKIPPED [INFO] Spark Project SQL .. SKIPPED [INFO] Spark Project Hive . SKIPPED [INFO] Spark Project REPL . SKIPPED [INFO] Spark Project Assembly . SKIPPED [INFO] Spark Project External Twitter . SKIPPED [INFO] Spark Project External Kafka ... SKIPPED [INFO] Spark Project External Flume Sink .. SKIPPED [INFO] Spark Project External Flume ... SKIPPED [INFO] Spark Project External ZeroMQ .. SKIPPED [INFO] Spark Project External MQTT SKIPPED [INFO] Spark Project Examples . SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 45.365 s [INFO] Finished at: 2014-10-05T10:29:48+09:00 [INFO] Final Memory: 34M/1017M [INFO] [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.0:compile (scala-compile-first) on project spark-core_2.10: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.0:compile failed. CompileFailed -> [Help 1] {noformat} > Building spark core fails with specific hadoop version > -- > > Key: SPARK-3794 > URL: https://issues.apache.org/jira/browse/SPARK-3794 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5 >Reporter: cocoatomo > Labels: spark > > At the commit cf1d32e3e1071829b152d4b597bf0a0d7a5629a2, building spark core > result in compilation error when we specify some hadoop versions. > To reproduce this issue, we should execute following command with > =1.1.0, 1.1.1, 1.1.2, 1.2.0, 1.2.1, or 2.2.0. > {noformat} > $ cd ./core > $ mvn -Dhadoop.version= -DskipTests clean compile > ... > [ERROR] > /Users/tomohiko/MyRepos/Scala/spark/core/src/main/scala/org/apache/spark/util/Utils.scala:720: > value listFilesAndDirs is not a member of object > org.apache.commons.io.FileUtils > [ERROR] val files = FileUtils.listFilesAndDirs(dir, > TrueFileFilter.TRUE, TrueFileFilter.TRUE) > [ERROR] ^ > {noformat} > Because that compilation uses commons-io version 2.1 and > FileUtils#listFilesAndDirs method was added at commons-io version 2.2, this > compilation always fails. > FileUtils#listFilesAndDirs → > http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html#listFilesAndDirs%28java.io.File,%20org.apache.commons.io.filefilter.IOFileFilter,%20org.apache.commons.io.filefilter.IOFileFilter%29 > Because a hadoop-client in those problematic version depends on commons-io > 2.1 not 2.4, we should have assumption that commons-io is version 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3794) Building spark core fails with specific hadoop version
[ https://issues.apache.org/jira/browse/SPARK-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3794: - Description: At the commit cf1d32e3e1071829b152d4b597bf0a0d7a5629a2, building spark core result in compilation error when we specify some hadoop versions. To reproduce this issue, we should execute following command with =1.1.0, 1.1.1, 1.1.2, 1.2.0, 1.2.1, or 2.2.0. {noformat} $ cd ./core $ mvn -Dhadoop.version= -DskipTests clean compile ... [ERROR] /Users/tomohiko/MyRepos/Scala/spark/core/src/main/scala/org/apache/spark/util/Utils.scala:720: value listFilesAndDirs is not a member of object org.apache.commons.io.FileUtils [ERROR] val files = FileUtils.listFilesAndDirs(dir, TrueFileFilter.TRUE, TrueFileFilter.TRUE) [ERROR] ^ {noformat} Because that compilation uses commons-io version 2.1 and FileUtils#listFilesAndDirs method was added at commons-io version 2.2, this compilation always fails. FileUtils#listFilesAndDirs → http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html#listFilesAndDirs%28java.io.File,%20org.apache.commons.io.filefilter.IOFileFilter,%20org.apache.commons.io.filefilter.IOFileFilter%29 Because a hadoop-client in those problematic version depends on commons-io 2.1 not 2.4, we should have assumption that commons-io is version 2.1. was: At the commit cf1d32e3e1071829b152d4b597bf0a0d7a5629a2, building spark core result in compilation error when we specify some hadoop versions. To reproduce this issue, we should execute following command with =1.1.0, 1.1.1, 1.1.2, 1.2.0, 1.2.1, or 2.2.0. {noformat} $ cd ./core $ mvn -Dhadoop.version= -DskipTests clean compile ... [ERROR] /Users/tomohiko/MyRepos/Scala/spark/core/src/main/scala/org/apache/spark/util/Utils.scala:720: value listFilesAndDirs is not a member of object org.apache.commons.io.FileUtils [ERROR] val files = FileUtils.listFilesAndDirs(dir, TrueFileFilter.TRUE, TrueFileFilter.TRUE) [ERROR] ^ {noformat} Because that compilation uses commons-io version 2.1 and FileUtils#listFilesAndDirs method was added at commons-io version 2.2, this compilation already fails. FileUtils#listFilesAndDirs → http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html#listFilesAndDirs%28java.io.File,%20org.apache.commons.io.filefilter.IOFileFilter,%20org.apache.commons.io.filefilter.IOFileFilter%29 Because a hadoop-client in those problematic version depends on commons-io 2.1 not 2.4, we should have assumption that commons-io is version 2.1. > Building spark core fails with specific hadoop version > -- > > Key: SPARK-3794 > URL: https://issues.apache.org/jira/browse/SPARK-3794 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5 >Reporter: cocoatomo > Labels: spark > Fix For: 1.2.0 > > > At the commit cf1d32e3e1071829b152d4b597bf0a0d7a5629a2, building spark core > result in compilation error when we specify some hadoop versions. > To reproduce this issue, we should execute following command with > =1.1.0, 1.1.1, 1.1.2, 1.2.0, 1.2.1, or 2.2.0. > {noformat} > $ cd ./core > $ mvn -Dhadoop.version= -DskipTests clean compile > ... > [ERROR] > /Users/tomohiko/MyRepos/Scala/spark/core/src/main/scala/org/apache/spark/util/Utils.scala:720: > value listFilesAndDirs is not a member of object > org.apache.commons.io.FileUtils > [ERROR] val files = FileUtils.listFilesAndDirs(dir, > TrueFileFilter.TRUE, TrueFileFilter.TRUE) > [ERROR] ^ > {noformat} > Because that compilation uses commons-io version 2.1 and > FileUtils#listFilesAndDirs method was added at commons-io version 2.2, this > compilation always fails. > FileUtils#listFilesAndDirs → > http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html#listFilesAndDirs%28java.io.File,%20org.apache.commons.io.filefilter.IOFileFilter,%20org.apache.commons.io.filefilter.IOFileFilter%29 > Because a hadoop-client in those problematic version depends on commons-io > 2.1 not 2.4, we should have assumption that commons-io is version 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3794) Building spark core fails with specific hadoop version
cocoatomo created SPARK-3794: Summary: Building spark core fails with specific hadoop version Key: SPARK-3794 URL: https://issues.apache.org/jira/browse/SPARK-3794 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5 Reporter: cocoatomo Fix For: 1.2.0 At the commit cf1d32e3e1071829b152d4b597bf0a0d7a5629a2, building spark core result in compilation error when we specify some hadoop versions. To reproduce this issue, we should execute following command with =1.1.0, 1.1.1, 1.1.2, 1.2.0, 1.2.1, or 2.2.0. {noformat} $ cd ./core $ mvn -Dhadoop.version= -DskipTests clean compile ... [ERROR] /Users/tomohiko/MyRepos/Scala/spark/core/src/main/scala/org/apache/spark/util/Utils.scala:720: value listFilesAndDirs is not a member of object org.apache.commons.io.FileUtils [ERROR] val files = FileUtils.listFilesAndDirs(dir, TrueFileFilter.TRUE, TrueFileFilter.TRUE) [ERROR] ^ {noformat} Because that compilation uses commons-io version 2.1 and FileUtils#listFilesAndDirs method was added at commons-io version 2.2, this compilation already fails. FileUtils#listFilesAndDirs → http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html#listFilesAndDirs%28java.io.File,%20org.apache.commons.io.filefilter.IOFileFilter,%20org.apache.commons.io.filefilter.IOFileFilter%29 Because a hadoop-client in those problematic version depends on commons-io 2.1 not 2.4, we should have assumption that commons-io is version 2.1. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3773) Sphinx build warnings
[ https://issues.apache.org/jira/browse/SPARK-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157590#comment-14157590 ] cocoatomo commented on SPARK-3773: -- Using Sphinx to generate API docs for PySpark > Sphinx build warnings > - > > Key: SPARK-3773 > URL: https://issues.apache.org/jira/browse/SPARK-3773 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, > Jinja2==2.7.3, MarkupSafe==0.23, Pygments==1.6, Sphinx==1.2.3, > docutils==0.12, numpy==1.9.0 >Reporter: cocoatomo >Priority: Minor > Labels: docs, docstrings, pyspark > > When building Sphinx documents for PySpark, we have 12 warnings. > Their causes are almost docstrings in broken ReST format. > To reproduce this issue, we should run following commands on the commit: > 6e27cb630de69fa5acb510b4e2f6b980742b1957. > {quote} > $ cd ./python/docs > $ make clean html > ... > /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of > pyspark.SparkContext.sequenceFile:4: ERROR: Unexpected indentation. > /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of > pyspark.RDD.saveAsSequenceFile:4: ERROR: Unexpected indentation. > /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring > of pyspark.mllib.classification.LogisticRegressionWithSGD.train:14: ERROR: > Unexpected indentation. > /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring > of pyspark.mllib.classification.LogisticRegressionWithSGD.train:16: WARNING: > Definition list ends without a blank line; unexpected unindent. > /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring > of pyspark.mllib.classification.LogisticRegressionWithSGD.train:17: WARNING: > Block quote ends without a blank line; unexpected unindent. > /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring > of pyspark.mllib.classification.SVMWithSGD.train:14: ERROR: Unexpected > indentation. > /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring > of pyspark.mllib.classification.SVMWithSGD.train:16: WARNING: Definition > list ends without a blank line; unexpected unindent. > /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring > of pyspark.mllib.classification.SVMWithSGD.train:17: WARNING: Block quote > ends without a blank line; unexpected unindent. > /Users//MyRepos/Scala/spark/python/docs/pyspark.mllib.rst:50: WARNING: > missing attribute mentioned in :members: or __all__: module > pyspark.mllib.regression, attribute > RidgeRegressionModelLinearRegressionWithSGD > /Users//MyRepos/Scala/spark/python/pyspark/mllib/tree.py:docstring of > pyspark.mllib.tree.DecisionTreeModel.predict:3: ERROR: Unexpected indentation. > ... > checking consistency... > /Users//MyRepos/Scala/spark/python/docs/modules.rst:: WARNING: document > isn't included in any toctree > ... > copying static files... WARNING: html_static_path entry > u'/Users//MyRepos/Scala/spark/python/docs/_static' does not exist > ... > build succeeded, 12 warnings. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3773) Sphinx build warnings
[ https://issues.apache.org/jira/browse/SPARK-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3773: - Description: When building Sphinx documents for PySpark, we have 12 warnings. Their causes are almost docstrings in broken ReST format. To reproduce this issue, we should run following commands on the commit: 6e27cb630de69fa5acb510b4e2f6b980742b1957. {quote} $ cd ./python/docs $ make clean html ... /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of pyspark.SparkContext.sequenceFile:4: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of pyspark.RDD.saveAsSequenceFile:4: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:14: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:16: WARNING: Definition list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:17: WARNING: Block quote ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:14: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:16: WARNING: Definition list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:17: WARNING: Block quote ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/docs/pyspark.mllib.rst:50: WARNING: missing attribute mentioned in :members: or __all__: module pyspark.mllib.regression, attribute RidgeRegressionModelLinearRegressionWithSGD /Users//MyRepos/Scala/spark/python/pyspark/mllib/tree.py:docstring of pyspark.mllib.tree.DecisionTreeModel.predict:3: ERROR: Unexpected indentation. ... checking consistency... /Users//MyRepos/Scala/spark/python/docs/modules.rst:: WARNING: document isn't included in any toctree ... copying static files... WARNING: html_static_path entry u'/Users//MyRepos/Scala/spark/python/docs/_static' does not exist ... build succeeded, 12 warnings. {quote} was: When building Sphinx documents for PySpark, we have 12 warnings. Their causes are almost docstrings in broken ReST format. To reproduce this issue, we should run following commands. {quote} $ cd ./python/docs $ make clean html ... /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of pyspark.SparkContext.sequenceFile:4: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of pyspark.RDD.saveAsSequenceFile:4: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:14: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:16: WARNING: Definition list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:17: WARNING: Block quote ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:14: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:16: WARNING: Definition list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:17: WARNING: Block quote ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/docs/pyspark.mllib.rst:50: WARNING: missing attribute mentioned in :members: or __all__: module pyspark.mllib.regression, attribute RidgeRegressionModelLinearRegressionWithSGD /Users//MyRepos/Scala/spark/python/pyspark/mllib/tree.py:docstring of pyspark.mllib.tree.DecisionTreeModel.predict:3: ERROR: Unexpected indentation. ... checking consistency... /Users//MyRepos/Scala/spark/python/docs/modules.rst:: WARNING: document isn't included in any toctree ... copying static files... WARNING: html_static_path entry u'/Users//MyRepos/Scala/spark/python/docs/_static' does not exist ... build succeeded, 12 warnings. {quote} > Sphinx build warning
[jira] [Commented] (SPARK-3772) RDD operation on IPython REPL failed with an illegal port number
[ https://issues.apache.org/jira/browse/SPARK-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157586#comment-14157586 ] cocoatomo commented on SPARK-3772: -- Thank you for the advice. I added the commit hash on the description. > RDD operation on IPython REPL failed with an illegal port number > > > Key: SPARK-3772 > URL: https://issues.apache.org/jira/browse/SPARK-3772 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0 >Reporter: cocoatomo > Labels: pyspark > > To reproduce this issue, we should execute following commands on the commit: > 6e27cb630de69fa5acb510b4e2f6b980742b1957. > {quote} > $ PYSPARK_PYTHON=ipython ./bin/pyspark > ... > In [1]: file = sc.textFile('README.md') > In [2]: file.first() > ... > 14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded > 14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1 > 14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at > PythonRDD.scala:334 > 14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at > PythonRDD.scala:334) with 1 output partitions (allowLocal=true) > 14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at > PythonRDD.scala:334) > 14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List() > 14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List() > 14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD > at PythonRDD.scala:44), which has no missing parents > 14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with > curMem=57388, maxMem=278019440 > 14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in > memory (estimated size 4.4 KB, free 265.1 MB) > 14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 > (PythonRDD[2] at RDD at PythonRDD.scala:44) > 14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks > 14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, > localhost, PROCESS_LOCAL, 1207 bytes) > 14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) > 14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) > java.lang.IllegalArgumentException: port out of range:1027423549 > at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) > at java.net.InetSocketAddress.(InetSocketAddress.java:188) > at java.net.Socket.(Socket.java:244) > at > org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75) > at > org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90) > at > org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) > at > org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) > at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100) > at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:744) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3772) RDD operation on IPython REPL failed with an illegal port number
[ https://issues.apache.org/jira/browse/SPARK-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cocoatomo updated SPARK-3772: - Description: To reproduce this issue, we should execute following commands on the commit: 6e27cb630de69fa5acb510b4e2f6b980742b1957. {quote} $ PYSPARK_PYTHON=ipython ./bin/pyspark ... In [1]: file = sc.textFile('README.md') In [2]: file.first() ... 14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded 14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1 14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at PythonRDD.scala:334 14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at PythonRDD.scala:334) with 1 output partitions (allowLocal=true) 14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at PythonRDD.scala:334) 14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List() 14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List() 14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44), which has no missing parents 14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with curMem=57388, maxMem=278019440 14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.4 KB, free 265.1 MB) 14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44) 14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1207 bytes) 14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.lang.IllegalArgumentException: port out of range:1027423549 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) at java.net.InetSocketAddress.(InetSocketAddress.java:188) at java.net.Socket.(Socket.java:244) at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75) at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:744) {quote} was: To reproduce this issue, we should execute following commands. {quote} $ PYSPARK_PYTHON=ipython ./bin/pyspark ... In [1]: file = sc.textFile('README.md') In [2]: file.first() ... 14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded 14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1 14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at PythonRDD.scala:334 14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at PythonRDD.scala:334) with 1 output partitions (allowLocal=true) 14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at PythonRDD.scala:334) 14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List() 14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List() 14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44), which has no missing parents 14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with curMem=57388, maxMem=278019440 14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.4 KB, free 265.1 MB) 14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44) 14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1207 bytes) 14/10/03 08:50:13 INFO Executor: Running task 0.0
[jira] [Created] (SPARK-3773) Sphinx build warnings
cocoatomo created SPARK-3773: Summary: Sphinx build warnings Key: SPARK-3773 URL: https://issues.apache.org/jira/browse/SPARK-3773 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0, Jinja2==2.7.3, MarkupSafe==0.23, Pygments==1.6, Sphinx==1.2.3, docutils==0.12, numpy==1.9.0 Reporter: cocoatomo Priority: Minor When building Sphinx documents for PySpark, we have 12 warnings. Their causes are almost docstrings in broken ReST format. To reproduce this issue, we should run following commands. {quote} $ cd ./python/docs $ make clean html ... /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of pyspark.SparkContext.sequenceFile:4: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/__init__.py:docstring of pyspark.RDD.saveAsSequenceFile:4: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:14: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:16: WARNING: Definition list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:17: WARNING: Block quote ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:14: ERROR: Unexpected indentation. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:16: WARNING: Definition list ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:17: WARNING: Block quote ends without a blank line; unexpected unindent. /Users//MyRepos/Scala/spark/python/docs/pyspark.mllib.rst:50: WARNING: missing attribute mentioned in :members: or __all__: module pyspark.mllib.regression, attribute RidgeRegressionModelLinearRegressionWithSGD /Users//MyRepos/Scala/spark/python/pyspark/mllib/tree.py:docstring of pyspark.mllib.tree.DecisionTreeModel.predict:3: ERROR: Unexpected indentation. ... checking consistency... /Users//MyRepos/Scala/spark/python/docs/modules.rst:: WARNING: document isn't included in any toctree ... copying static files... WARNING: html_static_path entry u'/Users//MyRepos/Scala/spark/python/docs/_static' does not exist ... build succeeded, 12 warnings. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3772) RDD operation on IPython REPL failed with an illegal port number
cocoatomo created SPARK-3772: Summary: RDD operation on IPython REPL failed with an illegal port number Key: SPARK-3772 URL: https://issues.apache.org/jira/browse/SPARK-3772 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.2.0 Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0 Reporter: cocoatomo To reproduce this issue, we should execute following commands. {quote} $ PYSPARK_PYTHON=ipython ./bin/pyspark ... In [1]: file = sc.textFile('README.md') In [2]: file.first() ... 14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded 14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1 14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at PythonRDD.scala:334 14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at PythonRDD.scala:334) with 1 output partitions (allowLocal=true) 14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at PythonRDD.scala:334) 14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List() 14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List() 14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44), which has no missing parents 14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with curMem=57388, maxMem=278019440 14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.4 KB, free 265.1 MB) 14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (PythonRDD[2] at RDD at PythonRDD.scala:44) 14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1207 bytes) 14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.lang.IllegalArgumentException: port out of range:1027423549 at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) at java.net.InetSocketAddress.(InetSocketAddress.java:188) at java.net.Socket.(Socket.java:244) at org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75) at org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90) at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89) at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62) at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100) at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:744) {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3706) Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset
[ https://issues.apache.org/jira/browse/SPARK-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156776#comment-14156776 ] cocoatomo commented on SPARK-3706: -- Thank you for the comment and modification, [~joshrosen]. Taking a quick look, this regression created at the commit [f38fab97c7970168f1bd81d4dc202e36322c95e3|https://github.com/apache/spark/commit/f38fab97c7970168f1bd81d4dc202e36322c95e3#diff-5dbcb82caf8131d60c73e82cf8d12d8aR107] on master branch. Pushing "ipython" aside into a default value force us to set PYSPARK_PYTHON as "ipython", since PYSPARK_PYTHON defaults to "python" at the top of the ./bin/pyspark script. This issue is a regression between 1.1.0 and 1.2.0, therefore affects only 1.2.0. > Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset > > > Key: SPARK-3706 > URL: https://issues.apache.org/jira/browse/SPARK-3706 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 1.2.0 > Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0 >Reporter: cocoatomo > Labels: pyspark > > h3. Problem > The section "Using the shell" in Spark Programming Guide > (https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) > says that we can run pyspark REPL through IPython. > But a folloing command does not run IPython but a default Python executable. > {quote} > $ IPYTHON=1 ./bin/pyspark > Python 2.7.8 (default, Jul 2 2014, 10:14:46) > ... > {quote} > the spark/bin/pyspark script on the commit > b235e013638685758885842dc3268e9800af3678 decides which executable and options > it use folloing way. > # if PYSPARK_PYTHON unset > #* → defaulting to "python" > # if IPYTHON_OPTS set > #* → set IPYTHON "1" > # some python scripts passed to ./bin/pyspak → run it with ./bin/spark-submit > #* out of this issues scope > # if IPYTHON set as "1" > #* → execute $PYSPARK_PYTHON (default: ipython) with arguments $IPYTHON_OPTS > #* otherwise execute $PYSPARK_PYTHON > Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON is > "1". > In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no > effect on decide which command to use. > ||PYSPARK_PYTHON||IPYTHON_OPTS||IPYTHON||resulting command||expected command|| > |(unset → defaults to python)|(unset)|(unset)|python|(same)| > |(unset → defaults to python)|(unset)|1|python|ipython| > |(unset → defaults to python)|an_option|(unset → set to 1)|python > an_option|ipython an_option| > |(unset → defaults to python)|an_option|1|python an_option|ipython an_option| > |ipython|(unset)|(unset)|ipython|(same)| > |ipython|(unset)|1|ipython|(same)| > |ipython|an_option|(unset → set to 1)|ipython an_option|(same)| > |ipython|an_option|1|ipython an_option|(same)| > h3. Suggestion > The pyspark script should determine firstly whether a user wants to run > IPython or other executables. > # if IPYTHON_OPTS set > #* set IPYTHON "1" > # if IPYTHON has a value "1" > #* PYSPARK_PYTHON defaults to "ipython" if not set > # PYSPARK_PYTHON defaults to "python" if not set > See the pull request for more detailed modification. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3420) Using Sphinx to generate API docs for PySpark
[ https://issues.apache.org/jira/browse/SPARK-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151073#comment-14151073 ] cocoatomo commented on SPARK-3420: -- Thank you for the comment. Yes, I am interested in this work. As a question for confirmation, the words "improve the docs" means "to remove ReST error"? Or, a work mentioned by the words includes to make documents more fulfilled and useful? > Using Sphinx to generate API docs for PySpark > - > > Key: SPARK-3420 > URL: https://issues.apache.org/jira/browse/SPARK-3420 > Project: Spark > Issue Type: Improvement > Components: PySpark >Reporter: Davies Liu >Assignee: Davies Liu > > Sphinx can generate better documents than epydoc, so let's move on to Sphinx. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3420) Using Sphinx to generate API docs for PySpark
[ https://issues.apache.org/jira/browse/SPARK-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150592#comment-14150592 ] cocoatomo commented on SPARK-3420: -- Do you mean to generate API documents using sphinx-apidoc? When I try to build documents using sphinx-apidoc, I have some import errors and ReST format errors. I prefer Sphinx to Epydoc, so I want to fix that errors. > Using Sphinx to generate API docs for PySpark > - > > Key: SPARK-3420 > URL: https://issues.apache.org/jira/browse/SPARK-3420 > Project: Spark > Issue Type: Improvement > Components: PySpark >Reporter: Davies Liu >Assignee: Davies Liu > > Sphinx can generate better documents than epydoc, so let's move on to Sphinx. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3706) Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset
cocoatomo created SPARK-3706: Summary: Cannot run IPython REPL with IPYTHON set to "1" and PYSPARK_PYTHON unset Key: SPARK-3706 URL: https://issues.apache.org/jira/browse/SPARK-3706 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.1.0 Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0 Reporter: cocoatomo h3. Problem The section "Using the shell" in Spark Programming Guide (https://spark.apache.org/docs/latest/programming-guide.html#using-the-shell) says that we can run pyspark REPL through IPython. But a folloing command does not run IPython but a default Python executable. {quote} $ IPYTHON=1 ./bin/pyspark Python 2.7.8 (default, Jul 2 2014, 10:14:46) ... {quote} the spark/bin/pyspark script on the commit b235e013638685758885842dc3268e9800af3678 decides which executable and options it use folloing way. # if PYSPARK_PYTHON unset #* → defaulting to "python" # if IPYTHON_OPTS set #* → set IPYTHON "1" # some python scripts passed to ./bin/pyspak → run it with ./bin/spark-submit #* out of this issues scope # if IPYTHON set as "1" #* → execute $PYSPARK_PYTHON (default: ipython) with arguments $IPYTHON_OPTS #* otherwise execute $PYSPARK_PYTHON Therefore, when PYSPARK_PYTHON is unset, python is executed though IPYTHON is "1". In other word, when PYSPARK_PYTHON is unset, IPYTHON_OPS and IPYTHON has no effect on decide which command to use. ||PYSPARK_PYTHON||IPYTHON_OPTS||IPYTHON||resulting command||expected command|| |(unset → defaults to python)|(unset)|(unset)|python|(same)| |(unset → defaults to python)|(unset)|1|python|ipython| |(unset → defaults to python)|an_option|(unset → set to 1)|python an_option|ipython an_option| |(unset → defaults to python)|an_option|1|python an_option|ipython an_option| |ipython|(unset)|(unset)|ipython|(same)| |ipython|(unset)|1|ipython|(same)| |ipython|an_option|(unset → set to 1)|ipython an_option|(same)| |ipython|an_option|1|ipython an_option|(same)| h3. Suggestion The pyspark script should determine firstly whether a user wants to run IPython or other executables. # if IPYTHON_OPTS set #* set IPYTHON "1" # if IPYTHON has a value "1" #* PYSPARK_PYTHON defaults to "ipython" if not set # PYSPARK_PYTHON defaults to "python" if not set See the pull request for more detailed modification. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org