[jira] [Updated] (SPARK-3772) RDD operation on IPython REPL failed with an illegal port number

cocoatomo (JIRA) Thu, 02 Oct 2014 19:01:45 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


cocoatomo updated SPARK-3772:
-----------------------------
    Description: 
To reproduce this issue, we should execute following commands on the commit: 
6e27cb630de69fa5acb510b4e2f6b980742b1957.

{quote}
$ PYSPARK_PYTHON=ipython ./bin/pyspark
...
In [1]: file = sc.textFile('README.md')
In [2]: file.first()
...
14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded
14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1
14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at PythonRDD.scala:334
14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at PythonRDD.scala:334) 
with 1 output partitions (allowLocal=true)
14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at 
PythonRDD.scala:334)
14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List()
14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List()
14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD at 
PythonRDD.scala:44), which has no missing parents
14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with 
curMem=57388, maxMem=278019440
14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 4.4 KB, free 265.1 MB)
14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 
(PythonRDD[2] at RDD at PythonRDD.scala:44)
14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 
localhost, PROCESS_LOCAL, 1207 bytes)
14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: port out of range:1027423549
        at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
        at java.net.InetSocketAddress.<init>(InetSocketAddress.java:188)
        at java.net.Socket.<init>(Socket.java:244)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
        at 
org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:744)
{quote}

  was:
To reproduce this issue, we should execute following commands.

{quote}
$ PYSPARK_PYTHON=ipython ./bin/pyspark
...
In [1]: file = sc.textFile('README.md')
In [2]: file.first()
...
14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded
14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1
14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at PythonRDD.scala:334
14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at PythonRDD.scala:334) 
with 1 output partitions (allowLocal=true)
14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at 
PythonRDD.scala:334)
14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List()
14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List()
14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD at 
PythonRDD.scala:44), which has no missing parents
14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with 
curMem=57388, maxMem=278019440
14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 4.4 KB, free 265.1 MB)
14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 
(PythonRDD[2] at RDD at PythonRDD.scala:44)
14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 
localhost, PROCESS_LOCAL, 1207 bytes)
14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: port out of range:1027423549
        at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
        at java.net.InetSocketAddress.<init>(InetSocketAddress.java:188)
        at java.net.Socket.<init>(Socket.java:244)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
        at 
org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
        at 
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
        at 
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
        at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100)
        at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:744)
{quote}


> RDD operation on IPython REPL failed with an illegal port number
> ----------------------------------------------------------------
>
>                 Key: SPARK-3772
>                 URL: https://issues.apache.org/jira/browse/SPARK-3772
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.2.0
>         Environment: Mac OS X 10.9.5, Python 2.7.8, IPython 2.2.0
>            Reporter: cocoatomo
>              Labels: pyspark
>
> To reproduce this issue, we should execute following commands on the commit: 
> 6e27cb630de69fa5acb510b4e2f6b980742b1957.
> {quote}
> $ PYSPARK_PYTHON=ipython ./bin/pyspark
> ...
> In [1]: file = sc.textFile('README.md')
> In [2]: file.first()
> ...
> 14/10/03 08:50:13 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 14/10/03 08:50:13 WARN LoadSnappy: Snappy native library not loaded
> 14/10/03 08:50:13 INFO FileInputFormat: Total input paths to process : 1
> 14/10/03 08:50:13 INFO SparkContext: Starting job: runJob at 
> PythonRDD.scala:334
> 14/10/03 08:50:13 INFO DAGScheduler: Got job 0 (runJob at 
> PythonRDD.scala:334) with 1 output partitions (allowLocal=true)
> 14/10/03 08:50:13 INFO DAGScheduler: Final stage: Stage 0(runJob at 
> PythonRDD.scala:334)
> 14/10/03 08:50:13 INFO DAGScheduler: Parents of final stage: List()
> 14/10/03 08:50:13 INFO DAGScheduler: Missing parents: List()
> 14/10/03 08:50:13 INFO DAGScheduler: Submitting Stage 0 (PythonRDD[2] at RDD 
> at PythonRDD.scala:44), which has no missing parents
> 14/10/03 08:50:13 INFO MemoryStore: ensureFreeSpace(4456) called with 
> curMem=57388, maxMem=278019440
> 14/10/03 08:50:13 INFO MemoryStore: Block broadcast_1 stored as values in 
> memory (estimated size 4.4 KB, free 265.1 MB)
> 14/10/03 08:50:13 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 
> (PythonRDD[2] at RDD at PythonRDD.scala:44)
> 14/10/03 08:50:13 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
> 14/10/03 08:50:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 
> localhost, PROCESS_LOCAL, 1207 bytes)
> 14/10/03 08:50:13 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
> 14/10/03 08:50:14 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.lang.IllegalArgumentException: port out of range:1027423549
>       at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143)
>       at java.net.InetSocketAddress.<init>(InetSocketAddress.java:188)
>       at java.net.Socket.<init>(Socket.java:244)
>       at 
> org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
>       at 
> org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
>       at 
> org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
>       at 
> org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
>       at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:100)
>       at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:71)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>       at org.apache.spark.scheduler.Task.run(Task.scala:56)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:744)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-3772) RDD operation on IPython REPL failed with an illegal port number

Reply via email to