Diana Carroll created SPARK-8795:
------------------------------------

             Summary: pySpark wholeTextFiles error when mapping string
                 Key: SPARK-8795
                 URL: https://issues.apache.org/jira/browse/SPARK-8795
             Project: Spark
          Issue Type: Bug
          Components: PySpark
         Environment: CentOS 6.6, Python 2.7, CDH 5.4.1
            Reporter: Diana Carroll


I created a test directory with two tiny text files.

This call works:
{code}sc.wholeTextFiles("testdata").map(lambda (fname,x): 
len(x)).collect(){code}

This call does not:
{code}sc.wholeTextFiles("testdata").map(lambda (fname,x): 
x.islower()).collect(){code}

In fact, any attempt to call any string methods on X, or pass X to any function 
requiring a string, fail the same way.

The main error is
{code}  File "/usr/lib/spark/python/pyspark/worker.py", line 101, in main
    process()
  File "/usr/lib/spark/python/pyspark/worker.py", line 96, in process
    serializer.dump_stream(func(split_index, iterator), outfile)
  File "/usr/lib/spark/python/pyspark/serializers.py", line 236, in dump_stream
    vs = list(itertools.islice(iterator, batch))
  File "<ipython-input-107-5192d18d0e4c>", line 1, in <lambda>
TypeError: 'bool' object is not callable
{code}

Will attach full log.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to