Hi All, I am just trying to compare Scala and Python API in my local machine. Just tried to import a local matrix(1000 by 10, created in R) stored in a text file via textFile in pyspark. when I run data.first() it fails to present the line and give error messages including the next:
Then I did nothing except changing the number of rows to 500 and importing the file again. data.first() runs correctly. I also tried these in scala using spark-shell, which runs correctly for both cases and larger matrices. Could somebody help me with this problem? I couldn't find an answer on the internet. It looks like pyspark has a problem with this simplest step? Best, Congrui Yi -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/pyspark-Failed-to-run-first-tp7691.html Sent from the Apache Spark User List mailing list archive at Nabble.com.