re memory during shuffle (like groupBy()),
which will increase
the performance.
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-is-running-extremely-slow-with-larger-data-set-like-2G-tp17152p17231.html
> Sent from t
tween
them?
*spark.python.worker.memory
spark.executor.memory
spark.driver.memory*
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-is-running-extremely-slow-with-larger-data-set-like-2G-tp17152p17231.html
Sent from the Apache Spark User List mailing li
uot;\t" + line[1])
> records.saveAsTextFile("file:///home/xzhang/data/result")
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-is-running-extremely-slow-with-larger-data-set-like-2G-tp17152p17153.htm
t; #records = records.sortByKey()
> records = records.map(lambda line: line[0] + "\t" + line[1])
> records.saveAsTextFile("file:///home/xzhang/data/result")
>
>
>
> --
> View this message in context:
> http://apache-spark
ot;\t" + b )
#print(records.count())
#records = records.sortByKey()
records = records.map(lambda line: line[0] + "\t" + line[1])
records.saveAsTextFile("file:///home/xzhang/data/result")
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spa
task won't finish in hours
I tried the input both from NFS and HDFS
<http://apache-spark-user-list.1001560.n3.nabble.com/file/n17152/48.png>
What's could be the problem?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-is-running-ext