Maybe you could try to implement your own Partitioner. As I remember, by
default, Spark use HashPartitioner.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Can-you-specify-partitions-tp23156p23187.html
Sent from the Apache Spark User List mailing list
I run spark application in spark standalone cluster with client deploy mode.
I want to check out the logs of my finished application, but I always get a
page telling me Application history not found - Application xxx is still in
process.
I am pretty sure that the application has indeed completed
I have encountered the all-pairs similarity problem in my recommendation
system. Thanks to this databricks blog, it seems RowMatrix may come to help.
However, RowMatrix is a matrix type without meaningful row indices, thereby
I don't know how to retrieve the similarity result after invoking
Hi, would you please how to checkpoint the training set rdd since all things
are done in ALS.train method.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/StackOverflow-Error-when-run-ALS-with-100-iterations-tp4296p22619.html
Sent from the Apache Spark User
I would like to retrieve column value from Spark SQL query result. But
currently it seems that Spark SQL only support retrieving by index
val results = sqlContext.sql(SELECT name FROM people)
results.map(t = Name: + *t(0)*).collect().foreach(println)
I think it will be much more convenient if