Re: Can you specify partitions?

2015-06-05 Thread amghost
Maybe you could try to implement your own Partitioner. As I remember, by default, Spark use HashPartitioner. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-you-specify-partitions-tp23156p23187.html Sent from the Apache Spark User List mailing list

Application is always in process when I check out logs of completed application

2015-06-02 Thread amghost
I run spark application in spark standalone cluster with client deploy mode. I want to check out the logs of my finished application, but I always get a page telling me Application history not found - Application xxx is still in process. I am pretty sure that the application has indeed completed

How can I retrieve item-pair after calculating similarity using RowMatrix

2015-04-25 Thread amghost
I have encountered the all-pairs similarity problem in my recommendation system. Thanks to this databricks blog, it seems RowMatrix may come to help. However, RowMatrix is a matrix type without meaningful row indices, thereby I don't know how to retrieve the similarity result after invoking

Re: StackOverflow Error when run ALS with 100 iterations

2015-04-22 Thread amghost
Hi, would you please how to checkpoint the training set rdd since all things are done in ALS.train method. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/StackOverflow-Error-when-run-ALS-with-100-iterations-tp4296p22619.html Sent from the Apache Spark User

Should Spark SQL support retrieve column value from Row by column name?

2015-03-22 Thread amghost
I would like to retrieve column value from Spark SQL query result. But currently it seems that Spark SQL only support retrieving by index val results = sqlContext.sql(SELECT name FROM people) results.map(t = Name: + *t(0)*).collect().foreach(println) I think it will be much more convenient if