Re: Scala: Perform Unit Testing in spark

2016-04-06 Thread Shishir Anshuman
g/mod_mbox/spark-user/201603.mbox/browser > > Let me know if you have follow up questions or want assistance. > > Regards, > > > Lars Albertsson > Data engineering consultant > www.mapflat.com > +46 70 7687109 > > > On Fri, Apr 1, 2016 at 10:31 PM, Shishir Anshum

Re: Scala: Perform Unit Testing in spark

2016-04-01 Thread Shishir Anshuman
t > 1447 Thu Mar 03 09:53:54 PST 2016 > org/apache/spark/mllib/util/MLlibTestSparkContext.class > 1704 Thu Mar 03 09:53:54 PST 2016 > org/apache/spark/mllib/util/MLlibTestSparkContext$class.class > > On Fri, Apr 1, 2016 at 3:07 PM, Shishir Anshuman < > shishiranshu...@gmail.com> wrote: > >

Scala: Perform Unit Testing in spark

2016-04-01 Thread Shishir Anshuman
Hello, I have a code written in scala using Mllib. I want to perform unit testing it. I cant decide between Junit 4 and ScalaTest. I am new to Spark. Please guide me how to proceed with the testing. Thank you.

Output is being stored on the clusters (slaves).

2016-03-24 Thread Shishir Anshuman
I am using two Slaves to run the ALS algorithm. I am saving the predictions in a textfile using : *saveAsTextFile(path)* The predictions is getting stored on the slaves but I want the predictions to be saved on the Master. Any suggestion on how to achieve this? I am using Standalone

Re: Error using collectAsMap() in scala

2016-03-20 Thread Shishir Anshuman
collection of a single value type, String. >> But `collectAsMap` is only defined for PairRDDs that have key-value pairs >> for their data elements. Both a key and a value are needed to collect into >> a Map[K, V]. >> >> On Sun, Mar 20, 2016 at 8:19 PM, Shishir Anshuman < >> s

Error using collectAsMap() in scala

2016-03-19 Thread Shishir Anshuman
I am using following code snippet in scala: *val dict: RDD[String] = sc.textFile("path/to/csv/file")* *val dict_broadcast=sc.broadcast(dict.collectAsMap())* On compiling It generates this error: *scala:42: value collectAsMap is not a member of org.apache.spark.rdd.RDD[String]* *val

How to convert Parquet file to a text file.

2016-03-15 Thread Shishir Anshuman
I need to convert the parquet file generated by the spark to a text (csv preferably) file. I want to use the data model outside spark. Any suggestion on how to proceed?

Save the model produced after training with ALS.

2016-03-13 Thread Shishir Anshuman
hello, I am using the sample code for ALS algorithm implementation. I want to save the model produced after training in a separate file. The 'modelPath' in model.save() stores some metadata. I am new to Apache spark, please

Get output of the ALS algorithm.

2016-03-10 Thread Shishir Anshuman
hello, I am new to Apache Spark and would like to get the Recommendation output of the ALS algorithm in a file. Please suggest me the solution. Thank you

how to implement ALS with csv file? getting error while calling Rating class

2016-03-06 Thread Shishir Anshuman
I am new to apache Spark, and I want to implement the Alternating Least Squares algorithm. The data set is stored in a csv file in the format: *Name,Value1,Value2*. When I read the csv file, I get *java.lang.NumberFormatException.forInputString* error because the Rating class needs the parameters