from:"Minnow Noir"

Extracting k-means cluster values along with centers?

2015-06-12 Thread Minnow Noir

Greetings. I have been following some of the tutorials online for Spark k-means clustering. I would like to be able to just dump all the cluster values and their centroids to text file so I can explore the data. I have the clusters as such: val clusters = KMeans.train(parsedData, numClusters,

Format RDD/SchemaRDD contents to screen?

2015-05-29 Thread Minnow Noir

Im trying to debug query results inside spark-shell, but finding it cumbersome to save to file and then use file system utils to explore the results, and .foreach(print) tends to interleave the results among the myriad log messages. Take() and collect() truncate. Is there a simple way to present

Query REST web service with Spark?

2015-03-31 Thread Minnow Noir

We have have some data on Hadoop that needs augmented with data only available to us via a REST service. We're using Spark to search for, and correct, missing data. Even though there are a lot of records to scour for missing data, the total number of calls to the service is expected to be low, so

Arguments/parameters in Spark shell scripts?

2015-03-29 Thread Minnow Noir

How does one consume parameters passed to a Scala script via spark-shell -i? 1. If I use an object with a main() method, the println outputs nothing as if not called: import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object Test {

Convert Spark SQL table to RDD in Scala / error: value toFloat is a not a member of Any

2015-03-22 Thread Minnow Noir

I'm following some online tutorial written in Python and trying to convert a Spark SQL table object to an RDD in Scala. The Spark SQL just loads a simple table from a CSV file. The tutorial says to convert the table to an RDD. The Python is products_rdd = sqlContext.table(products).map(lambda

Extracting k-means cluster values along with centers?

Format RDD/SchemaRDD contents to screen?

Query REST web service with Spark?

Arguments/parameters in Spark shell scripts?

Convert Spark SQL table to RDD in Scala / error: value toFloat is a not a member of Any

5 matches

Site Navigation

Mail list logo

Footer information