Spark GroupBy Save to different files

2017-09-01 Thread asethia
Hi, I have list of person records in following format: case class Person(fName:String, city:String) val l=List(Person("A","City1"),Person("B","City2"),Person("C","City1")) val rdd:RDD[Person]=sc.parallelize(l) val groupBy:RDD[(String, Iterable[Person])]=rdd.groupBy(_.city) I would like to

Market Basket Analysis by deploying FP Growth algorithm

2017-04-05 Thread asethia
Hi, We are currently working on a Market Basket Analysis by deploying FP Growth algorithm on Spark to generate association rules for product recommendation. We are running on close to 24 million invoices over an assortment of more than 100k products. However, whenever we relax the support

GenericRowWithSchema to case class

2016-04-03 Thread asethia
Hi, My Cassandra table has custom user defined say example: CREATE TYPE address ( addressline1 text, addressline2 text, city text, state text, country text, pincode text ) create table person ( id text, name text, addresses set>, PRIMARY KEY (id)); val

transformation - spark vs cassandra

2016-03-31 Thread asethia
Hi, I am working with Cassandra and Spark, would like to know what is best performance using Cassandra filter based on primary key and cluster key vs using spark data frame transformation/filters. for example in spark: val rdd = sqlContext.read.format("org.apache.spark.sql.cassandra")

Re: DataFrame vs RDD

2016-03-22 Thread asethia
creating RDD is done via spark context where as creating Dataframe is from sqlcontext... so Dataframe is part of sparksql where as RDD is spark core -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DataFrame-vs-RDD-tp26570p26573.html Sent from the Apache

DataFrame vs RDD

2016-03-22 Thread asethia
Hi, I am new to Spark, would like to know any guidelines when to use Data Frame vs. RDD. Thanks, As -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DataFrame-vs-RDD-tp26570.html Sent from the Apache Spark User List mailing list archive at Nabble.com.