Re: Latest Release of Receiver based Kafka Consumer for Spark Streaming.

2016-08-25 Thread mdkhajaasmath
Hi Dibyendu, Looks like it is available in 2.0, we are using older version of spark 1.5 . Could you please let me know how to use this with older versions. Thanks, Asmath Sent from my iPhone > On Aug 25, 2016, at 6:33 AM, Dibyendu Bhattacharya > wrote: > >

suggestion needed on FileInput Path- Spark Streaming

2016-08-10 Thread mdkhajaasmath
what is best practice while processing files from s3 bucket in spark file streaming ?? Like I keep on getting files in s3 path, have to process those in batch but while processing some other files might come up. In this steaming job, should I have to move files after end of our streaming batch

Re: [ASK]:Dataframe number of column limit in Saprk 1.5.2

2016-04-12 Thread mdkhajaasmath
I am also looking for same information . In my case I need to create 190 columns.. Sent from my iPhone > On Apr 12, 2016, at 9:49 PM, Divya Gehlot wrote: > > Hi, > I would like to know does Spark Dataframe API has limit on creation of > number of columns? > >

Data frames or Spark sql for partitioned tables

2016-04-06 Thread mdkhajaasmath
Hi, I am new to spark and trying to implement the solution without using hive. We are migrating to new environment where hive is not present intead I need to use spark to output files. I look at case class and maximum number of columns I can use is 22 but I have 180 columns . In this scenario

Fwd: Question on how to access tuple values in spark

2016-02-06 Thread mdkhajaasmath
> Hi, > > My req is to find max value of revenue per customer so I am using below > query. I got this solution from one of tutorial in google but not able to > understand how it returns max in this scenario. can anyone hep > > revenuePerDayPerCustomerMap.reduceByKey((x, y) => (if(x._2 >=

Re: Question on how to access tuple values in spark

2016-02-06 Thread mdkhajaasmath
Sent from my iPhone > On Feb 6, 2016, at 4:41 PM, KhajaAsmath Mohammed > wrote: > > Hi, > > My req is to find max value of revenue per customer so I am using below > query. I got this solution from one of tutorial in google but not able to > understand how it