from:"Rohit Pujari"

Spark Streaming

2015-01-17 Thread Rohit Pujari

) } val unifiedStream = ssc.union(streams) val sparkProcessingParallelism = 1 unifiedStream.repartition(sparkProcessingParallelism) } //print(kafkaStream) ssc.start() ssc.awaitTermination() -- Rohit Pujari -- CONFIDENTIALITY NOTICE NOTICE: This message

Re: Spark Streaming

2015-01-17 Thread Rohit Pujari

: Saturday, January 17, 2015 at 4:10 AM To: Rohit Pujari rpuj...@hortonworks.commailto:rpuj...@hortonworks.com Subject: Re: Spark Streaming Streams are lazy. Their computation is triggered by an output operator, which is apparently missing from your code. See the programming guide: https

Re: Spark Streaming

2015-01-17 Thread Rohit Pujari

operation on the stream. On Sat, Jan 17, 2015 at 10:17 AM, Rohit Pujari rpuj...@hortonworks.com wrote: Hi Francois: I tried using print(kafkaStream)” as output operator but no luck. It throws the same error. Any other thoughts? Thanks, Rohit From: francois.garil...@typesafe.com

Re: Market Basket Analysis

2014-12-05 Thread Rohit Pujari

algos when they really mean they want to compute item similarity or make recommendations. What's your use case? On Thu, Dec 4, 2014 at 8:23 PM, Rohit Pujari rpuj...@hortonworks.com wrote: Sure, I’m looking to perform frequent item set analysis on POS data set. Apriori is a classic algorithm

Market Basket Analysis

2014-12-04 Thread Rohit Pujari

Hello Folks: I'd like to do market basket analysis using spark, what're my options? Thanks, Rohit Pujari Solutions Architect, Hortonworks -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information

Re: Market Basket Analysis

2014-12-04 Thread Rohit Pujari

to perform a similar task? If there's no spoon to spoon substitute, spoon to fork will suffice too. Hopefully this provides some clarification. Thanks, Rohit From: Tobias Pfeiffer t...@preferred.jpmailto:t...@preferred.jp Date: Thursday, December 4, 2014 at 7:20 PM To: Rohit Pujari rpuj

Python Scientific Libraries in Spark

2014-11-24 Thread Rohit Pujari

possible today and some of the active development in the community that's on the horizon. Thanks, Rohit Pujari Solutions Architect, Hortonworks -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information

Re: Spark job doesn't clean after itself

2014-10-12 Thread Rohit Pujari

Reviving this .. any thoughts experts? On Thu, Oct 9, 2014 at 3:47 PM, Rohit Pujari rpuj...@hortonworks.com wrote: Hello Folks: I'm running spark job on YARN. After the execution, I would expect the spark job to clean staging the area, but it seems every run creates a new staging directory

Spark job doesn't clean after itself

2014-10-09 Thread Rohit Pujari

Hello Folks: I'm running spark job on YARN. After the execution, I would expect the spark job to clean staging the area, but it seems every run creates a new staging directory. Is there a way to force spark job to clean after itself? Thanks, Rohit -- CONFIDENTIALITY NOTICE NOTICE: This message

Debug Spark in Cluster Mode

2014-10-09 Thread Rohit Pujari

Hello Folks: What're some best practices to debug Spark in cluster mode? Thanks, Rohit -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from

Re: Can Spark stack scale to petabyte scale without performance degradation?

2014-07-16 Thread Rohit Pujari

, 2014 at 9:17 AM, Rohit Pujari rpuj...@hortonworks.com wrote: Hello Folks: There is lot of buzz in the hadoop community around Spark's inability to scale beyond the 1 TB datasets ( or 10-20 nodes). It is being regarded as great tech for cpu intensive workloads on smaller data( less that TB

Can Spark stack scale to petabyte scale without performance degradation?

2014-07-15 Thread Rohit Pujari

boundaries of the tech and recommend right solution for right problem. Thanks, Rohit Pujari Solutions Engineer, Hortonworks rpuj...@hortonworks.com 716-430-6899 -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain

KMeansModel Construtor error

2014-07-14 Thread Rohit Pujari

Hello Folks: I have written a simple program to read the already saved model from HDFS and score it. But when I'm trying to read the saved model, I get the following error. Any clues what might be going wrong here .. val x = sc.objectFile[Vector](/data/model).collect() val y = new

Spark Streaming

Re: Spark Streaming

Re: Spark Streaming

Re: Market Basket Analysis

Market Basket Analysis

Re: Market Basket Analysis

Python Scientific Libraries in Spark

Re: Spark job doesn't clean after itself

Spark job doesn't clean after itself

Debug Spark in Cluster Mode

Re: Can Spark stack scale to petabyte scale without performance degradation?

Can Spark stack scale to petabyte scale without performance degradation?

KMeansModel Construtor error

13 matches

Site Navigation

Mail list logo

Footer information