)
}
val unifiedStream = ssc.union(streams)
val sparkProcessingParallelism = 1
unifiedStream.repartition(sparkProcessingParallelism)
}
//print(kafkaStream)
ssc.start()
ssc.awaitTermination()
--
Rohit Pujari
--
CONFIDENTIALITY NOTICE
NOTICE: This message
: Saturday, January 17, 2015 at 4:10 AM
To: Rohit Pujari rpuj...@hortonworks.commailto:rpuj...@hortonworks.com
Subject: Re: Spark Streaming
Streams are lazy. Their computation is triggered by an output operator, which
is apparently missing from your code. See the programming guide:
https
operation on the stream.
On Sat, Jan 17, 2015 at 10:17 AM, Rohit Pujari rpuj...@hortonworks.com
wrote:
Hi Francois:
I tried using print(kafkaStream)” as output operator but no luck. It
throws
the same error. Any other thoughts?
Thanks,
Rohit
From: francois.garil...@typesafe.com
algos when they really mean they want to compute item similarity
or make recommendations. What's your use case?
On Thu, Dec 4, 2014 at 8:23 PM, Rohit Pujari rpuj...@hortonworks.com
wrote:
Sure, I’m looking to perform frequent item set analysis on POS data set.
Apriori is a classic algorithm
Hello Folks:
I'd like to do market basket analysis using spark, what're my options?
Thanks,
Rohit Pujari
Solutions Architect, Hortonworks
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information
to perform a similar task? If there's no spoon
to spoon substitute, spoon to fork will suffice too.
Hopefully this provides some clarification.
Thanks,
Rohit
From: Tobias Pfeiffer t...@preferred.jpmailto:t...@preferred.jp
Date: Thursday, December 4, 2014 at 7:20 PM
To: Rohit Pujari rpuj
possible today and some of the active development in the community that's
on the horizon.
Thanks,
Rohit Pujari
Solutions Architect, Hortonworks
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information
Reviving this .. any thoughts experts?
On Thu, Oct 9, 2014 at 3:47 PM, Rohit Pujari rpuj...@hortonworks.com
wrote:
Hello Folks:
I'm running spark job on YARN. After the execution, I would expect the
spark job to clean staging the area, but it seems every run creates a new
staging directory
Hello Folks:
I'm running spark job on YARN. After the execution, I would expect the
spark job to clean staging the area, but it seems every run creates a new
staging directory. Is there a way to force spark job to clean after itself?
Thanks,
Rohit
--
CONFIDENTIALITY NOTICE
NOTICE: This message
Hello Folks:
What're some best practices to debug Spark in cluster mode?
Thanks,
Rohit
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from
, 2014 at 9:17 AM, Rohit Pujari rpuj...@hortonworks.com
wrote:
Hello Folks:
There is lot of buzz in the hadoop community around Spark's inability to
scale beyond the 1 TB datasets ( or 10-20 nodes). It is being regarded as
great tech for cpu intensive workloads on smaller data( less that TB
boundaries of the tech and recommend right solution for
right problem.
Thanks,
Rohit Pujari
Solutions Engineer, Hortonworks
rpuj...@hortonworks.com
716-430-6899
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain
Hello Folks:
I have written a simple program to read the already saved model from HDFS
and score it. But when I'm trying to read the saved model, I get the
following error. Any clues what might be going wrong here ..
val x = sc.objectFile[Vector](/data/model).collect()
val y = new
13 matches
Mail list logo