Interesting. What where the Hive settings? Specifically it would be useful to
know if this was Hive on Tez.
- Steve
From: Sanjay Subramanian
Reply-To: Sanjay Subramanian
Date: Thursday, June 18, 2015 at 11:08
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Spark-sql versus Impala
...@mc10inc.commailto:jl...@mc10inc.com
Date: Sunday, January 25, 2015 at 17:17
To: Steve Nunez snu...@hortonworks.commailto:snu...@hortonworks.com,
user@spark.apache.orgmailto:user@spark.apache.org
user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Pairwise Processing of a List
So you've got a point
Spark Experts,
I've got a list of points: List[(Float, Float)]) that represent (x,y)
coordinate pairs and need to sum the distance. It's easy enough to compute the
distance:
case class Point(x: Float, y: Float) {
def distance(other: Point): Float =
sqrt(pow(x - other.x, 2) + pow(y -
Hello Users,
I've got a real-world use case that seems common enough that its pattern would
be documented somewhere, but I can't find any references to a simple solution.
The challenge is that data is getting dumped into a directory structure, and
that directory structure itself contains
Great stuff. Wonderful to see such progress in so short a time.
How about some links to code and instructions so that these benchmarks can
be reproduced?
Regards,
- Steve
From: Debasish Das debasish.da...@gmail.com
Date: Friday, October 10, 2014 at 8:17
To: Matei Zaharia
Anyone? No customers using streaming at scale?
From: Steve Nunez snu...@hortonworks.com
Date: Wednesday, August 27, 2014 at 9:08
To: user@spark.apache.org user@spark.apache.org
Subject: Reference Accounts Large Node Deployments
All,
Does anyone have specific references to customers
All,
Does anyone have specific references to customers, use cases and large-scale
deployments of Spark Streaming? By OElarge scale¹ I mean both through-put and
number of nodes. I¹m attempting an objective comparison of Streaming and
Storm and while this data is known for Storm, there appears to
I don’t think there is an hwx profile, but there probably should be.
- Steve
From: Patrick Wendell pwend...@gmail.com
Date: Monday, August 4, 2014 at 10:08
To: Ron's Yahoo! zlgonza...@yahoo.com
Cc: Ron's Yahoo! zlgonza...@yahoo.com.invalid, Steve Nunez
snu...@hortonworks.com, user
purist but just that I am not sure
these are things that the project can meaningfully bother with.
It makes sense to set vendor repos in the pom for convenience, and
makes sense to run smoke tests in Jenkins against particular versions.
$0.02
Sean
On Mon, Aug 4, 2014 at 6:21 PM, Steve Nunez snu
).distinct.count
Cheers,
- Steve Nunez
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
Anyone out there have a good configuration for emacs? Scala-mode sort of
works, but I¹d love to see a fully-supported spark-mode with an inferior
shell. Searching didn¹t turn up much of anything.
Any emacs users out there? What setup are you using?
Cheers,
- SteveN
--
CONFIDENTIALITY
I¹m also in early stages of setting up long running Spark jobs. Easiest way
I¹ve found is to set up a cluster and submit the job via YARN. Then I can
come back and check in on progress when I need to. Seems the trick is tuning
the queue priority and YARN preemption to get the job to run in a
12 matches
Mail list logo