Hi
I am new to Spark and I encountered this error when I try to map RDD[A] =
RDD[Array[Double]] then collect the results.
A is a custom class extends Serializable. (Actually it's just a wrapper
class which wraps a few variables that are all serializable).
I also tried KryoSerializer according
Hi Sonal,
There are no custom objects in saveRDD, it is of type RDD[(String, String)].
Thanks,
Pradeep
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SequenceFileRDDFunctions-cannot-be-used-output-of-spark-package-tp250p3508.html
Sent from the Apache
I am facing different kinds of java.lang.ClassNotFoundException when trying to
run spark on mesos. One error has to do with
org.apache.spark.executor.MesosExecutorBackend. Another has to do with
org.apache.spark.serializer.JavaSerializer. I see other people complaining
about similar issues.
I
What versions are you running?
There is a known protobuf 2.5 mismatch, depending on your versions.
Cheers,
Tim
- Original Message -
From: Bharath Bhushan manku.ti...@outlook.com
To: user@spark.apache.org
Sent: Monday, March 31, 2014 8:16:19 AM
Subject:
Hi,
I've just tested spark in yarn mode, but something made me confused.
When I *delete* the yarn.application.classpath configuration in
yarn-site.xml, the following command works well.
*bin/spark-class org.apache.spark.deploy.yarn.Client --jar
I tried 0.9.0 and the latest git tree of spark. For mesos, I tried 0.17.0 and
the latest git tree.
Thanks
On 31-Mar-2014, at 7:24 pm, Tim St Clair tstcl...@redhat.com wrote:
What versions are you running?
There is a known protobuf 2.5 mismatch, depending on your versions.
Cheers,
Howdy-doody,
I have a single, very large file sitting in S3 that I want to read in with
sc.textFile(). What are the best practices for reading in this file as
quickly as possible? How do I parallelize the read as much as possible?
Similarly, say I have a single, very large RDD sitting in memory
* unionAll preserve duplicate v/s union that does not
This is true, if you want to eliminate duplicate items you should follow
the union with a distinct()
* SQL union and unionAll result in same output format i.e. another SQL v/s
different RDD types here.
* Understand the existing union
This is similar to how SQL works, items in the GROUP BY clause are not
included in the output by default. You will need to include 'a in the
second parameter list (which is similar to the SELECT clause) as well if
you want it included in the output.
On Sun, Mar 30, 2014 at 9:52 PM, Manoj Samel
val people: RDD[Person] // An RDD of case class objects, from the first
example. is just a placeholder to avoid cluttering up each example with
the same code for creating an RDD. The : RDD[People] is just there to
let you know the expected type of the variable 'people'. Perhaps there is
a
Hi Michael,
Thanks for the clarification. My question is about the error above error:
class $iwC needs to be abstract and what does the RDD brings, since I can
do the DSL without the people: people: org.apache.spark.rdd.RDD[Person]
Thanks,
On Mon, Mar 31, 2014 at 9:13 AM, Michael Armbrust
Note that you may have minSplits set to more than the number of cores in
the cluster, and Spark will just run as many as possible at a time. This is
better if certain nodes may be slow, for instance.
In general, it is not necessarily the case that doubling the number of
cores doing IO will double
OK sweet. Thanks for walking me through that.
I wish this were StackOverflow so I could bestow some nice rep on all you
helpful people.
On Mon, Mar 31, 2014 at 1:06 PM, Aaron Davidson ilike...@gmail.com wrote:
Note that you may have minSplits set to more than the number of cores in
the
How about London?
--
Martin Goodson | VP Data Science
(0)20 3397 1240
[image: Inline image 1]
On Mon, Mar 31, 2014 at 6:28 PM, Andy Konwinski andykonwin...@gmail.comwrote:
Hi folks,
We have seen a lot of community growth outside of the Bay Area and we are
looking to help spur even
Not sure what data you are sending in. You could try calling
lines.print() instead which should just output everything that comes in
on the stream. Just to test that your socket is receiving what you think
you are sending.
On Mon, Mar 31, 2014 at 12:18 PM, eric perler
Responses about London, Montreal/Toronto, DC, Chicago. Great coverage so
far, and keep 'em coming! (still looking for an NYC connection)
I'll reply to each of you off-list to coordinate next-steps for setting up
a Spark meetup in your home area.
Thanks again, this is super exciting.
Andy
On
We'd love to see a Spark user group in Los Angeles and connect with others
working with it here.
Ping me if you're in the LA area and use Spark at your company (
ch...@retentionscience.com ).
Chris
Retention Science
call: 734.272.3099
visit: Site | like: Facebook | follow: Twitter
On Mar
It sounds like the protobuf issue.
So FWIW, You might want to try updating the 0.9.0 w/pom mods for mesos
protobuf.
mesos 0.17.0 protobuf 2.5
Cheers,
Tim
- Original Message -
From: Bharath Bhushan manku.ti...@outlook.com
To: user@spark.apache.org
Sent: Monday, March 31, 2014
Dear list,
I was wondering how Spark handles congestion when the upstream is
generating dstreams faster than downstream workers can handle?
Thanks
-Mo
Thanks
-Mo
2014-03-31 13:16 GMT-05:00 Evgeny Shishkin itparan...@gmail.com:
On 31 Mar 2014, at 21:05, Dong Mo monted...@gmail.com wrote:
Dear list,
I was wondering how Spark handles congestion when the upstream is
generating dstreams faster than downstream workers can handle?
It
Nicholas, I'm in Boston and would be interested in a Spark group. Not
sure if you know this -- there was a meetup that never got off the
ground. Anyway, I'd be +1 for attending. Not sure what is involved in
organizing. Seems a shame that a city like Boston doesn't have one.
On Mon, Mar 31, 2014
My fellow Bostonians and New Englanders,
We cannot allow New York to beat us to having a banging Spark meetup.
Respond to me (and I guess also Andy?) if you are interested.
Yana,
I'm not sure either what is involved in organizing, but we can figure it
out. I didn't know about the meetup that
I would offer to host one in Cape Town but we're almost certainly the only
Spark users in the country apart from perhaps one in Johanmesburg :)—
Sent from Mailbox for iPhone
On Mon, Mar 31, 2014 at 8:53 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
My fellow Bostonians and New
Happy to help with an NYC meet up (just emailed Andy). I recently moved to VA,
but am back in NYC quite often, and have been turning several computational
people at Columbia / NYU / Simons Foundation onto Spark; there'd definitely be
interest in those communities.
-- Jeremy
Also in NYC, definitely interested in a spark meetup!
Sent from my iPhone
On Mar 31, 2014, at 3:07 PM, Jeremy Freeman freeman.jer...@gmail.com wrote:
Happy to help with an NYC meet up (just emailed Andy). I recently moved to
VA, but am back in NYC quite often, and have been turning several
If you have any questions on helping to get a Spark Meetup off the ground,
please do not hesitate to ping me (denny.g@gmail.com). I helped jump start
the one here in Seattle (and tangentially have been helping the Vancouver and
Denver ones as well). HTH!
On March 31, 2014 at 12:35:38
Your suggestion took me past the ClassNotFoundException. I then hit
akka.actor.ActorNotFound exception. I patched in PR 568 into my 0.9.0 spark
codebase and everything worked.
So thanks a lot, Tim. Is there a JIRA/PR for the protobuf issue? Why is it not
fixed in the latest git tree?
Thanks.
Spark now shades its own protobuf dependency so protobuf 2.4.1 should't be
getting pulled in unless you are directly using akka yourself. Are you?
Does your project have other dependencies that might be indirectly pulling
in protobuf 2.4.1? It would be helpful if you could list all of your
In the spirit of everything being bigger and better in TX ;) = if
anyone is in Austin and interested in meeting up over Spark - contact
me! There seems to be a Spark meetup group in Austin that has never met
and my initial email to organize the first gathering was never acknowledged.
Ognen
On
@eric-
i saw this exact issue recently while working on the KinesisWordCount.
are you passing local[2] to your example as the MASTER arg versus just
local or local[1]?
you need at least 2. it's documented as n1 in the scala source docs -
which is easy to mistake for n=1.
i just ran the
I was talking about the protobuf version issue as not fixed. I could not find
any reference to the problem or the fix.
Reg. SPARK-1052, I could pull in the fix into my 0.9.0 tree (from the tar ball
on the website) and I see the fix in the latest git.
Thanks
On 01-Apr-2014, at 3:28 am, deric
Hi Andy,
I would be interested in setting up a meetup in Delhi/NCR, India. Can you
please let me know how to go about organizing it?
Best Regards,
Sonal
Nube Technologies http://www.nubetech.co
http://in.linkedin.com/in/sonalgoyal
On Tue, Apr 1, 2014 at 10:04 AM, giive chen
Another problem I noticed is that the current 1.0.0 git tree still gives me the
ClassNotFoundException. I see that the SPARK-1052 is already fixed there. I
then modified the pom.xml for mesos and protobuf and that still gave the
ClassNotFoundException. I also tried modifying pom.xml only for
33 matches
Mail list logo