Hello there,
I have a spark running in a 20 node cluster. The job is logically simple,
just a mapPartition and then sum. The return value of the mapPartitions is
an integer for each partition. The tasks got some random failure (which
could be caused by a 3rh party key-value store connections. The
Hi TD,
I have sent more informations now using 8 workers. The gap has been 27 sec now.
Have you seen?
Thanks
BR
--
Informativa sulla Privacy: http://www.unibs.it/node/8155
Ok Andrew,
Thanks
I sent informations of test with 8 worker and the gap is grown up.
On May 4, 2014, at 2:31, Andrew Ash wrote:
>>> From the logs, I see that the print() starts printing stuff 10 seconds
>>> after the context is started. And that 10 seconds is taken by the initial
>>> empty
Hi Eduardo,
Yep those machines look pretty well synchronized at this point. Just
wanted to throw that out there and eliminate it as a possible source of
confusion.
Good luck on continuing the debugging!
Andrew
On Sat, May 3, 2014 at 11:59 AM, Eduardo Costa Alfaia <
e.costaalf...@unibs.it> wrot
Hi TD, Thanks for reply
This last experiment I did with one computer, like local, but I think that time
gap grow up when I add more computer. I will do again now with 8 worker and 1
word source and I will see what’s go on. I will control the time too, like
suggested by Andrew.
On May 3, 2014, a
Hi TD, I am GMT +8 from you, Tomorrow I will get these information that you
have asked me.
Thanks
- Messaggio originale -
Da: "Tathagata Das"
Inviato: 30/04/2014 00.57
A: "user@spark.apache.org"
Oggetto: Re: Spark's behavior
Strange! Can you just do lines
Strange! Can you just do lines.print() to print the raw data instead of
doing word count. Beyond that we can do two things.
1. Can see the Spark stage UI to see whether there are stages running
during the 30 second period you referred to?
2. If you upgrade to using Spark master branch (or Spark 1.
Hi TD,
We are not using stream context with master local, we have 1 Master and 8
Workers and 1 word source. The command line that we are using is:
bin/run-example org.apache.spark.streaming.examples.JavaNetworkWordCount
spark://192.168.0.13:7077
On Apr 30, 2014, at 0:09, Tathagata Das wrot
Is you batch size 30 seconds by any chance?
Assuming not, please check whether you are creating the streaming context
with master "local[n]" where n > 2. With "local" or "local[1]", the system
only has one processing slot, which is occupied by the receiver leaving no
room for processing the receiv
Hi TD,
In my tests with spark streaming, I'm using JavaNetworkWordCount(modified) code
and a program that I wrote that sends words to the Spark worker, I use TCP as
transport. I verified that after starting Spark, it connects to my source which
actually starts sending, but the first word count
10 matches
Mail list logo