I'm happy to announce the availability of Spark 1.2.0! Spark 1.2.0 is
the third release on the API-compatible 1.X line. It is Spark's
largest release ever, with contributions from 172 developers and more
than 1,000 commits!
This release brings operational and performance improvements in Spark
Hi,
Thanks to Jerry for mentioning the Kafka Spout for Trident. The Storm
Trident has done the exact-once guarantee by processing the tuple in a
batch and assigning same transaction-id for a given batch . The replay for
a given batch with a transaction-id will have exact same set of tuples and
Congrats!
A little question about this release: Which commit is this release based
on? v1.2.0 and v1.2.0-rc2 are pointed to different commits in
https://github.com/apache/spark/releases
Best Regards,
Shixiong Zhu
2014-12-19 16:52 GMT+08:00 Patrick Wendell pwend...@gmail.com:
I'm happy to
Tag 1.2.0 is older than 1.2.0-rc2. I wonder if it just didn't get
updated. I assume it's going to be 1.2.0-rc2 plus a few commits
related to the release process.
On Fri, Dec 19, 2014 at 9:50 AM, Shixiong Zhu zsxw...@gmail.com wrote:
Congrats!
A little question about this release: Which commit
In the http://spark.apache.org/downloads.html page,We cann't download the
newest Spark release.
At 2014-12-19 17:55:29,Sean Owen so...@cloudera.com wrote:
Tag 1.2.0 is older than 1.2.0-rc2. I wonder if it just didn't get
updated. I assume it's going to be 1.2.0-rc2 plus a few commits
I can download it. Make sure you refresh the page, maybe, so that it
shows the 1.2.0 download as an option.
On Fri, Dec 19, 2014 at 11:16 AM, wyphao.2007 wyphao.2...@163.com wrote:
In the http://spark.apache.org/downloads.html page,We cann't download the
newest Spark release.
At
Hi all,
Thanks for your work on spark! I am trying to locate spark-yarn jars
for the new 1.2.0 release. The jars for spark-core, etc, are on maven
central, but the spark-yarn jars are missing.
Confusingly and perhaps relatedly, I also can't seem to get the
spark-yarn artifact to install
I believe spark-yarn does not exist from 1.2 onwards. Have a look at
spark-network-yarn for where some of that went, I believe.
On Fri, Dec 19, 2014 at 5:09 PM, David McWhorter mcwhor...@ccri.com wrote:
Hi all,
Thanks for your work on spark! I am trying to locate spark-yarn jars for
the new
Thanks for pointing out the tag issue. I've updated all links to point
to the correct tag (from the vote thread):
a428c446e23e628b746e0626cc02b7b3cadf588e
On Fri, Dec 19, 2014 at 1:55 AM, Sean Owen so...@cloudera.com wrote:
Tag 1.2.0 is older than 1.2.0-rc2. I wonder if it just didn't get
any comments?
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Confirming-race-condition-in-DagScheduler-NoSuchElementException-tp9798p9855.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
Hi,
I just filed a bug
SPARK-4906https://issues.apache.org/jira/browse/SPARK-4906, regarding Spark
master OOMs. If I understand correctly, the UI states for all running
applications are kept in memory retained by JobProgressListener, and when there
are a lot of exception stack traces, this UI
Hi Dibyendu,
Thanks for the details on the implementation. But I still do not believe
that it is no duplicates - what they achieve is that the same batch is
processed exactly the same way every time (but see it may be processed more
than once) - so it depends on the operation being idempotent. I
i am interested to contribute to spark
Hi Harikrishna,
A good place to start is taking a look at the wiki page on contributing:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
-Sandy
On Fri, Dec 19, 2014 at 2:43 PM, Harikrishna Kamepalli
harikrishna.kamepa...@gmail.com wrote:
i am interested to contribute
Please feel free to correct me if I’m wrong, but I think the exactly once spark
streaming semantics can easily be solved using updateStateByKey. Make the key
going into updateStateByKey be a hash of the event, or pluck off some uuid from
the message. The updateFunc would only emit the message
Yesterday, I changed the domain name in the mailing list archive settings
to remove .incubator so maybe it'll work now.
However, I also sent two emails about this through the nabble interface (in
this same thread) yesterday and they don't appear to have made it through
so not sure if it actually
Andy:
I saw two emails from you from yesterday.
See this thread: http://search-hadoop.com/m/JW1q5opRsY1
Cheers
On Fri, Dec 19, 2014 at 12:51 PM, Andy Konwinski andykonwin...@gmail.com
wrote:
Yesterday, I changed the domain name in the mailing list archive settings
to remove .incubator so
The problems you guys are discussing come from trying to store state in
spark, so don't do that. Spark isn't a distributed database.
Just map kafka partitions directly to rdds, llet user code specify the
range of offsets explicitly, and let them be in charge of committing
offsets.
Using the
Can you explain your basic algorithm for the once-only-delivery? It is quite a
bit of very Kafka-specific code, that would take more time to read than I can
currently afford? If you can explain your algorithm a bit, it might help.
Thanks, Hari
On Fri, Dec 19, 2014 at 1:48 PM, Cody Koeninger
That KafkaRDD code is dead simple.
Given a user specified map
(topic1, partition0) - (startingOffset, endingOffset)
(topic1, partition1) - (startingOffset, endingOffset)
...
turn each one of those entries into a partition of an rdd, using the simple
consumer.
That's it. No recovery logic, no
yup, we at tresata do the idempotent store the same way. very simple
approach.
On Fri, Dec 19, 2014 at 5:32 PM, Cody Koeninger c...@koeninger.org wrote:
That KafkaRDD code is dead simple.
Given a user specified map
(topic1, partition0) - (startingOffset, endingOffset)
(topic1, partition1)
Hi spark. Im trying to understand the akka debug messages when networking
doesnt work properly. any hints would be great on this.
SIMPLE TESTS I RAN
- i tried a ping, works.
- i tried a telnet to the 7077 port of master, from slave, also works.
LOGS
1) On the master I see this WARN log
22 matches
Mail list logo