Re: Spark history server running on Mongo

2017-07-19 Thread Ivan Sadikov
Yes, you are absolutely right, though UI does not change often, and it potentially allows to iterate faster, IMHO, which is why started working on this. For me, it felt like this functionality could easily be outsourced to a separate project. And, as you pointed out, I did add some small fixes to

Re: Spark history server running on Mongo

2017-07-19 Thread Marcelo Vanzin
On Tue, Jul 18, 2017 at 7:21 PM, Ivan Sadikov wrote: > Repository that I linked to does not require rebuilding Spark and could be > used with current distribution, which is preferable in my case. Fair enough, although that means that you're re-implementing the Spark UI, which makes that project h

Re: Spark history server running on Mongo

2017-07-18 Thread Ivan Sadikov
Hi Marcelo, Thanks for the reference, again. I looked at your code - really great work! I had to replace Spark distribution to use it though - could not figure out how to build it separately. Repository that I linked to does not require rebuilding Spark and could be used with current distribution

Re: Spark history server running on Mongo

2017-07-18 Thread Ivan Sadikov
Thanks for JIRA ticket reference! Frankly, I was aware of this work, but didn't know that there was an API for storage implementation. Will try exploring that as well, thanks! On Wed, 19 Jul 2017 at 4:18 AM, Marcelo Vanzin wrote: > See SPARK-18085. That has much of the same goals re: SHS resourc

Re: Spark history server running on Mongo

2017-07-18 Thread Marcelo Vanzin
See SPARK-18085. That has much of the same goals re: SHS resource usage, and also provides a (currently non-public) API where you could just create a MongoDB implementation if you want. On Tue, Jul 18, 2017 at 12:56 AM, Ivan Sadikov wrote: > Hello everyone! > > I have been working on Spark histor

Spark history server running on Mongo

2017-07-18 Thread Ivan Sadikov
Hello everyone! I have been working on Spark history server that uses MongoDB as a datastore for processed events to iterate on idea that Spree project uses for Spark UI. Project was originally designed to improve on standalone history server with reduced memory footprint. Project lives here: htt