Amazing! I'll fund $1/2 million for such a interesting initiative. Oh, wait... I have only $4 on my pocket
Cheers :) On 1 April 2016 at 11:40, Takeshi Yamamuro <linguin....@gmail.com> wrote: > Oh, the annual event... > > On Fri, Apr 1, 2016 at 4:37 PM, Xiao Li <gatorsm...@gmail.com> wrote: > >> April 1st... : ) >> >> 2016-04-01 0:33 GMT-07:00 Michael Malak <michaelma...@yahoo.com.invalid>: >> >>> I see you've been burning the midnight oil. >>> >>> >>> ------------------------------ >>> *From:* Reynold Xin <r...@databricks.com> >>> *To:* "dev@spark.apache.org" <dev@spark.apache.org> >>> *Sent:* Friday, April 1, 2016 1:15 AM >>> *Subject:* [discuss] using deep learning to improve Spark >>> >>> Hi all, >>> >>> Hope you all enjoyed the Tesla 3 unveiling earlier tonight. >>> >>> I'd like to bring your attention to a project called DeepSpark that we >>> have been working on for the past three years. We realized that scaling >>> software development was challenging. A large fraction of software >>> engineering has been manual and mundane: writing test cases, fixing bugs, >>> implementing features according to specs, and reviewing pull requests. So >>> we started this project to see how much we could automate. >>> >>> After three years of development and one year of testing, we now have >>> enough confidence that this could work well in practice. For example, Matei >>> confessed to me today: "It looks like DeepSpark has a better understanding >>> of Spark internals than I ever will. It updated several pieces of code I >>> wrote long ago that even I no longer understood.” >>> >>> >>> I think it's time to discuss as a community about how we want to >>> continue this project to ensure Spark is stable, secure, and easy to use >>> yet able to progress as fast as possible. I'm still working on a more >>> formal design doc, and it might take a little bit more time since I haven't >>> been able to fully grasp DeepSpark's capabilities yet. Based on my >>> understanding right now, I've written a blog post about DeepSpark here: >>> https://databricks.com/blog/2016/04/01/unreasonable-effectiveness-of-deep-learning-on-spark.html >>> >>> >>> Please take a look and share your thoughts. Obviously, this is an >>> ambitious project and could take many years to fully implement. One major >>> challenge is cost. The current Spark Jenkins infrastructure provided by the >>> AMPLab has only 8 machines, but DeepSpark uses 12000 machines. I'm not sure >>> whether AMPLab or Databricks can fund DeepSpark's operation for a long >>> period of time. Perhaps AWS can help out here. Let me know if you have >>> other ideas. >>> >>> >>> >>> >>> >>> >>> >>> >>> >> > > > -- > --- > Takeshi Yamamuro >