On Fri, Mar 3, 2017 at 12:34 PM, Panagiotis Liakos <p.lia...@di.uoa.gr> wrote: > Dear Mark, > > Thank you! I agree that deploying Giraph easily can be extremely > beneficial for researchers and software engineers as it will enable > them to focus on developing algorithms instead of wasting time to > set-up their infrastructure.
Completely agree, we should be working as a community to share and distill best practices of _operating_ software into repeatable reusable components. This is what we are aiming at for with Juju and Charms. In the Big Data space of charms there is core Hadoop, Spark, and Kafka bundles that enable developers to easily extend the core pieces needed with software like Giraph so folks can focus on the science. > And I believe that Giraph is more capable > of handling production-scale workloads than Spark's GraphX (I am not > the only one: > https://code.facebook.com/posts/319004238457019/a-comparison-of-state-of-the-art-graph-processing-systems/ > ) so it is a very nice addition to juju solutions. > > I am relatively new with juju so I am not really sure how to answer > your question. I understand that there is a trade-off between ease of > use and efficiency, right? No, I wouldn't say that is the trade-off is between ease of use and efficiency. In fact, with the ability to capture best practices and tuning a Juju user can easily deploy and have a performant efficient cluster. More over, you only know how fast you are until you measure it. Thus, the Spark and Hadoop bundles come with benchmarks to run so you can compare performance on different clouds, environments, or bare metal. If you have a hadoop-processing bundle running just issue: juju actions resourcemanager (ref = https://jujucharms.com/hadoop-processing) and you'll see a list of benchmarks you can run. An interesting paper may be to compare Spark's GraphX with Giraph using Charm benchmarks and analyzing the data in a repeatable fashion. Plus, having users give it a try for themselves. > Now, the giraph charm heavily depends on > the hadoop-processing bundle. Therefore, I think that such a paper > would make sense if one could show that this trade-off is good, i.e., > if the hadoop-processing bundle is efficient. Is it? It is, and is in a form to continually get better while community members discover best practices for operating Hadoop clusters. > If so, giraph > could serve the purpose of providing interesting use-cases and I think > it would be interesting to give it a go. I hope this makes sense :) If we could show value/performance improvements over say Spark's GraphX, and a workflow for users to gain insights and mine their data in a recommended straight-forward method I think that would be interesting. Especially if we can workshop it and spend most of the time discussing how to mine data using Giraph+Hadoop. > > Other than that, I think that it would be very interesting if you guys > organized a tutorial that you could give in computer science > conferences. For example, I attended the following tutorial a few > months ago: > http://cikm2016-sparktutorial.droppages.com/ > It attracted a large audience, and a lot of the attendees had come > prepared (they had installed what was needed to get a hands-on > experience). If they did a tutorial for Spark why not doing one for > Juju as well? I have watched two talks about juju and I've seen people > stand amazed when realizing how easy and intuitive it is to handle > complex deployments with it. With so many charms I am sure that there > are plenty of things to showcase, and doing it in a tutorial instead > of a talk, would result in increased user engagement. We have been at DevOp Days, Stratas, and Linux Foundation Events, but not the one you specifically pointed at. I think it would be interesting to present Big Data solutions and Machine learning workshops at conferences that are more orientated to operators and users wanting to focus on the science in a workshop/hands-on context. I wanted to thank you again for your contribution, and having the insight to see how you could leverage Juju and Charms to further extend the core pieces to provide the solution you needed and share it with the community. Hopefully we'll see you soon at a conference presenting on Giraph with your new charm. :-) -Antonio > > --Panagiotis > > 2017-03-03 13:53 GMT+02:00 Mark Shuttleworth <m...@ubuntu.com>: >> Hi Panagiotis >> >> Congratulations on getting giraf into the curated set! I think it's an >> exciting new capability for people working with webs of linked data, so >> making it really easy to deploy and operate is a very valuable >> contribution. It took a long time for giraf to mature but it's now one >> of those things that a lot of large data sets would benefit from. Is it >> worth presenting a paper on operating it efficiently with charms? >> >> Mark >> >> On 03/03/17 09:25, Panagiotis Liakos wrote: >>> Yesterday the giraph charm I've been working on the last weeks was >>> promulgated to: >>> https://jujucharms.com/giraph/ >>> >>> I would just like to express my gratitude towards Konstantinos, Kevin, >>> and Merlijn for their extremely valuable help! >>> >>> I hope that the charm is useful to many juju enthusiasts and I will do >>> my best to make it better with time! >>> >>> Thank you!!! >> >> > > -- > Juju mailing list > Juju@lists.ubuntu.com > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/juju -- Antonio Rosales Ecosystem Engineering Canonical -- Juju mailing list Juju@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju