On Fri, Mar 3, 2017 at 12:34 PM, Panagiotis Liakos <p.lia...@di.uoa.gr> wrote:
> Dear Mark,
>
> Thank you! I agree that deploying Giraph easily can be extremely
> beneficial for researchers and software engineers as it will enable
> them to focus on developing algorithms instead of wasting time to
> set-up their infrastructure.

Completely agree, we should be working as a community to share and
distill best practices of _operating_ software into repeatable
reusable components. This is what we are aiming at for with Juju and
Charms.  In the Big Data space of charms there is core Hadoop, Spark,
and Kafka bundles that enable developers to easily extend the core
pieces needed with software like Giraph so folks can focus on the
science.

> And I believe that Giraph is more capable
> of handling production-scale workloads than Spark's GraphX (I am not
> the only one: 
> https://code.facebook.com/posts/319004238457019/a-comparison-of-state-of-the-art-graph-processing-systems/
> ) so it is a very nice addition to juju solutions.
>
> I am relatively new with juju so I am not really sure how to answer
> your question. I understand that there is a trade-off between ease of
> use and efficiency, right?

No, I wouldn't say that is the trade-off is between ease of use and
efficiency. In fact, with the ability to capture best practices and
tuning a Juju user can easily deploy and have a performant efficient
cluster.  More over, you only know how fast you are until you measure
it. Thus, the Spark and Hadoop bundles come with benchmarks to run so
you can compare performance on different clouds, environments, or bare
metal.

If you have a hadoop-processing bundle running just issue:
    juju actions resourcemanager
(ref = https://jujucharms.com/hadoop-processing)
and you'll see a list of benchmarks you can run.

An interesting paper may be to compare Spark's GraphX with Giraph
using Charm benchmarks and analyzing the data in a repeatable fashion.
Plus, having users give it a try for themselves.

> Now, the giraph charm heavily depends on
> the hadoop-processing bundle. Therefore, I think that such a paper
> would make sense if one could show that this trade-off is good, i.e.,
> if the hadoop-processing bundle is efficient. Is it?

It is, and is in a form to continually get better while community
members discover best practices for operating Hadoop clusters.

> If so, giraph
> could serve the purpose of providing interesting use-cases and I think
> it would be interesting to give it a go. I hope this makes sense :)

If we could show value/performance improvements over say Spark's
GraphX, and a workflow for users to gain insights and mine their data
in a recommended straight-forward method I think that would be
interesting. Especially if we can workshop it and spend most of the
time discussing how to mine data using Giraph+Hadoop.

>
> Other than that, I think that it would be very interesting if you guys
> organized a tutorial that you could give in computer science
> conferences. For example, I attended the following tutorial a few
> months ago:
> http://cikm2016-sparktutorial.droppages.com/
> It attracted a large audience, and a lot of the attendees had come
> prepared (they had installed what was needed to get a hands-on
> experience). If they did a tutorial for Spark why not doing one for
> Juju as well? I have watched two talks about juju and I've seen people
> stand amazed when realizing how easy and intuitive it is to handle
> complex deployments with it. With so many charms I am sure that there
> are plenty of things to showcase, and doing it in a tutorial instead
> of a talk, would result in increased user engagement.

We have been at DevOp Days, Stratas, and Linux Foundation Events, but
not the one you specifically pointed at. I think it would be
interesting to present Big Data solutions and Machine learning
workshops at conferences that are more orientated to operators and
users wanting to focus on the science in a workshop/hands-on context.

I wanted to thank you again for your contribution, and having the
insight to see how you could leverage Juju and Charms to further
extend the core pieces to provide the solution you needed and share it
with the community.

Hopefully we'll see you soon at a conference presenting on Giraph with
your new charm. :-)

-Antonio

>
> --Panagiotis
>
> 2017-03-03 13:53 GMT+02:00 Mark Shuttleworth <m...@ubuntu.com>:
>> Hi Panagiotis
>>
>> Congratulations on getting giraf into the curated set! I think it's an
>> exciting new capability for people working with webs of linked data, so
>> making it really easy to deploy and operate is a very valuable
>> contribution. It took a long time for giraf to mature but it's now one
>> of those things that a lot of large data sets would benefit from. Is it
>> worth presenting a paper on operating it efficiently with charms?
>>
>> Mark
>>
>> On 03/03/17 09:25, Panagiotis Liakos wrote:
>>> Yesterday the giraph charm I've been working on the last weeks was
>>> promulgated to:
>>> https://jujucharms.com/giraph/
>>>
>>> I would just like to express my gratitude towards Konstantinos, Kevin,
>>> and Merlijn for their extremely valuable help!
>>>
>>> I hope that the charm is useful to many juju enthusiasts and I will do
>>> my best to make it better with time!
>>>
>>> Thank you!!!
>>
>>
>
> --
> Juju mailing list
> Juju@lists.ubuntu.com
> Modify settings or unsubscribe at: 
> https://lists.ubuntu.com/mailman/listinfo/juju



-- 
Antonio Rosales
Ecosystem Engineering
Canonical

-- 
Juju mailing list
Juju@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju

Reply via email to