Hi Harald,

it looks very interesting.
Don't hesitate to ping me if you need any help.

Regards
JB

On 10/07/2014 10:59 AM, Harald Kraemer wrote:
Hi,

we have been allowed to open-source one of our company internal projects -
currently called Bifroest.  Bifroest is a storage backend for graphite-web,
based on Apache Cassandra. I'm quite happy about this, and now I'm in the
process of finding the best options and means to do so. This mail isn't an
entire proposal yet, but I will try to stick at least to the major points
in a proposal.

What does Bifroest do, and where does it come from.

At GoodgameStudios, we used Munin for most of our monitoring, using a lot
of custom plugins for our servers and pushing 500 - 700 hosts around.
That's ambitious with munin and by now, the munin-master is not able to
take the stress anymore.
As such, we started to evaluate graphite, since graphite is the state of
the art larger scale monitoring solution. To start evaluating graphite, we
deployed graphite with a carbon backend on a virtual machine. Our senior
monitoring admin (which we didn't have back then) probably just had to
giggle a bit and doesn't know why - things didn't perform that well on a
virtual machine. It could handle the important data, but the system didn't
seem to scale that well.
An admin would have tossed hardware at this, SSD-Raids and all that,
naturally. But we are  software engineers, not admins, thus we tossed
software at it (until we required hardware) :)

Our intention was to have a graphite with data stored in a distributed
database. A distributed database would scale both in storage space and in
load the system can deal with. And it's all  behind a well-defined
interface. That seemed like a nifty feature for a scalable monitoring
system.
Hence, we tried Cyanide, since Cyanide was just that. Tossed a lot of data
into Apache Cassandra, click on the metric tree and... well. Nothing
happened, since Cyanide figured that a "select *" across several 100k rows
is a grand idea. After that, we looked at InfluxDB,  but at the time we
started developing this, InfluxDB didn't support data aggregation and
seemed to be in a very, very early stage of development.

Thus, the first thought of bifroest was born: Why don't we take the good
parts of Cyanide, a solid distributed database, such as Apache Cassandra,
and the good parts of carbon and toss them in a big stew?

That's what we did, and that's what we are currently deploying as our
productive monitoring system, graphite on bifroest as a frontend for apache
cassandra.

Fun features of this system include:
  - Existing graphite and most carbon apis:
  -- Full support of the graphite rest API, since we are just a backend.
  -- Support for the Plaintext Protocol of Carbon
  -- Planned: An AMQP interface to handle globally distributed networks
  - Neat things, which graphite could do as well:
  -- A fast key cache
  -- A fast value-cache, which is fed by the data collection to hit the
database as little as possible
  - New things, Graphite+carbon+whisper cannot do:
  -- On the fly adjustable retention levels. You don't have the space to
keep 6 weeks of 1m data? Just reduce it. Or increase it. Our system can do
that on the fly.
  -- Currently in progress: On the fly addition of new retention levels.
Have an emergency and need data in greater resolution? Just add a retention
level with 1 datapoint / 5s, keep the full data history and tell your data
collection to collect more data and delete it later on again wiithout
losing data.
  -- High fault tolerance. We are relying on cassandra for persistent
storage, and a properly deployed cassandra cluster with redundancy just
doesn't care. Add a new machine, tell everything to rebuild the cluster and
the frontend didn't even notice the outage.

So, after this wall of text, there are two questions from me:

a) is this project interesting enough for everyone? :)
b) Are there people who would volunteer to coach me and my team through the
proposal and the incubator?

Regards,
Harald.


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to