Seeking interest and a champion for bifroest - a backend for graphite-web, on Apache Cassandra

2014-10-07 Thread Harald Kraemer
Hi,

we have been allowed to open-source one of our company internal projects -
currently called Bifroest.  Bifroest is a storage backend for graphite-web,
based on Apache Cassandra. I'm quite happy about this, and now I'm in the
process of finding the best options and means to do so. This mail isn't an
entire proposal yet, but I will try to stick at least to the major points
in a proposal.

What does Bifroest do, and where does it come from.

At GoodgameStudios, we used Munin for most of our monitoring, using a lot
of custom plugins for our servers and pushing 500 - 700 hosts around.
That's ambitious with munin and by now, the munin-master is not able to
take the stress anymore.
As such, we started to evaluate graphite, since graphite is the state of
the art larger scale monitoring solution. To start evaluating graphite, we
deployed graphite with a carbon backend on a virtual machine. Our senior
monitoring admin (which we didn't have back then) probably just had to
giggle a bit and doesn't know why - things didn't perform that well on a
virtual machine. It could handle the important data, but the system didn't
seem to scale that well.
An admin would have tossed hardware at this, SSD-Raids and all that,
naturally. But we are  software engineers, not admins, thus we tossed
software at it (until we required hardware) :)

Our intention was to have a graphite with data stored in a distributed
database. A distributed database would scale both in storage space and in
load the system can deal with. And it's all  behind a well-defined
interface. That seemed like a nifty feature for a scalable monitoring
system.
Hence, we tried Cyanide, since Cyanide was just that. Tossed a lot of data
into Apache Cassandra, click on the metric tree and... well. Nothing
happened, since Cyanide figured that a select * across several 100k rows
is a grand idea. After that, we looked at InfluxDB,  but at the time we
started developing this, InfluxDB didn't support data aggregation and
seemed to be in a very, very early stage of development.

Thus, the first thought of bifroest was born: Why don't we take the good
parts of Cyanide, a solid distributed database, such as Apache Cassandra,
and the good parts of carbon and toss them in a big stew?

That's what we did, and that's what we are currently deploying as our
productive monitoring system, graphite on bifroest as a frontend for apache
cassandra.

Fun features of this system include:
 - Existing graphite and most carbon apis:
 -- Full support of the graphite rest API, since we are just a backend.
 -- Support for the Plaintext Protocol of Carbon
 -- Planned: An AMQP interface to handle globally distributed networks
 - Neat things, which graphite could do as well:
 -- A fast key cache
 -- A fast value-cache, which is fed by the data collection to hit the
database as little as possible
 - New things, Graphite+carbon+whisper cannot do:
 -- On the fly adjustable retention levels. You don't have the space to
keep 6 weeks of 1m data? Just reduce it. Or increase it. Our system can do
that on the fly.
 -- Currently in progress: On the fly addition of new retention levels.
Have an emergency and need data in greater resolution? Just add a retention
level with 1 datapoint / 5s, keep the full data history and tell your data
collection to collect more data and delete it later on again wiithout
losing data.
 -- High fault tolerance. We are relying on cassandra for persistent
storage, and a properly deployed cassandra cluster with redundancy just
doesn't care. Add a new machine, tell everything to rebuild the cluster and
the frontend didn't even notice the outage.

So, after this wall of text, there are two questions from me:

a) is this project interesting enough for everyone? :)
b) Are there people who would volunteer to coach me and my team through the
proposal and the incubator?

Regards,
Harald.
-- 

*Harald Krämer*
Server Developer (Profiling first)
*hkrae...@goodgamestudios.com hkrae...@goodgamestudios.com*

Goodgame Studios
Theodorstr. 42-90, House 9
22761 Hamburg, Germany
Phone: +49 (0)40 219 880 -0
*www.goodgamestudios.com http://www.goodgamestudios.com*

Goodgame Studios is a branch of Altigi GmbH
Altigi GmbH, District court Hamburg, HRB 99869
Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian
Ritter


Re: Seeking interest and a champion for bifroest - a backend for graphite-web, on Apache Cassandra

2014-10-07 Thread Harald Kraemer
Hi,

am I looking at the right pull requests with graphite-project/carbon#210
and #216?

Quite interesting. Sadly, I don't think I could provide exactly that python
API with our existing storage input frontend.

Just look out for the issues we fixed in
https://github.com/graphite-project/graphite-web/pull/698 :)

- Harald

2014-10-07 14:35 GMT+02:00 Jake Farrell jfarr...@apache.org:

 Hi Harald
 I have been working on a similar project which enables carbon to have a
 plugable backend storage system that leverages Apache Cassandra for
 storage. I opened pull requests in both carbon and graphite for the
 plugable backend portion and the Cassandra backend is still in the works.
 Your projects topic is something that I am very familiar with and all the
 mentioned related technologies. I would be happy to help as either a
 champion or a mentor for this project

 -Jake



 On Tue, Oct 7, 2014 at 4:59 AM, Harald Kraemer 
 hkrae...@goodgamestudios.com
  wrote:

  Hi,
 
  we have been allowed to open-source one of our company internal projects
 -
  currently called Bifroest.  Bifroest is a storage backend for
 graphite-web,
  based on Apache Cassandra. I'm quite happy about this, and now I'm in the
  process of finding the best options and means to do so. This mail isn't
 an
  entire proposal yet, but I will try to stick at least to the major points
  in a proposal.
 
  What does Bifroest do, and where does it come from.
 
  At GoodgameStudios, we used Munin for most of our monitoring, using a lot
  of custom plugins for our servers and pushing 500 - 700 hosts around.
  That's ambitious with munin and by now, the munin-master is not able to
  take the stress anymore.
  As such, we started to evaluate graphite, since graphite is the state of
  the art larger scale monitoring solution. To start evaluating graphite,
 we
  deployed graphite with a carbon backend on a virtual machine. Our senior
  monitoring admin (which we didn't have back then) probably just had to
  giggle a bit and doesn't know why - things didn't perform that well on a
  virtual machine. It could handle the important data, but the system
 didn't
  seem to scale that well.
  An admin would have tossed hardware at this, SSD-Raids and all that,
  naturally. But we are  software engineers, not admins, thus we tossed
  software at it (until we required hardware) :)
 
  Our intention was to have a graphite with data stored in a distributed
  database. A distributed database would scale both in storage space and in
  load the system can deal with. And it's all  behind a well-defined
  interface. That seemed like a nifty feature for a scalable monitoring
  system.
  Hence, we tried Cyanide, since Cyanide was just that. Tossed a lot of
 data
  into Apache Cassandra, click on the metric tree and... well. Nothing
  happened, since Cyanide figured that a select * across several 100k
 rows
  is a grand idea. After that, we looked at InfluxDB,  but at the time we
  started developing this, InfluxDB didn't support data aggregation and
  seemed to be in a very, very early stage of development.
 
  Thus, the first thought of bifroest was born: Why don't we take the good
  parts of Cyanide, a solid distributed database, such as Apache Cassandra,
  and the good parts of carbon and toss them in a big stew?
 
  That's what we did, and that's what we are currently deploying as our
  productive monitoring system, graphite on bifroest as a frontend for
 apache
  cassandra.
 
  Fun features of this system include:
   - Existing graphite and most carbon apis:
   -- Full support of the graphite rest API, since we are just a backend.
   -- Support for the Plaintext Protocol of Carbon
   -- Planned: An AMQP interface to handle globally distributed networks
   - Neat things, which graphite could do as well:
   -- A fast key cache
   -- A fast value-cache, which is fed by the data collection to hit the
  database as little as possible
   - New things, Graphite+carbon+whisper cannot do:
   -- On the fly adjustable retention levels. You don't have the space to
  keep 6 weeks of 1m data? Just reduce it. Or increase it. Our system can
 do
  that on the fly.
   -- Currently in progress: On the fly addition of new retention levels.
  Have an emergency and need data in greater resolution? Just add a
 retention
  level with 1 datapoint / 5s, keep the full data history and tell your
 data
  collection to collect more data and delete it later on again wiithout
  losing data.
   -- High fault tolerance. We are relying on cassandra for persistent
  storage, and a properly deployed cassandra cluster with redundancy just
  doesn't care. Add a new machine, tell everything to rebuild the cluster
 and
  the frontend didn't even notice the outage.
 
  So, after this wall of text, there are two questions from me:
 
  a) is this project interesting enough for everyone? :)
  b) Are there people who would volunteer to coach me and my team through
 the
  proposal and the incubator?
 
  Regards