Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Todd Lipcon
On Sun, Nov 21, 2010 at 6:25 PM, Jonathan Ellis  wrote:

> On Sun, Nov 21, 2010 at 6:16 PM, Todd Lipcon  wrote:
> > [only jumping in because info was requested - those who know me know that
> I
> > think Cassandra is a very interesting architecture and a better fit for
> many
> > applications than HBase]
>
> Hey Todd!  Good to see you de-lurk!
>
> Howdy :)

> > In the other mode of operation (default in recent versions of HBase) we
> do
> > not acknowledge a write until it has been pushed to the OS buffer on the
> > entire pipeline of log replicas.
>
> You mean "to the disk," I assume?
>
>
Actually just to the OS buffers. So if the entire cluster loses power
simultaneously you will lose data. If you lose just one rack, though, you'll
be OK since the pipeline always spans two racks in a multirack installation.

There are plans to add an API that fully fsync()s the replicas, but
currently there has been little demand - in memory on three replicas is
"safe enough" for most use cases, and of course a lot more performant.

-Todd


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Jonathan Ellis
On Sun, Nov 21, 2010 at 6:16 PM, Todd Lipcon  wrote:
> [only jumping in because info was requested - those who know me know that I
> think Cassandra is a very interesting architecture and a better fit for many
> applications than HBase]

Hey Todd!  Good to see you de-lurk!

> In the other mode of operation (default in recent versions of HBase) we do
> not acknowledge a write until it has been pushed to the OS buffer on the
> entire pipeline of log replicas.

You mean "to the disk," I assume?

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Dave Viner
I don't know the details of operation of HBase, so I can't speak on that
point.  But, I do know that Facebook hired Jonathan Grey, former CTO of
Streamy, who is a huge HBase contributor. Streamy ended in Mar 2010 -
although I'm not sure when he went to work for Facebook.

He presented on HBase at the Hadoop conference in October in NYC:
http://mpouttuclarke.wordpress.com/2010/10/18/notes-from-hadoop-world-2010-nyc/

Again, I don't know the chronology (whether he was hired before the decision
to use hbase or after).  But I know that Jonathan is a fantastically smart
(and extremely nice) guy and I'm sure he could make HBase bend to his will
at any point.

Dave Viner

On Sun, Nov 21, 2010 at 4:16 PM, Todd Lipcon  wrote:

> On Sun, Nov 21, 2010 at 2:06 PM, Edward Ribeiro 
> wrote:
>
>>
>> Also I believe saying HBASE is consistent is not true. This can happen:
>>> Write to region server. -> Region Server acknowledges client-> write
>>> to WAL -> region server fails = write lost
>>>
>>> I wonder how facebook will reconcile that. :)
>>>
>>
>> Are you sure about that? Client writes to WAL before ack user?
>>
>> According to these posts[1][2], "if writing the record to the WAL fails
>> the whole operation must be considered a failure.", so it would be nonsense
>> acknowledge clients before writing the lifeline. I hope any cloudera guy
>> explain this...
>>
>>
> [only jumping in because info was requested - those who know me know that I
> think Cassandra is a very interesting architecture and a better fit for many
> applications than HBase]
>
> You can operate the commit log in two different modes in HBase. One mode is
> "deferred log flush", where the region server appends but does not sync()
> the commit log to HDFS on every write, but rather on a periodic basis (eg
> once a second). This is similar to the innodb_flush_log_at_trx_commit=2
> option in MySQL for example. This has slightly better performance obviously
> since the writer doesn't need to wait on the commit, but as you noted
> there's a window where a write may be acknowledged but then lost. This is an
> issue of *durability* moreso than consistency.
>
> In the other mode of operation (default in recent versions of HBase) we do
> not acknowledge a write until it has been pushed to the OS buffer on the
> entire pipeline of log replicas. Obviously this is slower, but it results in
> "no lost data" regardless of any machine failures. Additionally, concurrent
> readers do not see written data until these same properties have been
> satisfied. So this mode is 100% consistent and 100% durable. In practice,
> this effects latency significantly since it adds two extra round trips to
> each write, but system throughput is only reduced by 20-30% since the
> commits are pipelined (see HDFS-895 for gory details)
>
> I believe Cassandra has similar tuning options about whether to sync every
> commit to the log or only do so periodically.
>
> If you're interested in learning more, feel free to reference this
> documentation:
> http://hbase.apache.org/docs/r0.89.20100726/acid-semantics.html
>
>
>
>> Besides that, you know that WAL is written to HDFS that takes care of
>> replication and fault tolerance, right? Of course, even so, there's a
>> "window of inconsistency" before the HLog is flushed to disk, but I don't
>> think you can dismiss this as not consistent. At most, you may classify it
>> as "eventual consistent". :)
>>
>> [1] http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
>> [2]
>> http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html
>>
>> E. Ribeiro
>>
>>
>


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Todd Lipcon
On Sun, Nov 21, 2010 at 2:06 PM, Edward Ribeiro wrote:

>
> Also I believe saying HBASE is consistent is not true. This can happen:
>> Write to region server. -> Region Server acknowledges client-> write
>> to WAL -> region server fails = write lost
>>
>> I wonder how facebook will reconcile that. :)
>>
>
> Are you sure about that? Client writes to WAL before ack user?
>
> According to these posts[1][2], "if writing the record to the WAL fails the
> whole operation must be considered a failure.", so it would be nonsense
> acknowledge clients before writing the lifeline. I hope any cloudera guy
> explain this...
>
>
[only jumping in because info was requested - those who know me know that I
think Cassandra is a very interesting architecture and a better fit for many
applications than HBase]

You can operate the commit log in two different modes in HBase. One mode is
"deferred log flush", where the region server appends but does not sync()
the commit log to HDFS on every write, but rather on a periodic basis (eg
once a second). This is similar to the innodb_flush_log_at_trx_commit=2
option in MySQL for example. This has slightly better performance obviously
since the writer doesn't need to wait on the commit, but as you noted
there's a window where a write may be acknowledged but then lost. This is an
issue of *durability* moreso than consistency.

In the other mode of operation (default in recent versions of HBase) we do
not acknowledge a write until it has been pushed to the OS buffer on the
entire pipeline of log replicas. Obviously this is slower, but it results in
"no lost data" regardless of any machine failures. Additionally, concurrent
readers do not see written data until these same properties have been
satisfied. So this mode is 100% consistent and 100% durable. In practice,
this effects latency significantly since it adds two extra round trips to
each write, but system throughput is only reduced by 20-30% since the
commits are pipelined (see HDFS-895 for gory details)

I believe Cassandra has similar tuning options about whether to sync every
commit to the log or only do so periodically.

If you're interested in learning more, feel free to reference this
documentation:
http://hbase.apache.org/docs/r0.89.20100726/acid-semantics.html



> Besides that, you know that WAL is written to HDFS that takes care of
> replication and fault tolerance, right? Of course, even so, there's a
> "window of inconsistency" before the HLog is flushed to disk, but I don't
> think you can dismiss this as not consistent. At most, you may classify it
> as "eventual consistent". :)
>
> [1] http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
> [2]
> http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html
>
> E. Ribeiro
>
>


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Edward Ribeiro
> Also I believe saying HBASE is consistent is not true. This can happen:
> Write to region server. -> Region Server acknowledges client-> write
> to WAL -> region server fails = write lost
>
> I wonder how facebook will reconcile that. :)
>

Are you sure about that? Client writes to WAL before ack user?

According to these posts[1][2], "if writing the record to the WAL fails the
whole operation must be considered a failure.", so it would be nonsense
acknowledge clients before writing the lifeline. I hope any cloudera guy
explain this...

Besides that, you know that WAL is written to HDFS that takes care of
replication and fault tolerance, right? Of course, even so, there's a
"window of inconsistency" before the HLog is flushed to disk, but I don't
think you can dismiss this as not consistent. At most, you may classify it
as "eventual consistent". :)

[1] http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
[2]
http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html

E. Ribeiro


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Jake Luciani
+1 Ed 

 

On Nov 21, 2010, at 12:13 PM, Edward Capriolo  wrote:

> On Sun, Nov 21, 2010 at 12:10 PM, André Fiedler
>  wrote:
>> Facebook Messaging – HBase Comes of Age
>> 
>> http://facility9.com/2010/11/18/facebook-messaging-hbase-comes-of-age
>> 
>> 2010/11/21 David Boxenhorn 
>>> 
>>> Eventual consistency is not good enough for instant messaging.
>>> 
>>> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely 
>>> wrote:
 
 (Posting this to both user + dev lists)
 
 I was reviewing the blog post on the facebook engineering blog from nov
 15th
 http://www.facebook.com/note.php?note_id=454991608919#
 
 The Underlying Technology of Messages
 by Kannan Muthukkaruppan 
 
 
 
 As a cassandra user I think the key sentence for this community is:
 "We found Cassandra's eventual consistency model to be a difficult
 pattern
 to reconcile for our new Messages infrastructure."
 
 I think it would be useful to find out more about this statement from
 Kannan
 and the facebook team. Does anyone have any contacts in the Facebook
 team?
 
 My goal here is to understand usage patterns and whether or not the
 Cassandra community can learn from this decision; maybe even understand
 whether the Cassandra roadmap should be influenced by this decision to
 address a target user base. Of course we might also conclude that its
 just
 "not a Cassandra use-case"!
 
 Cheers,
 Simon
 --
 Simon Reavely
 simon.reav...@gmail.com
>>> 
>> 
>> 
> 
> 
> 
> On Sun, Nov 21, 2010 at 11:40 AM, David Boxenhorn  wrote:
>> Eventual consistency is not good enough for instant messaging.
>> 
>> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely 
>> wrote:
>>> 
>>> (Posting this to both user + dev lists)
>>> 
>>> I was reviewing the blog post on the facebook engineering blog from nov
>>> 15th
>>> http://www.facebook.com/note.php?note_id=454991608919#
>>> 
>>> The Underlying Technology of Messages
>>> by Kannan Muthukkaruppan 
>>> 
>>> 
>>> 
>>> As a cassandra user I think the key sentence for this community is:
>>> "We found Cassandra's eventual consistency model to be a difficult pattern
>>> to reconcile for our new Messages infrastructure."
>>> 
>>> I think it would be useful to find out more about this statement from
>>> Kannan
>>> and the facebook team. Does anyone have any contacts in the Facebook team?
>>> 
>>> My goal here is to understand usage patterns and whether or not the
>>> Cassandra community can learn from this decision; maybe even understand
>>> whether the Cassandra roadmap should be influenced by this decision to
>>> address a target user base. Of course we might also conclude that its just
>>> "not a Cassandra use-case"!
>>> 
>>> Cheers,
>>> Simon
>>> --
>>> Simon Reavely
>>> simon.reav...@gmail.com
>> 
>> 
> 
> Jonathan Ellis pointed out a term that I like using better "Tunable
> consistency" . It seems that "eventual consistency" confuses everyone,
> that or it is an easy target of an anti Cassandra public relation
> campaign. If you want consistency use:
> 
> WRITE.ALL + READ.ONE (hinted handoff off)
> WRITE.QUORUM + READ.QUORUM
> WRITE.ONE + READ.ALL
> 
> Also I believe saying HBASE is consistent is not true. This can happen:
> Write to region server. -> Region Server acknowledges client-> write
> to WAL -> region server fails = write lost
> 
> I wonder how facebook will reconcile that. :)
> 
> Not trying to be nitpicky, at hadoop world in NYC I got to sit with
> lots of the hbase guys and we all had a great time talking about the
> mutual issues and happiness both of our communities share.
> 
> We can not speak for Facebook, but likely chose HBase because they
> have several of hadoop core developers and have a large hadoop
> deployment. I would say the decision was probably based on several
> things. Current Cassandra release does not do on line schema updates.
> I am sure facebook does not want to restart 10,000 cassandra servers
> for a schema change. Current release does not have memtable tuning per
> column family. The upcoming Cassandra release has support for both of
> these things and many many more awesome things.
> 
> Facebook is on the high end of how much data they have to manage, and
> how many servers they have. Most people do not share that use case. We
> can learn that facebook chose software that was good for them based on
> their use case and the experience they have in house. Something
> everyone should do.


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Eric Evans
On Sun, 2010-11-21 at 11:32 -0500, Simon Reavely wrote:
> As a cassandra user I think the key sentence for this community is:
> "We found Cassandra's eventual consistency model to be a difficult
> pattern to reconcile for our new Messages infrastructure."

In my experience, "we needed strong consistency", in conversations like
these amounts to hand waving.  It's the fastest way to shut down that
part of the discussion without having said anything at all.

> I think it would be useful to find out more about this statement from
> Kannan and the facebook team. Does anyone have any contacts in the
> Facebook team?

Good luck.  Facebook is notoriously tight-lipped about such things.

> My goal here is to understand usage patterns and whether or not the
> Cassandra community can learn from this decision; maybe even
> understand whether the Cassandra roadmap should be influenced by this
> decision to address a target user base. Of course we might also
> conclude that its just "not a Cassandra use-case"! 

Understanding is a laudable goal, just try to avoid drawing conclusions
(and call out others who are).


This is usually the point where a frenzy kicks in and folks assume that
the Smart Guys at Facebook know something they don't, something that
would invalidate their decision if they'd only known.

I seriously doubt they've uncovered some Truth that would fundamentally
alter the reasoning behind *my* decision to use Cassandra, and so I plan
to continue as I always have.  Following relevant research and
development, collecting experience (my own and others), and applying it
to the problems I face.


-- 
Eric Evans
eev...@rackspace.com



Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Aaron Morton
I've been using "Tunable Consistency" when explaining Cassandra to people as well. I've also started pointing out the different Consistency models of the standard Transaction Isolation Levels of an ACID compliant DB, as well as "eventually consistent" async replication like MYSQL / Couch DB I've also been saying that consistency needs to be though of with regard to scope and time. For example one replica may be inconsistent to the rest of the replicas and you can still get a consistent read out. You can have inconsistent data in the system that may become consistent when read (via RR). But that the entire cluster (or perhaps it's better to say all replicas of the same data) will eventually store a consistent view of the truth. But like you say, you *can* be consistent with regard to your reads and writes, when using the correct CL. The different combinations of read and write CL really interesting. CheersAaronOn 22 Nov, 2010,at 06:13 AM, Edward Capriolo  wrote:On Sun, Nov 21, 2010 at 12:10 PM, André Fiedler
 wrote:
> Facebook Messaging – HBase Comes of Age
>
> http://facility9.com/2010/11/18/facebook-messaging-hbase-comes-of-age
>
> 2010/11/21 David Boxenhorn 
>>
>> Eventual consistency is not good enough for instant messaging.
>>
>> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely 
>> wrote:
>>>
>>> (Posting this to both user + dev lists)
>>>
>>> I was reviewing the blog post on the facebook engineering blog from nov
>>> 15th
>>> http://www.facebook.com/note.php?note_id=454991608919#
>>> 
>>> The Underlying Technology of Messages
>>> by Kannan Muthukkaruppan 
>>>
>>>
>>>
>>> As a cassandra user I think the key sentence for this community is:
>>> "We found Cassandra's eventual consistency model to be a difficult
>>> pattern
>>> to reconcile for our new Messages infrastructure."
>>>
>>> I think it would be useful to find out more about this statement from
>>> Kannan
>>> and the facebook team. Does anyone have any contacts in the Facebook
>>> team?
>>>
>>> My goal here is to understand usage patterns and whether or not the
>>> Cassandra community can learn from this decision; maybe even understand
>>> whether the Cassandra roadmap should be influenced by this decision to
>>> address a target user base. Of course we might also conclude that its
>>> just
>>> "not a Cassandra use-case"!
>>>
>>> Cheers,
>>> Simon
>>> --
>>> Simon Reavely
>>> simon.reav...@gmail.com
>>
>
>



On Sun, Nov 21, 2010 at 11:40 AM, David Boxenhorn  wrote:
> Eventual consistency is not good enough for instant messaging.
>
> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely 
> wrote:
>>
>> (Posting this to both user + dev lists)
>>
>> I was reviewing the blog post on the facebook engineering blog from nov
>> 15th
>> http://www.facebook.com/note.php?note_id=454991608919#
>> 
>> The Underlying Technology of Messages
>> by Kannan Muthukkaruppan 
>>
>>
>>
>> As a cassandra user I think the key sentence for this community is:
>> "We found Cassandra's eventual consistency model to be a difficult pattern
>> to reconcile for our new Messages infrastructure."
>>
>> I think it would be useful to find out more about this statement from
>> Kannan
>> and the facebook team. Does anyone have any contacts in the Facebook team?
>>
>> My goal here is to understand usage patterns and whether or not the
>> Cassandra community can learn from this decision; maybe even understand
>> whether the Cassandra roadmap should be influenced by this decision to
>> address a target user base. Of course we might also conclude that its just
>> "not a Cassandra use-case"!
>>
>> Cheers,
>> Simon
>> --
>> Simon Reavely
>> simon.reav...@gmail.com
>
>

Jonathan Ellis pointed out a term that I like using better "Tunable
consistency" . It seems that "eventual consistency" confuses everyone,
that or it is an easy target of an anti Cassandra public relation
campaign. If you want consistency use:

WRITE.ALL + READ.ONE (hinted handoff off)
WRITE.QUORUM + READ.QUORUM
WRITE.ONE + READ.ALL

Also I believe saying HBASE is consistent is not true. This can happen:
Write to region server. -> Region Server acknowledges client-> write
to WAL -> region server fails = write lost

I wonder how facebook will reconcile that. :)

Not trying to be nitpicky, at hadoop world in NYC I got to sit with
lots of the hbase guys and we all had a great time talking about the
mutual issues and happiness both of our communities share.

We can not speak for Facebook, but likely chose HBase because they
have several of hadoop core developers and have a large hadoop
deployment. I would say the decision was probably based on several
things. Current Cassandra release does not do on line schema updates
I am sure facebook does not want to restart 10,000 cassandra servers
for a schema change. Current release does not have memtable tuning per
column f

(newbie) ColumnFamilyOutputFormat only writes one column (per key)

2010-11-21 Thread mck
(I'm new here so forgive any mistakes or mis-presumptions...)

I've set up a cassandra-0.7.0-beta3 and populated it with
thrift-serialised objects via a scribe server. This seems a great way to
get thrift beans out of the application asap and have then sitting in
cassandra for later processing.

I then went to write a m/r job that deserialises the thrift objects and
aggregates the data accordingly into a new column family. But what i've
found is that ColumnFamilyOutputFormat will only write out one column
per key.

Alex Burkoff also reported this nearly two months ago, but nobody ever
replied...
 http://article.gmane.org/gmane.comp.db.cassandra.user/9325

has anyone any ideas? 
should it be possible to write multiple columns out?

This is very easy to reproduce. Use the contrib/wordcount example, with
OUTPUT_REDUCER=cassandra and in WordCount.java add at line 132

>  results.add(getMutation(key, sum));
> +results.add(getMutation(new Text("doubled"), sum*2));

Only the last mutation for any key seems to be written.


~mck

-- 
echo '[q]sa[ln0=aln256%
Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc 

| www.semb.wever.org | www.sesat.no 
| www.finn.no| http://xss-http-filter.sf.net



signature.asc
Description: This is a digitally signed message part


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Edward Capriolo
On Sun, Nov 21, 2010 at 12:10 PM, André Fiedler
 wrote:
> Facebook Messaging – HBase Comes of Age
>
> http://facility9.com/2010/11/18/facebook-messaging-hbase-comes-of-age
>
> 2010/11/21 David Boxenhorn 
>>
>> Eventual consistency is not good enough for instant messaging.
>>
>> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely 
>> wrote:
>>>
>>> (Posting this to both user + dev lists)
>>>
>>> I was reviewing the blog post on the facebook engineering blog from nov
>>> 15th
>>> http://www.facebook.com/note.php?note_id=454991608919#
>>> 
>>> The Underlying Technology of Messages
>>> by Kannan Muthukkaruppan 
>>>
>>>
>>>
>>> As a cassandra user I think the key sentence for this community is:
>>> "We found Cassandra's eventual consistency model to be a difficult
>>> pattern
>>> to reconcile for our new Messages infrastructure."
>>>
>>> I think it would be useful to find out more about this statement from
>>> Kannan
>>> and the facebook team. Does anyone have any contacts in the Facebook
>>> team?
>>>
>>> My goal here is to understand usage patterns and whether or not the
>>> Cassandra community can learn from this decision; maybe even understand
>>> whether the Cassandra roadmap should be influenced by this decision to
>>> address a target user base. Of course we might also conclude that its
>>> just
>>> "not a Cassandra use-case"!
>>>
>>> Cheers,
>>> Simon
>>> --
>>> Simon Reavely
>>> simon.reav...@gmail.com
>>
>
>



On Sun, Nov 21, 2010 at 11:40 AM, David Boxenhorn  wrote:
> Eventual consistency is not good enough for instant messaging.
>
> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely 
> wrote:
>>
>> (Posting this to both user + dev lists)
>>
>> I was reviewing the blog post on the facebook engineering blog from nov
>> 15th
>> http://www.facebook.com/note.php?note_id=454991608919#
>> 
>> The Underlying Technology of Messages
>> by Kannan Muthukkaruppan 
>>
>>
>>
>> As a cassandra user I think the key sentence for this community is:
>> "We found Cassandra's eventual consistency model to be a difficult pattern
>> to reconcile for our new Messages infrastructure."
>>
>> I think it would be useful to find out more about this statement from
>> Kannan
>> and the facebook team. Does anyone have any contacts in the Facebook team?
>>
>> My goal here is to understand usage patterns and whether or not the
>> Cassandra community can learn from this decision; maybe even understand
>> whether the Cassandra roadmap should be influenced by this decision to
>> address a target user base. Of course we might also conclude that its just
>> "not a Cassandra use-case"!
>>
>> Cheers,
>> Simon
>> --
>> Simon Reavely
>> simon.reav...@gmail.com
>
>

Jonathan Ellis pointed out a term that I like using better "Tunable
consistency" . It seems that "eventual consistency" confuses everyone,
that or it is an easy target of an anti Cassandra public relation
campaign. If you want consistency use:

WRITE.ALL + READ.ONE (hinted handoff off)
WRITE.QUORUM + READ.QUORUM
WRITE.ONE + READ.ALL

Also I believe saying HBASE is consistent is not true. This can happen:
Write to region server. -> Region Server acknowledges client-> write
to WAL -> region server fails = write lost

I wonder how facebook will reconcile that. :)

Not trying to be nitpicky, at hadoop world in NYC I got to sit with
lots of the hbase guys and we all had a great time talking about the
mutual issues and happiness both of our communities share.

We can not speak for Facebook, but likely chose HBase because they
have several of hadoop core developers and have a large hadoop
deployment. I would say the decision was probably based on several
things. Current Cassandra release does not do on line schema updates.
I am sure facebook does not want to restart 10,000 cassandra servers
for a schema change. Current release does not have memtable tuning per
column family. The upcoming Cassandra release has support for both of
these things and many many more awesome things.

Facebook is on the high end of how much data they have to manage, and
how many servers they have. Most people do not share that use case. We
can learn that facebook chose software that was good for them based on
their use case and the experience they have in house. Something
everyone should do.


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread André Fiedler
 Facebook Messaging – HBase Comes of
Age
http://facility9.com/2010/11/18/facebook-messaging-hbase-comes-of-age


2010/11/21 David Boxenhorn 

> Eventual consistency is not good enough for instant messaging.
>
> On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely wrote:
>
>> (Posting this to both user + dev lists)
>>
>> I was reviewing the blog post on the facebook engineering blog from nov
>> 15th
>> http://www.facebook.com/note.php?note_id=454991608919#
>> 
>> The Underlying Technology of Messages
>> by Kannan Muthukkaruppan 
>>
>>
>>
>>
>> As a cassandra user I think the key sentence for this community is:
>> "We found Cassandra's eventual consistency model to be a difficult pattern
>> to reconcile for our new Messages infrastructure."
>>
>> I think it would be useful to find out more about this statement from
>> Kannan
>> and the facebook team. Does anyone have any contacts in the Facebook team?
>>
>> My goal here is to understand usage patterns and whether or not the
>> Cassandra community can learn from this decision; maybe even understand
>> whether the Cassandra roadmap should be influenced by this decision to
>> address a target user base. Of course we might also conclude that its just
>> "not a Cassandra use-case"!
>>
>> Cheers,
>> Simon
>> --
>> Simon Reavely
>> simon.reav...@gmail.com
>>
>
>


Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread David Boxenhorn
Eventual consistency is not good enough for instant messaging.

On Sun, Nov 21, 2010 at 6:32 PM, Simon Reavely wrote:

> (Posting this to both user + dev lists)
>
> I was reviewing the blog post on the facebook engineering blog from nov
> 15th
> http://www.facebook.com/note.php?note_id=454991608919#
> 
> The Underlying Technology of Messages
> by Kannan Muthukkaruppan 
>
>
>
> As a cassandra user I think the key sentence for this community is:
> "We found Cassandra's eventual consistency model to be a difficult pattern
> to reconcile for our new Messages infrastructure."
>
> I think it would be useful to find out more about this statement from
> Kannan
> and the facebook team. Does anyone have any contacts in the Facebook team?
>
> My goal here is to understand usage patterns and whether or not the
> Cassandra community can learn from this decision; maybe even understand
> whether the Cassandra roadmap should be influenced by this decision to
> address a target user base. Of course we might also conclude that its just
> "not a Cassandra use-case"!
>
> Cheers,
> Simon
> --
> Simon Reavely
> simon.reav...@gmail.com
>


Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-21 Thread Simon Reavely
(Posting this to both user + dev lists)

I was reviewing the blog post on the facebook engineering blog from nov
15th
http://www.facebook.com/note.php?note_id=454991608919#

The Underlying Technology of Messages
by Kannan Muthukkaruppan 



As a cassandra user I think the key sentence for this community is:
"We found Cassandra's eventual consistency model to be a difficult pattern
to reconcile for our new Messages infrastructure."

I think it would be useful to find out more about this statement from Kannan
and the facebook team. Does anyone have any contacts in the Facebook team?

My goal here is to understand usage patterns and whether or not the
Cassandra community can learn from this decision; maybe even understand
whether the Cassandra roadmap should be influenced by this decision to
address a target user base. Of course we might also conclude that its just
"not a Cassandra use-case"!

Cheers,
Simon
-- 
Simon Reavely
simon.reav...@gmail.com