date:20100422

Re: ORM in Cassandra?

2010-04-22 Thread dir dir

>So maybe it's weird to combine ORM and Cassandra, right? Is there
>anything we can take from ORM?

Honestly I do not understand what is your question. It is clear that
you can not combine ORM such as Hibernate or iBATIS with Cassandra.
Cassandra it self is not a RDBMS, so you will not map the table into
the object.

Dir.

On Fri, Apr 23, 2010 at 12:12 PM, aXqd  wrote:

> Hi, all:
>
> I know many people regard O/R Mapping as rubbish. However it is
> undeniable that ORM is quite easy to use in most simple cases,
> Meanwhile Cassandra is well known as No-SQL solution, a.k.a.
> No-Relational solution.
> So maybe it's weird to combine ORM and Cassandra, right? Is there
> anything we can take from ORM?
> I just hate to write CRUD functions/Data layer for each object in even
> a disposable prototype program.
>
> Regards.
> -Tian
>

Re: ORM in Cassandra?

2010-04-22 Thread Michael Pearson

For PHP there's Pandra http://github.com/mjpearson/Pandra . As much as
I dislike PHP and ORM's generally (ironic, yes) PHP's array/iterator
interfaces make building a domain model ontop of Cassandra a fairly
intuitive process.

-michael

On Fri, Apr 23, 2010 at 3:12 PM, aXqd  wrote:
> Hi, all:
>
> I know many people regard O/R Mapping as rubbish. However it is
> undeniable that ORM is quite easy to use in most simple cases,
> Meanwhile Cassandra is well known as No-SQL solution, a.k.a.
> No-Relational solution.
> So maybe it's weird to combine ORM and Cassandra, right? Is there
> anything we can take from ORM?
> I just hate to write CRUD functions/Data layer for each object in even
> a disposable prototype program.
>
> Regards.
> -Tian
>

Re: ORM in Cassandra?

2010-04-22 Thread Jeremy Dunck

See what you think of tragedy:
http://github.com/enki/tragedy


On Fri, Apr 23, 2010 at 12:12 AM, aXqd  wrote:
> Hi, all:
>
> I know many people regard O/R Mapping as rubbish. However it is
> undeniable that ORM is quite easy to use in most simple cases,
> Meanwhile Cassandra is well known as No-SQL solution, a.k.a.
> No-Relational solution.
> So maybe it's weird to combine ORM and Cassandra, right? Is there
> anything we can take from ORM?
> I just hate to write CRUD functions/Data layer for each object in even
> a disposable prototype program.
>
> Regards.
> -Tian
>

ORM in Cassandra?

2010-04-22 Thread aXqd

Hi, all:

I know many people regard O/R Mapping as rubbish. However it is
undeniable that ORM is quite easy to use in most simple cases,
Meanwhile Cassandra is well known as No-SQL solution, a.k.a.
No-Relational solution.
So maybe it's weird to combine ORM and Cassandra, right? Is there
anything we can take from ORM?
I just hate to write CRUD functions/Data layer for each object in even
a disposable prototype program.

Regards.
-Tian

Re: Row deletion and get_range_slices (cassandra 0.6.1)

2010-04-22 Thread David Harrison

Do those tombstone-d keys ever get purged completely ?  I've tried
shortening the GCGraceSeconds right down but they still don't get
cleaned up.

On 23 April 2010 08:57, Jonathan Ellis  wrote:
> http://wiki.apache.org/cassandra/FAQ#range_ghosts
>
> On Thu, Apr 22, 2010 at 5:29 PM, Carlos Sanchez
>  wrote:
>> I have a curious question..
>>
>> I am doing some testing where I insert 500 rows to a super column family and 
>> then delete one row, I make sure the row was indeed deleted 
>> (NotFoundException in the get call) and then I ran a get_range_slices and 
>> the row indeed returned. The shutdown Cassandra and restarted it. I repeated 
>> the test (with inserting the rows) and even though I get the 
>> NotFoundException for that row, the get_rance_slices still returns it.  Is 
>> this the expected behavior? How long should I wait before I don't see the 
>> row in the get_range_slices? Do I have to force a flush or change 
>> consistency level?
>>
>> Thanks,
>>
>> Carlos
>>
>> This email message and any attachments are for the sole use of the intended 
>> recipients and may contain proprietary and/or confidential information which 
>> may be privileged or otherwise protected from disclosure. Any unauthorized 
>> review, use, disclosure or distribution is prohibited. If you are not an 
>> intended recipient, please contact the sender by reply email and destroy the 
>> original message and any copies of the message as well as any attachments to 
>> the original message.
>>
>

Re: getting cassandra setup on windows 7

2010-04-22 Thread S Ahmed

I was just reading that thanks.

What does he mean when he says:

"This appears to be related to data storage paths I set, because if I switch
the paths back to the default UNIX paths. Everything runs fine"

On Thu, Apr 22, 2010 at 11:07 PM, Jonathan Ellis  wrote:

> https://issues.apache.org/jira/browse/CASSANDRA-948
>
> On Thu, Apr 22, 2010 at 10:03 PM, S Ahmed  wrote:
> > Ok so I found the config section:
> >
> E:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\commitlog
> >   
> >
> >
>  
> E:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\data
> >   
> >
> > Now when I run:
> > bin/cassandra
> > I get:
> > Starting cassandra server
> > listening for transport dt_socket at address:
> > exception in thread main java.lang.noclassDefFoundError:
> > org/apache/cassthreft/cassandraDaemon
> > could not find the main class:
> > org.apache.cassandra.threif.cassandraDaemon...
> >
> >
> >
> >
> >
> > On Thu, Apr 22, 2010 at 10:53 PM, S Ahmed  wrote:
> >>
> >> So I uncompressed the .tar, in the readme it says:
> >> * tar -zxvf cassandra-$VERSION.tgz
> >>   * cd cassandra-$VERSION
> >>   * sudo mkdir -p /var/log/cassandra
> >>   * sudo chown -R `whoami` /var/log/cassandra
> >>   * sudo mkdir -p /var/lib/cassandra
> >>   * sudo chown -R `whoami` /var/lib/cassandra
> >>
> >> My cassandra is at:
> >> c:\java\cassandra\apache-cassandra-0.6.1/
> >> So I have to create 2 folders log and lib?
> >> Is there a setting in a config file that I edit?
> >
>

Re: MapReduce, Timeouts and Range Batch Size

2010-04-22 Thread Jonathan Ellis

That would be an easy win, sure.

On Thu, Apr 22, 2010 at 9:27 PM, Joost Ouwerkerk  wrote:
> I was getting client timeouts in ColumnFamilyRecordReader.maybeInit() when
> MapReducing.  So I've reduced the Range Batch Size to 256 (from 4096) and
> this seems to have fixed my problem, although it has slowed things down a
> bit -- presumably because there are 16x more calls to get_range_slices.
> While I was in that code I noticed that a new client was being created for
> each batch get.  By decreasing the batch size, I've increased this
> overhead.  I'm thinking of re-writing ColumnFamilyRecordReader to do some
> connection pooling.  Anyone have any thoughts on that?
> joost.
>

Re: getting cassandra setup on windows 7

2010-04-22 Thread Jonathan Ellis

https://issues.apache.org/jira/browse/CASSANDRA-948

On Thu, Apr 22, 2010 at 10:03 PM, S Ahmed  wrote:
> Ok so I found the config section:
> E:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\commitlog
>   
>
>  E:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\data
>   
>
> Now when I run:
> bin/cassandra
> I get:
> Starting cassandra server
> listening for transport dt_socket at address:
> exception in thread main java.lang.noclassDefFoundError:
> org/apache/cassthreft/cassandraDaemon
> could not find the main class:
> org.apache.cassandra.threif.cassandraDaemon...
>
>
>
>
>
> On Thu, Apr 22, 2010 at 10:53 PM, S Ahmed  wrote:
>>
>> So I uncompressed the .tar, in the readme it says:
>> * tar -zxvf cassandra-$VERSION.tgz
>>   * cd cassandra-$VERSION
>>   * sudo mkdir -p /var/log/cassandra
>>   * sudo chown -R `whoami` /var/log/cassandra
>>   * sudo mkdir -p /var/lib/cassandra
>>   * sudo chown -R `whoami` /var/lib/cassandra
>>
>> My cassandra is at:
>> c:\java\cassandra\apache-cassandra-0.6.1/
>> So I have to create 2 folders log and lib?
>> Is there a setting in a config file that I edit?
>

Re: getting cassandra setup on windows 7

2010-04-22 Thread S Ahmed

Ok so I found the config section:

E:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\commitlog
  

 
E:\java\cassandra\apache-cassandra-0.6.1-bin\apache-cassandra-0.6.1\data
  


Now when I run:

bin/cassandra

I get:

Starting cassandra server
listening for transport dt_socket at address:
exception in thread main java.lang.noclassDefFoundError:
org/apache/cassthreft/cassandraDaemon

could not find the main class:
org.apache.cassandra.threif.cassandraDaemon...






On Thu, Apr 22, 2010 at 10:53 PM, S Ahmed  wrote:

> So I uncompressed the .tar, in the readme it says:
>
> * tar -zxvf cassandra-$VERSION.tgz
>   * cd cassandra-$VERSION
>   * sudo mkdir -p /var/log/cassandra
>   * sudo chown -R `whoami` /var/log/cassandra
>   * sudo mkdir -p /var/lib/cassandra
>   * sudo chown -R `whoami` /var/lib/cassandra
>
>
> My cassandra is at:
>
> c:\java\cassandra\apache-cassandra-0.6.1/
>
> So I have to create 2 folders log and lib?
> Is there a setting in a config file that I edit?
>

Re: getting cassandra setup on windows 7

2010-04-22 Thread Shinpei Ohtani

Hi,

You should do at least these:
1.open conf/storage-conf.xml and set commitlog/data directory settings
2.open conf/log4j.properties and set log directory whichever you want
3.I recommend to set c:\java\cassandra\apache-cassandra-0.6.1/ as
%CASSANDRA_HOME% to your class path.
4.Also I recommend to change JMX port which is described at
bin/cassandra.bat from 8080 to other,
this might be problem when you are using Tomcat.

Just mention about 1 and 2, these files are readonly so you should
change to be writable before editing.
(I think these conf files should not be readonly...)

Hope this helps,
===
Shinpei

On Fri, Apr 23, 2010 at 11:53 AM, S Ahmed  wrote:
> So I uncompressed the .tar, in the readme it says:
> * tar -zxvf cassandra-$VERSION.tgz
>   * cd cassandra-$VERSION
>   * sudo mkdir -p /var/log/cassandra
>   * sudo chown -R `whoami` /var/log/cassandra
>   * sudo mkdir -p /var/lib/cassandra
>   * sudo chown -R `whoami` /var/lib/cassandra
>
> My cassandra is at:
> c:\java\cassandra\apache-cassandra-0.6.1/
> So I have to create 2 folders log and lib?
> Is there a setting in a config file that I edit?

-- 
=
Shinpei Ohtani
mail: shinpei.oht...@gmail.com
blog: http://d.hatena.ne.jp/shot6/
twitter : http://twitter.com/shot6
=

getting cassandra setup on windows 7

2010-04-22 Thread S Ahmed

So I uncompressed the .tar, in the readme it says:

* tar -zxvf cassandra-$VERSION.tgz
  * cd cassandra-$VERSION
  * sudo mkdir -p /var/log/cassandra
  * sudo chown -R `whoami` /var/log/cassandra
  * sudo mkdir -p /var/lib/cassandra
  * sudo chown -R `whoami` /var/lib/cassandra


My cassandra is at:

c:\java\cassandra\apache-cassandra-0.6.1/

So I have to create 2 folders log and lib?
Is there a setting in a config file that I edit?

MapReduce, Timeouts and Range Batch Size

2010-04-22 Thread Joost Ouwerkerk

I was getting client timeouts in ColumnFamilyRecordReader.maybeInit() when
MapReducing.  So I've reduced the Range Batch Size to 256 (from 4096) and
this seems to have fixed my problem, although it has slowed things down a
bit -- presumably because there are 16x more calls to get_range_slices.
While I was in that code I noticed that a new client was being created for
each batch get.  By decreasing the batch size, I've increased this
overhead.  I'm thinking of re-writing ColumnFamilyRecordReader to do some
connection pooling.  Anyone have any thoughts on that?
joost.

Re: Is that normal to have some percent of reads/writes time out?

2010-04-22 Thread Ken Sandney

By the way, my testing cluster are 4 normal PCs with 2GB RAM assigned to
JVM, Intel(R) Celeron(R) CPU E3200 2.40GHz. How many concurrent reads/writes
should be reasonable? Or how much memory/CPU usage would be healthy for this
kind of test cluster?

Re: Is that normal to have some percent of reads/writes time out?

2010-04-22 Thread Ken Sandney

yes, I've tried the patch on
https://issues.apache.org/jira/browse/THRIFT-347, but seems not work for me.
I doubt I am involving another issue with Thrift. If my column value size is
more than 8KB(with thrift php extension enabled), my client has more chances
to get "timed out error". I am still working on this issue to figure out
what happend there.

I am new to this and any clues/advices are welcome.

Re: Cassandra Ruby Library's batch method example?

2010-04-22 Thread Jonathan Ellis

nope, there is no guarantee of that.  if the server fails
mid-operation you have to retry it.

On Thu, Apr 22, 2010 at 7:23 PM, Lucas Di Pentima
 wrote:
>
> El 22/04/2010, a las 19:57, Ryan King escribió:
>
>> The batch method in the cassandra gem is still a little crippled (it
>> doesn't actually batch together everything it can), but you can use it
>> like this:
>>
>> http://github.com/fauna/cassandra/blob/master/test/cassandra_test.rb#L299
>
> Thanks Ryan! One question about this feature: Ideally it should execute all 
> batched operations or none, is that right? In case one batched operation 
> raise some exception, the previous ops are rolled back?
>
> --
> Lucas Di Pentima - Santa Fe, Argentina
> Jabber: lu...@di-pentima.com.ar
> MSN: ldipent...@hotmail.com
>
>
>
>
>

Re: Cassandra Ruby Library's batch method example?

2010-04-22 Thread Lucas Di Pentima


El 22/04/2010, a las 19:57, Ryan King escribió:

> The batch method in the cassandra gem is still a little crippled (it
> doesn't actually batch together everything it can), but you can use it
> like this:
> 
> http://github.com/fauna/cassandra/blob/master/test/cassandra_test.rb#L299

Thanks Ryan! One question about this feature: Ideally it should execute all 
batched operations or none, is that right? In case one batched operation raise 
some exception, the previous ops are rolled back?

--
Lucas Di Pentima - Santa Fe, Argentina
Jabber: lu...@di-pentima.com.ar
MSN: ldipent...@hotmail.com

Re: Cassandra Ruby Library's batch method example?

2010-04-22 Thread Ryan King

On Thu, Apr 22, 2010 at 1:06 PM, Lucas Di Pentima
 wrote:
> Hi,
>
> I would like to see example code about the batch() method, I searched for it 
> on Google, but I couldn't find any. Reading the inline comments, this 
> operation could be useful for example to insert some record and update the 
> indexes all at once, am I right?

The batch method in the cassandra gem is still a little crippled (it
doesn't actually batch together everything it can), but you can use it
like this:

http://github.com/fauna/cassandra/blob/master/test/cassandra_test.rb#L299

-ryan

Re: Row deletion and get_range_slices (cassandra 0.6.1)

2010-04-22 Thread Jonathan Ellis

http://wiki.apache.org/cassandra/FAQ#range_ghosts

On Thu, Apr 22, 2010 at 5:29 PM, Carlos Sanchez
 wrote:
> I have a curious question..
>
> I am doing some testing where I insert 500 rows to a super column family and 
> then delete one row, I make sure the row was indeed deleted 
> (NotFoundException in the get call) and then I ran a get_range_slices and the 
> row indeed returned. The shutdown Cassandra and restarted it. I repeated the 
> test (with inserting the rows) and even though I get the NotFoundException 
> for that row, the get_rance_slices still returns it.  Is this the expected 
> behavior? How long should I wait before I don't see the row in the 
> get_range_slices? Do I have to force a flush or change consistency level?
>
> Thanks,
>
> Carlos
>
> This email message and any attachments are for the sole use of the intended 
> recipients and may contain proprietary and/or confidential information which 
> may be privileged or otherwise protected from disclosure. Any unauthorized 
> review, use, disclosure or distribution is prohibited. If you are not an 
> intended recipient, please contact the sender by reply email and destroy the 
> original message and any copies of the message as well as any attachments to 
> the original message.
>

Row deletion and get_range_slices (cassandra 0.6.1)

2010-04-22 Thread Carlos Sanchez

I have a curious question..

I am doing some testing where I insert 500 rows to a super column family and 
then delete one row, I make sure the row was indeed deleted (NotFoundException 
in the get call) and then I ran a get_range_slices and the row indeed returned. 
The shutdown Cassandra and restarted it. I repeated the test (with inserting 
the rows) and even though I get the NotFoundException for that row, the 
get_rance_slices still returns it.  Is this the expected behavior? How long 
should I wait before I don't see the row in the get_range_slices? Do I have to 
force a flush or change consistency level?

Thanks,

Carlos

This email message and any attachments are for the sole use of the intended 
recipients and may contain proprietary and/or confidential information which 
may be privileged or otherwise protected from disclosure. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not an 
intended recipient, please contact the sender by reply email and destroy the 
original message and any copies of the message as well as any attachments to 
the original message.

Cassandra Ruby Library's batch method example?

2010-04-22 Thread Lucas Di Pentima

Hi,

I would like to see example code about the batch() method, I searched for it on 
Google, but I couldn't find any. Reading the inline comments, this operation 
could be useful for example to insert some record and update the indexes all at 
once, am I right?

Best regards
--
Lucas Di Pentima - Santa Fe, Argentina
Jabber: lu...@di-pentima.com.ar
MSN: ldipent...@hotmail.com

Re: cassandra instability

2010-04-22 Thread Chris Goffinet

We don't use PHP to talk to Cassandra directly. But we do have the front-end 
communicate to our backend services which are over Thrift. We've used Framed 
and Buffered, both required some tweaks. We use the PHP C-extension from the 
Thrift repo. I have to admit, it's pretty crappy, we had to make some 
modifications in Thrift.

I opened this ticket, I need to submit some of my patches so we can close it 
out (resolve the time out issues):
https://issues.apache.org/jira/browse/THRIFT-638

-Chris

On Apr 22, 2010, at 9:03 AM, S Ahmed wrote:

> If digg uses PHP with cassandra, can the library really be that old?
> 
> Or they are using their own custom php cassandra client? (probably, but just 
> making sure).
> 
> On Fri, Apr 16, 2010 at 2:13 PM, Jonathan Ellis  wrote:
> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker  wrote:
> > Each time I start it up, it will
> > work fine for about 1 hour and then it will crash the servers.  The error
> > message on the servers is usually an out of memory error.
> 
> Sounds like 
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> to me.
> 
> > I will get
> > several time out errors on the clients
> 
> Symtomatic of running out of memory.
> 
> > and occasionally get an error telling
> > me that i was missing the timestamp.
> 
> This is an entirely different problem.  Your client is sending
> garbage, plain and simple.  Why that is, I don't know.  The PHP Thrift
> binding is virtually unmaintained, so it could be a bug there, but
> Digg uses PHP against Cassandra extensively and hasn't hit this to my
> knowledge.  As I said in another thread, I wouldn't rule out bad
> hardware.
> 
> > The timestamp error is accompanied by
> > a server crashing if I use framed transport instead of buffered.
> 
> Thrift is fragile when the client sends it garbage.
> (https://issues.apache.org/jira/browse/THRIFT-601)
> 
> > One of the reasons we
> > were trying cassandra was to scale out with smaller nodes rather than having
> > to run larger instances for mysql.
> 
> 2 x 1GB isn't a whole lot to do a bulk load with.  You may have to
> throttle your clients to fix the OOM completely.
> 
> -Jonathan
>

Re: At what point does the cluster get faster than the individual nodes?

2010-04-22 Thread Jonathan Ellis

fyi,

https://issues.apache.org/jira/browse/CASSANDRA-930
https://issues.apache.org/jira/browse/CASSANDRA-982

On Thu, Apr 22, 2010 at 11:11 AM, Mike Malone  wrote:
> On Wed, Apr 21, 2010 at 9:50 AM, Mark Greene  wrote:
>>
>> Right it's a similar concept to DB sharding where you spread the write
>> load around to different DB servers but won't necessarily increase the
>> throughput of an one DB server but rather collectively.
>
> Except with Cassandra, read-repair causes every read to go to every replica
> for a piece of data.
> Mike

Implementing Tags

2010-04-22 Thread Mark Jones

If I wanted to store tags in Cassandra, on a per user basis, what would be the 
best way to do that?

ColumnFamily:Tags
  Key:UserID
 SuperColumn: Tag names
 Columns: keys to records using this Tag

And in each of the items, have a comma separated list of its tags?


Or some other way?

Concurrent SuperColumn update question

2010-04-22 Thread tsuraan

Suppose I have a SuperColumn CF where one of the SuperColumns in each
row is being treated as a list (e.g. keys only, values are just
empty).  In this list, values will only ever be added; deletion never
occurs.  If I have two processes simultaneously add values to this
list (on different nodes, whatever), is that guaranteed to be safe
from race conditions?  The timestamp is on the actual entries in the
SuperColumn, not on the row that contains those entries, right?  Or am
I wrong, and the last update to the row will overwrite a previous
update to the row?

Also, in a scheme like this, is there a limit on the number of entries
I can have in my "list"?  I know that compaction normally needs to
read an entire row into RAM in order to compact it.  Does this also
apply to SuperColumn columns?

Re: At what point does the cluster get faster than the individual nodes?

2010-04-22 Thread Mike Malone

On Wed, Apr 21, 2010 at 9:50 AM, Mark Greene  wrote:

> Right it's a similar concept to DB sharding where you spread the write load
> around to different DB servers but won't necessarily increase the throughput
> of an one DB server but rather collectively.

Except with Cassandra, read-repair causes every read to go to every replica
for a piece of data.

Mike

Re: cassandra instability

2010-04-22 Thread S Ahmed

If digg uses PHP with cassandra, can the library really be that old?

Or they are using their own custom php cassandra client? (probably, but just
making sure).

On Fri, Apr 16, 2010 at 2:13 PM, Jonathan Ellis  wrote:

> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker  wrote:
> > Each time I start it up, it will
> > work fine for about 1 hour and then it will crash the servers.  The error
> > message on the servers is usually an out of memory error.
>
> Sounds like
> http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> to me.
>
> > I will get
> > several time out errors on the clients
>
> Symtomatic of running out of memory.
>
> > and occasionally get an error telling
> > me that i was missing the timestamp.
>
> This is an entirely different problem.  Your client is sending
> garbage, plain and simple.  Why that is, I don't know.  The PHP Thrift
> binding is virtually unmaintained, so it could be a bug there, but
> Digg uses PHP against Cassandra extensively and hasn't hit this to my
> knowledge.  As I said in another thread, I wouldn't rule out bad
> hardware.
>
> > The timestamp error is accompanied by
> > a server crashing if I use framed transport instead of buffered.
>
> Thrift is fragile when the client sends it garbage.
> (https://issues.apache.org/jira/browse/THRIFT-601)
>
> > One of the reasons we
> > were trying cassandra was to scale out with smaller nodes rather than
> having
> > to run larger instances for mysql.
>
> 2 x 1GB isn't a whole lot to do a bulk load with.  You may have to
> throttle your clients to fix the OOM completely.
>
> -Jonathan
>

Re: questions about consistency

2010-04-22 Thread Paul Prescod

2010/4/22 Даниел Симеонов :
> Hi Paul,
>     Thank you for your answer, about the first question, I wondered if it is
> possible to workaround this issue but relaxing some consistency, as I
> understand you it should be possible to implement this compareAndSet
> operation with the presence of vector clocks, then the client is going to
> reconcile the data.

I believe that the proposed implementation of vector clocks in
Cassandra allows the servers to do the reconciliation through
"plugins". So you'd do a "compareAndSet" "plugin", or Cassandra might
ship with one out of the box (there are several obvious ones that
should probably be right in the box).

> Regarding the second question I understood that without again the vector
> clocks and client reconciliation then there is this causality problem
> currently in Cassandra.

In general, Cassandra 0.6 has little protection against overlapping
and conflicting writes.

> About the third question, isn't it the same as if the writes and reads both
> use QUORUMs?

I think that you can use Consistency.ALL on write, and Consistency.ONE
on read, to optimize for read-speed, and the opposite to optimize for
write speed.

> What about implementation of counters, currently it seems it is not
> implementable in 'Cassandra', will the vector clocks help here? Do you have
> experiences with counters in Cassandra?

Counters are the "classic" example of why you need vector clocks.

The description for CASSANDRA-580 is "Allow a ColumnFamily to be
versioned via vector clocks, instead of long timestamps. Purpose:
enable incr/decr; flexible conflict resolution."

https://issues.apache.org/jira/browse/CASSANDRA-580

 Paul Prescod

Re: Does anybody work about transaction on cassandra ?

2010-04-22 Thread Mason Hale

You might also consider using a Software Transactional Memory[1] approach. I
haven't personally tried it, but there is a Scala/Java framework named Akka
that provides both STM features and Cassandra support. Should be worth a
look. Here's a nice write-up from someone who has already done some
exploring: http://codemonkeyism.com/cassandra-scala-akka/

 [1] http://en.wikipedia.org/wiki/Software_transactional_memory
 [2] http://doc.akkasource.org/

Mason Hale
http://www.onespot.com


On Thu, Apr 22, 2010 at 9:14 AM, Miguel Verde wrote:

> No, as far as I know no one is working on transaction support in
> Cassandra.  Transactions are orthogonal to the design of Cassandra[1][2],
> although a system could be designed incorporating Cassandra and other
> elements a la Google's MegaStore[3] to support transactions.  Google uses
> Paxos, one might be able to use Zookeeper[4] to design such a system, but it
> would be a daunting task.
>
> [1] http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
> [2] http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
> [3] http://perspectives.mvdirona.com/2008/07/10/GoogleMegastore.aspx
> [4] http://hadoop.apache.org/zookeeper/
>
> On Thu, Apr 22, 2010 at 2:56 AM, Jeff Zhang  wrote:
>
>> Hi all,
>>
>> I need transaction support on cassandra, so wondering is anybody work on
>> it ?
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>

Re: Is that normal to have some percent of reads/writes time out?

2010-04-22 Thread Jonathan Ellis

timeouts are usually a signal that you need to add capacity to handle
the load you are giving the cluster.

On Thu, Apr 22, 2010 at 8:22 AM, Ken Sandney  wrote:
> Hi
>
> I am doing some load test with 4 nodes cluster. My client is PHP. I found
> some reads/writes were time out no matter how I tuned the parameters. These
> time-outs could be caught by client code. My question is: are these
> time-outs normal even in production environment? Should they be treated as
> normal cases or kind of error?
>

Re: Is that normal to have some percent of reads/writes time out?

2010-04-22 Thread Miguel Verde

I see that you are aware of https://issues.apache.org/jira/browse/THRIFT-347
Have you applied the patch there?  It worked for the Digg guys (probably the
largest PHP user of Cassandra) and others in that JIRA issue.
Timeouts are typical with unusually heavy load, node failure, and/or
un-tuned parameters, but that doesn't sound like your situation.
On Thu, Apr 22, 2010 at 8:22 AM, Ken Sandney  wrote:

> Hi
>
> I am doing some load test with 4 nodes cluster. My client is PHP. I found
> some reads/writes were time out no matter how I tuned the parameters. These
> time-outs could be caught by client code. My question is: are these
> time-outs normal even in production environment? Should they be treated as
> normal cases or kind of error?
>

Re: Does anybody work about transaction on cassandra ?

2010-04-22 Thread Miguel Verde

No, as far as I know no one is working on transaction support in Cassandra.
Transactions are orthogonal to the design of Cassandra[1][2], although a
system could be designed incorporating Cassandra and other elements a la
Google's MegaStore[3] to support transactions.  Google uses Paxos, one might
be able to use Zookeeper[4] to design such a system, but it would be a
daunting task.

[1] http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
[2] http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
[3] http://perspectives.mvdirona.com/2008/07/10/GoogleMegastore.aspx
[4] http://hadoop.apache.org/zookeeper/

On Thu, Apr 22, 2010 at 2:56 AM, Jeff Zhang  wrote:

> Hi all,
>
> I need transaction support on cassandra, so wondering is anybody work on it
> ?
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: New user asking for advice on database design

2010-04-22 Thread Zhiguo Zhang

do you have read the article "
WTF is a SuperColumn? An Intro to the Cassandra Data
Model"?
link: http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model


it is a good article for data model.


On Thu, Apr 22, 2010 at 10:38 AM, Yésica Rey  wrote:

> Hi David,
>
> I think your arquitecture is right. I'm also new in cassandra, and I ve
> designed my database similar than yours.
> I also think that division than data and indexes is more efficient in the
> queries.
>
> I had not raised your question about put them in a separated keyspaces, but
> I also appreciate any sugestion.
>
> Yess
>
> David Boxenhorn escribió:
>
>  Hi guys! I'm brand new to Cassandara, and I'm working on a database
>> design. I don't necessarily know all the advantages/limitations of
>> Cassandra, so I'm not sure that I'm doing it right...
>>  It seems to me that I can divide my database into two parts:
>>  1. The (mostly) normal data, where every piece of data appears only once
>> (I say "mostly" because I think I need reverse indexes for delete... and
>> once it's there, other things).
>>  2. The indexes, which I use for queries.
>>  Questions:
>>  1. Is the above a good architecture?
>> 2. Would there be an advantage to putting the two parts of the database in
>> different keyspaces? I expect the indexes to change every once in a while as
>> my querying needs progress, but the normal database won't change unless I
>> made a mistake.
>>  Any other advice?
>>
>

Is that normal to have some percent of reads/writes time out?

2010-04-22 Thread Ken Sandney

Hi

I am doing some load test with 4 nodes cluster. My client is PHP. I found
some reads/writes were time out no matter how I tuned the parameters. These
time-outs could be caught by client code. My question is: are these
time-outs normal even in production environment? Should they be treated as
normal cases or kind of error?

Re: Cassandra tuning for running test on a desktop

2010-04-22 Thread Nicolas Labrot

Yes I think. I have read this wiki entry and the JIRA. I will use different
row key until it will be fixed

Thanks,

Nicolas


On Thu, Apr 22, 2010 at 4:47 AM, Stu Hood  wrote:

> Nicolas,
>
> Were all of those super column writes going to the same row?
> http://wiki.apache.org/cassandra/CassandraLimitations
>
> Thanks,
> Stu
>
> -Original Message-
> From: "Nicolas Labrot" 
> Sent: Wednesday, April 21, 2010 11:54am
> To: user@cassandra.apache.org
> Subject: Re: Cassandra tuning for running test on a desktop
>
> I donnot have a website ;)
>
> I'm testing the viability of Cassandra to store XML documents and make fast
> search queries. 4000 XML files (80MB of XML) create with my datamodel (one
> SC per XML node) 100 SC which make Cassandra go OOM with Xmx 1GB. On
> the
> contrary an xml DB like eXist handles 4000 XML doc without any problem with
> an acceptable amount of memories.
>
> What I like with Cassandra is his simplicity and his scalability. eXist is
> not able to scale with data, the only viable solution his marklogic which
> cost an harm and a feet... :)
>
> I will install linux and buy some memories to continue my test.
>
> Could a Cassandra developper give me the technical reason of this OOM ?
>
>
>
>
>
> On Wed, Apr 21, 2010 at 5:13 PM, Mark Greene  wrote:
>
> > Maybe, maybe not. Presumably if you are running a RDMS with any
> reasonable
> > amount of traffic now a days, it's sitting on a machine with 4-8G of
> memory
> > at least.
> >
> >
> > On Wed, Apr 21, 2010 at 10:48 AM, Nicolas Labrot  >wrote:
> >
> >> Thanks Mark.
> >>
> >> Cassandra is maybe too much for my need ;)
> >>
> >>
> >>
> >> On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene 
> wrote:
> >>
> >>> Hit send to early
> >>>
> >>> That being said a lot of people running Cassandra in production are
> using
> >>> 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully
> >>> gives you some perspective.
> >>>
> >>>
> >>> On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene  >wrote:
> >>>
>  RAM doesn't necessarily need to be proportional but I would say the
>  number of nodes does. You can't just throw a bazillion inserts at one
> node.
>  This is the main benefit of Cassandra is that if you start hitting
> your
>  capacity, you add more machines and distribute the keys across more
>  machines.
> 
> 
>  On Wed, Apr 21, 2010 at 9:07 AM, Nicolas Labrot  >wrote:
> 
> > So does it means the RAM needed is proportionnal with the data
> handled
> > ?
> >
> > Or Cassandra need a minimum amount or RAM when dataset is big?
> >
> > I must confess this OOM behaviour is strange.
> >
> >
> > On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones  >wrote:
> >
> >>  On my 4GB machine I’m giving it 3GB and having no trouble with 60+
> >> million 500 byte columns
> >>
> >>
> >>
> >> *From:* Nicolas Labrot [mailto:nith...@gmail.com]
> >> *Sent:* Wednesday, April 21, 2010 7:47 AM
> >> *To:* user@cassandra.apache.org
> >> *Subject:* Re: Cassandra tuning for running test on a desktop
> >>
> >>
> >>
> >> I have try 1400M, and Cassandra OOM too.
> >>
> >> Is there another solution ? My data isn't very big.
> >>
> >> It seems that is the merge of the db
> >>
> >>  On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene 
> >> wrote:
> >>
> >> Trying increasing Xmx. 1G is probably not enough for the amount of
> >> inserts you are doing.
> >>
> >>
> >>
> >> On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot 
> >> wrote:
> >>
> >> Hello,
> >>
> >> For my first message I will first thanks Cassandra contributors for
> >> their great works.
> >>
> >> I have a parameter issue with Cassandra (I hope it's just a
> parameter
> >> issue). I'm using Cassandra 6.0.1 with Hector client on my desktop.
> It's a
> >> simple dual core with 4GB of RAM on WinXP. I have keep the default
> JVM
> >> option inside cassandra.bat (Xmx1G)
> >>
> >> I'm trying to insert 3 millions of SC with 6 Columns each inside 1
> CF
> >> (named Super1). The insertion go to 1 millions of SC (without
> slowdown) and
> >> Cassandra crash because of an OOM. (I store an average of 100 bytes
> per SC
> >> with a max of 10kB).
> >> I have aggressively decreased all the memories parameters without
> any
> >> respect to the consistency (My config is here [1]), the cache is
> turn off
> >> but Cassandra still go to OOM. I have joined the last line of the
> Cassandra
> >> life [2].
> >>
> >> What can I do to fix my issue ?  Is there another solution than
> >> increasing the Xmx ?
> >>
> >> Thanks for your help,
> >>
> >> Nicolas
> >>
> >>
> >>
> >>
> >>
> >> [1]
> >>   
> >> 
> >>>> ColumnType="Super"
> >> CompareWith="BytesType"
> >>

Re: New user asking for advice on database design

2010-04-22 Thread Yésica Rey


Hi David,

I think your arquitecture is right. I'm also new in cassandra, and I ve 
designed my database similar than yours.
I also think that division than data and indexes is more efficient in 
the queries.


I had not raised your question about put them in a separated keyspaces, 
but I also appreciate any sugestion.


Yess

David Boxenhorn escribió:
Hi guys! I'm brand new to Cassandara, and I'm working on a database 
design. I don't necessarily know all the advantages/limitations of 
Cassandra, so I'm not sure that I'm doing it right...
 
It seems to me that I can divide my database into two parts:
 
1. The (mostly) normal data, where every piece of data appears only 
once (I say "mostly" because I think I need reverse indexes for 
delete... and once it's there, other things).
 
2. The indexes, which I use for queries.
 
Questions:
 
1. Is the above a good architecture?
2. Would there be an advantage to putting the two parts of the 
database in different keyspaces? I expect the indexes to change every 
once in a while as my querying needs progress, but the normal database 
won't change unless I made a mistake.
 
Any other advice?

(sin asunto)

2010-04-22 Thread Yésica Rey


Hi David.

I think your arquitecture is right. I'm also new in cassandra, and I ve 
designed my database similar than yours.
I also think that division than data and indexes is more efficient in 
the queries.


I had not raised your question about put them in a separated keyspaces, 
but I also appreciate any sugestion.


Yess

Re: Will cassandra block client ?

2010-04-22 Thread Ran Tavory

it reuses connections, yes. but wouldn't hurt to check as well ;)
you may want to check the haproxy connections as well.

On Thu, Apr 22, 2010 at 11:26 AM, Jeff Zhang  wrote:

> I use the hector java client, I think it reuse the connection, or
> maybe I should check the source code.
>
>
> On Thu, Apr 22, 2010 at 4:10 PM, Ran Tavory  wrote:
> > are you reusing your connections? If not, you may be running out of tcp
> > ports on the bombing client. check netstat -na | grep TIME_WAIT
> >
> > On Thu, Apr 22, 2010 at 10:52 AM, Jeff Zhang  wrote:
> >>
> >> Hi all,
> >>
> >> I made too many requests to cassandra , and then after a while, I can
> >> not connect to it. But I can still connect it from another machine ?
> >> So does it mean cassandra will block client in some situation ?
> >>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jeff Zhang
> >
> >
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Will cassandra block client ?

2010-04-22 Thread Jeff Zhang

I use the hector java client, I think it reuse the connection, or
maybe I should check the source code.


On Thu, Apr 22, 2010 at 4:10 PM, Ran Tavory  wrote:
> are you reusing your connections? If not, you may be running out of tcp
> ports on the bombing client. check netstat -na | grep TIME_WAIT
>
> On Thu, Apr 22, 2010 at 10:52 AM, Jeff Zhang  wrote:
>>
>> Hi all,
>>
>> I made too many requests to cassandra , and then after a while, I can
>> not connect to it. But I can still connect it from another machine ?
>> So does it mean cassandra will block client in some situation ?
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>
>



-- 
Best Regards

Jeff Zhang

RE: Periodically hiccups

2010-04-22 Thread Dr . Martin Grabmüller

Hello Alex,

unfortunately I can not help with your problem, just one hint: 

> - RecentReadLatencyMicros and RecentWriteLatencyMicros are super high 
> for StorageProxy, as well as every column family in JMX: up 
> to 43 s and 
> 9s (see screenshot). However, in cfstats, they are quite small.

Remember that the latency values in JMX are in microseconds, whereas
cfstats reports in ms.  This was changed between 0.5 and 0.6, IIRC.

Greetings,
  Martin

Periodically hiccups

2010-04-22 Thread Alex Li


Hello,

We recently deployed a cluster of 5 Cassandra nodes into production, and 
ran into big problems with periodically hiccups (individual node goes 
down, high CPU, client connection timeout). It was terrible with 0.5 
(one hiccups every 5-10 minutes), today we upgraded to 0.6.1, it happens 
less frequently now (likely once every 30 minutes or so). But it is 
still quite frustrating.


We used ReplicationFactor=3 for all column families. 5 nodes are behind 
haproxy. Java client goes through haproxy. The most obvious behavior is: 
as soon as one node goes down, the connections between haproxy and 
Cassandra nodes just shoot up to 1000 (in normal case it is stable at 
40, which should be really trivial for Cassandra), and the connections 
don't go down for quite a while. Meanwhile Java clients just get all 
kind of TimeoutException, then kept on retrying. Eventually we have to 
restart haproxy, then things go back to normal.


Each node has 5GB max heap, powerful enough CPU (quad-core), software 
RAID mirror. We are definitely NOT putting lots of load yet, mostly 
20-50 concurrent requests to Cassandra, but it is not holding up! Please 
help, we are on the verge of giving up Cassandra after 5 days of 
periodic "outage".


Couple observations:

- cfstats shows significant read latency on "system" keyspace, almost 5s 
(see below)


- RecentReadLatencyMicros and RecentWriteLatencyMicros are super high 
for StorageProxy, as well as every column family in JMX: up to 152676.92 
and 6950 (they are in ms, right?). However, in cfstats, they are quite 
small.


- Every second we see 5-10 DigestMismatchException in the log:

INFO [pool-1-thread-15857] 2010-04-22 00:37:37,887 StorageProxy.java 
(line 499) DigestMismatchException: Mismatch for key 1068022523 
(d41d8cd98f00b204e9800998ecf8427e vs 0dd4cdaeeb1a334ae133c6955e109629)


Please advice. Thank you!

snippets of storage-conf, cfstats, tpstats are listed below:



  false
  
  
 
  
org.apache.cassandra.locator.RackUnawareStrategy 


  3
  
org.apache.cassandra.locator.EndPointSnitch 


  
  


  1
  128

  auto
  512
  64
  32
  8
  64
  64
  256
  0.3
 60

  8
  32

  periodic
  1



  864000

r...@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost cfstats
Keyspace: system
  Read Count: 878
  Read Latency: 5752.042634396355 ms.
  Write Count: 2260398
  Write Latency: 0.014567047926957996 ms.
  Pending Tasks: 0
  Column Family: LocationInfo
  SSTable count: 2
  Space used (live): 3569
  Space used (total): 3569
  Memtable Columns Count: 0
  Memtable Data Size: 0
  Memtable Switch Count: 1
  Read Count: 1
  Read Latency: NaN ms.
  Write Count: 6
  Write Latency: NaN ms.
  Pending Tasks: 0
  Key cache capacity: 2
  Key cache size: 1
  Key cache hit rate: NaN
  Row cache: disabled
  Compacted row minimum size: 0
  Compacted row maximum size: 0
  Compacted row mean size: 0

  Column Family: HintsColumnFamily
  SSTable count: 2
  Space used (live): 70272035
  Space used (total): 70272035
  Memtable Columns Count: 56264
  Memtable Data Size: 486854
  Memtable Switch Count: 21
  Read Count: 877
  Read Latency: 13614.412 ms.
  Write Count: 2260392
  Write Latency: 0.142 ms.
  Pending Tasks: 0
  Key cache capacity: 2
  Key cache size: 2
  Key cache hit rate: 0.25
  Row cache: disabled
  Compacted row minimum size: 78567
  Compacted row maximum size: 39561901
  Compacted row mean size: 27878603


Keyspace: Titan
  Read Count: 8948702
  Read Latency: 7.949136100185256 ms.
  Write Count: 3393490
  Write Latency: 0.19255415398306758 ms.
  Pending Tasks: 0
  Column Family: FbUser
  SSTable count: 6
  Space used (live): 3675014807
  Space used (total): 3675014807
  Memtable Columns Count: 250055
  Memtable Data Size: 9146339
  Memtable Switch Count: 9
  Read Count: 6591406
  Read Latency: 8.361 ms.
  Write Count: 343030
  Write Latency: 0.078 ms.
  Pending Tasks: 0
  Key cache capacity: 20
  Key cache size: 20
  Key cache hit rate: 0.6341912478864628
  Row cache: disabled
  Compacted row minimum size: 320
  Compacted row maximum size: 586
  Compacted row mean size: 479

  Column Family: Payment
  SSTable count: 3
  Space used (live): 2473
  Space used (total): 2473
  Memtable Columns Count: 0
  Memtable Data Size: 0
  Memtable Switch Count: 0
  Read Count: 27728
  Read Latency: 0.059 ms.
  Write Count: 0
  Write Latency: NaN ms.
  Pending Tasks: 0
  Key cache capacity: 20
  Key cache size: 0
  Key cache hit rate: NaN
  Row cache: disabled
  Compacted row minimum size: 0
  Compacted row maximum size: 0
  Compacted row mean size: 0

  Column Family: Club
  SSTable count: 6
  Space u

Re: PHP client crashed if a column value > 8192 bytes

2010-04-22 Thread Zhiguo Zhang

Maybe you have to send this message also to thrift user mail list?


On Thu, Apr 22, 2010 at 6:34 AM, Ken Sandney  wrote:

> After many attempts I found this error only occurred when using PHP
> thrift_protocol extension. I don't know if there are some parameters that I
> could adjust for this issue. By the way, without the ext the speed is
> obviously slow.
>
>
> On Thu, Apr 22, 2010 at 12:01 PM, Ken Sandney  wrote:
>
>> I am using PHP as client to talk to Cassandra server but I found out if
>> any column value > 8192 bytes, the client crashed with the following error:
>>
>> PHP Fatal error:  Uncaught exception 'TException' with message 'TSocket:
>>> timed out reading 1024 bytes from 10.0.0.177:9160' in
>>> /home/phpcassa/include/thrift/transport/TSocket.php:264
>>> Stack trace:
>>> #0 /home/phpcassa/include/thrift/transport/TBufferedTransport.php(126):
>>> TSocket->read(1024)
>>> #1 [internal function]: TBufferedTransport->read(8192)
>>> #2 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(642):
>>> thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated),
>>> 'cassandra_Cassa...', false)
>>> #3 /home/phpcassa/include/thrift/packages/cassandra/Cassandra.php(615):
>>> CassandraClient->recv_batch_insert()
>>> #4 /home/phpcassa/include/phpcassa.php(197):
>>> CassandraClient->batch_insert('Keyspace1', '38246', Array, 1)
>>> #5 /home/phpcassa/test1.php(51): CassandraCF->insert('38246', Array)
>>> #6 {main}
>>>   thrown in /home/phpcassa/include/thrift/transport/TSocket.php on line
>>> 264
>>>
>>
>> Any idea about this?
>>
>
>

Re: Will cassandra block client ?

2010-04-22 Thread Ran Tavory

are you reusing your connections? If not, you may be running out of tcp
ports on the bombing client. check netstat -na | grep TIME_WAIT

On Thu, Apr 22, 2010 at 10:52 AM, Jeff Zhang  wrote:

> Hi all,
>
> I made too many requests to cassandra , and then after a while, I can
> not connect to it. But I can still connect it from another machine ?
> So does it mean cassandra will block client in some situation ?
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Does anybody work about transaction on cassandra ?

2010-04-22 Thread Jeff Zhang

Hi all,

I need transaction support on cassandra, so wondering is anybody work on it ?


-- 
Best Regards

Jeff Zhang

Will cassandra block client ?

2010-04-22 Thread Jeff Zhang

Hi all,

I made too many requests to cassandra , and then after a while, I can
not connect to it. But I can still connect it from another machine ?
So does it mean cassandra will block client in some situation ?



-- 
Best Regards

Jeff Zhang

Re: questions about consistency

2010-04-22 Thread Даниел Симеонов

Hi Paul,
Thank you for your answer, about the first question, I wondered if it is
possible to workaround this issue but relaxing some consistency, as I
understand you it should be possible to implement this compareAndSet
operation with the presence of vector clocks, then the client is going to
reconcile the data.
Regarding the second question I understood that without again the vector
clocks and client reconciliation then there is this causality problem
currently in Cassandra.
About the third question, isn't it the same as if the writes and reads both
use QUORUMs?
What about implementation of counters, currently it seems it is not
implementable in 'Cassandra', will the vector clocks help here? Do you have
experiences with counters in Cassandra?

Best regards, Daniel.

2010/4/21 Paul Prescod 

> I'm not an expert, so take what I say with a grain of salt.
>
> 2010/4/21 Даниел Симеонов :
> > Hello,
> >I am pretty new to Cassandra and I have some questions, they may seem
> > trivial, but still I am pretty new to the subject. First is about the
> lack
> > of a compareAndSet() operation, as I understood it is not supported
> > currently in Cassandra, do you know of use cases which really require
> such
> > operations and how these use cases currently workaround this .
>
> I think your question is paradoxical. If the use case really requires
> the operation then there is no workaround by definition. The existence
> of the workaround implies that the use case really did not require the
> operation.
>
> Anyhow, vector clocks are probably relevant to this question and your next
> one.
>
> > Second topic I'd like to discuss a little bit more is about the read
> repair,
> > as I understand is that it is being done by the timestamps supplied by
> the
> > client application servers. Since computer clocks (which requires
> > synchronization algorithms working regularly) diverge there should be a
> time
> > frame during which the order of the client request written to the
> database
> > is not guaranteed, do you have real world experiences with this? Is this
> > similar to the casual consistency (
> > http://en.wikipedia.org/wiki/Causal_consistency ) .What happens if two
> > application servers try to update the same data and supply one and the
> same
> > timestamp (it could happen although rarely), what if they try to update
> > several columns in batch operation this way, is there a chance that the
> > column value could be intermixed between the two update requests?
>
> All of this is changing with vector clocks in Cassandra 0.7.
>
> https://issues.apache.org/jira/browse/CASSANDRA-580
>
> > I have one last question about the consistency level ALL, do you know of
> > real use cases where it is required (instead of QUORUM) and why (both
> read
> > and write)?
>
> It would be required when your business rules do not allow any client
> to read the old value. For example if it would be illegal to provide
> an obsolete stock value.
>
> > Thank you very much for your help to better understand 'Cassandra'!
> > Best regards, Daniel.
> >
>

45 matches

Mail list logo