New user asking for advice on database design

2010-04-21 Thread David Boxenhorn
Hi guys! I'm brand new to Cassandara, and I'm working on a database design. I don't necessarily know all the advantages/limitations of Cassandra, so I'm not sure that I'm doing it right... It seems to me that I can divide my database into two parts: 1. The (mostly) normal data, where every piece

Re: Cassandra's bad behavior on disk failure

2010-04-21 Thread Oleg Anastasjev
> > Ideally I think we'd like to leave the node up to serve reads, if a > disk is erroring out on writes but still read-able. In my experience > this is very common when a disk first begins to fail, as well as in > the "disk is full" case where there is nothing actually wrong with the > disk per

Re: PHP client crashed if a column value > 8192 bytes

2010-04-21 Thread Ken Sandney
After many attempts I found this error only occurred when using PHP thrift_protocol extension. I don't know if there are some parameters that I could adjust for this issue. By the way, without the ext the speed is obviously slow. On Thu, Apr 22, 2010 at 12:01 PM, Ken Sandney wrote: > I am using

PHP client crashed if a column value > 8192 bytes

2010-04-21 Thread Ken Sandney
I am using PHP as client to talk to Cassandra server but I found out if any column value > 8192 bytes, the client crashed with the following error: PHP Fatal error: Uncaught exception 'TException' with message 'TSocket: > timed out reading 1024 bytes from 10.0.0.177:9160' in > /home/phpcassa/incl

RE: security, firewall level only?

2010-04-21 Thread Stu Hood
It isn't very well documented apparently, but if you are using 0.6, you can look at the 'Authenticator' property in the default config for an explanation of how to authenticate users. With the SimpleAuthenticator implementation, there are properties files that define your users and passwords, a

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Ingram Chen
arh! That's right. I check OutboundTcpConnection and it only does closeSocket() after something went wrong. I will log more in OutboundTcpConnection to see what actually happens. Thank your help. On Thu, Apr 22, 2010 at 10:03, Jonathan Ellis wrote: > But those connections aren't supposed to

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Stu Hood
Nicolas, Were all of those super column writes going to the same row? http://wiki.apache.org/cassandra/CassandraLimitations Thanks, Stu -Original Message- From: "Nicolas Labrot" Sent: Wednesday, April 21, 2010 11:54am To: user@cassandra.apache.org Subject: Re: Cassandra tuning for runn

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Jonathan Ellis
But those connections aren't supposed to ever terminate unless a node dies or is partitioned. So if we "fix" it by adding a socket.close I worry that we're covering up something more important. On Wed, Apr 21, 2010 at 8:53 PM, Ingram Chen wrote: > I agree your point. I patch the code and log mor

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Ingram Chen
I agree your point. I patch the code and log more informations to find out the real cause. Here is the code snip I think may be the cause: IncomingTcpConnection: public void run() { while (true) { try { MessagingService.validateMagi

Re: RandomPartitioner doubts

2010-04-21 Thread Jonathan Ellis
For each "page" of results, start with the key that was last in the previous iteration, and you will get all the keys back. The order is random but consistent. On Wed, Apr 21, 2010 at 7:55 PM, Lucas Di Pentima wrote: > Hello, > > I'm using Cassandra 0.6.1 and ruby's library. I did some tests on

Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160

2010-04-21 Thread Ken Sandney
I've tried the patch on https://issues.apache.org/jira/browse/THRIFT-347 , but still got this error: PHP Fatal error: Uncaught exception 'TException' with message 'TSocket: > timed out reading 1024 bytes from 10.0.0.169:9160' in > /home/phpcassa/include/thrift/transport/TSocket.php:266 > Stack tr

RandomPartitioner doubts

2010-04-21 Thread Lucas Di Pentima
Hello, I'm using Cassandra 0.6.1 and ruby's library. I did some tests on my one-node development installation about using get_range method to scan the whole CF. What I want to prove is if a CF with RandomPartitioner can be used with get_range getting a fixed number of keys at a time, until all

Re: questions about consistency

2010-04-21 Thread Masood Mortazavi
Hi Daniel, For a general theoretical understanding, try reading some of the papers on eventual consistency by Werner Vogels. Reading the SOSP'07, Dynamo paper would also help with some of the theoretical foundations and academic references. To get even further into it, try reading Replication T

Re: CassandraLimitations

2010-04-21 Thread Bill de hOra
Sweet. Bill Jonathan Ellis wrote: No. On Wed, Apr 21, 2010 at 2:58 PM, Bill de hOra wrote: http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on the limits around columns. Are there are design (or practical) limits to the number of rows a keyspace can have? Bill

Re: CassandraLimitations

2010-04-21 Thread Bill de hOra
> Are you asking if there are limits in the context > of a single node or a > ring of nodes? A ring, but across a few (3+) datacenters. Bill Mark Greene wrote: Hey Bill, Are you asking if there are limits in the context of a single node or a ring of nodes? On Wed, Apr 21, 2010 at 3:58 PM,

April Seattle Hadoop/Scalability/NoSQL Meetup: Cassandra, Science, More!

2010-04-21 Thread Bradford Stephens
Hey there! Wanted to let you all know about our next meetup, April 28th. We've got a killer new venue thanks to Amazon. Check out the details at the link: http://www.meetup.com/Seattle-Hadoop-HBase-NoSQL-Meetup/calendar/13072272/ Our Speakers this month: 1. Nick Dimiduk, Drawn to Scale: Intro to

Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
Gotcha. No i don't see anything particularly interesting in the log. Do i need to turn on higher logging in log4j? here it is after i killed the client: INFO [main] 2010-04-21 14:25:52,166 DatabaseDescriptor.java (line 229) Auto DiskAccessMode determined to be standard INFO [main] 2010-04-2

Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
what i mean by as data is processed is that the column size will grow in cassandra, but my client isn't ever writing large column size under a given row... Any idea whats going on here? On Wed, Apr 21, 2010 at 3:05 PM, Sonny Heer wrote: > What does OOM stand for? > > for a given insert the size

Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 5:05 PM, Sonny Heer wrote: > What does OOM stand for? out of memory > for a given insert the size is small (meaning the a single insert > operation only has about a sentence of data)  although as the insert > process continues, the columns under a given row key could pote

Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
What does OOM stand for? for a given insert the size is small (meaning the a single insert operation only has about a sentence of data) although as the insert process continues, the columns under a given row key could potentially grow to be large. Is that what you mean? An operation entails: Re

Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
then that's not the problem. are you writing large rows that OOM during compaction? On Wed, Apr 21, 2010 at 4:34 PM, Sonny Heer wrote: > They are showing up as completed?  Is this correct: > > > Pool Name                    Active   Pending      Completed > STREAM-STAGE                      0  

Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
They are showing up as completed? Is this correct: Pool NameActive Pending Completed STREAM-STAGE 0 0 0 RESPONSE-STAGE0 0 0 ROW-READ-STAGE0 0 517446 L

security, firewall level only?

2010-04-21 Thread S Ahmed
Is security in terms of remote clients connecting to a cassandra node done purely at the hardware/firewall level? i.e. there is no username/pwd like in mysql/sqlserver correct? Or permissions at the column family level per user ?

Re: unsubscribe

2010-04-21 Thread Jeremy Dunck
You have a typo: user-unsubscr...@cassandra.apache.org, not user-unsubcr...@cassandra.apache.org. :-) On Wed, Apr 21, 2010 at 3:55 PM, Jennifer Huynh wrote: > Anyone know how to unsubscribe to the mailing list? I tried emailing the > server, user-unsubcr...@cassandra.apache.org, and had no luck.

Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
you need to figure out where the memory is going. check tpstats, if the pending ops are large somewhere that means you're just generating insert ops faster than it can handle. On Wed, Apr 21, 2010 at 4:07 PM, Sonny Heer wrote: > note: I'm using the Thrift API to insert.  The commitLog directory

Re: Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
note: I'm using the Thrift API to insert. The commitLog directory continues to grow. The heap size continues to grow as well. I decreased MemtableSizeInMB size, but noticed no changes. Any idea what is causing this, and/or what property i need to tweek to alleviate this? What is the "insert th

unsubscribe

2010-04-21 Thread Jennifer Huynh
Anyone know how to unsubscribe to the mailing list? I tried emailing the server, user-unsubcr...@cassandra.apache.org, and had no luck. Thanks in advance!!!

Re: CassandraLimitations

2010-04-21 Thread Mark Greene
Hey Bill, Are you asking if there are limits in the context of a single node or a ring of nodes? On Wed, Apr 21, 2010 at 3:58 PM, Bill de hOra wrote: > http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on > the limits around columns. > > Are there are design (or practical)

Re: CassandraLimitations

2010-04-21 Thread Jonathan Ellis
No. On Wed, Apr 21, 2010 at 2:58 PM, Bill de hOra wrote: > http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on > the limits around columns. > > Are there are design (or practical) limits to the number of rows a keyspace > can have? > > Bill >

CassandraLimitations

2010-04-21 Thread Bill de hOra
http://wiki.apache.org/cassandra/CassandraLimitations has good coverage on the limits around columns. Are there are design (or practical) limits to the number of rows a keyspace can have? Bill

Re: Problem using get_range_slices

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 2:19 PM, Guilherme Kaster wrote: > I've encountered a problem on cassandra 0.6 while using get_ranged_slices. > I use RP and when I use get_range_slices the keys are not returned in an > "ordered" maner, that means the last key on the list not always the > "greater" key in

Problem using get_range_slices

2010-04-21 Thread Guilherme Kaster
I've encountered a problem on cassandra 0.6 while using get_ranged_slices. I use RP and when I use get_range_slices the keys are not returned in an "ordered" maner, that means the last key on the list not always the "greater" key in the list, so I started getting repetitions and ONCE entered in an

Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Anthony Molinaro
On Wed, Apr 21, 2010 at 01:24:45PM -0500, Jonathan Ellis wrote: > On Wed, Apr 21, 2010 at 1:11 PM, Anthony Molinaro > wrote: > > Interesting, in the config I see > > > >   > >  5000 > > > > So I thought that timeout was for inter-node communication not the thrift > > API, but I see how you probabl

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Anthony Molinaro
On Wed, Apr 21, 2010 at 12:05:07PM -0500, Jonathan Ellis wrote: > On Wed, Apr 21, 2010 at 11:31 AM, Anthony Molinaro > wrote: > > > > On Wed, Apr 21, 2010 at 11:08:19AM -0500, Jonathan Ellis wrote: > >> Yes, that looks right, where "token really close" means "slightly less > >> than" (more than w

Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 1:11 PM, Anthony Molinaro wrote: > Interesting, in the config I see > >   >  5000 > > So I thought that timeout was for inter-node communication not the thrift > API, but I see how you probably consider both inter-node traffic and > thrift traffic as clients.  Does this RPC

Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Anthony Molinaro
On Wed, Apr 21, 2010 at 12:52:32PM -0500, Jonathan Ellis wrote: > On Wed, Apr 21, 2010 at 12:45 PM, Anthony Molinaro > wrote: > >> as for why it backs up in the first place before the restart, you can > >> either (a) throttle writes [set your timeout lower, make your clients > >> back off temporar

Re: Should I use Cassandra for general purpose DB?

2010-04-21 Thread Miguel Verde
On Wed, Apr 21, 2010 at 12:56 PM, Soichi Hayashi wrote: > So, I am interested in using Cassandra not because of large amount of data, > but because of following reasons. > > 1) It's easy to administrate and handle fail-over (and scale, of course) > 2) Easy to write an application that makes sense

Re: Cassandra data model for financial data

2010-04-21 Thread Miguel Verde
On Wed, Apr 21, 2010 at 12:17 PM, Steve Lihn wrote: > [...] > Design 1: Each attribute is a super column. Therefore each date is a > column. So we have: > > AAPL -> closingPrice -> { '2010-04-13' : 242, '2010-04-14': 245 } > AAPL -> volume -> { '2010-04-13' : 10.9m, '2010-04-14': 14.4m } > etc

Should I use Cassandra for general purpose DB?

2010-04-21 Thread Soichi Hayashi
Hi. So, I am interested in using Cassandra not because of large amount of data, but because of following reasons. 1) It's easy to administrate and handle fail-over (and scale, of course) 2) Easy to write an application that makes sense to developers (Developers' fully in control of how data is or

Re: Import using cassandra 0.6.1

2010-04-21 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts On Wed, Apr 21, 2010 at 12:02 PM, Sonny Heer wrote: > Currently running on a single node with intensive write operations. > > > After running for a while... > > Client starts outputting: > > TimedOutException() >        at > org

Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 12:45 PM, Anthony Molinaro wrote: >> as for why it backs up in the first place before the restart, you can >> either (a) throttle writes [set your timeout lower, make your clients >> back off temporarily when it gets a timeoutexception] > > What timeout is this?  Something

Re: Just to be clear, cassandra is web framework agnostic b/c of Thrift?

2010-04-21 Thread Mark Greene
I'll try to test this out tonight. On Wed, Apr 21, 2010 at 1:07 PM, Jonathan Ellis wrote: > There is a patch attached to > https://issues.apache.org/jira/browse/CASSANDRA-948 that needs > volunteers to test. > > On Sun, Apr 18, 2010 at 11:13 PM, Mark Greene wrote: > > With the 0.6.0 release, th

Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Anthony Molinaro
On Wed, Apr 21, 2010 at 12:21:31PM -0500, Jonathan Ellis wrote: > [moving to u...@] > > 0.6 fixes replaying faster than it can flush. Yeah, I noticed some of those fixes, and will probably take the leap into 0.6 if I can keep my cluster running (it's not doing too bad, I do about 400K reads and

Re: Cassandra data model for financial data

2010-04-21 Thread JKnight JKnight
I know Cassandra is very flexible. a. Because of super_column can not contain large number of columns, you should not use design 1 b. Maybe with each query, you have to separate to each ColumnFamily On Wed, Apr 21, 2010 at 1:17 PM, Steve Lihn wrote: > Hi, > I am new to Cassandra. I would like to

Re: Cassandra's bad behavior on disk failure

2010-04-21 Thread Jonathan Ellis
We have a ticket open for this: https://issues.apache.org/jira/browse/CASSANDRA-809 Ideally I think we'd like to leave the node up to serve reads, if a disk is erroring out on writes but still read-able. In my experience this is very common when a disk first begins to fail, as well as in the "dis

Re: Cassandra 0.5.1 restarts slow

2010-04-21 Thread Jonathan Ellis
[moving to u...@] 0.6 fixes replaying faster than it can flush. as for why it backs up in the first place before the restart, you can either (a) throttle writes [set your timeout lower, make your clients back off temporarily when it gets a timeoutexception] or (b) add capacity. (b) is recommende

Re: Modelling assets and user permissions

2010-04-21 Thread Jonathan Ellis
if you want to look up "what permissions does user X have on asset Y" then i would model that as a row keyed by userid, containing supercolumns named by asset ids, and containing subcolumns of the permissions granted. On Mon, Apr 19, 2010 at 12:03 PM, tsuraan wrote: > Suppose I have a CF that hol

Cassandra data model for financial data

2010-04-21 Thread Steve Lihn
Hi, I am new to Cassandra. I would like to use Cassandra to store financial data (time series). Have question on the data model design. The example here is the daily stock data. This would be a column family called dailyStockData. The raw key is stock ticker. Everyday there are attributes like clo

Re: Delete row

2010-04-21 Thread Jonathan Ellis
You can serialize any RowMutation for BMT but if all you're doing is deleting rows why bother with BMT? It is not significantly more efficient than Thrift for that. On Tue, Apr 20, 2010 at 12:47 PM, Sonny Heer wrote: > How do i delete a row using BMT method? > > Do I simply do a mutate with colu

Re: tcp CLOSE_WAIT bug

2010-04-21 Thread Jonathan Ellis
I'd like to get something besides "I'm seeing close wait but i have no idea why" for a bug report, since most people aren't seeing that. On Tue, Apr 20, 2010 at 9:33 AM, Ingram Chen wrote: > I trace IncomingStreamReader source and found that incoming socket comes > from MessagingService$SocketThr

Re: restore with snapshot

2010-04-21 Thread Jonathan Ellis
On Mon, Apr 19, 2010 at 2:03 PM, Lee Parker wrote: > I am working on finalizing our backup and restore procedures for a cassandra > cluster running on EC2. I understand based on the wiki that in order to > replace a single node, I don't actually need to put data on that node.  I > just need to boo

Re: Just to be clear, cassandra is web framework agnostic b/c of Thrift?

2010-04-21 Thread Jonathan Ellis
There is a patch attached to https://issues.apache.org/jira/browse/CASSANDRA-948 that needs volunteers to test. On Sun, Apr 18, 2010 at 11:13 PM, Mark Greene wrote: > With the 0.6.0 release, the windows cassandra.bat file errors out. There's a > bug filed for this already. There's a README or som

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Jonathan Ellis
On Wed, Apr 21, 2010 at 11:31 AM, Anthony Molinaro wrote: > > On Wed, Apr 21, 2010 at 11:08:19AM -0500, Jonathan Ellis wrote: >> Yes, that looks right, where "token really close" means "slightly less >> than" (more than would move it into a different node's range). > > Is it better to go slightly

Import using cassandra 0.6.1

2010-04-21 Thread Sonny Heer
Currently running on a single node with intensive write operations. After running for a while... Client starts outputting: TimedOutException() at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12232) at org.apache.cassandra.thrift.Cassandra$Client.recv

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
I donnot have a website ;) I'm testing the viability of Cassandra to store XML documents and make fast search queries. 4000 XML files (80MB of XML) create with my datamodel (one SC per XML node) 100 SC which make Cassandra go OOM with Xmx 1GB. On the contrary an xml DB like eXist handles 4000

Re: At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Mark Greene
Right it's a similar concept to DB sharding where you spread the write load around to different DB servers but won't necessarily increase the throughput of an one DB server but rather collectively. On Wed, Apr 21, 2010 at 12:16 PM, Mike Gallamore < mike.e.gallam...@googlemail.com> wrote: > Some

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Anthony Molinaro
On Wed, Apr 21, 2010 at 11:08:19AM -0500, Jonathan Ellis wrote: > Yes, that looks right, where "token really close" means "slightly less > than" (more than would move it into a different node's range). Is it better to go slightly less than (say Token - 1), or slightly more than the beginning of t

Re: At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Mike Gallamore
Some people might be able to answer this better than me. However: with quorum consistency you have to communicate with n/2 + 1 where n is the replication factor nodes. So unless you are disk bound your real expense is going to be all those extra network latencies. I'd expect that you'll see a r

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Jonathan Ellis
Yes, that looks right, where "token really close" means "slightly less than" (more than would move it into a different node's range). You can't really migrate via scp since only one node with a given token can exist in the cluster at a time. -Jonathan On Wed, Apr 21, 2010 at 11:02 AM, Anthony Mo

Re: Clarification on Ring operations in Cassandra 0.5.1

2010-04-21 Thread Anthony Molinaro
Hi, I'm still curious if I got the data movement right in this email from before? Anyone? Also, anyone know if I can scp the data directory from a node I want to replace to a new machine? The cassandra streaming seems much slower than scp. -Anthony On Mon, Apr 19, 2010 at 04:48:23PM -0700,

Re: At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Jim R. Wilson
Hi Mark, I'm a relative newcomer to Cassandra, but I believe the common experience is that you start seeing gains after 5 nodes in a column-oriented data store. It may also depend on your usage pattern. Others may know better - hope this helps! -- Jim R. Wilson (jimbojw) On Wed, Apr 21, 2010 a

At what point does the cluster get faster than the individual nodes?

2010-04-21 Thread Mark Jones
I'm seeing a cluster of 4 (replication factor=2) to be about as slow overall as the barely faster than the slowest node in the group. When I run the 4 nodes individually, I see: For inserts: Two nodes @ 12000/second 1 node @ 9000/second 1 node @ 7000/second For reads: Abysmal, less than 1000/s

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
Maybe, maybe not. Presumably if you are running a RDMS with any reasonable amount of traffic now a days, it's sitting on a machine with 4-8G of memory at least. On Wed, Apr 21, 2010 at 10:48 AM, Nicolas Labrot wrote: > Thanks Mark. > > Cassandra is maybe too much for my need ;) > > > > On Wed, A

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
Thanks Mark. Cassandra is maybe too much for my need ;) On Wed, Apr 21, 2010 at 4:45 PM, Mark Greene wrote: > Hit send to early > > That being said a lot of people running Cassandra in production are using > 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully > gives yo

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
Hit send to early That being said a lot of people running Cassandra in production are using 4-6GB max heaps on 8GB machines, don't know if that helps but hopefully gives you some perspective. On Wed, Apr 21, 2010 at 10:39 AM, Mark Greene wrote: > RAM doesn't necessarily need to be proportio

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
RAM doesn't necessarily need to be proportional but I would say the number of nodes does. You can't just throw a bazillion inserts at one node. This is the main benefit of Cassandra is that if you start hitting your capacity, you add more machines and distribute the keys across more machines. On W

Re: Big Data Workshop 4/23 was Re: Cassandra Hackathon in SF @ Digg - 04/22 6:30pm

2010-04-21 Thread Eric Evans
On Tue, 2010-04-20 at 17:28 -0700, Joseph Boyle wrote: > We will have people from the Cassandra (including Stu Hood and Matt > Pfeil) and other NoSQL communities as well as with broader Big Data > interests, all available for discussion, and you can propose a session > to learn about anything. Ga

Cassandra's bad behavior on disk failure

2010-04-21 Thread Oleg Anastasjev
Hello, I am testing how cassandra behaves on single node disk failures to know what to expect when things go bad. I had a cluster of 4 cassandra nodes, stress loaded it with client and made 2 tests: 1. emulated disk failure of /data volume on read only stress test 2. emulated disk failure of /comm

Re: problem with get_key_range in cassandra 0.4.1

2010-04-21 Thread Jonathan Ellis
first, upgrade to 0.6.1. second, the easiest way to wipe everything is at the fs level like Mark said. On Wed, Apr 21, 2010 at 5:20 AM, ROGER PUIG GANZA wrote: > Hi all. > > I’m benchmarking several  nosql datastores and I’m going nuts with > Cassandra. > > The version of Cassandra we are using

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
So does it means the RAM needed is proportionnal with the data handled ? Or Cassandra need a minimum amount or RAM when dataset is big? I must confess this OOM behaviour is strange. On Wed, Apr 21, 2010 at 2:54 PM, Mark Jones wrote: > On my 4GB machine I’m giving it 3GB and having no trouble

RE: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Jones
On my 4GB machine I'm giving it 3GB and having no trouble with 60+ million 500 byte columns From: Nicolas Labrot [mailto:nith...@gmail.com] Sent: Wednesday, April 21, 2010 7:47 AM To: user@cassandra.apache.org Subject: Re: Cassandra tuning for running test on a desktop I have try 1400M, and Cass

RE: problem with get_key_range in cassandra 0.4.1

2010-04-21 Thread Mark Jones
Stop the program, wipe the data dir and commit logs, start the program, it's what I'm doing. I even made a script that will do it so it's just a one line command. From: ROGER PUIG GANZA [mailto:rp...@tid.es] Sent: Wednesday, April 21, 2010 5:20 AM To: cassandra-u...@incubator.apache.org Subject:

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
I have try 1400M, and Cassandra OOM too. Is there another solution ? My data isn't very big. It seems that is the merge of the db On Wed, Apr 21, 2010 at 2:14 PM, Mark Greene wrote: > Trying increasing Xmx. 1G is probably not enough for the amount of inserts > you are doing. > > > On Wed, Apr

Re: Cassandra tuning for running test on a desktop

2010-04-21 Thread Mark Greene
Trying increasing Xmx. 1G is probably not enough for the amount of inserts you are doing. On Wed, Apr 21, 2010 at 8:10 AM, Nicolas Labrot wrote: > Hello, > > For my first message I will first thanks Cassandra contributors for their > great works. > > I have a parameter issue with Cassandra (I ho

Cassandra tuning for running test on a desktop

2010-04-21 Thread Nicolas Labrot
Hello, For my first message I will first thanks Cassandra contributors for their great works. I have a parameter issue with Cassandra (I hope it's just a parameter issue). I'm using Cassandra 6.0.1 with Hector client on my desktop. It's a simple dual core with 4GB of RAM on WinXP. I have keep the

Re: questions about consistency

2010-04-21 Thread Даниел Симеонов
Hi Paul, about the last answer I still need some more clarifications, as I understand it if QUORUM is used, then reads doesn't get old values either? Or am I wrong? Thank you very much! Best regards, Daniel. 2010/4/21 Paul Prescod > I'm not an expert, so take what I say with a grain of salt.

Re: questions about consistency

2010-04-21 Thread Paul Prescod
I'm not an expert, so take what I say with a grain of salt. 2010/4/21 Даниел Симеонов : > Hello, >    I am pretty new to Cassandra and I have some questions, they may seem > trivial, but still I am pretty new to the subject. First is about the lack > of a compareAndSet() operation, as I understood

problem with get_key_range in cassandra 0.4.1

2010-04-21 Thread ROGER PUIG GANZA
Hi all. I'm benchmarking several nosql datastores and I'm going nuts with Cassandra. The version of Cassandra we are using is 0.4.1 I know 0.4.1 is a bit outdated but my implementation is done with that version. The thing is that every time the test runs, I need to reset the data inside the da

questions about consistency

2010-04-21 Thread Даниел Симеонов
Hello, I am pretty new to Cassandra and I have some questions, they may seem trivial, but still I am pretty new to the subject. First is about the lack of a compareAndSet() operation, as I understood it is not supported currently in Cassandra, do you know of use cases which really require such o