Re: Trove maps

2010-04-25 Thread Tatu Saloranta
On Sat, Apr 24, 2010 at 6:27 AM, Carlos Sanchez carlos.sanc...@riskmetrics.com wrote: There are forEach methods in  that would allow you to travel the keys/values/entries w/o creating the extra object (entries) Ok. So if change was made, it'd make sense to ensure those were used for traversal.

Re: Cassandra - Thread leak when high concurrent load

2010-04-25 Thread Mark Robson
On 25 April 2010 10:48, JKnight JKnight beukni...@gmail.com wrote: Dear all, My Cassandra server had thread leak when high concurrent load. I used jconsole and saw many, many thread occur. Just because there are a lot of threads, need not imply a thread leak. Cassandra uses a lot of

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Mark Robson
For me an important difference is that Cassandra is operationally much more straightforward - there is only one type of node, and it is fully redundant (depending what consistency level you're using). This seems to be an advantage in Cassandra vs most other distributed storage systems, which

Re: tcp CLOSE_WAIT bug

2010-04-25 Thread yangfeng
I encountered the same problem! Hope to get some help.Tks. 2010/4/22 Ingram Chen ingramc...@gmail.com arh! That's right. I check OutboundTcpConnection and it only does closeSocket() after something went wrong. I will log more in OutboundTcpConnection to see what actually happens. Thank

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joe Stump
On Apr 25, 2010, at 11:40 AM, Mark Robson wrote: For me an important difference is that Cassandra is operationally much more straightforward - there is only one type of node, and it is fully redundant (depending what consistency level you're using). This seems to be an advantage in

Re: getting cassandra setup on windows 7

2010-04-25 Thread S Ahmed
great that worked thanks! On Fri, Apr 23, 2010 at 2:28 PM, Mark Greene green...@gmail.com wrote: Try the cassandra-with-fixes.bathttps://issues.apache.org/jira/secure/attachment/12442349/cassandra-with-fixes.bat file attached to the issue. I had the same issue an that bat file got

Re: Cassandra-cli tutorials

2010-04-25 Thread Roger Schildmeijer
On 25 apr 2010, at 15.15em, S Ahmed wrote: Ok excited I got it up and running on windows 7, yah! Curious, are there any tutorials or examples of using the cassandra-cli? http://wiki.apache.org/cassandra/CassandraCli BTW, the cassandra-cli is pretty cool, even comes with tab-complete, is

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Lenin Gali
I second Joe. Lenin Sent from my BlackBerry® wireless handheld -Original Message- From: Joe Stump j...@joestump.net Date: Sun, 25 Apr 2010 13:04:50 To: user@cassandra.apache.org Subject: Re: The Difference Between Cassandra and HBase On Apr 25, 2010, at 11:40 AM, Mark Robson wrote:

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Eric Hauser
Out of curiosity, are you planning on copying the data you store in HBase/Hive into separate Hadoop cluster in a different data center or backing up HDFS in some other manner? Redundancy isn't an issue within the cluster; it's more a concern of storing all your HDFS data in one physical location.

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Jeff Hodges
HBase is awesome when you need high throughput and don't care so much about latency. Cassandra is generally the opposite. They are wonderfully complementary. -- Jeff On Sun, Apr 25, 2010 at 8:19 AM, Lenin Gali galile...@gmail.com wrote: I second Joe. Lenin Sent from my BlackBerry® wireless

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joe Stump
On Apr 25, 2010, at 5:18 PM, Eric Hauser wrote: Out of curiosity, are you planning on copying the data you store in HBase/Hive into separate Hadoop cluster in a different data center or backing up HDFS in some other manner? Redundancy isn't an issue within the cluster; it's more a

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joseph Stein
it is kind of the classic distinction between OLTP OLAP. Cassandra is to OLTP as HBase is to OLAP (for those SAT nutz). Both are useful and valuable in their own right, agreed. On Sun, Apr 25, 2010 at 12:20 PM, Jeff Hodges jhod...@twitter.com wrote: HBase is awesome when you need high

newbie question on how columns names are indexed/lucene limitations?

2010-04-25 Thread TuX RaceR
Hello Cassandra Users, When use the RandomPartinionner and a simple ColumnFamily/Columns (i.e. no SuperColumns) my understanding is that one signle Row can store millions of columns. If I look at the http://wiki.apache.org/cassandra/API, I understand that I can get a subset of the millions

Re: Cassandra - Thread leak when high concurrent load

2010-04-25 Thread JKnight JKnight
Thanks Robson, The number of thread gradually increase to 7000. And the server hang up. I know threadpool is used to prevent creating large number of thread. So why Cassandra create large number of thread when high concurrent load. On Sun, Apr 25, 2010 at 5:38 PM, Mark Robson mar...@gmail.com

How do you construct an index and use it, especially in Ruby

2010-04-25 Thread Bob Hutchison
Hi, I'm new to Cassandra and trying to work out how to do something that I've implemented any number of times (e.g. TokyoCabinet, Perst, even the filesystem using grep :-) I've managed to get some of this working in Cassandra but not all. So here's the core of the situation. I have this

Re: Cassandra - Thread leak when high concurrent load

2010-04-25 Thread Brandon Williams
On Sun, Apr 25, 2010 at 12:09 PM, JKnight JKnight beukni...@gmail.comwrote: Thanks Robson, The number of thread gradually increase to 7000. And the server hang up. I know threadpool is used to prevent creating large number of thread. So why Cassandra create large number of thread when high

RE: newbie question on how columns names are i ndexed/lucene limitations?

2010-04-25 Thread Stu Hood
The indexes within rows are _not_ implemented with Lucene: there is a custom index structure that allows for random access within a row. But, you should probably read http://wiki.apache.org/cassandra/CassandraLimitations to understand the current limitations of the file format, some of which

value size, is there a suggested limit?

2010-04-25 Thread S Ahmed
Is there a suggested sized maximum that you can set the value of a given key? e.g. could I convert a document to bytes and store it as a value to a key? if yes, which I presume so, what if the file is 10mb? or 100mb?

Re: value size, is there a suggested limit?

2010-04-25 Thread Mark Greene
http://wiki.apache.org/cassandra/CassandraLimitations On Sun, Apr 25, 2010 at 4:19 PM, S Ahmed sahmed1...@gmail.com wrote: Is there a suggested sized maximum that you can set the value of a given key? e.g. could I convert a document to bytes and store it as a value to a key? if yes, which

range get over subcolumns on supercolumn family

2010-04-25 Thread Rafael Ribeiro
Hi all! I am trying to do a paginated query on the subcolumns of a superfamily column but sincerely I am a little bit confused. I have already been able to do a range query but only over the keys of a regular column family. For the keys case I've been able to do so using the code below:

Re: Question about TimeUUIDType

2010-04-25 Thread Tatu Saloranta
On Sat, Apr 24, 2010 at 2:08 AM, Sylvain Lebresne sylv...@yakaz.com wrote: On Sat, Apr 24, 2010 at 12:53 AM, Jesse McConnell jesse.mcconn...@gmail.com wrote: try LexicalUUIDType, that will distinguish the secs correctly imo based on the existing impl (last I checked at least) TimeUUIDType

Re: Question about TimeUUIDType

2010-04-25 Thread Jonathan Ellis
On Sun, Apr 25, 2010 at 5:40 PM, Tatu Saloranta tsalora...@gmail.com wrote: Now with TimeUUIDType, if two UUID have the same timestamps, they are ordered by bytes order. Naively for the whole UUID? That would not be good, given that timestamp within UUID is not stored in expected lexical

Re: how to store file in the cassandra?

2010-04-25 Thread Shuge Lee
In Python: keyspace.columnfamily[key][column] = value files.video[uuid.uuid4()]['name'] = 'foo.flv' files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv' create a mapping files.video = { uuid.uuid4() : { 'name' : 'foo.flv', 'path' : '/var/files/foo.flv', } } if most

when i use the OrderPreservingPartition, the load is very imbalance

2010-04-25 Thread 刘兵兵
i do some INSERT ,because i will do some scan operations, i use the OrderPreservingPartition method. the state of the cluster is showed below. as i predicated the load is very imbalance, and some of the nodes down (in some nodes,the Cassandra processes died and in others the processes are alive

Re: how to store file in the cassandra?

2010-04-25 Thread Shuge Lee
Yes. Cassandra does save raw string data only, not a file, and shouldn't save a file. 2010/4/26 刘兵兵 rucb...@gmail.com sorry i'm not very familiar with python, are you meaning that the files are stored in the file system of the os? then , the cassandra just stores the path to access the

Re: how to store file in the cassandra?

2010-04-25 Thread Jonathan Ellis
Cassandra stores byte arrays. You can certainly store file data in it, although if it is larger than a few MB you should chunk it into multiple columns. On Sun, Apr 25, 2010 at 8:21 PM, Shuge Lee shuge@gmail.com wrote: Yes. Cassandra does save raw string data only, not a file, and