we have a SupperCF which may have up to 1000 supper columns and 5
clumns for each supper column, the read latency may go up to 50ms
(even higher), I think it's a long time to response, how to tune the
storage config to optimize the performace? I read the wiki,
ColumnIndexSizeInKB may help to
As written in the third point of
http://wiki.apache.org/cassandra/CassandraLimitations,
right now, super columns are not indexed and deserialized fully when you access
them. Another way to put it is, you'll want to user super columns with
only a relatively
small number of columns in them.
Because
I notice that: there are more than 100 CLOSE_WAIT incomming connections
on storage port 7000
In my two cassandra node:
126 of 146 storage connections is CLOSE_WAIT
196 of 217 storage connections is CLOSE_WAIT
Is it normal?
--
From: Chris
hi ,
I get a fatal exception with my cassandra cluster:
java.lang.NoClassDefFoundErrororg/apache/cassandra/db/CompactionManager$4
at
org.apache.cassandra.db.CompactionManager.submitMajor(CompactionManager.java:156)
at
I do the Thread Dump on each cassandra node, and count the thread with call
stack string at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)atorg.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.jav
a:66) in thread-xxx
then I find an
hi,
I have not used nodetool repair or nodetool compact . So how is
MajorCompaction triggered?
--
casablinca126.com
2010-06-04
-
发件人:casablinca126.com
发送日期:2010-06-04 18:05:11
Nice!
Would it be possible to give more than 2 weeks notice for the following
events? Preferrably a month, its not that easy to get off work etc.
On Fri, Jun 4, 2010 at 4:22 AM, Oleg Anastasjev olega...@gmail.com wrote:
Jonathan Ellis jbellis at gmail.com writes:
This will be Riptano's
Chris,
Can you get me a stack dump of one of the busy nodes (kill -3)?
Gary
On Thu, Jun 3, 2010 at 22:50, Chris Goffinet goffi...@digg.com wrote:
We're seeing this as well. We were testing with a 40+ node cluster on the
latest 0.6 branch from few days ago.
-Chris
On Jun 3, 2010, at 9:55
get_slice reads a single row. do you mean there are 23,000 columns,
or are you running get_slice in a loop 23000 times?
On Fri, Jun 4, 2010 at 4:59 AM, Per Olesen p...@trifork.com wrote:
Are 6..8 seconds to read 23.000 small rows - as it should be?
I have a quick question on what I think is
Anybody looked at VoltDB? I haven't dug into it, but curious about it.
dwh
It works for Random Partitioner only if you want to get all keys.
2010/6/4 Shuai Yuan yuansh...@supertool.net.cn
It's documented that get_range_slice() supports all partitioner in 0.6
Kevin
原始信件
发件人: Olivier Mallassi omalla...@octo.com
收件人: user@cassandra.apache.org
How many subcolumns are in each supercolumn and how large are the
values? Your example shows 8 subcolumns, but I didn't know if that was
the actual number. I've been able to read columns out of Cassandra at
an order of magnitude higher than what you're seeing here but there
are too many variables
Hi.
I have looked at cassandra before and now I'm revisiting the project :-) At
the project I am working on we need a fast storage for blobs and lucene
indexes that is available on each node in the cluster. Cassandra seems to
fit very good for the blob storage and cassandra/lucandra for the
Story continued, in hopes this experience is useful to someone...
I shut down the node, removed the huge file, restarted the node, and told
everybody to repair. Two days later, AE stages are still running.
Ian
On Thu, Jun 3, 2010 at 2:21 AM, Jonathan Ellis jbel...@gmail.com wrote:
this is
Here's the scenario: would like R = N where N is the number of nodes. Let's say
8.
1. Create first node, modify storage-conf.xml and change the Seed/ to be the
ip of the node. Change replication factor to 8 for CF of interest. Start the
puppy up.
2. Create 2nd node, modify storage-confg.xml
On Fri, Jun 4, 2010 at 10:36 AM, Philip Stanhope pstanh...@wimba.com wrote:
Here's the scenario: would like R = N where N is the number of nodes. Let's
say 8.
1. Create first node, modify storage-conf.xml and change the Seed/ to be
the ip of the node. Change replication factor to 8 for CF
Thanks on the correction about Keyspace versus ColumnFamily ... I knew that
just mis-typed.
I guess it should be stated (to be obvious) ... that when you are auto
bootstrapping a node ... the seed better be alive. The scenario I'm dealing
with is that it might not be (reasons for that are
On Fri, Jun 4, 2010 at 11:04 AM, Philip Stanhope pstanh...@wimba.com wrote:
I am contemplating a situation where there may be 2N servers ... but only N
online at any one time. But, for operational purposes, N+n (where n is 1 or
2), N may be occasionally greater than R.
Then Cassandra is
I guess I'm thick ...
What would be the right choice? Our data demands have already been proven to
scale beyond what RDB can handle for our purposes. We are quite pleased with
Cassandra read/write/scale out. Just trying to understand the operational
considerations.
On Jun 4, 2010, at 2:11
On Jun 4, 2010, at 5:19 PM, Ben Browning wrote:
How many subcolumns are in each supercolumn and how large are the
values? Your example shows 8 subcolumns, but I didn't know if that was
the actual number. I've been able to read columns out of Cassandra at
an order of magnitude higher than
On Fri, Jun 4, 2010 at 11:14 AM, Philip Stanhope pstanh...@wimba.com wrote:
I guess I'm thick ...
What would be the right choice? Our data demands have already been proven to
scale beyond what RDB can handle for our purposes. We are quite pleased with
Cassandra read/write/scale out. Just
2010/6/4 Ran Tavory ran...@gmail.com
Cassandra expects a config file and does not expose an alternative API, for
this file, that's correct.
I think it's not hard to add such API but so far the demand for it didn't
exist.
I see that making a config api is not that hard. Will probably take a
Yes, I know. And I might end up doing this in the end. I do though have
pretty hard upper limits of how many rows I will end up with for each key,
but anyways it might be a good idea none the less. Thanks for the advice on
that one.
You set count to Integer.MAX. Did you try with say 3?
On Fri, Jun 04, 2010 at 12:35:51PM -0700, Gary Dusbabek wrote:
Most of the streaming messages are DEBUG, so you'll have to amp up logging.
I've upped logging on the bootstrapping node, and I realize
that it's trying to assume load from two nodes. The other node
(ie the one not mentioned in the
Yes, I know. And I might end up doing this in the end. I do though have
pretty hard upper limits of how many rows I will end up with for each key,
but anyways it might be a good idea none the less. Thanks for the advice on
that one.
You set count to Integer.MAX. Did you try with say 3?
if you have a relatively small, static set of subcolumns, that you
read as a group, then using supercolumns is reasonable
On Tue, Jun 1, 2010 at 7:33 PM, Peter Hsu pe...@motivecast.com wrote:
I have a pretty simple data modeling question. I don't know whether or not
to use a CF or SCF in one
Is there a mechanism to select a time range within a row range query? Is
this planned? For example, return to me the last 10 post starting at 7:00pm
yesterday?
Nick
Hi,
I am not sure how to implement multiget or slice_range based on a
conditional predicate. For example what if I want to get only keys
with containing certain columns. Thanks.
--
Lev
Hi Fellows,
I have the following design for a system which holds basically key-value pairs
(aka Columns) for each user (SuperColumn Key) in different namespaces
(SuperColumnFamily row key).
Like this:
Namesapce-user-column_name = column_value;
keyspaces:
- name: NKVP
On Fri, Jun 04, 2010 at 12:35:51PM -0700, Gary Dusbabek wrote:
Most of the streaming messages are DEBUG, so you'll have to amp up logging.
I upped the logging to DEBUG on the bootstrapping node and the
nodes being bootstrapped from, and the bootstrap completed fine, so I'm
not sure what was
That's entirely up to you. If you make row keys that are time ordered
and include the time as a prefix in the key, you just use get_range()
as usual, start now, end 7pm yesterday, count of 10.
On Fri, Jun 4, 2010 at 2:23 PM, Nicholas Sun nick@raytheon.com wrote:
Is there a mechanism to
https://issues.apache.org/jira/browse/CASSANDRA-16
Can someone (Jonathan?) help me understand the performance characteristics
of this patch?
Specifically: If I have an open ended CF, and I keep inserting with ever
increasing column names (for example current Time), will things generally
work out
If I may ask, why the need for frequent topology changes?
On Fri, Jun 4, 2010 at 1:21 PM, Benjamin Black b...@b3k.us wrote:
On Fri, Jun 4, 2010 at 11:14 AM, Philip Stanhope pstanh...@wimba.com wrote:
I guess I'm thick ...
What would be the right choice? Our data demands have already been
Thanks Johathan
On Wed, Jun 2, 2010 at 11:17 PM, Jonathan Ellis jbel...@gmail.com wrote:
you're overcomplicating things.
just connect to *a* node, and if it happens to be down, try a different
one.
nodes being down should be a rare event, not a normal condition. no
need to optimize for
34 matches
Mail list logo