://wiki.apache.org/cassandra/NodeProbe
On Fri, Mar 12, 2010 at 12:40 PM, Weijun Li weiju...@gmail.com wrote:
Suppose I insert a lot of new items but also delete a lot of new items
daily, it will be ideal if I can force GC to happen during mid night
(when
traffic is low). Is there any way
Suppose I insert a lot of new items but also delete a lot of new items
daily, it will be ideal if I can force GC to happen during mid night (when
traffic is low). Is there any way to manually force GC to be executed? In
this way I can add a cronjob to trigger gc in mid night. I tried nodetool
and
-Original Message-
From: Sylvain Lebresne [mailto:sylv...@yakaz.com]
Sent: Thursday, February 25, 2010 2:23 AM
To: Weijun Li
Cc: cassandra-user@incubator.apache.org
Subject: Re: Strategy to delete/expire keys in cassandra
Hi,
Should I just run command (in Cassandra 0.5 source folder
Never mind. Figured out I forgot to compile thrift :)
Thanks,
-Weijun
On Wed, Mar 10, 2010 at 1:43 PM, Weijun Li weiju...@gmail.com wrote:
Hi Sylvain,
I applied your patch to 0.5 but it seems that it's not compilable:
1) column.getTtl() is no defined in RowMutation.java
public static
: Sylvain Lebresne [mailto:sylv...@yakaz.com]
Sent: Thursday, February 25, 2010 2:23 AM
To: Weijun Li
Cc: cassandra-user@incubator.apache.org
Subject: Re: Strategy to delete/expire keys in cassandra
Hi,
Should I just run command (in Cassandra 0.5 source folder?) like:
patch p1 i 0001-Add-new
in your ticket?
Also what's your opinion on extending ExpiringColumn to expire a key
completely? Otherwise it will be difficult to track what are expired or old
rows in Cassandra.
Thanks,
-Weijun
From: Weijun Li [mailto:weiju...@gmail.com]
Sent: Tuesday, February 23, 2010 6:18 PM
It seems that we are mostly talking about write and read keys into/from
Cassandra cluster. I’m wondering how did you successfully deal with
deleting/expiring keys in Cassandra? An typical example is you want to delete
keys that haven’t been modified in certain time period (i.e., old keys).
Thanks for the answer. A dumb question: how did you apply the patch file to
0.5 source? The link you gave doesn't mention that the patch is for 0.5??
Also, this ExpiringColumn feature doesn't seem to expire key/row, meaning
the number of keys will keep grow (even if you drop columns for them)
cached, we would let
the os block cache handle that without adding an extra layer. (0.6
uses mmap'd i/o by default on 64bit JVMs so this is very efficient.)
On Fri, Feb 19, 2010 at 3:29 AM, Weijun Li weiju...@gmail.com wrote:
The memory overhead issue is not directly related to GC because when
I setup a two cassandra clusters with 2 nodes each. Both use random
partitioner. It's strange that for each cluster, one node has much shortter
read latency than the other one
This is the info of one of the cluster:
Node A: read count 77302, data file 41GB, read latency 58180, io saturation
100%
at 8:37 PM, Weijun Li weiju...@gmail.com wrote:
Just tried to make quick change to enable it but it didn't work out :-(
ColumnFamily cachedRow =
cfs.getRawCachedRow(mutation.key());
// What I modified
if( cachedRow == null
Dumped 50mil records into my 2-node cluster overnight, made sure that
there's not many data files (around 30 only) per Martin's suggestion. The
size of the data directory is 63GB. Now when I read records from the cluster
the read latency is still ~44ms, --there's no write happening during the
, Feb 16, 2010 at 9:50 AM, Weijun Li weiju...@gmail.com wrote:
Dumped 50mil records into my 2-node cluster overnight, made sure that
there's not many data files (around 30 only) per Martin's suggestion. The
size of the data directory is 63GB. Now when I read records from the cluster
the read
the read latency?
-Weijun
On Tue, Feb 16, 2010 at 10:01 AM, Brandon Williams dri...@gmail.com wrote:
On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li weiju...@gmail.com wrote:
One more thoughts about Martin's suggestion: is it possible to put the
data files into multiple directories that are located
cache rows. You don't want to use up all of the memory on your
box for those caches though: you'll want to leave at least 50% for your OS's
disk cache, which will store the full row content.
-Original Message-
From: Weijun Li weiju...@gmail.com
Sent: Tuesday, February 16, 2010 12:16pm
Just started to play with the row cache feature in trunk: it seems to be
working fine so far except that for RowsCached parameter you need to specify
number of rows rather than a percentage (e.g., 20% doesn't work). Thanks
for this great feature that improves read latency dramatically so that disk
:15 PM, Weijun Li weiju...@gmail.com wrote:
Still have high read latency with 50mil records in the 2-node cluster
(replica 2). I restarted both nodes but read latency is still above 60ms
and
disk i/o saturation is high. Tried compact and repair but doesn't help
much.
When I reduced
at 5:20 PM, Jonathan Ellis jbel...@gmail.com wrote:
On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis jbel...@gmail.com wrote:
On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li weiju...@gmail.com wrote:
Just started to play with the row cache feature in trunk: it seems to be
working fine so far except
does
iostats tell you?
http://spyced.blogspot.com/2010/01/linux-performance-basics.html
do you have a lot of pending compactions? (tpstats will tell you)
have you increased KeysCachedFraction?
On Sun, Feb 14, 2010 at 8:18 PM, Weijun Li weiju...@gmail.com wrote:
Hello,
I saw some Cassandra
Hello,
I saw some Cassandra benchmark reports mentioning read latency that is less
than 50ms or even 30ms. But my benchmark with 0.5 doesn’t seem to support that.
Here’s my settings:
Nodes: 2 machines. 2x2.5GHZ Xeon Quad Core (thus 8 cores), 8GB RAM
ReplicationFactor=2
Hello,
I have a testing cluster with: A (dc1), B (dc1), C(dc2), D(dc2). The
replication factor is 2 so I assume each DC will have a complete copy of the
data. Also I'm using PropertyFileEndPointSnitch with rack.properties for the
dc and rack settings.
So, what's the steps to add another
Hello,
I tried to run nodeprobe flush but it display the usage info without doing
anything? What are the list of supported command for nodeprobe?
Thanks,
-Weijun
When you add a new node, cassandra will pick the node that has the most data
then split its token. In this case the data distribution among all nodes
become uneven. What's the right strategy/steps to rebalance the node load
after adding new nodes? Here's one example: I have a cluster of node A, B,
Hello, got one more issue when I was trying to run nodeprobe to connect to a
remote cassandra node, it freezed for a while then showed the following
error. The jmxremote port 8080 is open, and I tried to change the port but
it doesn't help. This command works properly if I run it in the same
24 matches
Mail list logo