so i flipped the composite around to;
create column family StockHistory
with comparator = 'CompositeType(UTF8Type,LongType)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'UTF8Type';
and now i'm getting what i expected the first time. i can get a range of
typ
Cassandra has no way of knowing that all the data is in the most recent
sstable, and will have to check the others too, and this bring a lot of
difficulty to data compaction.
I have a question that if I want a high performance data compaction, how can I
implement that all the columns are all
Hi Jonathan,
>
> > For some reason 16637958 (the keys cached) has become a golden number
> and I
> > don't see key cache increasing beyond that.
>
> 16637958 is your configured cache capacity according to the cfstats you
> pasted.
this is another weird part. If you look at the schema[1] (pasted
Yes, Cassandra has no way of knowing that all the data is in the most recent
sstable, and will have to check the others too, and this bring a lot of
difficulty to data compaction.
I have a question that if I want a high performance data compaction, how can I
implement that all the columns are
Thanks, Jonathan. I got it.
2012-02-17
zhangcheng
发件人: Jonathan Ellis
发送时间: 2012-02-17 10:15:05
收件人: user
抄送:
主题: Re: Key cache hit rate issue
Look for this code in SSTableReader.getPosition:
Pair unifiedKey = new
Pair(descriptor, decoratedKey);
Long cached
according to the read process, the key of the keycache should be the row key.
2012-02-17
zhangcheng
发件人: Todd Burruss
发送时间: 2012-02-17 06:23:47
收件人: user@cassandra.apache.org
抄送:
主题: Re: Key cache hit rate issue
jonathan, you said the key to the cache is key + sstable? looking at
CASSANDRA-3496, fixed in 1.0.4+
On Thu, Feb 16, 2012 at 8:27 AM, Bill Au wrote:
> I am running 1.0.2 with the default tiered compaction. After running a
> "nodetool compact", I noticed that on about half of the machines in my
> cluster, both "nodetool ring" and "nodetool info" report that the lo
Look for this code in SSTableReader.getPosition:
Pair unifiedKey = new
Pair(descriptor, decoratedKey);
Long cachedPosition = getCachedPosition(unifiedKey, true);
On Thu, Feb 16, 2012 at 4:23 PM, Todd Burruss wrote:
> jonathan, you said the key to the cache is key + sstabl
On 2/16/2012 12:46 AM, aaron morton wrote:
split this CF into two?
Or change the order of the column components as suggested.
as suggested - where?
are you saying if the flip the composite i'll be able to ask for a range
by type? and cassandra is going to order by columns like;
ticks:1
t
On Thu, Feb 16, 2012 at 3:52 PM, Eran Chinthaka Withana
wrote:
> Thanks for the reply. Yes there is a possibility that the keys can be
> distributed in multiple SSTables, but my data access patterns are such that
> I always read/write the whole row. So I expect all the data to be in the
> same SST
Hi Aaron Morton and R. Verlangen,
Thanks for the quick answer. It's good to know Thrift's limit on the amount
of data it will accept / send.
I know the hard limit is 2 billion columns per row. My question is at what
size it will slowdown read/write performance and maintenance. The blog I
refer
No, I am not using compression.
Bill
On Thu, Feb 16, 2012 at 2:05 PM, aaron morton wrote:
> Are you using compression ?
>
> I remember some issues with compression and reported load, cannot remember
> the details.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmort
Hi Todd,
Thanks for the reply. But I don't think the settings that you mentioned is
not playing any role as those are set to 0.85 and 0.6 in my cassandra.yaml
and the proportion between the number I see as space used and the amount I
set is much less than those numbers.
Thanks,
Eran Chinthaka Wit
jonathan, you said the key to the cache is key + sstable? looking at the
code it looks like a DecoratedKey is the "row key". how does sstable come
into play?
On 2/16/12 1:20 PM, "Jonathan Ellis" wrote:
>So, you have roughly 1/6 of your (physical) row keys cached and about
>1/4 cache hit rate,
there is a setting in the yaml file that helps relieve memory pressure by
reducing the row cache. it is based on the percent of memory used by the JVM
the setting are, reduce_cache_sizes_at and reduce_cache_capacity_to. see how
much free memory you have and if the numbers suggest that you have
On 17/02/2012 8:53 AM, "Eran Chinthaka Withana"
wrote:
>
> Hi Jonathan,
>
> Thanks for the reply. Yes there is a possibility that the keys can be
distributed in multiple SSTables, but my data access patterns are such that
I always read/write the whole row. So I expect all the data to be in the
sam
Hi Jonathan,
Thanks for the reply. Yes there is a possibility that the keys can be
distributed in multiple SSTables, but my data access patterns are such that
I always read/write the whole row. So I expect all the data to be in the
same SSTable (please correct me if I'm wrong).
For some reason 16
So, you have roughly 1/6 of your (physical) row keys cached and about
1/4 cache hit rate, which doesn't sound unreasonable to me. Remember,
each logical key may be spread across multiple physical sstables --
each (key, sstable) pair is one entry in the key cache.
On Thu, Feb 16, 2012 at 1:48 PM,
Hi Aaron,
Here it is.
Keyspace:
Read Count: 1123637972
Read Latency: 5.757938114343114 ms.
Write Count: 128201833
Write Latency: 0.0682576607387509 ms.
Pending Tasks: 0
Column Family: YY
SSTable count: 18
Space used (live): 103318720685
Space used (total): 103318720685
Number of Keys (est
yes.
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 16/02/2012, at 10:15 PM, R. Verlangen wrote:
> Hmm ok. This means if I want to have a CF with RF = 3 and another CF with RF
> = 1 (e.g. some debug logging) I will have to create 2 keyspaces?
>
I'm trying to figure out the best way to store items for query based on
multiple dimensions. I've got a large volume (many 100s of millions per day)
of time-ordered objects with 10+ properties each that I need to support
arbitrary query expressions on. So I may need to support a query based on
Try here http://www.datastax.com/support-forums/forum/opscenter
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 17/02/2012, at 7:01 AM, Radim Kolar wrote:
> Is there way in ops center gui to make node with agent to join specified
> cluste
Are you using compression ?
I remember some issues with compression and reported load, cannot remember the
details.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 17/02/2012, at 3:27 AM, Bill Au wrote:
> I am running 1.0.2 with the def
Hello,
We are trying to read data from cassandra via pig. The version of cassandra is
1.0.7 and pig is 0.9.0.
We get the following error when we try to load the data from the cassandra
keyspace and columnfamily.
[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate
Is there way in ops center gui to make node with agent to join specified
cluster?
I am thinking about: click on node. select join cluster - type IP
address of existing cluster member and data will be replicated into new
node.
[moving to users list]
See http://wiki.apache.org/cassandra/CassandraLimitations
2012/2/15 晓峰 :
> I want to insert more and more columns into the super column,is there any
> problem?
>
>
>
>
> 晓峰
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for profess
I am running 1.0.2 with the default tiered compaction. After running a
"nodetool compact", I noticed that on about half of the machines in my
cluster, both "nodetool ring" and "nodetool info" report that the load is
actually higher than before when I expect it to be lower. It is almost
twice as m
Hi,
We're organising Europe's first Apache Cassandra conference - Cassandra
Europe. It takes place on March 28th in London. We'll be having some of the
top PMCs, committers and users attending. I'm trying to get people's input
and was wondering if you had any suggestions for what you'd like to see
Hmm ok. This means if I want to have a CF with RF = 3 and another CF with
RF = 1 (e.g. some debug logging) I will have to create 2 keyspaces?
2012/2/16 aaron morton
> Multiple CF mutations for a row are treated atomically in the commit log,
> and they are sent together to the replicas. Replicati
Multiple CF mutations for a row are treated atomically in the commit log, and
they are sent together to the replicas. Replication occurs at the row level,
not the row+cf level.
If each CF had it's own RF, odd things may happen. Like sending a batch
mutation for one row and two CF's that fails
> 1). The "IN" operator is not working
> SELECT * FROM TestCF WHERE status IN ('Failed', 'Success')
IN is only valid for filtering on the row KEY
http://www.datastax.com/docs/1.0/references/cql/SELECT
e.g. it generates this error using cqlsh
cqlsh:dev> SELECT * FROM TestCF WHERE status IN ('Fa
> Is anyone using it with Cassandra?
Yes, we use it with cassandra 0.6. Had to implement service wrapper tanuki
style
for cassandra by myself to make it shudown correctly.
Hi there,
As the subject states: "Is it possible to set a replication factor per
column family?"
Could not find anything of recent releases. I'm running Cassandra 1.0.7 and
I think it should be possible on a per CF basis instead of the whole
keyspace.
With kind regards,
Robin
> but it still seem a bit strange coming from years and years of sql.
Think of the composite column name as a composite key. You want to write an
efficient query that uses a seek and partial scan of the index b-tree, rather
than a full scan.
> split this CF into two?
Or change the order of t
> Based on this blog of Basic Time Series with Cassandra data modeling,
> http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
I've not read that one but it sounds right. Mat Dennis knows his stuff
http://www.slideshare.net/mattdennis/cassandra-nyc-2011-data-modeling
> There
Things you should know:
- Thrift has a limit on the amount of data it will accept / send, you can
configure this in Cassandra: 64MB's should still work find (1)
- Rows should not become huge: this will make "perfect" load balancing
impossible in your cluster
- A single row should fit on a disk
- T
> Its in the order of 261 to 8000 and the ratio is 0.00. But i guess 8000 is
> bit high. Is there a way to fix/improve it?
Sorry I don't understand what you mean. But if the ratio is 0.0 all is good.
Could you include the full output from cfstats for the CF you are looking at ?
Cheers
I'm not sure about your first 2 questions. The third might be an exception:
check your Cassandra logs.
About the "like"-thing: there's no such query possibiliy in Cassandra / CQL.
You can take a look at Hadoop / Hive to tackle those problems.
2012/2/16 Roshan
> Hi
>
> I am using Cassandra 1.0.
thanks for the reply. i understand why, but it still seem a bit strange
coming from years and years of sql. so if i want to avoid the extra
load from fetching way more than i needed would i be best off split this
CF into two?
thanks,
deno
On 2/13/2012 10:41 AM, aaron morton wrote:
My unde
> 1. Changing consistency level configurations from Write.ALL + Read.ONE
> to Write.ALL + Read.ALL increases write latency (expected) and
> decrease read latency (unexpected).
When you tested at CL.ONE, was read repair turned on?
The two ways I can think of right now, by which read latency might
40 matches
Mail list logo