Dont forget you can make your own sorting algorithm.
Here is a nice tutorial for that.
http://www.sodeso.nl/?p=421
Justus
Från: Schubert Zhang [mailto:zson...@gmail.com]
Skickat: den 15 juli 2010 04:20
Till: user@cassandra.apache.org
Ämne: Re: key types and grouping related rows together
for
It could be that your Cassandra nodes haven't full compacted yet.
On Thu, Jul 15, 2010 at 5:55 AM, Hendro Kaskus hen...@kaskusnetworks.comwrote:
Hi everyone,
I'm newbie to Cassandra :D.. I try to insert data from MySQL to Cassandra.
Data dump from MySQL is about 11 MB (64716 records). But
Short answer: yes, this is normal.
Longer answer: this was discussed at length on this list a few days
ago, check the archives.
On Wed, Jul 14, 2010 at 10:55 PM, Hendro Kaskus
hen...@kaskusnetworks.com wrote:
Hi everyone,
I'm newbie to Cassandra :D.. I try to insert data from MySQL to
Did you add a new node to the cluster at the time you restarted it?
If not, I would think that each node already had a token that would
make such a collision impossible, unless we have a new bug to
troubleshoot.
Gary.
On Wed, Jul 14, 2010 at 20:46, Mubarak Seyed mubarak.se...@gmail.com wrote:
Well I'm not talking about a specific column family here, as ALL my column
families will have content that is specific to a certain website, so I need
a strategy that I will use on almost all my column families.
On Wed, Jul 14, 2010 at 9:20 PM, Schubert Zhang zson...@gmail.com wrote:
for your
Hi,
I am using OrderPreservingPartitioner, and my keys are integers which are
stored as strings, I want to manually assign token values equal to my key
values such that data is equally distributed.
So for this to work, I want to convert the token and key strings to integers
before doing compareTo
Do you think a composite key using a key type of Bytes would work?
How many bytes can it be?
public static byte [] createRowKey(int websiteid, long stamp)
throws Exception {
byte [] websiteidBytes = Bytes.toBytes(websiteid);
byte [] stampBytes = Bytes.toBytes(stamp);
return
Can someone please explain the mmap issue.
mmap is default for all storage files for 64bit machines.
according to this case https://issues.apache.org/jira/browse/CASSANDRA-1214
it might not be a good thing.
Is it right to say that you should use mmap only if your MAX expected data
is smaller
I found, for large dataset, long-term random reading test, the performance
with mmap is very bad.
See the attached chart in
https://issues.apache.org/jira/browse/CASSANDRA-1214.
On Fri, Jul 16, 2010 at 12:41 AM, Peter Schuller
peter.schul...@infidyne.com wrote:
Can someone please explain the
Yes, I think current HintedHandOff implementation in 0.6.x cannot support
large hints, it is a risk in a production system.
On Tue, Jun 29, 2010 at 12:31 AM, albert_e dongz...@gmail.com wrote:
In 0.6.2, HH sending MUTATION message using the same OutboundTcpConnection
with READ message. When
I am using Random Partitioner. The other 2 nodes are working fine. There are no
Errors in the log files for the 2 good nodes.
There were no log messages within 30 minutes before the exception occurs. Here
is the last log statement before the exception occurred.
INFO [COMPACTION-POOL:1]
if i have N=3 and run nodetool repair on node X. i assume that merkle
trees (at a minimum) are calculated on nodes X, X+1, and X+2 (since
N=3). when the repair is finished are nodes X, X+1, and X+2 all in sync
with respect to node X's data? or does X have the latest data and X+1
and X+2 still
If you could can you please share the command line function (to load TSV)?
There is no command line function ... you have to write code for this.
and Can you please help me on storing storage-conf.xml on HDFS part?
As I said. Maybe you better start with a simpler scenario and leave
out HDFS
I'm convinced. :) See comments on
https://issues.apache.org/jira/browse/CASSANDRA-1214
Noted :) To be clear I only mentioned it as an acknowledgement that
everyone didn't necessarily agree with what I was saying.
The main problem is not the syscall so much as Java insisting on
zeroing out
On Thu, Jul 15, 2010 at 1:54 PM, B. Todd Burruss bburr...@real.com wrote:
if i have N=3 and run nodetool repair on node X. i assume that merkle
trees (at a minimum) are calculated on nodes X, X+1, and X+2 (since
N=3). when the repair is finished are nodes X, X+1, and X+2 all in sync
with
Keys are always sorted (in 0.6) as UTF8 strings. The CompareWith
applies to _columns_ within rows, _not_ to row keys.
On Wed, Jul 14, 2010 at 1:44 PM, S Ahmed sahmed1...@gmail.com wrote:
Where is the link that describes the various key types and their impact on
sorting? (I believe I read it
This is a cluster which is horribly imbalanced because I didn't assign
initial tokens, so I'm adding 6 nodes with tokens according to the operations
page (ie, i * (2^127/N) with N = 6).
So here's what the ring will look like when bootstrap finishes
Oh, and looking at the load on the new machines it appears that
New 2 and New 6 have gotten some data (although neither is in the ring
yet). Not sure if that clears anything up though.
-Anthony
On Thu, Jul 15, 2010 at 01:28:06PM -0700, Anthony Molinaro wrote:
This is a cluster which is
On Thu, Jul 15, 2010 at 3:28 PM, Anthony Molinaro
antho...@alumni.caltech.edu wrote:
Is the fact that 2 new nodes are in the range messing it up?
Probably.
And if so
how do I recover (I'm thinking, shutdown new nodes 2,3,4,5, the bringing
up nodes 2,4, waiting for them to finish, then
On Thu, Jul 15, 2010 at 2:01 PM, Jonathan Ellis jbel...@gmail.com wrote:
The main problem is not the syscall so much as Java insisting on
zeroing out any buffer you create, which is a big hit to performance
when you're allocating buffers for file i/o on each request instead of
just mmaping
Benjamin,
Ah, thanks for clarifying that.
key sorting is changing in .7 I believe to support a binary array?
On Thu, Jul 15, 2010 at 3:26 PM, Benjamin Black b...@b3k.us wrote:
Keys are always sorted (in 0.6) as UTF8 strings. The CompareWith
applies to _columns_ within rows, _not_ to row
Given a CF like:
Articles : {
key1 : { title:some title, body: this is my article body..., },
key1 : { title:some title, body: this is my article body..., }
}
Now these articles could be for different websites e.g. www.website1.com,
www.website2.com
If I want to get the latest
On Thu, Jul 15, 2010 at 3:56 PM, Carlos Alvarez cbalva...@gmail.com wrote:
On Thu, Jul 15, 2010 at 2:01 PM, Jonathan Ellis jbel...@gmail.com wrote:
The main problem is not the syscall so much as Java insisting on
zeroing out any buffer you create, which is a big hit to performance
when you're
Just want to verify with group that what i am doing wrt RF is correct.
1. Nodes were running with RF=2
2. Stopped all the nodes, changed the RF to 4
3. Started all the nodes, verify the cluster ring using nodetool, all the
nodes are part of cluster
4. Ran nodetool repair on all the nodes
5. Ran
On Thu, Jul 15, 2010 at 5:29 PM, Mubarak Seyed mubarak.se...@gmail.com wrote:
Just want to verify with group that what i am doing wrt RF is correct.
1. Nodes were running with RF=2
2. Stopped all the nodes, changed the RF to 4
3. Started all the nodes, verify the cluster ring using nodetool,
On Jul 15, 2010, at 2:52 PM, Jonathan Ellis wrote:
On Thu, Jul 15, 2010 at 3:56 PM, Carlos Alvarez cbalva...@gmail.com wrote:
On Thu, Jul 15, 2010 at 2:01 PM, Jonathan Ellis jbel...@gmail.com wrote:
The main problem is not the syscall so much as Java insisting on
zeroing out any buffer you
On Thu, Jul 15, 2010 at 5:46 PM, Clint Byrum cl...@ubuntu.com wrote:
One other approach that works on Linux is to use HugeTLB. This post details
the process for doing so with a jvm:
http://andrigoss.blogspot.com/2008/02/jvm-performance-tuning.html
Basically when mmapping using HUGETLB you
Hi,
I am writing a scalability chapter in a book and I need to mention
Apache Cassandra although it's just a mention. Still I would not like
to be sloppy and would like to get verification whether my summary is
accurate. Cassandra stores four or five dimension associated arrays.
The first
You could build a secondary index, e.g.CFArticles : {article_id1 : {}article_id2 : {}}CFWebsiteArticle : {website_id1 : { time_uuid : article_id1, time_uuid2 : article_id2}}when you want to get the last 10 for a website, get_slice from the WebsiteArticle CF then multi get from Articles. Am
i saw this in the kernel log: jsvc uses 32-bit capabilitie. Is this right?
our server is
Linux 2.6.32-23-generic #37-Ubuntu SMP Fri Jun 11 08:03:28 UTC 2010 x86_64
GNU/Linux
On Jul 15, 2010, at 11:04 AM, Claire Chang wrote:
I am using Random Partitioner. The other 2 nodes are working fine.
I am no expert... but parts seem accurate, parts not.
Cassandra stores four or five dimension associated arrays
not sure what you're counting as a dimension of the associated array, but
here are the 2 associative array-like syntaxes:
ColumnFamily[row-key][column-name] = value1
The column names are arbitrary strings, so it's not obvious what the
next value should be at any step. So, I just set the start of the
next page to the end of the last page and eliminate the duplicate
value when joining the 2 pages together.
The paging direction does not matter in my case, as I
On 2010-07-16 01:57, Dave Viner wrote:
I am no expert... but parts seem accurate, parts not.
Cassandra stores four or five dimension associated arrays
not sure what you're counting as a dimension of the associated array,
but here are the 2 associative array-like syntaxes:
yes, you need to maintain the secondary index your self. Send a
batch_mutation and write the article and website article colums at the
same time. I think your safe up to a large number of cols, say
1M Not sure, may try to track the info down one day.AOn 16 Jul, 2010,at 03:39 PM, S Ahmed
Okay, so things were pretty messed up. I shut down all the new nodes,
then the old nodes started doing the half the ring is down garbage which
pretty much requires a full restart of everything. So I had to shut
everything down, then bring the seed back, then the rest of the nodes,
so they
35 matches
Mail list logo