Hi,
In a 3 node named (named A,B,C) setup with replication factor 3 and quorum
read/write scenario;
suppose a new value of data X is written to A and B but not C with any reason,
then A wend down and I fired D with the data of C or with an empty data where
in a case is X is not present in D.
Is it possible to launch cassandra from unprivileged user?
On Tuesday 15 of February 2011, ruslan usifov wrote:
Is it possible to launch cassandra from unprivileged user?
On linux - yes.
--
Mateusz Korniak
so long as the ports you want it to bind to are above 1024
http://wiki.apache.org/cassandra/FAQ#ports
On Tue, Feb 15, 2011 at 12:07 PM, Mateusz Korniak
mateusz-li...@ant.gliwice.pl wrote:
On Tuesday 15 of February 2011, ruslan usifov wrote:
Is it possible to launch cassandra from
hi everyone,
is anyone using cassandra as a backend repository for storing and serving
online chat information? are you able to share your design thoughts? have
you encountered problems with the data structure you've implemented? i was
playing with some ideas and each time i come back to
I never did it. But I suppose you can use chatroom name as key and store
messages nicks as columns in JSON and timestamp as columnName.
The schema design depends on chatrooms/users/messages numbers. I.e. you can
have one CF, where key is chatroom, column name is username, column value is
the message and message time is the same as column timestamp.
You can add day-timestamp to the chatroom name to avoid large rows.
Augi
Hello,
I would like to rename some column families but I discovered that the
system_rename_column_family disappeared in 0.7. How to rename the column
family now? I tried system_update_column_family method but it doesn't work
for renaming :(
Thank you!
thanks for the response. thinking about this, this would not allow for the
sorting of messages into a chronological order for end user display. i had
thought about having each message as its own column against the room or the
user, but i have had some inconsistencies in retrieving the data.
Hello Sasha.
In this sort of real time application the way you insert (QUORUM, ONE,
etc..) and the way you retrieve is extremely important because your data
may not have had the time to propagate to all your nodes. Be sure to use
adequate policies to do that : insert to a certain number of nodes
Hi,
i am a little puzzled on creation of secondary indexes and the docs in that
area are still very sparse.
What I am trying to do is - in a columnfamily with TimeUUID comparator, I want
the special timeuuid --1000-- to be indexed. The
value being some UTF8 string
Hi,
if you download Cassandra and look into conf/cassandra.yaml then you can
see this:
this keyspace definition is for demonstration purposes only. Cassandra will
not load these definitions during startup. See
http://wiki.apache.org/cassandra/FAQ#no_keyspaces for an explanation.
So you should
Yeah i know about that, but the definition i have is for a cluster that is
started/stopped from a unit test with hector embeddedServerHelper, which takes
definitions from the yaml.
So i'd still like to define the index in the yaml file (it should very well be
possible I guess)
Von: Michal
Ah, ok. I checked that in source and the problem is that you wrote
validation_class but you should validator_class.
Augi
2011/2/15 Roland Gude roland.g...@yoochoose.com
Yeah i know about that, but the definition i have is for a cluster that is
started/stopped from a unit test with hector
Hi all,
While testing the new 0.7.1 release I got the following exception:
ERROR [ReadStage:11] 2011-02-15 16:39:18,105
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.io.IOError: java.io.EOFException
at
Hi,
I am new to Cassandra and am evaluating it.
Following diagram is how my setup will be: http://bit.ly/gJZlhw
Here each oval represents one data center. I want to keep N=4. i.e. four
copies of every Column Family. I want one copy in each data-center. In
other words, COMPLETE database must
Some good discussion here:
http://www.mail-archive.com/user@cassandra.apache.org/msg09020.html
On Sun, Feb 13, 2011 at 5:25 PM, mcasandra mohitanch...@gmail.com wrote:
I just now watched some videos about performance tunning. And it looks like
most of the bottleneck could be on reads. Also, it
Renames are not yet supported (see
https://issues.apache.org/jira/browse/CASSANDRA-1585)
On Tue, Feb 15, 2011 at 7:45 AM, Michal Augustýn
augustyn.mic...@gmail.com wrote:
Hello,
I would like to rename some column families but I discovered that the
system_rename_column_family disappeared in
I can reproduce with your script. Thanks!
2011/2/15 Jonas Borgström jonas.borgst...@trioptima.com:
Hi all,
While testing the new 0.7.1 release I got the following exception:
ERROR [ReadStage:11] 2011-02-15 16:39:18,105
DebuggableThreadPoolExecutor.java (line 103) Error in
Have you made any changes to the cassandra config?
2011/2/15 Jonas Borgström jonas.borgst...@trioptima.com
Hi all,
While testing the new 0.7.1 release I got the following exception:
ERROR [ReadStage:11] 2011-02-15 16:39:18,105
DebuggableThreadPoolExecutor.java (line 103) Error in
Hello
Is it possible to store binary objects (images, pdfs, videos etc) in
Cassandra. The size of my images are less than 100MB.
If so, how do I try inserting and retrieving a few files from cassandra ?
Would prefer if someone can give examples using pycassa.
Thanks !
AJ
http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
Retrieval should be the same as the examples in the pycassa
tutorialhttp://pycassa.github.com/pycassa/tutorial.html
.
--
Tyler Hobbs
Software Engineer, DataStax http://datastax.com/
Maintainer of the pycassa
On Tue, Feb 15, 2011 at 3:59 AM, Serdar Irmak sir...@protel.com.tr wrote:
Hi,
In a 3 node named (named A,B,C) setup with replication factor 3 and quorum
read/write scenario;
suppose a new value of data X is written to A and B but not C with any
reason, then A wend down and I fired D with
It will be great if patch appear very quick
2011/2/15 Jonathan Ellis jbel...@gmail.com
I can reproduce with your script. Thanks!
2011/2/15 Jonas Borgström jonas.borgst...@trioptima.com:
Hi all,
While testing the new 0.7.1 release I got the following exception:
ERROR [ReadStage:11]
On Tue, Feb 15, 2011 at 7:10 PM, ruslan usifov ruslan.usi...@gmail.comwrote:
It will be great if patch appear very quick
patch attached here: https://issues.apache.org/jira/browse/CASSANDRA-2165
Hoping this is quick enough.
2011/2/15 Jonathan Ellis jbel...@gmail.com
I can reproduce with
2011/2/15 Sylvain Lebresne sylv...@datastax.com
On Tue, Feb 15, 2011 at 7:10 PM, ruslan usifov ruslan.usi...@gmail.comwrote:
It will be great if patch appear very quick
patch attached here: https://issues.apache.org/jira/browse/CASSANDRA-2165
Does this patch appear in binary release, or
Say I set write consistency level to ALL and all but one node are down. What
happens to writes ? Does it rollback from the live node before returning
failure to client ?
Thanks.
Your write will fail. But if the write has reached at least one node,
it will eventually reach all the other nodes as well. So it won't
rollback.
On Tue, Feb 15, 2011 at 7:38 PM, A J s5a...@gmail.com wrote:
Say I set write consistency level to ALL and all but one node are down. What
happens
I have been having plenty of problems (on 0.7.0,
http://www.mail-archive.com/user@cassandra.apache.org/msg09341.html,
http://www.mail-archive.com/user@cassandra.apache.org/msg09230.html,
http://www.mail-archive.com/user@cassandra.apache.org/msg09122.html,
the following exception seems to be about loading saved caches, but i
don't really care about the cache so maybe isn't a big deal. anyway,
this is with patched 0.7.1
(0001-Fix-bad-signed-conversion-from-byte-to-int.patch)
WARN 11:07:59,800 error reading saved cache
I worked on that ticket, will try to chase it up.
Aaron
On 15/02/2011, at 2:01 PM, Gregory Szorc gregory.sz...@gmail.com wrote:
The latest official 0.6.x releases, 0.6.10 and 0.6.11, have a very serious
bug/regression when performing some quorum reads (CASSANDRA-2081), which is
fixed in
Hello,
we are acquiring new hardware for our cluster and will be installing it
soon. It's likely that I won't need to rely on secondary index
functionality, as data will be write-once read-many and I can get away with
inverse index creation at load time, plus I have some more complex indexing
in
Hi there,
we are currently benchmarking a Cassandra 0.6.5 cluster with 3
High-Mem Quadruple Extra Large EC2 nodes
(http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
(replication factor is 3, random partitioner). We assigned 32 GB RAM
to the JVM and left 32 GB RAM for the Ubuntu Linux
Jaspersoft.com make reporting tools that claim no work with Cassandra. Have not
used them myself.
It will depend on what the reports are and how big your data is, though Pig may
be the best bet.
A
On 15/02/2011, at 8:18 PM, Michal Augustýn augustyn.mic...@gmail.com wrote:
Hi,
it depends
There was a by here last year who did something similar and did a nice write
up. Cannot find it right now, some googleing may help.
Aaron
On 16/02/2011, at 2:56 AM, Victor Kabdebon victor.kabde...@gmail.com wrote:
Hello Sasha.
In this sort of real time application the way you insert
You can using the Network Topology Strategy seehttp://wiki.apache.org/cassandra/Operations?highlight=(topology)|(network)#Network_topologyandNetworkTopologyStrategy in the conf/cassandra.yaml file.You can control the number of replicas to each DC.Also look at conf/cassandra-topology.properties for
Hi Aaron,
I did come across this:
http://www.juhonkoti.net/2010/09/25/example-how-to-model-your-data-into-nosql-with-cassandra
http://www.juhonkoti.net/2010/09/25/example-how-to-model-your-data-into-nosql-with-cassandraWas
this what you were referring to? I found this one interesting, and keep
Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
What's your CPU usage during these tests?
On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems mar...@klems.eu wrote:
Hi there,
we are currently benchmarking a Cassandra 0.6.5 cluster with 3
High-Mem Quadruple Extra Large EC2
0.7.1 is what I would go with right now. It's likely you'll eventually have
to upgrade that as well, but moving to other 0.7.x releases should be fairly
painless. Most development is happening on the 0.7 releases, which already
have lots of fixes over the 0.6 series (not to mention performance
Thank you! It's just that 7.1 seems the bleeding edge now (a serious bug
fixed today). Would you still trust it as a production-level service? I'm
just slightly concerned. I don't want to create a perception among our IT
that the product is not ready for prime time.
--
View this message in
The write will not start if there are insufficient nodes up. In this case (All
cl) you would get an error and nothing would be committed to disk. You would
get an Unavailable exception.
Aaron
On 16/02/2011, at 7:46 AM, Thibaut Britz thibaut.br...@trendiction.com wrote:
Your write will fail.
We have been running a 0.6.3 with some custom features for more than 1 month
and it has been running fine. we are planning on moving to 0.7.1 in about 1
month from now if it past our stress tests.
If you are really going from scratch to production environment, I would
definetly go with 0.7.1
Initial thoughts are you are overloading the cluster, are their any log lines
about dropping messages?
What is the schema, what settings do you have in Cassandra yaml and what are
CF stats telling you? E.g. Are you switching Memtables too quickly? What are
the write latency numbers?
Also 0.7
On Tue, Feb 15, 2011 at 3:03 PM, buddhasystem potek...@bnl.gov wrote:
Thank you! It's just that 7.1 seems the bleeding edge now (a serious bug
fixed today). Would you still trust it as a production-level service? I'm
just slightly concerned. I don't want to create a perception among our IT
But you can not depend on such behavior. If you do a write and you get an
unavailable exception, the only thing you know is at that time it was not
able to be placed on all the nodes required to meet your CL. It may
eventually end up on all those nodes, it may not be on any of the nodes or
at
If I update a column (i.e. change the value contents for a given name in a
given key), is the physical disk operation equivalent to delete followed by
insert.
Or is it just insert somehow making the last value marked as stale ?
In the definite guide, it says the following about SSTable:
*All
This may help clear things up for you:
http://wiki.apache.org/cassandra/MemtableSSTable
--
Tyler Hobbs
Software Engineer, DataStax http://datastax.com/
Maintainer of the pycassa http://github.com/pycassa/pycassa Cassandra
Python client library
This bug was not in 0.7.0, but it's certainly possible that other
ByteBuffer-related bugs were.
On Tue, Feb 15, 2011 at 1:00 PM, Dan Hendry dan.hendry.j...@gmail.com wrote:
I have been having plenty of problems (on 0.7.0,
http://www.mail-archive.com/user@cassandra.apache.org/msg09341.html,
Here's my understandingThe request will not start if CL nodes are not up from the point of view of the coordinator (when considering a single mutation). I the case described where the CL is ALL, the write would not start and UnavailableException would be thrown This comes from
Is this reproducible or just I happened to kill the server while it
was in the middle of writing out the cache keys?
On Tue, Feb 15, 2011 at 1:10 PM, B. Todd Burruss bburr...@real.com wrote:
the following exception seems to be about loading saved caches, but i don't
really care about the cache
Is this reproducible or just I happened to kill the server while it
was in the middle of writing out the c
On Tue, Feb 15, 2011 at 1:10 PM, B. Todd Burruss bburr...@real.com wrote:
the following exception seems to be about loading saved caches, but i don't
really care about the cache so maybe
Note that this is a read-time bug, there is no data loss involved.
Patch is committed with a new test to prevent future regressions.
I've asked Hudson (https://hudson.apache.org/hudson/job/Cassandra) to
create a new binary build with the patch included but the backlog is
long enough that I don't
It doesn't write anything to the coordinator node, it just forwards it to
nodes in the replica set for that row key.
write goes to some node (coordinator, i.e. whatever node you connected to).
coordinator looks at key, determines which nodes are responsible for it.
in parallel it forwards the
Makes sense ! Thanks.
Just a quick follow-up:
Now I understand the write is not made to coordinator (unless it is part of
the replica for that key). But does the write column traffic 'flow' through
the coordinator node. For a 2G column write, will I see 2G network traffic
on the coordinator node
0.6.8 is stable and production ready, the later versions of the 0.6
branch has issues. No offense, but the 0.7 branch is fairly unstable
from my experience. I have reproduced all the open bugs with a
production dataset, even when tried to rebuild it from scratch after a
complete loss.
If you have
We are using haproxy in TCP mode for round-robin with great succes.
It's bit unorthodox but has same real added values like logging.
Here is the relavant config for haproxy:
#
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
maxconn 4096
user haproxy
group haproxy
Thank you Attila!
We will indeed have a few months of breaking in. I suppose I'll
keep my fingers crossed and see that 0.7.X is very stable. So I'll
deploy 0.7.1 -- I will need to apply all the patches, there is no
cumulative download, is that correct?
Attila Babo wrote:
0.6.8 is stable and
You have a single HAProxy node in front of the cluster or you have a HAProxy
node on each machine that is a client of Cassandra that points at all the
nodes in the cluster?
The former has a SPOF and bottleneck (the HAProxy instance), the latter does
not (and is somewhat common, especially for
There is a single point of failure for sure as there is a single proxy
in front but that pays off as the load is even between nodes. Another
plus is when a machine is out of the cluster for maintenance the proxy
handles that automatically. Originally I started it as an experiment,
there is a large
Assuming you aren't changing the RC, the normal bootstrap process takes care
of all the problems like that, making sure things work correctly.
Most importantly, if something fails (either the new node or any of the
existing nodes) you can recover from it.
Just don't connect clients directly to
Thanks! Would Hector take care of not load balancing to the new node until
it's ready?
Also, when repair is occuring in background is there a status that I can
look at to see that repair is occuring for key ABC.
--
View this message in context:
On Tue, Feb 15, 2011 at 3:05 PM, mcasandra mohitanch...@gmail.com wrote:
Is there a way to let the new node join cluster in the background and make it
live to clients only after it has finished with node repair, syncing data
etc. and in the end sync keys or trees that's needed before it's come
Hi, a node in my cassandra cluster will not accept keyspace additions applied
to other nodes. In its logs, it says:
DEBUG [MigrationStage:1] 2011-02-15 15:39:57,995
DefinitionsUpdateResponseVerbHandler.java (line 71) Applying AddKeyspace from
{X}
DEBUG [MigrationStage:1] 2011-02-15
Hi,
I have general questions on writing enterprise applications on cassandra. I
come from a background which involves writing enterprise applications using
DBMS.
What are the general patterns people follow in Cassandra world when
migrating a code that is within transaction boundaries in a
Hi,
What is the best way to retrieve the latest rows from a CF with OPP.
We are using OPP and key range queries but I cannot find an easy way to
get the latest 10 keys for example from a column family with 1000s of keys.
I really don't want to create another CF to store row key names as
What is the best way to retrieve the latest rows from a CF with OPP.
Use inverted timestamps (for example, 2^64 - timestamp) with zeros for
padding as the row keys.
This way you can do a normal forward range scan and get the N latest rows.
--
Tyler Hobbs
Software Engineer, DataStax
On Tue, Feb 15, 2011 at 11:40 AM, buddhasystem potek...@bnl.gov wrote:
So, if I don't need indexes, what is the most stable, reliable version of
Cassandra that I can put in production? I'm seeing bug reports here and some
sound quite serious, I just want something that works day in, day out.
HH is one aspect and the other aspect is when new node join there need to be
some balancing that need to occur, this may take time as well.
But I also understand it will add lot of complexity in the code.
Is there any place where I can read other things of concern that one should
be aware of?
I would like to subscribe to your newsletter.
On Tue, Feb 15, 2011 at 8:04 AM, A J s5a...@gmail.com wrote:
Looks like your wish has been granted.
2011/2/15 Chris Goffinet c...@chrisgoffinet.com
I would like to subscribe to your newsletter.
On Tue, Feb 15, 2011 at 8:04 AM, A J s5a...@gmail.com wrote:
But wouldn't using timestamp as row keys cause conflicts?
On Tue, 2011-02-15 at 19:11 -0600, Tyler Hobbs wrote:
What is the best way to retrieve the latest rows from a CF
with OPP.
Use inverted timestamps (for example, 2^64 - timestamp) with zeros for
padding as the row
Created https://issues.apache.org/jira/browse/CASSANDRA-2172.
On Tue, Feb 15, 2011 at 3:34 PM, B. Todd Burruss bburr...@real.com wrote:
it happens when i start the node. just tried it again. here's the
saved_caches directory:
[cassandra@kv-app02 ~]$ ls -l /data/cassandra-data/saved_caches/
command never returns means it's waiting for the nodes to agree on
the new schema version. Bad Mojo will ensue if you issue more schema
updates anyway.
On Tue, Feb 15, 2011 at 3:46 PM, Bill Speirs bill.spe...@gmail.com wrote:
Has anyone ever tried to drop a column family and/or create one and
What would/could take so long for the nodes to agree? It's a small cluster (7
nodes) all on local LAN and not being used by anything else.
I think a delete refresh might be in order...
Thanks!
Bill-
On 02/15/2011 09:13 PM, Jonathan Ellis wrote:
command never returns means it's waiting for
Enterprise applications is a very broad topic. There's no one answer for every
type.
You specifically mention a transactional scenario. For that, I can recommend
you look at Cages (http://code.google.com/p/cages) if you haven't already.
On Feb 15, 2011, at 19:45, Ritesh Tijoriwala
2011/2/5 Jonathan Ellis jbel...@gmail.com
Start with grep -i down system.log on each machine
I grep all machines but nothing found
I'm seeing this as well; several column families with keys_cached = 0 on 0.7.1.
Debug level logs: http://pastebin.com/qvujKDth
--
Dan Washusen
On Wednesday, 16 February 2011 at 1:12 PM, Jonathan Ellis wrote:
Created https://issues.apache.org/jira/browse/CASSANDRA-2172.
On Tue, Feb 15, 2011
Can this be as result of compacting?
2011/2/16 ruslan usifov ruslan.usi...@gmail.com
2011/2/5 Jonathan Ellis jbel...@gmail.com
Start with grep -i down system.log on each machine
I grep all machines but nothing found
Recently upgraded my 8 node cluster from 0.6.6 to 0.7.0 (even more recently
0.7.1) for ExpiringColumn, among the many other spectacular improvements.
Retuned the GC settings based on experience from 0.6.6 and new defaults.
After about a week, two of the nodes were very far behind on minor
But wouldn't using timestamp as row keys cause conflicts?
Depending on client behavior, yes. If that's an issue for you, make your
own UUIDs by appending something random or client-specific to the timestamp.
--
Tyler Hobbs
Software Engineer, DataStax http://datastax.com/
Maintainer of the
Thanks, it works.
roland
Von: Michal Augustýn [mailto:augustyn.mic...@gmail.com]
Gesendet: Dienstag, 15. Februar 2011 16:22
An: user@cassandra.apache.org
Betreff: Re: cant seem to figure out secondary index definition
Ah, ok. I checked that in source and the problem is that you wrote
82 matches
Mail list logo