Hi AJ,
I am storing simplified chinese data in columns without any issues at
the moment
萨莎
I can retrieve the data, but haven't tried secondary indexes or
something a bit more advanced yet
-sd
On Thu, Feb 24, 2011 at 5:21 PM, A J s5a...@gmail.com wrote:
Hello,
Have there been Cassandra
Sylvain Lebresne sylvain at datastax.com writes:
However, if that simple conflict detection/resolution mechanism is not good
enough for some of your use case and you need to keep two concurrent updates,
it
is easy enough. Just make sure that the update don't end up in the same column.
This
We experienced the java.lang.NegativeArraySizeException when upgrading to
0.7.2 in staging. The proposed solution (running compaction) seems to have
solved this. However it took a lot of time to run.
Is it safe to invoke a major compaction on all of the machines concurrently?
I can't see a
I am suggesting that your probably want to rethink your scheme design
since partitioning by year is going to be bad performance since the
old servers are going to be nothing more then expensive tape drives.
You fail to see the obvious
It is just the fact that most of the data is stale
@Thibaut Britz
Caveat:Using simple strategy.
This works because cassandra scans data at startup and then serves
what it finds. For a join for example you can rsync all the data from
the node below/to the right of where the new node is joining. Then
join without bootstrap then cleanup both
If your cluster has the overall IO capacity to perform a simultaneous
compaction on every node and still adequately service reads and
writes, then yes. If you're concerned about availability, your best
bet will be to stagger the compactions.
Gary.
On Fri, Feb 25, 2011 at 04:24, Daniel
He has a product to sell, so you can expect some advertising. But in
general, Stonebraker's articles are very deep (another one that
challenges general conceptions is
http://voltdb.com/voltdb-webinar-sql-urban-myths ) . He is the creator
of Postgres and considered a guru in databases by many.
And
Compaction assumes that the sstables it has as input are ordered
correctly (otherwise it would have to read the full row into memory to
re-sort). So it would have to be a new operation, and not feasible in
general for larger-than-memory rows. I don't think we'll ever add
this.
On Wed, Feb 23,
Nice!
On Wed, Feb 23, 2011 at 9:06 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
On the mailing list and IRC there are many questions about Cassandra
internals. I understand where the questions are coming from because it
took me a while to get a grip on it.
However if you have a laptop
You should upgrade before wasting time troubleshooting such an old install.
On Thu, Feb 24, 2011 at 8:45 AM, Tomer B tomer...@gmail.com wrote:
Hi
i'm using a 3 node cluster of cassandra 0.6.1 together with hector as api to
java client.
every few days I get a situation where I cannot connect
http://wiki.apache.org/cassandra/RunningCassandra may be useful, but
really you should be using the debian package:
http://wiki.apache.org/cassandra/DebianPackaging
2011/2/24 ko...@vivinavi.com ko...@vivinavi.com:
Hi everyone
I am new to JAVA and Cassandra.
I just get started to install
That article is heavily biased by I am selling a competitor to Cassandra.
First, read Coda's original piece if you haven't:
http://codahale.com/you-cant-sacrifice-partition-tolerance/
Then, Jeff Darcy's response: http://pl.atyp.us/wordpress/?p=3110
On Thu, Feb 24, 2011 at 2:56 PM, A J
On Fri, Feb 25, 2011 at 7:38 AM, Terje Marthinussen
tmarthinus...@gmail.com wrote:
@Thibaut Britz
Caveat:Using simple strategy.
This works because cassandra scans data at startup and then serves
what it finds. For a join for example you can rsync all the data from
the node below/to the right
Though you are not really implying that, I am not selling anything. I
don't work for VoltDB. I had other issues for my use case with the
software when I was evaluating it (their claim of durability is weak
according to me. Though it does not matter I'd rather they call
themselves NOSQL. they just
Yeah - no worries - I don't think anyone was thinking you were trying to drink
kool-aid or selling anything. Jonathan was just pointing out thoughtful
replies to his claims.
This past year, Michael Stonebraker with voltdb and other things seems to have
tried to take advantage of momentum
And everyone has a bias - and I think most people working with any of these
solutions realizes that.
I think it's interesting how many organizations use multiple data storage
solutions versus just using one as they have different capabilities - like the
recent Netflix news about using
I read in some cassandra notes that each node should be allocated
twice the storage capacity you wish it to contain. I think the reason
was during compaction another copy of SSTables have to be made before
the original ones are discarded.
Can someone confirm if that is actually true ? During
On Fri, Feb 25, 2011 at 9:22 AM, A J s5a...@gmail.com wrote:
I read in some cassandra notes that each node should be allocated
twice the storage capacity you wish it to contain. I think the reason
was during compaction another copy of SSTables have to be made before
the original ones are
I'm wondering if anyone has used cassandra as a datastore for a
user-profile service. I'm thinking of applications like behavioral
targeting, where there are lots lots of users (10s to 100s of millions),
and lots lots of data about them intermixed in, say, weblogs (probably TBs
worth).
I wanted to let everyone know that we're expanding our beta for the
Acunu Storage Platform, which comprises a modified version of
Cassandra that interfaces directly on to a storage stack reengineered
for Big Data workloads.
Acunu runs Cassandra applications unmodified, but provides (as we'll
be
OK. Is it also driven by type of compaction ? Does a minor compaction
require less working space than major compaction ?
On Fri, Feb 25, 2011 at 12:40 PM, Robert Coli rc...@digg.com wrote:
On Fri, Feb 25, 2011 at 9:22 AM, A J s5a...@gmail.com wrote:
I read in some cassandra notes that each node
I updated the cassandra version in the hector package from 7.0 to 7.2. The
occasional slow-down in the CF-index went away. I then upped the heap to
512MB, and the secondary-indexing then works. Seems awfully memory hungry for
my small dataset. Even the CF-index was faster with more heap.
On Fri, Feb 25, 2011 at 10:14 AM, A J s5a...@gmail.com wrote:
OK. Is it also driven by type of compaction ? Does a minor compaction
require less working space than major compaction ?
Yes, unless that minor compaction happens to involve all SStables due
to compaction thresholds, at which time it
On Fri, Feb 25, 2011 at 12:14 PM, A J s5a...@gmail.com wrote:
OK. Is it also driven by type of compaction ? Does a minor compaction
require less working space than major compaction ?
No, every so often a minor compaction ends up compacting all SSTables, so
it's effectively the same as a major
Thanks. What happens when my compaction fails for space reasons ?
Is no compaction possible till I add more space ?
I would assume writes are not impacted though the latency of reads
would increase, right ?
Also though writes are not seek-intensive, compactions are seek-intensive, no ?
On Fri,
It's nice to see some testing in this regard, however, it's worth pointing
out something that gets lost in CF index vs secondary index discussions.
What you're really proving is that get_slice (across columns) is faster than
get_indexed_slices (across keys). For up to a certain size (and it would
Does it mean that we should design data model such that row keys
actually become columns (and create secondary index) so that the data
retrieval is faster. I am soon setting up big test instances to test
all this.
On Fri, Feb 25, 2011 at 11:18 AM, Ed Anuff e...@anuff.com wrote:
It's nice to see
At the risk of recapitulating a conversation that seems to happen with some
frequency on this list, the answer is going to boil down to depends on your
data model, but using rows as indexes is one of the core usage patterns of
Cassandra, whether to store the list of keys to rows in another column
If you search the list there is some discussion about this. Best advice is to
send in plain text. https://issues.apache.org/jira/browse/INFRA-3356
Personally I prefer the emails to have the whole discussion.
Aaron
On 25/02/2011, at 4:55 AM, Anthony John chirayit...@gmail.com wrote:
Do not
I just noticed this thread. Does this mean that (assuming the same setup of an
empty keyspace and CFs added later) if I have a CF that I write to for some
time, but not enough to hit the flush limits, it will never get flushed until
the server is restarted? I believe this is causing commit logs
Another related question:
Can the minor compactions across nodes be staggered so that I can
control how many nodes are compacting at any given point ?
On Fri, Feb 25, 2011 at 2:01 PM, A J s5a...@gmail.com wrote:
Thanks. What happens when my compaction fails for space reasons ?
Is no compaction
Hi Guys,
for all of those who prefer forums over mailing lists, I setup a forum for
cassandra, please have a look
http://www.cassandraforums.com/
thanks
Jo
Yes.
On Fri, Feb 25, 2011 at 4:29 PM, Jeffrey Wang jw...@palantir.com wrote:
I just noticed this thread. Does this mean that (assuming the same setup of
an empty keyspace and CFs added later) if I have a CF that I write to for
some time, but not enough to hit the flush limits, it will never
Hi,
I would like to know internals of how does node failure detection work in
Cassandra? And in absence of any network partition, do all nodes see the
same view of live nodes? Is there a concept of Coordinator/Election? If yes,
how is merge handled after network partition heals?
thanks,
Ritesh
On Fri, Feb 25, 2011 at 5:32 PM, tijoriwala.ritesh
tijoriwala.rit...@gmail.com wrote:
Hi,
I would like to know internals of how does node failure detection work in
Cassandra?
http://bit.ly/phi_accrual
Is there a concept of Coordinator/Election?
No.
-Brandon
On Fri, Feb 25, 2011 at 2:41 PM, A J s5a...@gmail.com wrote:
Can the minor compactions across nodes be staggered so that I can
control how many nodes are compacting at any given point ?
Not without some crazy scheme where you control the compaction
thresholds dynamically via some external
Cassandra never compacts more than one column family at the time?
Regards,
Terje
On 26 Feb 2011, at 02:40, Robert Coli rc...@digg.com wrote:
On Fri, Feb 25, 2011 at 9:22 AM, A J s5a...@gmail.com wrote:
I read in some cassandra notes that each node should be allocated
twice the storage
On Fri, Feb 25, 2011 at 4:55 PM, Terje Marthinussen
tmarthinus...@gmail.com wrote:
Cassandra never compacts more than one column family at the time?
Nope, compaction is single threaded currently.
https://issues.apache.org/jira/browse/CASSANDRA-2191
38 matches
Mail list logo