The session.execute blocks until the C* returns the response. Use the async
version, but do so with caution. If you don't throttle the requests, you
will start seeing timeouts on the client side pretty quickly. For
throttling I've used a Semaphore, but I think Guava's RateLimiter is better
suited.
Corollary:
what is getting shipped over the wire? The ganglia screenshot shows the
network traffic on all the three hosts on which I ran the nodetool repair.
[image: Inline image 1]
remember
UN 10.1.2.11 107.47 KB 256 32.9%
1f800723-10e4-4dcd-841f-73709a81d432 rack1
UN 10.1.2.10 127.
Howdy!
Not a matter of life or death, just curious.
I've just stood up a three node cluster (v1.2.8) on three c3.2xlarge boxes
in AWS. Silly me forgot the correct replication factor for one of the
needed keyspaces. So I changed it via cli and ran a nodetool repair.
Well .. there is no data at all
Hello Cassandra users,
For one of our our new Big data BI projects, we are using Apache Cassandra
1.2.10 as our primary data store with the support of Hadoop for analytics.
For prototyping purpose we have 1 node each for Apache Cassandra/Hadoop.
Pig is our choice to process the data from/to C*.
B
I can’t speak for Astyanax; their thrift transport I believe is abstracted out,
however the object model is very CF wide row vs table-y.
I have no idea what the plans are for further Astyanax dev (maybe someone on
this list), but I believe the thrift API is not going away, so considering
Astyan
Hi all,
I need to setup 2 nodes Cassandra cluster. I know that Datastax recommends
using JBOD as a disk configuration and have replication for the redundancy.
I was planning to use RAID 10 but using JBOD can save 50% disk space and
increase the performance . But I am not sure I should use JBOD wit
Comments added, not sure about the usefulness seeing as the issue is
already "resolved" :)
-Anne
On 12/10/2013 03:12 PM, Robert Coli
wrote:
On Tue, Dec 10, 2013 at 5:58 AM, Anne Sullivan
wro
On Tue, Dec 10, 2013 at 12:15 AM, horschi wrote:
> And my feeling gets worse when I look at Murmur3Partitioner.normalize().
> This one explicitly excludes Long.MIN_VALUE by changing it to
> Long.MAX_VALUE.
>
> I think I'll just avoid it in the future. Better safe than sorry...
>
I see, your ques
On Tue, Dec 10, 2013 at 5:58 AM, Anne Sullivan <
anne.b.sulli...@alcatel-lucent.com> wrote:
> My understanding is that a node won't auto-bootstrap if it thinks it's a
> seed node. So when adding a new node to an existing cluster, I want to
> make sure it will auto-bootstrap, and I don't want to
On Tue, Dec 10, 2013 at 1:45 AM, Bonnet Jonathan <
jonathan.bon...@externe.bnpparibas.com> wrote:
> Taking a look to the code ? i'm not a develloper but a DBA, where should i
> look ? Thank you.
>
In all seriousness, if you plan to operate Cassandra, get used to the idea
of reading Java source co
Great. Thanks Aaron.
FWIW, I am/was porting Virgil over CQL.
I should be able to release a new REST API for C* (using CQL) shortly.
-brian
---
Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive • King of Prussia, PA • 19406
M: 215.588.6024 •
Looks like a bug, will try to fix today
https://issues.apache.org/jira/browse/CASSANDRA-6472
Cheers
-
Aaron Morton
New Zealand
@aaronmorton
Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com
On 6/12/2013, at 10:25 am, Brian O'Neill wrote
Michael has a good point about the system tables - particularly hints and
batching (though neither should be a real tax unless you have bigger
issues).
If you have a monitoring system, add in the flush counters for these and
other higher traffic tables and see if there is a correlation at the four
You could shard your rows like the following.
You would need over 100 shards, possibly... so testing is in order :)
Michael
-- put this in and run using 'cqlsh -f
DROP KEYSPACE robert_test;
CREATE KEYSPACE robert_test WITH replication = {
'class': 'SimpleStrategy',
'replication_facto
Where you’ll run into trouble is with compaction. It looks as if pord is some
sequentially increasing value. Try your experiment again with a clustering key
which is a bit more random at the time of insertion.
On Dec 10, 2013, at 5:41 AM, Robert Wille wrote:
> I have a question about this s
comments below
On Dec 9, 2013, at 11:33 PM, Aaron Morton wrote:
>> But this becomes troublesome if I add or remove nodes. What effectively I
>> want is to partition on the unique id of the record modulus N (id % N; where
>> N is the number of nodes).
> This is exactly the problem consistent ha
Thanks a lot,
It Works, i see commit log bein archived. I'll try tomorrow the restore
command. Thanks again.
Bonnet Jonathan.
2.0.3: system tables have a 1 hour memtable_flush_period which I have
observed to trigger compaction on the 4 hour mark. Going by memory tho...
-ml
On Tue, Dec 10, 2013 at 10:31 AM, Andre Sprenger
wrote:
> As far as I know there is nothing hard coded in Cassandra that kicks in
> every 4 hours. T
As far as I know there is nothing hard coded in Cassandra that kicks in
every 4 hours. Turn on GC logging, maybe dump the output of jstats to a
file and correlate this data with the Cassandra logs. Cassandra logs are
pretty good in telling you what is going on.
2013/12/10 Joel Samuelsson
> Hell
Hi,
I have an error with pig action in oozie 4.0.0 using cassandraStorage.
(cassandra 1.2.10)
I can run pig scripts right with cassandra. but whe I try to use
cassandraStorage to load data I have this error:
*Run pig script using PigRunner.run() for Pig version 0.8+*
*Apache Pig version 0.10
Hello,
We've been having a lot of problems with extremely long GC (and still do)
which I've asked about several times on this list (I can find links to
those discussions if anyone is interested).
We noticed a pattern that the GC pauses may be related to something
happening every 4 hours. Is there
My understanding is that a node won't auto-bootstrap if it thinks
it's a seed node. So when adding a new node to an existing cluster,
I want to make sure it will auto-bootstrap, and I don't want to do 2
edits to the config file (first start without node as seed, then add
I have a question about this statement:
When rows get above a few 10¹s of MB things can slow down, when they get
above 50 MB they can be a pain, when they get above 100MB it¹s a warning
sign. And when they get above 1GB, well you you don¹t want to know what
happens then.
I tested a data model t
Hmm. I have read that the thrift interface to Cassandra is out of
favour and the CQL interface is in. Where does that leave Astyanax?
On Tue, Dec 10, 2013 at 1:14 PM, graham sanderson wrote:
> Perhaps not the way forward, however I can bulk insert data via astyanax at a
> rate that maxes out our
I should probably give you a number which is about 300 meg / s via thrift api
and use 1mb batches
On Dec 10, 2013, at 5:14 AM, graham sanderson wrote:
> Perhaps not the way forward, however I can bulk insert data via astyanax at a
> rate that maxes out our (fast) networks. That said for our ne
Perhaps not the way forward, however I can bulk insert data via astyanax at a
rate that maxes out our (fast) networks. That said for our next release (of
this part of our product - our other current is node.js via binary protocol) we
will be looking at insert speed via java driver, and also alte
I have tried the DataStax Java driver and it seems the fastest way to
insert data is to compose a CQL string with all parameters inline.
This loop takes 2500ms or so on my test cluster:
PreparedStatement ps = session.prepare("INSERT INTO perf_test.wibble
(id, info) VALUES (?, ?)")
for (int i = 0;
Hi,
There is some docs on the internet for this operations. It is basically
as presented in the archive-commitlog file.
(commitlog_archiving.properties).
The way the operations work: The operation is called automatically with
parameters that give you control over what you want to do with it.
Vicky Kak gmail.com> writes:
>
>
>
> >>Why, can you give me a good example and the good way to configure
archive
> >>commit logs ?
> Take a look at the cassandra code ;)
>
>
Taking a look to the code ? i'm not a develloper but a DBA, where should i
look ? Thank you.
Regards,
Bonnet Jona
Hi Aaron,
thanks for your response. But that is exactly what scares me:
RandomPartitioner.MIN is -1, which is not a valid token :-)
And my feeling gets worse when I look at Murmur3Partitioner.normalize().
This one explicitly excludes Long.MIN_VALUE by changing it to
Long.MAX_VALUE.
I think I'll
30 matches
Mail list logo