> I thought that C* had no null values... I use a lot of CF in which only the
> columns name are filled up and I request a range of column to see which
> references (like 1228#16866) exists. So I would like those column to simply
> disappear from the table.
Cassandra does not store null values.
Hi All,
We are happy to announce the release of Kundera 2.5.
Kundera is a JPA 2.0 compliant, object-datastore mapping library for NoSQL
datastores. The idea behind Kundera is to make working with NoSQL databases
drop-dead simple and fun. It currently supports Cassandra, HBase, MongoDB,
Redis,
Opened a ticket:
https://issues.apache.org/jira/browse/CASSANDRA-5525
On Mon, Apr 29, 2013 at 2:24 AM, aaron morton wrote:
> is this understanding correct "we had a 12 node cluster with 256 vnodes on
> each node (upgraded from 1.1), we added two additional nodes that streamed
> so much data (60
On Mon, Apr 29, 2013 at 1:17 PM, aaron morton wrote:
> Bulk Loader does not use CL, it's more like a repair / bootstrap.
> If you have to skip a node then use repair.
The bulk loader ("sstableloader") can ignore replica nodes via -i option :
./src/java/org/apache/cassandra/tools/BulkLoader.java
On Mon, Apr 29, 2013 at 3:52 PM, John Watson wrote:
> Same behavior on 1.1.3, 1.1.5 and 1.1.9.
> Currently: 1.2.3
(below snippets are from trunk)
./src/java/org/apache/cassandra/tools/NodeCmd.java
"
case SETCOMPACTIONTHROUGHPUT :
if (arguments.length != 1) { badUs
Same behavior on 1.1.3, 1.1.5 and 1.1.9.
Currently: 1.2.3
On Mon, Apr 29, 2013 at 11:43 AM, Robert Coli wrote:
> On Sun, Apr 28, 2013 at 2:28 PM, John Watson wrote:
> > Running these 2 commands are noop IO wise:
> > nodetool setcompactionthroughput 0
> > nodetool setstreamtrhoughput 0
>
>
> I used JMX to check current number of threads in a production cassandra
> machine, and it was ~27,000.
That does not sound too good.
My first guess would be lots of client connections. What client are you using,
does it do connection pooling ?
See the comments in cassandra.yaml around rpc_se
nodetool scrub will repair out of order rows in the source SSTables for the
compaction process. Or you can stop the node and use the offline
bin/sstablescrub tool
Not sure how they got there, there was a ticket for similar problems in 1.1.1
Cheers
-
Aaron Morton
Freelance Cas
Is this a once off data load or something you need to do regularly?
One option you have with RF3 and 3 Nodes is to place a copy of all the SSTables
on each node and use nodetool refresh to directly load the sstables into the
node without any streaming.
> 1. Please can anyone suggest how we can
Messages are sent to all replicas involved in there request at the same time.
All nodes in the cluster must be able to communicate to all other nodes.
The coordinator the client is talking to, the local coordinator, groups
messages (for one read/mutation) to be sent to remove data centres and on
I did a talk on the internals at Apache Con this year, it goes through the
architecture and the code
http://www.slideshare.net/aaronmorton/apachecon-nafeb2013
Not sure if/when the videos are going to be put up.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
> About 80% of these CFs should be truncated every day and if we decrease many
> CF by creating one key field in one CF, a huge amount of tombstones will
> appear.
>
>
Truncation requires that all nodes be available, so if you are doing it each
day you may run into troubles if a node it down.
On Sun, Apr 28, 2013 at 2:28 PM, John Watson wrote:
> Running these 2 commands are noop IO wise:
> nodetool setcompactionthroughput 0
> nodetool setstreamtrhoughput 0
What version of cassandra?
=Rob
On Mon, Apr 29, 2013 at 12:33 AM, Sasha Yanushkevich wrote:
> 1) We’ve tested 100 threads in parallel and each thread created 10 tables.
With this pattern of creating CFs, you are begging for schema desynch.
If this actually works any meaningful percentage of the time in modern
cassandra, I would
They were all restarted a couple times after adding 'num_tokens: 256' to
cassandra.yaml.
Yes and nodetool ring became 'unusable' due to all the new tokens.
On Mon, Apr 29, 2013 at 10:24 AM, Sam Overton wrote:
> Did you update num_tokens on the existing hosts and restart them, before
> you trie
Hi, we have a 9-node ring on m1.xlarge AWS hosts. We started having
some trouble a while ago, and it's making me pull out all of my hair.
The host in position #3 has been replaced 4 times. Each time, the host
joins the ring, I do a nodetool repair -pr, and she seems fine for about
a day. The
Did you update num_tokens on the existing hosts and restart them, before
you tried bootstrapping in the new node? If the new node tried to stream
all the data in the cluster then this would be consistent with you having
missed that step.
You should see "Calculating new tokens" in the logs of the e
That's what we tried first before the shuffle. And ran into the space issue.
That's detailed in another thread title: "Adding nodes in 1.2 with vnodes
requires huge disks"
On Mon, Apr 29, 2013 at 4:08 AM, Sam Overton wrote:
> An alternative to running shuffle is to do a rolling
> bootstrap/dec
*hi all:*
*
*
*i can run pig with cassandra and hadoop in EC2.*
*
*
*I ,m trying to run pig with cassandra ring and hadoop *
*The ring cassandra have the tasktrackers and datanodes , too. *
*
*
*and i running pig from another machine where i have intalled the
namenode-jobtracker.*
*ihave a
For starters: If you are using the Murmur3 partitioner, which is the default
in cassandra.yaml, then you need to calculate the tokens using:
python -c 'print [str(((2**64 / 2) * i) - 2**63) for i in range(2)]'
which gives the following values:
['-9223372036854775808', '0']
From: Rahul [mailto:r
Hi,
I am testing out Cassandra 1.2 on two of my local servers. But I face
problems with assigning tokens to my nodes. When I use nodetool to set
token, I end up getting an java Exception.
My test setup is as follows,
Node1: (seed)
Node2: (seed)
Since I have two nodes, i calculated the tokens as
I created it almost a year ago with cassandra-cli. Now show_schema returns:
create column family myCF
with column_type = 'Standard'
and comparator = 'UTF8Type'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'UTF8Type'
and read_repair_chance = 0.1
and dclocal_read_
Hi,
I'm having some issues. I keep getting:
ERROR [GossipStage:1] 2013-04-28 07:48:48,876 AbstractCassandraDaemon.java
(line 135) Exception in thread Thread[GossipStage:1,5,main]
java.lang.OutOfMemoryError: unable to create new native thread
--
after a day or two of runti
We saw this exception with 1.1.1 and also with 1.1.11 (we upgraded for
unrelated reasons, to fix the FD leak during slice queries) -- name of
the CF replaced with "*" for confidentiality:
10419 ERROR [CompactionExecutor:36] 2013-04-29 07:50:49,060
AbstractCassandraDaemon.java (line 132) Except
Hi All,
We have a requirement to load approximately 10 million records, each record
with approximately 100 columns. We are planning to use the Bulk-loader program
to convert the data into SSTables and then load them using SSTABLELOADER.
Everything is working fine when all nodes are up and runni
Hello.
I would like to know whether updates are propagated from local DC to remote
DCs simultaneously (so All-to-All network connections are preferable) or
Cassandra can somehow determine nearest DCs and send updates only to them
(so these nearest DCs have to propagate updates further)?
Is there s
An alternative to running shuffle is to do a rolling
bootstrap/decommission. You would set num_tokens on the existing hosts (and
restart them) so that they split their ranges, then bootstrap in N new
hosts, then decommission the old ones.
On 28 April 2013 22:21, John Watson wrote:
> The amount
Not really, I've passed on the comments to the doc teams.
The column timestamp is just a 64 bit int like I said.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 29/04/2013, at 10:06 AM, Michael Theroux wrote:
> Y
is this understanding correct "we had a 12 node cluster with 256 vnodes on each
node (upgraded from 1.1), we added two additional nodes that streamed so much
data (600+Gb when other nodes had 150-200GB) during the joining phase that they
filled their local disks and had to be killed" ?
Can you
Dear all,
I am trying to understand and analyze the source code of Cassandra. What I
expect (and see in other codes) is that there should be three sections in a
code. 1) Initialization and input reading, 2) Core computation and 3)
Finalizing and gathering the output.
However I can not find such
1) We’ve tested 100 threads in parallel and each thread created 10
tables. I think we will change our data model, but another problem may
occur. About 80% of these CFs should be truncated every day and if we
decrease many CF by creating one key field in one CF, a huge amount of
tombstones will appe
31 matches
Mail list logo