I'm in the process of migrating data over to cassandra for several of our
apps, and a few of the schemas use secondary indexes. Four times in the
last couple months I've run into a corrupted sstable belonging to a
secondary index, but have never seen this on any other sstables. When it
happens, any
Were you able to solve or work around this problem?
On 06/05/2014 11:47 AM, Tom van den Berge wrote:
Hi,
I'm trying to migrate a development cluster from 1.2.14 to 2.0.8. When
starting up 2.0.8, I'm seeing the following error in the logs:
INFO 17:40:25,405 Snapshotting drillster, Account to
On Tue, Jun 10, 2014 at 7:31 AM, Jeremy Jongsma wrote:
> I'm in the process of migrating data over to cassandra for several of our
> apps, and a few of the schemas use secondary indexes. Four times in the
> last couple months I've run into a corrupted sstable belonging to a
> secondary index, but
If you've been dropping and recreating tables with the same name, you might
be seeing this: https://issues.apache.org/jira/browse/CASSANDRA-6525
On Tue, Jun 10, 2014 at 12:19 PM, Robert Coli wrote:
> On Tue, Jun 10, 2014 at 7:31 AM, Jeremy Jongsma
> wrote:
>
>> I'm in the process of migrating
Honestly, this has been by far my single biggest obstacle with Cassandra
for time-based data--cleaning up the old data when the deletion criteria
(i.e., date) isn't the primary key. I've asked about a few different
approaches, but I haven't really seen any feasible options that can be
implemented
On Mon, Jun 9, 2014 at 10:43 PM, Colin Kuo wrote:
> You can use "nodetool repair" instead. Repair is able to re-transmit the
> data which belongs to new node.
>
Repair is not very likely to work in cases where bootstrap doesn't.
@OP : you probably will have to tune your phi detector to be more
I just wanted to verify the procedures to add and remove nodes in my
environment, please feel free to comments or advise.
I have 3 node cluster N1, N2, N3 with Vnode configured as (256) on each
node. All are in one data center.
1. Procedure to Change node hardware or replace to new node machine
Hi,
I tried to double the size of an existing cluster from 4 to 8 nodes. First
I added one node, which joined after 120min successfully. During that time
there was no additional load on the cluster. Afterwards I started the other
3 new nodes after each other in order to join the cluster simultaneo
Our approach for this scenario is to run a hadoop job that periodically
cleans old entries, but I admit it's far from ideal. Would be nice to have
a more native way to perform these kinds of tasks.
There's a legend about a compaction strategy that keeps only the N first
entries of a partition key,
I ran an application today that attempted to fetch 20,000+ unique row keys
in one query against a set of completely empty column families. On a 4-node
cluster (EC2 m1.large instances) with the recommended memory settings (2 GB
heap), every single node immediately ran out of memory and became
unresp
On Thu, Jun 5, 2014 at 2:38 PM, Charlie Mason wrote:
>
> I can't do the initial account insert with a TTL as I can't guarantee when
> a new value would come along and so replace this account record. However
> when I insert the new account record, instead of deleting the old one could
> I reinsert
Hello Jeremy
Basically what you are doing is to ask Cassandra to do a distributed full
scan on all the partitions across the cluster, it's normal that the nodes
are somehow stressed.
How did you make the query? Are you using Thrift or CQL3 API?
Please note that there is another way to get al
I didn't explain clearly - I'm not requesting 2 unknown keys (resulting
in a full scan), I'm requesting 2 specific rows by key.
On Jun 10, 2014 6:02 PM, "DuyHai Doan" wrote:
> Hello Jeremy
>
> Basically what you are doing is to ask Cassandra to do a distributed full
> scan on all the part
Perhaps if you described both the schema and the query in more detail, we
could help... e.g. did the query have an IN clause with 2 keys? Or is
the key compound? More detail will help.
On Tue, Jun 10, 2014 at 7:15 PM, Jeremy Jongsma wrote:
> I didn't explain clearly - I'm not requesting 200
On Tue, Jun 10, 2014 at 2:21 PM, Philipp Potisk
wrote:
> First I added one node, which joined after 120min successfully. During
> that time there was no additional load on the cluster. Afterwards I started
> the other 3 new nodes after each other in order to join the cluster
> simultaneously.
>
Have a look at http://www.tinc-vpn.org/, mesh based and handles multiple
gateways for the same network in a graceful manner (so you can run two gateways
per region for HA).
Also supports NAT traversal if you need to do public-private clusters.
We are currently evaluating it for our managed Cas
Hey Rob,
thanks for pointing out the issue with simultaneous bootstraps. However, I
am not sure if this applies in my case. As a matter of fact I did not start
the nodes simultaneously - I waited about 10min until they were receiving
streams from other nodes. So I guess the topology-changes were e
Hi all,
I encountered a strange phenomena (at least I believe it's strange) when
trying to set a ttl for a whole row.
When trying to set a ttl for a row using update statement and updating all
values I'm getting kind of a "phantom cql row".
When trying to do the same thing using an insert statemen
18 matches
Mail list logo