upgradesstables/cleanup/compaction strategy change

2016-05-23 Thread Erik Forsberg
Hi! I have a 2.0.13 cluster which I have just extended, and I'm now looking into upgrading it to 2.1. * The cleanup after the extension is partially done. * I'm also looking into changing a few tables into Leveled Compaction Strategy. In the interest of speeding up things by avoiding unnece

Re: Extending a partially upgraded cluster - supported

2016-05-18 Thread Erik Forsberg
radesstables before you try to extend the cluster. On 5/18/16, 11:17 AM, "Erik Forsberg" wrote: Hi! I have a 2.0.13 cluster which I need to do two things with: * Extend it * Upgrade to 2.1.14 I'm pondering in what order to do things. Is it a supported operation to extend a partia

Extending a partially upgraded cluster - supported

2016-05-18 Thread Erik Forsberg
Hi! I have a 2.0.13 cluster which I need to do two things with: * Extend it * Upgrade to 2.1.14 I'm pondering in what order to do things. Is it a supported operation to extend a partially upgraded cluster, i.e. a cluster upgraded to 2.0 where not all sstables have been upgraded? If I do tha

Lot's of hints, but only on a few nodes

2016-05-10 Thread Erik Forsberg
I have this situation where a few (like, 3-4 out of 84) nodes misbehave. Very long GC pauses, dropping out of cluster etc. This happens while loading data (via CQL), and analyzing metrics it looks like on these few nodes, a lot of hints are being generated close to the time when they start to

Re: A few misbehaving nodes

2016-04-21 Thread Erik Forsberg
On 2016-04-19 15:54, sai krishnam raju potturi wrote: hi; do we see any hung process like Repairs on those 3 nodes? what does "nodetool netstats" show?? No hung process from what I can see. root@cssa02-06:~# nodetool tpstats Pool NameActive Pending Completed B

A few misbehaving nodes

2016-04-19 Thread Erik Forsberg
Hi! I have this problem where 3 of my 84 nodes misbehave with too long GC times, leading to them being marked as DN. This happens when I load data to them using CQL from a hadoop job, so quite a lot of inserts at a time. The CQL loading job is using TokenAwarePolicy with fallback to DCAwareR

How are writes handled while adding nodes to cluster?

2015-10-06 Thread Erik Forsberg
Hi! How are writes handled while I'm adding a node to a cluster, i.e. while the new node is in JOINING state? Are they queued up as hinted handoffs, or are they being written to the joining node? In the former case I guess I have to make sure my max_hint_window_in_ms is long enough for the node

One node misbehaving (lot's of GC), ideas?

2015-04-15 Thread Erik Forsberg
Hi! We having problems with one node (out of 56 in total) misbehaving. Symptoms are: * High number of full CMS old space collections during early morning when we're doing bulkloads. Yes, bulkloads, not CQL, and only a few thrift insertions. * Really long stop-the-world GC events (I've seen up to

Re: Cluster status instability

2015-04-08 Thread Erik Forsberg
To elaborate a bit on what Marcin said: * Once a node starts to believe that a few other nodes are down, it seems to stay that way for a very long time (hours). I'm not even sure it will recover without a restart. * I've tried to stop then start gossip with nodetool on the node that thinks several

Re: Anonymous user in permissions system?

2015-02-16 Thread Erik Forsberg
On 2015-02-05 12:39, Carlos Rolo wrote: > Hello Erik, > > It seems possible, refer to the following documentation to see if it > fits your needs: > http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureInternalAuthenticationTOC.html > http://www.datastax.com/documentation/ca

changes to metricsReporterConfigFile requires restart of cassandra?

2015-02-11 Thread Erik Forsberg
Hi! I was pleased to find out that cassandra 2.0.x has added support for pluggable metrics export, which even includes a graphite metrics sender. Question: Will changes to the metricsReporterConfigFile require a restart of cassandra to take effect? I.e, if I want to add a new exported metric to

Anonymous user in permissions system?

2015-02-05 Thread Erik Forsberg
Hi! Is there such a thing as the anonymous/unauthenticated user in the cassandra permissions system? What I would like to do is to grant select, i.e. provide read-only access, to users which have not presented a username and password. Then grant update/insert to other users which have presented

Re: Working with legacy data via CQL

2014-11-19 Thread Erik Forsberg
On 2014-11-19 01:37, Robert Coli wrote: > > Thanks, I can reproduce the issue with that, and I should be able to > look into it tomorrow. FWIW, I believe the issue is server-side, > not in the driver. I may be able to suggest a workaround once I > figure out what's going on. > >

Re: Working with legacy data via CQL

2014-11-17 Thread Erik Forsberg
On 2014-11-17 09:56, Erik Forsberg wrote: > On 2014-11-15 01:24, Tyler Hobbs wrote: >> What version of cassandra did you originally create the column family >> in? Have you made any schema changes to it through cql or >> cassandra-cli, or has it always been exactly the s

Re: Working with legacy data via CQL

2014-11-17 Thread Erik Forsberg
as been around since 2011. So CF was probably created in Cassandra 0.7 or 0.8 via thrift calls from pycassa, and I don't think there has been any schema changes to it since. Thanks, \EF > > On Wed, Nov 12, 2014 at 2:06 AM, Erik Forsberg <mailto:forsb...@opera.com>> wrote:

Re: Working with legacy data via CQL

2014-11-12 Thread Erik Forsberg
On 2014-11-11 19:40, Alex Popescu wrote: > On Tuesday, November 11, 2014, Erik Forsberg <mailto:forsb...@opera.com>> wrote: > > > You'll have better chances to get an answer about the Python driver on > its own mailing > list > https://groups.google.com

Working with legacy data via CQL

2014-11-11 Thread Erik Forsberg
Hi! I have some data in a table created using thrift. In cassandra-cli, the 'show schema' output for this table is: create column family Users with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'UTF8Type' and key_validation_class = 'LexicalUUIDType'

Restart joining node

2014-09-20 Thread Erik Forsberg
Hi! On the same subject as before - due to full disk during bootstrap, my joining nodes are stuck. What's the correct procedure here, will a plain restart of the node do the right thing, i.e. continue where bootstrap stopped, or is it better to clean the data directories before new start of daemon

Running out of disk at bootstrap in low-disk situation

2014-09-20 Thread Erik Forsberg
Hi! We have unfortunately managed to put ourselves in a situation where we are really close to full disks on our existing 27 nodes. We are now trying to add 15 more nodes, but running into problems with out of disk space on the new nodes while joining. We're using vnodes, on Cassandra 1.2.18 (ye

Re: LeveledCompaction, streaming bulkload, and lot's of small sstables

2014-08-20 Thread Erik Forsberg
On 2014-08-18 19:52, Robert Coli wrote: > On Mon, Aug 18, 2014 at 6:21 AM, Erik Forsberg <mailto:forsb...@opera.com>> wrote: > > Is there some configuration knob I can tune to make this happen faster? > I'm getting a bit confused by the description for min_sstab

LeveledCompaction, streaming bulkload, and lot's of small sstables

2014-08-18 Thread Erik Forsberg
Hi! I'm bulkloading via streaming from Hadoop to my Cassandra cluster. This results in a rather large set of relatively small (~1MiB) sstables as the number of mappers that generate sstables on the hadoop cluster is high. With SizeTieredCompactionStrategy, the cassandra cluster would quickly comp

Running sstableloader from live Cassandra server

2014-08-16 Thread Erik Forsberg
Hi! I'm looking into moving some data from one Cassandra cluster to another, both of them running Cassandra 1.2.13 (or maybe some later 1.2 version if that helps me avoid some fatal bug). Sstableloader will probably be the right thing for me, and given the size of my tables, I will want to run the

sstableloader and ttls

2014-08-16 Thread Erik Forsberg
Hi! If I use sstableloader to load data to a cluster, and the source sstables contain some columns where the TTL has expired, i.e. the sstable has not yet been compacted - will those entries be properly removed on the destination side? Thanks, \EF

Re: EOFException in bulkloader, then IllegalStateException

2014-01-27 Thread Erik Forsberg
On 2014-01-27 12:56, Erik Forsberg wrote: This is on Cassandra 1.2.1. I know that's pretty old, but I would like to avoid upgrading until I have made this migration from old to new hardware. Upgrading to 1.2.13 might be an option. Update: Exactly the same behaviour on Cassandra 1

EOFException in bulkloader, then IllegalStateException

2014-01-27 Thread Erik Forsberg
Hi! I'm bulkloading from Hadoop to Cassandra. Currently in the process of moving to new hardware for both Hadoop and Cassandra, and while testrunning bulkload, I see the following error: Exception in thread "Streaming to /2001:4c28:1:413:0:1:1:12:1" java.lang.RuntimeException: java.io.EOFExc

Graveyard compactions, when do they occur?

2012-03-27 Thread Erik Forsberg
Hi! I was trying out the "truncate" command in cassandra-cli. http://wiki.apache.org/cassandra/CassandraCli08 says "A snapshot of the data is created, which is deleted asyncronously during a 'graveyard' compaction." When do "graveyard" compactions happen? Do I have to trigger them somehow?

Re: sstable size increase at compaction

2012-03-21 Thread Erik Forsberg
On 2012-03-21 16:36, Erik Forsberg wrote: Hi! We're using the bulkloader to load data to Cassandra. During and after bulkloading, the minor compaction process seems to result in larger sstables being created. An example: This is on Cassandra 1.1, btw. \EF

sstable size increase at compaction

2012-03-21 Thread Erik Forsberg
Hi! We're using the bulkloader to load data to Cassandra. During and after bulkloading, the minor compaction process seems to result in larger sstables being created. An example: INFO [CompactionExecutor:105] 2012-03-21 15:18:46,608 CompactionTask.java (line 115) Compacting [SSTableReader(p

On Bloom filters and Key Cache

2012-03-21 Thread Erik Forsberg
Hi! We're currently testing Cassandra with a large number of row keys per node - nodetool cfstats approximated number of keys to something like 700M per node. This seems to have caused a very large heap consumption. After reading http://wiki.apache.org/cassandra/LargeDataSetConsiderations I

Re: Max TTL?

2012-02-21 Thread Erik Forsberg
On 2012-02-20 21:20, aaron morton wrote: Nothing obvious. Samarth (working on same project) found that his patch to CASSANDRA-3754 was cleaned up a bit too much, which caused a negative ttl. https://issues.apache.org/jira/browse/CASSANDRA-3754?focusedCommentId=13212395&page=com.atlassian.jir

Max TTL?

2012-02-20 Thread Erik Forsberg
Hi! When setting ttl on columns, is there a maximum value (other than MAXINT, 2**31-1) that can be used? I have a very odd behaviour here, where I try to set ttl to 9 622 973 (~111 days) which works, but setting it to 11 824 305 (~137 days) does not - it seems columns are deleted instantly

Streaming sessions from BulkOutputFormat job being listed long after they were killed

2012-02-17 Thread Erik Forsberg
Hi! If I run a hadoop job that uses BulkOutputFormat to write data to Cassandra, and that hadoop job is aborted, i.e. streaming sessions are not completed, it seems like the streaming sessions hang around for a very long time, I've observed at least 12-15h, in output from 'nodetool netstats'.

Recommended configuration for good streaming performance?

2012-02-02 Thread Erik Forsberg
Hi! We're experimenting with streaming from Hadoop to Cassandra using BulkoutputFormat, on cassandra-1.1 branch. Are there any specific settings we should tune on the Cassandra servers in order to get the best streaming performance? Our Cassandra hardware are 16 core (including HT cores) wi

Can I use BulkOutputFormat from 1.1 to load data to older Cassandra versions?

2012-01-08 Thread Erik Forsberg
Hi! Can the new BulkOutputFormat (https://issues.apache.org/jira/browse/CASSANDRA-3045) be used to load data to servers running cassandra 0.8.7 and/or Cassandra 1.0.6? I'm thinking of using jar files from the development version to load data onto a production cluster which I want to keep on

Re: Multiple large disks in server - setup considerations

2011-06-07 Thread Erik Forsberg
e the fastest way to get the node up to speed with the rest of the cluster? Thanks, \EF > > On Tue, May 31, 2011 at 7:47 AM, Erik Forsberg > wrote: > > Hi! > > > > I'm considering setting up a small (4-6 nodes) Cassandra cluster on > > machines that eac

Multiple large disks in server - setup considerations

2011-05-31 Thread Erik Forsberg
I'm limiting the size of the total amount of data in the largest CF at compaction to, hmm.. the free space on the disk with most free space, correct? Comments welcome! Thanks, \EF -- Erik Forsberg Developer, Opera Software - http://www.opera.com/