RE: Why "select count("*) from .." hangs ?
Hi Shalab, Are you using anything in your WHERE clause of the query? If not, you are doing a full scan of your data. In iteration 8 it will scan 1 500 000 entries, and the default time out value is pretty low. If you do select count(*) from traffic_by_day where segment_id = 1 and day = 1 it should still work (and very fast). If you really need the count your data is to insert it, and then increment a counter with the number of records you inserted. That way insteady of doing a count(*) you just get the current counter value. Kind regards, Pieter From: shahab [mailto:shahab.mok...@gmail.com] Sent: dinsdag 25 maart 2014 17:19 To: user@cassandra.apache.org Subject: Re: Why "select count("*) from .." hangs ? Thanks. I run it on a Linux Server, Dual Processor, Intel(R) Xeon(R) CPU E5440 @ 2.83GHz, 4 core each and 8 GM RAM. Just to give an example of data inserted: INSERT INTO traffic_by_day(segment_id, day, event_time, traffic_value) VALUES (100, 84, '2013-04-03 07:02:00', 79); Here is the schema: CREATE TABLE traffic_by_day ( segment_id int, day int, event_time timestamp, traffic_value int, PRIMARY KEY ((segment_id, day), event_time) ) WITH bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.00 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.10 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; On Tue, Mar 25, 2014 at 4:58 PM, Michael Shuler mailto:mich...@pbandjelly.org>> wrote: On 03/25/2014 10:36 AM, shahab wrote: In our application, we need to insert roughly 30 sensor data every 30 seconds (basically we need to store time-series data). I wrote a simple java code to insert 30 random data every 30 seconds for 10 iterations, and measured the number of entries in the table after each insertion. But after iteration 8, (i.e. inserting 150 sensor data), the "select count(') ...) throws time-out exception and doesn't work anymore. I even tried to execute "select count(*)..." using Datastax DevCenter GUI, but I got same result. If you could post your schema, folks may be able to help a bit better. Your C* version couldn't hurt. cqlsh> DESC KEYSPACE $your_ks; -- Kind regards, Michael
RE: Getting into Too many open files issues
Hi Murthy, 32768 is a bit low (I know datastax docs recommend this). But our production env is now running on 1kk, or you can even put it on unlimited. Pieter From: Murthy Chelankuri [mailto:kmurt...@gmail.com] Sent: donderdag 7 november 2013 12:46 To: user@cassandra.apache.org Subject: Re: Getting into Too many open files issues Thanks Pieter for giving quick reply. I have downloaded the tar ball. And have changed the limits.conf as per the documentation like below. * soft nofile 32768 * hard nofile 32768 root soft nofile 32768 root hard nofile 32768 * soft memlock unlimited * hard memlock unlimited root soft memlock unlimited root hard memlock unlimited * soft as unlimited * hard as unlimited root soft as unlimited root hard as unlimited root soft/hard nproc 32000 Some reason with in less than an hour cassandra node is opening 32768 files and cassandra is not responding after that. It is still not clear why cassadra is opening that many files and not closing properly ( does the laest cassandra 2.0.1 version have some bugs ). what i have been experimenting is 300 writes per sec and 500 reads per sec. And i have using 2 node cluster with 8 core cpu and 32GB RAM ( Virtuval Machines) Do we need to increase the nofile limts to more than 32768 ? On Thu, Nov 7, 2013 at 4:55 PM, Pieter Callewaert mailto:pieter.callewa...@be-mobile.be>> wrote: Hi Murthy, Did you do a package install (.deb?) or you downloaded the tar? If the latest, you have to adjust the limits.conf file (/etc/security/limits.conf) to raise the nofile (number of files open) for the cassandra user. If you are using the .deb package, the limit is already raised to 100 000 files. (can be found in /etc/init.d/cassandra, FD_LIMIT). However, with the 2.0.x I had to raise it to 1 000 000 because 100 000 was too low. Kind regards, Pieter Callewaert From: Murthy Chelankuri [mailto:kmurt...@gmail.com<mailto:kmurt...@gmail.com>] Sent: donderdag 7 november 2013 12:15 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: Getting into Too many open files issues I have experimenting cassandra latest version for storing the huge the in our application. Write are doing good. but when comes to reads i have obsereved that cassandra is getting into too many open files issues. When i check the logs its not able to open the cassandra data files any more before of the file descriptors limits. Can some one suggest me what i am going wrong what could be issues which causing the read operating leads to Too many open files issue.
RE: Getting into Too many open files issues
Hi Murthy, Did you do a package install (.deb?) or you downloaded the tar? If the latest, you have to adjust the limits.conf file (/etc/security/limits.conf) to raise the nofile (number of files open) for the cassandra user. If you are using the .deb package, the limit is already raised to 100 000 files. (can be found in /etc/init.d/cassandra, FD_LIMIT). However, with the 2.0.x I had to raise it to 1 000 000 because 100 000 was too low. Kind regards, Pieter Callewaert From: Murthy Chelankuri [mailto:kmurt...@gmail.com] Sent: donderdag 7 november 2013 12:15 To: user@cassandra.apache.org Subject: Getting into Too many open files issues I have experimenting cassandra latest version for storing the huge the in our application. Write are doing good. but when comes to reads i have obsereved that cassandra is getting into too many open files issues. When i check the logs its not able to open the cassandra data files any more before of the file descriptors limits. Can some one suggest me what i am going wrong what could be issues which causing the read operating leads to Too many open files issue.
RE: OpsCenter not connecting to Cluster
Hi Nigel, I've currently have a simular problem. However, it has only been reproduced on Ubuntu... Are you using hsha as rpc_server_type? http://stackoverflow.com/questions/19633980/adding-cluster-error-creating-cluster-call-to-cluster-configs-timed-out is a guy with the same problem, showing how to reproduce. I know the ppl of datastax are now investigating this, but no fix yet... Kind regards, Pieter Callewaert -Original Message- From: Nigel LEACH [mailto:nigel.le...@uk.bnpparibas.com] Sent: dinsdag 29 oktober 2013 18:24 To: user@cassandra.apache.org Subject: OpsCenter not connecting to Cluster Cassandra 2.0.1 OpsCenter 3.2.2 Java 1.7.0_25 RHEL 6.4 This is a new test cluster with just three nodes, two seed nodes, SSL turned off, and GossipingPropertyFileSnitch. Pretty much out of the box environment, with both Cassandra and OpsCenter via DataStax yum repository. Cassandra seems fine, and OpsCenter is installed on a seed node. The OpsCenter gui comes up, but is unable to connect to the cluster, I get this error INFO: Starting factory INFO: will retry in 2 seconds DEBUG: Problem while pinging node: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/opscenterd/ThriftService.py", line 157, in checkThriftConnection UserError: User aborted connection: Shutdown requested. WARN: ProcessingError while calling CreateClusterConfController: Unable to connect to cluster INFO: Stopping factory Not getting very far troubleshooting it, any clues would be much appreciated. Should I try installing the OpsCenter agent manually maybe? Many Thanks, Nigel ___ This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is prohibited. Please refer to http://www.bnpparibas.co.uk/en/email-disclaimer/ for additional disclosures.
RE: Too many open files (Cassandra 2.0.1)
Investigated a bit more: -I can reproduce it, happened already on several nodes when I do some stress testing (5 select's spread over multiple threads) -Unexpected exception in the selector loop. Seems not related with the Too many open files, it just happens. -It's not socket related. -Using Oracle Java(TM) SE Runtime Environment (build 1.7.0_40-b43) -Using multiple data directories (maybe related ?) I'm stuck at the moment, I don't know If I should try DEBUG log because it will be too much information? Kind regards, Pieter Callewaert [Description: cid:image003.png@01CD9CE5.CE5A2330] Pieter Callewaert Web & IT engineer Web: www.be-mobile.be<http://www.be-mobile.be/> Email: pieter.callewa...@be-mobile.be<mailto:pieter.callewa...@be-mobile.be> Tel: + 32 9 330 51 80 From: Pieter Callewaert [mailto:pieter.callewa...@be-mobile.be] Sent: dinsdag 29 oktober 2013 13:40 To: user@cassandra.apache.org Subject: Too many open files (Cassandra 2.0.1) Hi, I've noticed some nodes in our cluster are dying after some period of time. WARN [New I/O server boss #17] 2013-10-29 12:22:20,725 Slf4JLogger.java (line 76) Failed to accept a connection. java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) And other exceptions related to the same cause. Now, as we use the Cassandra package, the nofile limit is raised to 10. To double check if this correct: root@de-cass09 ~ # cat /proc/18332/limits Limit Soft Limit Hard Limit Units ... Max open files10 10 files ... Now I check how many files are open: root@de-cass09 ~ # lsof -n -p 18332 | wc -l 100038 This seems an awful a lot for size tiered compaction... ? Now I noticed when I checked the list, a (deleted) file passed a lot ... java18332 cassandra 4704r REG8,1 10911921661 2147483839 /data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db (deleted) java18332 cassandra 4705r REG8,1 10911921661 2147483839 /data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db (deleted) ... Actually, if I count specific for this file: root@de-cass09 ~ # lsof -n -p 18332 | grep mapdata040-hos-jb-7648-Data.db | wc -l 52707 Other nodes are around a total of 350 files open... Any idea why this nofiles is so high ? The first exceptions I see is this: WARN [New I/O worker #8] 2013-10-29 12:09:34,440 Slf4JLogger.java (line 76) Unexpected exception in the selector loop. java.lang.NullPointerException at sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:178) at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:227) at sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:164) at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:133) at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:209) at org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:151) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Several minutes later I get Too many open files. Specs: 12-node cluster with Ubuntu 12.04 LTS, Cassandra 2.0.1 (datastax packages), using JBOD of 2 disks. JNA enabled. Any suggestions? Kind regards, Pieter Callewaert [Description: cid:image003.png@01CD9CE5.CE5A2330] Pieter Callewaert Web & IT engineer Web: www.be-mobile.be<http://www.be-mobile.be/> Email: pieter.callewa...@be-mobile.be<mailto:pieter.callewa...@be-mobile.be> Tel: + 32 9 330 51 80 <>
Too many open files (Cassandra 2.0.1)
Hi, I've noticed some nodes in our cluster are dying after some period of time. WARN [New I/O server boss #17] 2013-10-29 12:22:20,725 Slf4JLogger.java (line 76) Failed to accept a connection. java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) And other exceptions related to the same cause. Now, as we use the Cassandra package, the nofile limit is raised to 10. To double check if this correct: root@de-cass09 ~ # cat /proc/18332/limits Limit Soft Limit Hard Limit Units ... Max open files10 10 files ... Now I check how many files are open: root@de-cass09 ~ # lsof -n -p 18332 | wc -l 100038 This seems an awful a lot for size tiered compaction... ? Now I noticed when I checked the list, a (deleted) file passed a lot ... java18332 cassandra 4704r REG8,1 10911921661 2147483839 /data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db (deleted) java18332 cassandra 4705r REG8,1 10911921661 2147483839 /data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db (deleted) ... Actually, if I count specific for this file: root@de-cass09 ~ # lsof -n -p 18332 | grep mapdata040-hos-jb-7648-Data.db | wc -l 52707 Other nodes are around a total of 350 files open... Any idea why this nofiles is so high ? The first exceptions I see is this: WARN [New I/O worker #8] 2013-10-29 12:09:34,440 Slf4JLogger.java (line 76) Unexpected exception in the selector loop. java.lang.NullPointerException at sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:178) at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:227) at sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:164) at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:133) at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:209) at org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:151) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Several minutes later I get Too many open files. Specs: 12-node cluster with Ubuntu 12.04 LTS, Cassandra 2.0.1 (datastax packages), using JBOD of 2 disks. JNA enabled. Any suggestions? Kind regards, Pieter Callewaert [Description: cid:image003.png@01CD9CE5.CE5A2330] Pieter Callewaert Web & IT engineer Web: www.be-mobile.be<http://www.be-mobile.be/> Email: pieter.callewa...@be-mobile.be<mailto:pieter.callewa...@be-mobile.be> Tel: + 32 9 330 51 80 <>
RE: default_time_to_live
Thanks, it works perfectly with ALTER TABLE. Stupid I didn't thought of this. Maybe I overlooked, but maybe this should be added in the docs. Really a great feature! Kind regards, [Description: cid:image003.png@01CD9CE5.CE5A2330] Pieter Callewaert Web & IT engineer Web: www.be-mobile.be<http://www.be-mobile.be/> Email: pieter.callewa...@be-mobile.be<mailto:pieter.callewa...@be-mobile.be> Tel: + 32 9 330 51 80 From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: dinsdag 1 oktober 2013 19:10 To: user@cassandra.apache.org Subject: Re: default_time_to_live You're not supposed to change the table settings by modifying system.schema_columnfamilies as this will skip proper propagation of the change. Instead, you're supposed to do an ALTER TABLE, so something like: ALTER TABLE hol WITH default_time_to_live=10; That being said, if you restart the node on which you've made the update, the change "should" be picked up and propagated to all nodes. Still not a bad idea to do the ALTER TABLE to make sure everything is set right. -- Sylvain On Tue, Oct 1, 2013 at 10:50 AM, Pieter Callewaert mailto:pieter.callewa...@be-mobile.be>> wrote: Hi, We are starting up a new cluster with Cassandra 2.0.0 and one of the features we were interested in was Per-CF TTL (https://issues.apache.org/jira/browse/CASSANDRA-3974) I didn't find any commands in CQL to set this value, so I've used the following: UPDATE system.schema_columnfamilies SET default_time_to_live = 10 WHERE keyspace_name = 'testschema' AND columnfamily_name = 'hol'; Confirming it is set: cqlsh:testschema> select default_time_to_live from system.schema_columnfamilies where keyspace_name = 'testschema' and columnfamily_name = 'hol'; default_time_to_live -- 10 Then I Insert some dummy data, but it never expires... Using the ttl command I get this: cqlsh:testschema> select ttl(coverage) from hol; ttl(coverage) --- Null Am I doing something wrong? Or is this a bug? Kind regards, [Description: cid:image003.png@01CD9CE5.CE5A2330] Pieter Callewaert Web & IT engineer Web: www.be-mobile.be<http://www.be-mobile.be/> Email: pieter.callewa...@be-mobile.be<mailto:pieter.callewa...@be-mobile.be> Tel: + 32 9 330 51 80 <>
default_time_to_live
Hi, We are starting up a new cluster with Cassandra 2.0.0 and one of the features we were interested in was Per-CF TTL (https://issues.apache.org/jira/browse/CASSANDRA-3974) I didn't find any commands in CQL to set this value, so I've used the following: UPDATE system.schema_columnfamilies SET default_time_to_live = 10 WHERE keyspace_name = 'testschema' AND columnfamily_name = 'hol'; Confirming it is set: cqlsh:testschema> select default_time_to_live from system.schema_columnfamilies where keyspace_name = 'testschema' and columnfamily_name = 'hol'; default_time_to_live -- 10 Then I Insert some dummy data, but it never expires... Using the ttl command I get this: cqlsh:testschema> select ttl(coverage) from hol; ttl(coverage) --- Null Am I doing something wrong? Or is this a bug? Kind regards, [Description: cid:image003.png@01CD9CE5.CE5A2330] Pieter Callewaert Web & IT engineer Web: www.be-mobile.be<http://www.be-mobile.be/> Email: pieter.callewa...@be-mobile.be<mailto:pieter.callewa...@be-mobile.be> Tel: + 32 9 330 51 80 <>
RE: cryptic exception in Hadoop/Cassandra job
I have the same issue (but with sstableloaders). Should be fixed in 1.2 release (https://issues.apache.org/jira/browse/CASSANDRA-4813) Kind regards, Pieter -Original Message- From: Brian Jeltema [mailto:brian.jelt...@digitalenvoy.net] Sent: woensdag 30 januari 2013 13:58 To: user@cassandra.apache.org Subject: Re: cryptic exception in Hadoop/Cassandra job Cassandra 1.1.5, using BulkOutputFormat Brian On Jan 30, 2013, at 7:39 AM, Pieter Callewaert wrote: > Hi Brian, > > Which version of cassandra are you using? And are you using the BOF to write > to Cassandra? > > Kind regards, > Pieter > > -Original Message- > From: Brian Jeltema [mailto:brian.jelt...@digitalenvoy.net] > Sent: woensdag 30 januari 2013 13:20 > To: user@cassandra.apache.org > Subject: cryptic exception in Hadoop/Cassandra job > > > I have a Hadoop/Cassandra map/reduce job that performs a simple > transformation on a table with very roughly 1 billion columns spread across > roughly 4 million rows. During reduction, I see a relative handful of the > following: > > Exception in thread "Streaming to /10.4.0.3:1" java.lang.RuntimeException: > java.io.EOFException > at > org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) > at > org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:104) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > ... 3 more > > which ultimately leads to job failure. I can't tell if this is a bug in my > code or in the underlying framework. > Does anyone have suggestions on how to debug this? > > TIA > > Brian > >
RE: cryptic exception in Hadoop/Cassandra job
Hi Brian, Which version of cassandra are you using? And are you using the BOF to write to Cassandra? Kind regards, Pieter -Original Message- From: Brian Jeltema [mailto:brian.jelt...@digitalenvoy.net] Sent: woensdag 30 januari 2013 13:20 To: user@cassandra.apache.org Subject: cryptic exception in Hadoop/Cassandra job I have a Hadoop/Cassandra map/reduce job that performs a simple transformation on a table with very roughly 1 billion columns spread across roughly 4 million rows. During reduction, I see a relative handful of the following: Exception in thread "Streaming to /10.4.0.3:1" java.lang.RuntimeException: java.io.EOFException at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.cassandra.streaming.FileStreamTask.receiveReply(FileStreamTask.java:194) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:104) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more which ultimately leads to job failure. I can't tell if this is a bug in my code or in the underlying framework. Does anyone have suggestions on how to debug this? TIA Brian
RE: idea drive layout - 4 drives + RAID question
We also have 4-disk nodes, and we use the following layout: 2 x OS + Commit in RAID 1 2 x Data disk in RAID 0 This gives us the advantage we never have to reinstall the node when a drive crashes. Kind regards, Pieter From: Ran User [mailto:ranuse...@gmail.com] Sent: dinsdag 30 oktober 2012 4:33 To: user@cassandra.apache.org Subject: Re: idea drive layout - 4 drives + RAID question Have you considered running RAID 10 for the data drives to improve MTBF? On one hand Cassandra is handling redundancy issues, on the other hand, reducing the frequency of dealing with failed nodes is attractive if cheap (switching RAID levels to 10). We have no experience with software RAID (have always used hardware raid with BBU). I'm assuming software RAID 1 or 10 (the mirroring part) is inherently reliable (perhaps minus some edge case). On Tue, Oct 30, 2012 at 1:07 AM, Tupshin Harper mailto:tups...@tupshin.com>> wrote: I would generally recommend 1 drive for OS and commit log and 3 drive raid 0 for data. The raid does give you good performance benefit, and it can be convenient to have the OS on a side drive for configuration ease and better MTBF. -Tupshin On Oct 29, 2012 8:56 PM, "Ran User" mailto:ranuse...@gmail.com>> wrote: I was hoping to achieve approx. 2x IO (write and read) performance via RAID 0 (by accepting a higher MTBF). Do believe the performance gains of RAID0 are much lower and/or are not worth it vs the increased server failure rate? >From my understanding, RAID 10 would achieve the read performance benefits of >RAID 0, but not the write benefits. I'm also considering RAID 10 to maximize >server IO performance. Currently, we're working with 1 CF. Thank you On Mon, Oct 29, 2012 at 11:51 PM, Timmy Turner mailto:timm.t...@gmail.com>> wrote: I'm not sure whether the raid 0 gets you anything other than headaches should one of the drives fail. You can already distribute the individual Cassandra column families on different drives by just setting up symlinks to the individual folders. 2012/10/30 Ran User mailto:ranuse...@gmail.com>>: > For a server with 4 drive slots only, I'm thinking: > > either: > > - OS (1 drive) > - Commit Log (1 drive) > - Data (2 drives, software raid 0) > > vs > > - OS + Data (3 drives, software raid 0) > - Commit Log (1 drive) > > or something else? > > also, if I can spare the wasted storage, would RAID 10 for cassandra data > improve read performance and have no effect on write performance? > > Thank you!
RE: frequent node up/downs
Hi, Had the same problem this morning, seems related to the leap second bug. Rebooting the nodes fixed it for me, but there seems to be a fix also without rebooting the server. Kind regards, Pieter From: feedly team [mailto:feedly...@gmail.com] Sent: maandag 2 juli 2012 17:09 To: user@cassandra.apache.org Subject: frequent node up/downs Hello, I recently set up a 2 node cassandra cluster on dedicated hardware. In the logs there have been a lot of "InetAddress xxx is now dead' or UP messages. Comparing the log messages between the 2 nodes, they seem to coincide with extremely long ParNew collections. I have seem some of up to 50 seconds. The installation is pretty vanilla, I didn't change any settings and the machines don't seem particularly busy - cassandra is the only thing running on the machine with an 8GB heap. The machine has 64GB of RAM and CPU/IO usage looks pretty light. I do see a lot of 'Heap is xxx full. You may need to reduce memtable and/or cache sizes' messages. Would this help with the long ParNew collections? That message seems to be triggered on a full collection.
RE: forceUserDefinedCompaction in 1.1.0
Hi, While I was typing my mail I had the idea to try with the new directory layout. It seems you have to change the parameter settings from 1.0 to 1.1 In 1.0: Param 1: Param 2: In 1.1: Param 1: Param 2: / Don't know if this is a bug or a breaking change ? Kind regards, Pieter Callewaert From: Pieter Callewaert [mailto:pieter.callewa...@be-mobile.be] Sent: maandag 2 juli 2012 15:10 To: user@cassandra.apache.org Subject: forceUserDefinedCompaction in 1.1.0 Hi guys, We have a 6-node 1.0.9 cluster for production and 3-node 1.1.0 cluster for testing the new version of Cassandra. In both we insert data in a particular CF with always a TTL of 31 days. To clean up the files faster we use the forceUserDefinedCompaction to manually force compaction on the old sstables which we are sure the data has expired. In 1.0 this works perfect, but in the 1.1 the command executes without error, but in the log of the node I see the following: INFO [CompactionExecutor:546] 2012-07-02 15:05:11,837 CompactionManager.java (line 337) Will not compact MapData024CTwT/MapData024CTwT-HOS-hc-21150: it is not an active sstable INFO [CompactionExecutor:546] 2012-07-02 15:05:11,838 CompactionManager.java (line 350) No file to compact for user defined compaction I am pretty sure the sstable because it should contain partly expired data en partly data which is still active. Does this have to do something with the new directory structure from 1.1 ? Or are the parameters changed from the function? Kind regards, Pieter Callewaert
forceUserDefinedCompaction in 1.1.0
Hi guys, We have a 6-node 1.0.9 cluster for production and 3-node 1.1.0 cluster for testing the new version of Cassandra. In both we insert data in a particular CF with always a TTL of 31 days. To clean up the files faster we use the forceUserDefinedCompaction to manually force compaction on the old sstables which we are sure the data has expired. In 1.0 this works perfect, but in the 1.1 the command executes without error, but in the log of the node I see the following: INFO [CompactionExecutor:546] 2012-07-02 15:05:11,837 CompactionManager.java (line 337) Will not compact MapData024CTwT/MapData024CTwT-HOS-hc-21150: it is not an active sstable INFO [CompactionExecutor:546] 2012-07-02 15:05:11,838 CompactionManager.java (line 350) No file to compact for user defined compaction I am pretty sure the sstable because it should contain partly expired data en partly data which is still active. Does this have to do something with the new directory structure from 1.1 ? Or are the parameters changed from the function? Kind regards, Pieter Callewaert
RE: supercolumns with TTL columns not being compacted correctly
Hi, This means I got a serious flaw in my column family design. At this moment I am storing sensor data into the database, rowkey is the sensor ID, supercolumn is the timestamp, and the different columns in the supercolumn are sensor readings. This means with my current design it is almost impossible to ‘delete’ data from disk unless doing a major compaction on all the sstables ? (all sstables will contain the same rowkey) And at this moment new data is being loaded every 5 minutes, which means it would be big troubles to do the major compaction. Is this correct what I am thinking? Kind regards, Pieter Callewaert From: Yuki Morishita [mailto:mor.y...@gmail.com] Sent: dinsdag 22 mei 2012 16:21 To: user@cassandra.apache.org Subject: Re: supercolumns with TTL columns not being compacted correctly Data will not be deleted when those keys appear in other stables outside of compaction. This is to prevent obsolete data from appearing again. yuki On Tuesday, May 22, 2012 at 7:37 AM, Pieter Callewaert wrote: Hi Samal, Thanks for your time looking into this. I force the compaction by using forceUserDefinedCompaction on only that particular sstable. This gurantees me the new sstable being written only contains the data from the old sstable. The data in the sstable is more than 31 days old and gc_grace is 0, but still the data from the sstable is being written to the new one, while I am 100% sure all the data is invalid. Kind regards, Pieter Callewaert From: samal [mailto:samalgo...@gmail.com] Sent: dinsdag 22 mei 2012 14:33 To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: Re: supercolumns with TTL columns not being compacted correctly Data will remain till next compaction but won't be available. Compaction will delete old sstable create new one. On 22-May-2012 5:47 PM, "Pieter Callewaert" mailto:pieter.callewa...@be-mobile.be>> wrote: Hi, I’ve had my suspicions some months, but I think I am sure about it. Data is being written by the SSTableSimpleUnsortedWriter and loaded by the sstableloader. The data should be alive for 31 days, so I use the following logic: int ttl = 2678400; long timestamp = System.currentTimeMillis() * 1000; long expirationTimestampMS = (long) ((timestamp / 1000) + ((long) ttl * 1000)); And using this to write it: sstableWriter.newRow(bytes(entry.id<http://entry.id>)); sstableWriter.newSuperColumn(bytes(superColumn)); sstableWriter.addExpiringColumn(nameTT, bytes(entry.aggregatedTTMs), timestamp, ttl, expirationTimestampMS); sstableWriter.addExpiringColumn(nameCov, bytes(entry.observationCoverage), timestamp, ttl, expirationTimestampMS); sstableWriter.addExpiringColumn(nameSpd, bytes(entry.speed), timestamp, ttl, expirationTimestampMS); This works perfectly, data can be queried until 31 days are passed, then no results are given, as expected. But the data is still on disk until the sstables are being recompacted: One of our nodes (we got 6 total) has the following sstables: [cassandra@bemobile-cass3 ~]$ ls -hal /data/MapData007/HOS-* | grep G -rw-rw-r--. 1 cassandra cassandra 103G May 3 03:19 /data/MapData007/HOS-hc-125620-Data.db -rw-rw-r--. 1 cassandra cassandra 103G May 12 21:17 /data/MapData007/HOS-hc-163141-Data.db -rw-rw-r--. 1 cassandra cassandra 25G May 15 06:17 /data/MapData007/HOS-hc-172106-Data.db -rw-rw-r--. 1 cassandra cassandra 25G May 17 19:50 /data/MapData007/HOS-hc-181902-Data.db -rw-rw-r--. 1 cassandra cassandra 21G May 21 07:37 /data/MapData007/HOS-hc-191448-Data.db -rw-rw-r--. 1 cassandra cassandra 6.5G May 21 17:41 /data/MapData007/HOS-hc-193842-Data.db -rw-rw-r--. 1 cassandra cassandra 5.8G May 22 11:03 /data/MapData007/HOS-hc-196210-Data.db -rw-rw-r--. 1 cassandra cassandra 1.4G May 22 13:20 /data/MapData007/HOS-hc-196779-Data.db -rw-rw-r--. 1 cassandra cassandra 401G Apr 16 08:33 /data/MapData007/HOS-hc-58572-Data.db -rw-rw-r--. 1 cassandra cassandra 169G Apr 16 17:59 /data/MapData007/HOS-hc-61630-Data.db -rw-rw-r--. 1 cassandra cassandra 173G Apr 17 03:46 /data/MapData007/HOS-hc-63857-Data.db -rw-rw-r--. 1 cassandra cassandra 105G Apr 23 06:41 /data/MapData007/HOS-hc-87900-Data.db As you can see, the following files should be invalid: /data/MapData007/HOS-hc-58572-Data.db /data/MapData007/HOS-hc-61630-Data.db /data/MapData007/HOS-hc-63857-Data.db Because they are all written more than an moth ago. gc_grace is 0 so this should also not be a problem. As a test, I use forceUserSpecifiedCompaction on the HOS-hc-61630-Data.db. Expected behavior should be an empty file is being written because all data in the sstable should be invalid: Compactionstats is giving: compaction typekeyspace column family bytes compacted bytes total progress Compaction MapData007 HOS 11518215662 532355279724 2.16% And when I ls the directory I find this: -rw-r
RE: supercolumns with TTL columns not being compacted correctly
Hi Samal, Thanks for your time looking into this. I force the compaction by using forceUserDefinedCompaction on only that particular sstable. This gurantees me the new sstable being written only contains the data from the old sstable. The data in the sstable is more than 31 days old and gc_grace is 0, but still the data from the sstable is being written to the new one, while I am 100% sure all the data is invalid. Kind regards, Pieter Callewaert From: samal [mailto:samalgo...@gmail.com] Sent: dinsdag 22 mei 2012 14:33 To: user@cassandra.apache.org Subject: Re: supercolumns with TTL columns not being compacted correctly Data will remain till next compaction but won't be available. Compaction will delete old sstable create new one. On 22-May-2012 5:47 PM, "Pieter Callewaert" mailto:pieter.callewa...@be-mobile.be>> wrote: Hi, I’ve had my suspicions some months, but I think I am sure about it. Data is being written by the SSTableSimpleUnsortedWriter and loaded by the sstableloader. The data should be alive for 31 days, so I use the following logic: int ttl = 2678400; long timestamp = System.currentTimeMillis() * 1000; long expirationTimestampMS = (long) ((timestamp / 1000) + ((long) ttl * 1000)); And using this to write it: sstableWriter.newRow(bytes(entry.id<http://entry.id>)); sstableWriter.newSuperColumn(bytes(superColumn)); sstableWriter.addExpiringColumn(nameTT, bytes(entry.aggregatedTTMs), timestamp, ttl, expirationTimestampMS); sstableWriter.addExpiringColumn(nameCov, bytes(entry.observationCoverage), timestamp, ttl, expirationTimestampMS); sstableWriter.addExpiringColumn(nameSpd, bytes(entry.speed), timestamp, ttl, expirationTimestampMS); This works perfectly, data can be queried until 31 days are passed, then no results are given, as expected. But the data is still on disk until the sstables are being recompacted: One of our nodes (we got 6 total) has the following sstables: [cassandra@bemobile-cass3 ~]$ ls -hal /data/MapData007/HOS-* | grep G -rw-rw-r--. 1 cassandra cassandra 103G May 3 03:19 /data/MapData007/HOS-hc-125620-Data.db -rw-rw-r--. 1 cassandra cassandra 103G May 12 21:17 /data/MapData007/HOS-hc-163141-Data.db -rw-rw-r--. 1 cassandra cassandra 25G May 15 06:17 /data/MapData007/HOS-hc-172106-Data.db -rw-rw-r--. 1 cassandra cassandra 25G May 17 19:50 /data/MapData007/HOS-hc-181902-Data.db -rw-rw-r--. 1 cassandra cassandra 21G May 21 07:37 /data/MapData007/HOS-hc-191448-Data.db -rw-rw-r--. 1 cassandra cassandra 6.5G May 21 17:41 /data/MapData007/HOS-hc-193842-Data.db -rw-rw-r--. 1 cassandra cassandra 5.8G May 22 11:03 /data/MapData007/HOS-hc-196210-Data.db -rw-rw-r--. 1 cassandra cassandra 1.4G May 22 13:20 /data/MapData007/HOS-hc-196779-Data.db -rw-rw-r--. 1 cassandra cassandra 401G Apr 16 08:33 /data/MapData007/HOS-hc-58572-Data.db -rw-rw-r--. 1 cassandra cassandra 169G Apr 16 17:59 /data/MapData007/HOS-hc-61630-Data.db -rw-rw-r--. 1 cassandra cassandra 173G Apr 17 03:46 /data/MapData007/HOS-hc-63857-Data.db -rw-rw-r--. 1 cassandra cassandra 105G Apr 23 06:41 /data/MapData007/HOS-hc-87900-Data.db As you can see, the following files should be invalid: /data/MapData007/HOS-hc-58572-Data.db /data/MapData007/HOS-hc-61630-Data.db /data/MapData007/HOS-hc-63857-Data.db Because they are all written more than an moth ago. gc_grace is 0 so this should also not be a problem. As a test, I use forceUserSpecifiedCompaction on the HOS-hc-61630-Data.db. Expected behavior should be an empty file is being written because all data in the sstable should be invalid: Compactionstats is giving: compaction typekeyspace column family bytes compacted bytes total progress Compaction MapData007 HOS 11518215662 532355279724 2.16% And when I ls the directory I find this: -rw-rw-r--. 1 cassandra cassandra 3.9G May 22 14:12 /data/MapData007/HOS-tmp-hc-196898-Data.db The sstable is being 1-on-1 copied to a new one. What am I missing here? TTL works perfectly, but is it giving a problem because it is in a super column, and so never to be deleted from disk? Kind regards Pieter Callewaert | Web & IT engineer Be-Mobile NV<http://www.be-mobile.be/> | TouringMobilis<http://www.touringmobilis.be/> Technologiepark 12b - 9052 Ghent - Belgium Tel + 32 9 330 51 80 | Fax + 32 9 330 51 81 | Cell + 32 473 777 121
supercolumns with TTL columns not being compacted correctly
Hi, I've had my suspicions some months, but I think I am sure about it. Data is being written by the SSTableSimpleUnsortedWriter and loaded by the sstableloader. The data should be alive for 31 days, so I use the following logic: int ttl = 2678400; long timestamp = System.currentTimeMillis() * 1000; long expirationTimestampMS = (long) ((timestamp / 1000) + ((long) ttl * 1000)); And using this to write it: sstableWriter.newRow(bytes(entry.id)); sstableWriter.newSuperColumn(bytes(superColumn)); sstableWriter.addExpiringColumn(nameTT, bytes(entry.aggregatedTTMs), timestamp, ttl, expirationTimestampMS); sstableWriter.addExpiringColumn(nameCov, bytes(entry.observationCoverage), timestamp, ttl, expirationTimestampMS); sstableWriter.addExpiringColumn(nameSpd, bytes(entry.speed), timestamp, ttl, expirationTimestampMS); This works perfectly, data can be queried until 31 days are passed, then no results are given, as expected. But the data is still on disk until the sstables are being recompacted: One of our nodes (we got 6 total) has the following sstables: [cassandra@bemobile-cass3 ~]$ ls -hal /data/MapData007/HOS-* | grep G -rw-rw-r--. 1 cassandra cassandra 103G May 3 03:19 /data/MapData007/HOS-hc-125620-Data.db -rw-rw-r--. 1 cassandra cassandra 103G May 12 21:17 /data/MapData007/HOS-hc-163141-Data.db -rw-rw-r--. 1 cassandra cassandra 25G May 15 06:17 /data/MapData007/HOS-hc-172106-Data.db -rw-rw-r--. 1 cassandra cassandra 25G May 17 19:50 /data/MapData007/HOS-hc-181902-Data.db -rw-rw-r--. 1 cassandra cassandra 21G May 21 07:37 /data/MapData007/HOS-hc-191448-Data.db -rw-rw-r--. 1 cassandra cassandra 6.5G May 21 17:41 /data/MapData007/HOS-hc-193842-Data.db -rw-rw-r--. 1 cassandra cassandra 5.8G May 22 11:03 /data/MapData007/HOS-hc-196210-Data.db -rw-rw-r--. 1 cassandra cassandra 1.4G May 22 13:20 /data/MapData007/HOS-hc-196779-Data.db -rw-rw-r--. 1 cassandra cassandra 401G Apr 16 08:33 /data/MapData007/HOS-hc-58572-Data.db -rw-rw-r--. 1 cassandra cassandra 169G Apr 16 17:59 /data/MapData007/HOS-hc-61630-Data.db -rw-rw-r--. 1 cassandra cassandra 173G Apr 17 03:46 /data/MapData007/HOS-hc-63857-Data.db -rw-rw-r--. 1 cassandra cassandra 105G Apr 23 06:41 /data/MapData007/HOS-hc-87900-Data.db As you can see, the following files should be invalid: /data/MapData007/HOS-hc-58572-Data.db /data/MapData007/HOS-hc-61630-Data.db /data/MapData007/HOS-hc-63857-Data.db Because they are all written more than an moth ago. gc_grace is 0 so this should also not be a problem. As a test, I use forceUserSpecifiedCompaction on the HOS-hc-61630-Data.db. Expected behavior should be an empty file is being written because all data in the sstable should be invalid: Compactionstats is giving: compaction typekeyspace column family bytes compacted bytes total progress Compaction MapData007 HOS 11518215662 532355279724 2.16% And when I ls the directory I find this: -rw-rw-r--. 1 cassandra cassandra 3.9G May 22 14:12 /data/MapData007/HOS-tmp-hc-196898-Data.db The sstable is being 1-on-1 copied to a new one. What am I missing here? TTL works perfectly, but is it giving a problem because it is in a super column, and so never to be deleted from disk? Kind regards Pieter Callewaert | Web & IT engineer Be-Mobile NV<http://www.be-mobile.be/> | TouringMobilis<http://www.touringmobilis.be/> Technologiepark 12b - 9052 Ghent - Belgium Tel + 32 9 330 51 80 | Fax + 32 9 330 51 81 | Cell + 32 473 777 121
RE: 1.1 not removing commit log files?
Hi, In 1.1 the commitlog files are pre-allocated with files of 128MB. (https://issues.apache.org/jira/browse/CASSANDRA-3411) This should however not exceed your commitlog size in Cassandra.yaml. commitlog_total_space_in_mb: 4096 Kind regards, Pieter Callewaert From: Bryce Godfrey [mailto:bryce.godf...@azaleos.com] Sent: maandag 21 mei 2012 9:52 To: user@cassandra.apache.org Subject: 1.1 not removing commit log files? The commit log drives on my nodes keep slowly filling up. I don't see any errors in my logs that are indicating any issues that I can map to this issue. Is this how 1.1 is supposed to work now? Previous versions seemed to keep this drive at a minimum as it flushed. /dev/mapper/mpathf 25G 21G 4.2G 83% /opt/cassandra/commitlog
RE: sstableloader 1.1 won't stream
Hi, Sorry to say I didn't look further into this. I'm using CentOS 6.2 now for loader without any problems. Kind regards, Pieter Callewaert -Original Message- From: sj.climber [mailto:sj.clim...@gmail.com] Sent: vrijdag 18 mei 2012 3:56 To: cassandra-u...@incubator.apache.org Subject: Re: sstableloader 1.1 won't stream Pieter, Aaron, Any further progress on this? I'm running into the same issue, although in my case I'm trying to stream from Ubuntu 10.10 to a 2-node cluster (also Cassandra 1.1.0, and running on separate Ubuntu 10.10 hosts). Thanks in advance! -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/sstableloader-1-1-won-t-stream-tp7535517p7564811.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
RE: sstableloader 1.1 won't stream
Firstly I disabled ipv6 on the server to be sure it wasn't trying to use the ipv6, but no effect. I've tried using sstableloader on one of the cassandra nodes, no probem here, worked perfectly! So was doubting if the first server was corrupt or something, so I tried on another server, CentOS 5.7 x64 with java 7 p04, which is not running a Cassandra instance, and again I'm having problems streaming: [root@bms-web2 ~]# ./apache-cassandra-1.1.0/bin/sstableloader --debug -d 10.10.10.100 MapData024/HOS/ Streaming revelant part of MapData024/HOS/MapData024-HOS-hc-1-Data.db to [/10.10.10.102, /10.10.10.100, /10.10.10.101] progress: [/10.10.10.102 0/1 (0)] [/10.10.10.100 0/1 (0)] [/10.10.10.101 0/1 (0)] [total: 0 - 0MB/s (avg: 0MB/s)] WARN 10:53:18,575 Failed attempt 1 to connect to /10.10.10.101 to stream MapData024/HOS/MapData024-HOS-hc-1-Data.db sections=2 progress=0/6566400 - 0%. Retrying in 4000 ms. (java.net.SocketException: Invalid argument or cannot assign requested address) WARN 10:53:18,577 Failed attempt 1 to connect to /10.10.10.102 to stream MapData024/HOS/MapData024-HOS-hc-1-Data.db sections=1 progress=0/6557280 - 0%. Retrying in 4000 ms. (java.net.SocketException: Invalid argument or cannot assign requested address) WARN 10:53:18,594 Failed attempt 1 to connect to /10.10.10.100 to stream MapData024/HOS/MapData024-HOS-hc-1-Data.db sections=1 progress=0/6551840 - 0%. Retrying in 4000 ms. (java.net.SocketException: Invalid argument or cannot assign requested address) progress: [/10.10.10.102 0/1 (0)] [/10.10.10.100 0/1 (0)] [/10.10.10.101 0/1 (0)] [total: 0 - 0MB/s (avg: 0MB/s)] WARN 10:53:22,598 Failed attempt 2 to connect to /10.10.10.101 to stream MapData024/HOS/MapData024-HOS-hc-1-Data.db sections=2 progress=0/6566400 - 0%. Retrying in 8000 ms. (java.net.SocketException: Invalid argument or cannot assign requested address) WARN 10:53:22,601 Failed attempt 2 to connect to /10.10.10.102 to stream MapData024/HOS/MapData024-HOS-hc-1-Data.db sections=1 progress=0/6557280 - 0%. Retrying in 8000 ms. (java.net.SocketException: Invalid argument or cannot assign requested address) WARN 10:53:22,611 Failed attempt 2 to connect to /10.10.10.100 to stream MapData024/HOS/MapData024-HOS-hc-1-Data.db sections=1 progress=0/6551840 - 0%. Retrying in 8000 ms. (java.net.SocketException: Invalid argument or cannot assign requested address) progress: [/10.10.10.102 0/1 (0)] [/10.10.10.100 0/1 (0)] [/10.10.10.101 0/1 (0)] [total: 0 - 0MB/s (avg: 0MB/s)] [root@bms-web2 ~]# java -version java version "1.7.0_04" Java(TM) SE Runtime Environment (build 1.7.0_04-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode) [root@bms-web2 ~]# cat /etc/redhat-release CentOS release 5.7 (Final) Is it possible the sstableloader only works now if a Cassandra instance is also running on the same server? The only other difference I see is CentOS 6.2 vs CentOS 5.x The new sstableloader, does it still use the Cassandra.yaml or is it completely independent? Kind regards -Original Message- From: Pieter Callewaert [mailto:pieter.callewa...@be-mobile.be] Sent: woensdag 9 mei 2012 17:41 To: user@cassandra.apache.org Subject: RE: sstableloader 1.1 won't stream I don't see any entries in the logs of the nodes. I've disabled SELinux, to be sure this wasn't a blocking factor, and tried adding -Djava.net.preferIPv4Stack=true to bin/sstableloader, but no change unfortunately. To summarize, I'm trying to use sstableloader from a server (CentOS release 5.8 (Final)) not running Cassandra to a 3-node Cassandra cluster. All running 1.1. My next step will be to try to use sstableloader on one of the nodes from the cluster, to see if that works... If anyone has any other ideas, please share. Kind regards, Pieter Callewaert -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: woensdag 9 mei 2012 10:45 To: user@cassandra.apache.org Subject: Re: sstableloader 1.1 won't stream Have you checked for errors in the servers' logs? -- Sylvain On Tue, May 8, 2012 at 1:24 PM, Pieter Callewaert wrote: > I've updated all nodes to 1.1 but I keep getting the same problem... > Any other thoughts about this? > > Kind regards, > Pieter > > -Original Message- > From: Benoit Perroud [mailto:ben...@noisette.ch] > Sent: maandag 7 mei 2012 22:21 > To: user@cassandra.apache.org > Subject: Re: sstableloader 1.1 won't stream > > You may want to upgrade all your nodes to 1.1. > > The streaming process connect to every living nodes of the cluster (you can > explicitely diable some nodes), so all nodes need to speak 1.1. > > > > 2012/5/7 Pieter Callewaert : >> Hi, >> >> >> >> I'm trying to upgrade our bulk load process in our testing env. >> >> We use the SSTableSimpleUnsortedWriter to write
RE: sstableloader 1.1 won't stream
I don't see any entries in the logs of the nodes. I've disabled SELinux, to be sure this wasn't a blocking factor, and tried adding -Djava.net.preferIPv4Stack=true to bin/sstableloader, but no change unfortunately. To summarize, I'm trying to use sstableloader from a server (CentOS release 5.8 (Final)) not running Cassandra to a 3-node Cassandra cluster. All running 1.1. My next step will be to try to use sstableloader on one of the nodes from the cluster, to see if that works... If anyone has any other ideas, please share. Kind regards, Pieter Callewaert -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: woensdag 9 mei 2012 10:45 To: user@cassandra.apache.org Subject: Re: sstableloader 1.1 won't stream Have you checked for errors in the servers' logs? -- Sylvain On Tue, May 8, 2012 at 1:24 PM, Pieter Callewaert wrote: > I've updated all nodes to 1.1 but I keep getting the same problem... > Any other thoughts about this? > > Kind regards, > Pieter > > -Original Message- > From: Benoit Perroud [mailto:ben...@noisette.ch] > Sent: maandag 7 mei 2012 22:21 > To: user@cassandra.apache.org > Subject: Re: sstableloader 1.1 won't stream > > You may want to upgrade all your nodes to 1.1. > > The streaming process connect to every living nodes of the cluster (you can > explicitely diable some nodes), so all nodes need to speak 1.1. > > > > 2012/5/7 Pieter Callewaert : >> Hi, >> >> >> >> I'm trying to upgrade our bulk load process in our testing env. >> >> We use the SSTableSimpleUnsortedWriter to write tables, and use >> sstableloader to stream it into our cluster. >> >> I've changed the writer program to fit to the 1.1 api, but now I'm >> having troubles to load them to our cluster. The cluster exists out >> of one 1.1 node and two 1.0.9 nodes. >> >> >> >> I've enabled debug as parameter and in the log4j conf. >> >> >> >> [root@bms-app1 ~]# ./apache-cassandra/bin/sstableloader --debug -d >> 10.10.10.100 /tmp/201205071234/MapData024/HOS/ >> >> INFO 16:25:40,735 Opening >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1 (1588949 bytes) >> >> INFO 16:25:40,755 JNA not found. Native methods will be disabled. >> >> DEBUG 16:25:41,060 INDEX LOAD TIME for >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1: 327 ms. >> >> Streaming revelant part of >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to >> [/10.10.10.102, /10.10.10.100, /10.10.10.101] >> >> INFO 16:25:41,083 Stream context metadata >> [/tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db >> sections=1 >> progress=0/6557280 - 0%], 1 sstables. >> >> DEBUG 16:25:41,084 Adding file >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to be streamed. >> >> INFO 16:25:41,087 Streaming to /10.10.10.102 >> >> DEBUG 16:25:41,092 Files are >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db >> sections=1 >> progress=0/6557280 - 0% >> >> INFO 16:25:41,099 Stream context metadata >> [/tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db >> sections=1 >> progress=0/6551840 - 0%], 1 sstables. >> >> DEBUG 16:25:41,100 Adding file >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to be streamed. >> >> INFO 16:25:41,100 Streaming to /10.10.10.100 >> >> DEBUG 16:25:41,100 Files are >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db >> sections=1 >> progress=0/6551840 - 0% >> >> INFO 16:25:41,102 Stream context metadata >> [/tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db >> sections=2 >> progress=0/6566400 - 0%], 1 sstables. >> >> DEBUG 16:25:41,102 Adding file >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to be streamed. >> >> INFO 16:25:41,102 Streaming to /10.10.10.101 >> >> DEBUG 16:25:41,102 Files are >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db >> sections=2 >> progress=0/6566400 - 0% >> >> >> >> progress: [/10.10.10.102 0/1 (0)] [/10.10.10.100 0/1 (0)] >> [/10.10.10.101 0/1 (0)] [total: 0 - 0MB/s (avg: 0MB/s)] WARN >> 16:25:41,107 Failed attempt 1 to connect to /10.10.10.101 to stream >> /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db >> sections=2 >> progress=0/6566400 - 0%. Retrying in 4000 ms. (java.net.SocketException: >> Invalid argument or cannot assign requested address) >> >> WARN 16:25:41,108 Fail
RE: sstableloader 1.1 won't stream
I've updated all nodes to 1.1 but I keep getting the same problem... Any other thoughts about this? Kind regards, Pieter -Original Message- From: Benoit Perroud [mailto:ben...@noisette.ch] Sent: maandag 7 mei 2012 22:21 To: user@cassandra.apache.org Subject: Re: sstableloader 1.1 won't stream You may want to upgrade all your nodes to 1.1. The streaming process connect to every living nodes of the cluster (you can explicitely diable some nodes), so all nodes need to speak 1.1. 2012/5/7 Pieter Callewaert : > Hi, > > > > I’m trying to upgrade our bulk load process in our testing env. > > We use the SSTableSimpleUnsortedWriter to write tables, and use > sstableloader to stream it into our cluster. > > I’ve changed the writer program to fit to the 1.1 api, but now I’m > having troubles to load them to our cluster. The cluster exists out of > one 1.1 node and two 1.0.9 nodes. > > > > I’ve enabled debug as parameter and in the log4j conf. > > > > [root@bms-app1 ~]# ./apache-cassandra/bin/sstableloader --debug -d > 10.10.10.100 /tmp/201205071234/MapData024/HOS/ > > INFO 16:25:40,735 Opening > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1 (1588949 bytes) > > INFO 16:25:40,755 JNA not found. Native methods will be disabled. > > DEBUG 16:25:41,060 INDEX LOAD TIME for > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1: 327 ms. > > Streaming revelant part of > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to > [/10.10.10.102, /10.10.10.100, /10.10.10.101] > > INFO 16:25:41,083 Stream context metadata > [/tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=1 > progress=0/6557280 - 0%], 1 sstables. > > DEBUG 16:25:41,084 Adding file > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to be streamed. > > INFO 16:25:41,087 Streaming to /10.10.10.102 > > DEBUG 16:25:41,092 Files are > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=1 > progress=0/6557280 - 0% > > INFO 16:25:41,099 Stream context metadata > [/tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=1 > progress=0/6551840 - 0%], 1 sstables. > > DEBUG 16:25:41,100 Adding file > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to be streamed. > > INFO 16:25:41,100 Streaming to /10.10.10.100 > > DEBUG 16:25:41,100 Files are > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=1 > progress=0/6551840 - 0% > > INFO 16:25:41,102 Stream context metadata > [/tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=2 > progress=0/6566400 - 0%], 1 sstables. > > DEBUG 16:25:41,102 Adding file > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db to be streamed. > > INFO 16:25:41,102 Streaming to /10.10.10.101 > > DEBUG 16:25:41,102 Files are > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=2 > progress=0/6566400 - 0% > > > > progress: [/10.10.10.102 0/1 (0)] [/10.10.10.100 0/1 (0)] > [/10.10.10.101 0/1 (0)] [total: 0 - 0MB/s (avg: 0MB/s)] WARN > 16:25:41,107 Failed attempt 1 to connect to /10.10.10.101 to stream > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=2 > progress=0/6566400 - 0%. Retrying in 4000 ms. (java.net.SocketException: > Invalid argument or cannot assign requested address) > > WARN 16:25:41,108 Failed attempt 1 to connect to /10.10.10.102 to > stream /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=1 > progress=0/6557280 - 0%. Retrying in 4000 ms. (java.net.SocketException: > Invalid argument or cannot assign requested address) > > WARN 16:25:41,108 Failed attempt 1 to connect to /10.10.10.100 to > stream /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=1 > progress=0/6551840 - 0%. Retrying in 4000 ms. (java.net.SocketException: > Invalid argument or cannot assign requested address) > > progress: [/10.10.10.102 0/1 (0)] [/10.10.10.100 0/1 (0)] > [/10.10.10.101 0/1 (0)] [total: 0 - 0MB/s (avg: 0MB/s)] WARN > 16:25:45,109 Failed attempt 2 to connect to /10.10.10.101 to stream > /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=2 > progress=0/6566400 - 0%. Retrying in 8000 ms. (java.net.SocketException: > Invalid argument or cannot assign requested address) > > WARN 16:25:45,110 Failed attempt 2 to connect to /10.10.10.102 to > stream /tmp/201205071234/MapData024/HOS/MapData024-HOS-hc-1-Data.db > sections=1 > progress=0/6557280 - 0%. Retrying in 8000 ms. (java.net.SocketException: > Invalid argument or cannot assign requested address) > > WARN 16:25:45,110 Failed attempt 2 to connect to /10.10.1
sstableloader 1.1 won't stream
.net.SocketException: Invalid argument or cannot assign requested address) progress: [/10.10.10.102 0/1 (0)] [/10.10.10.100 0/1 (0)] [/10.10.10.101 0/1 (0)] [total: 0 - 0MB/s (avg: 0MB/s)] ... Anyone any idea what I'm doing wrong? Kind regards, Pieter Callewaert