Re: Does DateTieredCompactionStrategy work with a compound clustering key?
> I believe, that the DateTieredCompactionStrategy would work for PRIMARY > KEY (timeblock, timestamp) -- but does it also work for PRIMARY KEY > (timeblock, timestamp, hash) ? Yes. (sure you don't want to be using a timeuuid instead?) ~mck
Re: best practices for time-series data with massive amounts of records
> Here "partition" is a random digit from 0 to (N*M) > where N=nodes in cluster, and M=arbitrary number. Hopefully it was obvious, but here (unless you've got hot partitions), you don't need N. ~mck
Re: best practices for time-series data with massive amounts of records
Clint, > CREATE TABLE events ( > id text, > date text, // Could also use year+month here or year+week or something else > event_time timestamp, > event blob, > PRIMARY KEY ((id, date), event_time)) > WITH CLUSTERING ORDER BY (event_time DESC); > > The downside of this approach is that we can no longer do a simple > continuous scan to get all of the events for a given user. Some users > may log lots and lots of interactions every day, while others may interact > with our application infrequently, so I'd like a quick way to get the most > recent interaction for a given user. > > Has anyone used different approaches for this problem? One idea is to provide additional manual partitioning like… CREATE TABLE events ( user_id text, partition int, event_time timeuuid, event_json text, PRIMARY KEY ((user_id, partition), event_time) ) WITH CLUSTERING ORDER BY (event_time DESC) AND compaction={'class': 'DateTieredCompactionStrategy'}; Here "partition" is a random digit from 0 to (N*M) where N=nodes in cluster, and M=arbitrary number. Read performance is going to suffer a little because you need to query N*M as many partition keys for each read, but should be constant enough that it comes down to increasing the cluster's hardware and scaling out as need be. The multikey reads you can do it with a SELECT…IN query, or better yet with parallel reads (less pressure on the coordinator at expense of extra network calls). Starting with M=1, you have the option to increase it over time if the rows in partitions for any users get too high. (We do¹ something similar for storing all raw events in our enterprise platform, but because the data is not user-centric the initial partition key is minute-by-minute timebuckets, and M has remained at 1 the whole time). This approach is better than using order-preserving partition (really don't do that). I would also consider replacing "event blob" with "event text", choosing json instead of any binary serialisation. We've learnt the hard way the value of data transparency, and i'm guessing the storage cost is small given c* compression. Otherwise the advice here is largely repeating what Jens has already said. ~mck ¹ slide 19+20 from https://prezi.com/vt98oob9fvo4/cassandra-summit-cassandra-and-hadoop-at-finnno/
Re: how to scan all rows of cassandra using multiple threads
> Can I get data owned by a particular node and this way generate sum > on different nodes by iterating over data from virtual nodes and later > generate total sum by doing sum of data from all virtual nodes. > You're pretty much describing a map/reduce job using CqlInputFormat.
Re: Node stuck in joining the ring
Any errors in your log file? We saw something similar when bootstrap crashed when rebuilding secondary indexes. See CASSANDRA-8798 ~mck
Re: Why no virtual nodes for Cassandra on EC2?
> … my understanding was that > performance of Hadoop jobs on C* clusters with vnodes was poor because a > given Hadoop input split has to run many individual scans (one for each > vnode) rather than just a single scan. I've run C* and Hadoop in > production with a custom input format that used vnodes (and just combined > multiple vnodes in a single input split) and didn't have any issues (the > jobs had many other performance bottlenecks besides starting multiple > scans from C*). You've described the ticket, and how it has been solved :-) > This is one of the videos where I recall an off-hand mention of the Spark > connector working with vnodes: > https://www.youtube.com/watch?v=1NtnrdIUlg0 Thanks. ~mck
Re: Why no virtual nodes for Cassandra on EC2?
At least the problem of hadoop and vnodes described in CASSANDRA-6091 doesn't apply to spark. (Spark already allows multiple token ranges per split). If this is the reason why DSE hasn't enabled vnodes then fingers crossed that'll change soon. > Some of the DataStax videos that I watched discussed how the Cassandra Spark > connecter has > optimizations to deal with vnodes. Are these videos public? if so got any link to them? ~mck
Re: How to speed up SELECT * query in Cassandra
> Could you please share how much data you store on the cluster and what > is HW configuration of the nodes? These nodes are dedicated HW, 24 cpu and 50Gb ram. Each node has a few TBs of data (you don't want to go over this) in raid50 (we're migrating over to JBOD). Each c* node is running 2.0.11 and configured to use 8gm heap, 2g new, and jdk1.7.0_55. Hadoop (2.2.0) tasktrackers and dfs run on these nodes as well, all up they use up to 12Gb ram, leaving ~30Gb ram for kernel and page cache. Data-locality is an important goal, in the worse case scenarios we've seen it mean a four times throughput benefit. Hdfs being a volatile hadoop-internals space for us is on SSDs, providing strong m/r performance. (commitlog of course is also on SSD – we made the mistake of putting it on the same SSD to begin with. don't do that, commitlog gets its own SSD) > I am really impressed that you are > able to read 100M records in ~4minutes on 4 nodes. It makes something > like 100k reads per node, which is something we are quite far away from. These are not individual reads and not the number of partition keys, but m/r records (or cql rows). But yes, the performance of spark against cassandra is impressive. > It leads me to question, whether reading from Spark goes through > Cassandra's JVM and thus go through normal read path, or if it reads the > sstables directly from disks sequentially and possibly filters out > old/tombstone values by itself? Both Hadoop-Cassandra integration and the Spark-Cassandra connector goes through the normal read path like all cql read queries. With our m/r jobs each task works with just one partition key, doing repeated column slice reads through that partition key according to the ConfigHelper.rangeBatchSize setting, which we have set to 100. These hadoop jobs use a custom written CqlInputFormat due to the poor performance CqlInputFormat has today against a vnodes setup, the customisation we have is pretty much the same as the patch on offer in CASSANDRA-6091. This problem with vnodes we haven't experienced with the spark connector. I presume that, like the hadoop integration, spark also bulk reads (column slices) from each partition key. Otherwise this is useful reading http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting > This is also a cluster that serves requests to web applications that > need low latency. Let it be said this isn't something i'd recommend, just the path we had to take because of our small initial dedicated-HW cluster. (You really want to separate online and offline datacenters, so that you can maximise the offline clusters for the heavy batch reads). ~mck
Re: How to speed up SELECT * query in Cassandra
Jirka, > But I am really interested how it can work well with Spark/Hadoop where > you basically needs to read all the data as well (as far as I understand > that). I can't give you any benchmarking between technologies (nor am i particularly interested in getting involved in such a discussion) but i can share our experiences with Cassandra, Hadoop, and Spark, over the past 4+ years, and hopefully assure you that Cassandra+Spark is a smart choice. On a four node cluster we were running 5000+ small hadoop jobs each day each finishing within two minutes, often within one minute, resulting in (give or take) a billion records read and 150 millions records written from and to c*. These small jobs are incrementally processing on limited partition key sets each time. These jobs are primarily reading data from a "raw events store" that has a ttl of 3 months and 22+Gb of tombstones a day (reads over old partition keys are rare). We also run full-table-scan jobs and have never come across any issues particular to that. There are hadoop map/reduce settings to increase durability if you have tables with troublesome partition keys. This is also a cluster that serves requests to web applications that need low latency. We recently wrote a spark job that does full table scans over 100 million+ rows, involves a handful of stages (two tables, 9 maps, 4 reduce, and 2 joins), and writes back to a new table 5 millions rows. This job runs in ~260 seconds. Spark is becoming a natural complement to schema evolution for cassandra, something you'll want to do to keep your schema optimised against your read request patterns, even little things like switching cluster keys around. With any new technology hitting some hurdles (especially if you go wondering outside recommended practices) will of course be part of the game, but that said I've only had positive experiences with this community's ability to help out (and do so quickly). Starting from scratch i'd use Spark (on scala) over Hadoop no questions asked. Otherwise Cassandra has always been our 'big data' platform, hadoop/spark is just an extra tool on top. We've never kept data in hdfs and are very grateful for having made that choice. ~mck ref https://prezi.com/vt98oob9fvo4/cassandra-summit-cassandra-and-hadoop-at-finnno/
Re: cqlinputformat and retired cqlpagingingputformat creates lots of connections to query the server
Shenghua, > The problem is the user might only want all the data via a "select *" > like statement. It seems that 257 connections to query the rows are necessary. > However, is there any way to prohibit 257 concurrent connections? Your reasoning is correct. The number of connections should be tunable via the "cassandra.input.split.size" property. See ConfigHelper.setInputSplitSize(..) The problem is that vnodes completely trashes this, since splits returned don't span across vnodes. There's an issue out for this – https://issues.apache.org/jira/browse/CASSANDRA-6091 but part of the problem is that the thrift stuff involved here is getting rewritten¹ to be pure cql. In the meantime you override the CqlInputFormat and manually re-merge splits together, where location sets match, so to better honour inputSplitSize and to return to a more reasonable number of connections. We do this, using code similar to this patch https://github.com/michaelsembwever/cassandra/pull/2/files ~mck ¹ https://issues.apache.org/jira/browse/CASSANDRA-8358
Re: Which Topology fits best ?
> However I guess it can be easily changed ? that's correct.
Re: Which Topology fits best ?
NetworkTopogolyStrategy gives you a better horizon and more flexibility as you scale out, at least once you've gone past small cluster problems like wanting RF=3 in a 4 node two dc cluster. IMO I'd go with "DC:1,DC2:1". ~mck
Re: Why does C* repeatedly compact the same tables over and over?
> Are you using Leveled compaction strategy? And if you're using Date Tiered compaction strategy on a table that isn't time-series data, for example deletes happen, you find it compacting over and over. ~mck
Re: Storing large files for later processing through hadoop
> Since the hadoop MR streaming job requires the file to be processed to be > present in HDFS, > I was thinking whether can it get directly from mongodb instead of me > manually fetching it > and placing it in a directory before submitting the hadoop job? Hadoop M/R can get data directly from Cassandra. See CqlInputFormat. ~mck
Re: Storing large files for later processing through hadoop
> 1) The FAQ … informs that I can have only files of around 64 MB … See http://wiki.apache.org/cassandra/CassandraLimitations "A single column value may not be larger than 2GB; in practice, "single digits of MB" is a more reasonable limit, since there is no streaming or random access of blob values." CASSANDRA-16 only covers pushing those objects through compaction. Getting the objects in and out of the heap during normal requests is still a problem. You could manually chunk them down to 64Mb pieces. > 2) Can I replace HDFS with Cassandra so that I don't have to sync/fetch > the file from cassandra to HDFS when I want to process it in hadoop cluster? We¹ keep HDFS as a volatile filesystem simply for hadoop internals. No need for backups of it, no need to upgrade data, and we're free to wipe it whenever hadoop has been stopped. Otherwise all our hadoop jobs still read from and write to Cassandra. Cassandra is our "big data" platform, with hadoop/spark just providing additional aggregation abilities. I think this is the effective way, rather than trying to completely gut out HDFS. There was a datastax project before in being able to replace HDFS with Cassandra, but i don't think it's alive anymore. ~mck
Re: 2.0.10 to 2.0.11 upgrade and immediate ParNew and CMS GC storm
> Perf is better, correctness seems less so. I value latter more than > former. Yeah no doubt. Especially in CASSANDRA-6285 i see some scary stuff went down. But there are no outstanding bugs that we know of, are there? (CASSANDRA-6815 remains just a wrap up of how options are to be presented in cassandra.yaml?) ~mck
Re: 2.0.10 to 2.0.11 upgrade and immediate ParNew and CMS GC storm
> > Should I stick to 2048 or try > > with something closer to 128 or even something else ? 2048 worked fine for us. > > About HSHA, > > I anti-recommend hsha, serious apparently unresolved problems exist with > it. We saw an improvement when we switched to HSHA, particularly for our offline (hadoop/spark) nodes. Sorry i don't have the data anymore to support that statement, although i can say that improvement paled in comparison to cross_node_timeout which we enabled shortly afterwards. ~mck
Can initial_token be decimal or hexadecimal format?
And does it matter when using different partitoners? In the config it seems only strings are used. In RP it parses this string into a BigInteger so it needs to be in decimal format, but for ByteOrderPartitioner it uses FBUtilities.hexToBytes(..) when translating a string to a token (BytesToken). More to the point... For a 3 node cluster using BOP where my largest token will be 0x8000 (coincidently 2**127) should i write out initial_tokens like node0: 0 node1: 2AAA node2: 5554 or like node0: 0 node1: 56713727820156410577229101238628035242 node2: 113427455640312821154458202477256070484 If it is the former there's some important documentation missing. ~mck ps CASSANDRA-1006 seems to be of some relation.
Task's map reading more record than CFIF's inputSplitSize
Cassandra-0.8.4 w/ ByteOrderedPartitioner CFIF's inputSplitSize=196608 3 map tasks (from 4013) is still running after read 25 million rows. Can this be a bug in StorageService.getSplits(..) ? With this data I've had general headache with using tokens that are longer than usual (and trying to move nodes around to balance the ring). nodetool ring gives Address Status State LoadOwnsToken Token(bytes[76118303760208547436305468318170713656]) 152.90.241.22 Up Normal 270.46 GB 33.33% Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8]) 152.90.241.24 Up Normal 247.89 GB 33.33% Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8]) 152.90.241.23 Up Normal 1.1 TB 33.33% Token(bytes[76118303760208547436305468318170713656]) ~mck
Re: RF=1 w/ hadoop jobs
On Thu, 2011-08-18 at 08:54 +0200, Patrik Modesto wrote: > But there is the another problem with Hadoop-Cassandra, if there is no > node available for a range of keys, it fails on RuntimeError. For > example having a keyspace with RF=1 and a node is down all MapReduce > tasks fail. CASSANDRA-2388 is related but not the same. Before 0.8.4 the behaviour was if the local cassandra node didn't have the split's data the tasktracker would connect to another cassandra node where the split's data could be found. So even <0.8.4 with RF=1 you would have your hadoop job fail. Although I've reopened CASSANDRA-2388 (and reverted the code locally) because the new behaviour in 0.8.4 leads to abysmal tasktracker throughput (for me task allocation doesn't seem to honour data-locality according to split.getLocations()). > I've reworked my previous patch, that was addressing this > issue and now there are ConfigHelper methods for enable/disable > ignoring unavailable ranges. > It's available here: http://pastebin.com/hhrr8m9P (for version 0.7.8) I'm interested in this patch and see it's usefulness but no one will act until you attach it to an issue. (I think a new issue is appropriate here). ~mck
[hadoop] Counters in ColumnFamilyOutputFormat?
I'd like to investigate using Counters in hadoop using ColumnFamilyOutputFormat. But i see that this class uses outdated ..hadoop.arvo classes Does it make sense to use counters for hadoop output? If i try rewriting ColumnFamilyOutputFormat and friends should it be to the normal ..avro classes, or to something else? ~mck
Re: IOException: Unable to create hard link ... /snapshots/ ... (errno 17)
On Tue, 2011-05-03 at 14:22 -0500, Jonathan Ellis wrote: > Can you create a ticket? CASSANDRA-2598
Re: IOException: Unable to create hard link ... /snapshots/ ... (errno 17)
On Tue, 2011-05-03 at 13:52 -0500, Jonathan Ellis wrote: > you should probably look to see what errno 17 means for the link > system call on your system. That the file already exists. It seems cassandra is trying to make the same hard link in parallel (under heavy write load) ? I see now i can also reproduce the problem with hadoop and ColumnFamilyOutputFormat. Turning off snapshot_before_compaction seems to be enough to prevent it. ~mck
Re: IOException: Unable to create hard link ... /snapshots/ ... (errno 17)
On Tue, 2011-05-03 at 16:52 +0200, Mck wrote: > Running a 3 node cluster with cassandra-0.8.0-beta1 > > I'm seeing the first node logging many (thousands) times Only "special" thing about this first node is it receives all the writes from our sybase->cassandra import job. This process migrates an existing 60million rows into cassandra (before the cluster is /turned on/ for normal operations). The import job runs over ~20minutes. I wiped everything and started from scratch, this time running the import job with cassandra configured instead with: incremental_backups: false snapshot_before_compaction: false This created the problem then on another node. So changing to these settings on all nodes and running the import again fixed it: no more "Unable to create hard link ..." After the import i could turn both incremental_backups and snapshot_before_compaction to true again without problems so far. To me this says something is broken with incremental_backups and snapshot_before_compaction under heavy writing? ~mck
IOException: Unable to create hard link ... /snapshots/ ... (errno 17)
Running a 3 node cluster with cassandra-0.8.0-beta1 I'm seeing the first node logging many (thousands) times lines like Caused by: java.io.IOException: Unable to create hard link from /iad/finn/countstatistics/cassandra-data/countstatisticsCount/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-5504-Data.db to /iad/finn/countstatistics/cassandra-data/countstatisticsCount/snapshots/compact-thrift_no_finntech_countstats_count_Count_1299479381593068337/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-5504-Data.db (errno 17) This seems to happen for all column families (including system). It happens a lot during startup. The hardlinks do exist. Stopping, deleting the hardlinks, and starting again does not help. But i haven't seen it once on the other nodes... ~mck ps the stacktrace java.io.IOError: java.io.IOException: Unable to create hard link from /iad/finn/countstatistics/cassandra-data/countstatisticsCount/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db to /iad/finn/countstatistics/cassandra-data/countstatisticsCount/snapshots/compact-thrift_no_finntech_countstats_count_Count_1299479381593068337/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db (errno 17) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1629) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1654) at org.apache.cassandra.db.Table.snapshot(Table.java:198) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:504) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Unable to create hard link from /iad/finn/countstatistics/cassandra-data/countstatisticsCount/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db to /iad/finn/countstatistics/cassandra-data/countstatisticsCount/snapshots/compact-thrift_no_finntech_countstats_count_Count_1299479381593068337/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db (errno 17) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:155) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:713) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1622) ... 10 more
Re: [RELEASE] Apache Cassandra 0.8.0 beta1
On Tue, 2011-04-26 at 12:53 +0100, Stephen Connolly wrote: > (or did you want 20million unneeded deps for the > client jars?) Yes that's a good reason :-) If there anything i can help with? Will beta versions be available under releases repository? ~mck
Re: [RELEASE] Apache Cassandra 0.8.0 beta1
On Fri, 2011-04-22 at 16:49 -0500, Eric Evans wrote: > I am pleased to announce the release of Apache Cassandra 0.8.0 beta1. *Truly Awesome!* CQL rocks in so many ways. Is 0.8.0-beta1 available in apache's maven repository? And if not, why not? ~mck
Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse
On Wed, 2011-01-26 at 12:13 +0100, Patrik Modesto wrote: > BTW how to get current time in microseconds in Java? I'm using HFactory.clock() (from hector). > > As far as moving the clone(..) into ColumnFamilyRecordWriter.write(..) > > won't this hurt performance? > > The size of the queue is computed at runtime: > ColumnFamilyOutputFormat.QUEUE_SIZE, 32 * > Runtime.getRuntime().availableProcessors() > So the queue is not too large so I'd say the performance shouldn't get hurt. This is only the default. I'm running w/ 8. Testing have given this the best throughput for me when processing 25+ million rows... In the end it is still 25+ million .clone(..) calls. > The key isn't the only potential live byte[]. You also have names and > values in all the columns (and supercolumns) for all the mutations. Now make that over a billion .clone(..) calls... :-( byte[] copies are relatively quick and cheap, still i am seeing a performance degradation in m/r reduce performance with cloning of keys. It's not that you don't have my vote here, i'm just stating my uncertainty on what the correct API should be. ~mck signature.asc Description: This is a digitally signed message part
Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse
> > is "d.timestamp = System.currentTimeMillis();" ok? > > You are correct that microseconds would be better but for the test it > doesn't matter that much. Have you tried. I'm very new to cassandra as well, and always uncertain as to what to expect... > ByteBuffer bbKey = ByteBufferUtil.clone(ByteBuffer.wrap(key.getBytes(), 0, > key.getLength())); An alternative approach to your client-side cloning is ByteBuffer bbKey = ByteBuffer.wrap(key.toString().getBytes(UTF_8)); Here at least it is obvious you are passing in the bytes from an immutable object. As far as moving the clone(..) into ColumnFamilyRecordWriter.write(..) won't this hurt performance? Normally i would _always_ agree that a defensive copy of an array/collection argument be stored, but has this intentionally not been done (or should it) because of large reduce jobs (millions of records) and the performance impact here. The key isn't the only potential live byte[]. You also have names and values in all the columns (and supercolumns) for all the mutations. ~mck
Re: Should nodetool ring give equal load ?
On Wed, 2011-01-12 at 14:21 -0800, Ryan King wrote: > What consistency level did you use to write the > data? R=1,W=1 (reads happen a long time afterwards). ~mck -- "It is now quite lawful for a Catholic woman to avoid pregnancy by a resort to mathematics, though she is still forbidden to resort to physics and chemistry." H.L. Mencken | www.semb.wever.org | www.sesat.no | www.finn.no | http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
Re: Timeout Errors while running Hadoop over Cassandra
On Wed, 2011-01-12 at 23:04 +0100, mck wrote: > > Caused by: TimedOutException() > > What is the exception in the cassandra logs? Or tried increasing rpc_timeout_in_ms? ~mck -- "When there is no enemy within, the enemies outside can't hurt you." African proverb | www.semb.wever.org | www.sesat.no | www.finn.no | http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
Re: Should nodetool ring give equal load ?
> You're using an ordered partitioner and your nodes are evenly spread > around the ring, but your data probably isn't evenly distributed. This load number seems equals to `du -hs ` and since i've got N == RF shouldn't the data size always be the same on every node? ~mck -- "Traveller, there are no paths. Paths are made by walking." Australian Aboriginal saying | www.semb.wever.org | www.sesat.no | www.finn.no | http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
Re: Timeout Errors while running Hadoop over Cassandra
On Wed, 2011-01-12 at 18:40 +, Jairam Chandar wrote: > Caused by: TimedOutException() What is the exception in the cassandra logs? ~mck -- "Don't use Outlook. Outlook is really just a security hole with a small e-mail client attached to it." Brian Trosko | www.semb.wever.org | www.sesat.no | www.finn.no | http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
Should nodetool ring give equal load ?
I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner. When i run "nodetool ring" it reports > Address Status State LoadOwnsToken > > > Token(bytes[ff034355152567a5b2d962b55990e692]) > 152.90.242.91 Up Normal 12.26 GB33.33% > Token(bytes[01cecd88847283229a3dc88292deff86]) > 152.90.242.93 Up Normal 6.13 GB 33.33% > Token(bytes[d4a4de25c0dad34749e99219e227d896]) > 152.90.242.92 Up Normal 6.13 GB 33.33% > Token(bytes[ff034355152567a5b2d962b55990e692]) why would the first node have double the Load? is this expected or is something wrong? Number of files data_file_directories for the keyspace is roughly the same. But each Index and Filter file is double the size on the first node (regardless of the cf they belong to). "cleanup" didn't help. "compact" only took away 2GB. Otherwise there is a lot here i don't understand. ~mck -- "The turtle only makes progress when it's neck is stuck out" Rollo May | www.semb.wever.org | www.sesat.no | www.finn.no | http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
Re: Hadoop Integration doesn't work when one node is down
> Is this a bug or feature or a misuse? i can confirm this bug. on a 3 node cluster testing environment with RF 3. (and no issue exists for it AFAIK). ~mck -- "Simplicity is the ultimate sophistication" Leonardo Da Vinci's (William of Ockham) | www.semb.wever.org | www.sesat.no | www.finn.no| http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
Re: nodetool can't jmx authenticate...
On Thu, 2010-12-30 at 08:03 -0600, Jonathan Ellis wrote: > We don't have any explicit code for enabling that, no. https://issues.apache.org/jira/browse/CASSANDRA-1921 the patch was simple (NodeCmd and NodeProbe). just testing it now... ~mck -- "I'm not one of those who think Bill Gates is the devil. I simply suspect that if Microsoft ever met up with the devil, it wouldn't need an interpreter." Nicholas Petreley | www.semb.wever.org | www.sesat.no | www.finn.no | http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
nodetool can't jmx authenticate...
I'm starting cassandra-0.7.0-rc3 with jmx secured by adding JAVA_OPTS -Dcom.sun.management.jmxremote.password.file=/somepath/jmxpassword But then i can't use nodetool since it dumps > Exception in thread "main" java.lang.SecurityException: Authentication > failed! Credentials required > at > com.sun.jmx.remote.security.JMXPluggableAuthenticator.authenticationFailure(JMXPluggableAuthenticator.java:193) > at [snip] > at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:118) Adding "-Dcom.sun.management.jmxremote.password.file" to nodetool doesn't help... Is there any support for nodetool to connect to a password authenticated jmx service? ~mck -- "There are only two ways to live your life. One is as though nothing is a miracle. The other is as if everything is." Albert Einstein | www.semb.wever.org | www.sesat.no | www.finn.no| http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part
Re: (newbie) ColumnFamilyOutputFormat only writes one column (per key)
> I then went to write a m/r job that deserialises the thrift objects and > aggregates the data accordingly into a new column family. But what i've > found is that ColumnFamilyOutputFormat will only write out one column > per key. I've entered a bug for this: https://issues.apache.org/jira/browse/CASSANDRA-1774 ~mck signature.asc Description: This is a digitally signed message part
(newbie) ColumnFamilyOutputFormat only writes one column (per key)
(I'm new here so forgive any mistakes or mis-presumptions...) I've set up a cassandra-0.7.0-beta3 and populated it with thrift-serialised objects via a scribe server. This seems a great way to get thrift beans out of the application asap and have then sitting in cassandra for later processing. I then went to write a m/r job that deserialises the thrift objects and aggregates the data accordingly into a new column family. But what i've found is that ColumnFamilyOutputFormat will only write out one column per key. Alex Burkoff also reported this nearly two months ago, but nobody ever replied... http://article.gmane.org/gmane.comp.db.cassandra.user/9325 has anyone any ideas? should it be possible to write multiple columns out? This is very easy to reproduce. Use the contrib/wordcount example, with OUTPUT_REDUCER=cassandra and in WordCount.java add at line 132 > results.add(getMutation(key, sum)); > +results.add(getMutation(new Text("doubled"), sum*2)); Only the last mutation for any key seems to be written. ~mck -- echo '[q]sa[ln0=aln256% Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc | www.semb.wever.org | www.sesat.no | www.finn.no| http://xss-http-filter.sf.net signature.asc Description: This is a digitally signed message part