Re: [VOTE] Release Apache Cassandra 2.0.0-rc2

2013-08-20 Thread Radim Kolar
what about failing shuffle? CASSANDRA-5876 https://issues.apache.org/jira/browse/CASSANDRA-5873

Re: Apache Cassandra 2.0.0 rc1

2013-08-12 Thread Radim Kolar
thrift, sync

Re: Apache Cassandra 2.0.0 rc1

2013-08-11 Thread Radim Kolar
Dne 10.8.2013 21:30, Brandon Williams napsal(a): Make a conf/triggers directory and that will fix it. We fixed this in trunk already. yes, that fixed it. 2.0 is considerably slower then 1.2 for cpu bound tasks, average throughput is -15% at 50 threads. 2.0 with 20 threads burst thruput with

Apache Cassandra 2.0.0 rc1

2013-08-10 Thread Radim Kolar
did you package it correctly? something seems to be missing ERROR 20:27:11,578 Internal error processing batch_mutate java.lang.NoClassDefFoundError: Could not initialize class org.apache.cassandra. triggers.TriggerExecutor at org.apache.cassandra.service.StorageProxy.mutateWithTrigger

Re: cassandra vnodes

2013-07-13 Thread Radim Kolar
I decided to implement it in my cassandra too, but i am using zookeeper for cluster management. I scraped idea of consistent hashing with random tokens. There is too much variance with effective ranges allocated to nodes in large cluster, you need to have lot of ranges which is quite large ove

wiki bootstrap doc

2013-07-12 Thread Radim Kolar
http://wiki.apache.org/cassandra/Operations* * To bootstrap a node, turn _AutoBootstrap_ on in the configuration file, and start it. * *its called *auto_bootstrap *in config file

cassandra vnodes

2013-07-07 Thread Radim Kolar
cassandra vnodes are just implementation of consistent hashing or there are some improvements to make similar sized split sizes? I decided to implement it in my cassandra too, but i am using zookeeper for cluster management.

manifest fsync

2013-05-27 Thread Radim Kolar
should be fine, the manifest is not rewritten that often its rewritten after each sstable flush? databases should do fsync only in checkpoint. Fsync scenario in WAFL is that hard checkpoint is done after predefined number of log segments. On checkpoint fsync everything and then write checkp

EOL info

2013-05-22 Thread Radim Kolar
Dne 22.5.2013 18:22, Brandon Williams napsal(a): I don't see a 1.1.13 ever happening do you have some page with EOL information something like http://www.freebsd.org/security/ lifetime is about 6 months per major release?

cassie

2013-05-21 Thread Radim Kolar
http://wiki.apache.org/cassandra/ClientOptionsThrift add cassie - https://github.com/twitter/cassie

Re: Time to roll 1.1.12?

2013-05-21 Thread Radim Kolar
* fsync leveled manifest to avoid corruption (CASSANDRA-5535) you sure that this does not have performance impact? most filesystems sync all their data not just one file. write to .new file and then do rename.

Re: real leveldb vs cassandra leveldb

2013-02-22 Thread Radim Kolar
Dne 13.2.2013 16:32, Jonathan Ellis napsal(a): The only point here that would make a difference in practice is leveldb using a worse hash function. how do you know that it would not make difference in practice. i have implemented some optimalization from leveldb to cassandra - different L0 leve

Re: real leveldb vs cassandra leveldb

2013-02-14 Thread Radim Kolar
Dne 13.2.2013 16:32, Jonathan Ellis napsal(a): The only point here that would make a difference in practice is leveldb using a worse hash function. For us it's not worth making partitioning worse to make compaction better. then use two hash functions. one for spliting rows to nodes and second f

real leveldb vs cassandra leveldb

2013-02-13 Thread Radim Kolar
real leveldb is better in lot of areas: L0 are 1/10 of L1 sstable size tables can be promoted to upper levels if no merging is needed (there is hole) variable number of sstables per level, but it tries to keep 1:10:100 sstable ratios. Not hard requirement very important - better hash function.

Re: max_compaction_threshold removed - bad move

2013-01-09 Thread Radim Kolar
if this was renamed then update your documentation: http://www.datastax.com/docs/1.2/configuration/storage_configuration

max_compaction_threshold removed - bad move

2013-01-09 Thread Radim Kolar
removing max_compaction_threshold in 1.2 was bad move, keeping it low helps compaction throughput because it lowers number of disk seeks. if you have redhat linux, check during install category "performance tools" or something like that, you will get tools for disk monitoring. Learn to use the

Re: Hector 0.8.0-2 update fails with : "All host pools marked down. Retry burden pushed out to client."

2012-12-04 Thread Radim Kolar
Dne 3.12.2012 9:15, Bisht, Jaikrit napsal(a): Hi there, What have been the problems with Hector? problems with improper detection of down nodes problems with improper detection of timeouts some lost updates due to bad timestamp generation, spliting into more mutators helped. lack of support f

Re: Hector 0.8.0-2 update fails with : "All host pools marked down. Retry burden pushed out to client."

2012-12-01 Thread Radim Kolar
after 1 year experience with Hector, i would recommend: stay away from hector if possible.

slf4j

2012-11-22 Thread Radim Kolar
instead of this: if (logger.isDebugEnabled()) logger.debug("INDEX LOAD TIME for " + descriptor + ": " + (System.currentTimeMillis() - start) + " ms."); do this: logger.debug("INDEX LOAD TIME for {} : {} ms.", descriptor, (System.currentTimeMillis() - start));

Re: findbugs

2012-11-05 Thread Radim Kolar
Dne 30.7.2012 16:47, Edward Capriolo napsal(a): I am sure no one would have an issue with an optional findbugs target. https://issues.apache.org/jira/browse/CASSANDRA-4891 here you have optional findbugs target.

Re: maximum sstable size

2012-11-03 Thread Radim Kolar
Dne 4.11.2012 1:24, Edward Capriolo napsal(a): I have another ticket open for this. which one

Re: maximum sstable size

2012-11-03 Thread Radim Kolar
done https://issues.apache.org/jira/browse/CASSANDRA-4897

maximum sstable size

2012-10-29 Thread Radim Kolar
its possible to implement maximum sstable size for tieredcompactionpolicy without much code changes? I am using it in lucene with really good performance effect, max size is 4 GB, dataset total size is 30 GB. It prevents lucene from creating too big segment which takes too long to be merged wi

Re: customizable size tiered compaction

2012-09-22 Thread Radim Kolar
Dne 22.9.2012 16:01, Jonathan Ellis napsal(a): It's a compaction strategy option, so it's cluster-wide for that strategy. coded in https://issues.apache.org/jira/browse/CASSANDRA-4704

customizable size tiered compaction

2012-09-22 Thread Radim Kolar
I am interested in experiments with size tiered compaction, because i get sstables which are never compacted because no other sstable is close to their size, i have plans to experiment with bucket ratio which is currently 50-150 percent to make it 33-200 percent. Its all about changing constan

Re: findbugs

2012-07-30 Thread Radim Kolar
Dne 30.7.2012 16:52, Jonathan Ellis napsal(a): Is Jenkins smart enough to be able to say, "I know we had X findbugs warnings previously, which are known to be false positives, but now there are X+1" ? yes. Look at hadoop project pre-commit check builds.

Re: findbugs

2012-07-30 Thread Radim Kolar
i am using maven to build cassandra. i didnt have in mind to contribute build system because you are not interested in maven. In maven you just call findbugs plugin, nothing special to contribute. I had in mind patch fixing various FB discovered problems. but because its difficult to post it as

Re: findbugs

2012-07-30 Thread Radim Kolar
was any decision about findbugs made? you do not consider code style recommended by findbugs as good practice which should be followed? I can submit few findbugs patches, but it will probably turns into flamewar WE vs FINDBUGS like there: https://issues.apache.org/jira/browse/HADOOP-8619 find

Re: findbugs

2012-07-23 Thread Radim Kolar
Dne 23.7.2012 16:34, Zoltan Farkas napsal(a): In general, I prefer integrating findbugs into the build process and fail the build if issues are found. I am a strong believer in this approach, increases the quality of the project significantly. Thats true, i am currently in process of fixing fin

Re: findbugs

2012-07-23 Thread Radim Kolar
The line numbers here don't appear to match with trunk. you are right, it was from old trunk 415 commits old. It was just demo of findbugs, for serious use developers should install findbugs maven plugin or eclipse plugin (preferred).

findbugs

2012-07-22 Thread Radim Kolar
I used findbugs on cassandra and it returns 69 possible errors. most problematic part of code is CQL - lot of null pointer problems there some interesting errors: C:/apache-nutch/eclipse/cassandra/src/java/org/apache/cassandra/service/AntiEntropyService.java:916 Condition.await() not in loop i

Re: Cassandra in memory key index

2012-06-08 Thread Radim Kolar
Dne 8.6.2012 21:19, Jason Rutherglen napsal(a): Ok looks like the IndexSummary encapsulates everything, I can start with hacking that. do memory part first. i want to test it on existing serialized index data.

Re: Cassandra in memory key index

2012-06-08 Thread Radim Kolar
If you are interested I can help, I used the FST on a Hadoop project to implement a fast map side range join. create JIRA item with patch attached, i will test it.

Re: PerRowSecondaryIndex

2012-05-27 Thread Radim Kolar
Dne 27.5.2012 12:54, Fábio Caldas napsal(a): by the way ... Solr with Cassandra was a great idea .. I´m using it and loving ... how much data you have stored in solr

Re: make default download cassandra 1.0

2012-05-19 Thread Radim Kolar
message was wrong, It should be cass 1.1 vs 1.0. Cassandra 1.1 needs some time to stabilize. It took months to get cassandra 1.0 stable after it was released. Reworked schema changes in cass 1.1 produces some really weird bugs like disappearing entire keyspace (data are still there). I think t

make default download cassandra 1.0

2012-05-18 Thread Radim Kolar
because cassandra 1.0 is not sufficiently stable, what about to make cassandra 1.0 default download and add bottom line - cassandra 1.0 is also available. I seen this in other projects.

Re: [VOTE] Release Apache Cassandra 1.0.10

2012-05-04 Thread Radim Kolar
Dne 4.5.2012 18:33, Sylvain Lebresne napsal(a): CASSANDRA-4116 is kind of a big deal its bugfix or improvement?

Re: [VOTE] Release Mojo's Cassandra Maven Plugin 1.0.0-1

2012-05-03 Thread Radim Kolar
I'd like to release version 1.1.0-1 of Mojo's Cassandra Maven Plugin What is this plugin supposed to do?

maven 3 build system

2012-04-27 Thread Radim Kolar
In general, maintaining the pom is something that can fall off the C* devs Maven is really easy tool once you get it going and gain necessary knowledge. It is really well integrated in Eclipse, in Jenkins and there are plugins for nearly anything and writing your plugins is easy and you can

Re: RFC: Cassandra Virtual Nodes

2012-03-19 Thread Radim Kolar
Hi Radim, The number of virtual nodes for each host would be configurable by the user, in much the same way that initial_token is configurable now. A host taking a larger number of virtual nodes (tokens) would have proportionately more of the data. This is how we anticipate support for heterog

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Radim Kolar
I don't like that every node will have same portion of data. 1. We are using nodes with different HW sizes (number of disks) 2. especially with ordered partitioner there tends to be hotspots and you must assign smaller portion of data to nodes holding hotspots

Re: Cassandra has moved to Git

2012-01-07 Thread Radim Kolar
Dne 5.1.2012 7:22, Peter Schuller napsal(a): (And btw, major +1 on the transition to git!) please fix github mirror already.

Re: Cassandra has moved to Git

2011-12-29 Thread Radim Kolar
git://git.apache.org/cassandra.git this still works?

Re: major version release schedule

2011-12-20 Thread Radim Kolar
Nobody's forcing you to upgrade. If you want twice as much time between upgrading, just wait for 1.2. Currently 1.0 branch is still less stable then 0.8, i still get OOM on some nodes. Adding 1.1 feature set on top will make it less stable. It's also worth noting that waiting for 2x as many f

major version release schedule

2011-12-20 Thread Radim Kolar
http://www.mail-archive.com/dev@cassandra.apache.org/msg01549.html I read it but things are different now because magic 1.0 is out. If you implement 1.0 and put it into production, you really do not want to retest app on new version every 4 months and its unlikely that you will get migration a

Re: 1.1 freeze approaching

2011-12-19 Thread Radim Kolar
Just a reminder that for us to meet our four-month major release schedule (i.e., 1.1 = Feb 18), can you make release cycle slower? its better to have more new features and do major upgrades less often. it saves time needed for testing and migrations.

Re: hintedhandoff in 1.0.3

2011-11-17 Thread Radim Kolar
Dne 16.11.2011 18:17, Jonathan Ellis napsal(a): Keys in HCF are nodes it has hints for. Because it is 2 node cluster then it must write HH to himself and that explains why after second node gets back again. HH for it are delivered and cleaned but HH with second key are never delivered.

Re: How is Cassandra being used?

2011-11-17 Thread Radim Kolar
Dne 16.11.2011 23:58, Bill napsal(a): We'll turn this off, and would possibly patch it out of the code. That's not to say it wouldn't be useful to others. we patch out of code spyware in ehcache and quartz too. This is only way to be sure that it will not be enabled by configuration mistake. W

Re: How is Cassandra being used?

2011-11-15 Thread Radim Kolar
ppl hate EHCache and Quartz for doing this.

Re: hintedhandoff in 1.0.3

2011-11-15 Thread Radim Kolar
Same problem on other node: 2 keys in HintsColumnFamily. One delivered, one left. INFO [HintedHandoff:1] 2011-11-15 10:31:53,181 HintedHandOffManager.java (line 268) Started hinted handoff for token: 99070591730234615865843651857942052864 INFO [HintedHandoff:1] 2011-11-15 10:32:49,385 Colum

hintedhandoff in 1.0.3

2011-11-15 Thread Radim Kolar
I suspect these partial/invalid hints are left over from a failed hints delivery from before you upgraded to 1.0.3 and not something created by 1.0.3. Try to clear HintsColumnFamily (by removing the sstables for example) first and then see if you still can reproduce this issue afterwards. it s

Re: [VOTE] Release Apache Cassandra 1.0.3

2011-11-14 Thread Radim Kolar
I'm not sure why hints are not working for you. You might have hit some other issue. Some suggestions: 1. Verify that HintsColumnFamily actually contains some data with cassandra-cli and "list HintsColumnFamily" yes 2. Try restarting the node containing the hints to check if that gets your h

Re: [VOTE] Release Apache Cassandra 1.0.3

2011-11-14 Thread Radim Kolar
-1. fix for (CASSANDRA-3466) is included (no exception this time) but hints are not delivered to other node: anybody tested this? Did included fix worked for you?

Re: [VOTE] Release Apache Cassandra 1.0.3

2011-11-12 Thread Radim Kolar
-1 fix for (CASSANDRA-3466) is included (no exception this time) but hints are not delivered to other node: INFO [GossipTasks:1] 2011-11-12 15:05:35,001 Gossiper.java (line 759) InetAddress /**.99.40 is now dead. WARN [pool-1-thread-1] 2011-11-12 15:06:11,514 Memtable.java (line 169) s

Re: [VOTE] Release Apache Cassandra 1.0.3

2011-11-11 Thread Radim Kolar
-1 unless (CASSANDRA-3466) is included

Better CF stats

2011-10-06 Thread Radim Kolar
I need more detailed CF stats. Currently CASS supports Read/write stats and cache hit ratio. I am interested in: 1. key not found: like get cf['non-existent-key'] 2. hits to tombstone, row existed but it is tombstoned now It this easy enough to implement?

slow read performance with leveldb compactor

2011-10-04 Thread Radim Kolar
Lets say i have this: { "generations" : [ { "generation" : 0, "members" : [ 650, 651, 652, 653, 654 ] }, { "generation" : 1, "members" : [ 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648 ] }, { "generation" : 2, "members" : [ 566, 575, 576, 578, 579, 580, 582,

avro in binary distro

2011-09-25 Thread Radim Kolar
Cassandra binary distribution contains avro-1.4.0-sources-fixes.jar which is source code jar. It is packaged by mistake?