Re: pig_cassandra problem - "Incompatible field schema" error

2011-10-11 Thread Pete Warden
For posterity, I ended up hacking around this by renaming the repeated 'value' alias in CassandraStorage and rebuilding it. Here's the patch: --- src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java.original 2011-10-11 23:42:19.0 -0700 +++ src/java/org/apache/cassandra/hadoop/pig

Re: Hector Problem Basic one

2011-10-11 Thread CASSANDRA learner
Thanks for the reply ben. Actually The problem is, I could not able to run a basic hector example from eclipse. Its throwing "me.prettyprint.hector.api. exceptions.HectorException: All host pools marked > down. Retry burden pushed out to client " Can you please let me know why i am getting this,,,

Re: Cassandra as session store under heavy load

2011-10-11 Thread Maciej Miklas
- RF is 1. We have few KeySpaces, only this one is not replicated - this data is not that very important. In case of error customer will have to execute process again. But again, I would like to persist it. - Serializing data is not an option, because I would like to have possibility to access data

Re: pig_cassandra problem - "Incompatible field schema" error

2011-10-11 Thread Pete Warden
Thanks for all your help Brandon and Jeremy, that got me to the point where I could load data. I'm now hitting a new issue that seems like it could possibly be related. When I try to access the data like this: grunt> rows = LOAD 'cassandra://Frap/FriendsAlreadyRanked' USING CassandraStorage(); gr

Re: CompletedTasks attribute exposed via JMX

2011-10-11 Thread Tyler Hobbs
The OpsCenter graph you're referring to basically does the following: 1. For each node, find out how much the WriteOperations attribute of the StorageProxy increased during the last minute. 2. Sum these values to get a total for the cluster. 3. Divide by 60 to get an average number of WriteOperati

Using ttl to expire columns rather than using delete

2011-10-11 Thread Terry Cumaranatunge
Hello, If you set a ttl and expire a column, I've read that this eventually turns into a tombstone and will be cleaned out by the GC. Are expirations considered a form of delete that still requires a node repair to be run in gc_grace_period seconds? The operations guide says you have to run node r

Re: anyway to throttle nodetool repair?

2011-10-11 Thread Yan Chunlu
as I asked earlier: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-does-compaction-throughput-kb-per-sec-affect-disk-io-td6831711.html might not directly throttle the disk I/O? it would be easy if ionice could work with cassandra. not sure it is because of jvm or something e

Re: nodetool cfstats on 1.0.0-rc1 throws an exception

2011-10-11 Thread Jonathan Ellis
Are all 3 CFs using compression? On Tue, Oct 11, 2011 at 4:43 PM, Günter Ladwig wrote: > Hi all, > > I'm seeing the same problem on my 1.0.0-rc2 cluster. However, I do not have > 5000, but just three (compressed) CFs. > > The exception does not happen for the Migrations CF, but for one of mine:

Re: 0.7.9 RejectedExecutionException

2011-10-11 Thread Jonathan Ellis
grep -i 'killed process' /var/log/messages On Tue, Oct 11, 2011 at 5:25 PM, Ashley Martens wrote: > So we created a script to check if Cassandra is alive and run it every two > minutes. Here are some results for today: > > Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again

Re: Volunteers needed - Wiki

2011-10-11 Thread hani elabed
Hi Aaron, I got an account to the wiki, logged in, and claimed the 'Configuration' page a.k.a 'Storage Configuration' for now. I will let you know when done or if I get stumped. Will also work on "Setting up Eclipse" page and put it somewhere. Hani On Mon, Oct 10, 2011 at 4:24 PM, aaron morton wro

Re: Operator on secondary indexes in 0.8.x (GTE/LTE)

2011-10-11 Thread Jonathan Ellis
simple, elegant, and less performant than just doing a range scan without the index. :) On Tue, Oct 11, 2011 at 4:06 PM, Sasha Dolgy wrote: > ah, hadn't even thought of that.  simple.  elegant. > cheers. > > On Tue, Oct 11, 2011 at 11:01 PM, Jake Luciani wrote: >> >> This hasn't changed in AFAIK

Re: pig_cassandra problem - "Incompatible field schema" error

2011-10-11 Thread Brandon Williams
On Tue, Oct 11, 2011 at 4:24 PM, Pete Warden wrote: > I'm trying to run the most basic example for pig_cassandra, counting the > number of rows in a column family, and I'm hitting the following error: > 2011-10-11 14:13:32,321 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 1031: Incompata

Re: pig_cassandra problem - "Incompatible field schema" error

2011-10-11 Thread Jeremy Hanna
Just for informational purposes, Pete and I tried to troubleshoot it via twitter. I was able to do the following with Cassandra 0.8.1 and Pig 0.9.1. He's going to dig in to see if there's something else going on. // Cassandra-cli stuff // bin/cassandra-cli -h localhost -p 9160 create keyspace

Re: 0.7.9 RejectedExecutionException

2011-10-11 Thread Ashley Martens
So we created a script to check if Cassandra is alive and run it every two minutes. Here are some results for today: Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 19:00:10 UTC 2011 - F this Cassandra bullshit... it died again Tue Oct 11 19:30:10 UTC 2011 - F

Re: Volunteers needed - Wiki

2011-10-11 Thread Sasha Dolgy
while on the topic of the wiki ... it's not entirely pleasing to the senses or at all user friendly ... hacking around on it earlier today, there aren't that many options on how to give it some flare ... shame really that for such a cool piece of software, the wiki doesn't scream the same level of

Re: Volunteers needed - Wiki

2011-10-11 Thread Daria Hutchinson
Sounds like a good place to start! Thanks for taking the lead and please let me know how I can help! Daria On Tue, Oct 11, 2011 at 2:20 PM, aaron morton wrote: > Thanks Daria, I have a look at whats there and get in touch. > > Right now I'm not thinking beyond getting the wiki complete (e.g. it

Re: nodetool cfstats on 1.0.0-rc1 throws an exception

2011-10-11 Thread Günter Ladwig
Hi all, I'm seeing the same problem on my 1.0.0-rc2 cluster. However, I do not have 5000, but just three (compressed) CFs. The exception does not happen for the Migrations CF, but for one of mine: Keyspace: KeyspaceCumulus Read Count: 816 Read Latency: 8.926029411764706 ms.

pig_cassandra problem - "Incompatible field schema" error

2011-10-11 Thread Pete Warden
I'm trying to run the most basic example for pig_cassandra, counting the number of rows in a column family, and I'm hitting the following error: 2011-10-11 14:13:32,321 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1031: Incompatable field schema: left is "columns:bag{:tuple(name:bytearray

Re: Volunteers needed - Wiki

2011-10-11 Thread aaron morton
Thanks Daria, I have a look at whats there and get in touch. Right now I'm not thinking beyond getting the wiki complete (e.g. it lists all the command line tools) and correct for version 1.0. My main concern was people coming away from the site with incorrect information and having a bad out of

Re: Operator on secondary indexes in 0.8.x (GTE/LTE)

2011-10-11 Thread Sasha Dolgy
ah, hadn't even thought of that. simple. elegant. cheers. On Tue, Oct 11, 2011 at 11:01 PM, Jake Luciani wrote: > This hasn't changed in AFAIK, In Brisk we had the same problem in CFS so > we created a sentinel value that all rows shared then it works. > CASSANDRA-2915 should fix it. > > On

Re: Operator on secondary indexes in 0.8.x (GTE/LTE)

2011-10-11 Thread Jake Luciani
This hasn't changed in AFAIK, In Brisk we had the same problem in CFS so we created a sentinel value that all rows shared then it works. CASSANDRA-2915 should fix it. On Tue, Oct 11, 2011 at 4:48 PM, Sasha Dolgy wrote: > I was trying to get a range of rows based on a secondary_index that was >

Re: CompletedTasks attribute exposed via JMX

2011-10-11 Thread aaron morton
Its the number of mutations, a mutation is a collection of changes for a single row across one or more column families. Take a look at the nodetool cfstats, this is where I assume Ops Centre is getting it's data from. Cheers - Aaron Morton Freelance Cassandra Developer @aaro

Re: Cassandra as session store under heavy load

2011-10-11 Thread aaron morton
Some thoughts… > non replicated Key Space Not sure what you mean here. Do you mean RF 1 ? I would consider using 3. Consider what happens you want to install a rolling upgrade to the cluster. > single Column Family, where key is session ID and each column within row > stores single key/value -

Operator on secondary indexes in 0.8.x (GTE/LTE)

2011-10-11 Thread Sasha Dolgy
I was trying to get a range of rows based on a secondary_index that was defined. Any rows where age was greater than or equal to ... it didn't work. Is this a continued limitation? Did a quick look in JIRA, couldn't find anything. The output from "help get;" on the cli contains the following, w

Unsubscribe

2011-10-11 Thread Jim Zamata

CassandraDaemon deactivate doesn't shutdown Cassandra

2011-10-11 Thread Shimi Kiviti
I am running an Embedded Cassandra (0.8.7) and calling CassandraDaemon.deactivate() after I write rows (at least 1), doesn't shutdown Cassandra. If I run only "reads" it does shutdown even without calling CassandraDaemon.deactivate() Anyone have any idea what can cause this problem? Shimi

Re: ApacheCon meetup?

2011-10-11 Thread Eric Evans
On Tue, Oct 11, 2011 at 11:05 AM, Eric Evans wrote: > On Tue, Oct 4, 2011 at 2:44 PM, Chris Burroughs > wrote: >> ApacheCon NA is coming up next month.  I suspect there will be at least >> a few Cassandra users there (yeah new release!).  Would anyone be >> interested in getting together and shar

Re: Volunteers needed - Wiki

2011-10-11 Thread Daria Hutchinson
DataStax would like to help with the wiki update effort. For example, we have a start on updates for 1.0, such as the storage configuration. http://www.datastax.com/docs/1.0/configuration/storage_configuration Let me know how we can help. Cheers, Daria (DataStax Tech Writer) Question - Are you

Schema versions reflect schemas on unwanted nodes

2011-10-11 Thread Eric Czech
Hi, I'm having what I think is a fairly uncommon schema issue -- My situation is that I had a cluster with 10 nodes and a consistent schema. Then, in an experiment to setup a second cluster with the same information (by copying the raw sstables), I left the LocationInfo* sstables in the system ke

Re: different size sstable on different nodes?

2011-10-11 Thread Yang
"46e70d80": [["0132f3726cbb30303030303030303030303030303030303030303030303030303030303030303030316431636633","4e945b0e",1318344486784,"d"] for the timestamp perl -e 'print gmtime(1318344486)."\n" ' Tue Oct 11 14:48:06 2011 $ TZ=GMT date Tue Oct 11 17:40:31 GMT 2011 so it's almost 3 hou

different size sstable on different nodes?

2011-10-11 Thread Yang
after I did a major compaction on both nodes in my test cluster, I found that for the same CF, one node has a 100MB sstable file, while the other has a 1GB one. since GC_grace is set into schema, and both nodes have the same config, how could this happen? I'm still going through sstable2json to f

Re: add bloomfilter results to nodetool?

2011-10-11 Thread Brandon Williams
On Tue, Oct 11, 2011 at 12:19 PM, Yang wrote: > I find the info about bloomfilter very helpful, could we add that to NodeCmd ? Feel free to create a ticket and tag it 'lhf' -Brandon

add bloomfilter results to nodetool?

2011-10-11 Thread Yang
I find the info about bloomfilter very helpful, could we add that to NodeCmd ? Thanks Yang

Re: ApacheCon meetup?

2011-10-11 Thread Jake Luciani
Sounds good. I'll be giving a talk there about Cassandra 1.0 http://na11.apachecon.com/talks/19500 On Tue, Oct 11, 2011 at 12:05 PM, Eric Evans wrote: > On Tue, Oct 4, 2011 at 2:44 PM, Chris Burroughs > wrote: > > ApacheCon NA is coming up next month. I suspect there will be at least > > a fe

Re: Hector has a website

2011-10-11 Thread Aaron Turner
Just a FYI: http://hector-client.org is requesting a username/pass http://www.hector-client.org is working fine On Fri, Oct 7, 2011 at 12:51 AM, aaron morton wrote: > Thanks, will be handy for new peeps. > A > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http

Re: ApacheCon meetup?

2011-10-11 Thread Eric Evans
On Tue, Oct 4, 2011 at 2:44 PM, Chris Burroughs wrote: > ApacheCon NA is coming up next month.  I suspect there will be at least > a few Cassandra users there (yeah new release!).  Would anyone be > interested in getting together and sharing some stories?  This could > either be a "official" [1] m

Request for Transactional Scenarios

2011-10-11 Thread Henrique Moniz
Hi, >From time to time discussions pop up here regarding the transactional or atomic capabilities of Cassandra (or lack thereof). There is at least one project dedicated to solving this problem (i.e., Cages). Unfortunately, in pretty much every discussion or blog post I’ve come across on this subj

CompletedTasks attribute exposed via JMX

2011-10-11 Thread Alexandru Dan Sicoe
Hello everyone, I was trying to get some cluster wide statistics of the total insertions performed in my 3 node Cassandra 0.8.6 cluster. So I wrote a nice little program that gets the CompletedTasks attribute of org.apache.cassandra.db:type=Commitlog from every node, sums up the values and records

Re: Multi DC setup

2011-10-11 Thread Eric Tamme
We already have two separate rings. Idea of bidirectional sync is, if one ring is down, we can still send the traffic to other ring. When original cluster comes back, it will pick up the data from available cluster. I'm not sure if it makes sense to have separate rings or combine these two rings

Re: Multi DC setup

2011-10-11 Thread Brandon Williams
On Tue, Oct 11, 2011 at 2:36 AM, Peter Schuller wrote: > Google/check wiki/read docs about NetworkTopologyStrategy and > PropertyFileSnitch. I don't have a good link to multi-dc off hand > (anyone got a good link to suggest that goes through this?). http://www.datastax.com/docs/0.8/cluster_archit

Cassandra as session store under heavy load

2011-10-11 Thread Maciej Miklas
Hi *, I would like to use Cassandra to store session related informations. I do not have real HTTP session - it's different protocol, but the same concept. Memcached would be fine, but I would like to additionally persist data. Cassandra setup: - non replicated Key Space - single Column F

Re: Existing column(s) not readable

2011-10-11 Thread aaron morton
kewl, > * Row is not deleted (other columns can be read, row survives compaction > with GCGraceSeconds=0) IIRC row tombstones can hang around for a while (until gc grace has passed), and they only have an effect on columns that have a lower timstamp. So it's possible to read columns from a row

Re: Hector Problem Basic one

2011-10-11 Thread Ben Ashton
Hey, We had this one, even tho in the hector documentation it says that it retry s failed servers even 30 by default, it doesn't. Once we explicitly set it to X seconds, when ever there is a failure, ie with network (AWS), it will retry and add it back into the pool. Ben On 11 October 2011 11:0

Re: Existing column(s) not readable

2011-10-11 Thread Thomas Richter
Hi Aaron, i invalidated the caches but nothing changed. I didn't get the mentioned log line either, but as I read the code SliceByNamesReadCommand uses NamesQueryFilter and not SliceQueryFilter. Next, there is only one SSTable. I can rule out that the row is deleted because I deleted all other r

Hector Problem Basic one

2011-10-11 Thread CASSANDRA learner
Hi Every One, Actually I was using cassandra long time back and when i tried today, I am getting a problem from eclipse. When i am trying to run a basic hector (java) example, I am getting an exception me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden p

Re: Volunteers needed - Wiki

2011-10-11 Thread aaron morton
@maki thanks, Could you take a look at the cli page http://wiki.apache.org/cassandra/CassandraCli ?. There is a lot of online docs in the tool, so we dont need to replicate that. Just a simple getting started guide, some examples and a few tips about about what to do if things don't wo

Re: Existing column(s) not readable

2011-10-11 Thread aaron morton
Nothing jumps out. The obvious answer is that the column has been deleted. Did you check all the SSTables ? It looks like query returned from row cache, otherwise you would see this as well… DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java (line 123) collecting 0 of 214748364

Re: Multi DC setup

2011-10-11 Thread Peter Schuller
> We already have two separate rings. Idea of bidirectional sync is, if one > ring is down, we can still send the traffic to other ring. When original > cluster comes back, it will pick up the data from available cluster. I'm not > sure if it makes sense to have separate rings or combine these two

Re: anyway to throttle nodetool repair?

2011-10-11 Thread Peter Schuller
> so how about disk io?  is there anyway to use ionice to control it? > I have tried to adjust the priority by "ionice -c3 -p [cassandra pid]. >  seems not working... Compaction throttling (and in 1.0 internode streaming throttling) both address disk I/O. -- / Peter Schuller (@scode on twitter)

Re: Volunteers needed - Wiki

2011-10-11 Thread Jérémy SEVELLEC
Hi Aaron, I think the CommitLog section is outdated ( http://wiki.apache.org/cassandra/ArchitectureCommitLog) : The CommitLogHeader is no longer exist since this ticket : https://issues.apache.org/jira/browse/CASSANDRA-2419 Regards, Jérémy 2011/10/11 Sasha Dolgy > maybe that should be the fi