Re: SurgeCon 2012

2012-09-06 Thread Dan Kuebrich
I'm down--we had a good mini-meetup last year at lunch. How about trying to get something together on Wed or Thurs night? On Wed, Sep 5, 2012 at 5:46 PM, Chris Burroughs wrote: > Surge [1] is scalability focused conference in late September hosted in > Baltimore. It's a pretty cool conference w

Re: Python CQL Batching is slower than single statements

2012-01-25 Thread Dan Kuebrich
Not that familiar with CQL in particular, but what timeout is set in pycassa? It could be too low for your batch size. If your request is timing out, it will do exponential back off between retries. On Jan 25, 2012 2:53 AM, "aaron morton" wrote: > There are few slight differences in the execution

Re: Surgecon Meetup?

2011-09-26 Thread Dan Kuebrich
I'll be at Surge on Thursday, would love to meet up. Anyone else planning to be there? On Sun, Sep 25, 2011 at 7:27 PM, Chris Burroughs wrote: > Surge [1] is scalability focused conference in late September hosted in > Baltimore. It's a pretty cool conference with a good mix of > operationally

Re: dropping secondary indexes

2011-08-18 Thread Dan Kuebrich
family Standard4 > with comparator = BytesType > and column_metadata = > [ > { > column_name : 'other name', > validation_class : LongType > }]; > > > Cheers > > - > Aaron Morton > Freelance Cassand

Re: dropping secondary indexes

2011-08-17 Thread Dan Kuebrich
). > > There is a minor over head, but only if the named column is updated. > > Cheers > > ----- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 17/08/2011, at 2:12 AM, Dan Kuebrich wrote: > > > I

dropping secondary indexes

2011-08-16 Thread Dan Kuebrich
I think I've dropped all the indexes on a CF, but I see traces of them in the CLI output of show keyspaces. I see a few validators left behind, and one "built index". (output below) 1. Is there a better way to check schema for indexes? 2. I can't drop the "built" one so I assume they're all gone

Re: strange json2sstable cast exception

2011-08-06 Thread Dan Kuebrich
le2json with expiring columns. > > On Sat, Aug 6, 2011 at 9:29 AM, Dan Kuebrich wrote: >> Having run into a recurring compaction problem due to a corrupt sstable >> (perceived row size was 13 petabytes or something), I sstable2json -x 'd >> the key and am now trying to

strange json2sstable cast exception

2011-08-06 Thread Dan Kuebrich
Having run into a recurring compaction problem due to a corrupt sstable (perceived row size was 13 petabytes or something), I sstable2json -x 'd the key and am now trying to re-import the sstable without it. However, I'm running into the following exception: Importing 2882 keys... java.lang.Clas

Re: Storing single rows on multiple nodes

2011-07-09 Thread Dan Kuebrich
Perhaps I misunderstand your proposal, but it seems that even with your manual key placement schemes, the row would still be huge, no matter what node it gets placed on. A better solution might be figuring out how to make each row into a few smaller ones to get better balancing of load and also fa

Re: 8.0.1 Released - Debian Package ETA?

2011-06-28 Thread Dan Kuebrich
Try running apt-get update (as opposed to upgrade) to pull down the latest listings from the repo. On Tue, Jun 28, 2011 at 8:40 PM, Oleg Tsvinev wrote: > Thank you Dan! But I only see 0.8.0 there :( > > > On Tue, Jun 28, 2011 at 5:35 PM, Dan Kuebrich wrote: > >> 0.8

Re: 8.0.1 Released - Debian Package ETA?

2011-06-28 Thread Dan Kuebrich
0.8.1 should be up--I've already installed it. Here's directions: http://wiki.apache.org/cassandra/DebianPackaging On Tue, Jun 28, 2011 at 8:24 PM, Oleg Tsvinev wrote: > Hi, > > First of all, thank you for releasing v8.0.1 and congrats! the list of > fixes and improvements is impressive. > Is th

Re: RAID or no RAID

2011-06-27 Thread Dan Kuebrich
Not sure what the intended purpose is, but we've mostly used it as an emergency disk-capacity-increase option. It's not as good as raid because each disk size is counted individually (a compacted sstable can only be on one disk) so compaction size limits aren't expanded as one might expect. On Mo

Re: solandra or pig or....?

2011-06-21 Thread Dan Kuebrich
Solandra is indeed distributed search, not distributed number-crunching. As a previous poster said, you could imagine structuring the data in a series of documents with fields containing playername, teamname, position, location, day, time, inning, at bat, outcome, etc. Then you could query to get

Re: Cassandra Statistics and Metrics

2011-06-14 Thread Dan Kuebrich
Here's what people usually monitor from munin (and how they get at it): https://github.com/jbellis/cassandra-munin-plugins . Sounds a lot like what these guys are doing (even the stack?): http://datadoghq.com/ On Tue, Jun 14, 2011 at 10:13 AM, Viktor Jevdokimov wrote: > We're using open source m

Re: problem in using get_range() function

2011-06-13 Thread Dan Kuebrich
Are you using the order preserving partitioner or the random partitioner for this CF? In order to get the results you expect, you'll need to use the OPP. More info: http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/ On Mon, Jun 13, 2011 at 8:47 AM,

Re: how to know there are some columns in a row

2011-06-07 Thread Dan Kuebrich
There might not be a built-in way to do this, but if you make two rows for each author, eg: nabokov_fulltext [ 'lolita' : 'Lolita, light of my life ...' , ...] nabokov_bookindex [ 'lolita' : None , ... ] you could query the bookindex for each author without cassandra having to load the full texts

Re: CLI set command returns null

2011-06-07 Thread Dan Kuebrich
Null response may mean an error on the server side. Have you checked your cassandra server's logs? On Tue, Jun 7, 2011 at 2:22 PM, AJ wrote: > Ver 0.8.0. > > Please help. I don't know what I'm doing wrong. One simple keyspace with > one simple CF with one simple column. I've tried two simple

Re: Appending to fields

2011-05-31 Thread Dan Kuebrich
On Tue, May 31, 2011 at 4:57 PM, Victor Kabdebon wrote: > As Jonathan stated I believe that the insert is in O(N + M), unless there > are some operations that I don't know. > > There are other NoSQL database that can be used with Cassandra as > "buffers" for quick access and modification and then

Re: Priority queue in a single row - performance falls over time

2011-05-25 Thread Dan Kuebrich
It sounds like the problem is that the row is getting filled up with tombstones and becoming enormous? Another idea then, which might not be worth the added complexity, is to progressively use new rows. Depending on volume, this could mean having 5-minute-window rows, or 1 minute, or whatever wor

Re: Upgrade to a different version?

2011-03-17 Thread Dan Kuebrich
Do people have success stories with 0.7.4? It seems like the list only hears if there's a major problem with a release, which means that if you're trying to judge the stability of a release you're looking for silence. But maybe that means not many people have tried it yet. Is there a record of t

Re: Storing photos, images, docs etc.

2011-03-03 Thread Dan Kuebrich
It's still maintained: https://github.com/mogilefs/ . I don't have a good sense of the community, though we did use it at my last job. On Thu, Mar 3, 2011 at 3:44 PM, mcasandra wrote: > Well it's not just metadata that I need to store but also Username, > profiles, > followers etc. What I meant

Re: "null" vs "value not found"?

2011-02-24 Thread Dan Kuebrich
I should mention that it took me a while to figure this out too. Might be a candidate for an improvement in the cli? On Thu, Feb 24, 2011 at 4:01 PM, buddhasystem wrote: > > Thanks! You are right. I see exception but have no idea what went wrong. > > > ERROR [ReadStage:14] 2011-02-24 21:51:29,3

Re: "null" vs "value not found"?

2011-02-24 Thread Dan Kuebrich
When I've gotten "null" as a result in cassandra-cli, it turned out to mean that there were exceptions being thrown on the server side. Have you checked your Cassandra logs? On Thu, Feb 24, 2011 at 3:44 PM, buddhasystem wrote: > > Thanks Tyler, > >ColumnFamily: index1 > Columns sorted b

Re: read latency in cassandra

2011-02-20 Thread Dan Kuebrich
the thrift client level below pycassa, so it should avoid library internals. I'm not sure how to compare this to the jmx latency report. Apologies for being so verbose! dan Sorry for all the questions, the answer to your initial question is "mmm, > that does not sound right.

Re: RandomPartitioner

2011-02-14 Thread Dan Kuebrich
You may find this part of the wiki helpful: http://wiki.apache.org/cassandra/Operations#Range_changes "If you explicitly specify an InitialToken in the configuration, the new node will bootstrap to that position on the ring. Otherwise, it will pick a Token that will give it half the keys from the

read latency in cassandra

2011-02-04 Thread Dan Kuebrich
Hi all, It often takes more than two seconds to load: - one row of ~450 events comprising ~600k - cluster size of 1 - client is pycassa 1.04 - timeout on recv - cold read (I believe) - load generally < 0.5 on a 4-core machine, 2 EC2 instance store drives for cassandra - cpu wait generally < 1% O

Re: Using Cassandra to store files

2011-02-03 Thread Dan Kuebrich
> > > CouchDB > That's not what document-oriented means! (har har) I don't know all the details of your case, but with serving static files I suspect you could do ok with something that has a much smaller memory/cpu footprint as you won't have as great of write throughput / read latency concerns.

Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-14 Thread Dan Kuebrich
We've done hundreds of gigs in and out of cassandra 0.6.8 with pycassa 0.3. Working on upgrading to 0.7 and pycassa 1.03. I don't know if we're using it wrong, but the "connection object is tied to a particular keyspace" constraint isn't that awesome--we have a number of keyspaces used simultaneo

Re: Cassandra Monitoring

2010-12-17 Thread Dan Kuebrich
Is anyone using cassandra with monit? All I have is this embarrassing bit of monit config: check process cassandra with pidfile /var/run/cassandra.pid start program = "/etc/init.d/cassandra start" with timeout 60 seconds stop program = "/etc/init.d/cassandra stop" if failed port 9160 type