Re: CassandraBulkLoader

2010-07-19 Thread Torsten Curdt
When i run bmt_example, M/R job gets executed, cassandra server  gets the data but it goes as HintedHandoff to 127.0.0.2 and it is trying to send data to 127.0.0.2 as if 127.0.0.2 is an actual node. Well, it kind of becomes an actual node. Any idea, why does StorageService returns 127.0.0.2

Re: CassandraBulkLoader

2010-07-15 Thread Torsten Curdt
If you could can you please share the command line function (to load TSV)? There is no command line function ... you have to write code for this. and Can you please help me on storing storage-conf.xml on HDFS part? As I said. Maybe you better start with a simpler scenario and leave out HDFS

Re: CassandraBulkLoader

2010-07-13 Thread Torsten Curdt
On Tue, Jul 13, 2010 at 04:35, Mubarak Seyed mubarak.se...@gmail.com wrote: Where can i find the documentation for BinaryMemTable (btm_example in contrib) to use CassandraBulkLoader? What is the input to be supplied to CassandraBulkLoader? How to form the input data and what is the format of

Re: CassandraBulkLoader

2010-07-13 Thread Torsten Curdt
look at contrib/bmt_example, with the caveat that it's usually premature optimization I wish that was true for us :) Fact: It has always been straightforward to send the output of Hadoop jobs to Cassandra, and Facebook, Digg, and others have been using Hadoop like this as a Cassandra

[OT] Re: unsubscribe

2010-06-22 Thread Torsten Curdt
Hey Dean ...and everyone else not managing to unsubscribe (and sending mails to the list instead): If you don't know how to unsubscribe you can always look at the List-Unsubscribe: header of any of the list emails. These days most of the time you will find that an -unsubscribe suffix is used

Re: bulk loading

2010-06-22 Thread Torsten Curdt
I looked at the thrift service implementation and got it working. (Much faster import!) Thanks! On Mon, Jun 21, 2010 at 13:09, Oleg Anastasjev olega...@gmail.com wrote: Torsten Curdt tcurdt at vafer.org writes: First I tried with my one cassandra -f instance then I saw this requires

Re: bulk loading

2010-06-21 Thread Torsten Curdt
You should be using the thrift API, or a wrapper around the thrift API. It looks like you're using internal cassandra classes. The goal is to get around using the overhead of the Thrift API for a bulk import. There is a Java wrapper called Hector, and there was another talked about on the

bulk loading

2010-06-20 Thread Torsten Curdt
I am trying to get the bulk loading example to work for simple CF. ListColumnFamily columnFamilies = new LinkedListColumnFamily(); while(...) { String[] fields = ... ColumnFamily columnFamily = ColumnFamily.create(keyspace, family); long now =

Re: Pelops - a new Java client library paradigm

2010-06-14 Thread Torsten Curdt
Also think this looks really promising. The fact that there are so many API wrappers now (3?) doesn't reflect well on the native API though :) /me ducks and runs On Mon, Jun 14, 2010 at 11:55, Dominic Williams thedwilli...@googlemail.com wrote: Hi Ran, thanks for the compliment. It is true that

Re: Beginner Assumptions

2010-06-14 Thread Torsten Curdt
rant TBH while we are using super columns, the somehow feel wrong to me. I would be happier if we could move what we do with super columns into the row key space. But in our case that does not seem to be so easy. /rant I'd be quite interested to learn what you are doing with super columns

Re: Too many ParNew's

2010-06-09 Thread Torsten Curdt
As promised on IIRC we also have collected some information as we are seeing (probably) the same problem. https://issues.apache.org/jira/browse/CASSANDRA-1177 On Wed, Jun 9, 2010 at 14:11, aaron morton aa...@thelastpickle.com wrote: May be related to CASSANDRA-1014

Re: Are 6..8 seconds to read 23.000 small rows - as it should be?

2010-06-04 Thread Torsten Curdt
Yes, I know. And I might end up doing this in the end. I do though have pretty hard upper limits of how many rows I will end up with for each key, but anyways it might be a good idea none the less. Thanks for the advice on that one. You set count to Integer.MAX. Did you try with say 3?

Re: Range search on keys not working?

2010-06-02 Thread Torsten Curdt
Sounds like you are not using an order preserving partitioner? On Wed, Jun 2, 2010 at 13:48, David Boxenhorn da...@lookin2.com wrote: Range search on keys is not working for me. I was assured in earlier threads that range search would work, but the results would not be ordered. I'm trying to

Re: Continuously increasing RAM usage

2010-06-02 Thread Torsten Curdt
We've also seen something like this. Will soon investigate and try again with 0.6.2 On Wed, Jun 2, 2010 at 20:27, Paul Brown paulrbr...@gmail.com wrote: FWIW, I'm seeing similar issues on a cluster.  Three nodes, Cassandra 0.6.1, SUN JDK 1.6.0_b20.  I will try to get some heap dumps to see

sorting by column value

2010-05-31 Thread Torsten Curdt
Is it possible to have columns in a super column sorted by value rather than name? I assume not but I thought I ask anyway. What I would love to do is something along the lines of /user/user/country/DE += 1 and then get the sorted result of /user/user/country cheers -- Torsten

key path vs super column

2010-05-19 Thread Torsten Curdt
We are currently working on a prototype that is using Cassandra for realtime-ish statistics system. This seems to be quite a common use case. If people are interested - maybe it be worth collaborating on this beyond design discussions on the list. But first let's me explain our approach and where

Re: Is SuperColumn necessary?

2010-05-06 Thread Torsten Curdt
+1 on all of that On Thu, May 6, 2010 at 09:04, David Boxenhorn da...@lookin2.com wrote: That would be a good time to get rid of the confusing column term, which incorrectly suggests a two-dimensional tabular structure. Suggestions: 1. A hypercube (or hypocube, if only two dimensions):