Re: Load Balancing Mapper Tasks

2010-05-16 Thread Jonathan Ellis
On Sun, May 16, 2010 at 2:52 PM, Joost Ouwerkerk wrote: > Meanwhile. I'm still getting TimedOutException errors when mapping this > 30-million row table, even when retrieving no data at all.  It looks like it > is related to disk activity on "hot" nodes (when the same cassandra node has > to handl

Re: FileNotFoundException in ROW-READ-STAGE

2010-05-16 Thread Jonathan Ellis
we really need to know what the file history is. if it's a freshly flushed file, that's Bad. if it's a file that was a compaction victim, that's annoying but not something we need to worry about. both flush and compaction are logged at INFO. On Sun, May 16, 2010 at 11:13 AM, Mason Hale wrote:

Re: zookeeper, how do you feed the pets?

2010-05-16 Thread Yang
I guess you'd have better luck with the zookeeper user list but according to my very limited understanding of ZK, u can use the ZooKeeper.setData()/getData() API http://hadoop.apache.org/zookeeper/docs/current/api/index.html storing a simple count should be simple to ZK since each ZNode is desi

Re: zookeeper, how do you feed the pets?

2010-05-16 Thread S Ahmed
yes counts will be a big part of the project (user points). ok i'll wait for that vector implementation then (I think that is what it was called). thanks! On Sun, May 16, 2010 at 10:10 PM, Chris Goffinet wrote: > If you are running multiple datacenters, intend to have a lot of writes for > coun

Re: zookeeper, how do you feed the pets?

2010-05-16 Thread Chris Goffinet
If you are running multiple datacenters, intend to have a lot of writes for counters, I highly advise against it. We got rid of ZK because of that. -Chris On May 16, 2010, at 7:04 PM, S Ahmed wrote: > Can someone quickly go over how you go about using zookeeper if you want to > store counts an

Re: How to use binaryMemTable to insert large data?

2010-05-16 Thread Peng Guo
I can't put data into Cassandra. I look into the code, for StorageService.instance.getNaturalEndpoints(keyspace, key) can't get any InetAddress endpoint I have start the Cassandra Server in my pc. On Mon, May 17, 2010 at 12:54 AM, Sonny Heer wrote: > What do you mean it didn't work? > > Cassan

zookeeper, how do you feed the pets?

2010-05-16 Thread S Ahmed
Can someone quickly go over how you go about using zookeeper if you want to store counts and have those counts be accurate? e.g. in digg's case I believe, they are using zookeeper so they can keep track of digg's for a particular digg story. Is it a backend change only and then storing API calls

Re: list of columns

2010-05-16 Thread Jonathan Shook
I think you are correct, David. What Bill is asking for specifically is not in the API. Bill, if this is a performance concern (i.e., your column values are/could be vastly larger than your column names, and you need to query the namespace before loading the values), then you might consider keepin

Re: Nodes Levels of Hierarchy in Cassandra.

2010-05-16 Thread Benjamin Black
Not in Cassandra. Your description of the levels is not quite accurate, either. The keyspaces and CFs are generally considered fixed since it is rather expensive to change them compared to the row keys and columns. Within an SCF, you have: row_key: {supercolumn1: {column1A: value1A, column1B: v

Re: Load Balancing Mapper Tasks

2010-05-16 Thread Joost Ouwerkerk
Hadoop doesn't make any assumptions about how input source data is distributed. It can't 'know' that the data for the first 30 splits emitted by the InputFormat are all stored on the same cassandra node. The new case with the patch is CASSANDRA-1096 Meanwhile. I'm still getting TimedOutException

Re: chiton

2010-05-16 Thread Sonny Heer
> On Sun, May 16, 2010 at 1:29 PM, Sonny Heer wrote: > Look like you hit this bug which is now solved: > http://github.com/driftx/chiton/issues/closed#issue/3 sweet, thanks. > It will display the raw value for most column types except UUIDs, which it > will decode.  It doesn't yet support longs

Re: chiton

2010-05-16 Thread Brandon Williams
On Sun, May 16, 2010 at 1:29 PM, Sonny Heer wrote: > it doesn't like the the int(port) call. somehow the port # isn't > getting through? > > I hardcoded the port to 9160, and it works. > Look like you hit this bug which is now solved: http://github.com/driftx/chiton/issues/closed#issue/3 Does

Re: chiton

2010-05-16 Thread Sonny Heer
it doesn't like the the int(port) call. somehow the port # isn't getting through? I hardcoded the port to 9160, and it works. Does chiton handle all the column types for display? I'm assuming it doesn't show mixed type column names (e.g. first x bits are a long, and the rest is a text utf-8)?

Re: chiton

2010-05-16 Thread Sonny Heer
thanks for the help. I had two versions of python, it was using the wrong one. I got the GUI to show up, but when i connect by going to file/connect and enter localhost/9160... i get the following in the output: chiton/bin/chiton-client 2010-05-16 10:59:45-0700 [-] Log opened. 2010-05-16 10:5

Re: How to use binaryMemTable to insert large data?

2010-05-16 Thread Sonny Heer
What do you mean it didn't work? CassandraBulkLoader is a map/reduce program, so you'll need hadoop setup. On Sun, May 16, 2010 at 7:43 AM, Peng Guo wrote: > I try the contrib\bmt_example\CassandraBulkLoader.java Example, but it not > work. > > Can i try contrib\bmt_example\CassandraBulkLoader.j

Re: FileNotFoundException in ROW-READ-STAGE

2010-05-16 Thread Mason Hale
Grepping through the logs, I haven't been able to deduce what was going on at the same time, but I have found many instances of these errors for multiple files going back for weeks nows. (Found 848 of these errors since 4-23-2010). The files not found include Index and Data files spanning multiple

How to use binaryMemTable to insert large data?

2010-05-16 Thread Peng Guo
I try the contrib\bmt_example\CassandraBulkLoader.java Example, but it not work. Can i try contrib\bmt_example\CassandraBulkLoader.java in a pc which is not a Cassandra server? -- Regards Peng Guo

Re: list of columns

2010-05-16 Thread David Boxenhorn
Bill, I am a new user of Cassandra, so I've been following this discussion with interest. I think the answer is "no", except for the brute force method of looping through all your data. It's like asking for a list of all the files on your C: drive. The term "column" is very misleading, since "colum