A few recent changes made at hector:
1. We keep several branches in parallel: 0.5.0, 0.5.1, 0.6.0 and master.
We've now changed master to be at version 0.6.0. 0.6.1 is compatible with
0.6.0 as the API didn't change, so practically master is now at the latest
released cassandra version.
2. We added
Awesome thx..
Carlos
From: Jonathan Ellis [jbel...@gmail.com]
Sent: Tuesday, April 20, 2010 10:52 PM
To: user@cassandra.apache.org
Subject: Re: Batch row deletion
This will be done in https://issues.apache.org/jira/browse/CASSANDRA-293
On Tue, Apr 20, 201
This will be done in https://issues.apache.org/jira/browse/CASSANDRA-293
On Tue, Apr 20, 2010 at 10:45 PM, Carlos Sanchez
wrote:
> All,
>
> Is there or will there be a feature to batch delete rows? (KeyRange delete?)
>
> Thanks
>
> Carlos
>
> This email message and any attachments are for the sol
All,
Is there or will there be a feature to batch delete rows? (KeyRange delete?)
Thanks
Carlos
This email message and any attachments are for the sole use of the intended
recipients and may contain proprietary and/or confidential information which
may be privileged or otherwise protected fro
Thanks Ryan, I also notice this prameter in storage-conf just now. I am
going to increase this number to test whether it will work
2010/4/21 Ryan King
> what's your RPC timeout in storage-conf?
>
> -ryan
>
> On Tue, Apr 20, 2010 at 6:46 PM, Jeff Zhang wrote:
> > Hi all,
> >
> > When I insert
what's your RPC timeout in storage-conf?
-ryan
On Tue, Apr 20, 2010 at 6:46 PM, Jeff Zhang wrote:
> Hi all,
>
> When I insert very large value, the thrift will throw TimeOutException,
> event If I set the socket timeout as 10 minutes. I believe the 10 minutes
> is enough for inserting the large
Hi all,
When I insert very large value, the thrift will throw TimeOutException,
event If I set the socket timeout as 10 minutes. I believe the 10 minutes
is enough for inserting the large value and spreading the replica to other
machines, the ConsistencyLevel I choose is DCQUORUM. So is there any
On Tue, Apr 20, 2010 at 11:54 AM, Mark Jones wrote:
> When I look at this arrangement, I see one lookup by key for the user,
> followed by a large read for all the "email indexes" (these are all columns
> in the same row, right?)
>
> Then one lookup by key for each email Seems very seek in
Reminder - price goes up after tonight at http://bigdataworkshop.eventbrite.com
We now have enough people interested in a bus or van from SF to Mountain View
to offer one. Check the interested box when you register and we will send you
pickup point information.
We will have people from the Cass
you should use keys, not tokens. start with empty string.
On Tue, Apr 20, 2010 at 5:12 PM, Chris Dean wrote:
> I'd like to use get_range_slices to pull all the keys from a small CF
> with 10,000 keys. I'd also like to get them in chunks of 100 at a time.
> Is there a way to do that?
>
> I thoug
I'd like to use get_range_slices to pull all the keys from a small CF
with 10,000 keys. I'd also like to get them in chunks of 100 at a time.
Is there a way to do that?
I thought I could set start_token and end_token in KeyRange, but I can't
figure out what the intial start_token should be.
Chee
> It seems to me you might get by with putting the actual assets into
> cassandra (possibly breaking them up into chunks depending on how big
> they are) and storing the pointers to them in Postgres along with all
> the other metadata. If it were me, I'd split each file into a fixed
> chunksize an
And the key would be the state or value matched, I'm getting it well?
On Tue, Apr 20, 2010 at 2:46 PM, Christian Torres wrote:
> So the sugestion would be create a column family with the values or states
> and with columns save the matches?
>
>
> On Tue, Apr 20, 2010 at 11:27 AM, Roger Schildmeij
So the sugestion would be create a column family with the values or states
and with columns save the matches?
On Tue, Apr 20, 2010 at 11:27 AM, Roger Schildmeijer wrote:
> My bad. Missed your one-to-one relationship (row key <-> column
>
> )
> On 20 apr 2010, at 19.24em, Christian Torres wrote:
On Tue, Apr 20, 2010 at 1:37 PM, tsuraan wrote:
> The assets are binary files on a document tracking system. Our
> current platform is postgres-backed; the entire system we've written
> is fairly easily distributed across multiple computers, but postgres
> isn't. There are reliable databases tha
When I look at this arrangement, I see one lookup by key for the user, followed
by a large read for all the "email indexes" (these are all columns in the same
row, right?)
Then one lookup by key for each email Seems very seek intensive.
Would a better way be to index each email with a ke
We haven't gotten around to implementing this yet and so far no one needed
that badly enough to write it.
We accept contributions or forks and we use github, so feel free to diy
(forks are preferable). http://github.com/rantav/hector
On Tue, Apr 20, 2010 at 3:25 AM, Chris Dean wrote:
> Ok, thank
I can't answer for its sanity, but I would not do it that way. I'd
have a CF for Emails, with 1 email per row, and another CF for
UserEmails with per-user index rows referencing the Emails rows.
b
On Tue, Apr 20, 2010 at 9:44 AM, Mark Jones wrote:
> To make sure I'm clear on what you are sayin
thx, that did the trick.
Jonathan Ellis wrote:
Added to http://wiki.apache.org/cassandra/MemtableSSTable:
SSTables that are obsoleted by a compaction are deleted asynchronously
when the JVM performs a GC. You can force a GC from jconsole if
necessary but this is not necessary; Cassandra will f
i have done no deletes, just inserts. so you are correct, there isn't
any "data" to cleanup. however when i run some of the cleanup and/or
compaction tasks the space used on disk actually grows, and i would like
to force any unneeded files to be removed. as i write this, jonathan
has respond
Added to http://wiki.apache.org/cassandra/MemtableSSTable:
SSTables that are obsoleted by a compaction are deleted asynchronously
when the JVM performs a GC. You can force a GC from jconsole if
necessary but this is not necessary; Cassandra will force one itself
if it detects that it is low on sp
Are you deleting data through the API or just doing a bunch of inserts
and then running a compaction? The latter will not result in anything
to clean up since data must be explicitly deleted.
b
On Tue, Apr 20, 2010 at 10:33 AM, B. Todd Burruss wrote:
> i'm trying to draw some correlation betwe
How do i delete a row using BMT method?
Do I simply do a mutate with column delete flag set to true? Thanks.
> I'm curious as to how you would have so many asset / user permissions that
> you couldn't use a standard relational database to model them. Is this some
> sort of multi-tenant system where you're providing some generalized asset
> check-out mechanism to many, many customers? Even so, I'm not sure
You're welcome Schubert.
I look forward to any new results you may come up with.
{ It would also be interesting, when you run your tests again, to look at
the GC logs and see to what extent
https://issues.apache.org/jira/browse/CASSANDRA-896 is the culprit for what
you will see. Identifying any ot
i'm trying to draw some correlation between the size of my data and the
space used on disk. i have set 1 so
there isn't any reason to keep data around.
my approach is this:
after only doing "puts" to cassandra for a while i stop my client and
want to perform the proper "cleanup" and/or "comp
My bad. Missed your one-to-one relationship (row key <-> column
)
On 20 apr 2010, at 19.24em, Christian Torres wrote:
> Mmmm...
>
> According with this doc http://wiki.apache.org/cassandra/API#get_slice that a
> developer mailed to me It's possible!!
>
> I sent you as reference
>
> On Tue, Ap
http://wiki.apache.org/cassandra/API#get_slice
get_slice retrieves the values for either (a) a list of column names or (b)
a range of columns, depending on the SlicePredicate you use. It does not
allow you to filter a la SQL's WHERE. You would need to create your own
index to do so, at least unti
If you notice the SlicePredicate accepts column names, but not values. You can
tell it pull these 3 columns, but there is no "if/where" in there.
SliceRange is I think, based on the key, since it doesn't have a way to pair up
column names/values
From: Christian Torres [mailto:chtor...@gmail.co
great, I'm happy you found hector useful and reused it in your client.
On Tue, Apr 20, 2010 at 5:11 PM, Dop Sun wrote:
> Hi,
>
>
>
> I have downloaded hector-0.6.0-10.jar. As you mentioned, it has good
> implementation for the connection pooling, JMX counters.
>
>
>
> What I’m doing is: using H
Dop,
Thank you for trying out hector. I think you have the right approach
for using it with your project. Feel free to ping us directly
regarding Hector on either of these mailings lists as appropriate:
http://wiki.github.com/rantav/hector/mailing-lists
Cheers,
-Nate
On Tue, Apr 20, 2010 at 7:11
Mmmm...
According with this doc http://wiki.apache.org/cassandra/API#get_slice that
a developer mailed to me It's possible!!
I sent you as reference
On Tue, Apr 20, 2010 at 11:17 AM, Mark Jones wrote:
> You will have to pull the columns and filter yourself.
>
>
>
> *From:* Christian Torres [m
http://sourceforge.net/projects/clusterssh/
Roger Schildmeijer wrote:
dancer's shell / distributed shell
http://www.netfort.gr.jp/~dancer/software/dsh.html.en
On 20 apr 2010, at 17.18em, Joost Ouwerkerk wrote:
What are people using to manage Cassandra cluster nodes? i.e. to s
The short answer as to what people normally do is that they use a
relational database for something like this.
I'm curious as to how you would have so many asset / user permissions that
you couldn't use a standard relational database to model them. Is this some
sort of multi-tenant system w
I would think this is on the roadmap, just not available yet. It can be
managed by adjusting the Heap size (to a large degree).
-Original Message-
From: Tatu Saloranta [mailto:tsalora...@gmail.com]
Sent: Tuesday, April 20, 2010 12:18 PM
To: user@cassandra.apache.org
Subject: Re: 0.6.1 in
You will have to pull the columns and filter yourself.
From: Christian Torres [mailto:chtor...@gmail.com]
Sent: Tuesday, April 20, 2010 11:50 AM
To: user@cassandra.apache.org
Cc: d...@cassandra.apache.org
Subject: Filters
Hello!
Is there any way to make filters (WHEREs) in cassandra? Or I have t
On Mon, Apr 19, 2010 at 7:12 PM, Brandon Williams wrote:
> On Mon, Apr 19, 2010 at 9:06 PM, Schubert Zhang wrote:
>>
>> 2. Reject the request when be short of resource, instead of throws OOME
>> and exit (crash).
>
> Right, that is the crux of the problem It will be addressed here:
> https://iss
Hello!
Is there any way to make filters (WHEREs) in cassandra? Or I have to manages
to do it
For example:
I have a ColumnFamily with a column in each row whose value is a state...
Public or Private, so I want to filter all rows that are private and also
the public ones in other form... Beside in
> Suppose I have a CF that holds some sort of assets that some users of
> my program have access to, and that some do not. In SQL-ish terms it
> would look something like this:
>
> TABLE Assets (
> asset_id serial primary key,
> ...
> );
>
> TABLE Users (
> user_id serial primary key,
> user_n
To make sure I'm clear on what you are saying:
Are the "Individual Emails" in the example below, Supercolumns and the {body,
header, tags...} the subcolumns?
Is that a sane data layout for an email system? Where the Supercolumn
identifier is the "conversation label"
Sorry to be so daft, but
Not all the data associated w/ the key is brought into memory, just
all the data associated w/ the supercolumns being queried.
Supercolumns are so you can update a smallish number of subcolumns
independently (e.g. when denormalizing an entire narrow row, usually
with a finite set of columns). If
Sorry, I didn't answer your question in my response, I have at this point:
Key(ID)
When/Where SuperColumn Tag: and Columns {Data: One Value (not yet written,
tags, flags)}
Under some keys (very small #) there will be 2 values like:
Key(ID)
When/Where SuperColumn Tag: and Columns {Da
When I first read this, it bothered me because it seemed like it couldn't be
so. So I read the link, and it says the whole thing, so I have to ask for some
classification here.
I had always assumed a super column was similar to a local keyspace, and that
the SubColumns under it were similar to
How many columns are in the supercolumn total?
"in super columnfamilies there is a third level of subcolumns; these
are not indexed, and any request for a subcolumn deserializes _all_
the subcolumns in that supercolumn"
http://wiki.apache.org/cassandra/CassandraLimitations
On Tue, Apr 20, 2010 a
dancer's shell / distributed shell
http://www.netfort.gr.jp/~dancer/software/dsh.html.en
On 20 apr 2010, at 17.18em, Joost Ouwerkerk wrote:
> What are people using to manage Cassandra cluster nodes? i.e. to start,
> stop, copy config files, etc. I'm using cssh and wondering if there is a
> b
What are people using to manage Cassandra cluster nodes? i.e. to start,
stop, copy config files, etc. I'm using cssh and wondering if there is a
better way...
Joost.
I too am seeing very slow performance while testing worst case scenarios of 1
key leading to 1 supercolumn and 1 column beyond that.
Key -> SuperColumn -> 1 Column (of ~ 500 bytes)
Drive utilization is 80-90% and I'm only dealing with 50-70 million rows.
(With NO swapping) So far, I've found
I trace IncomingStreamReader source and found that incoming socket comes
from MessagingService$SocketThread.
but there is no close() call on either accepted socket or socketChannel.
Should I file a bug report ?
On Tue, Apr 20, 2010 at 11:02, Ingram Chen wrote:
> this happened after several hour
On Tue, 2010-04-20 at 10:39 +0800, Ken Sandney wrote:
> Sorry I just don't know how to resolve this :)
http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
> On Tue, Apr 20, 2010 at 10:37 AM, Jonathan Ellis
> wrote:
>
> > Ken, I linked you to the FAQ answering your problem in the
Hi,
I have downloaded hector-0.6.0-10.jar. As you mentioned, it has good
implementation for the connection pooling, JMX counters.
What I’m doing is: using Hector to create the Cassandra client (be specific:
borrow_client(url, port)). And my understanding is: in this way, the Jassandra
wi
I get 10 columns Family by keys and one columns Family has 30 columns.
I use multigetSlice once to get 10 column Family.but the performance is so
poor.
anyone has other thought to increase the performance.
Not so reasonable, given what you are trying to accomplish. A 1GB
heap (on a 2GB machine) is fine for development and functional
testing, but I wouldn't try to deal with the number of rows you are
describing with less than 8GB/node with 4-6GB heap.
b
On Mon, Apr 19, 2010 at 7:32 PM, Ken Sandney
52 matches
Mail list logo