Re: Option for ordering columns by timestamp in CF

2012-10-13 Thread Martin Koch
One example could be to identify when a row was last updated. For example,
if I have a column family for storing users, the row key is a user ID and
the columns are values for that user, e.g. natural column names would be
firstName, lastName, address, etc; column names don't naturally
include a date here.

Sorting the coulmns by timestamp and picking the last would allow me to
know when the row was last modified. (I could manually maintain a 'last
modified' column as well, I know, but just coming up with a use case :).

/Martin Koch

On Fri, Oct 12, 2012 at 11:39 PM, B. Todd Burruss bto...@gmail.com wrote:

 trying to think of a use case where you would want to order by
 timestamp, and also have unique column names for direct access.

 not really trying to challenge the use case, but you can get ordering
 by timestamp and still maintain a name for the column using
 composites. if the first component of the composite is a timestamp,
 then you can order on it.  when retrieved you will could have a name
 in the second component .. and have dupes as long as the timestamp is
 unique (use TimeUUID)


 On Fri, Oct 12, 2012 at 7:20 AM, Derek Williams de...@fyrie.net wrote:
  You probably already know this but I'm pretty sure it wouldn't be a
 trivial
  change, since to efficiently lookup a column by name requires the
 columns to
  be ordered by name. A separate index would be needed in order to provide
  lookup by column name if the row was sorted by timestamp (which is the
 way
  Redis implements it's sorted set).
 
 
  On Fri, Oct 12, 2012 at 12:13 AM, Ertio Lew ertio...@gmail.com wrote:
 
  Make column timestamps optional- kidding me, right ?:)  I do
 understand
  that this wont be possible as then cassandra wont be able to
 distinguish the
  latest among several copies of same column. I dont mean that. I just
 want
  the while ordering the columns, Cassandra(in an optional mode per CF)
 should
  not look at column names(they will exist though but for retrieval
 purposes
  not for ordering) but instead Cassandra would order the columns by
 looking
  at the timestamp values(timestamps would exist!). So the change would be
  just to provide a mode in which cassandra, while ordering, uses
 timestamps
  instead of column names.
 
 
  On Fri, Oct 12, 2012 at 2:26 AM, Tyler Hobbs ty...@datastax.com
 wrote:
 
  Without thinking too deeply about it, this is basically equivalent to
  disabling timestamps for a column family and using timestamps for
 column
  names, though in a very indirect (and potentially confusing) manner.
  So, if
  you want to open a ticket, I would suggest framing it as make column
  timestamps optional.
 
 
  On Wed, Oct 10, 2012 at 4:44 AM, Ertio Lew ertio...@gmail.com wrote:
 
  I think Cassandra should provide an configurable option on per column
  family basis to do columns sorting by time-stamp rather than column
 names.
  This would be really helpful to maintain time-sorted columns without
 using
  up the column name as time-stamps which might otherwise be used to
 store
  most relevant column names useful for retrievals. Very frequently we
 need to
  store data sorted in time order. Therefore I think this may be a very
  general requirement  not specific to just my use-case alone.
 
  Does it makes sense to create an issue for this ?
 
 
 
 
  On Fri, Mar 25, 2011 at 2:38 AM, aaron morton 
 aa...@thelastpickle.com
  wrote:
 
  If you mean order by the column timestamp (as passed by the client)
  that it not possible.
 
  Can you use your own timestamps as the column name and store them as
  long values ?
 
  Aaron
 
  On 25 Mar 2011, at 09:30, Narendra Sharma wrote:
 
   Cassandra 0.7.4
   Column names in my CF are of type byte[] but I want to order
 columns
   by timestamp. What is the best way to achieve this? Does it make
 sense for
   Cassandra to support ordering of columns by timestamp as option
 for a column
   family irrespective of the column name type?
  
   Thanks,
   Naren
 
 
 
 
 
  --
  Tyler Hobbs
  DataStax
 
 
 
 
 
  --
  Derek Williams
 



Issue removing rows

2012-10-13 Thread Nick Morizio
I'm wondering if anyone has seen this issue before:

We are running Cassandra 1.1.5 on linux, latest Oracle JDK 6.


Starting with a fresh, empty cassandra on a new ring (~7 nodes), we create our 
keyspace and insert a row.  We then try to remove that row, at which point the 
operation fails and times out. We can keep inserting new rows, but this failure 
occurs as soon as we try to remove one.


The specific error we are getting is an exception in MutationStage, caused by 
an IllegalArgumentException to Buffer.limit() while trying to perform the 
mutate.  This same error occurs on any node where we try this test.


The commit logs are created, but as far as I can tell no data is ever written 
to the keyspace dir.

We are using StorageProxy directly, but in this case it is pretty much a 
copy/paste from what the thrift server is doing.

This code has been working without issue on other servers, but for some reason 
it is not working on a new set of servers and I'm at a loss trying to diagnose.


Has anybody seen a similar issue?


Re: Option for ordering columns by timestamp in CF

2012-10-13 Thread Ertio Lew
@B Todd Burruss:
Regarding the use cases, I think they are pretty common. At least I see its
usages very frequently in my project. Lets say when the application needs
to store a timeline of bookmark activity by a user on certain items then if
I could store the activity data containing columns(with concerned item id
as column name)  get it ordered by timestamp then I could also fetch from
that row whether or not a particular item was bookmarked by user.
Ordering columns by time is a very common requirement in any application
therefore if such a mechanism is provided by cassandra, it would be really
useful  convenient to app developers.

On Sat, Oct 13, 2012 at 8:50 PM, Martin Koch m...@issuu.com wrote:

 One example could be to identify when a row was last updated. For example,
 if I have a column family for storing users, the row key is a user ID and
 the columns are values for that user, e.g. natural column names would be
 firstName, lastName, address, etc; column names don't naturally
 include a date here.

 Sorting the coulmns by timestamp and picking the last would allow me to
 know when the row was last modified. (I could manually maintain a 'last
 modified' column as well, I know, but just coming up with a use case :).

 /Martin Koch


 On Fri, Oct 12, 2012 at 11:39 PM, B. Todd Burruss bto...@gmail.comwrote:

 trying to think of a use case where you would want to order by
 timestamp, and also have unique column names for direct access.

 not really trying to challenge the use case, but you can get ordering
 by timestamp and still maintain a name for the column using
 composites. if the first component of the composite is a timestamp,
 then you can order on it.  when retrieved you will could have a name
 in the second component .. and have dupes as long as the timestamp is
 unique (use TimeUUID)


 On Fri, Oct 12, 2012 at 7:20 AM, Derek Williams de...@fyrie.net wrote:
  You probably already know this but I'm pretty sure it wouldn't be a
 trivial
  change, since to efficiently lookup a column by name requires the
 columns to
  be ordered by name. A separate index would be needed in order to provide
  lookup by column name if the row was sorted by timestamp (which is the
 way
  Redis implements it's sorted set).
 
 
  On Fri, Oct 12, 2012 at 12:13 AM, Ertio Lew ertio...@gmail.com wrote:
 
  Make column timestamps optional- kidding me, right ?:)  I do
 understand
  that this wont be possible as then cassandra wont be able to
 distinguish the
  latest among several copies of same column. I dont mean that. I just
 want
  the while ordering the columns, Cassandra(in an optional mode per CF)
 should
  not look at column names(they will exist though but for retrieval
 purposes
  not for ordering) but instead Cassandra would order the columns by
 looking
  at the timestamp values(timestamps would exist!). So the change would
 be
  just to provide a mode in which cassandra, while ordering, uses
 timestamps
  instead of column names.
 
 
  On Fri, Oct 12, 2012 at 2:26 AM, Tyler Hobbs ty...@datastax.com
 wrote:
 
  Without thinking too deeply about it, this is basically equivalent to
  disabling timestamps for a column family and using timestamps for
 column
  names, though in a very indirect (and potentially confusing) manner.
  So, if
  you want to open a ticket, I would suggest framing it as make column
  timestamps optional.
 
 
  On Wed, Oct 10, 2012 at 4:44 AM, Ertio Lew ertio...@gmail.com
 wrote:
 
  I think Cassandra should provide an configurable option on per column
  family basis to do columns sorting by time-stamp rather than column
 names.
  This would be really helpful to maintain time-sorted columns without
 using
  up the column name as time-stamps which might otherwise be used to
 store
  most relevant column names useful for retrievals. Very frequently we
 need to
  store data sorted in time order. Therefore I think this may be a very
  general requirement  not specific to just my use-case alone.
 
  Does it makes sense to create an issue for this ?
 
 
 
 
  On Fri, Mar 25, 2011 at 2:38 AM, aaron morton 
 aa...@thelastpickle.com
  wrote:
 
  If you mean order by the column timestamp (as passed by the client)
  that it not possible.
 
  Can you use your own timestamps as the column name and store them as
  long values ?
 
  Aaron
 
  On 25 Mar 2011, at 09:30, Narendra Sharma wrote:
 
   Cassandra 0.7.4
   Column names in my CF are of type byte[] but I want to order
 columns
   by timestamp. What is the best way to achieve this? Does it make
 sense for
   Cassandra to support ordering of columns by timestamp as option
 for a column
   family irrespective of the column name type?
  
   Thanks,
   Naren
 
 
 
 
 
  --
  Tyler Hobbs
  DataStax
 
 
 
 
 
  --
  Derek Williams
 





Re: Issue removing rows

2012-10-13 Thread B. Todd Burruss
i have used StorageProxy and was forgetting to rewind (or otherwise
setup my ByteBuffer properly) and was getting, i believe, the same
error.

check your ByteBuffers

On Sat, Oct 13, 2012 at 8:49 AM, Nick Morizio nmori...@yahoo.com wrote:
 I'm wondering if anyone has seen this issue before:

 We are running Cassandra 1.1.5 on linux, latest Oracle JDK 6.


 Starting with a fresh, empty cassandra on a new ring (~7 nodes), we create 
 our keyspace and insert a row.  We then try to remove that row, at which 
 point the operation fails and times out. We can keep inserting new rows, but 
 this failure occurs as soon as we try to remove one.


 The specific error we are getting is an exception in MutationStage, caused by 
 an IllegalArgumentException to Buffer.limit() while trying to perform the 
 mutate.  This same error occurs on any node where we try this test.


 The commit logs are created, but as far as I can tell no data is ever written 
 to the keyspace dir.

 We are using StorageProxy directly, but in this case it is pretty much a 
 copy/paste from what the thrift server is doing.

 This code has been working without issue on other servers, but for some 
 reason it is not working on a new set of servers and I'm at a loss trying to 
 diagnose.


 Has anybody seen a similar issue?


Re: Issue removing rows

2012-10-13 Thread Nick Morizio
Thanks,  I will check into that!





 From: B. Todd Burruss bto...@gmail.com
To: user@cassandra.apache.org; Nick Morizio nmori...@yahoo.com 
Sent: Saturday, October 13, 2012 4:40 PM
Subject: Re: Issue removing rows
 
i have used StorageProxy and was forgetting to rewind (or otherwise
setup my ByteBuffer properly) and was getting, i believe, the same
error.

check your ByteBuffers

On Sat, Oct 13, 2012 at 8:49 AM, Nick Morizio nmori...@yahoo.com wrote:
 I'm wondering if anyone has seen this issue before:

 We are running Cassandra 1.1.5 on linux, latest Oracle JDK 6.


 Starting with a fresh, empty cassandra on a new ring (~7 nodes), we create 
 our keyspace and insert a row.  We then try to remove that row, at which 
 point the operation fails and times out. We can keep inserting new rows, but 
 this failure occurs as soon as we try to remove one.


 The specific error we are getting is an exception in MutationStage, caused by 
 an IllegalArgumentException to Buffer.limit() while trying to perform the 
 mutate.  This same error occurs on any node where we try this test.


 The commit logs are created, but as far as I can tell no data is ever written 
 to the keyspace dir.

 We are using StorageProxy directly, but in this case it is pretty much a 
 copy/paste from what the thrift server is doing.

 This code has been working without issue on other servers, but for some 
 reason it is not working on a new set of servers and I'm at a loss trying to 
 diagnose.


 Has anybody seen a similar issue?