[jira] [Comment Edited] (CASSANDRA-3779) need forked language document

2012-05-24 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283104#comment-13283104
 ] 

Eric Evans edited comment on CASSANDRA-3779 at 5/25/12 4:10 AM:


Sorry. :(  I plan to have a look over the next few days, but if you get tired 
of waiting, I see no reason not to post them (docs are better than none).

  was (Author: urandom):
Sorry. :(  I plan to have a look over the next few days, but if you get 
tired of waiting, I see no reason not to post them (no docs are better than 
none).
  
> need forked language document
> -
>
> Key: CASSANDRA-3779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3779
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Affects Versions: 1.1.0
>Reporter: Eric Evans
>Assignee: Sylvain Lebresne
>  Labels: cql
> Fix For: 1.1.2
>
>
> The language doc ({{doc/cql/CQL.textile}}) needs to be forked for CQLv3 and 
> updated accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-3779) need forked language document

2012-05-24 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283104#comment-13283104
 ] 

Eric Evans edited comment on CASSANDRA-3779 at 5/25/12 4:10 AM:


Sorry. :(  I plan to have a look over the next few days, but if you get tired 
of waiting, I see no reason not to post them (any docs are better than no docs).

  was (Author: urandom):
Sorry. :(  I plan to have a look over the next few days, but if you get 
tired of waiting, I see no reason not to post them (docs are better than none).
  
> need forked language document
> -
>
> Key: CASSANDRA-3779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3779
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Affects Versions: 1.1.0
>Reporter: Eric Evans
>Assignee: Sylvain Lebresne
>  Labels: cql
> Fix For: 1.1.2
>
>
> The language doc ({{doc/cql/CQL.textile}}) needs to be forked for CQLv3 and 
> updated accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3779) need forked language document

2012-05-24 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283104#comment-13283104
 ] 

Eric Evans commented on CASSANDRA-3779:
---

Sorry. :(  I plan to have a look over the next few days, but if you get tired 
of waiting, I see no reason not to post them (no docs are better than none).

> need forked language document
> -
>
> Key: CASSANDRA-3779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3779
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Affects Versions: 1.1.0
>Reporter: Eric Evans
>Assignee: Sylvain Lebresne
>  Labels: cql
> Fix For: 1.1.2
>
>
> The language doc ({{doc/cql/CQL.textile}}) needs to be forked for CQLv3 and 
> updated accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4127) migration support for vnodes

2012-05-24 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-4127:
--

Description: 
If, when starting up for the first time, the host only has 1 token but 
num_tokens is configured differently, then this will trigger a migration 
process:
* The host will assign itself num_tokens tokens in its own range
* The new tokens will be gossiped

This will allow a rolling migration where N new hosts are bootstrapped into the 
cluster (with num_tokens set appropriately) and then the N old nodes are 
decommissioned. This will result in even distribution of the data among the new 
nodes with randomly assigned tokens.

_Edit0: Appended patch information._

h3. Patches
||Compare||Raw diff||Description||
|[01_migration_path|https://github.com/acunu/cassandra/compare/top-bases/p/4127/01_migration_path...p/4127/01_migration_path]|[01_migration_path.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4127/01_migration_path...p/4127/01_migration_path.diff]|Migrate
 from one token to many|



_Note: These are branches managed with TopGit. If you are applying the patch 
output manually, you will either need to filter the TopGit metadata files (i.e. 
{{wget -O -  | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove 
them afterward ({{rm .topmsg .topdeps}})._

  was:
If, when starting up for the first time, the host only has 1 token but 
num_tokens is configured differently, then this will trigger a migration 
process:
* The host will assign itself num_tokens tokens in its own range
* The new tokens will be gossiped

This will allow a rolling migration where N new hosts are bootstrapped into the 
cluster (with num_tokens set appropriately) and then the N old nodes are 
decommissioned. This will result in even distribution of the data among the new 
nodes with randomly assigned tokens.

 Labels: vnodes  (was: )

> migration support for vnodes
> 
>
> Key: CASSANDRA-4127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4127
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Sam Overton
>Assignee: Sam Overton
>  Labels: vnodes
>
> If, when starting up for the first time, the host only has 1 token but 
> num_tokens is configured differently, then this will trigger a migration 
> process:
> * The host will assign itself num_tokens tokens in its own range
> * The new tokens will be gossiped
> This will allow a rolling migration where N new hosts are bootstrapped into 
> the cluster (with num_tokens set appropriately) and then the N old nodes are 
> decommissioned. This will result in even distribution of the data among the 
> new nodes with randomly assigned tokens.
> _Edit0: Appended patch information._
> h3. Patches
> ||Compare||Raw diff||Description||
> |[01_migration_path|https://github.com/acunu/cassandra/compare/top-bases/p/4127/01_migration_path...p/4127/01_migration_path]|[01_migration_path.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4127/01_migration_path...p/4127/01_migration_path.diff]|Migrate
>  from one token to many|
> 
> _Note: These are branches managed with TopGit. If you are applying the patch 
> output manually, you will either need to filter the TopGit metadata files 
> (i.e. {{wget -O -  | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), 
> or remove them afterward ({{rm .topmsg .topdeps}})._

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3865) Cassandra-cli returns 'command not found' instead of syntax error

2012-05-24 Thread Dave Brosius (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius updated CASSANDRA-3865:


Attachment: 3865_better_cli_ex_handling.txt

3865_better_cli_ex_handling.txt

has better decimal parsing as before.

Also has better error messaging when the exception is NoViableAltException.

For some reason the code special cased that exception and returned the 
non-useful "Command not found" message. I just removed the special casing and 
now the message is useful.

If there is some case that someone knows about that the message "Command not 
found" is useful, i can put it back in for that specific case, but i couldn't 
see it.

against cassandra-1.1

> Cassandra-cli returns 'command not found' instead of syntax error
> -
>
> Key: CASSANDRA-3865
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3865
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE 1.0.5
>Reporter: Eric Lubow
>Assignee: Dave Brosius
>Priority: Trivial
>  Labels: cassandra-cli
> Fix For: 1.1.2
>
> Attachments: 3865_better_cli_ex_handling.txt, parse_doubles_better.txt
>
>
> When creating a column family from the output of 'show schema' with an index, 
> there is a trailing comma after "index_type: 0,"  The return from this is a 
> 'command not found'  This is misleading because the command is found, there 
> is just a syntax error.
> 'Command not found: `create column family $cfname ...`

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-4277) hsha default thread limits make no sense, and yaml comments look confused

2012-05-24 Thread Peter Schuller (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller reassigned CASSANDRA-4277:
-

Assignee: Peter Schuller

> hsha default thread limits make no sense, and yaml comments look confused
> -
>
> Key: CASSANDRA-4277
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4277
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Peter Schuller
>Assignee: Peter Schuller
>
> The cassandra.yaml states with respect to {{rpc_max_threads}}:
> {code}
> # For the Hsha server, the min and max both default to quadruple the number of
> # CPU cores.
> {code}
> The code seems to indeed do this. But this makes, as far as I can tell, no 
> sense what-so-ever since the number of concurrent RPC threads you need is a 
> function of the throughput and the average latency of requests (that includes 
> synchronously waiting on network traffic).
> Defaulting to anything having to do with CPU cores seems inherently wrong. If 
> a default is non-static, a closer guess might be to look at thread stack size 
> and heap size and infer what "might" be reasonable.
> *NOTE*: The effect of having this too low, is "strange" (if you don't know 
> what's going on) latencies observed form the client on all thrift requests 
> (*any* thrift request, including e.g. {{describe_ring()}}), that isn't 
> visible in any latency metric exposed by Cassandra. This is why I consider 
> this "major", since unwitting users may be seeing detrimental performance for 
> no good reason.
> In addition, I read this about async:
> {code}
> # async -> Nonblocking server implementation with one thread to serve 
> #  rpc connections.  This is not recommended for high throughput use
> #  cases. Async has been tested to be about 50% slower than sync
> #  or hsha and is deprecated: it will be removed in the next major 
> release.
> {code}
> This makes even less sense. Running with *one* rpc thread limits you to a 
> single concurrent request. How was that 50% number even attained? By 
> single-node testing being completely CPU bound locally on a node? The actual 
> effect should be "stupidly slow" in any real situation with lots of requests 
> on a cluster of many nodes and network traffic (though I didn't test that) - 
> especially in the event of any kind of hiccup like a node doing GC. I agree 
> that if the above is true, async should *definitely* be deprecated, but the 
> reasons seem *much* stronger than implied.
> I may be missing something here, in which case I apologize,, but I 
> specifically double-checked after I fixed this setting on on our our clusters 
> after seeing exactly the expected side-effect of having it be too low. I 
> always was under the impression that rpc_max_threads affects the number of 
> RPC requests running concurrently, and code inspection (it being used for the 
> worker thread limit) + the effects of client-observed latency is consistent 
> with my understanding.
> I suspect the setting was set strangely by someone because the phrasing of 
> the comments in {{cassandra.yaml}} strongly suggest that this should be tied 
> to CPU cores, hiding the fact that this really has to do with the number of 
> requests that can be serviced concurrently regardless of implementation 
> details of thrift/networking being sync/async/etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-05-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282683#comment-13282683
 ] 

Jonathan Ellis edited comment on CASSANDRA-2897 at 5/24/12 10:03 PM:
-

bq. this doesn't work for us, since (unlike Bigtable) we don't make an effort 
to preserve all older versions of a column on disk

We can fix this without having to go full-on Bigtable with value retention.  
"All" we need to do is have the memtable update code special case replacements 
in the CF map to issue an index delete against the replaced value.  Messy, but 
not as messy as having to maintain two KEYS index implementations.

So, we can add that as step 2.5 to my list above and we should be good.


  was (Author: jbellis):
Adding code to the memtable update to issue an index delete when an 
overwrite happens would be messy, but not as messy as having to maintain two 
KEYS index implementations.

So, we can add that as step 2.5 to my list above and we should be good.

  
> Secondary indexes without read-before-write
> ---
>
> Key: CASSANDRA-2897
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Sylvain Lebresne
>Priority: Minor
>  Labels: secondary_index
>
> Currently, secondary index updates require a read-before-write to maintain 
> the index consistency. Keeping the index consistent at all time is not 
> necessary however. We could let the (secondary) index get inconsistent on 
> writes and repair those on reads. This would be easy because on reads, we 
> make sure to request the indexed columns anyway, so we can just skip the row 
> that are not needed and repair the index at the same time.
> This does trade work on writes for work on reads. However, read-before-write 
> is sufficiently costly that it will likely be a win overall.
> There is (at least) two small technical difficulties here though:
> # If we repair on read, this will be racy with writes, so we'll probably have 
> to synchronize there.
> # We probably shouldn't only rely on read to repair and we should also have a 
> task to repair the index for things that are rarely read. It's unclear how to 
> make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4238) Pig secondary index usage could be improved

2012-05-24 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-4238:


Attachment: 4238-v3.txt

v3 makes one small tweak, and prepends "index_" instead of appending "_index", 
since pig identifiers need to always begin with an alphanumeric character and 
this can guarantee that.

> Pig secondary index usage could be improved
> ---
>
> Key: CASSANDRA-4238
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4238
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Attachments: 4238-v2.txt, 4238-v3.txt, 4238.txt
>
>
> As Dmitriy suggested on CASSANDRA-2246, CassandraStorage could implement 
> LoadMetadata.getPartitionKeys and LoadMetadata.setPartitionFilter to 
> automatically apply secondary indexes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (CASSANDRA-3708) Support "composite prefix" tombstones

2012-05-24 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay reopened CASSANDRA-3708:
--


There is a bug in DeletionInfo.
{code}
public int dataSize()
{
int size = TypeSizes.NATIVE.sizeof(topLevel.markedForDeleteAt);
for (RangeTombstone r : ranges)
size += r.data.markedForDeleteAt;
return size;
}
{code}

1) We should do TypeSizes.NATIVE.sizeof(r.data.markedForDeleteAt)
2) We should also calculate the type sizes for the range tombstones.

> Support "composite prefix" tombstones
> -
>
> Key: CASSANDRA-3708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3708
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jonathan Ellis
>Assignee: Sylvain Lebresne
> Fix For: 1.2
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3865) Cassandra-cli returns 'command not found' instead of syntax error

2012-05-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3865:
--

Fix Version/s: (was: 1.0.11)
   (was: 1.1.1)
   1.1.2

My fault, I added 1.0 as a fix target after Dave's patch since I thought it was 
going to be a quick fix.  Let's target 1.1 instead.

> Cassandra-cli returns 'command not found' instead of syntax error
> -
>
> Key: CASSANDRA-3865
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3865
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE 1.0.5
>Reporter: Eric Lubow
>Assignee: Dave Brosius
>Priority: Trivial
>  Labels: cassandra-cli
> Fix For: 1.1.2
>
> Attachments: parse_doubles_better.txt
>
>
> When creating a column family from the output of 'show schema' with an index, 
> there is a trailing comma after "index_type: 0,"  The return from this is a 
> 'command not found'  This is misleading because the command is found, there 
> is just a syntax error.
> 'Command not found: `create column family $cfname ...`

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-05-24 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282687#comment-13282687
 ] 

Jeremy Hanna commented on CASSANDRA-2897:
-

so you're saying that it won't persist forever (as per your 18 May comment) 
because of the index delete that you would add to memtable update.  That sounds 
fair.  We were just talking about the 18 May comment and thinking that would be 
fine if there was some either automated regular cleanup or at least a nodetool 
type of command to clean it up.  Otherwise to clean it up you'd have to delete 
and re-create the index or less intrusive, do a full scan of the index.

> Secondary indexes without read-before-write
> ---
>
> Key: CASSANDRA-2897
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Sylvain Lebresne
>Priority: Minor
>  Labels: secondary_index
>
> Currently, secondary index updates require a read-before-write to maintain 
> the index consistency. Keeping the index consistent at all time is not 
> necessary however. We could let the (secondary) index get inconsistent on 
> writes and repair those on reads. This would be easy because on reads, we 
> make sure to request the indexed columns anyway, so we can just skip the row 
> that are not needed and repair the index at the same time.
> This does trade work on writes for work on reads. However, read-before-write 
> is sufficiently costly that it will likely be a win overall.
> There is (at least) two small technical difficulties here though:
> # If we repair on read, this will be racy with writes, so we'll probably have 
> to synchronize there.
> # We probably shouldn't only rely on read to repair and we should also have a 
> task to repair the index for things that are rarely read. It's unclear how to 
> make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-05-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282683#comment-13282683
 ] 

Jonathan Ellis commented on CASSANDRA-2897:
---

Adding code to the memtable update to issue an index delete when an overwrite 
happens would be messy, but not as messy as having to maintain two KEYS index 
implementations.

So, we can add that as step 2.5 to my list above and we should be good.


> Secondary indexes without read-before-write
> ---
>
> Key: CASSANDRA-2897
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Sylvain Lebresne
>Priority: Minor
>  Labels: secondary_index
>
> Currently, secondary index updates require a read-before-write to maintain 
> the index consistency. Keeping the index consistent at all time is not 
> necessary however. We could let the (secondary) index get inconsistent on 
> writes and repair those on reads. This would be easy because on reads, we 
> make sure to request the indexed columns anyway, so we can just skip the row 
> that are not needed and repair the index at the same time.
> This does trade work on writes for work on reads. However, read-before-write 
> is sufficiently costly that it will likely be a win overall.
> There is (at least) two small technical difficulties here though:
> # If we repair on read, this will be racy with writes, so we'll probably have 
> to synchronize there.
> # We probably shouldn't only rely on read to repair and we should also have a 
> task to repair the index for things that are rarely read. It's unclear how to 
> make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4238) Pig secondary index usage could be improved

2012-05-24 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-4238:


Attachment: 4238-v2.txt

v2 implements a workaround.  If PIG_PARTITION_FILTER is enabled, then each 
index (actual index, not plain validation) is appended as a top-level field to 
the schema after the bag, and the name has '_index' appended.  Thus, if there 
is an index on a column called 'name', you can use it with a statement like 
"filter rows by name_index eq 'foo'".

The caveat to this is that we have to relax the putNext function a bit to 
ignore these fields, so if you have this enabled and are storing a completely 
bad schema, it will just silently drop your bad fields as well.  However this 
is a small price to pay for the added functionality.

> Pig secondary index usage could be improved
> ---
>
> Key: CASSANDRA-4238
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4238
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Brandon Williams
>Assignee: Brandon Williams
> Attachments: 4238-v2.txt, 4238.txt
>
>
> As Dmitriy suggested on CASSANDRA-2246, CassandraStorage could implement 
> LoadMetadata.getPartitionKeys and LoadMetadata.setPartitionFilter to 
> automatically apply secondary indexes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4277) hsha default thread limits make no sense, and yaml comments look confused

2012-05-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282596#comment-13282596
 ] 

Sylvain Lebresne commented on CASSANDRA-4277:
-

I agree with Peter, StorageProxy being synchronous, we do need on thread per 
active request (this is "annoying" for CASSANDRA-2478 too). It would be neat to 
make StorageProxy asynchronous but that's likely very much non-trivial. So on 
the thread numbers, I also agree that some big number would be much better. 
Those threads will mostly spend time waiting, so I don't think the context 
switching will kill us anyway.


> hsha default thread limits make no sense, and yaml comments look confused
> -
>
> Key: CASSANDRA-4277
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4277
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Peter Schuller
>
> The cassandra.yaml states with respect to {{rpc_max_threads}}:
> {code}
> # For the Hsha server, the min and max both default to quadruple the number of
> # CPU cores.
> {code}
> The code seems to indeed do this. But this makes, as far as I can tell, no 
> sense what-so-ever since the number of concurrent RPC threads you need is a 
> function of the throughput and the average latency of requests (that includes 
> synchronously waiting on network traffic).
> Defaulting to anything having to do with CPU cores seems inherently wrong. If 
> a default is non-static, a closer guess might be to look at thread stack size 
> and heap size and infer what "might" be reasonable.
> *NOTE*: The effect of having this too low, is "strange" (if you don't know 
> what's going on) latencies observed form the client on all thrift requests 
> (*any* thrift request, including e.g. {{describe_ring()}}), that isn't 
> visible in any latency metric exposed by Cassandra. This is why I consider 
> this "major", since unwitting users may be seeing detrimental performance for 
> no good reason.
> In addition, I read this about async:
> {code}
> # async -> Nonblocking server implementation with one thread to serve 
> #  rpc connections.  This is not recommended for high throughput use
> #  cases. Async has been tested to be about 50% slower than sync
> #  or hsha and is deprecated: it will be removed in the next major 
> release.
> {code}
> This makes even less sense. Running with *one* rpc thread limits you to a 
> single concurrent request. How was that 50% number even attained? By 
> single-node testing being completely CPU bound locally on a node? The actual 
> effect should be "stupidly slow" in any real situation with lots of requests 
> on a cluster of many nodes and network traffic (though I didn't test that) - 
> especially in the event of any kind of hiccup like a node doing GC. I agree 
> that if the above is true, async should *definitely* be deprecated, but the 
> reasons seem *much* stronger than implied.
> I may be missing something here, in which case I apologize,, but I 
> specifically double-checked after I fixed this setting on on our our clusters 
> after seeing exactly the expected side-effect of having it be too low. I 
> always was under the impression that rpc_max_threads affects the number of 
> RPC requests running concurrently, and code inspection (it being used for the 
> worker thread limit) + the effects of client-observed latency is consistent 
> with my understanding.
> I suspect the setting was set strangely by someone because the phrasing of 
> the comments in {{cassandra.yaml}} strongly suggest that this should be tied 
> to CPU cores, hiding the fact that this really has to do with the number of 
> requests that can be serviced concurrently regardless of implementation 
> details of thrift/networking being sync/async/etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-4281) schema agreement accross the nodes

2012-05-24 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-4281.
-

Resolution: Duplicate

Solved by CASSANDRA-4269

> schema agreement accross the nodes
> --
>
> Key: CASSANDRA-4281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4281
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Claudio Atzori
>
> I'm creating a cluster of 2 nodes (for now), of cassandra 1.1.0, installed on 
> Ubuntu 10.04
> root@node2.d:/etc/cassandra# uname -a
> Linux node2 2.6.32-5-xen-amd64 #1 SMP Fri Sep 9 22:23:19 UTC 2011 x86_64 
> GNU/Linux
> with all defaults in cassandra.yaml, except for:
> cluster_name
> initial_token (I set a 50/50 balancing between the 2 nodes)
> seeds list (one of the 2 nodes ip address)
> #listen_address: localhost
> #rpc_address: localhost
> The 2 nodes recognize each other
> root@node2.d:/etc/cassandra# nodetool ring
> Address DC  RackStatus State   Load
> Effective-Owership  Token   
>   
>  85070591730234615865843651857942052864  
> 146.48.122.136  datacenter1 rack1   Up Normal  28.62 KB
> 100.00% 0}}   
> 146.48.122.137  datacenter1 rack1   Up Normal  21.79 KB
> 100.00% 85070591730234615865843651857942052864
> But, I'm experiencing an issue. I'm trying to define a new keyspace from the 
> cqlsh.
> cqlsh> CREATE KEYSPACE efg_mr WITH strategy_class = 'SimpleStrategy' AND 
> strategy_options:replication_factor=2 ;
> ..and ok, the new keyspace is seen accross the 2 nodes. 
> cqlsh> DESCRIBE KEYSPACE efg_mr ;
> CREATE KEYSPACE efg_mr WITH strategy_class = 'SimpleStrategy'
>   AND strategy_options:replication_factor = '2';
> now I wanted to define a column family:
> cqlsh> CREATE COLUMNFAMILY records (KEY varchar PRIMARY KEY, title varchar, 
> year varchar) ;
> at this point I noticed an exception in /var/log/cassandra/output.log
> ERROR 14:28:47,475 Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.RuntimeException: java.nio.charset.MalformedInputException: Input 
> length = 1
>   at 
> org.apache.cassandra.cql3.ColumnIdentifier.(ColumnIdentifier.java:50)
>   at 
> org.apache.cassandra.cql3.CFDefinition.getKeyId(CFDefinition.java:125)
>   at org.apache.cassandra.cql3.CFDefinition.(CFDefinition.java:59)
>   at 
> org.apache.cassandra.config.CFMetaData.updateCfDef(CFMetaData.java:1278)
>   at org.apache.cassandra.config.CFMetaData.keyAlias(CFMetaData.java:221)
>   at 
> org.apache.cassandra.config.CFMetaData.fromSchemaNoColumns(CFMetaData.java:1162)
>   at 
> org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1190)
>   at 
> org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:291)
>   at 
> org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:358)
>   at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:270)
>   at 
> org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)
>   at 
> org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.nio.charset.MalformedInputException: Input length = 1
>   at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
>   at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:163)
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:120)
>   at 
> org.apache.cassandra.cql3.ColumnIdentifier.(ColumnIdentifier.java:46)
>   ... 18 more
> and from now on, only one of the 2 nodes knows about the new column family, 
> the other one somehow hasn't been informed, or didn't complete the agreement 
> on the new column family.
> Since I'm creating a new cluster I tried several times to drop all the data 
> (rm -rf /var/lib/cassandra/*) and starting over again. But sometimes this 
> error happens

[jira] [Commented] (CASSANDRA-4281) schema agreement accross the nodes

2012-05-24 Thread Claudio Atzori (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282584#comment-13282584
 ] 

Claudio Atzori commented on CASSANDRA-4281:
---

BTW, I already tried 
http://wiki.apache.org/cassandra/FAQ#schema_disagreement

But it doesn't seems to be a problem of different schema versions


[default@unknown] describe cluster; 
Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
c427ad5b-6780-3893-bb82-d66cb895a986: [146.48.122.137, 146.48.122.136]


> schema agreement accross the nodes
> --
>
> Key: CASSANDRA-4281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4281
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Claudio Atzori
>
> I'm creating a cluster of 2 nodes (for now), of cassandra 1.1.0, installed on 
> Ubuntu 10.04
> root@node2.d:/etc/cassandra# uname -a
> Linux node2 2.6.32-5-xen-amd64 #1 SMP Fri Sep 9 22:23:19 UTC 2011 x86_64 
> GNU/Linux
> with all defaults in cassandra.yaml, except for:
> cluster_name
> initial_token (I set a 50/50 balancing between the 2 nodes)
> seeds list (one of the 2 nodes ip address)
> #listen_address: localhost
> #rpc_address: localhost
> The 2 nodes recognize each other
> root@node2.d:/etc/cassandra# nodetool ring
> Address DC  RackStatus State   Load
> Effective-Owership  Token   
>   
>  85070591730234615865843651857942052864  
> 146.48.122.136  datacenter1 rack1   Up Normal  28.62 KB
> 100.00% 0}}   
> 146.48.122.137  datacenter1 rack1   Up Normal  21.79 KB
> 100.00% 85070591730234615865843651857942052864
> But, I'm experiencing an issue. I'm trying to define a new keyspace from the 
> cqlsh.
> cqlsh> CREATE KEYSPACE efg_mr WITH strategy_class = 'SimpleStrategy' AND 
> strategy_options:replication_factor=2 ;
> ..and ok, the new keyspace is seen accross the 2 nodes. 
> cqlsh> DESCRIBE KEYSPACE efg_mr ;
> CREATE KEYSPACE efg_mr WITH strategy_class = 'SimpleStrategy'
>   AND strategy_options:replication_factor = '2';
> now I wanted to define a column family:
> cqlsh> CREATE COLUMNFAMILY records (KEY varchar PRIMARY KEY, title varchar, 
> year varchar) ;
> at this point I noticed an exception in /var/log/cassandra/output.log
> ERROR 14:28:47,475 Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.RuntimeException: java.nio.charset.MalformedInputException: Input 
> length = 1
>   at 
> org.apache.cassandra.cql3.ColumnIdentifier.(ColumnIdentifier.java:50)
>   at 
> org.apache.cassandra.cql3.CFDefinition.getKeyId(CFDefinition.java:125)
>   at org.apache.cassandra.cql3.CFDefinition.(CFDefinition.java:59)
>   at 
> org.apache.cassandra.config.CFMetaData.updateCfDef(CFMetaData.java:1278)
>   at org.apache.cassandra.config.CFMetaData.keyAlias(CFMetaData.java:221)
>   at 
> org.apache.cassandra.config.CFMetaData.fromSchemaNoColumns(CFMetaData.java:1162)
>   at 
> org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1190)
>   at 
> org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:291)
>   at 
> org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:358)
>   at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:270)
>   at 
> org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)
>   at 
> org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:48)
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.nio.charset.MalformedInputException: Input length = 1
>   at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
>   at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:163)
>   at 
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:120)
>   at 
> org.apache.cassandra.cql3.Colu

[jira] [Commented] (CASSANDRA-4217) Easy access to column timestamps (and maybe ttl) during queries

2012-05-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282554#comment-13282554
 ] 

Jonathan Ellis commented on CASSANDRA-4217:
---

WFM.

> Easy access to column timestamps (and maybe ttl) during queries
> ---
>
> Key: CASSANDRA-4217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4217
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Affects Versions: 1.1.0
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>  Labels: cql3
> Fix For: 1.1.2
>
> Attachments: 4217.txt
>
>
> It would be interesting to allow accessing the timestamp/ttl of a column 
> though some syntax like
> {noformat}
> SELECT key, value, timestamp(value) FROM foo;
> {noformat}
> and the same for ttl.
> I'll note that currently timestamp and ttl are returned in the resultset 
> because it includes thrift Column object, but adding such syntax would make 
> our future protocol potentially simpler as we wouldn't then have to care 
> about timestamps explicitely (and more compact in general as we would only 
> return timestamps when asked)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4279) kick off background compaction when min/max changed

2012-05-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282540#comment-13282540
 ] 

Sylvain Lebresne commented on CASSANDRA-4279:
-

+1

> kick off background compaction when min/max changed
> ---
>
> Key: CASSANDRA-4279
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4279
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jonathan Ellis
>Assignee: Jonathan Ellis
>Priority: Trivial
>  Labels: compaction
> Fix For: 1.0.11, 1.1.1
>
> Attachments: 4279.txt
>
>
> When the threshold changes, we may be eligible for a compaction immediately 
> (without waiting for a flush to trigger the eligibility check).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4278) Can't specify certain keyspace properties in CQL

2012-05-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4278:
--

Reviewer: xedin

> Can't specify certain keyspace properties in CQL
> 
>
> Key: CASSANDRA-4278
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4278
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: paul cannon
>Assignee: Sylvain Lebresne
>Priority: Minor
>  Labels: cql, cql3
> Fix For: 1.1.1
>
> Attachments: 4278.txt
>
>
> A user using EC2MultiRegionSnitch, where the datacenter name has to match the 
> AWS region names, will not be able to specify a keyspace's replica counts for 
> those datacenters using CQL. AWS region names contain hyphens, which are not 
> valid identifiers in CQL, and CQL keyspace/columnfamily properties must be 
> identifiers or identifiers separated by colons.
> Example:
> {noformat}
> CREATE KEYSPACE Foo
>   WITH strategy_class = 'NetworkTopologyStrategy'
>   AND strategy_options:"us-east"=1
>   AND strategy_options:"us-west"=1;
> {noformat}
> (see 
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201205.mbox/browser 
> for context)
> ..will not currently work, with or without the double quotes.
> CQL should either allow hyphens in COMPIDENT, or allow quoted parts of a 
> COMPIDENT token.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[2/3] git commit: Fix range queries with secondary indexes

2012-05-24 Thread slebresne
Fix range queries with secondary indexes

patch by slebresne; reviewed by xedin for CASSANDRA-4257


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cbf04361
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cbf04361
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cbf04361

Branch: refs/heads/trunk
Commit: cbf0436181fe1f47eff98f54aa161dd5fbca0479
Parents: f77cd11
Author: Sylvain Lebresne 
Authored: Thu May 24 16:44:32 2012 +0200
Committer: Sylvain Lebresne 
Committed: Thu May 24 16:44:32 2012 +0200

--
 CHANGES.txt|1 +
 .../cassandra/cql3/statements/SelectStatement.java |7 +--
 2 files changed, 6 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/cbf04361/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 487f388..8c58af7 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -62,6 +62,7 @@
  * Fix exception on colum metadata with non-string comparator (CASSANDRA-4269)
  * Check for unknown/invalid compression options (CASSANDRA-4266)
  * (cql3) Adds simple access to column timestamp and ttl (CASSANDRA-4217)
+ * (cql3) Fix range queries with secondary indexes (CASSANDRA-4257)
 Merged from 1.0:
  * Fix super columns bug where cache is not updated (CASSANDRA-4190)
  * fix maxTimestamp to include row tombstones (CASSANDRA-4116)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cbf04361/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index d7089f9..26f082b 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -546,8 +546,11 @@ public class SelectStatement implements CQLStatement
 {
 for (Bound b : Bound.values())
 {
-ByteBuffer value = 
restriction.bound(b).getByteBuffer(name.type, variables);
-expressions.add(new IndexExpression(name.name.key, 
restriction.getIndexOperator(b), value));
+if (restriction.bound(b) != null)
+{
+ByteBuffer value = 
restriction.bound(b).getByteBuffer(name.type, variables);
+expressions.add(new IndexExpression(name.name.key, 
restriction.getIndexOperator(b), value));
+}
 }
 }
 }



[3/3] git commit: Allow accessing column timestamp and ttl in CQL3

2012-05-24 Thread slebresne
Allow accessing column timestamp and ttl in CQL3

patch by slebresne; reviewed by xedin for CASSANDRA-4217


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f77cd113
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f77cd113
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f77cd113

Branch: refs/heads/trunk
Commit: f77cd11373a85c4136e76f004c5bf8c45e875f09
Parents: a4f06c2
Author: Sylvain Lebresne 
Authored: Thu May 24 16:42:01 2012 +0200
Committer: Sylvain Lebresne 
Committed: Thu May 24 16:42:01 2012 +0200

--
 CHANGES.txt|1 +
 .../apache/cassandra/cql3/ColumnIdentifier.java|   19 ++-
 src/java/org/apache/cassandra/cql3/Cql.g   |   21 ++-
 .../cassandra/cql3/statements/SelectStatement.java |  127 +++
 .../apache/cassandra/cql3/statements/Selector.java |   98 +++
 5 files changed, 225 insertions(+), 41 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f77cd113/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 55a34ef..487f388 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -61,6 +61,7 @@
  * rename stress to cassandra-stress for saner packaging (CASSANDRA-4256)
  * Fix exception on colum metadata with non-string comparator (CASSANDRA-4269)
  * Check for unknown/invalid compression options (CASSANDRA-4266)
+ * (cql3) Adds simple access to column timestamp and ttl (CASSANDRA-4217)
 Merged from 1.0:
  * Fix super columns bug where cache is not updated (CASSANDRA-4190)
  * fix maxTimestamp to include row tombstones (CASSANDRA-4116)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f77cd113/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
--
diff --git a/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java 
b/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
index 337e619..557e420 100644
--- a/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
+++ b/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
@@ -25,10 +25,12 @@ import java.nio.ByteBuffer;
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.utils.ByteBufferUtil;
 
+import org.apache.cassandra.cql3.statements.Selector;
+
 /**
  * Represents an identifer for a CQL column definition.
  */
-public class ColumnIdentifier implements Comparable
+public class ColumnIdentifier implements Comparable, Selector
 {
 public final ByteBuffer key;
 private final String text;
@@ -70,4 +72,19 @@ public class ColumnIdentifier implements 
Comparable
 {
 return key.compareTo(other.key);
 }
+
+public ColumnIdentifier id()
+{
+return this;
+}
+
+public boolean hasFunction()
+{
+return false;
+}
+
+public Selector.Function function()
+{
+return null;
+}
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f77cd113/src/java/org/apache/cassandra/cql3/Cql.g
--
diff --git a/src/java/org/apache/cassandra/cql3/Cql.g 
b/src/java/org/apache/cassandra/cql3/Cql.g
index 6e6240f..5123cbb 100644
--- a/src/java/org/apache/cassandra/cql3/Cql.g
+++ b/src/java/org/apache/cassandra/cql3/Cql.g
@@ -179,14 +179,21 @@ selectStatement returns [SelectStatement.RawStatement 
expr]
   }
 ;
 
-selectClause returns [List expr]
-: ids=cidentList { $expr = ids; }
-| '\*'   { $expr = Collections.emptyList();}
+selectClause returns [List expr]
+: t1=selector { $expr = new ArrayList(); $expr.add(t1); } (',' 
tN=selector { $expr.add(tN); })*
+| '\*' { $expr = Collections.emptyList();}
 ;
 
-selectCountClause returns [List expr]
-: c=selectClause { $expr = c; }
-| i=INTEGER  { if (!i.getText().equals("1")) addRecognitionError("Only 
COUNT(1) is supported, got COUNT(" + i.getText() + ")"); $expr = 
Collections.emptyList();}
+selector returns [Selector s]
+: c=cident { $s = c; }
+| K_WRITETIME '(' c=cident ')' { $s = new Selector.WithFunction(c, 
Selector.Function.WRITE_TIME); }
+| K_TTL '(' c=cident ')'   { $s = new Selector.WithFunction(c, 
Selector.Function.TTL); }
+;
+
+selectCountClause returns [List expr]
+: ids=cidentList { $expr = new ArrayList(ids); }
+| '\*'   { $expr = Collections.emptyList();}
+| i=INTEGER  { if (!i.getText().equals("1")) addRecognitionError("Only 
COUNT(1) is supported, got COUNT(" + i.getText() + ")"); $expr = 
Collections.emptyList();}
 ;
 
 whereClause returns [List clause]
@@ -547,6 +554,7 @@ unreserved_keyword returns [String str]
 | K_STORAGE
 | K_TYPE
 | K_VAL

[1/3] git commit: Merge branch 'cassandra-1.1' into trunk

2012-05-24 Thread slebresne
Updated Branches:
  refs/heads/trunk 969a310c5 -> 2979820e5


Merge branch 'cassandra-1.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2979820e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2979820e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2979820e

Branch: refs/heads/trunk
Commit: 2979820e5cde11bb6b3f5e6e984af87c9a1572ab
Parents: 969a310 cbf0436
Author: Sylvain Lebresne 
Authored: Thu May 24 16:47:02 2012 +0200
Committer: Sylvain Lebresne 
Committed: Thu May 24 16:47:02 2012 +0200

--
 CHANGES.txt|2 +
 .../apache/cassandra/cql3/ColumnIdentifier.java|   19 ++-
 src/java/org/apache/cassandra/cql3/Cql.g   |   21 ++-
 .../cassandra/cql3/statements/SelectStatement.java |  134 +++
 .../apache/cassandra/cql3/statements/Selector.java |   98 +++
 5 files changed, 231 insertions(+), 43 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2979820e/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2979820e/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2979820e/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--



[jira] [Commented] (CASSANDRA-4217) Easy access to column timestamps (and maybe ttl) during queries

2012-05-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282536#comment-13282536
 ] 

Sylvain Lebresne commented on CASSANDRA-4217:
-

Committed, thanks.

Btw, my patch was using 'writetime'. Is that fine for everyone? It's still time 
to change.

> Easy access to column timestamps (and maybe ttl) during queries
> ---
>
> Key: CASSANDRA-4217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4217
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API
>Affects Versions: 1.1.0
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>  Labels: cql3
> Fix For: 1.1.2
>
> Attachments: 4217.txt
>
>
> It would be interesting to allow accessing the timestamp/ttl of a column 
> though some syntax like
> {noformat}
> SELECT key, value, timestamp(value) FROM foo;
> {noformat}
> and the same for ttl.
> I'll note that currently timestamp and ttl are returned in the resultset 
> because it includes thrift Column object, but adding such syntax would make 
> our future protocol potentially simpler as we wouldn't then have to care 
> about timestamps explicitely (and more compact in general as we would only 
> return timestamps when asked)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[2/2] git commit: Allow accessing column timestamp and ttl in CQL3

2012-05-24 Thread slebresne
Allow accessing column timestamp and ttl in CQL3

patch by slebresne; reviewed by xedin for CASSANDRA-4217


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f77cd113
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f77cd113
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f77cd113

Branch: refs/heads/cassandra-1.1
Commit: f77cd11373a85c4136e76f004c5bf8c45e875f09
Parents: a4f06c2
Author: Sylvain Lebresne 
Authored: Thu May 24 16:42:01 2012 +0200
Committer: Sylvain Lebresne 
Committed: Thu May 24 16:42:01 2012 +0200

--
 CHANGES.txt|1 +
 .../apache/cassandra/cql3/ColumnIdentifier.java|   19 ++-
 src/java/org/apache/cassandra/cql3/Cql.g   |   21 ++-
 .../cassandra/cql3/statements/SelectStatement.java |  127 +++
 .../apache/cassandra/cql3/statements/Selector.java |   98 +++
 5 files changed, 225 insertions(+), 41 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f77cd113/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 55a34ef..487f388 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -61,6 +61,7 @@
  * rename stress to cassandra-stress for saner packaging (CASSANDRA-4256)
  * Fix exception on colum metadata with non-string comparator (CASSANDRA-4269)
  * Check for unknown/invalid compression options (CASSANDRA-4266)
+ * (cql3) Adds simple access to column timestamp and ttl (CASSANDRA-4217)
 Merged from 1.0:
  * Fix super columns bug where cache is not updated (CASSANDRA-4190)
  * fix maxTimestamp to include row tombstones (CASSANDRA-4116)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f77cd113/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
--
diff --git a/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java 
b/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
index 337e619..557e420 100644
--- a/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
+++ b/src/java/org/apache/cassandra/cql3/ColumnIdentifier.java
@@ -25,10 +25,12 @@ import java.nio.ByteBuffer;
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.utils.ByteBufferUtil;
 
+import org.apache.cassandra.cql3.statements.Selector;
+
 /**
  * Represents an identifer for a CQL column definition.
  */
-public class ColumnIdentifier implements Comparable
+public class ColumnIdentifier implements Comparable, Selector
 {
 public final ByteBuffer key;
 private final String text;
@@ -70,4 +72,19 @@ public class ColumnIdentifier implements 
Comparable
 {
 return key.compareTo(other.key);
 }
+
+public ColumnIdentifier id()
+{
+return this;
+}
+
+public boolean hasFunction()
+{
+return false;
+}
+
+public Selector.Function function()
+{
+return null;
+}
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f77cd113/src/java/org/apache/cassandra/cql3/Cql.g
--
diff --git a/src/java/org/apache/cassandra/cql3/Cql.g 
b/src/java/org/apache/cassandra/cql3/Cql.g
index 6e6240f..5123cbb 100644
--- a/src/java/org/apache/cassandra/cql3/Cql.g
+++ b/src/java/org/apache/cassandra/cql3/Cql.g
@@ -179,14 +179,21 @@ selectStatement returns [SelectStatement.RawStatement 
expr]
   }
 ;
 
-selectClause returns [List expr]
-: ids=cidentList { $expr = ids; }
-| '\*'   { $expr = Collections.emptyList();}
+selectClause returns [List expr]
+: t1=selector { $expr = new ArrayList(); $expr.add(t1); } (',' 
tN=selector { $expr.add(tN); })*
+| '\*' { $expr = Collections.emptyList();}
 ;
 
-selectCountClause returns [List expr]
-: c=selectClause { $expr = c; }
-| i=INTEGER  { if (!i.getText().equals("1")) addRecognitionError("Only 
COUNT(1) is supported, got COUNT(" + i.getText() + ")"); $expr = 
Collections.emptyList();}
+selector returns [Selector s]
+: c=cident { $s = c; }
+| K_WRITETIME '(' c=cident ')' { $s = new Selector.WithFunction(c, 
Selector.Function.WRITE_TIME); }
+| K_TTL '(' c=cident ')'   { $s = new Selector.WithFunction(c, 
Selector.Function.TTL); }
+;
+
+selectCountClause returns [List expr]
+: ids=cidentList { $expr = new ArrayList(ids); }
+| '\*'   { $expr = Collections.emptyList();}
+| i=INTEGER  { if (!i.getText().equals("1")) addRecognitionError("Only 
COUNT(1) is supported, got COUNT(" + i.getText() + ")"); $expr = 
Collections.emptyList();}
 ;
 
 whereClause returns [List clause]
@@ -547,6 +554,7 @@ unreserved_keyword returns [String str]
 | K_STORAGE
 | K_TYPE

[1/2] git commit: Fix range queries with secondary indexes

2012-05-24 Thread slebresne
Updated Branches:
  refs/heads/cassandra-1.1 a4f06c237 -> cbf043618


Fix range queries with secondary indexes

patch by slebresne; reviewed by xedin for CASSANDRA-4257


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cbf04361
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cbf04361
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cbf04361

Branch: refs/heads/cassandra-1.1
Commit: cbf0436181fe1f47eff98f54aa161dd5fbca0479
Parents: f77cd11
Author: Sylvain Lebresne 
Authored: Thu May 24 16:44:32 2012 +0200
Committer: Sylvain Lebresne 
Committed: Thu May 24 16:44:32 2012 +0200

--
 CHANGES.txt|1 +
 .../cassandra/cql3/statements/SelectStatement.java |7 +--
 2 files changed, 6 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/cbf04361/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 487f388..8c58af7 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -62,6 +62,7 @@
  * Fix exception on colum metadata with non-string comparator (CASSANDRA-4269)
  * Check for unknown/invalid compression options (CASSANDRA-4266)
  * (cql3) Adds simple access to column timestamp and ttl (CASSANDRA-4217)
+ * (cql3) Fix range queries with secondary indexes (CASSANDRA-4257)
 Merged from 1.0:
  * Fix super columns bug where cache is not updated (CASSANDRA-4190)
  * fix maxTimestamp to include row tombstones (CASSANDRA-4116)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cbf04361/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index d7089f9..26f082b 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -546,8 +546,11 @@ public class SelectStatement implements CQLStatement
 {
 for (Bound b : Bound.values())
 {
-ByteBuffer value = 
restriction.bound(b).getByteBuffer(name.type, variables);
-expressions.add(new IndexExpression(name.name.key, 
restriction.getIndexOperator(b), value));
+if (restriction.bound(b) != null)
+{
+ByteBuffer value = 
restriction.bound(b).getByteBuffer(name.type, variables);
+expressions.add(new IndexExpression(name.name.key, 
restriction.getIndexOperator(b), value));
+}
 }
 }
 }



[jira] [Commented] (CASSANDRA-4018) Add column metadata to system columnfamilies

2012-05-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282526#comment-13282526
 ] 

Sylvain Lebresne commented on CASSANDRA-4018:
-

Some remarks on the patches:
* The patch removes the gc_grace (of 3 months) for schema tables, but schema 
tables really need their gc_grace.
* The second patch allows uppercase for properties in create table, but 
CASSANDRA-4278 has a more generic solution for that. It'd probably be better to 
rebase this on top of CASSANDRA-4278 (once it's committed).
* In HHOM, the table definition in the comments don't correspond to the actual 
schema. And since the rest seems to expect the schema in the comment, I'm not 
sure this work as expected (as a side note, it'd be safer to reuse the actual 
CF comparator rather that redefining the compositeType).
* Given that the peers table will later store potentially multiple token per 
host, it seems 'peers' is not the best name. Mostly this will be an inverted 
index for tokens, maybe 'tokens' or 'tokens_maps' or something like that would 
be more suited? But on a more general level, if we had CASSANDRA-3647, we could 
imagine to have a peers system table, where the key would be the unique host id 
and we would store as info the inet address and the list of token. Just saying.
* In hints, following CASSANDRA-4120, the row key is the host id, not the 
token, so we should probably rename it in the schema (and maybe use the UUID 
type).
* In SystemTable, when inserting into Peers, it use the colum name "token", but 
the table definition use the name "peer" (which is better since it's the inet 
address that this column store).
* In SystemTable, when fetching keys in Peers, instead of doing the query 
manually and doing a resultify, why not use CQL3 for the query itself?
* In SystemTable.getLocalHostId, the PEERS_CF is queried instead of LOCAL_CF
* In NodeIdCf, the id is actually a UUID (and the code does rely on the sorting 
being the one of (time) UUID).

Otherwise, after this patch we will just have IndexCf and NodeIdCf that will 
have case sensitive names. Do we want to do something about that? I guess 
that's not the most useful table to be queried by users, so probably it's not a 
big deal, but wanted to mention it to get other opinions.


> Add column metadata to system columnfamilies
> 
>
> Key: CASSANDRA-4018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4018
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Jonathan Ellis
>Assignee: Jonathan Ellis
>Priority: Minor
> Fix For: 1.2
>
>
> CASSANDRA-3792 adds this to the schema CFs; we should modernize the other 
> system CFs as well

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4278) Can't specify certain keyspace properties in CQL

2012-05-24 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4278:


Attachment: 4278.txt

Attaching patch that cleans up (I think) the definition of property names. Not 
only does it allow quoted identifiers, it also make the non quoted ones case 
insensitive. I'll note that it remove the support integer without quotes, i.e, 
one can't write "strategy_options:4 = ...", but I'm pretty sure this was 
neither used, not is it useful.

> Can't specify certain keyspace properties in CQL
> 
>
> Key: CASSANDRA-4278
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4278
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: paul cannon
>Assignee: Sylvain Lebresne
>Priority: Minor
>  Labels: cql, cql3
> Fix For: 1.1.1
>
> Attachments: 4278.txt
>
>
> A user using EC2MultiRegionSnitch, where the datacenter name has to match the 
> AWS region names, will not be able to specify a keyspace's replica counts for 
> those datacenters using CQL. AWS region names contain hyphens, which are not 
> valid identifiers in CQL, and CQL keyspace/columnfamily properties must be 
> identifiers or identifiers separated by colons.
> Example:
> {noformat}
> CREATE KEYSPACE Foo
>   WITH strategy_class = 'NetworkTopologyStrategy'
>   AND strategy_options:"us-east"=1
>   AND strategy_options:"us-west"=1;
> {noformat}
> (see 
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201205.mbox/browser 
> for context)
> ..will not currently work, with or without the double quotes.
> CQL should either allow hyphens in COMPIDENT, or allow quoted parts of a 
> COMPIDENT token.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-4278) Can't specify certain keyspace properties in CQL

2012-05-24 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282524#comment-13282524
 ] 

Sylvain Lebresne edited comment on CASSANDRA-4278 at 5/24/12 2:34 PM:
--

Attaching patch that cleans up (I think) the definition of property names. Not 
only does it allow quoted identifiers, it also make the non quoted ones case 
insensitive. I'll note that it remove the support integer without quotes, i.e, 
one can't write "strategy_options:4 = ...", but I'm pretty sure this was 
neither used, nor is it useful.

  was (Author: slebresne):
Attaching patch that cleans up (I think) the definition of property names. 
Not only does it allow quoted identifiers, it also make the non quoted ones 
case insensitive. I'll note that it remove the support integer without quotes, 
i.e, one can't write "strategy_options:4 = ...", but I'm pretty sure this was 
neither used, not is it useful.
  
> Can't specify certain keyspace properties in CQL
> 
>
> Key: CASSANDRA-4278
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4278
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: paul cannon
>Assignee: Sylvain Lebresne
>Priority: Minor
>  Labels: cql, cql3
> Fix For: 1.1.1
>
> Attachments: 4278.txt
>
>
> A user using EC2MultiRegionSnitch, where the datacenter name has to match the 
> AWS region names, will not be able to specify a keyspace's replica counts for 
> those datacenters using CQL. AWS region names contain hyphens, which are not 
> valid identifiers in CQL, and CQL keyspace/columnfamily properties must be 
> identifiers or identifiers separated by colons.
> Example:
> {noformat}
> CREATE KEYSPACE Foo
>   WITH strategy_class = 'NetworkTopologyStrategy'
>   AND strategy_options:"us-east"=1
>   AND strategy_options:"us-west"=1;
> {noformat}
> (see 
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201205.mbox/browser 
> for context)
> ..will not currently work, with or without the double quotes.
> CQL should either allow hyphens in COMPIDENT, or allow quoted parts of a 
> COMPIDENT token.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4281) schema agreement accross the nodes

2012-05-24 Thread Claudio Atzori (JIRA)
Claudio Atzori created CASSANDRA-4281:
-

 Summary: schema agreement accross the nodes
 Key: CASSANDRA-4281
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4281
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Claudio Atzori


I'm creating a cluster of 2 nodes (for now), of cassandra 1.1.0, installed on 
Ubuntu 10.04

root@node2.d:/etc/cassandra# uname -a
Linux node2 2.6.32-5-xen-amd64 #1 SMP Fri Sep 9 22:23:19 UTC 2011 x86_64 
GNU/Linux

with all defaults in cassandra.yaml, except for:
cluster_name
initial_token (I set a 50/50 balancing between the 2 nodes)
seeds list (one of the 2 nodes ip address)
#listen_address: localhost
#rpc_address: localhost

The 2 nodes recognize each other

root@node2.d:/etc/cassandra# nodetool ring
Address DC  RackStatus State   Load
Effective-Owership  Token   

   85070591730234615865843651857942052864  
146.48.122.136  datacenter1 rack1   Up Normal  28.62 KB100.00%  
   0}}   
146.48.122.137  datacenter1 rack1   Up Normal  21.79 KB100.00%  
   85070591730234615865843651857942052864

But, I'm experiencing an issue. I'm trying to define a new keyspace from the 
cqlsh.

cqlsh> CREATE KEYSPACE efg_mr WITH strategy_class = 'SimpleStrategy' AND 
strategy_options:replication_factor=2 ;

..and ok, the new keyspace is seen accross the 2 nodes. 

cqlsh> DESCRIBE KEYSPACE efg_mr ;

CREATE KEYSPACE efg_mr WITH strategy_class = 'SimpleStrategy'
  AND strategy_options:replication_factor = '2';

now I wanted to define a column family:

cqlsh> CREATE COLUMNFAMILY records (KEY varchar PRIMARY KEY, title varchar, 
year varchar) ;

at this point I noticed an exception in /var/log/cassandra/output.log

ERROR 14:28:47,475 Exception in thread Thread[MigrationStage:1,5,main]
java.lang.RuntimeException: java.nio.charset.MalformedInputException: Input 
length = 1
at 
org.apache.cassandra.cql3.ColumnIdentifier.(ColumnIdentifier.java:50)
at 
org.apache.cassandra.cql3.CFDefinition.getKeyId(CFDefinition.java:125)
at org.apache.cassandra.cql3.CFDefinition.(CFDefinition.java:59)
at 
org.apache.cassandra.config.CFMetaData.updateCfDef(CFMetaData.java:1278)
at org.apache.cassandra.config.CFMetaData.keyAlias(CFMetaData.java:221)
at 
org.apache.cassandra.config.CFMetaData.fromSchemaNoColumns(CFMetaData.java:1162)
at 
org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1190)
at 
org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:291)
at 
org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:358)
at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:270)
at 
org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)
at 
org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:163)
at 
org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:120)
at 
org.apache.cassandra.cql3.ColumnIdentifier.(ColumnIdentifier.java:46)
... 18 more

and from now on, only one of the 2 nodes knows about the new column family, the 
other one somehow hasn't been informed, or didn't complete the agreement on the 
new column family.

Since I'm creating a new cluster I tried several times to drop all the data (rm 
-rf /var/lib/cassandra/*) and starting over again. But sometimes this error 
happens on the column family definition, sometimes after a CREATE INDEX command.

Am I doing something wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on 

[jira] [Commented] (CASSANDRA-3865) Cassandra-cli returns 'command not found' instead of syntax error

2012-05-24 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282501#comment-13282501
 ] 

Pavel Yaskevich commented on CASSANDRA-3865:


I think that Dave's patch is one part of it, another would be to change 
"Command not found" to "Error in the command" and add information from 
RecognitionException (which NoViableAltException extends) to were recognition 
error have actually happend.

> Cassandra-cli returns 'command not found' instead of syntax error
> -
>
> Key: CASSANDRA-3865
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3865
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE 1.0.5
>Reporter: Eric Lubow
>Assignee: Dave Brosius
>Priority: Trivial
>  Labels: cassandra-cli
> Fix For: 1.0.11, 1.1.1
>
> Attachments: parse_doubles_better.txt
>
>
> When creating a column family from the output of 'show schema' with an index, 
> there is a trailing comma after "index_type: 0,"  The return from this is a 
> 'command not found'  This is misleading because the command is found, there 
> is just a syntax error.
> 'Command not found: `create column family $cfname ...`

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-3865) Cassandra-cli returns 'command not found' instead of syntax error

2012-05-24 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282501#comment-13282501
 ] 

Pavel Yaskevich edited comment on CASSANDRA-3865 at 5/24/12 1:30 PM:
-

I think that Dave's patch is one part of it, another would be to change 
"Command not found" to "Error in the command" and add information from 
RecognitionException (which NoViableAltException extends) to were recognition 
error have actually happend.

Edit: Also, Dave, please don't forget that this is intended for inclusion into 
1.0 so patches should be against cassandra-1.0 branch instead.

  was (Author: xedin):
I think that Dave's patch is one part of it, another would be to change 
"Command not found" to "Error in the command" and add information from 
RecognitionException (which NoViableAltException extends) to were recognition 
error have actually happend.
  
> Cassandra-cli returns 'command not found' instead of syntax error
> -
>
> Key: CASSANDRA-3865
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3865
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE 1.0.5
>Reporter: Eric Lubow
>Assignee: Dave Brosius
>Priority: Trivial
>  Labels: cassandra-cli
> Fix For: 1.0.11, 1.1.1
>
> Attachments: parse_doubles_better.txt
>
>
> When creating a column family from the output of 'show schema' with an index, 
> there is a trailing comma after "index_type: 0,"  The return from this is a 
> 'command not found'  This is misleading because the command is found, there 
> is just a syntax error.
> 'Command not found: `create column family $cfname ...`

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-4272) Keyspace name case sensitivity

2012-05-24 Thread Claudio Atzori (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Claudio Atzori resolved CASSANDRA-4272.
---

Resolution: Invalid

> Keyspace name case sensitivity
> --
>
> Key: CASSANDRA-4272
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4272
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
> Environment: cassandra-cli
>Reporter: Claudio Atzori
>
> I've been trying to define a keyspace from the cassandra-cli on a cluster of 
> 5 nodes (1.1.0), but I got a NotFoundException(), after a schema agreement 
> accross the cluster message, which I guess It can be quite confusing, because 
> the keyspace wasn't created at all.
> [default@unknown] create keyspace 'MY_NEW_KEYSPACE' with placement_strategy = 
> 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = 
> {replication_factor:2} ;  
> ba6a9a70-e983-3b9c-bd8c-b0022865bb3e
> Waiting for schema agreement...
> ... schemas agree across the cluster
> NotFoundException() 
> BTW, I've been able to create the keyspace by lowercasing its name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4272) Keyspace name case sensitivity

2012-05-24 Thread Claudio Atzori (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282491#comment-13282491
 ] 

Claudio Atzori commented on CASSANDRA-4272:
---

Ok, nevermind. I really needed to start over with an empty cluster. We've been 
using those 5 nodes for development since cassandra 0.8 and I guess it held 
quite old test data, so a fresh start was needed indeed.




> Keyspace name case sensitivity
> --
>
> Key: CASSANDRA-4272
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4272
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
> Environment: cassandra-cli
>Reporter: Claudio Atzori
>
> I've been trying to define a keyspace from the cassandra-cli on a cluster of 
> 5 nodes (1.1.0), but I got a NotFoundException(), after a schema agreement 
> accross the cluster message, which I guess It can be quite confusing, because 
> the keyspace wasn't created at all.
> [default@unknown] create keyspace 'MY_NEW_KEYSPACE' with placement_strategy = 
> 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = 
> {replication_factor:2} ;  
> ba6a9a70-e983-3b9c-bd8c-b0022865bb3e
> Waiting for schema agreement...
> ... schemas agree across the cluster
> NotFoundException() 
> BTW, I've been able to create the keyspace by lowercasing its name.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-3702) CQL count() needs paging support

2012-05-24 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281727#comment-13281727
 ] 

Sam Tunnicliffe edited comment on CASSANDRA-3702 at 5/24/12 12:59 PM:
--

I didn't think the overriding the key bounds would cause a problem, as queries 
of the form 
{{SELECT COUNT(*) FROM  WHERE key='foo'}} and {{SELECT COUNT(*) FROM  
WHERE key in ('foo', 'bar')}}
aren't treated as key range queries and so go down the branch that uses 
getSlice(). However, I'd overlooked the token() function
(and, I imagine, behaviour when using OrderPreservingPartitioner) where a key 
range can be specified. 

Am I right that the issue with not paging within internal rows is that 
fetching/materialising PAGE_COUNT_SIZE wide rows could still present a problem
by blowing up memory usage? If so, I could rework to add paging within the rows 
(and fix the token()/range queries), or given your comment re: CASSANDRA-2478,  
do you think it'd be better to just abandon this patch?

  was (Author: beobal):
I didn't think the overriding the key bounds would cause a problem, as 
queries of the form 
{{SELECT COUNT(*) FROM  WHERE key='foo'}} and {{SELECT COUNT(*) FROM  
WHERE key in ('foo', 'bar')}}
aren't treated as key range queries and so go down the branch that uses 
getSlice(). However, I'd overlooked the token() function
(and, I imagine, behaviour when using OrderPreservingPartitioner) where a key 
range can be specified. 

Am I right that the issue with not paging within internal rows is that 
fetching/materialising PAGE_COUNT_SIZE wide rows could still present a problem
by blowing up memory usage? If so, I could rework to add paging within the rows 
(and fix the token()/range queries), or given your comment re: CASSANDRA-2478,  
do you think it'd be better to just abandon this patch and close the ticket as 
a "won't fix"?
  
> CQL count() needs paging support
> 
>
> Key: CASSANDRA-3702
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3702
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Nick Bailey
>Assignee: Sam Tunnicliffe
>  Labels: cql3, lhf
> Fix For: 1.1.2
>
> Attachments: v1-0001-CASSANDRA-3702-CQL-count-needs-paging-support.txt
>
>
> Doing
> {noformat}
> SELECT count(*) from ;
> {noformat}
> will max out at 10,000 because that is the default limit for cql queries. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira