[jira] [Updated] (CASSANDRA-4784) Create separate sstables for each token range handled by a node

2012-11-05 Thread Benjamin Coverston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Coverston updated CASSANDRA-4784:
--

Attachment: 4784.patch

> Create separate sstables for each token range handled by a node
> ---
>
> Key: CASSANDRA-4784
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4784
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: sankalp kohli
>Assignee: Benjamin Coverston
>Priority: Minor
>  Labels: perfomance
> Fix For: 1.3
>
> Attachments: 4784.patch
>
>
> Currently, each sstable has data for all the ranges that node is handling. If 
> we change that and rather have separate sstables for each range that node is 
> handling, it can lead to some improvements.
> Improvements
> 1) Node rebuild will be very fast as sstables can be directly copied over to 
> the bootstrapping node. It will minimize any application level logic. We can 
> directly use Linux native methods to transfer sstables without using CPU and 
> putting less pressure on the serving node. I think in theory it will be the 
> fastest way to transfer data. 
> 2) Backup can only transfer sstables for a node which belong to its primary 
> keyrange. 
> 3) ETL process can only copy one replica of data and will be much faster. 
> Changes:
> We can split the writes into multiple memtables for each range it is 
> handling. The sstables being flushed from these can have details of which 
> range of data it is handling.
> There will be no change I think for any reads as they work with interleaved 
> data anyway. But may be we can improve there as well? 
> Complexities:
> The change does not look very complicated. I am not taking into account how 
> it will work when ranges are being changed for nodes. 
> Vnodes might make this work more complicated. We can also have a bit on each 
> sstable which says whether it is primary data or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4917) Optimize tombstone creation for ExpiringColumns

2012-11-05 Thread Christian Spriegel (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Spriegel updated CASSANDRA-4917:
--

Description: 
The goal of this ticket is to reduce the amount of tombstones created from 
ExpiringColumns.

Currently tombstones will always stay a full gc_grace time, which is not 
neccessary for ExpiringColumns. We only need to ensure that ExpiringColumn and 
tombstone together live as long as gc_grace. If the ExpiringColumn's 
TTL>=gc_grace then we can create an already gcable tombstone and drop that 
instantly.

My initial proposal was to use the ExpiringColumns creation-timestamp as 
deletiontime for the tombstone, but Sylvain pointed out that we should not mix 
local and client timestamps. So I changed it to this:
{code}
public static Column create(ByteBuffer name, ByteBuffer value, long timestamp, 
int timeToLive, int localExpirationTime, int expireBefore, 
IColumnSerializer.Flag flag)
{
if (localExpirationTime >= expireBefore || flag == 
IColumnSerializer.Flag.PRESERVE_SIZE)
return new ExpiringColumn(name, value, timestamp, timeToLive, 
localExpirationTime);
// the column is now expired, we can safely return a simple tombstone
return new DeletedColumn(name, localExpirationTime-timeToLive, timestamp);
// return new DeletedColumn(name, localExpirationTime, timestamp); // old 
code
}
{code}


This was discussed on the mailinglist: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

  was:

The goal of this ticket is to reduce the amount of tombstones created from 
ExpiringColumns.

Currently tombstones will always stay a full gc_grace time, which is not 
neccessary for ExpiringColumns. We only need to ensure that ExpiringColumn and 
tombstone together live as long as gc_grace. If the ExpiringColumn's 
TTL>=gc_grace then we can create an already gcable tombstone and drop that 
instantly.

My initial proposal was to use the ExpiringColumns creation-timestamp as 
deletiontime for the tombstone, but Sylvain pointed out that we should not mix 
local and client timestamps. So I changed it to this:
{code}
public static Column create(ByteBuffer name, ByteBuffer value, long timestamp, 
int timeToLive, int localExpirationTime, int expireBefore, 
IColumnSerializer.Flag flag)
{
if (localExpirationTime >= expireBefore || flag == 
IColumnSerializer.Flag.PRESERVE_SIZE)
return new ExpiringColumn(name, value, timestamp, timeToLive, 
localExpirationTime);
// the column is now expired, we can safely return a simple tombstone
return new DeletedColumn(name, *localExpirationTime-timeToLive*, timestamp);
// return new DeletedColumn(name, localExpirationTime, timestamp); // old 
code
}
{code}


This was discussed on the mailinglist: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html


> Optimize tombstone creation for ExpiringColumns
> ---
>
> Key: CASSANDRA-4917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4917
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Christian Spriegel
>
> The goal of this ticket is to reduce the amount of tombstones created from 
> ExpiringColumns.
> Currently tombstones will always stay a full gc_grace time, which is not 
> neccessary for ExpiringColumns. We only need to ensure that ExpiringColumn 
> and tombstone together live as long as gc_grace. If the ExpiringColumn's 
> TTL>=gc_grace then we can create an already gcable tombstone and drop that 
> instantly.
> My initial proposal was to use the ExpiringColumns creation-timestamp as 
> deletiontime for the tombstone, but Sylvain pointed out that we should not 
> mix local and client timestamps. So I changed it to this:
> {code}
> public static Column create(ByteBuffer name, ByteBuffer value, long 
> timestamp, int timeToLive, int localExpirationTime, int expireBefore, 
> IColumnSerializer.Flag flag)
> {
> if (localExpirationTime >= expireBefore || flag == 
> IColumnSerializer.Flag.PRESERVE_SIZE)
> return new ExpiringColumn(name, value, timestamp, timeToLive, 
> localExpirationTime);
> // the column is now expired, we can safely return a simple tombstone
> return new DeletedColumn(name, localExpirationTime-timeToLive, timestamp);
> // return new DeletedColumn(name, localExpirationTime, timestamp); // old 
> code
> }
> {code}
> This was discussed on the mailinglist: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.

[jira] [Created] (CASSANDRA-4917) Optimize tombstone creation for ExpiringColumns

2012-11-05 Thread Christian Spriegel (JIRA)
Christian Spriegel created CASSANDRA-4917:
-

 Summary: Optimize tombstone creation for ExpiringColumns
 Key: CASSANDRA-4917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4917
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Christian Spriegel



The goal of this ticket is to reduce the amount of tombstones created from 
ExpiringColumns.

Currently tombstones will always stay a full gc_grace time, which is not 
neccessary for ExpiringColumns. We only need to ensure that ExpiringColumn and 
tombstone together live as long as gc_grace. If the ExpiringColumn's 
TTL>=gc_grace then we can create an already gcable tombstone and drop that 
instantly.

My initial proposal was to use the ExpiringColumns creation-timestamp as 
deletiontime for the tombstone, but Sylvain pointed out that we should not mix 
local and client timestamps. So I changed it to this:
{code}
public static Column create(ByteBuffer name, ByteBuffer value, long timestamp, 
int timeToLive, int localExpirationTime, int expireBefore, 
IColumnSerializer.Flag flag)
{
if (localExpirationTime >= expireBefore || flag == 
IColumnSerializer.Flag.PRESERVE_SIZE)
return new ExpiringColumn(name, value, timestamp, timeToLive, 
localExpirationTime);
// the column is now expired, we can safely return a simple tombstone
return new DeletedColumn(name, *localExpirationTime-timeToLive*, timestamp);
// return new DeletedColumn(name, localExpirationTime, timestamp); // old 
code
}
{code}


This was discussed on the mailinglist: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4906) Avoid flushing other columnfamilies on truncate

2012-11-05 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491040#comment-13491040
 ] 

Yuki Morishita commented on CASSANDRA-4906:
---

I'm seeing truncated data appear again after commit log replay.
Looks like we have to flush system.local after truncate, otherwise we are not 
able to query 'truncated_at' properly before commit log replays update of 
truncated_at.

> Avoid flushing other columnfamilies on truncate
> ---
>
> Key: CASSANDRA-4906
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4906
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Jonathan Ellis
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: 4906.txt
>
>
> Currently truncate flushes *all* columnfamilies so it can get rid of the 
> commitlog segments containing truncated data.  Otherwise, it could be 
> replayed on restart since the replay position is contained in the sstables 
> we're trying to delete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4097) Classes in org.apache.cassandra.deps:avro:1.4.0-cassandra-1 clash with core Avro classes

2012-11-05 Thread Andrew Swan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491036#comment-13491036
 ] 

Andrew Swan commented on CASSANDRA-4097:


Hi Rahul,

For unrelated reasons we stopped using Avro, so I don't run into this problem 
any more.

However it seems like it would be easy to fix - just move the conflicting 
classes from one package to another, and rebuild any dependent projects (e.g. 
Cassandra itself). I guess there aren't enough people using both Avro and 
Cassandra in the same JVM for this to be worth solving.

> Classes in org.apache.cassandra.deps:avro:1.4.0-cassandra-1 clash with core 
> Avro classes
> 
>
> Key: CASSANDRA-4097
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4097
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Andrew Swan
>Priority: Minor
>
> Cassandra has this dependency:
> {code:title=build.xml}...
>  version="1.4.0-cassandra-1">
> ...{code}
> Unfortunately this JAR file contains classes in the {{org.apache.avro}} 
> package that are incompatible with classes of the same fully-qualified name 
> in the current release of Avro. For example, the inner class 
> {{org.apache.avro.Schema$Parser}} found in Avro 1.6.1 is missing from the 
> Cassandra version of that class. This makes it impossible to have both 
> Cassandra and the latest Avro version on the classpath (my use case is an 
> application that embeds Cassandra but also uses Avro 1.6.1 for unrelated 
> serialization purposes). A simple and risk-free solution would be to change 
> the package declaration of Cassandra's Avro classes from {{org.apache.avro}} 
> to (say) {{org.apache.cassandra.avro}}, assuming that the above dependency is 
> only used by Cassandra and no other projects (which seems a reasonable 
> assumption given its name).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4813) Problem using BulkOutputFormat while streaming several SSTables simultaneously from a given node.

2012-11-05 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491032#comment-13491032
 ] 

Michael Kjellman commented on CASSANDRA-4813:
-

Exception in thread "Streaming to /10.25.9.5:1" java.lang.RuntimeException: 
java.net.SocketException: Already bound
at com.google.common.base.Throwables.propagate(Throwables.java:156)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.SocketException: Already bound
at sun.nio.ch.Net.translateToSocketException(Net.java:109)
at sun.nio.ch.Net.translateException(Net.java:141)
at sun.nio.ch.Net.translateException(Net.java:147)
at sun.nio.ch.SocketAdaptor.bind(SocketAdaptor.java:147)
at 
org.apache.cassandra.net.OutboundTcpConnectionPool.newSocket(OutboundTcpConnectionPool.java:128)
at 
org.apache.cassandra.streaming.FileStreamTask.connectAttempt(FileStreamTask.java:236)
at 
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:88)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
... 3 more
Caused by: java.nio.channels.AlreadyBoundException
at sun.nio.ch.SocketChannelImpl.bind(SocketChannelImpl.java:556)
at sun.nio.ch.SocketAdaptor.bind(SocketAdaptor.java:145)
... 7 more

is it intended that we fail that reducer? this looks like just a more elegant 
collision, no? seeing the same failure on every node.



> Problem using BulkOutputFormat while streaming several SSTables 
> simultaneously from a given node.
> -
>
> Key: CASSANDRA-4813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4813
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: I am using SLES 10 SP3, Java 6, 4 Cassandra + Hadoop 
> nodes, 3 Hadoop only nodes (datanodes/tasktrackers), 1 namenode/jobtracker. 
> The machines used are Six-Core AMD Opteron(tm) Processor 8431, 24 cores and 
> 33 GB of RAM. I get the issue on both cassandra 1.1.3, 1.1.5 and I am using 
> Hadoop 0.20.2.
>Reporter: Ralph Romanos
>Assignee: Yuki Morishita
>Priority: Minor
>  Labels: Bulkoutputformat, Hadoop, SSTables
> Fix For: 1.2.0
>
> Attachments: 4813.txt
>
>
> The issue occurs when streaming simultaneously SSTables from the same node to 
> a cassandra cluster using SSTableloader. It seems to me that Cassandra cannot 
> handle receiving simultaneously SSTables from the same node. However, when it 
> receives simultaneously SSTables from two different nodes, everything works 
> fine. As a consequence, when using BulkOutputFormat to generate SSTables and 
> stream them to a cassandra cluster, I cannot use more than one reducer per 
> node otherwise I get a java.io.EOFException in the tasktracker's logs and a 
> java.io.IOException: Broken pipe in the Cassandra logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491002#comment-13491002
 ] 

Henrik Ring edited comment on CASSANDRA-4909 at 11/5/12 10:51 PM:
--

Now I got it running and ran the test - it passed:

{code:title=CASSANDRA_4909.java|borderStyle=solid}
package org.apache.cassandra;

import org.apache.cassandra.thrift.*;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TFramedTransport;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.junit.AfterClass;
import org.junit.BeforeClass;

import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.sql.Connection;
import java.util.List;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotNull;

public class CASSANDRA_4909 {
public static final Charset charset = Charset.forName("UTF-8");
public static final CharsetEncoder encoder = charset.newEncoder();
public static final CharsetDecoder decoder = charset.newDecoder();

private static final String KEYSPACE_NAME = "test_cassandra_4909_ks";
private static final String TABLE_NAME = "test_cassandra_4909_tlb1";
private static Connection _con = null;
private static Cassandra.Client _client = null;


@BeforeClass
public static void setUp() throws Exception {
TTransport tr = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol proto = new TBinaryProtocol(tr);
_client = new Cassandra.Client(proto);
tr.open();
_client.set_cql_version("3.0.0");
_client.set_keyspace("system");

// Create Keyspace
executeCQL_legacy("CREATE KEYSPACE " + KEYSPACE_NAME + " WITH 
strategy_class = SimpleStrategy AND strategy_options:replication_factor = 1");

// Switch default KS
_client.set_keyspace(KEYSPACE_NAME);
}

@AfterClass
public static void tearDown() throws Exception {
if (_client != null) {
executeCQL3("DROP KEYSPACE " + KEYSPACE_NAME);
}
}

@org.junit.Test
public void test1() throws Exception {
assertNotNull("Expected a connection to be defined?", _client);

executeCQL3("CREATE TABLE " + TABLE_NAME + "\n" +
"(A1 set,\n" +
"A2 set,\n" +
"B1 text PRIMARY KEY,\n" +
"B2 text);");

//Should fail: InvalidRequestException("Indexes on collections are no 
yet supported")
try {
executeCQL3("CREATE INDEX " + TABLE_NAME + "_INX1 ON " + TABLE_NAME 
+ " (A1)");
} catch (InvalidRequestException e) {
assertEquals("Unexpected message in exception", "Indexes on 
collections are no yet supported", e.getWhy());
}

//Should succeed
executeCQL3("CREATE INDEX " + TABLE_NAME + "_INX2 ON " + TABLE_NAME + " 
(B2)");

executeCQL3("INSERT INTO " + TABLE_NAME + " (A1, A2, B1, B2)\n" +
"VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );\n");

executeCQL3("INSERT INTO " + TABLE_NAME + " (A1, A2, B1, B2)\n" +
"VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );\n");

// The select would fail with InvalidRequestException(why:No indexed 
columns present in by-columns clause with Equal operator)
// if the index is not in effect.
CqlResult r = executeCQL3("SELECT B1 FROM " + TABLE_NAME + " WHERE B2 = 
'B2-ROW2';");
List rows = r.getRows();
CqlRow row = rows.get(0);
Column c_b1 = row.getColumns().get(0); // B1
String B1_value = bb2str(c_b1.bufferForValue());
assertEquals("Expected other value for B1", "B1-ROW2", B1_value);
int dummy = 0;
}

public static String bb2str(ByteBuffer buffer) {
String data = "";
try {
int old_position = buffer.position();
data = decoder.decode(buffer).toString();
buffer.position(old_position);
} catch (Exception e) {
e.printStackTrace();
return "";
}
return data;
}

private static ByteBuffer str2bb(String str) {
try {
return ByteBuffer.wrap(str.getBytes("UTF-8"));
} catch (UnsupportedEncodingException e) {
throw new RuntimeException("UTF-8 is unavailable?", e);
}
}

private static CqlResult executeCQL_legacy(String cql) throws Exception {
return _client.execute_cql_query(str2bb(cql), Compression.NONE);
}

private static CqlResult executeCQL3(String cql) throws Exception {
return _client.execute_cql3_qu

[jira] [Comment Edited] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491002#comment-13491002
 ] 

Henrik Ring edited comment on CASSANDRA-4909 at 11/5/12 10:49 PM:
--

Now I got it running and ran the test - it passed:

{code:title=CASSANDRA_4909.java|borderStyle=solid}
package org.apache.cassandra;

import org.apache.cassandra.thrift.*;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TFramedTransport;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.junit.AfterClass;
import org.junit.BeforeClass;

import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.sql.Connection;
import java.util.List;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotNull;

public class CASSANDRA_4909 {
public static final Charset charset = Charset.forName("UTF-8");
public static final CharsetEncoder encoder = charset.newEncoder();
public static final CharsetDecoder decoder = charset.newDecoder();

private static final String KEYSPACE_NAME = "test_cassandra_4909_ks";
private static final String TABLE_NAME = "test_cassandra_4909_tlb1";
private static Connection _con = null;
private static Cassandra.Client _client = null;


@BeforeClass
public static void setUp() throws Exception {
TTransport tr = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol proto = new TBinaryProtocol(tr);
_client = new Cassandra.Client(proto);
tr.open();
_client.set_cql_version("3.0.0");
_client.set_keyspace("system");

// Create Keyspace
executeCQL_legacy("CREATE KEYSPACE " + KEYSPACE_NAME + " WITH 
strategy_class = SimpleStrategy AND strategy_options:replication_factor = 1");

// Switch default KS
_client.set_keyspace(KEYSPACE_NAME);
}

@AfterClass
public static void tearDown() throws Exception {
if (_client != null) {
executeCQL3("DROP KEYSPACE " + KEYSPACE_NAME);
}
}

@org.junit.Test
public void test1() throws Exception {
assertNotNull("Expected a connection to be defined?", _client);

executeCQL3("CREATE TABLE " + TABLE_NAME + "\n" +
"(A1 set,\n" +
"A2 set,\n" +
"B1 text PRIMARY KEY,\n" +
"B2 text);");

//Should fail: InvalidRequestException("Indexes on collections are no 
yet supported")
try {
executeCQL3("CREATE INDEX " + TABLE_NAME + "_INX1 ON " + TABLE_NAME 
+ " (A1)");
} catch (InvalidRequestException e) {
assertEquals("Unexpected message in exception", "Indexes on 
collections are no yet supported", e.getWhy());
}

//Should fail: InvalidRequestException("Indexes on collections are no 
yet supported")
executeCQL3("CREATE INDEX " + TABLE_NAME + "_INX2 ON " + TABLE_NAME + " 
(B2)");

executeCQL3("INSERT INTO " + TABLE_NAME + " (A1, A2, B1, B2)\n" +
"VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );\n");

executeCQL3("INSERT INTO " + TABLE_NAME + " (A1, A2, B1, B2)\n" +
"VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );\n");

// The select would fail with InvalidRequestException(why:No indexed 
columns present in by-columns clause with Equal operator)
// if the index is not in effect.
CqlResult r = executeCQL3("SELECT B1 FROM " + TABLE_NAME + " WHERE B2 = 
'B2-ROW2';");
List rows = r.getRows();
CqlRow row = rows.get(0);
Column c_b1 = row.getColumns().get(0); // B1
String B1_value = bb2str(c_b1.bufferForValue());
assertEquals("Expected other value for B1", "B1-ROW2", B1_value);
int dummy = 0;
}

public static String bb2str(ByteBuffer buffer) {
String data = "";
try {
int old_position = buffer.position();
data = decoder.decode(buffer).toString();
buffer.position(old_position);
} catch (Exception e) {
e.printStackTrace();
return "";
}
return data;
}

private static ByteBuffer str2bb(String str) {
try {
return ByteBuffer.wrap(str.getBytes("UTF-8"));
} catch (UnsupportedEncodingException e) {
throw new RuntimeException("UTF-8 is unavailable?", e);
}
}

private static CqlResult executeCQL_legacy(String cql) throws Exception {
return _client.execute_cql_query(str2bb(cql), Compression.NONE);
}

private static CqlResult executeCQL3

[jira] [Commented] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491002#comment-13491002
 ] 

Henrik Ring commented on CASSANDRA-4909:


Now I got it running and ran the test:

{code:title=CASSANDRA_4909.java|borderStyle=solid}
package org.apache.cassandra;

import org.apache.cassandra.thrift.*;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TFramedTransport;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.junit.AfterClass;
import org.junit.BeforeClass;

import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.sql.Connection;
import java.util.List;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotNull;

public class CASSANDRA_4909 {
public static final Charset charset = Charset.forName("UTF-8");
public static final CharsetEncoder encoder = charset.newEncoder();
public static final CharsetDecoder decoder = charset.newDecoder();

private static final String KEYSPACE_NAME = "test_cassandra_4909_ks";
private static final String TABLE_NAME = "test_cassandra_4909_tlb1";
private static Connection _con = null;
private static Cassandra.Client _client = null;


@BeforeClass
public static void setUp() throws Exception {
TTransport tr = new TFramedTransport(new TSocket("localhost", 9160));
TProtocol proto = new TBinaryProtocol(tr);
_client = new Cassandra.Client(proto);
tr.open();
_client.set_cql_version("3.0.0");
_client.set_keyspace("system");

// Create Keyspace
executeCQL_legacy("CREATE KEYSPACE " + KEYSPACE_NAME + " WITH 
strategy_class = SimpleStrategy AND strategy_options:replication_factor = 1");

// Switch default KS
_client.set_keyspace(KEYSPACE_NAME);
}

@AfterClass
public static void tearDown() throws Exception {
if (_client != null) {
executeCQL3("DROP KEYSPACE " + KEYSPACE_NAME);
}
}

@org.junit.Test
public void test1() throws Exception {
assertNotNull("Expected a connection to be defined?", _client);

executeCQL3("CREATE TABLE " + TABLE_NAME + "\n" +
"(A1 set,\n" +
"A2 set,\n" +
"B1 text PRIMARY KEY,\n" +
"B2 text);");

//Should fail: InvalidRequestException("Indexes on collections are no 
yet supported")
try {
executeCQL3("CREATE INDEX " + TABLE_NAME + "_INX1 ON " + TABLE_NAME 
+ " (A1)");
} catch (InvalidRequestException e) {
assertEquals("Unexpected message in exception", "Indexes on 
collections are no yet supported", e.getWhy());
}

//Should fail: InvalidRequestException("Indexes on collections are no 
yet supported")
executeCQL3("CREATE INDEX " + TABLE_NAME + "_INX2 ON " + TABLE_NAME + " 
(B2)");

executeCQL3("INSERT INTO " + TABLE_NAME + " (A1, A2, B1, B2)\n" +
"VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );\n");

executeCQL3("INSERT INTO " + TABLE_NAME + " (A1, A2, B1, B2)\n" +
"VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );\n");

// The select would fail with InvalidRequestException(why:No indexed 
columns present in by-columns clause with Equal operator)
// if the index is not in effect.
CqlResult r = executeCQL3("SELECT B1 FROM " + TABLE_NAME + " WHERE B2 = 
'B2-ROW2';");
List rows = r.getRows();
CqlRow row = rows.get(0);
Column c_b1 = row.getColumns().get(0); // B1
String B1_value = bb2str(c_b1.bufferForValue());
assertEquals("Expected other value for B1", "B1-ROW2", B1_value);
int dummy = 0;
}

public static String bb2str(ByteBuffer buffer) {
String data = "";
try {
int old_position = buffer.position();
data = decoder.decode(buffer).toString();
buffer.position(old_position);
} catch (Exception e) {
e.printStackTrace();
return "";
}
return data;
}

private static ByteBuffer str2bb(String str) {
try {
return ByteBuffer.wrap(str.getBytes("UTF-8"));
} catch (UnsupportedEncodingException e) {
throw new RuntimeException("UTF-8 is unavailable?", e);
}
}

private static CqlResult executeCQL_legacy(String cql) throws Exception {
return _client.execute_cql_query(str2bb(cql), Compression.NONE);
}

private static CqlResult executeCQL3(String cql) throws Exception {
return _client.execute_

[jira] [Commented] (CASSANDRA-4915) CQL should force limit when query samples data.

2012-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490994#comment-13490994
 ] 

Jonathan Ellis commented on CASSANDRA-4915:
---

I think you may be seeing CASSANDRA-4858 -- we don't do pre-query sampling.

> CQL should force limit when query samples data.
> ---
>
> Key: CASSANDRA-4915
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4915
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 1.2.0 beta 1
>Reporter: Edward Capriolo
>Priority: Minor
>
> When issuing a query like:
> {noformat}
> CREATE TABLE videos (
>   videoid uuid,
>   videoname varchar,
>   username varchar,
>   description varchar,
>   tags varchar,
>   upload_date timestamp,
>   PRIMARY KEY (videoid,videoname)
> );
> SELECT * FROM videos WHERE videoname = 'My funny cat';
> {noformat}
> Cassandra samples some data using get_range_slice and then applies the query.
> This is very confusing to me, because as an end user am not sure if the query 
> is fast because Cassandra is performing an optimized query (over an index, or 
> using a slicePredicate) or if cassandra is simple sampling some random rows 
> and returning me some results. 
> My suggestions:
> 1) force people to supply a LIMIT clause on any query that is going to
> page over get_range_slice
> 2) having some type of explain support so I can establish if this
> query will work in the
> I will champion suggestion 1) because CQL has put itself in a rather unique 
> un-sql like position by applying an automatic limit clause without the user 
> asking for them. I also do not believe the CQL language should let the user 
> issue queries that will not work as intended with "larger-then-auto-limit" 
> size data sets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4916) Starting Cassandra throws EOF while reading saved cache

2012-11-05 Thread Michael Kjellman (JIRA)
Michael Kjellman created CASSANDRA-4916:
---

 Summary: Starting Cassandra throws EOF while reading saved cache
 Key: CASSANDRA-4916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4916
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Michael Kjellman


Currently seeing nodes throw an EOF while reading a saved cache on the system 
schema when starting cassandra

 WARN 14:25:54,896 error reading saved cache 
/ssd/saved_caches/system-schema_columns-KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 
org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:278)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:393)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:365)
at org.apache.cassandra.db.Table.initCf(Table.java:334)
at org.apache.cassandra.db.Table.(Table.java:272)
at org.apache.cassandra.db.Table.open(Table.java:102)
at org.apache.cassandra.db.Table.open(Table.java:80)
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:320)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:203)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:395)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:438)


to reproduce delete all data files, start a cluster, leave cluster up long 
enough to build a cache. nodetool drain, kill cassandra process. start 
cassandra process in foreground and note EOF thrown (see above for stack trace)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4915) CQL should force limit when query samples data.

2012-11-05 Thread Edward Capriolo (JIRA)
Edward Capriolo created CASSANDRA-4915:
--

 Summary: CQL should force limit when query samples data.
 Key: CASSANDRA-4915
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4915
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0 beta 1
Reporter: Edward Capriolo
Priority: Minor


When issuing a query like:
{noformat}
CREATE TABLE videos (
  videoid uuid,
  videoname varchar,
  username varchar,
  description varchar,
  tags varchar,
  upload_date timestamp,
  PRIMARY KEY (videoid,videoname)
);
SELECT * FROM videos WHERE videoname = 'My funny cat';
{noformat}

Cassandra samples some data using get_range_slice and then applies the query.

This is very confusing to me, because as an end user am not sure if the query 
is fast because Cassandra is performing an optimized query (over an index, or 
using a slicePredicate) or if cassandra is simple sampling some random rows and 
returning me some results. 

My suggestions:
1) force people to supply a LIMIT clause on any query that is going to
page over get_range_slice
2) having some type of explain support so I can establish if this
query will work in the

I will champion suggestion 1) because CQL has put itself in a rather unique 
un-sql like position by applying an automatic limit clause without the user 
asking for them. I also do not believe the CQL language should let the user 
issue queries that will not work as intended with "larger-then-auto-limit" size 
data sets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4914) Aggregate functions in CQL

2012-11-05 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-4914:
-

Description: 
The requirement is to do aggregation of data in Cassandra (Wide row of column 
values of int, double, float etc).

With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for 
the columns within a row).

Example:

SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC;

 empid | deptid | first_name | last_name | salary
---+++---+
   130 |  3 | joe| doe   |   10.1
   130 |  2 | joe| doe   |100
   130 |  1 | joe| doe   |  1e+03
 
SELECT sum(salary), empid FROM emp WHERE empID IN (130);

 sum(salary) | empid
-+
   1110.1|  130




  was:
The requirement is to do aggregation of data in Cassandra (Wide row of column 
values of int, double, float etc).

With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for 
the columns within a row).

Example:

SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC;

 empid | deptid | first_name | last_name | salary
---+++---+
   130 |  3 | sughit | singh |   10.1
   130 |  2 | sughit | singh |100
   130 |  1 | sughit | singh |  1e+03
 
SELECT sum(salary), empid FROM emp WHERE empID IN (130) ORDER BY deptID DESC;   
 
 sum(salary) | empid
-+
   1110.1|  130





> Aggregate functions in CQL
> --
>
> Key: CASSANDRA-4914
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4914
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Vijay
>Assignee: Vijay
> Fix For: 1.2.0 beta 2
>
>
> The requirement is to do aggregation of data in Cassandra (Wide row of column 
> values of int, double, float etc).
> With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for 
> the columns within a row).
> Example:
> SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC;  
>   
>  empid | deptid | first_name | last_name | salary
> ---+++---+
>130 |  3 | joe| doe   |   10.1
>130 |  2 | joe| doe   |100
>130 |  1 | joe| doe   |  1e+03
>  
> SELECT sum(salary), empid FROM emp WHERE empID IN (130);  
>   
>  sum(salary) | empid
> -+
>1110.1|  130

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4914) Aggregate functions in CQL

2012-11-05 Thread Vijay (JIRA)
Vijay created CASSANDRA-4914:


 Summary: Aggregate functions in CQL
 Key: CASSANDRA-4914
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4914
 Project: Cassandra
  Issue Type: Bug
Reporter: Vijay
Assignee: Vijay
 Fix For: 1.2.0 beta 2


The requirement is to do aggregation of data in Cassandra (Wide row of column 
values of int, double, float etc).

With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for 
the columns within a row).

Example:

SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC;

 empid | deptid | first_name | last_name | salary
---+++---+
   130 |  3 | sughit | singh |   10.1
   130 |  2 | sughit | singh |100
   130 |  1 | sughit | singh |  1e+03
 
SELECT sum(salary), empid FROM emp WHERE empID IN (130) ORDER BY deptID DESC;   
 
 sum(salary) | empid
-+
   1110.1|  130




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: fix hinted handoff threadpool's JMX path

2012-11-05 Thread yukim
Updated Branches:
  refs/heads/trunk 8782c366a -> 5467fb52f


fix hinted handoff threadpool's JMX path


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5467fb52
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5467fb52
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5467fb52

Branch: refs/heads/trunk
Commit: 5467fb52f85b40346e789b9a31cafd254d75e902
Parents: 8782c36
Author: Yuki Morishita 
Authored: Mon Nov 5 13:40:04 2012 -0600
Committer: Yuki Morishita 
Committed: Mon Nov 5 13:40:04 2012 -0600

--
 .../apache/cassandra/db/HintedHandOffManager.java  |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5467fb52/src/java/org/apache/cassandra/db/HintedHandOffManager.java
--
diff --git a/src/java/org/apache/cassandra/db/HintedHandOffManager.java 
b/src/java/org/apache/cassandra/db/HintedHandOffManager.java
index 2b8b59a..a8b14b2 100644
--- a/src/java/org/apache/cassandra/db/HintedHandOffManager.java
+++ b/src/java/org/apache/cassandra/db/HintedHandOffManager.java
@@ -101,7 +101,7 @@ public class HintedHandOffManager implements 
HintedHandOffManagerMBean

  Integer.MAX_VALUE,

  TimeUnit.SECONDS,

  new LinkedBlockingQueue(),
-   
  new NamedThreadFactory("HintedHandoff", Thread.MIN_PRIORITY), 
"HintedHandoff");
+   
  new NamedThreadFactory("HintedHandoff", Thread.MIN_PRIORITY), "internal");
 
 public void start()
 {



[jira] [Updated] (CASSANDRA-4237) Add back 0.8-style memtable_lifetime feature

2012-11-05 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-4237:
--

Attachment: 4237.txt

Attaching rebased and updated version.

bq. is there a race here where if user manually calls force flush while it's 
expired, we get two scheduled tasks added?

You are right, so I moved the rescheduling code from forceFlush to 
scheduleFlush. I just let periodic flush to be scheduled without canceling 
already scheduled ones, and let the task check expiration before running flush.

> Add back 0.8-style memtable_lifetime feature
> 
>
> Key: CASSANDRA-4237
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4237
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Jonathan Ellis
>Assignee: Yuki Morishita
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: 4237.txt
>
>
> Back in 0.8 we had a memtable_lifetime_in_minutes setting.  We got rid of it 
> in 1.0 when we added CASSANDRA-2427, which is a better way to  ensure 
> flushing on low-activity memtables.
> However, at the same time we also added the ability to disable durable 
> writes.  So it's entirely possible to configure a low-activity memtable, that 
> isn't part of the commitlog.  So, we should add back a memtable lifetime 
> setting.
> An additional motive is pointed out by 
> http://www.fsl.cs.sunysb.edu/~pshetty/socc11-gtssl.pdf: if you have a *high* 
> activity columnfamily, and don't require absolute durability, the commitlog 
> is redundant if you are flushing faster than the commitlog sync period.  So, 
> disabling durable writes but setting memtable lifetime to the same as the 
> commitlog sync would be a reasonable optimization.
> Thus, when we add back memtable lifetime, I think we should measure it in 
> seconds or possibly even milliseconds (to match commitlog_sync_period) rather 
> than minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4237) Add back 0.8-style memtable_lifetime feature

2012-11-05 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-4237:
--

Attachment: (was: 4237.txt)

> Add back 0.8-style memtable_lifetime feature
> 
>
> Key: CASSANDRA-4237
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4237
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Jonathan Ellis
>Assignee: Yuki Morishita
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: 4237.txt
>
>
> Back in 0.8 we had a memtable_lifetime_in_minutes setting.  We got rid of it 
> in 1.0 when we added CASSANDRA-2427, which is a better way to  ensure 
> flushing on low-activity memtables.
> However, at the same time we also added the ability to disable durable 
> writes.  So it's entirely possible to configure a low-activity memtable, that 
> isn't part of the commitlog.  So, we should add back a memtable lifetime 
> setting.
> An additional motive is pointed out by 
> http://www.fsl.cs.sunysb.edu/~pshetty/socc11-gtssl.pdf: if you have a *high* 
> activity columnfamily, and don't require absolute durability, the commitlog 
> is redundant if you are flushing faster than the commitlog sync period.  So, 
> disabling durable writes but setting memtable lifetime to the same as the 
> commitlog sync would be a reasonable optimization.
> Thus, when we add back memtable lifetime, I think we should measure it in 
> seconds or possibly even milliseconds (to match commitlog_sync_period) rather 
> than minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4913) DESC KEYSPACE from cqlsh won't show cql3 cfs

2012-11-05 Thread Nick Bailey (JIRA)
Nick Bailey created CASSANDRA-4913:
--

 Summary: DESC KEYSPACE  from cqlsh won't show cql3 cfs
 Key: CASSANDRA-4913
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4913
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.2.0 beta 1
Reporter: Nick Bailey
Assignee: Aleksey Yeschenko
 Fix For: 1.2.0


I'm assuming because we made 'describe_keyspaces' from thrift not return cql3 
cfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Git Push Summary

2012-11-05 Thread slebresne
Updated Tags:  refs/tags/1.2.0-beta2-tentative [created] f04bebfa7


[2/2] git commit: Update version and add missing license

2012-11-05 Thread slebresne
Update version and add missing license


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fdf29594
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fdf29594
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fdf29594

Branch: refs/heads/trunk
Commit: fdf29594caae2ec0b59a485edc59250a7f49e792
Parents: 52f3912
Author: Sylvain Lebresne 
Authored: Mon Nov 5 18:41:01 2012 +0100
Committer: Sylvain Lebresne 
Committed: Mon Nov 5 18:41:01 2012 +0100

--
 build.xml  |2 +-
 debian/changelog   |6 +
 .../db/columniterator/IColumnIteratorFactory.java  |   21 +++
 .../db/columniterator/LazyColumnIterator.java  |   21 +++
 .../apache/cassandra/tools/AbstractJmxClient.java  |   21 +++
 src/java/org/apache/cassandra/tools/Shuffle.java   |   21 +++
 .../cassandra/utils/AlwaysPresentFilter.java   |   21 +++
 .../org/apache/cassandra/db/HintedHandOffTest.java |  139 +--
 8 files changed, 192 insertions(+), 60 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fdf29594/build.xml
--
diff --git a/build.xml b/build.xml
index 097da2a..a9a1b14 100644
--- a/build.xml
+++ b/build.xml
@@ -25,7 +25,7 @@
 
 
 
-
+
 
 
 http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree"/>

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fdf29594/debian/changelog
--
diff --git a/debian/changelog b/debian/changelog
index e5db090..a0caacb 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+cassandra (1.2.0~beta2) unstable; urgency=low
+
+  * New beta release
+
+ -- Sylvain Lebresne   Mon, 05 Nov 2012 18:17:03 +0100
+
 cassandra (1.2.0~beta1) unstable; urgency=low
 
   * New release

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fdf29594/src/java/org/apache/cassandra/db/columniterator/IColumnIteratorFactory.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/IColumnIteratorFactory.java 
b/src/java/org/apache/cassandra/db/columniterator/IColumnIteratorFactory.java
index 91fcdd8..1d618a5 100644
--- 
a/src/java/org/apache/cassandra/db/columniterator/IColumnIteratorFactory.java
+++ 
b/src/java/org/apache/cassandra/db/columniterator/IColumnIteratorFactory.java
@@ -1,4 +1,25 @@
 package org.apache.cassandra.db.columniterator;
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 
 public interface IColumnIteratorFactory
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fdf29594/src/java/org/apache/cassandra/db/columniterator/LazyColumnIterator.java
--
diff --git 
a/src/java/org/apache/cassandra/db/columniterator/LazyColumnIterator.java 
b/src/java/org/apache/cassandra/db/columniterator/LazyColumnIterator.java
index aa2c188..80c7037 100644
--- a/src/java/org/apache/cassandra/db/columniterator/LazyColumnIterator.java
+++ b/src/java/org/apache/cassandra/db/columniterator/LazyColumnIterator.java
@@ -1,4 +1,25 @@
 package org.apache.cassandra.db.columniterator;
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for

[1/2] git commit: log index scan subject in CompositesSearcher

2012-11-05 Thread slebresne
Updated Branches:
  refs/heads/trunk 52f3912d4 -> f04bebfa7


log index scan subject in CompositesSearcher

patch by slebresne; reviewed by jbellis for CASSANDRA-4904


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f04bebfa
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f04bebfa
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f04bebfa

Branch: refs/heads/trunk
Commit: f04bebfa7e5fb0efd003685b10c0e31250de441d
Parents: fdf2959
Author: Sylvain Lebresne 
Authored: Mon Nov 5 18:41:48 2012 +0100
Committer: Sylvain Lebresne 
Committed: Mon Nov 5 18:41:48 2012 +0100

--
 CHANGES.txt|1 +
 .../AbstractSimplePerColumnSecondaryIndex.java |   13 +
 .../db/index/composites/CompositesIndex.java   |6 ++
 .../db/index/composites/CompositesSearcher.java|8 
 .../apache/cassandra/db/index/keys/KeysIndex.java  |6 ++
 .../cassandra/db/index/keys/KeysSearcher.java  |   14 +++---
 6 files changed, 33 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f04bebfa/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 02385df..b64237e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -52,6 +52,7 @@
  * Don't allow prepared marker inside collections (CASSANDRA-4890)
  * Re-allow order by on non-selected columns (CASSANDRA-4645)
  * Bug when composite index is created in a table having collections 
(CASSANDRA-4909)
+ * log index scan subject in CompositesSearcher (CASSANDRA-4904)
 Merged from 1.1:
  * add get[Row|Key]CacheEntries to CacheServiceMBean (CASSANDRA-4859)
  * fix get_paged_slice to wrap to next row correctly (CASSANDRA-4816)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f04bebfa/src/java/org/apache/cassandra/db/index/AbstractSimplePerColumnSecondaryIndex.java
--
diff --git 
a/src/java/org/apache/cassandra/db/index/AbstractSimplePerColumnSecondaryIndex.java
 
b/src/java/org/apache/cassandra/db/index/AbstractSimplePerColumnSecondaryIndex.java
index ecabf23..63af51b 100644
--- 
a/src/java/org/apache/cassandra/db/index/AbstractSimplePerColumnSecondaryIndex.java
+++ 
b/src/java/org/apache/cassandra/db/index/AbstractSimplePerColumnSecondaryIndex.java
@@ -25,6 +25,7 @@ import org.apache.cassandra.config.ColumnDefinition;
 import org.apache.cassandra.db.*;
 import org.apache.cassandra.db.marshal.*;
 import org.apache.cassandra.dht.*;
+import org.apache.cassandra.thrift.IndexExpression;
 import org.apache.cassandra.utils.ByteBufferUtil;
 
 /**
@@ -73,6 +74,18 @@ public abstract class AbstractSimplePerColumnSecondaryIndex 
extends PerColumnSec
 
 protected abstract ByteBuffer makeIndexColumnName(ByteBuffer rowKey, 
IColumn column);
 
+protected abstract AbstractType getExpressionComparator();
+
+public String expressionString(IndexExpression expr)
+{
+return String.format("'%s.%s %s %s'",
+ baseCfs.columnFamily,
+ 
getExpressionComparator().getString(expr.column_name),
+ expr.op,
+ 
baseCfs.metadata.getColumn_metadata().get(expr.column_name).getValidator().getString(expr.value));
+}
+
+
 public void delete(ByteBuffer rowKey, IColumn column)
 {
 if (column.isMarkedForDelete())

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f04bebfa/src/java/org/apache/cassandra/db/index/composites/CompositesIndex.java
--
diff --git 
a/src/java/org/apache/cassandra/db/index/composites/CompositesIndex.java 
b/src/java/org/apache/cassandra/db/index/composites/CompositesIndex.java
index 552d75f..f1aa4aa 100644
--- a/src/java/org/apache/cassandra/db/index/composites/CompositesIndex.java
+++ b/src/java/org/apache/cassandra/db/index/composites/CompositesIndex.java
@@ -67,6 +67,12 @@ public class CompositesIndex extends 
AbstractSimplePerColumnSecondaryIndex
 return builder.build();
 }
 
+protected AbstractType getExpressionComparator()
+{
+CompositeType baseComparator = (CompositeType)baseCfs.getComparator();
+return baseComparator.types.get(prefixSize);
+}
+
 @Override
 public boolean indexes(ByteBuffer name)
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f04bebfa/src/java/org/apache/cassandra/db/index/composites/CompositesSearcher.java
--
diff --git 
a/src/java/org/apache/cassandra/db/index/composites/CompositesSearcher.java 
b/src/java/org/apache/cassandra/db/index/composites

[jira] [Commented] (CASSANDRA-4904) log index scan subject in CompositesSearcher

2012-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490758#comment-13490758
 ] 

Jonathan Ellis commented on CASSANDRA-4904:
---

+1

> log index scan subject in CompositesSearcher
> 
>
> Key: CASSANDRA-4904
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4904
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Jonathan Ellis
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: 4904.txt
>
>
> Would like to do the equivalent of this from KeysSearcher:
> {code}
> if (logger.isDebugEnabled())
> logger.debug("Most-selective indexed predicate is on {}", 
> baseCfs.getComparator().getString(primary.column_name));
> {code}
> Not sure how to figure out what part of the comparator the indexed name 
> belongs to here, with which to getString it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490748#comment-13490748
 ] 

Henrik Ring edited comment on CASSANDRA-4909 at 11/5/12 5:19 PM:
-

Just to clarify: Comments above are my understanding of the issue and a 
suggestion for test. Since you agree I will next try and actually run the test 
to see if it does as expected. I have not done that yet. I need to get the 
development environment up and running first. I am new to all of this so it is 
taking a bit of time to get it all going :-). I'm not a Python developer, so I 
was hoping someone else with better skills that area could include the test in 
the automated tests as appropriate. 

  was (Author: henrikring):
Just to clarify: Comments above are my understanding of the issue and a 
suggestion for test. Since you agree I will next try and actually run the test 
to see if it does as expected. I have not done that yet. I need to get the 
development environment up and running first. I new to all of this so it is 
taking a bit of time :-). I'm not a Python developer, so I was hoping someone 
else with better skills that area could include the test in the automated tests 
as appropriate. 
  
> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490748#comment-13490748
 ] 

Henrik Ring edited comment on CASSANDRA-4909 at 11/5/12 5:19 PM:
-

Just to clarify: Comments above are my understanding of the issue and a 
suggestion for test. Since you agree I will next try and actually run the test 
to see if it does as expected. I have not done that yet. I need to get the 
development environment up and running first. I new to all of this so it is 
taking a bit of time :-). I'm not a Python developer, so I was hoping someone 
else with better skills that area could include the test in the automated tests 
as appropriate. 

  was (Author: henrikring):
Just to clarify: Comments below are my understanding of the issue and a 
suggestion for test. Since you agree I will next try and actually run the test 
to see if it does as expected. I have not done that yet. I need to get the 
development environment up and running first. I new to all of this so it is 
taking a bit of time :-). I'm not a Python developer, so I was hoping someone 
else with better skills that area could include the test in the automated tests 
as appropriate. 
  
> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490748#comment-13490748
 ] 

Henrik Ring commented on CASSANDRA-4909:


Just to clarify: Comments below are my understanding of the issue and a 
suggestion for test. Since you agree I will next try and actually run the test 
to see if it does as expected. I have not done that yet. I need to get the 
development environment up and running first. I new to all of this so it is 
taking a bit of time :-). I'm not a Python developer, so I was hoping someone 
else with better skills that area could include the test in the automated tests 
as appropriate. 

> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-4897) Allow tiered compaction define max sstable size

2012-11-05 Thread Radim Kolar (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radim Kolar reassigned CASSANDRA-4897:
--

Assignee: (was: Radim Kolar)

> Allow tiered compaction define max sstable size
> ---
>
> Key: CASSANDRA-4897
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4897
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Radim Kolar
> Attachments: cass-maxsize1.txt
>
>
> Lucene is doing same thing. Correctly configured max segment size will 
> recycle old data faster with less diskspace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4879) CQL help in trunk/doc/cql3/CQL.textile outdated

2012-11-05 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490733#comment-13490733
 ] 

Eric Evans commented on CASSANDRA-4879:
---

bq. Patch attached to document collection types

And another to fix {{CREATE KEYSPACE}} for map vs. properties

> CQL help in trunk/doc/cql3/CQL.textile outdated
> ---
>
> Key: CASSANDRA-4879
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4879
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Kristine Hahn
>Assignee: Sylvain Lebresne
>  Labels: documentation
> Attachments: 
> v2-0001-CASSANDRA-4879-update-CQL-doc-for-collections.txt, 
> v2-0002-update-CREATE-KEYSPACE-for-map-syntax.txt
>
>
> https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile doesn't 
> include the new create keyspace syntax or the collections. Last time, I 
> updated the CQL.textile for Paul Cannon to review. Want me to do it again? 
> BNR-like formatting needs to be replaced, right?, because the brackets now 
> have literal meaning. I test-applied this custom formatting to commands and 
> it seems ok: Uppercase means literal (lowercase nonliteral), italics mean 
> optional, the | symbol means OR, ... means repeatable. The ... in italics 
> doesn't strictly explain things like nested [...] does, but it's easier on 
> the eyes and loosely understandable. Any doubt could be erased by examples, I 
> think. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4879) CQL help in trunk/doc/cql3/CQL.textile outdated

2012-11-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-4879:
--

Attachment: v2-0002-update-CREATE-KEYSPACE-for-map-syntax.txt
v2-0001-CASSANDRA-4879-update-CQL-doc-for-collections.txt

> CQL help in trunk/doc/cql3/CQL.textile outdated
> ---
>
> Key: CASSANDRA-4879
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4879
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Kristine Hahn
>Assignee: Sylvain Lebresne
>  Labels: documentation
> Attachments: 
> v2-0001-CASSANDRA-4879-update-CQL-doc-for-collections.txt, 
> v2-0002-update-CREATE-KEYSPACE-for-map-syntax.txt
>
>
> https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile doesn't 
> include the new create keyspace syntax or the collections. Last time, I 
> updated the CQL.textile for Paul Cannon to review. Want me to do it again? 
> BNR-like formatting needs to be replaced, right?, because the brackets now 
> have literal meaning. I test-applied this custom formatting to commands and 
> it seems ok: Uppercase means literal (lowercase nonliteral), italics mean 
> optional, the | symbol means OR, ... means repeatable. The ... in italics 
> doesn't strictly explain things like nested [...] does, but it's easier on 
> the eyes and loosely understandable. Any doubt could be erased by examples, I 
> think. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4813) Problem using BulkOutputFormat while streaming several SSTables simultaneously from a given node.

2012-11-05 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490732#comment-13490732
 ] 

Michael Kjellman commented on CASSANDRA-4813:
-

Sounds good Yuki. I've been trying to get a secondary cluster setup since the 
31st..keep getting pulled away. I promise i'll get multiple reducers tested 
today :)

> Problem using BulkOutputFormat while streaming several SSTables 
> simultaneously from a given node.
> -
>
> Key: CASSANDRA-4813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4813
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: I am using SLES 10 SP3, Java 6, 4 Cassandra + Hadoop 
> nodes, 3 Hadoop only nodes (datanodes/tasktrackers), 1 namenode/jobtracker. 
> The machines used are Six-Core AMD Opteron(tm) Processor 8431, 24 cores and 
> 33 GB of RAM. I get the issue on both cassandra 1.1.3, 1.1.5 and I am using 
> Hadoop 0.20.2.
>Reporter: Ralph Romanos
>Assignee: Yuki Morishita
>Priority: Minor
>  Labels: Bulkoutputformat, Hadoop, SSTables
> Fix For: 1.2.0
>
> Attachments: 4813.txt
>
>
> The issue occurs when streaming simultaneously SSTables from the same node to 
> a cassandra cluster using SSTableloader. It seems to me that Cassandra cannot 
> handle receiving simultaneously SSTables from the same node. However, when it 
> receives simultaneously SSTables from two different nodes, everything works 
> fine. As a consequence, when using BulkOutputFormat to generate SSTables and 
> stream them to a cassandra cluster, I cannot use more than one reducer per 
> node otherwise I get a java.io.EOFException in the tasktracker's logs and a 
> java.io.IOException: Broken pipe in the Cassandra logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4879) CQL help in trunk/doc/cql3/CQL.textile outdated

2012-11-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-4879:
--

Attachment: (was: 
v1-0001-CASSANDRA-4879-update-CQL-doc-for-collections.txt)

> CQL help in trunk/doc/cql3/CQL.textile outdated
> ---
>
> Key: CASSANDRA-4879
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4879
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Kristine Hahn
>Assignee: Sylvain Lebresne
>  Labels: documentation
> Attachments: 
> v2-0001-CASSANDRA-4879-update-CQL-doc-for-collections.txt, 
> v2-0002-update-CREATE-KEYSPACE-for-map-syntax.txt
>
>
> https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile doesn't 
> include the new create keyspace syntax or the collections. Last time, I 
> updated the CQL.textile for Paul Cannon to review. Want me to do it again? 
> BNR-like formatting needs to be replaced, right?, because the brackets now 
> have literal meaning. I test-applied this custom formatting to commands and 
> it seems ok: Uppercase means literal (lowercase nonliteral), italics mean 
> optional, the | symbol means OR, ... means repeatable. The ... in italics 
> doesn't strictly explain things like nested [...] does, but it's easier on 
> the eyes and loosely understandable. Any doubt could be erased by examples, I 
> think. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4813) Problem using BulkOutputFormat while streaming several SSTables simultaneously from a given node.

2012-11-05 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490728#comment-13490728
 ] 

Yuki Morishita commented on CASSANDRA-4813:
---

Michael,

I couldn't reproduce your error, and I believe that is not related to this 
issue.
So if you see that error constantly, please open another issue.

> Problem using BulkOutputFormat while streaming several SSTables 
> simultaneously from a given node.
> -
>
> Key: CASSANDRA-4813
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4813
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: I am using SLES 10 SP3, Java 6, 4 Cassandra + Hadoop 
> nodes, 3 Hadoop only nodes (datanodes/tasktrackers), 1 namenode/jobtracker. 
> The machines used are Six-Core AMD Opteron(tm) Processor 8431, 24 cores and 
> 33 GB of RAM. I get the issue on both cassandra 1.1.3, 1.1.5 and I am using 
> Hadoop 0.20.2.
>Reporter: Ralph Romanos
>Assignee: Yuki Morishita
>Priority: Minor
>  Labels: Bulkoutputformat, Hadoop, SSTables
> Fix For: 1.2.0
>
> Attachments: 4813.txt
>
>
> The issue occurs when streaming simultaneously SSTables from the same node to 
> a cassandra cluster using SSTableloader. It seems to me that Cassandra cannot 
> handle receiving simultaneously SSTables from the same node. However, when it 
> receives simultaneously SSTables from two different nodes, everything works 
> fine. As a consequence, when using BulkOutputFormat to generate SSTables and 
> stream them to a cassandra cluster, I cannot use more than one reducer per 
> node otherwise I get a java.io.EOFException in the tasktracker's logs and a 
> java.io.IOException: Broken pipe in the Cassandra logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4897) Allow tiered compaction define max sstable size

2012-11-05 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490729#comment-13490729
 ] 

Radim Kolar commented on CASSANDRA-4897:


this one is very trivial, you can fix it yourself.

> Allow tiered compaction define max sstable size
> ---
>
> Key: CASSANDRA-4897
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4897
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Radim Kolar
>Assignee: Radim Kolar
> Attachments: cass-maxsize1.txt
>
>
> Lucene is doing same thing. Correctly configured max segment size will 
> recycle old data faster with less diskspace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4905) Repair should exclude gcable tombstones from merkle-tree computation

2012-11-05 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4905:
--

Fix Version/s: 1.2.0
   1.1.7

Tagging version based on the basic part in the title.  Let's open a new ticket 
for 1.3 if we want to get crazy with protocol changes.

> Repair should exclude gcable tombstones from merkle-tree computation
> 
>
> Key: CASSANDRA-4905
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4905
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Christian Spriegel
> Fix For: 1.1.7, 1.2.0
>
>
> Currently gcable tombstones get repaired if some replicas compacted already, 
> but some are not compacted.
> This could be avoided by ignoring all gcable tombstones during merkle tree 
> calculation.
> This was discussed with Sylvain on the mailing list:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: bump hadoop version to 1.0.3

2012-11-05 Thread jbellis
Updated Branches:
  refs/heads/trunk 0aef00489 -> 52f3912d4


bump hadoop version to 1.0.3


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/52f3912d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/52f3912d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/52f3912d

Branch: refs/heads/trunk
Commit: 52f3912d42be16f418993893734681308da1c3a2
Parents: 0aef004
Author: Jonathan Ellis 
Authored: Mon Nov 5 10:41:15 2012 -0600
Committer: Jonathan Ellis 
Committed: Mon Nov 5 10:41:19 2012 -0600

--
 build.xml |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/52f3912d/build.xml
--
diff --git a/build.xml b/build.xml
index 04a5307..097da2a 100644
--- a/build.xml
+++ b/build.xml
@@ -362,7 +362,7 @@
   
  
   
-  
+  
   
   
   



[jira] [Resolved] (CASSANDRA-4679) Fix binary protocol NEW_NODE event

2012-11-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-4679.
-

Resolution: Fixed

Alright, I've reverted the 'start thrift before gossip part of this patch'. 
That has never been the most important part anyway. This should fix any 
regression at least as far as this ticket is concerned.

> Fix binary protocol NEW_NODE event
> --
>
> Key: CASSANDRA-4679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4679
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0 beta 2
>
> Attachments: 0001-4679.txt, 
> 0002-Start-RPC-binary-protocol-before-gossip.txt, 
> 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt
>
>
> As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary 
> protocol are not correctly fired (NEW_NODE is fired on node UP basically). 
> This ticket is to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: Revert 'start thrift before gossip' part of 4679

2012-11-05 Thread slebresne
Updated Branches:
  refs/heads/trunk 617a4ab6f -> 0aef00489


Revert 'start thrift before gossip' part of 4679


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0aef0048
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0aef0048
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0aef0048

Branch: refs/heads/trunk
Commit: 0aef004896951372b0d9f7a755d0ad906cc810b2
Parents: 617a4ab
Author: Sylvain Lebresne 
Authored: Mon Nov 5 17:35:05 2012 +0100
Committer: Sylvain Lebresne 
Committed: Mon Nov 5 17:35:05 2012 +0100

--
 .../apache/cassandra/service/CassandraDaemon.java  |   22 +++
 .../apache/cassandra/service/StorageService.java   |   19 -
 2 files changed, 10 insertions(+), 31 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0aef0048/src/java/org/apache/cassandra/service/CassandraDaemon.java
--
diff --git a/src/java/org/apache/cassandra/service/CassandraDaemon.java 
b/src/java/org/apache/cassandra/service/CassandraDaemon.java
index 7b8e7d8..03da7e3 100644
--- a/src/java/org/apache/cassandra/service/CassandraDaemon.java
+++ b/src/java/org/apache/cassandra/service/CassandraDaemon.java
@@ -300,7 +300,16 @@ public class CassandraDaemon
 
 // start server internals
 StorageService.instance.registerDaemon(this);
-StorageService.instance.initServerLocally();
+try
+{
+StorageService.instance.initServer();
+}
+catch (ConfigurationException e)
+{
+logger.error("Fatal configuration error", e);
+System.err.println(e.getMessage() + "\nFatal configuration error; 
unable to start server.  See log for stacktrace.");
+System.exit(1);
+}
 
 Mx4jTool.maybeLoad();
 
@@ -348,17 +357,6 @@ public class CassandraDaemon
 nativeServer.start();
 else
 logger.info("Not starting native transport as requested. Use JMX 
(StorageService->startNativeTransport()) to start it");
-
-try
-{
-StorageService.instance.maybeJoinRing(StorageService.RING_DELAY);
-}
-catch (ConfigurationException e)
-{
-logger.error("Fatal configuration error", e);
-System.err.println(e.getMessage() + "\nFatal configuration error; 
unable to start server.  See log for stacktrace.");
-System.exit(1);
-}
 }
 
 /**

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0aef0048/src/java/org/apache/cassandra/service/StorageService.java
--
diff --git a/src/java/org/apache/cassandra/service/StorageService.java 
b/src/java/org/apache/cassandra/service/StorageService.java
index 6302507..0e6fd30 100644
--- a/src/java/org/apache/cassandra/service/StorageService.java
+++ b/src/java/org/apache/cassandra/service/StorageService.java
@@ -400,12 +400,6 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 
 public synchronized void initServer(int delay) throws 
ConfigurationException
 {
-initServerLocally();
-maybeJoinRing(delay);
-}
-
-public void initServerLocally()
-{
 logger.info("Cassandra version: " + 
FBUtilities.getReleaseVersionString());
 logger.info("Thrift API version: " + Constants.VERSION);
 logger.info("CQL supported versions: " + 
StringUtils.join(ClientState.getCQLSupportedVersion(), ",") + " (default: " + 
ClientState.DEFAULT_CQL_VERSION + ")");
@@ -501,19 +495,6 @@ public class StorageService implements 
IEndpointStateChangeSubscriber, StorageSe
 }
 }, "StorageServiceShutdownHook");
 Runtime.getRuntime().addShutdownHook(drainOnShutdown);
-}
-
-public synchronized void maybeJoinRing(int delay) throws 
ConfigurationException
-{
-// This method should only be called as part of the server 
initialization, so if initialized == true, we've already gone
-// through that. If the ring must be joined after the server 
initialization, use joinTokenRing() directly.
-if (initialized)
-{
-if (isClientMode)
-throw new UnsupportedOperationException("StorageService does 
not support switching modes.");
-return;
-}
-initialized = true;
 
 if (Boolean.parseBoolean(System.getProperty("cassandra.join_ring", 
"true")))
 {



[jira] [Commented] (CASSANDRA-4898) Authentication provider in Cassandra itself

2012-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490716#comment-13490716
 ] 

Jonathan Ellis commented on CASSANDRA-4898:
---

One wrinkle I forgot about: CASSANDRA-4648 made executeInternal local-only.  So 
we'd need to re-add a method to execute a CQL query that may have non-local 
answers.  Should be pretty easy to pull that code out of the pre-4648 class 
though.

> Authentication provider in Cassandra itself
> ---
>
> Key: CASSANDRA-4898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4898
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 1.1.6
>Reporter: Dirkjan Bussink
>  Labels: authentication, authorization
>
> I've been working on an implementation for both IAuthority2 and 
> IAuthenticator that uses Cassandra itself to store the necessary credentials. 
> I'm planning on open sourcing this shortly.
> Is there any interest in this? It tries to provide reasonable security, for 
> example using PBKDF2 to store passwords with a configurable configuration 
> cycle and managing all the rights available in IAuthority2. 
> My main use goal isn't security / confidentiality of the data, but more that 
> I don't want multiple consumers of the cluster to accidentally screw stuff 
> up. Only certain users can write data, others can read it out again and 
> further process it.
> I'm planning on releasing this soon under an open source license (probably 
> the same as Cassandra itself). Would there be interest in incorporating it as 
> a new reference implementation instead of the properties file implementation 
> perhaps? Or can I better maintain it separately? I would love if people from 
> the community would want to review it, since I have been dabbling in the 
> Cassandra source code only for a short while now.
> During the development of this I've encountered a few bumps and I wonder 
> whether they could be addressed or not.
> = Moment when validateConfiguration() runs =
> Is there a deliberate reason that validateConfiguration() is executed before 
> all information about keyspaces, column families etc. is available? In the 
> current form I therefore can't validate whether column families etc. are 
> available for authentication since they aren't loaded yet.
> I've wanted to use this to make relatively easy bootstrapping possible. My 
> approach here would be to only enable authentication if the needed keyspace 
> is available. This allows for configuring the cluster, then import the 
> necessary authentication data for an admin user to bootstrap further and then 
> restart every node in the cluster.
> Basically the questions here are, can the moment when validateConfiguration() 
> runs for an authentication provider be changed? Is this approach to 
> bootstrapping reasonable or do people have better ideas?
> = AbstractReplicationStrategy has package visible constructor =
> I've added a strategy that basically says that data should be available on 
> all nodes. The amount of data use for authentication is very limited. 
> Replicating it to every node is there for not very problematic and allows for 
> every node to have all data locally available for verifying requests.
> I wanted to put this strategy into it's own package inside the authentication 
> module, but since the constructor of AbstractReplicationStrategy has no 
> visibility explicitly marked, it's only available inside the same package.
> I'm not sure whether implementing a strategy to replicate data to all nodes 
> is a sane idea and whether my implementation of this strategy is correct. 
> What do you people think of this? Would people want to review the 
> implementation?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490713#comment-13490713
 ] 

Sylvain Lebresne commented on CASSANDRA-4909:
-

bq. Working at the ApacheConEU Hackathon

I'm not sure what that means, but let me recall that there is a fair amount of 
tests here: 
https://github.com/riptano/cassandra-dtest/blob/master/cql_tests.py, including 
a test for this ticket. It's not necessarily perfect, but in any case it would 
be nice to avoid duplication of efforts.

> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4679) Fix binary protocol NEW_NODE event

2012-11-05 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490710#comment-13490710
 ] 

Brandon Williams commented on CASSANDRA-4679:
-

Sorry, edited my comment to clarify while you were posting yours :)

> Fix binary protocol NEW_NODE event
> --
>
> Key: CASSANDRA-4679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4679
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0 beta 2
>
> Attachments: 0001-4679.txt, 
> 0002-Start-RPC-binary-protocol-before-gossip.txt, 
> 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt
>
>
> As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary 
> protocol are not correctly fired (NEW_NODE is fired on node UP basically). 
> This ticket is to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: Bug when composite index is created in a table having collections

2012-11-05 Thread slebresne
Updated Branches:
  refs/heads/trunk 811359db0 -> 617a4ab6f


Bug when composite index is created in a table having collections

patch by slebresne; reviewed by jbellis for CASSANDRA-4909


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/617a4ab6
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/617a4ab6
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/617a4ab6

Branch: refs/heads/trunk
Commit: 617a4ab6fd1ee76820a95de5fe32a5b283e07c06
Parents: 811359d
Author: Sylvain Lebresne 
Authored: Mon Nov 5 17:17:16 2012 +0100
Committer: Sylvain Lebresne 
Committed: Mon Nov 5 17:17:16 2012 +0100

--
 CHANGES.txt|1 +
 .../cql3/statements/CreateIndexStatement.java  |5 -
 2 files changed, 5 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/617a4ab6/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 89c046e..02385df 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -51,6 +51,7 @@
  * Add tracing support to the binary protocol (CASSANDRA-4699)
  * Don't allow prepared marker inside collections (CASSANDRA-4890)
  * Re-allow order by on non-selected columns (CASSANDRA-4645)
+ * Bug when composite index is created in a table having collections 
(CASSANDRA-4909)
 Merged from 1.1:
  * add get[Row|Key]CacheEntries to CacheServiceMBean (CASSANDRA-4859)
  * fix get_paged_slice to wrap to next row correctly (CASSANDRA-4816)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/617a4ab6/src/java/org/apache/cassandra/cql3/statements/CreateIndexStatement.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/statements/CreateIndexStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/CreateIndexStatement.java
index 710de11..4e0f536 100644
--- a/src/java/org/apache/cassandra/cql3/statements/CreateIndexStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/CreateIndexStatement.java
@@ -76,11 +76,14 @@ public class CreateIndexStatement extends 
SchemaAlteringStatement
 if (logger.isDebugEnabled())
 logger.debug("Updating column {} definition for index {}", 
columnName, indexName);
 
+if (cd.getValidator().isCollection())
+throw new InvalidRequestException("Indexes on collections 
are no yet supported");
+
 if (cfDef.isComposite)
 {
 CompositeType composite = (CompositeType)cfm.comparator;
 Map opts = new HashMap();
-opts.put(CompositesIndex.PREFIX_SIZE_OPTION, 
String.valueOf(composite.types.size() - 1));
+opts.put(CompositesIndex.PREFIX_SIZE_OPTION, 
String.valueOf(composite.types.size() - (cfDef.hasCollections ? 2 : 1)));
 cd.setIndexType(IndexType.COMPOSITES, opts);
 }
 else



[jira] [Commented] (CASSANDRA-4679) Fix binary protocol NEW_NODE event

2012-11-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490708#comment-13490708
 ] 

Sylvain Lebresne commented on CASSANDRA-4679:
-

As much as I'm happy to revert the part about starting the thrift/binary 
protocol server first, I'd prefer some precision about "breaks all kind of 
things". That is, does that break things only if people query the thrift 
interface before MS and gossip are up (which would make sense, but in real life 
doesn't seem like very likely to happen (which doesn't mean we shouldn't fix it 
btw)), or does it break things in other cases?

> Fix binary protocol NEW_NODE event
> --
>
> Key: CASSANDRA-4679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4679
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0 beta 2
>
> Attachments: 0001-4679.txt, 
> 0002-Start-RPC-binary-protocol-before-gossip.txt, 
> 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt
>
>
> As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary 
> protocol are not correctly fired (NEW_NODE is fired on node UP basically). 
> This ticket is to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4910) CQL3 doesn't allow static CF definition with compact storage in C* 1.1

2012-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490709#comment-13490709
 ] 

Jonathan Ellis commented on CASSANDRA-4910:
---

+1

> CQL3 doesn't allow static CF definition with compact storage in C* 1.1
> --
>
> Key: CASSANDRA-4910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4910
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.1.1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.1.7
>
> Attachments: 4910.txt
>
>
> In Cassandra 1.1, the following CQL3 definition:
> {noformat}
> CREATE TABLE user_profiles (
> user_id text PRIMARY KEY,
> first_name text,
> last_name text,
> year_of_birth int
> ) WITH COMPACT STORAGE;
> {noformat}
> yields:
> {noformat}
> Bad Request: COMPACT STORAGE requires at least one column part of the 
> clustering key, none found
> {noformat}
> This works fine in 1.2 however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4879) CQL help in trunk/doc/cql3/CQL.textile outdated

2012-11-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-4879:
--

Attachment: v1-0001-CASSANDRA-4879-update-CQL-doc-for-collections.txt

> CQL help in trunk/doc/cql3/CQL.textile outdated
> ---
>
> Key: CASSANDRA-4879
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4879
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Kristine Hahn
>Assignee: Sylvain Lebresne
>  Labels: documentation
> Attachments: v1-0001-CASSANDRA-4879-update-CQL-doc-for-collections.txt
>
>
> https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile doesn't 
> include the new create keyspace syntax or the collections. Last time, I 
> updated the CQL.textile for Paul Cannon to review. Want me to do it again? 
> BNR-like formatting needs to be replaced, right?, because the brackets now 
> have literal meaning. I test-applied this custom formatting to commands and 
> it seems ok: Uppercase means literal (lowercase nonliteral), italics mean 
> optional, the | symbol means OR, ... means repeatable. The ... in italics 
> doesn't strictly explain things like nested [...] does, but it's easier on 
> the eyes and loosely understandable. Any doubt could be erased by examples, I 
> think. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4897) Allow tiered compaction define max sstable size

2012-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490707#comment-13490707
 ] 

Jonathan Ellis commented on CASSANDRA-4897:
---

If I review this, are you going to just close it as wontfix if you need to 
change something?

Because if so let's save us both some time and do it now.

> Allow tiered compaction define max sstable size
> ---
>
> Key: CASSANDRA-4897
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4897
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Radim Kolar
>Assignee: Radim Kolar
> Attachments: cass-maxsize1.txt
>
>
> Lucene is doing same thing. Correctly configured max segment size will 
> recycle old data faster with less diskspace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4912) BulkOutputFormat should support Hadoop MultipleOutput

2012-11-05 Thread Michael Kjellman (JIRA)
Michael Kjellman created CASSANDRA-4912:
---

 Summary: BulkOutputFormat should support Hadoop MultipleOutput
 Key: CASSANDRA-4912
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4912
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Affects Versions: 1.2.0 beta 1
Reporter: Michael Kjellman


Much like CASSANDRA-4208 BOF should support outputting to Multiple Column 
Families. The current approach takken in the patch for COF results in only one 
stream being sent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4679) Fix binary protocol NEW_NODE event

2012-11-05 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490699#comment-13490699
 ] 

Brandon Williams edited comment on CASSANDRA-4679 at 11/5/12 4:11 PM:
--

Reopening because (at least sometimes) this allows thrift to start before MS 
and gossip, which is wrong, but also other things seem to be breaking like 
bootstrap.

  was (Author: brandon.williams):
Reopening because (at least sometimes) this allows thrift to start before 
MS and gossip, which breaks all kind of things.
  
> Fix binary protocol NEW_NODE event
> --
>
> Key: CASSANDRA-4679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4679
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0 beta 2
>
> Attachments: 0001-4679.txt, 
> 0002-Start-RPC-binary-protocol-before-gossip.txt, 
> 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt
>
>
> As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary 
> protocol are not correctly fired (NEW_NODE is fired on node UP basically). 
> This ticket is to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490697#comment-13490697
 ] 

Henrik Ring edited comment on CASSANDRA-4909 at 11/5/12 4:09 PM:
-

Working at the ApacheConEU Hackathon. Suggestions for testing the patch.
Below is my understanding of the issue and the solution as well as a suggestion 
for how to test it.

The patch changes the code so the exception: InvalidRequestException("Indexes 
on collections are no yet supported")
is raised if an attempt is made to create an index on an collection type column.

Creation of indexes on tables that have collection column values is allowed as 
long as 
the collection columns are NOT part of the index. 

Tests:
In CQL3 create CF with 2 collection type columns "A1" and "A2" and 2 
noncollection type columns "B1" and "B2".
Create index on A1 or A2 should fail. 
Create an index on B1 or B2 should succeed.
Insert data.
Select on column with index to ensure index is in place.

-- Should succeed
CREATE KEYSPACE TEST_CASSANDRA_4909_KS;

-- Should succeed
USE TEST_CASSANDRA_4909_KS; 

-- Should succeed
CREATE TABLE TEST_CASSANDRA_4909_TLB1
   (A1 set,
A2 set,
B1 text,
B2 text,
PRIMARY KEY B1);

-- Should fail: InvalidRequestException("Indexes on collections are no yet 
supported")
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX1 ON TEST_CASSANDRA_4909_TLB1 (A1);

-- Should Succeed:
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX2 ON TEST_CASSANDRA_4909_TLB1 (B2);

-- Wait for schema agreement.

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );

-- Should succeed (ensure index is in place):
SELECT * FROM TEST_CASSANDRA_4909_TLB1 WHERE B2 = 'B2-ROW2';

-- Should succeed:
DROP TABLE TEST_CASSANDRA_4909_TLB1;


  was (Author: henrikring):
Working at the ApacheConEU Hackathon. Suggestions for testing the patch.
Below is my understanding of the issue and the solution and a suggestion for 
how to test it.

The patch changes the code so the exception: InvalidRequestException("Indexes 
on collections are no yet supported")
is raised if an attempt is made to create an index on an collection type column.

Creation of indexes on tables that have collection column values is allowed as 
long as 
the collection columns are NOT part of the index. 

Tests:
In CQL3 create CF with 2 collection type columns "A1" and "A2" and 2 
noncollection type columns "B1" and "B2".
Create index on A1 or A2 should fail. 
Create an index on B1 or B2 should succeed.
Insert data.
Select on column with index to ensure index is in place.

-- Should succeed
CREATE KEYSPACE TEST_CASSANDRA_4909_KS;

-- Should succeed
USE TEST_CASSANDRA_4909_KS; 

-- Should succeed
CREATE TABLE TEST_CASSANDRA_4909_TLB1
   (A1 set,
A2 set,
B1 text,
B2 text,
PRIMARY KEY B1);

-- Should fail: InvalidRequestException("Indexes on collections are no yet 
supported")
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX1 ON TEST_CASSANDRA_4909_TLB1 (A1);

-- Should Succeed:
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX2 ON TEST_CASSANDRA_4909_TLB1 (B2);

-- Wait for schema agreement.

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );

-- Should succeed (ensure index is in place):
SELECT * FROM TEST_CASSANDRA_4909_TLB1 WHERE B2 = 'B2-ROW2';

-- Should succeed:
DROP TABLE TEST_CASSANDRA_4909_TLB1;

  
> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, ple

[jira] [Comment Edited] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490697#comment-13490697
 ] 

Henrik Ring edited comment on CASSANDRA-4909 at 11/5/12 4:08 PM:
-

Working at the ApacheConEU Hackathon. Suggestions for testing the patch.
Below is my understanding of the issue and the solution and a suggestion for 
how to test it.

The patch changes the code so the exception: InvalidRequestException("Indexes 
on collections are no yet supported")
is raised if an attempt is made to create an index on an collection type column.

Creation of indexes on tables that have collection column values is allowed as 
long as 
the collection columns are NOT part of the index. 

Tests:
In CQL3 create CF with 2 collection type columns "A1" and "A2" and 2 
noncollection type columns "B1" and "B2".
Create index on A1 or A2 should fail. 
Create an index on B1 or B2 should succeed.
Insert data.
Select on column with index to ensure index is in place.

-- Should succeed
CREATE KEYSPACE TEST_CASSANDRA_4909_KS;

-- Should succeed
USE TEST_CASSANDRA_4909_KS; 

-- Should succeed
CREATE TABLE TEST_CASSANDRA_4909_TLB1
   (A1 set,
A2 set,
B1 text,
B2 text,
PRIMARY KEY B1);

-- Should fail: InvalidRequestException("Indexes on collections are no yet 
supported")
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX1 ON TEST_CASSANDRA_4909_TLB1 (A1);

-- Should Succeed:
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX2 ON TEST_CASSANDRA_4909_TLB1 (B2);

-- Wait for schema agreement.

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );

-- Should succeed (ensure index is in place):
SELECT * FROM TEST_CASSANDRA_4909_TLB1 WHERE B2 = 'B2-ROW2';

-- Should succeed:
DROP TABLE TEST_CASSANDRA_4909_TLB1;


  was (Author: henrikring):
Working at the ApacheConEU Hackathon. Suggestions for testing the patch.

The patch changes the code so the exception: InvalidRequestException("Indexes 
on collections are no yet supported")
is raised if an attempt is made to create an index on an collection type column.

Creation of indexes on tables that have collection column values is allowed as 
long as 
the collection columns are NOT part of the index. 

Tests:
In CQL3 create CF with 2 collection type columns "A1" and "A2" and 2 
noncollection type columns "B1" and "B2".
Create index on A1 or A2 should fail. 
Create an index on B1 or B2 should succeed.
Insert data.
Select on column with index to ensure index is in place.

-- Should succeed
CREATE KEYSPACE TEST_CASSANDRA_4909_KS;

-- Should succeed
USE TEST_CASSANDRA_4909_KS; 

-- Should succeed
CREATE TABLE TEST_CASSANDRA_4909_TLB1
   (A1 set,
A2 set,
B1 text,
B2 text,
PRIMARY KEY B1);

-- Should fail: InvalidRequestException("Indexes on collections are no yet 
supported")
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX1 ON TEST_CASSANDRA_4909_TLB1 (A1);

-- Should Succeed:
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX2 ON TEST_CASSANDRA_4909_TLB1 (B2);

-- Wait for schema agreement.

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );

-- Should succeed (ensure index is in place):
SELECT * FROM TEST_CASSANDRA_4909_TLB1 WHERE B2 = 'B2-ROW2';

-- Should succeed:
DROP TABLE TEST_CASSANDRA_4909_TLB1;

  
> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/soft

git commit: Re-allow order by on non-selected columns

2012-11-05 Thread slebresne
Updated Branches:
  refs/heads/trunk f8e52ea98 -> 811359db0


Re-allow order by on non-selected columns

patch by slebresne; reviewed by jbellis for CASSANDRA-4645


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/811359db
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/811359db
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/811359db

Branch: refs/heads/trunk
Commit: 811359db0adeb3ac312708840379497ea950a0a4
Parents: f8e52ea
Author: Sylvain Lebresne 
Authored: Mon Nov 5 17:05:45 2012 +0100
Committer: Sylvain Lebresne 
Committed: Mon Nov 5 17:05:45 2012 +0100

--
 CHANGES.txt|1 +
 .../cassandra/cql3/statements/SelectStatement.java |6 --
 2 files changed, 5 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/811359db/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 75416d5..89c046e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -50,6 +50,7 @@
  * Fix short read protection for CQL3 (CASSANDRA-4882)
  * Add tracing support to the binary protocol (CASSANDRA-4699)
  * Don't allow prepared marker inside collections (CASSANDRA-4890)
+ * Re-allow order by on non-selected columns (CASSANDRA-4645)
 Merged from 1.1:
  * add get[Row|Key]CacheEntries to CacheServiceMBean (CASSANDRA-4859)
  * fix get_paged_slice to wrap to next row correctly (CASSANDRA-4816)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/811359db/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 365d7cb..945d2e8 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -1180,8 +1180,10 @@ public class SelectStatement implements CQLStatement
 if (stmt.isKeyRange)
 throw new InvalidRequestException("ORDER BY is only 
supported when the partition key is restricted by an EQ or an IN.");
 
-// check if we are trying to order by column that wouldn't be 
included in the results
-if (!stmt.selectedNames.isEmpty()) // empty means wildcard was 
used
+// If we order an IN query, we'll have to do a manual sort 
post-query. Currently, this sorting requires that we
+// have queried the column on which we sort (TODO: we should 
update it to add the column on which we sort to the one
+// queried automatically, and then removing it from the 
resultSet afterwards if needed)
+if (stmt.keyIsInRelation && !stmt.selectedNames.isEmpty()) // 
empty means wildcard was used
 {
 for (ColumnIdentifier column : 
stmt.parameters.orderings.keySet())
 {



[jira] [Commented] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490698#comment-13490698
 ] 

Jonathan Ellis commented on CASSANDRA-4909:
---

+1

> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (CASSANDRA-4679) Fix binary protocol NEW_NODE event

2012-11-05 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reopened CASSANDRA-4679:
-


Reopening because (at least sometimes) this allows thrift to start before MS 
and gossip, which breaks all kind of things.

> Fix binary protocol NEW_NODE event
> --
>
> Key: CASSANDRA-4679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4679
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0 beta 2
>
> Attachments: 0001-4679.txt, 
> 0002-Start-RPC-binary-protocol-before-gossip.txt, 
> 0003-Remove-hardcoded-initServer-from-AntiEntropyServiceTes.txt
>
>
> As discussed on CASSANDRA-4480, the NEW_NODE/REMOVED_NODE of the binary 
> protocol are not correctly fired (NEW_NODE is fired on node UP basically). 
> This ticket is to fix that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4834) Old-style mapred interface only populates row key for first column when using wide rows

2012-11-05 Thread Ben Kempe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kempe updated CASSANDRA-4834:
-

Attachment: (was: cassandra-1.1-CASSANDRA-4834.txt)

> Old-style mapred interface only populates row key for first column when using 
> wide rows
> ---
>
> Key: CASSANDRA-4834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Ben Kempe
>Assignee: Ben Kempe
>Priority: Minor
> Fix For: 1.1.7
>
> Attachments: cassandra-1.1-CASSANDRA-4834.txt, TestJob.java, 
> TestJobOldHadoop.java, trunk-CASSANDRA-4834.txt
>
>
> When using the ColumnFamilyRecordReader with the old-style Hadoop interface 
> to iterate over wide row columns, the row key is only populated on the first 
> column.
> See attached tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4909) Bug when composite index is created in a table having collections

2012-11-05 Thread Henrik Ring (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490697#comment-13490697
 ] 

Henrik Ring commented on CASSANDRA-4909:


Working at the ApacheConEU Hackathon. Suggestions for testing the patch.

The patch changes the code so the exception: InvalidRequestException("Indexes 
on collections are no yet supported")
is raised if an attempt is made to create an index on an collection type column.

Creation of indexes on tables that have collection column values is allowed as 
long as 
the collection columns are NOT part of the index. 

Tests:
In CQL3 create CF with 2 collection type columns "A1" and "A2" and 2 
noncollection type columns "B1" and "B2".
Create index on A1 or A2 should fail. 
Create an index on B1 or B2 should succeed.
Insert data.
Select on column with index to ensure index is in place.

-- Should succeed
CREATE KEYSPACE TEST_CASSANDRA_4909_KS;

-- Should succeed
USE TEST_CASSANDRA_4909_KS; 

-- Should succeed
CREATE TABLE TEST_CASSANDRA_4909_TLB1
   (A1 set,
A2 set,
B1 text,
B2 text,
PRIMARY KEY B1);

-- Should fail: InvalidRequestException("Indexes on collections are no yet 
supported")
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX1 ON TEST_CASSANDRA_4909_TLB1 (A1);

-- Should Succeed:
CREATE INDEX TEST_CASSANDRA_4909_TLB1_INX2 ON TEST_CASSANDRA_4909_TLB1 (B2);

-- Wait for schema agreement.

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW1-ELM-1', 'A1-ROW1-ELM2'}, {'A2-ROW1-ELM-1', 
'A2-ROW1-ELM2'}, 'B1-ROW1', 'B2-ROW1' );

-- Should succeed:
INSERT INTO TEST_CASSANDRA_4909_TLB1 (A1, A2, B1, B2)
   VALUES({'A1-ROW2-ELM-1', 'A1-ROW2-ELM2'}, {'A2-ROW2-ELM-1', 
'A2-ROW2-ELM2'}, 'B1-ROW2', 'B2-ROW2' );

-- Should succeed (ensure index is in place):
SELECT * FROM TEST_CASSANDRA_4909_TLB1 WHERE B2 = 'B2-ROW2';

-- Should succeed:
DROP TABLE TEST_CASSANDRA_4909_TLB1;


> Bug when composite index is created in a table having collections
> -
>
> Key: CASSANDRA-4909
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4909
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 2
>
> Attachments: 4909.txt
>
>
> CASSANDRA-4511 is open to add proper indexing of collection, but currently 
> indexing doesn't work correctly if we index a value in a table having 
> collection, even if that value is not a collection itself.
> We also don't refuse creating index on collections, even though we don't 
> support it. Attaching patch to fix both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4834) Old-style mapred interface only populates row key for first column when using wide rows

2012-11-05 Thread Ben Kempe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kempe updated CASSANDRA-4834:
-

Attachment: cassandra-1.1-CASSANDRA-4834.txt

added patch for cassandra-1.1 branch

> Old-style mapred interface only populates row key for first column when using 
> wide rows
> ---
>
> Key: CASSANDRA-4834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Ben Kempe
>Assignee: Ben Kempe
>Priority: Minor
> Fix For: 1.1.7
>
> Attachments: cassandra-1.1-CASSANDRA-4834.txt, TestJob.java, 
> TestJobOldHadoop.java, trunk-CASSANDRA-4834.txt
>
>
> When using the ColumnFamilyRecordReader with the old-style Hadoop interface 
> to iterate over wide row columns, the row key is only populated on the first 
> column.
> See attached tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4834) Old-style mapred interface only populates row key for first column when using wide rows

2012-11-05 Thread Ben Kempe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kempe updated CASSANDRA-4834:
-

Comment: was deleted

(was: added patch for cassandra-1.1 branch)

> Old-style mapred interface only populates row key for first column when using 
> wide rows
> ---
>
> Key: CASSANDRA-4834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Ben Kempe
>Assignee: Ben Kempe
>Priority: Minor
> Fix For: 1.1.7
>
> Attachments: cassandra-1.1-CASSANDRA-4834.txt, TestJob.java, 
> TestJobOldHadoop.java, trunk-CASSANDRA-4834.txt
>
>
> When using the ColumnFamilyRecordReader with the old-style Hadoop interface 
> to iterate over wide row columns, the row key is only populated on the first 
> column.
> See attached tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4834) Old-style mapred interface only populates row key for first column when using wide rows

2012-11-05 Thread Ben Kempe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490693#comment-13490693
 ] 

Ben Kempe edited comment on CASSANDRA-4834 at 11/5/12 3:49 PM:
---

added patch for cassandra-1.1 branch

  was (Author: bkempe):
patch for cassandra-1.1 branch
  
> Old-style mapred interface only populates row key for first column when using 
> wide rows
> ---
>
> Key: CASSANDRA-4834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Ben Kempe
>Assignee: Ben Kempe
>Priority: Minor
> Fix For: 1.1.7
>
> Attachments: cassandra-1.1-CASSANDRA-4834.txt, TestJob.java, 
> TestJobOldHadoop.java, trunk-CASSANDRA-4834.txt
>
>
> When using the ColumnFamilyRecordReader with the old-style Hadoop interface 
> to iterate over wide row columns, the row key is only populated on the first 
> column.
> See attached tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4834) Old-style mapred interface only populates row key for first column when using wide rows

2012-11-05 Thread Ben Kempe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Kempe updated CASSANDRA-4834:
-

Attachment: cassandra-1.1-CASSANDRA-4834.txt

patch for cassandra-1.1 branch

> Old-style mapred interface only populates row key for first column when using 
> wide rows
> ---
>
> Key: CASSANDRA-4834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Ben Kempe
>Assignee: Ben Kempe
>Priority: Minor
> Fix For: 1.1.7
>
> Attachments: cassandra-1.1-CASSANDRA-4834.txt, TestJob.java, 
> TestJobOldHadoop.java, trunk-CASSANDRA-4834.txt
>
>
> When using the ColumnFamilyRecordReader with the old-style Hadoop interface 
> to iterate over wide row columns, the row key is only populated on the first 
> column.
> See attached tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4645) (CQL3) Re-allow order by on non-selected columns

2012-11-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490687#comment-13490687
 ] 

Jonathan Ellis commented on CASSANDRA-4645:
---

+1

> (CQL3) Re-allow order by on non-selected columns
> 
>
> Key: CASSANDRA-4645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4645
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0 beta 2
>
> Attachments: 4645.txt
>
>
> CASSANDRA-4612 added a limitation to ORDER BY query in that it requires the 
> columns part of the ORDER BY to be in the select clause, while this wasn't 
> the case previously.
> The reason for that is that for ORDER BY with IN queries, the sorting is done 
> post-query, and by the time we do the ordering, we've already cut down the 
> result set to the select clause, so if the column are not in the select 
> clause we cannot sort on them.
> We should remove that that limitation however as this is a regression from 
> what we had before. As far as 1.2.0 is concerned, at the very least we should 
> lift the limitation for EQ queries since we don't do any post-query sorting 
> in that case and that was working correctly pre-CASSANDRA-4612. But we should 
> also remove that limitation for IN query, even if it's in a second time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4905) Repair should exclude gcable tombstones from merkle-tree computation

2012-11-05 Thread Christian Spriegel (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490106#comment-13490106
 ] 

Christian Spriegel edited comment on CASSANDRA-4905 at 11/5/12 1:34 PM:


-Also I wonder if a expired column could create the digest of a tombstone if it 
is timed out.-

-If a gcable tombstone does not alter the digest, then a timed-out 
ExpiredColumn should behave like a tombstone.-

Edit: It already works like this

  was (Author: christianmovi):
Also I wonder if a expired column could create the digest of a tombstone if 
it is timed out.

If a gcable tombstone does not alter the digest, then a timed-out ExpiredColumn 
should behave like a tombstone.
  
> Repair should exclude gcable tombstones from merkle-tree computation
> 
>
> Key: CASSANDRA-4905
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4905
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Christian Spriegel
>
> Currently gcable tombstones get repaired if some replicas compacted already, 
> but some are not compacted.
> This could be avoided by ignoring all gcable tombstones during merkle tree 
> calculation.
> This was discussed with Sylvain on the mailing list:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4898) Authentication provider in Cassandra itself

2012-11-05 Thread Dirkjan Bussink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490597#comment-13490597
 ] 

Dirkjan Bussink commented on CASSANDRA-4898:


I've published the current work here:

https://github.com/nedap/cassandra-auth

The biggest issue with switching to CQL3 internally at the moment is that the 
processInternal API is different in 1.1.x and 1.2.x / trunk it seems. I haven't 
switched this over to CQL3 for that reason, since our current cluster that we 
run is running 1.1.5 at the moment. 

The same goes for moving it to using Map or Set collections.

I don't think it would be much work for this, I'll probably update it anyway 
when we switch to 1.2.x which probably isn't long after the release.

> Authentication provider in Cassandra itself
> ---
>
> Key: CASSANDRA-4898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4898
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 1.1.6
>Reporter: Dirkjan Bussink
>  Labels: authentication, authorization
>
> I've been working on an implementation for both IAuthority2 and 
> IAuthenticator that uses Cassandra itself to store the necessary credentials. 
> I'm planning on open sourcing this shortly.
> Is there any interest in this? It tries to provide reasonable security, for 
> example using PBKDF2 to store passwords with a configurable configuration 
> cycle and managing all the rights available in IAuthority2. 
> My main use goal isn't security / confidentiality of the data, but more that 
> I don't want multiple consumers of the cluster to accidentally screw stuff 
> up. Only certain users can write data, others can read it out again and 
> further process it.
> I'm planning on releasing this soon under an open source license (probably 
> the same as Cassandra itself). Would there be interest in incorporating it as 
> a new reference implementation instead of the properties file implementation 
> perhaps? Or can I better maintain it separately? I would love if people from 
> the community would want to review it, since I have been dabbling in the 
> Cassandra source code only for a short while now.
> During the development of this I've encountered a few bumps and I wonder 
> whether they could be addressed or not.
> = Moment when validateConfiguration() runs =
> Is there a deliberate reason that validateConfiguration() is executed before 
> all information about keyspaces, column families etc. is available? In the 
> current form I therefore can't validate whether column families etc. are 
> available for authentication since they aren't loaded yet.
> I've wanted to use this to make relatively easy bootstrapping possible. My 
> approach here would be to only enable authentication if the needed keyspace 
> is available. This allows for configuring the cluster, then import the 
> necessary authentication data for an admin user to bootstrap further and then 
> restart every node in the cluster.
> Basically the questions here are, can the moment when validateConfiguration() 
> runs for an authentication provider be changed? Is this approach to 
> bootstrapping reasonable or do people have better ideas?
> = AbstractReplicationStrategy has package visible constructor =
> I've added a strategy that basically says that data should be available on 
> all nodes. The amount of data use for authentication is very limited. 
> Replicating it to every node is there for not very problematic and allows for 
> every node to have all data locally available for verifying requests.
> I wanted to put this strategy into it's own package inside the authentication 
> module, but since the constructor of AbstractReplicationStrategy has no 
> visibility explicitly marked, it's only available inside the same package.
> I'm not sure whether implementing a strategy to replicate data to all nodes 
> is a sane idea and whether my implementation of this strategy is correct. 
> What do you people think of this? Would people want to review the 
> implementation?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4097) Classes in org.apache.cassandra.deps:avro:1.4.0-cassandra-1 clash with core Avro classes

2012-11-05 Thread rahul jain (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490585#comment-13490585
 ] 

rahul jain commented on CASSANDRA-4097:
---

Hi Andrew,

I am also facing same issue. What was your resolution? Have you already 
submitted your change? Can you please share your fix?

Regards,
Rahul


> Classes in org.apache.cassandra.deps:avro:1.4.0-cassandra-1 clash with core 
> Avro classes
> 
>
> Key: CASSANDRA-4097
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4097
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Andrew Swan
>Priority: Minor
>
> Cassandra has this dependency:
> {code:title=build.xml}...
>  version="1.4.0-cassandra-1">
> ...{code}
> Unfortunately this JAR file contains classes in the {{org.apache.avro}} 
> package that are incompatible with classes of the same fully-qualified name 
> in the current release of Avro. For example, the inner class 
> {{org.apache.avro.Schema$Parser}} found in Avro 1.6.1 is missing from the 
> Cassandra version of that class. This makes it impossible to have both 
> Cassandra and the latest Avro version on the classpath (my use case is an 
> application that embeds Cassandra but also uses Avro 1.6.1 for unrelated 
> serialization purposes). A simple and risk-free solution would be to change 
> the package declaration of Cassandra's Avro classes from {{org.apache.avro}} 
> to (say) {{org.apache.cassandra.avro}}, assuming that the above dependency is 
> only used by Cassandra and no other projects (which seems a reasonable 
> assumption given its name).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4645) (CQL3) Re-allow order by on non-selected columns

2012-11-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4645:


Attachment: 4645.txt

Attaching trivial patch to fix the limitation introduced by CASSANDRA-4612. 
I.e. it doesn't force the order by columns to be select, unless this is an IN 
query. I've opened CASSANDRA-4911 to remove the limitation for IN queries, but 
that part has never worked anyway so it's more of an improvement.

> (CQL3) Re-allow order by on non-selected columns
> 
>
> Key: CASSANDRA-4645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4645
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.2.0 beta 1
>Reporter: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0 beta 2
>
> Attachments: 4645.txt
>
>
> CASSANDRA-4612 added a limitation to ORDER BY query in that it requires the 
> columns part of the ORDER BY to be in the select clause, while this wasn't 
> the case previously.
> The reason for that is that for ORDER BY with IN queries, the sorting is done 
> post-query, and by the time we do the ordering, we've already cut down the 
> result set to the select clause, so if the column are not in the select 
> clause we cannot sort on them.
> We should remove that that limitation however as this is a regression from 
> what we had before. As far as 1.2.0 is concerned, at the very least we should 
> lift the limitation for EQ queries since we don't do any post-query sorting 
> in that case and that was working correctly pre-CASSANDRA-4612. But we should 
> also remove that limitation for IN query, even if it's in a second time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4911) Lift limitation that order by columns must be selected for IN queries

2012-11-05 Thread Sylvain Lebresne (JIRA)
Sylvain Lebresne created CASSANDRA-4911:
---

 Summary: Lift limitation that order by columns must be selected 
for IN queries
 Key: CASSANDRA-4911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4911
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.2.0 beta 1
Reporter: Sylvain Lebresne
Priority: Minor
 Fix For: 1.2.1


This is the followup of CASSANDRA-4645. We should remove the limitation that 
for IN queries, you must have columns on which you have an ORDER BY in the 
select clause.

For that, we'll need to automatically add the columns on which we have an ORDER 
BY to the one queried internally, and remove it afterwards (once the sorting is 
done) from the resultSet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4883) Optimize mostRecentTomstone vs maxTimestamp check in CollationController.collectAllData

2012-11-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4883:


Attachment: 4883.txt

Patch attached. Not that the current CollationContollerTest demonstrate the 
improvement in the sense that for collectAllData, we iterate only 1 sstables as 
in the collectTimeOrdered with this patch (since the most recent sstable is 
also the one having the most recent row tombstone). I've however updated the 
test so that it examplify the difference between both methods.

> Optimize mostRecentTomstone vs maxTimestamp check in 
> CollationController.collectAllData
> ---
>
> Key: CASSANDRA-4883
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4883
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: 4883.txt
>
>
> CollationController.collectAllData eliminates a sstable if we've already read 
> a row tombstone more recent that its maxTimestamp. This is however done in 2 
> passes and can be inefficient (or rather, it's not as efficient as it could). 
> More precisely, say we have 10 sstables s0, ... s9, where s0 is the most 
> recent and s9 the least one (and their maxTimestamp reflect that) and s0 has 
> a row tombstone that is more recent than all of s1-s9 maxTimestamps. Now in 
> collectAllData(), we first iterate over sstables in a "random" order (because 
> DataTracker keeps sstable in a more or less random order). Meaning that we 
> may iterate in the order s9, s8, ... s0. In that case, we will end up reading 
> the row header from all the sstable (hitting disk each time). Then, and only 
> then, the 2nd pass of collectAllData will eliminate s1 to s9.
> However, if we were to iterate sstable in maxTimestamps order (as we do in 
> collectTimeOrdered), we would only need one pass but more importantly we 
> would minimize the number of row header we read to perform that sstable 
> eliminination. In my example, we would only ever read the row tombstone from 
> s0 and eliminate all other sstable directly, simply based on their 
> maxTimestamp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-4883) Optimize mostRecentTomstone vs maxTimestamp check in CollationController.collectAllData

2012-11-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reassigned CASSANDRA-4883:
---

Assignee: Sylvain Lebresne

> Optimize mostRecentTomstone vs maxTimestamp check in 
> CollationController.collectAllData
> ---
>
> Key: CASSANDRA-4883
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4883
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: 4883.txt
>
>
> CollationController.collectAllData eliminates a sstable if we've already read 
> a row tombstone more recent that its maxTimestamp. This is however done in 2 
> passes and can be inefficient (or rather, it's not as efficient as it could). 
> More precisely, say we have 10 sstables s0, ... s9, where s0 is the most 
> recent and s9 the least one (and their maxTimestamp reflect that) and s0 has 
> a row tombstone that is more recent than all of s1-s9 maxTimestamps. Now in 
> collectAllData(), we first iterate over sstables in a "random" order (because 
> DataTracker keeps sstable in a more or less random order). Meaning that we 
> may iterate in the order s9, s8, ... s0. In that case, we will end up reading 
> the row header from all the sstable (hitting disk each time). Then, and only 
> then, the 2nd pass of collectAllData will eliminate s1 to s9.
> However, if we were to iterate sstable in maxTimestamps order (as we do in 
> collectTimeOrdered), we would only need one pass but more importantly we 
> would minimize the number of row header we read to perform that sstable 
> eliminination. In my example, we would only ever read the row tombstone from 
> s0 and eliminate all other sstable directly, simply based on their 
> maxTimestamp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4803) CFRR wide row iterators improvements

2012-11-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490548#comment-13490548
 ] 

Piotr Kołaczkowski commented on CASSANDRA-4803:
---

Hold on with applying patch 2 for a while. We just discovered it breaks running 
hive queries while doing rolling upgrade. There is a need for falling back to 
old describe_splits method if describe_splits_ex is not found.

> CFRR wide row iterators improvements
> 
>
> Key: CASSANDRA-4803
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4803
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Piotr Kołaczkowski
>Assignee: Piotr Kołaczkowski
> Fix For: 1.1.7, 1.2.0
>
> Attachments: 0001-Wide-row-iterator-counts-rows-not-columns.patch, 
> 0002-Fixed-bugs-in-describe_splits.-CFRR-uses-row-counts-.patch, 
> 0003-Fixed-get_paged_slice-memtable-and-sstable-column-it.patch, 
> 0004-Better-token-range-wrap-around-handling-in-CFIF-CFRR.patch, 
> 0005-Fixed-handling-of-start_key-end_token-in-get_range_s.patch, 
> 0006-Code-cleanup-refactoring-in-CFRR.-Fixed-bug-with-mis.patch
>
>
> {code}
>  public float getProgress()
> {
> // TODO this is totally broken for wide rows
> // the progress is likely to be reported slightly off the actual but 
> close enough
> float progress = ((float) iter.rowsRead() / totalRowCount);
> return progress > 1.0F ? 1.0F : progress;
> }
> {code}
> The problem is iter.rowsRead() does not return the number of rows read from 
> the wide row iterator, but returns number of *columns* (every row is counted 
> multiple times). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4803) CFRR wide row iterators improvements

2012-11-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490531#comment-13490531
 ] 

Piotr Kołaczkowski commented on CASSANDRA-4803:
---

#04 - what about virtual nodes in 1.2? Do we insist that split may not span 
more than one contiguous token range? It will be harder to avoid too small 
splits. And too small split = bigger task book-keeping overhead.

> CFRR wide row iterators improvements
> 
>
> Key: CASSANDRA-4803
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4803
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.1.0
>Reporter: Piotr Kołaczkowski
>Assignee: Piotr Kołaczkowski
> Fix For: 1.1.7, 1.2.0
>
> Attachments: 0001-Wide-row-iterator-counts-rows-not-columns.patch, 
> 0002-Fixed-bugs-in-describe_splits.-CFRR-uses-row-counts-.patch, 
> 0003-Fixed-get_paged_slice-memtable-and-sstable-column-it.patch, 
> 0004-Better-token-range-wrap-around-handling-in-CFIF-CFRR.patch, 
> 0005-Fixed-handling-of-start_key-end_token-in-get_range_s.patch, 
> 0006-Code-cleanup-refactoring-in-CFRR.-Fixed-bug-with-mis.patch
>
>
> {code}
>  public float getProgress()
> {
> // TODO this is totally broken for wide rows
> // the progress is likely to be reported slightly off the actual but 
> close enough
> float progress = ((float) iter.rowsRead() / totalRowCount);
> return progress > 1.0F ? 1.0F : progress;
> }
> {code}
> The problem is iter.rowsRead() does not return the number of rows read from 
> the wide row iterator, but returns number of *columns* (every row is counted 
> multiple times). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4762) Support multiple OR clauses for CQL3 Compact storage

2012-11-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4762:


Fix Version/s: (was: 1.2.0)
   1.2.1

> Support multiple OR clauses for CQL3 Compact storage
> 
>
> Key: CASSANDRA-4762
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4762
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: T Jake Luciani
> Fix For: 1.2.1
>
>
> Given CASSANDRA-3885
> It seems it should be possible to store multiple ranges for many predicates 
> even the inner parts of a composite column.
> They could be expressed as a expanded set of filter queries.
> example:
> {code}
> CREATE TABLE test (
>name text,
>tdate timestamp,
>tdate2 timestamp,
>tdate3 timestamp,
>num double,
>PRIMARY KEY(key,tdate,tdate2,tdate3)
>  ) WITH COMPACT STORAGE;
> SELECT * FROM test WHERE 
>   name IN ('a','b') and
>   tdate IN ('2010-01-01','2011-01-01') and
>   tdate2 IN ('2010-01-01','2011-01-01') and
>   tdate3 IN ('2010-01-01','2011-01-01') and
>   num > 1.0
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4910) CQL3 doesn't allow static CF definition with compact storage in C* 1.1

2012-11-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-4910:


Attachment: 4910.txt

Simple fix attached (not sure why we've refused that in the first place).

> CQL3 doesn't allow static CF definition with compact storage in C* 1.1
> --
>
> Key: CASSANDRA-4910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4910
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 1.1.1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 1.1.7
>
> Attachments: 4910.txt
>
>
> In Cassandra 1.1, the following CQL3 definition:
> {noformat}
> CREATE TABLE user_profiles (
> user_id text PRIMARY KEY,
> first_name text,
> last_name text,
> year_of_birth int
> ) WITH COMPACT STORAGE;
> {noformat}
> yields:
> {noformat}
> Bad Request: COMPACT STORAGE requires at least one column part of the 
> clustering key, none found
> {noformat}
> This works fine in 1.2 however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4910) CQL3 doesn't allow static CF definition with compact storage in C* 1.1

2012-11-05 Thread Sylvain Lebresne (JIRA)
Sylvain Lebresne created CASSANDRA-4910:
---

 Summary: CQL3 doesn't allow static CF definition with compact 
storage in C* 1.1
 Key: CASSANDRA-4910
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4910
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 1.1.7


In Cassandra 1.1, the following CQL3 definition:
{noformat}
CREATE TABLE user_profiles (
user_id text PRIMARY KEY,
first_name text,
last_name text,
year_of_birth int
) WITH COMPACT STORAGE;
{noformat}
yields:
{noformat}
Bad Request: COMPACT STORAGE requires at least one column part of the 
clustering key, none found
{noformat}

This works fine in 1.2 however.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4905) Repair should exclude gcable tombstones from merkle-tree computation

2012-11-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490494#comment-13490494
 ] 

Sylvain Lebresne commented on CASSANDRA-4905:
-

In other words, for existing release, we should probably just do what the title 
here suggest. But in the long run (because that require adding a new parameter 
to the network protocol, so at best it can be done for 1.2), we should probably 
consider having repair agree on a starting timestamp and use that as reference 
to expire columns and decide if a tombstone is gcable or not.

> Repair should exclude gcable tombstones from merkle-tree computation
> 
>
> Key: CASSANDRA-4905
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4905
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Christian Spriegel
>
> Currently gcable tombstones get repaired if some replicas compacted already, 
> but some are not compacted.
> This could be avoided by ignoring all gcable tombstones during merkle tree 
> calculation.
> This was discussed with Sylvain on the mailing list:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira