secondary index problem

2013-03-15 Thread Brett Tinling
We have a CF with an indexed column 'type', but we get incomplete results when 
we query that CF for all rows matching 'type'.  We can find the missing rows if 
we query by key.

 * we are seeing this on a small, single node, 1.2.2 instance with few rows.
 * we use thrift execute_cql_query, no CL is specified
 * none of repair, restart, compact, scrub helped
 
 Finally, nodetool rebuild_index fixed it.  
 
 Is index rebuild something we need to do periodically?  How often?  Is there a 
way to know when it needs to be done?  Do we have to run rebuild on all nodes?
 
 We have not noticed this until 1.2
 
 Regards,
  - Brett
 

 
 




smime.p7s
Description: S/MIME cryptographic signature


Re: secondary index problem

2013-03-15 Thread Janne Jalkanen

This could be either of the following bugs (which might be the same thing).  I 
get it too every time I recycle a node on 1.1.10.

https://issues.apache.org/jira/browse/CASSANDRA-4973
or
https://issues.apache.org/jira/browse/CASSANDRA-4785

/Janne

On Mar 15, 2013, at 23:24 , Brett Tinling  wrote:

> We have a CF with an indexed column 'type', but we get incomplete results 
> when we query that CF for all rows matching 'type'.  We can find the missing 
> rows if we query by key.
> 
> * we are seeing this on a small, single node, 1.2.2 instance with few rows.
> * we use thrift execute_cql_query, no CL is specified
> * none of repair, restart, compact, scrub helped
> 
> Finally, nodetool rebuild_index fixed it.  
> 
> Is index rebuild something we need to do periodically?  How often?  Is there 
> a way to know when it needs to be done?  Do we have to run rebuild on all 
> nodes?
> 
> We have not noticed this until 1.2
> 
> Regards,
>  - Brett
> 
> 
> 
> 
> 
> 



Re: secondary index problem

2013-03-18 Thread aaron morton
Brett, 
Do you have some steps to reproduce the problem ? If so please create a 
ticket on jira. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/03/2013, at 11:40 AM, Janne Jalkanen  wrote:

> 
> This could be either of the following bugs (which might be the same thing).  
> I get it too every time I recycle a node on 1.1.10.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-4973
> or
> https://issues.apache.org/jira/browse/CASSANDRA-4785
> 
> /Janne
> 
> On Mar 15, 2013, at 23:24 , Brett Tinling  wrote:
> 
>> We have a CF with an indexed column 'type', but we get incomplete results 
>> when we query that CF for all rows matching 'type'.  We can find the missing 
>> rows if we query by key.
>> 
>> * we are seeing this on a small, single node, 1.2.2 instance with few rows.
>> * we use thrift execute_cql_query, no CL is specified
>> * none of repair, restart, compact, scrub helped
>> 
>> Finally, nodetool rebuild_index fixed it.  
>> 
>> Is index rebuild something we need to do periodically?  How often?  Is there 
>> a way to know when it needs to be done?  Do we have to run rebuild on all 
>> nodes?
>> 
>> We have not noticed this until 1.2
>> 
>> Regards,
>>  - Brett
>> 
>> 
>> 
>> 
>> 
>> 
> 



Re: secondary index problem

2013-03-18 Thread Brett Tinling
Aaron,

No recipe yet.  It pops up randomly and, i think due to the nature of our app, 
goes away.

Seems like when we have updates that are large (10k rows in one mutate) the 
problem is more likely to occur.

I'll try to workout a repro...

-Brett

On Mar 18, 2013, at 10:18 AM, aaron morton wrote:

> Brett, 
>   Do you have some steps to reproduce the problem ? If so please create a 
> ticket on jira. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 16/03/2013, at 11:40 AM, Janne Jalkanen  wrote:
> 
>> 
>> This could be either of the following bugs (which might be the same thing).  
>> I get it too every time I recycle a node on 1.1.10.
>> 
>> https://issues.apache.org/jira/browse/CASSANDRA-4973
>> or
>> https://issues.apache.org/jira/browse/CASSANDRA-4785
>> 
>> /Janne
>> 
>> On Mar 15, 2013, at 23:24 , Brett Tinling  wrote:
>> 
>>> We have a CF with an indexed column 'type', but we get incomplete results 
>>> when we query that CF for all rows matching 'type'.  We can find the 
>>> missing rows if we query by key.
>>> 
>>> * we are seeing this on a small, single node, 1.2.2 instance with few rows.
>>> * we use thrift execute_cql_query, no CL is specified
>>> * none of repair, restart, compact, scrub helped
>>> 
>>> Finally, nodetool rebuild_index fixed it.  
>>> 
>>> Is index rebuild something we need to do periodically?  How often?  Is 
>>> there a way to know when it needs to be done?  Do we have to run rebuild on 
>>> all nodes?
>>> 
>>> We have not noticed this until 1.2
>>> 
>>> Regards,
>>>  - Brett
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: secondary index problem

2013-03-19 Thread aaron morton
> Seems like when we have updates that are large (10k rows in one mutate) the 
> problem is more likely to occur.
10K rows in one mutate is a very bad idea. 
It will take the nodes a long time to process them, risking time out,  and it 
will essentially starving other requests. 

You should also specify a CL level so we understand if this you are using 
cassandra in an eventually consistent or strongly consistent way. 

Cheers
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/03/2013, at 11:02 AM, Brett Tinling  wrote:

> Aaron,
> 
> No recipe yet.  It pops up randomly and, i think due to the nature of our 
> app, goes away.
> 
> Seems like when we have updates that are large (10k rows in one mutate) the 
> problem is more likely to occur.
> 
> I'll try to workout a repro...
> 
> -Brett
> 
> On Mar 18, 2013, at 10:18 AM, aaron morton wrote:
> 
>> Brett, 
>>  Do you have some steps to reproduce the problem ? If so please create a 
>> ticket on jira. 
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 16/03/2013, at 11:40 AM, Janne Jalkanen  wrote:
>> 
>>> 
>>> This could be either of the following bugs (which might be the same thing). 
>>>  I get it too every time I recycle a node on 1.1.10.
>>> 
>>> https://issues.apache.org/jira/browse/CASSANDRA-4973
>>> or
>>> https://issues.apache.org/jira/browse/CASSANDRA-4785
>>> 
>>> /Janne
>>> 
>>> On Mar 15, 2013, at 23:24 , Brett Tinling  
>>> wrote:
>>> 
 We have a CF with an indexed column 'type', but we get incomplete results 
 when we query that CF for all rows matching 'type'.  We can find the 
 missing rows if we query by key.
 
 * we are seeing this on a small, single node, 1.2.2 instance with few rows.
 * we use thrift execute_cql_query, no CL is specified
 * none of repair, restart, compact, scrub helped
 
 Finally, nodetool rebuild_index fixed it.  
 
 Is index rebuild something we need to do periodically?  How often?  Is 
 there a way to know when it needs to be done?  Do we have to run rebuild 
 on all nodes?
 
 We have not noticed this until 1.2
 
 Regards,
 - Brett
 
 
 
 
 
 
>>> 
>> 
> 



Re: secondary index problem

2013-03-19 Thread Brett Tinling
We are using CL ONE for mutates.

As for the large batches, yes, our use pattern has exceeded the initial 
understanding.  We plan to rewrite this bit, but it has not been a problem so 
far (or maybe this index thing is the problem that forces the rewrite?).  On 
the rare timeout, we retry... 

I have a test program trying to cause the error now.  So far, no luck.


On Mar 19, 2013, at 1:03 AM, aaron morton wrote:

>> Seems like when we have updates that are large (10k rows in one mutate) the 
>> problem is more likely to occur.
> 10K rows in one mutate is a very bad idea. 
> It will take the nodes a long time to process them, risking time out,  and it 
> will essentially starving other requests. 
> 
> You should also specify a CL level so we understand if this you are using 
> cassandra in an eventually consistent or strongly consistent way. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 19/03/2013, at 11:02 AM, Brett Tinling  wrote:
> 
>> Aaron,
>> 
>> No recipe yet.  It pops up randomly and, i think due to the nature of our 
>> app, goes away.
>> 
>> Seems like when we have updates that are large (10k rows in one mutate) the 
>> problem is more likely to occur.
>> 
>> I'll try to workout a repro...
>> 
>> -Brett
>> 
>> On Mar 18, 2013, at 10:18 AM, aaron morton wrote:
>> 
>>> Brett, 
>>> Do you have some steps to reproduce the problem ? If so please create a 
>>> ticket on jira. 
>>> 
>>> Cheers
>>> 
>>> -
>>> Aaron Morton
>>> Freelance Cassandra Consultant
>>> New Zealand
>>> 
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 16/03/2013, at 11:40 AM, Janne Jalkanen  wrote:
>>> 
 
 This could be either of the following bugs (which might be the same 
 thing).  I get it too every time I recycle a node on 1.1.10.
 
 https://issues.apache.org/jira/browse/CASSANDRA-4973
 or
 https://issues.apache.org/jira/browse/CASSANDRA-4785
 
 /Janne
 
 On Mar 15, 2013, at 23:24 , Brett Tinling  
 wrote:
 
> We have a CF with an indexed column 'type', but we get incomplete results 
> when we query that CF for all rows matching 'type'.  We can find the 
> missing rows if we query by key.
> 
> * we are seeing this on a small, single node, 1.2.2 instance with few 
> rows.
> * we use thrift execute_cql_query, no CL is specified
> * none of repair, restart, compact, scrub helped
> 
> Finally, nodetool rebuild_index fixed it.  
> 
> Is index rebuild something we need to do periodically?  How often?  Is 
> there a way to know when it needs to be done?  Do we have to run rebuild 
> on all nodes?
> 
> We have not noticed this until 1.2
> 
> Regards,
> - Brett
> 
> 
> 
> 
> 
> 
 
>>> 
>> 
> 



smime.p7s
Description: S/MIME cryptographic signature