[ 
https://issues.apache.org/jira/browse/CASSANDRA-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Matarazzo updated CASSANDRA-5501:
---------------------------------------

    Description: 
We have a 3 nodes cluster, and a keyspace with RF = 3.

>From cassandra-cli everything is fine (we actually never use it, I just 
>launched it for a check in this particular case).

[default@goh_master] get agents where station_id = ascii(1110129);
-------------------
RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
=> (column=, value=, timestamp=1364580868176000)
=> (column=character_points, value=, timestamp=1361030686890000)
=> (column=component_id, value=0, timestamp=1364580868176000)
=> (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, 
timestamp=1361030686890000)
=> (column=entity_id, value=0, timestamp=1364580868176000)
=> (column=manufacturing, value=, timestamp=1361030686890000)
=> (column=model, value=500005, timestamp=1361030686890000)
=> (column=name, value=Jenny Olifield, timestamp=1361030686890000)
=> (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
=> (column=station_id, value=1110129, timestamp=1364580868176000)
=> (column=stats_intellect, value=8, timestamp=1361030686890000)
=> (column=stats_reflexes, value=8, timestamp=1361030686890000)
=> (column=stats_stamina, value=7, timestamp=1361030686890000)
=> (column=stats_technology, value=7, timestamp=1361030686890000)
=> (column=trading, value=, timestamp=1361030686890000)
-------------------
RowKey: dc413373-6b06-11e2-8943-aacc00000216
=> (column=, value=, timestamp=1366568185220000)
=> (column=character_points, value=100, timestamp=1364580381651000)
=> (column=component_id, value=, timestamp=1364580381651000)
=> (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, 
timestamp=1364580381651000)
=> (column=entity_id, value=0, timestamp=1364580381651000)
=> (column=manufacturing, value=, timestamp=1364580381651000)
=> (column=model, value=500018, timestamp=1364580381651000)
=> (column=name, value=Darren Matar, timestamp=1364580381651000)
=> (column=name_check, value=darren_matar, timestamp=1364580381651000)
=> (column=station_id, value=1110129, timestamp=1364580381651000)
=> (column=stats_intellect, value=10, timestamp=1364580381651000)
=> (column=stats_reflexes, value=10, timestamp=1364580381651000)
=> (column=stats_stamina, value=10, timestamp=1364580381651000)
=> (column=stats_technology, value=10, timestamp=1364580381651000)
=> (column=trading, value=1, timestamp=1366568185220000)
-------------------
RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
=> (column=, value=, timestamp=1364828039093000)
=> (column=character_points, value=, timestamp=1361030686760000)
=> (column=component_id, value=0, timestamp=1364828039093000)
=> (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, 
timestamp=1361030686760000)
=> (column=entity_id, value=0, timestamp=1364828039093000)
=> (column=manufacturing, value=1, timestamp=1362517535613000)
=> (column=model, value=500008, timestamp=1361030686760000)
=> (column=name, value=Tom Bishop, timestamp=1361030686760000)
=> (column=name_check, value=tom_bishop, timestamp=1361030686760000)
=> (column=station_id, value=1110129, timestamp=1364828039093000)
=> (column=stats_intellect, value=9, timestamp=1361030686760000)
=> (column=stats_reflexes, value=7, timestamp=1361030686760000)
=> (column=stats_stamina, value=5, timestamp=1361030686760000)
=> (column=stats_technology, value=9, timestamp=1361030686760000)
=> (column=trading, value=, timestamp=1361030686760000)
-------------------
RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
=> (column=, value=, timestamp=1366568185096000)
=> (column=character_points, value=100, timestamp=1364580381537000)
=> (column=component_id, value=, timestamp=1364580381537000)
=> (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, 
timestamp=1364580381537000)
=> (column=entity_id, value=0, timestamp=1364580381537000)
=> (column=manufacturing, value=, timestamp=1364580381537000)
=> (column=model, value=500003, timestamp=1364580381537000)
=> (column=name, value=Andrea Len, timestamp=1364580381537000)
=> (column=name_check, value=andrea_len, timestamp=1364580381537000)
=> (column=station_id, value=1110129, timestamp=1364580381537000)
=> (column=stats_intellect, value=10, timestamp=1364580381537000)
=> (column=stats_reflexes, value=10, timestamp=1364580381537000)
=> (column=stats_stamina, value=10, timestamp=1364580381537000)
=> (column=stats_technology, value=10, timestamp=1364580381537000)
=> (column=trading, value=1, timestamp=1366568185096000)

4 Rows Returned.


>From CQLSH, hovewer, the result is different, and 2 rows are missing.


cqlsh:goh_master> select agent_id,name,station_id from agents where 
station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129

cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id 
= '1b462f09-65f3-4148-a1a6-536b52b3bcfa';

 agent_id                             | name       | station_id
--------------------------------------+------------+------------
 1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129


Updating one column makes the single row reappear in the index, but just for 
that row and that columns/index.

cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = 
'1b462f09-65f3-4148-a1a6-536b52b3bcfa';

cqlsh:goh_master> select agent_id,name,station_id from agents where 
station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
 1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129


Updating one columns does not make all the row re-appear on all indexes (as it 
would be somewhat expected), but just on the updated one.

cqlsh:goh_master> select * from agents where name = 'Andrea Len';
cqlsh:goh_master> 


Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the 
problem, neither do nodetool repair.


We also used COPY TO to dump the entire row to check for hidden spaces or 
anything like that, but we can't see anything:

....
dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
 Matar,darren_matar,1110129,10,10,10,10,1
1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
 Len,andrea_len,1110129,10,10,10,10,1
....


Logs shows no error. I lowered log level to DEBUG and the only thing that I see 
when I launch the query is just:

DEBUG [Thrift:16] 2013-04-22 10:47:27,727 CassandraServer.java (line 1703) 
execute_cql3_query
DEBUG [Thrift:16] 2013-04-22 10:47:27,737 Tracing.java (line 167) request 
complete


In attachment, a log of the query with TRACE level log. 

I don't know how the internal mechanism of Cassandra works, so this could 
absolutely be no related, but the two columns that are not coming out have a 
more recent timestamp as it can be seen from cassandra-cli, and this CF has two 
sstables on disk, ...-435 and ...-436, and with log level lowered to trace, it 
seems to me that -436 get skipped due to bloom filter. I am really guessing 
here, I admit.

The very same skipping is shown in logs when i try to rebuild indexes, btw.


  was:
We have a 3 nodes cluster, and a keyspace with RF = 3.

>From cassandra-cli everything is fine (we actually never use it, I just 
>launched it for a check in this particular case).

[default@goh_master] get agents where station_id = ascii(1110129);
-------------------
RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
=> (column=, value=, timestamp=1364580868176000)
=> (column=character_points, value=, timestamp=1361030686890000)
=> (column=component_id, value=0, timestamp=1364580868176000)
=> (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, 
timestamp=1361030686890000)
=> (column=entity_id, value=0, timestamp=1364580868176000)
=> (column=manufacturing, value=, timestamp=1361030686890000)
=> (column=model, value=500005, timestamp=1361030686890000)
=> (column=name, value=Jenny Olifield, timestamp=1361030686890000)
=> (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
=> (column=station_id, value=1110129, timestamp=1364580868176000)
=> (column=stats_intellect, value=8, timestamp=1361030686890000)
=> (column=stats_reflexes, value=8, timestamp=1361030686890000)
=> (column=stats_stamina, value=7, timestamp=1361030686890000)
=> (column=stats_technology, value=7, timestamp=1361030686890000)
=> (column=trading, value=, timestamp=1361030686890000)
-------------------
RowKey: dc413373-6b06-11e2-8943-aacc00000216
=> (column=, value=, timestamp=1366568185220000)
=> (column=character_points, value=100, timestamp=1364580381651000)
=> (column=component_id, value=, timestamp=1364580381651000)
=> (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, 
timestamp=1364580381651000)
=> (column=entity_id, value=0, timestamp=1364580381651000)
=> (column=manufacturing, value=, timestamp=1364580381651000)
=> (column=model, value=500018, timestamp=1364580381651000)
=> (column=name, value=Darren Matar, timestamp=1364580381651000)
=> (column=name_check, value=darren_matar, timestamp=1364580381651000)
=> (column=station_id, value=1110129, timestamp=1364580381651000)
=> (column=stats_intellect, value=10, timestamp=1364580381651000)
=> (column=stats_reflexes, value=10, timestamp=1364580381651000)
=> (column=stats_stamina, value=10, timestamp=1364580381651000)
=> (column=stats_technology, value=10, timestamp=1364580381651000)
=> (column=trading, value=1, timestamp=1366568185220000)
-------------------
RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
=> (column=, value=, timestamp=1364828039093000)
=> (column=character_points, value=, timestamp=1361030686760000)
=> (column=component_id, value=0, timestamp=1364828039093000)
=> (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, 
timestamp=1361030686760000)
=> (column=entity_id, value=0, timestamp=1364828039093000)
=> (column=manufacturing, value=1, timestamp=1362517535613000)
=> (column=model, value=500008, timestamp=1361030686760000)
=> (column=name, value=Tom Bishop, timestamp=1361030686760000)
=> (column=name_check, value=tom_bishop, timestamp=1361030686760000)
=> (column=station_id, value=1110129, timestamp=1364828039093000)
=> (column=stats_intellect, value=9, timestamp=1361030686760000)
=> (column=stats_reflexes, value=7, timestamp=1361030686760000)
=> (column=stats_stamina, value=5, timestamp=1361030686760000)
=> (column=stats_technology, value=9, timestamp=1361030686760000)
=> (column=trading, value=, timestamp=1361030686760000)
-------------------
RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
=> (column=, value=, timestamp=1366568185096000)
=> (column=character_points, value=100, timestamp=1364580381537000)
=> (column=component_id, value=, timestamp=1364580381537000)
=> (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, 
timestamp=1364580381537000)
=> (column=entity_id, value=0, timestamp=1364580381537000)
=> (column=manufacturing, value=, timestamp=1364580381537000)
=> (column=model, value=500003, timestamp=1364580381537000)
=> (column=name, value=Andrea Len, timestamp=1364580381537000)
=> (column=name_check, value=andrea_len, timestamp=1364580381537000)
=> (column=station_id, value=1110129, timestamp=1364580381537000)
=> (column=stats_intellect, value=10, timestamp=1364580381537000)
=> (column=stats_reflexes, value=10, timestamp=1364580381537000)
=> (column=stats_stamina, value=10, timestamp=1364580381537000)
=> (column=stats_technology, value=10, timestamp=1364580381537000)
=> (column=trading, value=1, timestamp=1366568185096000)

4 Rows Returned.


>From CQLSH, hovewer, the result is different, and 2 rows are missing.


cqlsh:goh_master> select agent_id,name,station_id from agents where 
station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129

cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id 
= '1b462f09-65f3-4148-a1a6-536b52b3bcfa';

 agent_id                             | name       | station_id
--------------------------------------+------------+------------
 1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129


Updating one column makes the single row reappear in the index, but just for 
that row and that columns/index.

cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = 
'1b462f09-65f3-4148-a1a6-536b52b3bcfa';

cqlsh:goh_master> select agent_id,name,station_id from agents where 
station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
 1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129


Updating one columns does not make all the row re-appear on all indexes (as it 
would be somewhat expected), but just on the updated one.

cqlsh:goh_master> select * from agents where name = 'Andrea Len';
cqlsh:goh_master> 


Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the 
problem, neither do nodetool repair.


We also used COPY TO to dump the entire row to check for hidden spaces or 
anything like that, but we can't see anything:

....
dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
 Matar,darren_matar,1110129,10,10,10,10,1
1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
 Len,andrea_len,1110129,10,10,10,10,1
....


Logs shows no error. I lowered log level to DEBUG and the only thing that I see 
when I launch the query is just:

DEBUG [Thrift:16] 2013-04-22 10:47:27,727 CassandraServer.java (line 1703) 
execute_cql3_query
DEBUG [Thrift:16] 2013-04-22 10:47:27,737 Tracing.java (line 167) request 
complete


In attachment, a log of the query with TRACE level log. 

I don't know how the internal mechanism of Cassandra works, so this could 
absolutely be no related, but the two columns that are not coming out have a 
more recent timestamp as it can be seen from cassandra-cli, and this CF has two 
sstables on disk, ...-435 and ...-436, and with log level lowered to trace, it 
seems to me that -436 get skipped due to bloom filter.

The very same skipping is shown in logs when i try to rebuild indexes, btw.


    
> Missing data on SELECT on secondary index 
> ------------------------------------------
>
>                 Key: CASSANDRA-5501
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5501
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.4
>         Environment: linux ubuntu 12.04
>            Reporter: Marco Matarazzo
>         Attachments: query_log.txt
>
>
> We have a 3 nodes cluster, and a keyspace with RF = 3.
> From cassandra-cli everything is fine (we actually never use it, I just 
> launched it for a check in this particular case).
> [default@goh_master] get agents where station_id = ascii(1110129);
> -------------------
> RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
> => (column=, value=, timestamp=1364580868176000)
> => (column=character_points, value=, timestamp=1361030686890000)
> => (column=component_id, value=0, timestamp=1364580868176000)
> => (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, 
> timestamp=1361030686890000)
> => (column=entity_id, value=0, timestamp=1364580868176000)
> => (column=manufacturing, value=, timestamp=1361030686890000)
> => (column=model, value=500005, timestamp=1361030686890000)
> => (column=name, value=Jenny Olifield, timestamp=1361030686890000)
> => (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
> => (column=station_id, value=1110129, timestamp=1364580868176000)
> => (column=stats_intellect, value=8, timestamp=1361030686890000)
> => (column=stats_reflexes, value=8, timestamp=1361030686890000)
> => (column=stats_stamina, value=7, timestamp=1361030686890000)
> => (column=stats_technology, value=7, timestamp=1361030686890000)
> => (column=trading, value=, timestamp=1361030686890000)
> -------------------
> RowKey: dc413373-6b06-11e2-8943-aacc00000216
> => (column=, value=, timestamp=1366568185220000)
> => (column=character_points, value=100, timestamp=1364580381651000)
> => (column=component_id, value=, timestamp=1364580381651000)
> => (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, 
> timestamp=1364580381651000)
> => (column=entity_id, value=0, timestamp=1364580381651000)
> => (column=manufacturing, value=, timestamp=1364580381651000)
> => (column=model, value=500018, timestamp=1364580381651000)
> => (column=name, value=Darren Matar, timestamp=1364580381651000)
> => (column=name_check, value=darren_matar, timestamp=1364580381651000)
> => (column=station_id, value=1110129, timestamp=1364580381651000)
> => (column=stats_intellect, value=10, timestamp=1364580381651000)
> => (column=stats_reflexes, value=10, timestamp=1364580381651000)
> => (column=stats_stamina, value=10, timestamp=1364580381651000)
> => (column=stats_technology, value=10, timestamp=1364580381651000)
> => (column=trading, value=1, timestamp=1366568185220000)
> -------------------
> RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
> => (column=, value=, timestamp=1364828039093000)
> => (column=character_points, value=, timestamp=1361030686760000)
> => (column=component_id, value=0, timestamp=1364828039093000)
> => (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, 
> timestamp=1361030686760000)
> => (column=entity_id, value=0, timestamp=1364828039093000)
> => (column=manufacturing, value=1, timestamp=1362517535613000)
> => (column=model, value=500008, timestamp=1361030686760000)
> => (column=name, value=Tom Bishop, timestamp=1361030686760000)
> => (column=name_check, value=tom_bishop, timestamp=1361030686760000)
> => (column=station_id, value=1110129, timestamp=1364828039093000)
> => (column=stats_intellect, value=9, timestamp=1361030686760000)
> => (column=stats_reflexes, value=7, timestamp=1361030686760000)
> => (column=stats_stamina, value=5, timestamp=1361030686760000)
> => (column=stats_technology, value=9, timestamp=1361030686760000)
> => (column=trading, value=, timestamp=1361030686760000)
> -------------------
> RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
> => (column=, value=, timestamp=1366568185096000)
> => (column=character_points, value=100, timestamp=1364580381537000)
> => (column=component_id, value=, timestamp=1364580381537000)
> => (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, 
> timestamp=1364580381537000)
> => (column=entity_id, value=0, timestamp=1364580381537000)
> => (column=manufacturing, value=, timestamp=1364580381537000)
> => (column=model, value=500003, timestamp=1364580381537000)
> => (column=name, value=Andrea Len, timestamp=1364580381537000)
> => (column=name_check, value=andrea_len, timestamp=1364580381537000)
> => (column=station_id, value=1110129, timestamp=1364580381537000)
> => (column=stats_intellect, value=10, timestamp=1364580381537000)
> => (column=stats_reflexes, value=10, timestamp=1364580381537000)
> => (column=stats_stamina, value=10, timestamp=1364580381537000)
> => (column=stats_technology, value=10, timestamp=1364580381537000)
> => (column=trading, value=1, timestamp=1366568185096000)
> 4 Rows Returned.
> From CQLSH, hovewer, the result is different, and 2 rows are missing.
> cqlsh:goh_master> select agent_id,name,station_id from agents where 
> station_id='1110129';
>  agent_id                             | name           | station_id
> --------------------------------------+----------------+------------
>  6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
>  0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
> cqlsh:goh_master> select agent_id, name, station_id from agents where 
> agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa';
>  agent_id                             | name       | station_id
> --------------------------------------+------------+------------
>  1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129
> Updating one column makes the single row reappear in the index, but just for 
> that row and that columns/index.
> cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = 
> '1b462f09-65f3-4148-a1a6-536b52b3bcfa';
> cqlsh:goh_master> select agent_id,name,station_id from agents where 
> station_id='1110129';
>  agent_id                             | name           | station_id
> --------------------------------------+----------------+------------
>  6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
>  0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
>  1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129
> Updating one columns does not make all the row re-appear on all indexes (as 
> it would be somewhat expected), but just on the updated one.
> cqlsh:goh_master> select * from agents where name = 'Andrea Len';
> cqlsh:goh_master> 
> Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes 
> the problem, neither do nodetool repair.
> We also used COPY TO to dump the entire row to check for hidden spaces or 
> anything like that, but we can't see anything:
> ....
> dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
>  Matar,darren_matar,1110129,10,10,10,10,1
> 1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
>  Len,andrea_len,1110129,10,10,10,10,1
> ....
> Logs shows no error. I lowered log level to DEBUG and the only thing that I 
> see when I launch the query is just:
> DEBUG [Thrift:16] 2013-04-22 10:47:27,727 CassandraServer.java (line 1703) 
> execute_cql3_query
> DEBUG [Thrift:16] 2013-04-22 10:47:27,737 Tracing.java (line 167) request 
> complete
> In attachment, a log of the query with TRACE level log. 
> I don't know how the internal mechanism of Cassandra works, so this could 
> absolutely be no related, but the two columns that are not coming out have a 
> more recent timestamp as it can be seen from cassandra-cli, and this CF has 
> two sstables on disk, ...-435 and ...-436, and with log level lowered to 
> trace, it seems to me that -436 get skipped due to bloom filter. I am really 
> guessing here, I admit.
> The very same skipping is shown in logs when i try to rebuild indexes, btw.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to