Marco Matarazzo created CASSANDRA-5501:
------------------------------------------

             Summary: Missing data on SELECT on secondary index 
                 Key: CASSANDRA-5501
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5501
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 1.2.4
         Environment: linux ubuntu 12.04
            Reporter: Marco Matarazzo


We have a 3 nodes cluster, and a keyspace with RF = 3.

>From cassandra-cli everything is fine (we actually never use it, I just 
>launched it for a check in this particular case).

[default@goh_master] get agents where station_id = ascii(1110129);
-------------------
RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
=> (column=, value=, timestamp=1364580868176000)
=> (column=character_points, value=, timestamp=1361030686890000)
=> (column=component_id, value=0, timestamp=1364580868176000)
=> (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, 
timestamp=1361030686890000)
=> (column=entity_id, value=0, timestamp=1364580868176000)
=> (column=manufacturing, value=, timestamp=1361030686890000)
=> (column=model, value=500005, timestamp=1361030686890000)
=> (column=name, value=Jenny Olifield, timestamp=1361030686890000)
=> (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
=> (column=station_id, value=1110129, timestamp=1364580868176000)
=> (column=stats_intellect, value=8, timestamp=1361030686890000)
=> (column=stats_reflexes, value=8, timestamp=1361030686890000)
=> (column=stats_stamina, value=7, timestamp=1361030686890000)
=> (column=stats_technology, value=7, timestamp=1361030686890000)
=> (column=trading, value=, timestamp=1361030686890000)
-------------------
RowKey: dc413373-6b06-11e2-8943-aacc00000216
=> (column=, value=, timestamp=1366568185220000)
=> (column=character_points, value=100, timestamp=1364580381651000)
=> (column=component_id, value=, timestamp=1364580381651000)
=> (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, 
timestamp=1364580381651000)
=> (column=entity_id, value=0, timestamp=1364580381651000)
=> (column=manufacturing, value=, timestamp=1364580381651000)
=> (column=model, value=500018, timestamp=1364580381651000)
=> (column=name, value=Darren Matar, timestamp=1364580381651000)
=> (column=name_check, value=darren_matar, timestamp=1364580381651000)
=> (column=station_id, value=1110129, timestamp=1364580381651000)
=> (column=stats_intellect, value=10, timestamp=1364580381651000)
=> (column=stats_reflexes, value=10, timestamp=1364580381651000)
=> (column=stats_stamina, value=10, timestamp=1364580381651000)
=> (column=stats_technology, value=10, timestamp=1364580381651000)
=> (column=trading, value=1, timestamp=1366568185220000)
-------------------
RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
=> (column=, value=, timestamp=1364828039093000)
=> (column=character_points, value=, timestamp=1361030686760000)
=> (column=component_id, value=0, timestamp=1364828039093000)
=> (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, 
timestamp=1361030686760000)
=> (column=entity_id, value=0, timestamp=1364828039093000)
=> (column=manufacturing, value=1, timestamp=1362517535613000)
=> (column=model, value=500008, timestamp=1361030686760000)
=> (column=name, value=Tom Bishop, timestamp=1361030686760000)
=> (column=name_check, value=tom_bishop, timestamp=1361030686760000)
=> (column=station_id, value=1110129, timestamp=1364828039093000)
=> (column=stats_intellect, value=9, timestamp=1361030686760000)
=> (column=stats_reflexes, value=7, timestamp=1361030686760000)
=> (column=stats_stamina, value=5, timestamp=1361030686760000)
=> (column=stats_technology, value=9, timestamp=1361030686760000)
=> (column=trading, value=, timestamp=1361030686760000)
-------------------
RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
=> (column=, value=, timestamp=1366568185096000)
=> (column=character_points, value=100, timestamp=1364580381537000)
=> (column=component_id, value=, timestamp=1364580381537000)
=> (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, 
timestamp=1364580381537000)
=> (column=entity_id, value=0, timestamp=1364580381537000)
=> (column=manufacturing, value=, timestamp=1364580381537000)
=> (column=model, value=500003, timestamp=1364580381537000)
=> (column=name, value=Andrea Len, timestamp=1364580381537000)
=> (column=name_check, value=andrea_len, timestamp=1364580381537000)
=> (column=station_id, value=1110129, timestamp=1364580381537000)
=> (column=stats_intellect, value=10, timestamp=1364580381537000)
=> (column=stats_reflexes, value=10, timestamp=1364580381537000)
=> (column=stats_stamina, value=10, timestamp=1364580381537000)
=> (column=stats_technology, value=10, timestamp=1364580381537000)
=> (column=trading, value=1, timestamp=1366568185096000)

4 Rows Returned.


>From CQLSH, hovewer, the result is different, and 2 rows are missing.


cqlsh:goh_master> select agent_id,name,station_id from agents where 
station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129

cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id 
= '1b462f09-65f3-4148-a1a6-536b52b3bcfa';

 agent_id                             | name       | station_id
--------------------------------------+------------+------------
 1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len |    1110129


Updating one column makes the single row reappear in the index, but just for 
that row and that columns/index.

cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = 
'1b462f09-65f3-4148-a1a6-536b52b3bcfa';

cqlsh:goh_master> select agent_id,name,station_id from agents where 
station_id='1110129';

 agent_id                             | name           | station_id
--------------------------------------+----------------+------------
 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield |    1110129
 0e7074ac-64bd-11e2-8c38-aacc00000201 |     Tom Bishop |    1110129
 1b462f09-65f3-4148-a1a6-536b52b3bcfa |     Andrea Len |    1110129


Updating one columns does not make all the row re-appear on all indexes (as it 
would be somewhat expected), but just on the updated one.

cqlsh:goh_master> select * from agents where name = 'Andrea Len';
cqlsh:goh_master> 


Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the 
problem, neither do nodetool repair.


We also used COPY TO to dump the entire row to check for hidden spaces or 
anything like that, but we can't see anything:

....
dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
 Matar,darren_matar,1110129,10,10,10,10,1
1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
 Len,andrea_len,1110129,10,10,10,10,1
....


The situation still persists, so if needed I am available to do what I can to 
check the situation.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to