Marco Matarazzo created CASSANDRA-5501: ------------------------------------------
Summary: Missing data on SELECT on secondary index Key: CASSANDRA-5501 URL: https://issues.apache.org/jira/browse/CASSANDRA-5501 Project: Cassandra Issue Type: Bug Affects Versions: 1.2.4 Environment: linux ubuntu 12.04 Reporter: Marco Matarazzo We have a 3 nodes cluster, and a keyspace with RF = 3. >From cassandra-cli everything is fine (we actually never use it, I just >launched it for a check in this particular case). [default@goh_master] get agents where station_id = ascii(1110129); ------------------- RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216 => (column=, value=, timestamp=1364580868176000) => (column=character_points, value=, timestamp=1361030686890000) => (column=component_id, value=0, timestamp=1364580868176000) => (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216, timestamp=1361030686890000) => (column=entity_id, value=0, timestamp=1364580868176000) => (column=manufacturing, value=, timestamp=1361030686890000) => (column=model, value=500005, timestamp=1361030686890000) => (column=name, value=Jenny Olifield, timestamp=1361030686890000) => (column=name_check, value=jenny_olifield, timestamp=1361030686890000) => (column=station_id, value=1110129, timestamp=1364580868176000) => (column=stats_intellect, value=8, timestamp=1361030686890000) => (column=stats_reflexes, value=8, timestamp=1361030686890000) => (column=stats_stamina, value=7, timestamp=1361030686890000) => (column=stats_technology, value=7, timestamp=1361030686890000) => (column=trading, value=, timestamp=1361030686890000) ------------------- RowKey: dc413373-6b06-11e2-8943-aacc00000216 => (column=, value=, timestamp=1366568185220000) => (column=character_points, value=100, timestamp=1364580381651000) => (column=component_id, value=, timestamp=1364580381651000) => (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200, timestamp=1364580381651000) => (column=entity_id, value=0, timestamp=1364580381651000) => (column=manufacturing, value=, timestamp=1364580381651000) => (column=model, value=500018, timestamp=1364580381651000) => (column=name, value=Darren Matar, timestamp=1364580381651000) => (column=name_check, value=darren_matar, timestamp=1364580381651000) => (column=station_id, value=1110129, timestamp=1364580381651000) => (column=stats_intellect, value=10, timestamp=1364580381651000) => (column=stats_reflexes, value=10, timestamp=1364580381651000) => (column=stats_stamina, value=10, timestamp=1364580381651000) => (column=stats_technology, value=10, timestamp=1364580381651000) => (column=trading, value=1, timestamp=1366568185220000) ------------------- RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201 => (column=, value=, timestamp=1364828039093000) => (column=character_points, value=, timestamp=1361030686760000) => (column=component_id, value=0, timestamp=1364828039093000) => (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201, timestamp=1361030686760000) => (column=entity_id, value=0, timestamp=1364828039093000) => (column=manufacturing, value=1, timestamp=1362517535613000) => (column=model, value=500008, timestamp=1361030686760000) => (column=name, value=Tom Bishop, timestamp=1361030686760000) => (column=name_check, value=tom_bishop, timestamp=1361030686760000) => (column=station_id, value=1110129, timestamp=1364828039093000) => (column=stats_intellect, value=9, timestamp=1361030686760000) => (column=stats_reflexes, value=7, timestamp=1361030686760000) => (column=stats_stamina, value=5, timestamp=1361030686760000) => (column=stats_technology, value=9, timestamp=1361030686760000) => (column=trading, value=, timestamp=1361030686760000) ------------------- RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa => (column=, value=, timestamp=1366568185096000) => (column=character_points, value=100, timestamp=1364580381537000) => (column=component_id, value=, timestamp=1364580381537000) => (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e, timestamp=1364580381537000) => (column=entity_id, value=0, timestamp=1364580381537000) => (column=manufacturing, value=, timestamp=1364580381537000) => (column=model, value=500003, timestamp=1364580381537000) => (column=name, value=Andrea Len, timestamp=1364580381537000) => (column=name_check, value=andrea_len, timestamp=1364580381537000) => (column=station_id, value=1110129, timestamp=1364580381537000) => (column=stats_intellect, value=10, timestamp=1364580381537000) => (column=stats_reflexes, value=10, timestamp=1364580381537000) => (column=stats_stamina, value=10, timestamp=1364580381537000) => (column=stats_technology, value=10, timestamp=1364580381537000) => (column=trading, value=1, timestamp=1366568185096000) 4 Rows Returned. >From CQLSH, hovewer, the result is different, and 2 rows are missing. cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129'; agent_id | name | station_id --------------------------------------+----------------+------------ 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield | 1110129 0e7074ac-64bd-11e2-8c38-aacc00000201 | Tom Bishop | 1110129 cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa'; agent_id | name | station_id --------------------------------------+------------+------------ 1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len | 1110129 Updating one column makes the single row reappear in the index, but just for that row and that columns/index. cqlsh:goh_master> update agents set station_id = '1110129' where agent_id = '1b462f09-65f3-4148-a1a6-536b52b3bcfa'; cqlsh:goh_master> select agent_id,name,station_id from agents where station_id='1110129'; agent_id | name | station_id --------------------------------------+----------------+------------ 6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield | 1110129 0e7074ac-64bd-11e2-8c38-aacc00000201 | Tom Bishop | 1110129 1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len | 1110129 Updating one columns does not make all the row re-appear on all indexes (as it would be somewhat expected), but just on the updated one. cqlsh:goh_master> select * from agents where name = 'Andrea Len'; cqlsh:goh_master> Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the problem, neither do nodetool repair. We also used COPY TO to dump the entire row to check for hidden spaces or anything like that, but we can't see anything: .... dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren Matar,darren_matar,1110129,10,10,10,10,1 1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea Len,andrea_len,1110129,10,10,10,10,1 .... The situation still persists, so if needed I am available to do what I can to check the situation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira