[ https://issues.apache.org/jira/browse/CASSANDRA-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alberto Pujante updated CASSANDRA-5721: --------------------------------------- Description: I don't know if this is a bug or a normal behaviour. After doing some insertions and deletions (new keyspace, new table) and make a flush of the table, Cassandra gives extremely slow reads (and finally timeouts). Compactions also are extremely slow (with just a few hundred of columns). I've created a example script to test this. The result table of the output script has 750 live columns and 750 tombstones and only is flushed one memtable. When I don't do deletions and I read the entire row, Cassandra gives normal times. In this case it would be best for performance manually mark columns as deleted and when row reaches a % of deletes copy "no deleted columns" in a new row and delete the old one( row deletion). Even in models with a few columns deletions, after a while Cassandra would become extremely slow, and compactions would be very painfull Thanks in advance public void test() throws InterruptedException { session.execute("CREATE KEYSPACE ks WITH replication " + "=" + "{'class':'SimpleStrategy', 'replication_factor':1};"); session.execute("use ks;"); //session.execute("drop table timelineTable;"); session.execute("CREATE " + "TABLE ks.timelineTable (" + "key blob," + "timeline timestamp," + "value blob," + "PRIMARY KEY (key, timeline)" + ") WITH CLUSTERING ORDER BY (timeline DESC) and gc_grace_seconds=0;"); Long interval; Long time = new Date().getTime(); int j = 0; while (j < 15) { int i = 0; interval = new Date().getTime(); while (i < 100) { session.execute("insert into timelineTable (key,timeline,value) values (0x01," + time.toString() + ",0x0" + Integer.toHexString(j) + ")"); time++; i++; } System.out.println("Insert Interval:" + (new Date().getTime() - interval)); interval = new Date().getTime(); ResultSet results = session.execute("SELECT * FROM timelineTable" + " WHERE key = 0x01 ORDER BY timeline DESC limit 100"); System.out.println("Read Interval:" + (new Date().getTime() - interval)); i = 0; interval = new Date().getTime(); for (Row row : results) { if (i >= 50) { session.execute("DELETE FROM timelineTable WHERE key = 0x01 AND timeline=" + row.getDate("timeline").getTime()); } i++; } System.out.println("Delete Interval:" + (new Date().getTime() - interval)); j++; System.out.println(""); } } was: I don't know if this is a bug or a normal behaviour. After doing some insertions and deletions (new keyspace, new table) and make a flush of the table, Cassandra gives extremely slow reads (and finally timeouts). Compactions also are extremely slow (with just a few hundred of columns). I've created a example script to test this. The result table of the output script has 750 live columns and 750 tombstones and only is flushed one memtable. When I don't do deletions and I read the entire row, Cassandra gives normal times. In this case it would be best for performance manually mark columns as deleted and when row reaches a % of deletes copy "no deleted columns" in a new row and delete the old one( row deletion). Even in models with a few columns deletions, after a while Cassandra would become extremely slow, and compactions would be very painfull public void test() throws InterruptedException { session.execute("CREATE KEYSPACE ks WITH replication " + "=" + "{'class':'SimpleStrategy', 'replication_factor':1};"); session.execute("use ks;"); //session.execute("drop table timelineTable;"); session.execute("CREATE " + "TABLE ks.timelineTable (" + "key blob," + "timeline timestamp," + "value blob," + "PRIMARY KEY (key, timeline)" + ") WITH CLUSTERING ORDER BY (timeline DESC) and gc_grace_seconds=0;"); Long interval; Long time = new Date().getTime(); int j = 0; while (j < 15) { int i = 0; interval = new Date().getTime(); while (i < 100) { session.execute("insert into timelineTable (key,timeline,value) values (0x01," + time.toString() + ",0x0" + Integer.toHexString(j) + ")"); time++; i++; } System.out.println("Insert Interval:" + (new Date().getTime() - interval)); interval = new Date().getTime(); ResultSet results = session.execute("SELECT * FROM timelineTable" + " WHERE key = 0x01 ORDER BY timeline DESC limit 100"); System.out.println("Read Interval:" + (new Date().getTime() - interval)); i = 0; interval = new Date().getTime(); for (Row row : results) { if (i >= 50) { session.execute("DELETE FROM timelineTable WHERE key = 0x01 AND timeline=" + row.getDate("timeline").getTime()); } i++; } System.out.println("Delete Interval:" + (new Date().getTime() - interval)); j++; System.out.println(""); } } > Extremely slow reads after flusing a table with column deletes > -------------------------------------------------------------- > > Key: CASSANDRA-5721 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5721 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Ubuntu 32 bits, Windows XP > Reporter: Alberto Pujante > > I don't know if this is a bug or a normal behaviour. After doing some > insertions and deletions (new keyspace, new table) and make a flush of the > table, Cassandra gives extremely slow reads (and finally timeouts). > Compactions also are extremely slow (with just a few hundred of columns). > I've created a example script to test this. The result table of the output > script has 750 live columns and 750 tombstones and only is flushed one > memtable. > When I don't do deletions and I read the entire row, Cassandra gives normal > times. In this case it would be best for performance manually mark columns as > deleted and when row reaches a % of deletes copy "no deleted columns" in a > new row and delete the old one( row deletion). > Even in models with a few columns deletions, after a while Cassandra would > become extremely slow, and compactions would be very painfull > Thanks in advance > public void test() throws InterruptedException { > session.execute("CREATE KEYSPACE ks WITH replication " + "=" > + "{'class':'SimpleStrategy', 'replication_factor':1};"); > session.execute("use ks;"); > //session.execute("drop table timelineTable;"); > session.execute("CREATE " + "TABLE ks.timelineTable (" > + "key blob," > + "timeline timestamp," > + "value blob," > + "PRIMARY KEY (key, timeline)" > + ") WITH CLUSTERING ORDER BY (timeline DESC) and > gc_grace_seconds=0;"); > Long interval; > Long time = new Date().getTime(); > int j = 0; > while (j < 15) { > int i = 0; > interval = new Date().getTime(); > while (i < 100) { > session.execute("insert into timelineTable > (key,timeline,value) values (0x01," > + time.toString() + ",0x0" + Integer.toHexString(j) + > ")"); > time++; > i++; > } > System.out.println("Insert Interval:" + (new Date().getTime() - > interval)); > interval = new Date().getTime(); > ResultSet results = session.execute("SELECT * FROM timelineTable" > + " WHERE key = 0x01 ORDER BY timeline DESC limit 100"); > System.out.println("Read Interval:" + (new Date().getTime() - > interval)); > i = 0; > interval = new Date().getTime(); > for (Row row : results) { > if (i >= 50) { > session.execute("DELETE FROM timelineTable WHERE key = > 0x01 AND timeline=" > + row.getDate("timeline").getTime()); > } > i++; > } > System.out.println("Delete Interval:" + (new Date().getTime() - > interval)); > j++; > System.out.println(""); > } > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira