[jira] [Updated] (CASSANDRA-5721) Extremely slow reads after flusing a table with column deletes

Alberto Pujante (JIRA) Wed, 03 Jul 2013 10:04:40 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alberto Pujante updated CASSANDRA-5721:
---------------------------------------

    Description: 
I don't know if this is a bug or a normal behaviour. After doing some 
insertions and deletions (new keyspace, new table) and make a flush of the 
table, Cassandra gives extremely slow reads (and finally timeouts).  
Compactions  also are extremely slow (with just a few hundred of columns).


I've created a example script to test this. The result table of the output 
script has 750 live columns and 750 tombstones and only is flushed one memtable.
When I don't do deletions and I read the entire row, Cassandra gives normal 
times. In this case it would be best for performance manually mark columns as 
deleted and when row reaches a % of deletes copy "no deleted columns" in a new 
row and delete the old one( row deletion).

Even in models with a few columns deletions, after a while Cassandra would 
become  extremely slow, and compactions would be very painfull

Thanks in advance

    public void test() throws InterruptedException {
        session.execute("CREATE KEYSPACE ks WITH replication " + "="
                + "{'class':'SimpleStrategy', 'replication_factor':1};");
        session.execute("use ks;");
        //session.execute("drop table timelineTable;");
        session.execute("CREATE " + "TABLE ks.timelineTable ("
                + "key blob,"
                + "timeline timestamp,"
                + "value blob,"
                + "PRIMARY KEY (key, timeline)"
                + ") WITH CLUSTERING ORDER BY (timeline DESC) and 
gc_grace_seconds=0;");

        Long interval;
        Long time = new Date().getTime();
        int j = 0;
        while (j < 15) {
            int i = 0;
            interval = new Date().getTime();
            while (i < 100) {
                session.execute("insert into timelineTable (key,timeline,value) 
values (0x01,"
                        + time.toString() + ",0x0" + Integer.toHexString(j) + 
")");
                time++;
                i++;
            }
            System.out.println("Insert Interval:" + (new Date().getTime() - 
interval));

            interval = new Date().getTime();
            ResultSet results = session.execute("SELECT * FROM timelineTable"
                    + " WHERE key = 0x01 ORDER BY timeline DESC limit 100");
            System.out.println("Read Interval:" + (new Date().getTime() - 
interval));

            i = 0;
            interval = new Date().getTime();
            for (Row row : results) {
                if (i >= 50) {
                    session.execute("DELETE FROM timelineTable WHERE key = 0x01 
AND timeline="
                            + row.getDate("timeline").getTime());
                }
                i++;
            }
            System.out.println("Delete Interval:" + (new Date().getTime() - 
interval));
            j++;
            System.out.println("");
        }
    }

  was:
I don't know if this is a bug or a normal behaviour. After doing some 
insertions and deletions (new keyspace, new table) and make a flush of the 
table, Cassandra gives extremely slow reads (and finally timeouts).  
Compactions  also are extremely slow (with just a few hundred of columns).


I've created a example script to test this. The result table of the output 
script has 750 live columns and 750 tombstones and only is flushed one memtable.
When I don't do deletions and I read the entire row, Cassandra gives normal 
times. In this case it would be best for performance manually mark columns as 
deleted and when row reaches a % of deletes copy "no deleted columns" in a new 
row and delete the old one( row deletion).

Even in models with a few columns deletions, after a while Cassandra would 
become  extremely slow, and compactions would be very painfull

    public void test() throws InterruptedException {
        session.execute("CREATE KEYSPACE ks WITH replication " + "="
                + "{'class':'SimpleStrategy', 'replication_factor':1};");
        session.execute("use ks;");
        //session.execute("drop table timelineTable;");
        session.execute("CREATE " + "TABLE ks.timelineTable ("
                + "key blob,"
                + "timeline timestamp,"
                + "value blob,"
                + "PRIMARY KEY (key, timeline)"
                + ") WITH CLUSTERING ORDER BY (timeline DESC) and 
gc_grace_seconds=0;");

        Long interval;
        Long time = new Date().getTime();
        int j = 0;
        while (j < 15) {
            int i = 0;
            interval = new Date().getTime();
            while (i < 100) {
                session.execute("insert into timelineTable (key,timeline,value) 
values (0x01,"
                        + time.toString() + ",0x0" + Integer.toHexString(j) + 
")");
                time++;
                i++;
            }
            System.out.println("Insert Interval:" + (new Date().getTime() - 
interval));

            interval = new Date().getTime();
            ResultSet results = session.execute("SELECT * FROM timelineTable"
                    + " WHERE key = 0x01 ORDER BY timeline DESC limit 100");
            System.out.println("Read Interval:" + (new Date().getTime() - 
interval));

            i = 0;
            interval = new Date().getTime();
            for (Row row : results) {
                if (i >= 50) {
                    session.execute("DELETE FROM timelineTable WHERE key = 0x01 
AND timeline="
                            + row.getDate("timeline").getTime());
                }
                i++;
            }
            System.out.println("Delete Interval:" + (new Date().getTime() - 
interval));
            j++;
            System.out.println("");
        }
    }

    
> Extremely slow reads after flusing a table with column deletes
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-5721
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5721
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Ubuntu 32 bits, Windows XP
>            Reporter: Alberto Pujante
>
> I don't know if this is a bug or a normal behaviour. After doing some 
> insertions and deletions (new keyspace, new table) and make a flush of the 
> table, Cassandra gives extremely slow reads (and finally timeouts).  
> Compactions  also are extremely slow (with just a few hundred of columns).
> I've created a example script to test this. The result table of the output 
> script has 750 live columns and 750 tombstones and only is flushed one 
> memtable.
> When I don't do deletions and I read the entire row, Cassandra gives normal 
> times. In this case it would be best for performance manually mark columns as 
> deleted and when row reaches a % of deletes copy "no deleted columns" in a 
> new row and delete the old one( row deletion).
> Even in models with a few columns deletions, after a while Cassandra would 
> become  extremely slow, and compactions would be very painfull
> Thanks in advance
>     public void test() throws InterruptedException {
>         session.execute("CREATE KEYSPACE ks WITH replication " + "="
>                 + "{'class':'SimpleStrategy', 'replication_factor':1};");
>         session.execute("use ks;");
>         //session.execute("drop table timelineTable;");
>         session.execute("CREATE " + "TABLE ks.timelineTable ("
>                 + "key blob,"
>                 + "timeline timestamp,"
>                 + "value blob,"
>                 + "PRIMARY KEY (key, timeline)"
>                 + ") WITH CLUSTERING ORDER BY (timeline DESC) and 
> gc_grace_seconds=0;");
>         Long interval;
>         Long time = new Date().getTime();
>         int j = 0;
>         while (j < 15) {
>             int i = 0;
>             interval = new Date().getTime();
>             while (i < 100) {
>                 session.execute("insert into timelineTable 
> (key,timeline,value) values (0x01,"
>                         + time.toString() + ",0x0" + Integer.toHexString(j) + 
> ")");
>                 time++;
>                 i++;
>             }
>             System.out.println("Insert Interval:" + (new Date().getTime() - 
> interval));
>             interval = new Date().getTime();
>             ResultSet results = session.execute("SELECT * FROM timelineTable"
>                     + " WHERE key = 0x01 ORDER BY timeline DESC limit 100");
>             System.out.println("Read Interval:" + (new Date().getTime() - 
> interval));
>             i = 0;
>             interval = new Date().getTime();
>             for (Row row : results) {
>                 if (i >= 50) {
>                     session.execute("DELETE FROM timelineTable WHERE key = 
> 0x01 AND timeline="
>                             + row.getDate("timeline").getTime());
>                 }
>                 i++;
>             }
>             System.out.println("Delete Interval:" + (new Date().getTime() - 
> interval));
>             j++;
>             System.out.println("");
>         }
>     }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-5721) Extremely slow reads after flusing a table with column deletes

Reply via email to