[ 
https://issues.apache.org/jira/browse/CASSANDRA-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Deng updated CASSANDRA-11654:
---------------------------------
    Description: 
It is pretty trivial to reproduce. Here are the steps I used (on a single node 
C* 3.x cluster):

{noformat}
echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, 
PRIMARY KEY (k, c) );" | cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | cqlsh
echo "DELETE FROM testks.testcf where k=1 and c='c1';" | cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | cqlsh
nodetool flush testks testcf
echo "SELECT * FROM testks.testcf;" | cqlsh
{noformat}

The last step from above will confirm that there is one live row in the 
testks.testcf table. However, if you now go to the actual SSTable file 
directory and run sstabledump like the following, you will see the row is still 
marked as deleted and no row content is shown:

{noformat}
$ sstabledump ma-1-big-Data.db
[
  {
    "partition" : {
      "key" : [ "1" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 18,
        "clustering" : [ "c1" ],
        "liveness_info" : { "tstamp" : 1461633248542342 },
        "deletion_info" : { "deletion_time" : 1461633248212499, "tstamp" : 
1461633248 }
      }
    ]
  }
]
{noformat}

This is reproduced in both latest 3.0.5 and 3.6-snapshot (i.e. trunk as of Apr 
25, 2016).

Looks like only row tombstone is affecting sstabledump. If you generate cell 
tombstones, even if you delete all non-PK & non-static columns in the row, as 
long as there is no explicit row delete (so the clustering is still considered 
alive), sstabledump will work just fine, see the following example steps:

{noformat}
echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, 
val1_int int, PRIMARY KEY (k, c) );" | cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 
100, 200);" | cqlsh
echo "DELETE val0_int, val1_int FROM testks.testcf where k=1 and c='c1';" | 
cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 
300, 400);" | cqlsh
nodetool flush testks testcf
echo "select * from testks.testcf;" | cqlsh

$ sstabledump ma-1-big-Data.db
[
  {
    "partition" : {
      "key" : [ "1" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 18,
        "clustering" : [ "c1" ],
        "liveness_info" : { "tstamp" : 1461634633566479 },
        "cells" : [
          { "name" : "val0_int", "value" : "300" },
          { "name" : "val1_int", "value" : "400" }
        ]
      }
    ]
  }
]
{noformat}

  was:
It is pretty trivial to reproduce. Here are the steps I used (on a single node 
C* 3.x cluster):

{noformat}
echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, 
PRIMARY KEY (k, c) );" | cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | cqlsh
echo "delete from testks.testcf where k=1 and c='c1';" | cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | cqlsh
nodetool flush testks testcf
echo "select * from testks.testcf;" | cqlsh
{noformat}

The last step from above will confirm that there is one live row in the 
testks.testcf table. However, if you now go to the actual SSTable file 
directory and run sstabledump like the following, you will see the row is still 
marked as deleted and no row content is shown:

{noformat}
$ sstabledump ma-1-big-Data.db
[
  {
    "partition" : {
      "key" : [ "1" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 18,
        "clustering" : [ "c1" ],
        "liveness_info" : { "tstamp" : 1461633248542342 },
        "deletion_info" : { "deletion_time" : 1461633248212499, "tstamp" : 
1461633248 }
      }
    ]
  }
]
{noformat}

This is reproduced in both latest 3.0.5 and 3.6-snapshot (i.e. trunk as Apr 25, 
2016).

Looks like only row tombstone is affecting sstabledump. If you generate cell 
tombstones, even if you delete all non-PK & non-static columns in the row, as 
long as there is no explicit row delete (so the clustering is still considered 
alive), sstabledump will work just fine, see the following example steps:

{noformat}
echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 
'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, 
val1_int int, PRIMARY KEY (k, c) );" | cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 
100, 200);" | cqlsh
echo "DELETE val0_int, val1_int FROM testks.testcf where k=1 and c='c1';" | 
cqlsh
echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 
300, 400);" | cqlsh
nodetool flush testks testcf
echo "select * from testks.testcf;" | cqlsh

$ sstabledump ma-1-big-Data.db
[
  {
    "partition" : {
      "key" : [ "1" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 18,
        "clustering" : [ "c1" ],
        "liveness_info" : { "tstamp" : 1461634633566479 },
        "cells" : [
          { "name" : "val0_int", "value" : "300" },
          { "name" : "val1_int", "value" : "400" }
        ]
      }
    ]
  }
]
{noformat}


> sstabledump is not able to properly print out SSTable that may contain 
> historical (but "shadowed") row tombstone
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11654
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11654
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Wei Deng
>
> It is pretty trivial to reproduce. Here are the steps I used (on a single 
> node C* 3.x cluster):
> {noformat}
> echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 
> 'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
> echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, 
> PRIMARY KEY (k, c) );" | cqlsh
> echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | 
> cqlsh
> echo "DELETE FROM testks.testcf where k=1 and c='c1';" | cqlsh
> echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | 
> cqlsh
> nodetool flush testks testcf
> echo "SELECT * FROM testks.testcf;" | cqlsh
> {noformat}
> The last step from above will confirm that there is one live row in the 
> testks.testcf table. However, if you now go to the actual SSTable file 
> directory and run sstabledump like the following, you will see the row is 
> still marked as deleted and no row content is shown:
> {noformat}
> $ sstabledump ma-1-big-Data.db
> [
>   {
>     "partition" : {
>       "key" : [ "1" ],
>       "position" : 0
>     },
>     "rows" : [
>       {
>         "type" : "row",
>         "position" : 18,
>         "clustering" : [ "c1" ],
>         "liveness_info" : { "tstamp" : 1461633248542342 },
>         "deletion_info" : { "deletion_time" : 1461633248212499, "tstamp" : 
> 1461633248 }
>       }
>     ]
>   }
> ]
> {noformat}
> This is reproduced in both latest 3.0.5 and 3.6-snapshot (i.e. trunk as of 
> Apr 25, 2016).
> Looks like only row tombstone is affecting sstabledump. If you generate cell 
> tombstones, even if you delete all non-PK & non-static columns in the row, as 
> long as there is no explicit row delete (so the clustering is still 
> considered alive), sstabledump will work just fine, see the following example 
> steps:
> {noformat}
> echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 
> 'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
> echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, 
> val1_int int, PRIMARY KEY (k, c) );" | cqlsh
> echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 
> 100, 200);" | cqlsh
> echo "DELETE val0_int, val1_int FROM testks.testcf where k=1 and c='c1';" | 
> cqlsh
> echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 
> 300, 400);" | cqlsh
> nodetool flush testks testcf
> echo "select * from testks.testcf;" | cqlsh
> $ sstabledump ma-1-big-Data.db
> [
>   {
>     "partition" : {
>       "key" : [ "1" ],
>       "position" : 0
>     },
>     "rows" : [
>       {
>         "type" : "row",
>         "position" : 18,
>         "clustering" : [ "c1" ],
>         "liveness_info" : { "tstamp" : 1461634633566479 },
>         "cells" : [
>           { "name" : "val0_int", "value" : "300" },
>           { "name" : "val1_int", "value" : "400" }
>         ]
>       }
>     ]
>   }
> ]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to