Look inside a secondary index

Tobias Eriksson Sun, 16 Aug 2020 13:52:44 -0700

Hi
 I am curious about the internals of the  Secondary Index, and in particular 
how the data is stored
The article dated 2016 here (is that still valid ? )
https://www.datastax.com/blog/2016/04/cassandra-native-secondary-index-deep-dive
indicates that a Secondary Index is really represented just as a normal table 
(only hidden)
The index puts the column we are interested in as Partition Key and the real 
tables Partition Key and Clustering Keys as Clustering Keys
The read simply doing a reverse lookup
Original Table : Customer


CREATE TABLE customer (

    id int PRIMARY KEY,

    city text,

    name text

)

The index for “city” would thus be a table like this

CREATE INDEX customer_city_idx ON customer (city);

CREATE TABLE customer_city_idx (
city text PRIMARY KEY,
id int
)

So I figured I’d investigate some more and did
sstabledump md-1-big-Data.db
where the file md-1-big-Data.db is the data file inside of the index directory

But now I get errors(see below) , which surprised me cause I thought it was a 
traditional table
Is there any other way to dump the content of the index (and it’s internal 
structure) ?

-Tobias





root@f772b8c66fe3:/var/lib/cassandra/data/cim/customer-cea10680764211ea8502f58bd8d3766a/.customer_city_idx#
 /opt/cassandra/tools/bin/sstabledump md-1-big-Data.db

WARN  20:30:48,790 Only 44.417GiB free across all data volumes. Consider adding 
more capacity to your cluster or removing obsolete snapshots

[

  {

    "partition" : {

      "key" : [ "Jonkoping" ],

      "position" : 0

    },

    "rows" : [

      {

        "type" : "row",

        "position" : 23,

        "clustering" : [ ] } ] } ]Exception in thread "main" 
java.lang.UnsupportedOperationException

       at 
org.apache.cassandra.db.marshal.PartitionerDefinedOrder.toJSONString(PartitionerDefinedOrder.java:88)

       at 
org.apache.cassandra.tools.JsonTransformer.serializeClustering(JsonTransformer.java:349)

       at 
org.apache.cassandra.tools.JsonTransformer.serializeRow(JsonTransformer.java:246)

       at 
org.apache.cassandra.tools.JsonTransformer.serializePartition(JsonTransformer.java:213)

       at 
java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)

       at 
java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)

       at java.util.Iterator.forEachRemaining(Iterator.java:116)

       at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)

       at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)

       at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)

       at 
java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)

       at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)

       at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)

       at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)

       at 
org.apache.cassandra.tools.JsonTransformer.toJson(JsonTransformer.java:101)

       at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:240)

Look inside a secondary index

Reply via email to