I'm not sure why sstable2json doesn't work for collections, but if you're into reading raw sstables we use the following code with good success:
https://github.com/coursera/aegisthus/blob/77c73f6259f2a30d3d8ca64578be5c13ecc4e6f4/aegisthus-hadoop/src/main/java/org/coursera/mapreducer/CQLMapper.java#L85 Thanks, Daniel On Mon, Jun 8, 2015 at 1:22 PM, java8964 <java8...@hotmail.com> wrote: > Hi, Cassandra users: > > I have a question related to how to Deserialize the new collection types > data in the Cassandra 2.x. (The exactly version is C 2.0.10). > > I create the following example tables in the CQLSH: > > CREATE TABLE coupon ( > account_id bigint, > campaign_id uuid, > ........................, > discount_info map<text, text>, > ........................, > PRIMARY KEY (account_id, campaign_id) > ) > > The other columns can be ignored in this case. Then I inserted into the > one test data like this: > > insert into coupon (account_id, campaign_id, discount_info) values > (111,uuid(), {'test_key':'test_value'}); > > After this, I got the SSTable files. I use the sstable2json file to check > the output: > > $./resources/cassandra/bin/sstable2json /xxx/test-coupon-jb-1-Data.db > [ > {"key": "000000000000006f","columns": > [["0336e50d-21aa-4b3a-9f01-989a8c540e54:","",1433792922055000], > ["0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info","0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:!",1433792922054999,"t",1433792922], > ["0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:746573745f6b6579","746573745f76616c7565",1433792922055000]]} > ] > > What I want to is to get the {"test_key" : "test_value"} as key/value pair > that I input into "discount_info" column. I followed the sstable2json code, > and try to deserialize the data by myself, but to my surprise, I cannot > make it work, even I tried several ways, but kept getting Exception. > > From what I researched, I know that Cassandra put the "campaign_id" + > "discount_info" + "Another ByteBuffer" as composite column in this case. > When I deserialize this columnName, I got the following dumped out as > String: > > "0336e50d-21aa-4b3a-9f01-989a8c540e54:discount_info:746573745f6b6579". > > It includes 3 parts: the first part is the uuid for the campaign_id. The > 2nd part as "discount_info", which is the static name I defined in the > table. The 3 part is a bytes array as length of 46, which I am not sure > what it is. > > The corresponding value part of this composite column is another byte > array as length of 10, hex as "746573745f76616c7565" if I dump it out. > > Now, here is what I did and not sure why it doesn't work. > First, I assume the value part stores the real value I put in the Map, so > I did the following: > > ByteBuffer value = ByteBufferUtil.clone(column.value()); > > MapType<String, String> result = MapType.getInstance(UTF8Type.instance, > UTF8Type.instance); > Map<String, String> output = result.compose(value); > > // it gave me the following exception: > org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a > map > > Then I am think that the real value must be stored as part of the column > names (the 3rd part of 46 bytes), so I did this: > > MapType<String, String> result = MapType.getInstance(UTF8Type.instance, > UTF8Type.instance); > Map<String, String> output = result.compose(third_part.value); > > // I got the following exception: > > java.lang.IllegalArgumentException > at java.nio.Buffer.limit(Buffer.java:267) > at > org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:587) > at > org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596) > at > org.apache.cassandra.serializers.MapSerializer.deserialize(MapSerializer.java:63) > at > org.apache.cassandra.serializers.MapSerializer.deserialize(MapSerializer.java:28) > at > org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:142) > > > I can get all other non-collection types data, but I cannot get the data from > the Map. My questions are: > > 1) How does the Cassandra store the collection data in the SSTable files? > From the length of bytes, it is most likely as part of the composite column. > If so, why I got the exception as above? > > 2) The sstable2json doesn't deserialize the real data out from the collection > type. So I don't have an example to follow. Do I use the wrong way trying to > compose the Map type data? > > > Thanks > > > Yong > >