[ 
https://issues.apache.org/jira/browse/CASSANDRA-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260204#comment-14260204
 ] 

Aleksey Yeschenko commented on CASSANDRA-8543:
----------------------------------------------

Use native protocol batching with prepared separate inserts - but make sure 
that you only batch columns/rows with the same partition key.

Use DateTieredCompactionStrategy 
(https://labs.spotify.com/2014/12/18/date-tiered-compaction/).

And, more importantly, don't try to optimize before you actually need it.

In any case, CASSANDRA-6412 is very unlikely to make it into Cassandra until 
3.1 or 3.2, if at all, so any wins that you could get from your blob-packing 
will be negated by the need to do a read before write.

You also lose convenient querying on lesser than 1024 limits, and the ability 
to reuse 3.0 aggregate functions on your values. Also complicating MR/Spark 
jobs and losing ability to use some of those pre-defined methods.

> Allow custom code to control behavior of reading and compaction
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-8543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8543
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Pavol Slamka
>            Priority: Minor
>
> When storing series data in blob objects because of speed improvements, it is 
> sometimes neccessary to change only few values of a single blob (say few 
> integers out of 1024 integers). Right now one could rewrite these using 
> compare and set and versioning - read blob and version, change few values, 
> write whole updated blob and incremented version if version did not change, 
> repeat the whole process otherwise (optimistic approach). However compare and 
> set brings some overhead. Let's try to leave out compare and set, and instead 
> reading and updating, let's write only "blank" blob with only few values set. 
> Blank blob contains special blank placeholder data such as NULL or max value 
> of int or similar. Since this write in fact only appends new SStable record, 
> we did not overwrite the old data yet. That happens during read or 
> compaction. But if we provided custom read, and custom compaction, which 
> would not replace the blob with a new "sparse blank" blob, but rather would 
> replace values in first blob (first sstable record) with only "non blank" 
> values from second blob (second sstable record), we would achieve fast 
> partial blob update without compare and set on a last write wins basis. Is 
> such approach feasible? Would it be possible to customize Cassandra so that 
> custom code for compaction and data reading could be provided for a column 
> (blob)? 
> There may be other better solutions, but speedwise, this seems best to me. 
> Sorry for any mistakes, I am new to Cassandra.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to