Re: Pointers on writing your own Compaction Strategy

2014-09-07 Thread Tupshin Harper
In addition to what Markus said, take a look at the latest patch in
https://issues.apache.org/jira/browse/CASSANDRA-6602 for a relevant
example.

-Tupshin
On Sep 4, 2014 2:28 PM, Marcus Eriksson krum...@gmail.com wrote:

 1. create a class that extends AbstractCompactionStrategy (i would keep it
 in-tree while developing to avoid having classpath issues etc)
 2. Implement the abstract methods
- getNextBackgroundTask - called when cassandra wants to do a new minor
 (background) compaction - return a CompactionTask with the sstables you
 want compacted
- getMaximalTask - called when a user triggers a major compaction
- getUserDefinedTask - when a user triggers a user defined compaction
 from JMX
- getEstimatedRemainingTasks - return the guessed number of tasks before
 we are done
- getMaxSSTableBytes - if your compaction strategy puts a limit on the
 size of sstables
 3. Execute this in cqlsh to enable your compaction strategy: ALTER TABLE
 foo WITH compaction = { class: ‘Bar’ }
 4. Things to think about:
 - make sure you mark sstables as compacting before returning them from
 the compaction strategy (and check the return value!):

 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java#L271
 - if you do this on 2.1 - dont mix repaired and unrepaired sstables
 (SSTableReader#isRepaired)

 Let me know if you need any more information

 /Marcus



 On Thu, Sep 4, 2014 at 6:50 PM, Ghosh, Mainak mgho...@illinois.edu
 wrote:

  Hello,
 
  I am planning to write a new compaction strategy and I was hoping if
  anyone can point me to the relevant functions and how they are related in
  the call hierarchy.
 
  Thanks for the help.
 
  Regards,
  Mainak.
 



Re: Node side processing

2014-02-27 Thread Tupshin Harper
Hi David,

Check out the ongoing discussion in
https://issues.apache.org/jira/browse/CASSANDRA-6704 as well as some
related tickets linked to from that one.

No consensus at this point, but I'm personally hoping to see something
along the general lines of Hive's UDFs.

-Tupshin


On Thu, Feb 27, 2014 at 8:50 AM, David Semeria da...@lmframework.comwrote:

 Hi List,

 I was wondering whether there have been any past proposals for
 implementing node side processing (NSP) in C*. By NSP, I mean the passing a
 reference to a Java class which would then process the result set before it
 being returned to the client.

 In our particular use case our clients typically loop through result sets
 of a million or more rows to produce a tiny amount of output (sums, means,
 variance, etc). The bottleneck -- quite obviously -- is the need to
 transfer a million rows to the client before processing can take place. It
 would be extremely useful to execute this processing on the coordinator
 node and only transfer the results to the client.

 I mention this here because I can imagine other C* users having similar
 requirements.

 Thanks

 D.



Re: network compatibility from 0.6 to 0.7

2010-07-22 Thread Tupshin Harper
As long as network compatibility is in place, it is possible to
incrementally upgrade a cluster by restricting thrift clients to only talk
to the 0.6 nodes until half the cluster is upgraded and then modify them to
talk to the 0.7 nodes. If networking compatibility breaks, there is no way
to avoid downtime or even test 0.7 under production load.

On Jul 22, 2010 9:50 AM, Jonathan Ellis jbel...@gmail.com wrote:

How useful is this to insist on, given that 0.7 thrift api is fairly
incompatible with 0.6's?  (timestamp - Clock change being the biggest
problem there)

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com