Re: How to retrieve snappy compressed data from Cassandra using Datastax?

2014-01-28 Thread Alex Popescu
Wouldn't you be better to delegate the compression part to Cassandra (which
support Snappy [1])? This way the compression part will be completely
transparent to your application.

[1] http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression


On Tue, Jan 28, 2014 at 8:51 PM, Check Peck  wrote:

> I am working on a project in which I am supposed to store the snappy
> compressed data in Cassandra, so that when I retrieve the same data from
> Cassandra, it should be snappy compressed in memory and then I will
> decompress that data using snappy to get the actual data from it.
>
> I am having a byte array in `bytesToStore` variable, then I am snappy
> compressing it using google `Snappy` and stored it back into Cassandra -
>
> // .. some code here
> System.out.println(bytesToStore);
>
> byte[] compressed = Snappy.compress(bytesToStore);
>
> attributesMap.put("e1", compressed);
>
> ICassandraClient client = CassandraFactory.getInstance().getDao();
> // write to Cassandra
> client.upsertAttributes("0123", attributesMap, "sample_table");
>
> After inserting the data in Cassandra, I went back into CQL mode and I
> queried it and I can see this data in my table for the test_id `0123`-
>
> cqlsh:testingks> select * from sample_table where test_id = '0123';
>
>  test_id | name | value
>
> -+-+
> 0123 |   e1 |
> 0x2cac7fff012c4ebb9555001e42797465204172726179205465737420466f722042696720456e6469616e
>
>
> Now I am trying to read the same data back from Cassandra and everytime it
> is giving me `IllegalArgumentException` -
>
> public Map getDataFromCassandra(final String rowKey,
> final Collection attributeNames) {
>
> Map dataFromCassandra = new
> ConcurrentHashMap();
>
> try {
> String query="SELECT test_id, name, value from sample_table
> where test_id = '"+rowKey+ "';";
> //SELECT test_id, name, value from sample_table where test_id
> = '0123';
> System.out.println(query);
>
> DatastaxConnection.getInstance();
>
> ResultSet result =
> DatastaxConnection.getSession().execute(query);
>
> Iterator it = result.iterator();
>
> while (it.hasNext()) {
> Row r = it.next();
> for(String str : attributeNames) {
> ByteBuffer bb = r.getBytes(str); // this line is
> throwing an exception for me
> byte[] ba=new byte[bb.remaining()];
> bb.get(ba, 0, ba.length);
> dataFromCassandra.put(str, ba);
> }
> }
> } catch (Exception e) {
> e.printStackTrace();
> }
>
> return dataFromCassandra;
> }
>
> This is the Exception I am getting -
>
> java.lang.IllegalArgumentException: e1 is not a column defined in this
> metadata
>
> In the above method, I am passing rowKey as `0123` and `attributeNames`
> contains `e1` as the string.
>
> I am expecting Snappy Compressed data in `dataFromCassandra` Map. In this
> map the key should be `e1` and the value should be snappy compressed data
> if I am not wrong.. And then I will iterate this Map to snappy decompress
> the data..
>
> I am using Datastax Java client working with Cassandra 1.2.9.
>
> Any thoughts what wrong I am doing here?
>
> To unsubscribe from this group and stop receiving emails from it, send an
> email to java-driver-user+unsubscr...@lists.datastax.com.
>



-- 

:- a)


Alex Popescu
Sen. Product Manager @ DataStax
@al3xandru


Re: Issues with seeding on EC2 for C* 2.0.4 - help needed

2014-01-28 Thread Kumar Ranjan
Hi Michael - Yes, 7000, 7001, 9042, 9160 are all open on EC2.

Issue was seeds address and listen_address were 127.0.0.1 and private_ip.

This will help anyone

http://stackoverflow.com/questions/20690987/apache-cassandra-unable-to-gossip-with-any-seeds


On Wed, Jan 29, 2014 at 1:12 AM, Michael Shuler wrote:

> Did you open up the ports so they can talk to each other?
>
> http://www.datastax.com/documentation/cassandra/2.0/
> webhelp/index.html#cassandra/install/installAMISecurityGroup.html
>
> --
> Michael
>


Re: Issues with seeding on EC2 for C* 2.0.4 - help needed

2014-01-28 Thread Michael Shuler

Did you open up the ports so they can talk to each other?

http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/install/installAMISecurityGroup.html

--
Michael


Re: OpenJDK is not recommended? Why

2014-01-28 Thread Kumar Ranjan
Yes got rid of openJDK and installed oracle version and warning went away.
Happy happy...Thank you folks..


On Tue, Jan 28, 2014 at 11:59 PM, Michael Shuler wrote:

> On 01/28/2014 09:55 PM, Kumar Ranjan wrote:
>
>> I am in process of setting 2 node cluster with C* version 2.0.4. When I
>> started each node, it failed to communicate thus, each are running
>> separate and not in same ring. So started looking at the log files are
>> saw the message below:
>>
>
> This is probably just a configuration issue and not likely to be the fault
> of OpenJDK.  OpenJDK is ok for testing the waters and light dev work; it is
> the reference architecture for Oracle Java SE 7.
>
>
>  WARN [main] 2014-01-28 06:02:17,861 CassandraDaemon.java (line 155)
>> OpenJDK is not recommended. Please upgrade to the newest Oracle Java
>> release
>>
>> Is this message informational only or can it be real issue?
>>
>
> Source of the above warning has some comments (attached, so they don't
> wrap so badly, I hope).
>
> https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
> blob;f=src/java/org/apache/cassandra/service/CassandraDaemon.java;h=
> 424dbfa58ec72ea812362e2b428d0c4534626307;hb=HEAD#l106
>
> --
> Kind regards,
> Michael
>
>


Issues with seeding on EC2 for C* 2.0.4 - help needed

2014-01-28 Thread Kumar Ranjan
Hey Folks - I am burning the midnight oil fast but cant figure out what I
am doing wrong? log files has this. I have also listed both seed node and
node 2 partial configurations.


 INFO [main] 2014-01-29 05:15:11,515 CommitLog.java (line 127) Log replay
complete, 46 replayed mutations

 INFO [main] 2014-01-29 05:15:12,734 StorageService.java (line 490)
Cassandra version: 2.0.4

 INFO [main] 2014-01-29 05:15:12,743 StorageService.java (line 491) Thrift
API version: 19.39.0

 INFO [main] 2014-01-29 05:15:12,755 StorageService.java (line 492) CQL
supported versions: 2.0.0,3.1.3 (default: 3.1.3)

 INFO [main] 2014-01-29 05:15:12,821 StorageService.java (line 515) Loading
persisted ring state

 INFO [main] 2014-01-29 05:15:12,864 MessagingService.java (line 458)
Starting Messaging Service on port 7000

ERROR [main] 2014-01-29 05:15:43,890 CassandraDaemon.java (line 478)
Exception encountered during startup

java.lang.RuntimeException: Unable to gossip with any seeds


Seed node 1:

( cassandra.yml : I just have 2 node cluster and this is the seed node)


seed_provider:

# Addresses of hosts that are deemed contact points.

# Cassandra nodes use this list of hosts to find each other and learn

# the topology of the ring.  You must change this if you are running

# multiple nodes!

- class_name: org.apache.cassandra.locator.SimpleSeedProvider

  parameters:

  # seeds is actually a comma-delimited list of addresses.

  # Ex: ",,"

  - seeds: "127.0.0.1"

storage_port: 7000

ssl_storage_port: 7001

listen_address: 10.xxx.xxx.xxx ( Private IP of this node )

start_native_transport: true

native_transport_port: 9042

start_rpc: true

rpc_address: 0.0.0.0

rpc_port: 9160

rpc_keepalive: true

rpc_server_type: sync


Node 2:

seed_provider:

# Addresses of hosts that are deemed contact points.

# Cassandra nodes use this list of hosts to find each other and learn

# the topology of the ring.  You must change this if you are running

# multiple nodes!

- class_name: org.apache.cassandra.locator.SimpleSeedProvider

  parameters:

  # seeds is actually a comma-delimited list of addresses.

  # Ex: ",,"

  - seeds: "10.xxx.xxx.xxx"   ---> private IP of seed node listed
above

storage_port: 7000

ssl_storage_port: 7001

listen_address: 10.xxx.xxx.xxx ---> private IP of this node

start_native_transport: true

native_transport_port: 9042

start_rpc: true

rpc_address: 0.0.0.0

rpc_port: 9160

rpc_keepalive: true

rpc_server_type: sync


Re: OpenJDK is not recommended? Why

2014-01-28 Thread Michael Shuler

On 01/28/2014 09:55 PM, Kumar Ranjan wrote:

I am in process of setting 2 node cluster with C* version 2.0.4. When I
started each node, it failed to communicate thus, each are running
separate and not in same ring. So started looking at the log files are
saw the message below:


This is probably just a configuration issue and not likely to be the 
fault of OpenJDK.  OpenJDK is ok for testing the waters and light dev 
work; it is the reference architecture for Oracle Java SE 7.



WARN [main] 2014-01-28 06:02:17,861 CassandraDaemon.java (line 155)
OpenJDK is not recommended. Please upgrade to the newest Oracle Java release

Is this message informational only or can it be real issue?


Source of the above warning has some comments (attached, so they don't 
wrap so badly, I hope).


https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob;f=src/java/org/apache/cassandra/service/CassandraDaemon.java;h=424dbfa58ec72ea812362e2b428d0c4534626307;hb=HEAD#l106

--
Kind regards,
Michael

dc0bc878 (Jonathan Ellis  2013-03-07 18:08:59 + 106) // log 
warnings for different kinds of sub-optimal JVMs.  tldr use 64-bit Oracle >= 
1.6u32
dc0bc878 (Jonathan Ellis  2013-03-07 18:08:59 + 107) if 
(!System.getProperty("os.arch").contains("64"))
dc0bc878 (Jonathan Ellis  2013-03-07 18:08:59 + 108) 
logger.info("32bit JVM detected.  It is recommended to run Cassandra on a 64bit 
JVM for better performance.");
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 109) String 
javaVersion = System.getProperty("java.version");
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 110) String 
javaVmName = System.getProperty("java.vm.name");
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 111) 
logger.info("JVM vendor/version: {}/{}", javaVmName, javaVersion);
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 112) if 
(javaVmName.contains("OpenJDK"))
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 113) {
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 114) // 
There is essentially no QA done on OpenJDK builds, and
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 115) // 
clusters running OpenJDK have seen many heap and load issues.
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 116) 
logger.warn("OpenJDK is not recommended. Please upgrade to the newest Oracle 
Java release");
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 117) }
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 118) else if 
(!javaVmName.contains("HotSpot"))
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 119) {
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 120) 
logger.warn("Non-Oracle JVM detected.  Some features, such as immediate unmap 
of compacted SSTables, may not work as intended");
d1e91a77 (Jonathan Ellis  2013-03-07 17:59:35 + 121) }


How to retrieve snappy compressed data from Cassandra using Datastax?

2014-01-28 Thread Check Peck
I am working on a project in which I am supposed to store the snappy
compressed data in Cassandra, so that when I retrieve the same data from
Cassandra, it should be snappy compressed in memory and then I will
decompress that data using snappy to get the actual data from it.

I am having a byte array in `bytesToStore` variable, then I am snappy
compressing it using google `Snappy` and stored it back into Cassandra -

// .. some code here
System.out.println(bytesToStore);

byte[] compressed = Snappy.compress(bytesToStore);

attributesMap.put("e1", compressed);

ICassandraClient client = CassandraFactory.getInstance().getDao();
// write to Cassandra
client.upsertAttributes("0123", attributesMap, "sample_table");

After inserting the data in Cassandra, I went back into CQL mode and I
queried it and I can see this data in my table for the test_id `0123`-

cqlsh:testingks> select * from sample_table where test_id = '0123';

 test_id | name | value

-+-+
0123 |   e1 |
0x2cac7fff012c4ebb9555001e42797465204172726179205465737420466f722042696720456e6469616e


Now I am trying to read the same data back from Cassandra and everytime it
is giving me `IllegalArgumentException` -

public Map getDataFromCassandra(final String rowKey,
final Collection attributeNames) {

Map dataFromCassandra = new
ConcurrentHashMap();

try {
String query="SELECT test_id, name, value from sample_table
where test_id = '"+rowKey+ "';";
//SELECT test_id, name, value from sample_table where test_id =
'0123';
System.out.println(query);

DatastaxConnection.getInstance();

ResultSet result =
DatastaxConnection.getSession().execute(query);

Iterator it = result.iterator();

while (it.hasNext()) {
Row r = it.next();
for(String str : attributeNames) {
ByteBuffer bb = r.getBytes(str); // this line is
throwing an exception for me
byte[] ba=new byte[bb.remaining()];
bb.get(ba, 0, ba.length);
dataFromCassandra.put(str, ba);
}
}
} catch (Exception e) {
e.printStackTrace();
}

return dataFromCassandra;
}

This is the Exception I am getting -

java.lang.IllegalArgumentException: e1 is not a column defined in this
metadata

In the above method, I am passing rowKey as `0123` and `attributeNames`
contains `e1` as the string.

I am expecting Snappy Compressed data in `dataFromCassandra` Map. In this
map the key should be `e1` and the value should be snappy compressed data
if I am not wrong.. And then I will iterate this Map to snappy decompress
the data..

I am using Datastax Java client working with Cassandra 1.2.9.

Any thoughts what wrong I am doing here?


Re: OpenJDK is not recommended? Why

2014-01-28 Thread Colin
Open jdk has known issues and they will raise their ugly little head from time 
to time-i have experienced them myself.

To be safe, I would use the latest oracle 7 release.

You may also be experiencing a configuration issue, make sure one node is 
specified as the seed node and that the other node knows that address as well.  
There's a good guide to configuration on the datastax website.

--
Colin 
+1 320 221 9531

 

> On Jan 28, 2014, at 9:55 PM, Kumar Ranjan  wrote:
> 
> I am in process of setting 2 node cluster with C* version 2.0.4. When I 
> started each node, it failed to communicate thus, each are running separate 
> and not in same ring. So started looking at the log files are saw the message 
> below:
> 
> WARN [main] 2014-01-28 06:02:17,861 CassandraDaemon.java (line 155) OpenJDK 
> is not recommended. Please upgrade to the newest Oracle Java release
> 
> Is this message informational only or can it be real issue? Is this why, two 
> nodes are not in a ring?
> 
> -- Kumar 


Re: question about secondary index or not

2014-01-28 Thread Jimmy Lin
in my #2 example:
select * from people where company_id='xxx' and gender='male'

I already specify the first part of the primary key(row key) in my where
clause, so how does the secondary indexed column gender='male" help
determine which row to return? It is more like filtering a list of column
from a row(which is exactly I can do that in #1 example).
But then if I don't create index first, the cql statement will run into
syntax error.




On Tue, Jan 28, 2014 at 11:37 AM, Mullen, Robert
wrote:

> I would do #2.   Take a look at this blog which talks about secondary
> indexes, cardinality, and what it means for cassandra.   Secondary indexes
> in cassandra are a different beast, so often old rules of thumb about
> indexes don't apply.   http://www.wentnet.com/blog/?p=77
>
>
> On Tue, Jan 28, 2014 at 10:41 AM, Edward Capriolo 
> wrote:
>
>> Generally indexes on binary fields true/false male/female are not
>> terrible effective.
>>
>>
>> On Tue, Jan 28, 2014 at 12:40 PM, Jimmy Lin wrote:
>>
>>> I have a simple column family like the following
>>>
>>> create table people(
>>> company_id text,
>>> employee_id text,
>>> gender text,
>>> primary key(company_id, employee_id)
>>> );
>>>
>>> if I want to find out all the "male" employee given a company id, I can
>>> do
>>>
>>> 1/
>>> select * from people where company_id='
>>> and loop through the result efficiently to pick the employee who has
>>> gender column value equal to "male"
>>>
>>> 2/
>>> add a seconday index
>>> create index gender_index on people(gender)
>>> select * from people where company_id='xxx' and gender='male'
>>>
>>>
>>> I though #2 seems more appropriate, but I also thought the secondary
>>> index is helping only locating the primary row key, with the select clause
>>> in #2, is it more efficient than #1 where application responsible loop
>>> through the result and filter the right content?
>>>
>>> (
>>> It totally make sense if I only need to find out all the male
>>> employee(and not within a company) by using
>>> select * from people where gender='male"
>>> )
>>>
>>> thanks
>>>
>>
>>
>


OpenJDK is not recommended? Why

2014-01-28 Thread Kumar Ranjan
I am in process of setting 2 node cluster with C* version 2.0.4. When I
started each node, it failed to communicate thus, each are running separate
and not in same ring. So started looking at the log files are saw the
message below:

WARN [main] 2014-01-28 06:02:17,861 CassandraDaemon.java (line 155) OpenJDK
is not recommended. Please upgrade to the newest Oracle Java release

Is this message informational only or can it be real issue? Is this why,
two nodes are not in a ring?

-- Kumar


Re: Heavy update dataset and compaction

2014-01-28 Thread Robert Wille
> 
> Perhaps a log structured database with immutable data files is not best suited
> for this use case?

Perhaps not, but I have other data structures I¹m moving to Cassandra as
well. This is just the first. Cassandra has actually worked quite well for
this first step, in spite of it not being an optimal tool for this use case.
And, I have to say that records being modified a thousand times is an
extreme case. Most of my data is far less volatile (perhaps dozens of
times).

I greatly appreciate all the information contributed to this mailing list.
It¹s a great resource.

Robert





Re: resetting nodetool info exception count

2014-01-28 Thread Robert Coli
On Tue, Jan 28, 2014 at 2:16 PM, John Pyeatt wrote:

> Is there any way of resetting the value of a nodetool info Exceptions
> value manually?
>
> Is there a JMX call I can make?
>

Almost certainly not.

=Rob


Re: GC eden filled instantly (any size). Dropping messages.

2014-01-28 Thread Arya Goudarzi
Dimetrio,

Look at my last post. I showed you how to turn on all useful GC logging
flags. From there we can get information on why GC has long pauses. From
the changes you have made it seems you are changing things without knowing
the effect. Here are a few things to considenr:

- Having a 9GB NewGen out of a 16GB heap is one recipe for disaster. I am
sure if you turn on GC logs, you will see lots of promotion failures. The
standard is NewGen to be at max 1/4th of your HEAP to allow for healthy GC
promotions;
- The Jstat output suggests that the survivor spaces aren't utilized. This
is one sign of premature promotion. Consider
increasing MaxTenuringThreshold to a value higher that what it is. The
higher it is, the slowed things get promoted out of Eden; but we should
really examine your GC logs before making this part of the resolution;
- If you are going with 16Gb heap, then reduce your NewGen to 1/4th of it;
- It seems you have lowered compaction so much that SSTables aren't
compacting fast enough; tpstats should tell you something about this it my
assumption is true;

I also agree with Jonathan about Data Model and access pattern issues. It
seems your queries are creating long rows with lots of tombstones. If you
are deleting lots of columns from a single row and writing more to it and
do a fetch of lots of columns, you end up having to read a large row
causing it to stay in heap while being processes and cause long GCs. The GC
histograms inside GC logs (after you enabled it), should tell you what is
in the heap, either columns from slice queries or columns from compaction
(these two are usually the cases based on my experience of tuning GC
pauses).

Hope this helps


On Mon, Jan 27, 2014 at 4:07 AM, Dimetrio  wrote:

> No one advice did't help to me for reduce GC load
>
> I tried these:
>
> MAX_HEAP_SIZE from default(8GB) to 16G with HEAP_NEWSIZE from 400M to 9600M
> key cache on/off
> compacting memory size and other limits
> 15 c3.4xlarge nodes (adding 5 nodes to 10 nodes cluster did't help):
> and many other
>
> Reads ~5000 ops/s
> Writes ~ 5000 ops/s
> max batch is 50
> heavy reads and heavy writes (and heavy deletes)
> sometimes i have message:
> Read 1001 live and 2691
> Read 12 live and 2796
>
>
> sudo jstat -gcutil -h15 `sudo cat /var/run/cassandra/cassandra.pid` 250ms 0
>   S0 S1 E  O  P YGC YGCTFGCFGCT GCT
>  18.93   0.00   4.52  75.36  59.77225   30.11918   28.361   58.480
>   0.00  13.12   3.78  81.09  59.77226   30.19318   28.617   58.810
>   0.00  13.12  39.50  81.09  59.78226   30.19318   28.617   58.810
>   0.00  13.12  80.70  81.09  59.78226   30.19318   28.617   58.810
>  17.21   9.13   0.66  87.38  59.78228   30.23518   28.617   58.852
>   0.00  10.96  29.43  87.89  59.78228   30.32818   28.617   58.945
>   0.00  10.96  62.67  87.89  59.78228   30.32818   28.617   58.945
>   0.00  10.96  96.62  87.89  59.78228   30.32818   28.617   58.945
>   0.00  10.69  10.29  94.56  59.78230   30.46218   28.617   59.078
>   0.00  10.69  38.08  94.56  59.78230   30.46218   28.617   59.078
>   0.00  10.69  71.70  94.56  59.78230   30.46218   28.617   59.078
>  15.91   6.24   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>   S0 S1 E  O  P YGC YGCTFGCFGCT GCT
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>  15.91   8.02   0.03  99.96  59.78232   30.50618   28.617   59.123
>
>
> $ nodetool cfhistograms Social home_timeline
> Social/home_timeline histograms
> Offset  SSTables Write Latency  Read LatencyPartition Size
> Cell Count
>   (micros)  (micros)   (bytes)
> 1  10458 0  

Re: Help me on Cassandra Data Modelling

2014-01-28 Thread Thunder Stumpges
Hey Naresh,

Unfortunately I don't have any further advice. I keep feeling like you're
looking at a search problem instead of a lookup problem. Perhaps Cassandra
is not the right tool for your need in this case. Perhaps something with a
full-text index type feature would help.

Or perhaps someone more experienced than I could come up with another
design.

Good luck,
Thunder



On Tue, Jan 28, 2014 at 9:07 AM, Naresh Yadav  wrote:

> please inputs on last email if any..
>
>
>
> On Tue, Jan 28, 2014 at 7:18 AM, Naresh Yadav wrote:
>
>> yes thunder you are right, i had simplified that by moving *tags 
>> *search(partial/exact)
>> in separate column family tagcombination which will act as index for all
>> search based on tags and in my my original metricresult table will store
>> tagcombinationid and time in columns otherwise it was getting complicated &
>> was not getting good results.
>>
>> Yes i agree with you on duplicating the storage with tagcombination
>> columnfamily...if i have billion of real tagcombinations with 8 tags in
>> each then i am duplicating 2^8 combinations for each one to support partial
>> match for that tagcombination which will make this very heavy table...with
>> individual keys i will not able to support search with set of tags
>> ..please suggest alternative solution..
>>
>> Also one of my colleague suggested a total different approach to it but i
>> am  not able to map that on cassandra.
>> Acc to him we store all possible tags in columns and for each combination
>> we just mark 0s, 1s whichever tags
>> appear in that combination...So data(TC1 as India, Pencil AND TC2 as
>> India, Pen) will be like :
>>
>>   IndiaPencil   Pen
>> TC1  1 1  0
>> TC2  1  0  1
>>
>> I am not able to design optimal column family for this in cassandra..if i
>> design as is then for search of India, Pen then i will select India, Pen
>> columns but that will touch each and every row because i am not able to
>> apply criteria of matching 1s only...i believe there can be better design
>> of this to make use of this good thought.
>>
>> Please help me on this..
>>
>> Thanks
>> Naresh
>>
>>
>>
>> On Mon, Jan 27, 2014 at 11:30 PM, Thunder Stumpges <
>> thunder.stump...@gmail.com> wrote:
>>
>>> Hey Naresh,
>>>
>>> You asked a similar question a week or two ago. It looks like you have
>>> simplified your needs quite a bit. Were you able to adjust your
>>> requirements or separate the issue? You had a complicated time dimension
>>> before, as well as a single "query" for multiple AND cases on tags.
>>>
>>> 
 c)Give data for Metric=Sales AND Tag=U.S.A
O/P : 5 rows
 d)Give data for Metric=Sales AND Period=Jan-10 AND Tag=U.S.A AND Tag=Pen
O/P :1 row"
>>>
>>>
>>>
>>> I agree with Jonathan on the model for this simplified use case. However
>>> looking at how you are storing each partial tag combination as well as
>>> individual tags in the partitioning key, you will be severely duplicating
>>> your storage. You might want to just store individual keys in the
>>> partitioning key.
>>>
>>> Good luck,
>>> Thunder
>>>
>>>
>>>
>>>
>>> On Mon, Jan 27, 2014 at 8:48 AM, Naresh Yadav wrote:
>>>
 Thanks Jonathan for guiding me..i just want to confirm my understanding
 :

 create columnfamily tagcombinations {
  partialtags text,
  tagcombinationid text,
  tagcombinationtags set
 Primary Key((partialtags), tagcombinationid)
 }
 IF i need to store TWO tagcombination TC1 as India, Pencil AND TC2 as
 India, Pen then data will stored as :

TC1  TC2
 India  India,Pencil   India,pen

TC1
 Pencil  India,Pencil

TC2
 Pen   India,Pen

 TC1
 India,PencilIndia,Pencil

   TC2
 India,PenIndia, Pen


 I hope i had understood the thought properly please confirm on design.

 Thanks
 Naresh


 On Mon, Jan 27, 2014 at 7:05 PM, Jonathan Lacefield <
 jlacefi...@datastax.com> wrote:

> Hello,
>
>   The trick with this data model is to get to partition based, and/or
> cluster based access pattern so C* returns results quickly.  In C* you 
> want
> to model your tables based on your query access patterns and remember that
> writes are cheap and fast in C*.
>
>   So, try something like the following:
>
>   1 Table with a Partition Key = Tag String
>  Tag String = "Tag" or "set of Tags"
>  Cluster based on tag combination (probably desc order)
>  This will allow you to select any combination that includes
> Tag or "set of Tags"
>  This will duplicate data as you will store 

resetting nodetool info exception count

2014-01-28 Thread John Pyeatt
Is there any way of resetting the value of a nodetool info Exceptions value
manually?

Is there a JMX call I can make?

-- 
John Pyeatt
Singlewire Software, LLC
www.singlewire.com
--
608.661.1184
john.pye...@singlewire.com


Re: question about secondary index or not

2014-01-28 Thread Mullen, Robert
I would do #2.   Take a look at this blog which talks about secondary
indexes, cardinality, and what it means for cassandra.   Secondary indexes
in cassandra are a different beast, so often old rules of thumb about
indexes don't apply.   http://www.wentnet.com/blog/?p=77


On Tue, Jan 28, 2014 at 10:41 AM, Edward Capriolo wrote:

> Generally indexes on binary fields true/false male/female are not terrible
> effective.
>
>
> On Tue, Jan 28, 2014 at 12:40 PM, Jimmy Lin  wrote:
>
>> I have a simple column family like the following
>>
>> create table people(
>> company_id text,
>> employee_id text,
>> gender text,
>> primary key(company_id, employee_id)
>> );
>>
>> if I want to find out all the "male" employee given a company id, I can do
>>
>> 1/
>> select * from people where company_id='
>> and loop through the result efficiently to pick the employee who has
>> gender column value equal to "male"
>>
>> 2/
>> add a seconday index
>> create index gender_index on people(gender)
>> select * from people where company_id='xxx' and gender='male'
>>
>>
>> I though #2 seems more appropriate, but I also thought the secondary
>> index is helping only locating the primary row key, with the select clause
>> in #2, is it more efficient than #1 where application responsible loop
>> through the result and filter the right content?
>>
>> (
>> It totally make sense if I only need to find out all the male
>> employee(and not within a company) by using
>> select * from people where gender='male"
>> )
>>
>> thanks
>>
>
>


Re: no more zookeeper?

2014-01-28 Thread S Ahmed
Sorry guys, I am confusing things with Hbase.  But Nate's jira look sure
looks interesting thanks.


On Tue, Jan 28, 2014 at 12:25 PM, Edward Capriolo wrote:

> Some people had done some custom cassandra zookeper integration back in
> the day. Triggers, there is some reference in the original facebook thrown
> over the wall to zk. No official release has ever used zk directly. Though
> people have suggested it.
>
>
> On Tue, Jan 28, 2014 at 12:08 PM, Andrey Ilinykh wrote:
>
>> Why would cassandra use zookeeper?
>>
>>
>> On Tue, Jan 28, 2014 at 7:18 AM, S Ahmed  wrote:
>>
>>> Does C* no long use zookeeper?
>>>
>>> I don't see a reference to it in the
>>> https://github.com/apache/cassandra/blob/trunk/build.xml
>>>
>>> If not, what replaced it?
>>>
>>
>>
>


Re: A question to OutboundTcpConnection.expireMessages()

2014-01-28 Thread Robert Coli
On Mon, Jan 27, 2014 at 11:40 PM, Lu, Boying  wrote:

> When I read the codes of OutboundTcpConnection.expireMessages(), I found
> the following snippet in a loop:
>
>  if (qm.timestamp >= System.currentTimeMillis() -
> qm.message.getTimeout())
>
> *return*;
>
>
>
> My understanding is that this method is to remove all the expired messages
> from the backlog queue, so
>
> I think the '*return*' statement should be changed to "*continue*".
>
>
>
> Is that correct?
>

This question is probably better suited for the cassandra-dev mailing list
or #cassandra-dev IRC channel, or the Apache Cassandra JIRA.

=Rob


Re: Heavy update dataset and compaction

2014-01-28 Thread Robert Coli
On Tue, Jan 28, 2014 at 7:57 AM, Robert Wille  wrote:

> I have a dataset which is heavy on updates. The updates are actually
> performed by inserting new records and deleting the old ones the following
> day. Some records might be updated (replaced) a thousand times before they
> are finished.
>

Perhaps a log structured database with immutable data files is not best
suited for this use case?

Are you deleting rows or columns each day?

As I watch SSTables get created and compacted on my staging server (I
> haven't gone live with this yet), it appears that if I let the compactor do
> its default behavior, I'll probably end up consuming several times the
> amount of disk space as is actually required. I probably need to
> periodically trigger a major compaction if I want to avoid that. However,
> I've read that major compactions aren't really recommended. I'd like to get
> people's take on this. I'd also be interested in people's recommendations
> on compaction strategy and other compaction-related configuration settings.
>

This is getting to be a FAQ... but... briefly..

1) yes, you are correct about the amount of space waste. this is why most
people avoid write patterns with lots of overwrite.
2) the docs used to say something incoherent about major compactions, but
suffice it to say that running them regularly is often a viable solution.
they are the optimal way cassandra has available to it to merge data.
3) if you really have some problem related to your One Huge SSTable, you
can always use sstablesplit to split it into N smaller ones.
4) if you really don't want to run a major compaction, you can either use
Level compaction (which has its own caveats) or use checksstablegarbage [1]
and UserDefinedCompaction to strategically manually compact SSTables.

=Rob
[1] https://github.com/cloudian/support-tools#checksstablegarbage


question about secondary index or not

2014-01-28 Thread Jimmy Lin
I have a simple column family like the following

create table people(
company_id text,
employee_id text,
gender text,
primary key(company_id, employee_id)
);

if I want to find out all the "male" employee given a company id, I can do

1/
select * from people where company_id='
and loop through the result efficiently to pick the employee who has gender
column value equal to "male"

2/
add a seconday index
create index gender_index on people(gender)
select * from people where company_id='xxx' and gender='male'


I though #2 seems more appropriate, but I also thought the secondary index
is helping only locating the primary row key, with the select clause in #2,
is it more efficient than #1 where application responsible loop through the
result and filter the right content?

(
It totally make sense if I only need to find out all the male employee(and
not within a company) by using
select * from people where gender='male"
)

thanks


Re: question about secondary index or not

2014-01-28 Thread Edward Capriolo
Generally indexes on binary fields true/false male/female are not terrible
effective.


On Tue, Jan 28, 2014 at 12:40 PM, Jimmy Lin  wrote:

> I have a simple column family like the following
>
> create table people(
> company_id text,
> employee_id text,
> gender text,
> primary key(company_id, employee_id)
> );
>
> if I want to find out all the "male" employee given a company id, I can do
>
> 1/
> select * from people where company_id='
> and loop through the result efficiently to pick the employee who has
> gender column value equal to "male"
>
> 2/
> add a seconday index
> create index gender_index on people(gender)
> select * from people where company_id='xxx' and gender='male'
>
>
> I though #2 seems more appropriate, but I also thought the secondary index
> is helping only locating the primary row key, with the select clause in #2,
> is it more efficient than #1 where application responsible loop through the
> result and filter the right content?
>
> (
> It totally make sense if I only need to find out all the male employee(and
> not within a company) by using
> select * from people where gender='male"
> )
>
> thanks
>


Re: no more zookeeper?

2014-01-28 Thread Edward Capriolo
Some people had done some custom cassandra zookeper integration back in the
day. Triggers, there is some reference in the original facebook thrown over
the wall to zk. No official release has ever used zk directly. Though
people have suggested it.


On Tue, Jan 28, 2014 at 12:08 PM, Andrey Ilinykh  wrote:

> Why would cassandra use zookeeper?
>
>
> On Tue, Jan 28, 2014 at 7:18 AM, S Ahmed  wrote:
>
>> Does C* no long use zookeeper?
>>
>> I don't see a reference to it in the
>> https://github.com/apache/cassandra/blob/trunk/build.xml
>>
>> If not, what replaced it?
>>
>
>


Re: no more zookeeper?

2014-01-28 Thread Andrey Ilinykh
Why would cassandra use zookeeper?


On Tue, Jan 28, 2014 at 7:18 AM, S Ahmed  wrote:

> Does C* no long use zookeeper?
>
> I don't see a reference to it in the
> https://github.com/apache/cassandra/blob/trunk/build.xml
>
> If not, what replaced it?
>


Re: Help me on Cassandra Data Modelling

2014-01-28 Thread Naresh Yadav
please inputs on last email if any..


On Tue, Jan 28, 2014 at 7:18 AM, Naresh Yadav  wrote:

> yes thunder you are right, i had simplified that by moving *tags 
> *search(partial/exact)
> in separate column family tagcombination which will act as index for all
> search based on tags and in my my original metricresult table will store
> tagcombinationid and time in columns otherwise it was getting complicated &
> was not getting good results.
>
> Yes i agree with you on duplicating the storage with tagcombination
> columnfamily...if i have billion of real tagcombinations with 8 tags in
> each then i am duplicating 2^8 combinations for each one to support partial
> match for that tagcombination which will make this very heavy table...with
> individual keys i will not able to support search with set of tags
> ..please suggest alternative solution..
>
> Also one of my colleague suggested a total different approach to it but i
> am  not able to map that on cassandra.
> Acc to him we store all possible tags in columns and for each combination
> we just mark 0s, 1s whichever tags
> appear in that combination...So data(TC1 as India, Pencil AND TC2 as
> India, Pen) will be like :
>
>   IndiaPencil   Pen
> TC1  1 1  0
> TC2  1  0  1
>
> I am not able to design optimal column family for this in cassandra..if i
> design as is then for search of India, Pen then i will select India, Pen
> columns but that will touch each and every row because i am not able to
> apply criteria of matching 1s only...i believe there can be better design
> of this to make use of this good thought.
>
> Please help me on this..
>
> Thanks
> Naresh
>
>
>
> On Mon, Jan 27, 2014 at 11:30 PM, Thunder Stumpges <
> thunder.stump...@gmail.com> wrote:
>
>> Hey Naresh,
>>
>> You asked a similar question a week or two ago. It looks like you have
>> simplified your needs quite a bit. Were you able to adjust your
>> requirements or separate the issue? You had a complicated time dimension
>> before, as well as a single "query" for multiple AND cases on tags.
>>
>> 
>>> c)Give data for Metric=Sales AND Tag=U.S.A
>>>O/P : 5 rows
>>> d)Give data for Metric=Sales AND Period=Jan-10 AND Tag=U.S.A AND Tag=Pen
>>>O/P :1 row"
>>
>>
>>
>> I agree with Jonathan on the model for this simplified use case. However
>> looking at how you are storing each partial tag combination as well as
>> individual tags in the partitioning key, you will be severely duplicating
>> your storage. You might want to just store individual keys in the
>> partitioning key.
>>
>> Good luck,
>> Thunder
>>
>>
>>
>>
>> On Mon, Jan 27, 2014 at 8:48 AM, Naresh Yadav wrote:
>>
>>> Thanks Jonathan for guiding me..i just want to confirm my understanding :
>>>
>>> create columnfamily tagcombinations {
>>>  partialtags text,
>>>  tagcombinationid text,
>>>  tagcombinationtags set
>>> Primary Key((partialtags), tagcombinationid)
>>> }
>>> IF i need to store TWO tagcombination TC1 as India, Pencil AND TC2 as
>>> India, Pen then data will stored as :
>>>
>>>TC1  TC2
>>> India  India,Pencil   India,pen
>>>
>>>TC1
>>> Pencil  India,Pencil
>>>
>>>TC2
>>> Pen   India,Pen
>>>
>>> TC1
>>> India,PencilIndia,Pencil
>>>
>>>   TC2
>>> India,PenIndia, Pen
>>>
>>>
>>> I hope i had understood the thought properly please confirm on design.
>>>
>>> Thanks
>>> Naresh
>>>
>>>
>>> On Mon, Jan 27, 2014 at 7:05 PM, Jonathan Lacefield <
>>> jlacefi...@datastax.com> wrote:
>>>
 Hello,

   The trick with this data model is to get to partition based, and/or
 cluster based access pattern so C* returns results quickly.  In C* you want
 to model your tables based on your query access patterns and remember that
 writes are cheap and fast in C*.

   So, try something like the following:

   1 Table with a Partition Key = Tag String
  Tag String = "Tag" or "set of Tags"
  Cluster based on tag combination (probably desc order)
  This will allow you to select any combination that includes
 Tag or "set of Tags"
  This will duplicate data as you will store 1 tag combination
 in every Tag partition, i.e. if a tag combination has 2 parts, then you
 will have 2 rows

   Hope this helps.

 Jonathan Lacefield
 Solutions Architect, DataStax
 (404) 822 3487
  



 


 On Mon, Jan 27, 2014 at 7:24 AM, Naresh Yadav wrote:

> Hi all,
>
> Urgently need help on modelling this usecase on Cassandra.
>
> I have concept of tags and

Re: Possible optimization: avoid creating tombstones for TTLed columns if updates to TTLs are disallowed

2014-01-28 Thread horschi
Hi Donald,

I was reporting the ticket you mentioned, so I kinds feel like I should
answer this :-)


 I presume the point is that GCable tombstones can still do work
> (preventing spurious writing from nodes that were down) but only until the
> data is flushed to disk.
>
I am not sure I understand this correctly. Could you rephrase that sentence?



> If the effective TTL exceeds gc_grace_seconds then the tombstone will be
> deleted anyway.
>
Its not even written (since  CASSANDRA-4917). There is no delete on the
tombstone in that case.



>  It occurred to me that if you never update the TTL of a column, then
> there should be no need for tombstones at all:  any replicas will have the
> same TTL.  So there'd be no risk of missed deletes.  You wouldn't even need
> GCable tombstones
>
I think so too. There should be no need for a tombstone at all if the
following condition are given:
- column was not deleted manually, but timed out by itself
- column was not updated in the last gc_grace days

If I am not mistaken, the second point would even be neccessary for
CASSANDRA-4917 to be able to handle changing TTLs correctly: I think the
current implementation might break, if a column gets updated with a smaller
TTL, or to be more precise when  (old.creationdate + old.ttl) <
(new.creationdate + new.ttl) && new.ttl < gc_grace


Imho, for any further tombstone-optimization to work, compaction would have
to be smarter:
 I think it should be able to track max(old.creationdate + old.ttl ,
new.creationdate + new.ttl) when merging columns. I have no idea if that is
possible though.


>
> So, if - and it's a big if - a table disallowed updates to TTL, then you
> could really optimize deletion of TTLed columns: you could do away with
> tombstones entirely.   If a table allows updates to TTL then it's possible
> a different node will have the row without the TTL and the tombstone would
> be needed.
>
I am not sure I understand this. My "thrift" understanding of cassandra is
that you cannot update the TTL, you can just update an entire column. Also
each column has its own TTL. There is no TTL on the row.


cheers,
Christian


Re: no more zookeeper?

2014-01-28 Thread Nate McCall
AFAIK zookeeper was never in use. It was discussed once or twice over the
years, but never seriously.

If you are talking about the origins of the current lightweight
transactions in 2.0, take a look at this issue (warning - it's one of the
longer ASF jira issues I've seen, but some good stuff in there):
https://issues.apache.org/jira/browse/CASSANDRA-5062


On Tue, Jan 28, 2014 at 9:18 AM, S Ahmed  wrote:

> Does C* no long use zookeeper?
>
> I don't see a reference to it in the
> https://github.com/apache/cassandra/blob/trunk/build.xml
>
> If not, what replaced it?
>



-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Heavy update dataset and compaction

2014-01-28 Thread Nate McCall
LeveledCompactionStrategy is ideal for update heavy workloads. If you are
using a pre 1.2.8 version make sure you set the sstable_size_in_mb up to
the new default of 160.

Also, keep an eye on "Average live cells per slice" and "Average tombstones
per slice" (available in versions > 1.2.11 - so I guess just upgrade if you
are using an older version and not in production yet) in nodetool cfstats
to make sure you reads are not traversing too many tombstones.


On Tue, Jan 28, 2014 at 9:57 AM, Robert Wille  wrote:

> I have a dataset which is heavy on updates. The updates are actually
> performed by inserting new records and deleting the old ones the following
> day. Some records might be updated (replaced) a thousand times before they
> are finished.
>
> As I watch SSTables get created and compacted on my staging server (I
> haven't gone live with this yet), it appears that if I let the compactor do
> its default behavior, I'll probably end up consuming several times the
> amount of disk space as is actually required. I probably need to
> periodically trigger a major compaction if I want to avoid that. However,
> I've read that major compactions aren't really recommended. I'd like to get
> people's take on this. I'd also be interested in people's recommendations
> on compaction strategy and other compaction-related configuration settings.
>
> Thanks
>
> Robert
>



-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Heavy update dataset and compaction

2014-01-28 Thread Robert Wille
I have a dataset which is heavy on updates. The updates are actually
performed by inserting new records and deleting the old ones the following
day. Some records might be updated (replaced) a thousand times before they
are finished.

As I watch SSTables get created and compacted on my staging server (I
haven¹t gone live with this yet), it appears that if I let the compactor do
its default behavior, I¹ll probably end up consuming several times the
amount of disk space as is actually required. I probably need to
periodically trigger a major compaction if I want to avoid that. However,
I¹ve read that major compactions aren¹t really recommended. I¹d like to get
people¹s take on this. I¹d also be interested in people¹s recommendations on
compaction strategy and other compaction-related configuration settings.

Thanks

Robert




RE: no more zookeeper?

2014-01-28 Thread S Ahmed
Does C* no long use zookeeper?

I don't see a reference to it in the
https://github.com/apache/cassandra/blob/trunk/build.xml

If not, what replaced it?


Re: No deletes - is periodic repair needed? I think not...

2014-01-28 Thread Sylvain Lebresne
>
>
> I have actually set up one of our application streams such that the same
> key is only overwritten with a monotonically increasing ttl.
>
> For example, a breaking news item might have an initial ttl of 60 seconds,
> followed in 45 seconds by an update with a ttl of 3000 seconds, followed by
> an 'ignore me' update in 600 seconds with a ttl of 30 days (our maximum
> ttl) when the article is published.
>
> My understanding is that this case fits the criteria and no 'periodic
> repair' is needed.
>

That's correct. The real criteria for not needing repair if you do no
deletes but only TTL is "update only with monotonically increasing (non
necessarily strictly) ttl". Always setting the same TTL is just a special
case of that, but it's the most commonly used one I think, so I tend to
simplify it to that case.


>
> I guess another thing I would point out that is easy to miss or forget (if
> you are a newish user like me), is that ttl's are fine-grained, by column.
> So we are talking 'fixed' or 'variable' by individual column, not by table.
> Which means, in my case, that ttl's can vary widely across a table, but as
> long as I constrain them by key value to be fixed or monotonically
> increasing, it fits the criteria.
>

We're talking monotonically increasing ttl "for a given primary key' if
we're talking the CQL language and "for a given column" if we're talking
the thrift one. Not "by table".

--
Sylvain



>
> Cheers,
>
> Michael
>
>
> On Tue, Jan 28, 2014 at 4:18 AM, Sylvain Lebresne wrote:
>
>> On Tue, Jan 28, 2014 at 1:05 AM, Edward Capriolo 
>> wrote:
>>
>>> If you have only ttl columns, and you never update the column I would
>>> not think you need a repair.
>>>
>>
>> Right, no deletes and no updates is the case 1. of Michael on which I
>> think we all agree 'periodic repair to avoid resurrected columns' is not
>> required.
>>
>>
>>>
>>> Repair cures lost deletes. If all your writes have a ttl a lost write
>>> should not matter since the column was never written to the node and thus
>>> could never be resurected on said node.
>>>
>>
>>  I'm sure we're all in agreement here, but for the record, this is only
>> true if you have no updates (overwrites) and/or if all writes have the
>> *same* ttl. Because in the general case, a column with a relatively short
>> TTL is basically very close to a delete, while a column with a long TTL is
>> very close from one that has no TTL. If the former column (with short TTL)
>> overwrites the latter one (with long TTL), and if one nodes misses the
>> overwrite, that node could resurrect the column with the longer TTL (until
>> that column expires that is). Hence the separation of the case 2. (fixed
>> ttl, no repair needed) and 2.a. (variable ttl, repair may be needed).
>>
>> --
>> Sylvain
>>
>>
>>>
>>> Unless i am missing something.
>>>
>>> On Monday, January 27, 2014, Laing, Michael 
>>> wrote:
>>> > Thanks Sylvain,
>>> > Your assumption is correct!
>>> > So I think I actually have 4 classes:
>>> > 1.Regular values, no deletes, no overwrites, write heavy, variable
>>> ttl's to manage size
>>> > 2.Regular values, no deletes, some overwrites, read heavy (10 to
>>> 1), fixed ttl's to manage size
>>> > 2.a. Regular values, no deletes, some overwrites, read heavy (10 to
>>> 1), variable ttl's to manage size
>>> > 3.Counter values, no deletes, update heavy, rotation/truncation to
>>> manage size
>>> > Only 2.a. above requires me to do 'periodic repair'.
>>> > What I will actually do is change my schema and applications slightly
>>> to eliminate the need for overwrites on the only table I have in that
>>> category.
>>> > And I will set gc_grace_period to 0 for the tables in the updated
>>> schema and drop 'periodic repair' from the schedule.
>>> > Cheers,
>>> > Michael
>>> >
>>> >
>>> > On Mon, Jan 27, 2014 at 4:22 AM, Sylvain Lebresne <
>>> sylv...@datastax.com> wrote:
>>> >>
>>> >> By periodic repair, I'll assume you mean "having to run repair every
>>> gc_grace period to make sure no deleted entries resurrect". With that
>>> assumption:
>>> >>
>>> >>>
>>> >>> 1. Regular values, no deletes, no overwrites, write heavy, ttl's to
>>> manage size
>>> >>
>>> >> Since 'repair within gc_grace' is about avoiding value that have been
>>> deleted to resurrect, if you do no delete nor overwrites, you're in no risk
>>> of that (and don't need to 'repair withing gc_grace').
>>> >>
>>> >>>
>>> >>> 2. Regular values, no deletes, some overwrites, read heavy (10 to
>>> 1), ttl's to manage size
>>> >>
>>> >> It depends a bit. In general, if you always set the exact same TTL on
>>> every insert (implying you always set a TTL), then you have nothing to
>>> worry about. If the TTL varies (of if you only set TTL some of the times),
>>> then you might still need to have some periodic repairs. That being said,
>>> if there is no deletes but only TTLs, then the TTL kind of lengthen the
>>> period at which you need to do repair: instead of needing to repair withing
>>> gc_grace, 

Re: No deletes - is periodic repair needed? I think not...

2014-01-28 Thread Laing, Michael
Thanks again Sylvain!

I have actually set up one of our application streams such that the same
key is only overwritten with a monotonically increasing ttl.

For example, a breaking news item might have an initial ttl of 60 seconds,
followed in 45 seconds by an update with a ttl of 3000 seconds, followed by
an 'ignore me' update in 600 seconds with a ttl of 30 days (our maximum
ttl) when the article is published.

My understanding is that this case fits the criteria and no 'periodic
repair' is needed.

I guess another thing I would point out that is easy to miss or forget (if
you are a newish user like me), is that ttl's are fine-grained, by column.
So we are talking 'fixed' or 'variable' by individual column, not by table.
Which means, in my case, that ttl's can vary widely across a table, but as
long as I constrain them by key value to be fixed or monotonically
increasing, it fits the criteria.

Cheers,

Michael


On Tue, Jan 28, 2014 at 4:18 AM, Sylvain Lebresne wrote:

> On Tue, Jan 28, 2014 at 1:05 AM, Edward Capriolo wrote:
>
>> If you have only ttl columns, and you never update the column I would not
>> think you need a repair.
>>
>
> Right, no deletes and no updates is the case 1. of Michael on which I
> think we all agree 'periodic repair to avoid resurrected columns' is not
> required.
>
>
>>
>> Repair cures lost deletes. If all your writes have a ttl a lost write
>> should not matter since the column was never written to the node and thus
>> could never be resurected on said node.
>>
>
> I'm sure we're all in agreement here, but for the record, this is only
> true if you have no updates (overwrites) and/or if all writes have the
> *same* ttl. Because in the general case, a column with a relatively short
> TTL is basically very close to a delete, while a column with a long TTL is
> very close from one that has no TTL. If the former column (with short TTL)
> overwrites the latter one (with long TTL), and if one nodes misses the
> overwrite, that node could resurrect the column with the longer TTL (until
> that column expires that is). Hence the separation of the case 2. (fixed
> ttl, no repair needed) and 2.a. (variable ttl, repair may be needed).
>
> --
> Sylvain
>
>
>>
>> Unless i am missing something.
>>
>> On Monday, January 27, 2014, Laing, Michael 
>> wrote:
>> > Thanks Sylvain,
>> > Your assumption is correct!
>> > So I think I actually have 4 classes:
>> > 1.Regular values, no deletes, no overwrites, write heavy, variable
>> ttl's to manage size
>> > 2.Regular values, no deletes, some overwrites, read heavy (10 to
>> 1), fixed ttl's to manage size
>> > 2.a. Regular values, no deletes, some overwrites, read heavy (10 to 1),
>> variable ttl's to manage size
>> > 3.Counter values, no deletes, update heavy, rotation/truncation to
>> manage size
>> > Only 2.a. above requires me to do 'periodic repair'.
>> > What I will actually do is change my schema and applications slightly
>> to eliminate the need for overwrites on the only table I have in that
>> category.
>> > And I will set gc_grace_period to 0 for the tables in the updated
>> schema and drop 'periodic repair' from the schedule.
>> > Cheers,
>> > Michael
>> >
>> >
>> > On Mon, Jan 27, 2014 at 4:22 AM, Sylvain Lebresne 
>> wrote:
>> >>
>> >> By periodic repair, I'll assume you mean "having to run repair every
>> gc_grace period to make sure no deleted entries resurrect". With that
>> assumption:
>> >>
>> >>>
>> >>> 1. Regular values, no deletes, no overwrites, write heavy, ttl's to
>> manage size
>> >>
>> >> Since 'repair within gc_grace' is about avoiding value that have been
>> deleted to resurrect, if you do no delete nor overwrites, you're in no risk
>> of that (and don't need to 'repair withing gc_grace').
>> >>
>> >>>
>> >>> 2. Regular values, no deletes, some overwrites, read heavy (10 to 1),
>> ttl's to manage size
>> >>
>> >> It depends a bit. In general, if you always set the exact same TTL on
>> every insert (implying you always set a TTL), then you have nothing to
>> worry about. If the TTL varies (of if you only set TTL some of the times),
>> then you might still need to have some periodic repairs. That being said,
>> if there is no deletes but only TTLs, then the TTL kind of lengthen the
>> period at which you need to do repair: instead of needing to repair withing
>> gc_grace, you only need to repair every gc_grace + min(TTL) (where min(TTL)
>> is the smallest TTL you set on columns).
>> >>>
>> >>> 3. Counter values, no deletes, update heavy, rotation/truncation to
>> manage size
>> >>
>> >> No deletes and no TTL implies that your fine (as in, there is no need
>> for 'repair withing gc_grace').
>> >>
>> >> --
>> >> Sylvain
>> >
>>
>> --
>> Sorry this was sent from mobile. Will do less grammar and spell check
>> than usual.
>>
>
>


Re: No deletes - is periodic repair needed? I think not...

2014-01-28 Thread Sylvain Lebresne
On Tue, Jan 28, 2014 at 1:05 AM, Edward Capriolo wrote:

> If you have only ttl columns, and you never update the column I would not
> think you need a repair.
>

Right, no deletes and no updates is the case 1. of Michael on which I think
we all agree 'periodic repair to avoid resurrected columns' is not required.


>
> Repair cures lost deletes. If all your writes have a ttl a lost write
> should not matter since the column was never written to the node and thus
> could never be resurected on said node.
>

I'm sure we're all in agreement here, but for the record, this is only true
if you have no updates (overwrites) and/or if all writes have the *same*
ttl. Because in the general case, a column with a relatively short TTL is
basically very close to a delete, while a column with a long TTL is very
close from one that has no TTL. If the former column (with short TTL)
overwrites the latter one (with long TTL), and if one nodes misses the
overwrite, that node could resurrect the column with the longer TTL (until
that column expires that is). Hence the separation of the case 2. (fixed
ttl, no repair needed) and 2.a. (variable ttl, repair may be needed).

--
Sylvain


>
> Unless i am missing something.
>
> On Monday, January 27, 2014, Laing, Michael 
> wrote:
> > Thanks Sylvain,
> > Your assumption is correct!
> > So I think I actually have 4 classes:
> > 1.Regular values, no deletes, no overwrites, write heavy, variable
> ttl's to manage size
> > 2.Regular values, no deletes, some overwrites, read heavy (10 to 1),
> fixed ttl's to manage size
> > 2.a. Regular values, no deletes, some overwrites, read heavy (10 to 1),
> variable ttl's to manage size
> > 3.Counter values, no deletes, update heavy, rotation/truncation to
> manage size
> > Only 2.a. above requires me to do 'periodic repair'.
> > What I will actually do is change my schema and applications slightly to
> eliminate the need for overwrites on the only table I have in that category.
> > And I will set gc_grace_period to 0 for the tables in the updated schema
> and drop 'periodic repair' from the schedule.
> > Cheers,
> > Michael
> >
> >
> > On Mon, Jan 27, 2014 at 4:22 AM, Sylvain Lebresne 
> wrote:
> >>
> >> By periodic repair, I'll assume you mean "having to run repair every
> gc_grace period to make sure no deleted entries resurrect". With that
> assumption:
> >>
> >>>
> >>> 1. Regular values, no deletes, no overwrites, write heavy, ttl's to
> manage size
> >>
> >> Since 'repair within gc_grace' is about avoiding value that have been
> deleted to resurrect, if you do no delete nor overwrites, you're in no risk
> of that (and don't need to 'repair withing gc_grace').
> >>
> >>>
> >>> 2. Regular values, no deletes, some overwrites, read heavy (10 to 1),
> ttl's to manage size
> >>
> >> It depends a bit. In general, if you always set the exact same TTL on
> every insert (implying you always set a TTL), then you have nothing to
> worry about. If the TTL varies (of if you only set TTL some of the times),
> then you might still need to have some periodic repairs. That being said,
> if there is no deletes but only TTLs, then the TTL kind of lengthen the
> period at which you need to do repair: instead of needing to repair withing
> gc_grace, you only need to repair every gc_grace + min(TTL) (where min(TTL)
> is the smallest TTL you set on columns).
> >>>
> >>> 3. Counter values, no deletes, update heavy, rotation/truncation to
> manage size
> >>
> >> No deletes and no TTL implies that your fine (as in, there is no need
> for 'repair withing gc_grace').
> >>
> >> --
> >> Sylvain
> >
>
> --
> Sorry this was sent from mobile. Will do less grammar and spell check than
> usual.
>