Re: Not-Equals (!=) in Where Clause

2014-10-01 Thread Sylvain Lebresne
Right, my bad, thanks Tyler for the correction.

On Tue, Sep 30, 2014 at 5:44 PM, Tyler Hobbs  wrote:

> I think Sylvain may not have had his coffee yet.  You can't use IF's in
> SELECT statements, but you can in INSERT/UPDATE/DELETE:
>
> UPDATE foo SET a = 0 WHERE k = 0 IF b != 0;
>
> On Tue, Sep 30, 2014 at 2:36 AM, Sylvain Lebresne 
> wrote:
>
>>
>>
>>> Is != supported as part of the where clause in Cassandra?
>>>
>>
>> It's not.
>>
>> Or is it the grammar for some other purpose?
>>>
>>
>> It's supported in 'IF' conditions. You can do something like:
>>   SELECT * FROM foo WHERE k = 0 IF v != 3;
>>
>> --
>> Sylvain
>>
>
>
>
> --
> Tyler Hobbs
> DataStax 
>


Regarding Cassandra-Stress tool

2014-10-01 Thread shahab
Hi,

I am trying to benchmark our custom schema in Cassandra and I managed to
run it. However there are couple of setting and issues which I couldn't
find any solution/explanation for. I appreciate any comments.
1- The default number of warm-up iterations in stress tool is about 5.
I would like to reduce this number (due to my storage space limitations),
but I couldn't find any input parameters to do this. I just wonder if this
setting is possible ?


2- I did not understand well what does the output of cassandra stress tool
mean? I read  this
http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html,
but . for example, what does "latency" means here? does it mean how long a
read/write operation is delayed until it is executed? in this case, what is
the measure for actual read/write operation?
It seems that the documentation is outdated, there is an output parameter
"partition_rate" which is not explained in documentation?

best,
/Shahab


Re: disk space issue

2014-10-01 Thread Ken Hancock
Major compaction is bad if you're using size-tiered, especially if you're
already having capacity issues.  Once you have one huge table, with default
settings, you'll need 4x that huge table worth of storage in order for it
to compact again to ever reclaim your TTL'd data.

If you're running into space issues that are ultimately going to get your
system wedged and you're using columns with TTL, I'd recommend using the
jmx operation to compact individual tables.  This will free the TTL'd data
assuming that you've exceeded your gc_grace_seconds.  This can probably be
scripted up in a relatively easy manner with a nice,
shellshocked-vulnerable bash script and jmxterm.


On Wed, Oct 1, 2014 at 2:43 AM, Nikolay Mihaylov  wrote:

> my 2 cents:
>
> try major compaction on the column family with TTL's - for sure will be
> faster than full rebuild.
>
> also try not cassandra related things, such check and remove old log
> files, backups etc.
>
> On Wed, Oct 1, 2014 at 9:34 AM, Sumod Pawgi  wrote:
>
>> In the past in such scenarios it has helped us to check the partition
>> where cassandra is installed and allocate more space for the partition.
>> Maybe it is a disk space issue but it is good to check if it is related to
>> the space allocation for the partition issue. My 2 cents.
>>
>> Sent from my iPhone
>>
>> On 01-Oct-2014, at 11:53 am, Dominic Letz 
>> wrote:
>>
>> This is a shot into the dark but you could check whether you have too
>> many snapshots laying around that you actually don't need. You can get rid
>> of those with a quick "nodetool clearsnapshot".
>>
>> On Wed, Oct 1, 2014 at 5:49 AM, cem  wrote:
>>
>>> Hi All,
>>>
>>> I have a 7 node cluster. One node ran out of disk space and others are
>>> around 80% disk utilization.
>>> The data has 10 days TTL but I think compaction wasn't fast enough to
>>> clean up the expired data.  gc_grace value is set default. I have a
>>> replication factor of 3. Do you think that it may help if I delete all data
>>> for that node and run repair. Does node repair check the ttl value before
>>> retrieving data from other nodes? Do you have any other suggestions?
>>>
>>> Best Regards,
>>> Cem.
>>>
>>
>>
>>
>> --
>> Dominic Letz
>> Director of R&D
>> Exosite 
>>
>>
>


-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC

Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hanc...@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]


[image: SeaChange International]
This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.


cassandra stress tools

2014-10-01 Thread shahab
Hi,

I am trying to benchmark our custom schema in Cassandra and I managed to
run it. However there are couple of setting and issues which I couldn't
find any solution/explanation for. I appreciate any comments.
1- The default number of warm-up iterations in stress tool is about 5.
I would like to reduce this number (due to my storage space limitations),
but I couldn't find any input parameters to do this. I just wonder if this
setting is possible ?


2- I did not understand well what does the output of cassandra stress tool
mean? I read  this
http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html,
but . for example, what does "latency" means here? does it mean how long a
read/write operation is delayed until it is executed? in this case, what is
the measure for actual read/write operation?
It seems that the documentation is outdated, there is an output parameter
"partition_rate" which is not explained in documentation?

best,
/Shahab


CASSANDRA-7649 : upgrade existing db to 2.0.10

2014-10-01 Thread Desimpel, Ignace
I deploy/distribute the Cassandra database as an embedded service allowing me 
to create a basic cassandra.yaml file based on the global cluster of machines 
(seeds, non-seeds, ports, disks, etc...). That allows me to configure and 
upgrade my own software and the cassandra software using the same 
cassandra.yaml. That yaml file has no tokens specified in it, still having a 
vnode cluster (thanks cassandra) .

In previous versions that was ok, since the cassandra code was simply accepting 
the tokens it saved in its own database, disregarding any changes one made in 
the yaml file ( there was no test like bootstrapTokens.size() != 
DatabaseDescriptor.getNumTokens() ). I guess there was some logic to that, 
since at that time the system is not bootstrapping and thus should/could use 
the known token configuration without using the yaml token parameter.

Also, isn't this small code change of CASSANDRA-7649 inspired on balancing 
problems going to vnodes (CASSANDRA-7601) using a random partitioner. And in my 
case I'm using a ByteOrdered partitioner, forcing me to balance/move/add 
nodes/tokens myself.
And as the description is saying, it was meant to avoid 'to change the number 
of tokens', that test is doing a little more (from my point of view).

Well, in short : I would be in favor of removing that test, clearly leaving a 
message that the "saved tokens" are used, not the yaml configured tokens.

Regards,
Ignace






Re: disk space issue

2014-10-01 Thread cem
thanks for the answers!

Cem

On Wed, Oct 1, 2014 at 2:38 PM, Ken Hancock  wrote:

> *https://github.com/hancockks/cassandra-compact-cf
> *
>
> On Tue, Sep 30, 2014 at 5:49 PM, cem  wrote:
>
>> Hi All,
>>
>> I have a 7 node cluster. One node ran out of disk space and others are
>> around 80% disk utilization.
>> The data has 10 days TTL but I think compaction wasn't fast enough to
>> clean up the expired data.  gc_grace value is set default. I have a
>> replication factor of 3. Do you think that it may help if I delete all data
>> for that node and run repair. Does node repair check the ttl value before
>> retrieving data from other nodes? Do you have any other suggestions?
>>
>> Best Regards,
>> Cem.
>>
>
>
>
> --
> *Ken Hancock *| System Architect, Advanced Advertising
> SeaChange International
> 50 Nagog Park
> Acton, Massachusetts 01720
> ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC
> 
> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com
>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
> 
>
> [image: SeaChange International]
> This e-mail and any attachments may contain
> information which is SeaChange International confidential. The information
> enclosed is intended only for the addressees herein and may not be copied
> or forwarded without permission from SeaChange International.
>


Cassaandra & Java 8

2014-10-01 Thread Tony Anecito
Hi All,
Has anyone done any performance testing of say Cassandra 2.1 using Java 8?
Thanks,-Tony


Question about incremental repair

2014-10-01 Thread John Sumsion
If you only run incremental repairs, does that mean that bitrot will go 
undetected for already repaired sstables?

If so, is there any other process that will detect bitrot for all the repaired 
sstables other than full repair (or an unfortunate user)?

John...


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



Re: Question about incremental repair

2014-10-01 Thread Tyler Hobbs
Compressed SSTables store a checksum for every compressed block, which is
checked each time the block is decompressed.  I believe there's a ticket
out there to add something similar for non-compressed SSTables.

We also store the sha1 hash of SSTables in its own file on disk.

On Wed, Oct 1, 2014 at 4:45 PM, John Sumsion 
wrote:

>  If you only run incremental repairs, does that mean that bitrot will go
> undetected for already repaired sstables?
>
>  If so, is there any other process that will detect bitrot for all the
> repaired sstables other than full repair (or an unfortunate user)?
>
>  John...
>
>
>
> NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If you
> are not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.
>
>


-- 
Tyler Hobbs
DataStax 


Re: Question about incremental repair

2014-10-01 Thread Robert Coli
On Wed, Oct 1, 2014 at 3:11 PM, Tyler Hobbs  wrote:

> Compressed SSTables store a checksum for every compressed block, which is
> checked each time the block is decompressed.  I believe there's a ticket
> out there to add something similar for non-compressed SSTables.
>
> We also store the sha1 hash of SSTables in its own file on disk.
>

@OP : this came up a few weeks ago on the list, search for bitrot for the
previous thread.

Expanding on the discussion further, I plan to file a JIRA on
you-must-mark-all-sstables-for-that-range-unrepaired-if-you-fail-CRC-on-read.
I'll try to remember to reply on thread when I do.

Once there is a CRC on uncompressed read,
marking-all-sstables-unrepaired-on-failed-CRC would handle the bitrot case
for both uncompressed and compressed reads.

=Rob


Re: cassandra stress tools

2014-10-01 Thread Sumod Pawgi
Not a direct answer to your post but you can also take a look at YCSB.

Sent from my iPhone

> On 01-Oct-2014, at 8:38 pm, shahab  wrote:
> 
> Hi,
> 
> I am trying to benchmark our custom schema in Cassandra and I managed to run 
> it. However there are couple of setting and issues which I couldn't find any 
> solution/explanation for. I appreciate any comments.
> 1- The default number of warm-up iterations in stress tool is about 5. I 
> would like to reduce this number (due to my storage space limitations), but I 
> couldn't find any input parameters to do this. I just wonder if this setting 
> is possible ?
> 
> 
> 2- I did not understand well what does the output of cassandra stress tool 
> mean? I read  this 
> http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html,
>  but . for example, what does "latency" means here? does it mean how long a 
> read/write operation is delayed until it is executed? in this case, what is 
> the measure for actual read/write operation?
> It seems that the documentation is outdated, there is an output parameter 
> "partition_rate" which is not explained in documentation?
> 
> best,
> /Shahab