question on uda/udf

2017-02-18 Thread Kant Kodali
Hi All,

Goal: want to create check_duplicate UDA on a blob column

Context: I have a partition of 10Million rows with size of 10GB (I know
this is bad). I want to check if there are duplicate in a blob column in
this partition. The blob column can at most be 256 bytes.

Question: can I create state map ? since blob is represented as
a ByteBuffer I suspect this wont work because of .equals method which would
just compare the references. so can I convert the blob into hex string
first? If so, how should I do that?

Thanks,
kant


Re: High disk io read load

2017-02-18 Thread Bhuvan Rawal
This looks fine, 8k read ahead as you mentioned.
Doesnt look like an issue of data model as well since reads in this
https://cl.ly/2c3Z1u2k0u2I appear balanced.

In most possibility, this looks like an issue with new node configuration
to me. The fact that you have really less data going out of node rules out
the possibility of More "hot" data than can be cached. Are your nodes
running spark jobs in locality which are filtering data locally and sending
limited data out?

Im finding 800M Disk IO for 4M network transfer a really fishy!

I believe as a starting point, you can try and debugging page faults with:
*sar -B 1 10*

Regards*,*

On Sun, Feb 19, 2017 at 2:57 AM, Benjamin Roth 
wrote:

> Just for the record, that's what dstat looks like while CS is starting:
>
> root@cas10:~# dstat -lrnv 10
> ---load-avg--- --io/total- -net/total- ---procs--- --memory-usage-
> ---paging-- -dsk/total- ---system-- total-cpu-usage
>  1m   5m  15m | read  writ| recv  send|run blk new| used  buff  cach
>  free|  in   out | read  writ| int   csw |usr sys idl wai hiq siq
> 0.69 0.18 0.06| 228  24.3 |   0 0 |0.0   0  24|17.8G 3204k  458M
>  108G|   0 0 |5257k  417k|  17k 3319 |  2   1  97   0   0   0
> 0.96 0.26 0.09| 591  27.9 | 522k  476k|4.1   0  69|18.3G 3204k  906M
>  107G|   0 0 |  45M  287k|  22k 6943 |  7   1  92   0   0   0
> 13.2 2.83 0.92|2187  28.7 |1311k  839k|5.3  90  18|18.9G 3204k 9008M
> 98.1G|   0 0 | 791M 8346k|  49k   25k| 17   1  36  46   0   0
> 30.6 6.91 2.27|2188  67.0 |4200k 3610k|8.8 106  27|19.5G 3204k 17.9G
> 88.4G|   0 0 | 927M 8396k| 116k  119k| 24   2  17  57   0   0
> 43.6 10.5 3.49|2136  24.3 |4371k 3708k|6.3 108 1.0|19.5G 3204k 26.7G
> 79.6G|   0 0 | 893M   13M| 117k  159k| 15   1  17  66   0   0
> 56.9 14.4 4.84|2152  32.5 |3937k 3767k| 11  83 5.0|19.5G 3204k 35.5G
> 70.7G|   0 0 | 894M   14M| 126k  160k| 16   1  16  65   0   0
> 63.2 17.1 5.83|2135  44.1 |4601k 4185k|6.9  99  35|19.6G 3204k 44.3G
> 61.9G|   0 0 | 879M   15M| 133k  168k| 19   2  19  60   0   0
> 64.6 18.9 6.54|2174  42.2 |4393k 3522k|8.4  93 2.2|20.0G 3204k 52.7G
> 53.0G|   0 0 | 897M   14M| 138k  160k| 14   2  15  69   0   0
>
> The IO shoots up (791M) as soon as CS has started up and accepts requests.
> I also diffed sysctl of the both machines. No significant differences.
> Only CPU-related, random values and some hashes differ.
>
> 2017-02-18 21:49 GMT+01:00 Benjamin Roth :
>
>> 256 tokens:
>>
>> root@cas9:/sys/block/dm-0# blockdev --report
>> RORA   SSZ   BSZ   StartSecSize   Device
>> rw   256   512  4096  067108864   /dev/ram0
>> rw   256   512  4096  067108864   /dev/ram1
>> rw   256   512  4096  067108864   /dev/ram2
>> rw   256   512  4096  067108864   /dev/ram3
>> rw   256   512  4096  067108864   /dev/ram4
>> rw   256   512  4096  067108864   /dev/ram5
>> rw   256   512  4096  067108864   /dev/ram6
>> rw   256   512  4096  067108864   /dev/ram7
>> rw   256   512  4096  067108864   /dev/ram8
>> rw   256   512  4096  067108864   /dev/ram9
>> rw   256   512  4096  067108864   /dev/ram10
>> rw   256   512  4096  067108864   /dev/ram11
>> rw   256   512  4096  067108864   /dev/ram12
>> rw   256   512  4096  067108864   /dev/ram13
>> rw   256   512  4096  067108864   /dev/ram14
>> rw   256   512  4096  067108864   /dev/ram15
>> rw16   512  4096  0800166076416 <0800%20166076416>
>> /dev/sda
>> rw16   512  4096   2048800164151296   /dev/sda1
>> rw16   512  4096  0644245094400 <06442%2045094400>
>> /dev/dm-0
>> rw16   512  4096  0  2046820352   /dev/dm-1
>> rw16   512  4096  0  1023410176   /dev/dm-2
>> rw16   512  4096  0800166076416 <0800%20166076416>
>> /dev/sdb
>>
>> 512 tokens:
>> root@cas10:/sys/block# blockdev --report
>> RORA   SSZ   BSZ   StartSecSize   Device
>> rw   256   512  4096  067108864   /dev/ram0
>> rw   256   512  4096  067108864   /dev/ram1
>> rw   256   512  4096  067108864   /dev/ram2
>> rw   256   512  4096  067108864   /dev/ram3
>> rw   256   512  4096  067108864   /dev/ram4
>> rw   256   512  4096  067108864   /dev/ram5
>> rw   256   512  4096  067108864   /dev/ram6
>> rw   256   512  4096  067108864   /dev/ram7
>> rw   256   512  4096  067108864   /dev/ram8
>> rw   256   512  4096  067108864   /dev/ram9
>> rw   256   512  4096  067108864   /dev/ram10
>> rw   256   512  4096  067108864   /dev/ram11
>> rw   256   512  4096  0

Re: High disk io read load

2017-02-18 Thread Benjamin Roth
Just for the record, that's what dstat looks like while CS is starting:

root@cas10:~# dstat -lrnv 10
---load-avg--- --io/total- -net/total- ---procs--- --memory-usage-
---paging-- -dsk/total- ---system-- total-cpu-usage
 1m   5m  15m | read  writ| recv  send|run blk new| used  buff  cach  free|
 in   out | read  writ| int   csw |usr sys idl wai hiq siq
0.69 0.18 0.06| 228  24.3 |   0 0 |0.0   0  24|17.8G 3204k  458M  108G|
  0 0 |5257k  417k|  17k 3319 |  2   1  97   0   0   0
0.96 0.26 0.09| 591  27.9 | 522k  476k|4.1   0  69|18.3G 3204k  906M  107G|
  0 0 |  45M  287k|  22k 6943 |  7   1  92   0   0   0
13.2 2.83 0.92|2187  28.7 |1311k  839k|5.3  90  18|18.9G 3204k 9008M 98.1G|
  0 0 | 791M 8346k|  49k   25k| 17   1  36  46   0   0
30.6 6.91 2.27|2188  67.0 |4200k 3610k|8.8 106  27|19.5G 3204k 17.9G 88.4G|
  0 0 | 927M 8396k| 116k  119k| 24   2  17  57   0   0
43.6 10.5 3.49|2136  24.3 |4371k 3708k|6.3 108 1.0|19.5G 3204k 26.7G 79.6G|
  0 0 | 893M   13M| 117k  159k| 15   1  17  66   0   0
56.9 14.4 4.84|2152  32.5 |3937k 3767k| 11  83 5.0|19.5G 3204k 35.5G 70.7G|
  0 0 | 894M   14M| 126k  160k| 16   1  16  65   0   0
63.2 17.1 5.83|2135  44.1 |4601k 4185k|6.9  99  35|19.6G 3204k 44.3G 61.9G|
  0 0 | 879M   15M| 133k  168k| 19   2  19  60   0   0
64.6 18.9 6.54|2174  42.2 |4393k 3522k|8.4  93 2.2|20.0G 3204k 52.7G 53.0G|
  0 0 | 897M   14M| 138k  160k| 14   2  15  69   0   0

The IO shoots up (791M) as soon as CS has started up and accepts requests.
I also diffed sysctl of the both machines. No significant differences. Only
CPU-related, random values and some hashes differ.

2017-02-18 21:49 GMT+01:00 Benjamin Roth :

> 256 tokens:
>
> root@cas9:/sys/block/dm-0# blockdev --report
> RORA   SSZ   BSZ   StartSecSize   Device
> rw   256   512  4096  067108864   /dev/ram0
> rw   256   512  4096  067108864   /dev/ram1
> rw   256   512  4096  067108864   /dev/ram2
> rw   256   512  4096  067108864   /dev/ram3
> rw   256   512  4096  067108864   /dev/ram4
> rw   256   512  4096  067108864   /dev/ram5
> rw   256   512  4096  067108864   /dev/ram6
> rw   256   512  4096  067108864   /dev/ram7
> rw   256   512  4096  067108864   /dev/ram8
> rw   256   512  4096  067108864   /dev/ram9
> rw   256   512  4096  067108864   /dev/ram10
> rw   256   512  4096  067108864   /dev/ram11
> rw   256   512  4096  067108864   /dev/ram12
> rw   256   512  4096  067108864   /dev/ram13
> rw   256   512  4096  067108864   /dev/ram14
> rw   256   512  4096  067108864   /dev/ram15
> rw16   512  4096  0800166076416 <0800%20166076416>
> /dev/sda
> rw16   512  4096   2048800164151296   /dev/sda1
> rw16   512  4096  0644245094400 <06442%2045094400>
> /dev/dm-0
> rw16   512  4096  0  2046820352   /dev/dm-1
> rw16   512  4096  0  1023410176   /dev/dm-2
> rw16   512  4096  0800166076416 <0800%20166076416>
> /dev/sdb
>
> 512 tokens:
> root@cas10:/sys/block# blockdev --report
> RORA   SSZ   BSZ   StartSecSize   Device
> rw   256   512  4096  067108864   /dev/ram0
> rw   256   512  4096  067108864   /dev/ram1
> rw   256   512  4096  067108864   /dev/ram2
> rw   256   512  4096  067108864   /dev/ram3
> rw   256   512  4096  067108864   /dev/ram4
> rw   256   512  4096  067108864   /dev/ram5
> rw   256   512  4096  067108864   /dev/ram6
> rw   256   512  4096  067108864   /dev/ram7
> rw   256   512  4096  067108864   /dev/ram8
> rw   256   512  4096  067108864   /dev/ram9
> rw   256   512  4096  067108864   /dev/ram10
> rw   256   512  4096  067108864   /dev/ram11
> rw   256   512  4096  067108864   /dev/ram12
> rw   256   512  4096  067108864   /dev/ram13
> rw   256   512  4096  067108864   /dev/ram14
> rw   256   512  4096  067108864   /dev/ram15
> rw16   512  4096  0800166076416 <0800%20166076416>
> /dev/sda
> rw16   512  4096   2048800164151296   /dev/sda1
> rw16   512  4096  0800166076416 <0800%20166076416>
> /dev/sdb
> rw16   512  4096   2048800165027840   /dev/sdb1
> rw16   512  4096  0   1073741824000   /dev/dm-0
> rw16   512  4096  0  2046820352   /dev/dm-1
> rw16   512  4096  0  1023410176   /dev/dm-2
>
> 2017-02-18 21:41 GMT+01:00 Bhuvan Rawal :
>
>> Hi Ben,
>>
>> If its same on both machines then something else could 

Re: Logging queries

2017-02-18 Thread Igor Leão
Thanks Bhuvan!

Matija, I'm looking forward to this new release. Cassandra-diagnostics is
just great and this feature will make it awesome.
Hope to hear from you soon.

2017-02-18 17:20 GMT-03:00 Bhuvan Rawal :

> Im not sure if you can create an index on system_traces keyspace for this
> use case.
>
> If the performance issue that you are trying to troubleshoot is consistent
> than you can switch on tracing for a while and do dump of
> system_traces.events table say using COPY into csv. You can do analysis on
> that for finding the problematic query.
>
> copy system_traces.events TO 'traces_dump.csv';
>
> Also do make sure you dont set trace probability to a high number if
> working on a production database as it can adversely impact performance.
>
> Regards,
>
> On Sun, Feb 19, 2017 at 1:28 AM, Igor Leão  wrote:
>
>> Hi Bhuvan,
>> Thanks a lot!
>>
>> Any idea if something can be done for C* 2.X?
>>
>> Best,
>> Igor
>>
>> 2017-02-18 16:41 GMT-03:00 Bhuvan Rawal :
>>
>>> Hi Igor,
>>>
>>> If you are using java driver, you can log slow queries on client side
>>> using QueryLogger.
>>> https://docs.datastax.com/en/developer/java-driver/2.1/manual/logging/
>>>
>>> Slow Query logger for server was introduced in C* 3.10 version. Details:
>>> https://issues.apache.org/jira/browse/CASSANDRA-12403
>>>
>>> Regards,
>>> Bhuvan
>>>
>>> On Sun, Feb 19, 2017 at 12:59 AM, Igor Leão  wrote:
>>>
 Hi there,

 I'm wondering how to log queries from Cassandra. These queries can be
 either slow queries or all queries. The only constraint is that I should do
 this on server side.

 I tried using `nodetool settraceprobability`, which writes all queries
 to the keyspace `system_traces`. When I try to see which queries are slower
 than a given number, I get:

 Result: ```InvalidRequest: code=2200 [Invalid query] message="No
 secondary indexes on the restricted columns support the provided operators:
 "```
 Query: `select * from events where source_elapsed >= 1000;`

 My goal is to debug performance issues in a production database. I want
 to know which queries are degrading the performance of the db.

 Thanks in advance!







>>>
>>
>>
>> --
>> Igor Leão  Site Reliability Engineer
>>
>> Mobile: +55 81 99727-1083 
>> Skype: *igorvpcleao*
>> Office: +55 81 4042-9757 
>> Website: inlocomedia.com 
>> [image: inlocomedia]
>> 
>>  [image: LinkedIn]
>> 
>>  [image: Facebook]  [image:
>> Twitter]
>> 
>>
>>
>>
>>
>>
>>
>>
>


-- 
Igor Leão  Site Reliability Engineer

Mobile: +55 81 99727-1083 
Skype: *igorvpcleao*
Office: +55 81 4042-9757 
Website: inlocomedia.com 
[image: inlocomedia]

 [image: LinkedIn]

 [image: Facebook]  [image: Twitter]



Re: High disk io read load

2017-02-18 Thread Benjamin Roth
256 tokens:

root@cas9:/sys/block/dm-0# blockdev --report
RORA   SSZ   BSZ   StartSecSize   Device
rw   256   512  4096  067108864   /dev/ram0
rw   256   512  4096  067108864   /dev/ram1
rw   256   512  4096  067108864   /dev/ram2
rw   256   512  4096  067108864   /dev/ram3
rw   256   512  4096  067108864   /dev/ram4
rw   256   512  4096  067108864   /dev/ram5
rw   256   512  4096  067108864   /dev/ram6
rw   256   512  4096  067108864   /dev/ram7
rw   256   512  4096  067108864   /dev/ram8
rw   256   512  4096  067108864   /dev/ram9
rw   256   512  4096  067108864   /dev/ram10
rw   256   512  4096  067108864   /dev/ram11
rw   256   512  4096  067108864   /dev/ram12
rw   256   512  4096  067108864   /dev/ram13
rw   256   512  4096  067108864   /dev/ram14
rw   256   512  4096  067108864   /dev/ram15
rw16   512  4096  0800166076416   /dev/sda
rw16   512  4096   2048800164151296   /dev/sda1
rw16   512  4096  0644245094400   /dev/dm-0
rw16   512  4096  0  2046820352   /dev/dm-1
rw16   512  4096  0  1023410176   /dev/dm-2
rw16   512  4096  0800166076416   /dev/sdb

512 tokens:
root@cas10:/sys/block# blockdev --report
RORA   SSZ   BSZ   StartSecSize   Device
rw   256   512  4096  067108864   /dev/ram0
rw   256   512  4096  067108864   /dev/ram1
rw   256   512  4096  067108864   /dev/ram2
rw   256   512  4096  067108864   /dev/ram3
rw   256   512  4096  067108864   /dev/ram4
rw   256   512  4096  067108864   /dev/ram5
rw   256   512  4096  067108864   /dev/ram6
rw   256   512  4096  067108864   /dev/ram7
rw   256   512  4096  067108864   /dev/ram8
rw   256   512  4096  067108864   /dev/ram9
rw   256   512  4096  067108864   /dev/ram10
rw   256   512  4096  067108864   /dev/ram11
rw   256   512  4096  067108864   /dev/ram12
rw   256   512  4096  067108864   /dev/ram13
rw   256   512  4096  067108864   /dev/ram14
rw   256   512  4096  067108864   /dev/ram15
rw16   512  4096  0800166076416   /dev/sda
rw16   512  4096   2048800164151296   /dev/sda1
rw16   512  4096  0800166076416   /dev/sdb
rw16   512  4096   2048800165027840   /dev/sdb1
rw16   512  4096  0   1073741824000   /dev/dm-0
rw16   512  4096  0  2046820352   /dev/dm-1
rw16   512  4096  0  1023410176   /dev/dm-2

2017-02-18 21:41 GMT+01:00 Bhuvan Rawal :

> Hi Ben,
>
> If its same on both machines then something else could be the issue. We
> faced high disk io due to misconfigured read ahead which resulted in high
> amount of disk io for comparatively insignificant network transfer.
>
> Can you post output of blockdev --report for a normal node and 512 token
> node.
>
> Regards,
>
> On Sun, Feb 19, 2017 at 2:07 AM, Benjamin Roth 
> wrote:
>
>> cat /sys/block/sda/queue/read_ahead_kb
>> => 8
>>
>> On all CS nodes. Is that what you mean?
>>
>> 2017-02-18 21:32 GMT+01:00 Bhuvan Rawal :
>>
>>> Hi Benjamin,
>>>
>>> What is the disk read ahead on both nodes?
>>>
>>> Regards,
>>> Bhuvan
>>>
>>> On Sun, Feb 19, 2017 at 1:58 AM, Benjamin Roth 
>>> wrote:
>>>
 This is status of the largest KS of these both nodes:
 UN  10.23.71.10  437.91 GiB  512  49.1%
 2679c3fa-347e-4845-bfc1-c4d0bc906576  RAC1
 UN  10.23.71.9   246.99 GiB  256  28.3%
 2804ef8a-26c8-4d21-9e12-01e8b6644c2f  RAC1

 So roughly as expected.

 2017-02-17 23:07 GMT+01:00 kurt greaves :

> what's the Owns % for the relevant keyspace from nodetool status?
>



 --
 Benjamin Roth
 Prokurist

 Jaumo GmbH · www.jaumo.com
 Wehrstraße 46 · 73035 Göppingen · Germany
 Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
 <07161%203048801>
 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer

>>>
>>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
>> <07161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing 

Re: High disk io read load

2017-02-18 Thread Bhuvan Rawal
Hi Ben,

If its same on both machines then something else could be the issue. We
faced high disk io due to misconfigured read ahead which resulted in high
amount of disk io for comparatively insignificant network transfer.

Can you post output of blockdev --report for a normal node and 512 token
node.

Regards,

On Sun, Feb 19, 2017 at 2:07 AM, Benjamin Roth 
wrote:

> cat /sys/block/sda/queue/read_ahead_kb
> => 8
>
> On all CS nodes. Is that what you mean?
>
> 2017-02-18 21:32 GMT+01:00 Bhuvan Rawal :
>
>> Hi Benjamin,
>>
>> What is the disk read ahead on both nodes?
>>
>> Regards,
>> Bhuvan
>>
>> On Sun, Feb 19, 2017 at 1:58 AM, Benjamin Roth 
>> wrote:
>>
>>> This is status of the largest KS of these both nodes:
>>> UN  10.23.71.10  437.91 GiB  512  49.1%
>>> 2679c3fa-347e-4845-bfc1-c4d0bc906576  RAC1
>>> UN  10.23.71.9   246.99 GiB  256  28.3%
>>> 2804ef8a-26c8-4d21-9e12-01e8b6644c2f  RAC1
>>>
>>> So roughly as expected.
>>>
>>> 2017-02-17 23:07 GMT+01:00 kurt greaves :
>>>
 what's the Owns % for the relevant keyspace from nodetool status?

>>>
>>>
>>>
>>> --
>>> Benjamin Roth
>>> Prokurist
>>>
>>> Jaumo GmbH · www.jaumo.com
>>> Wehrstraße 46 · 73035 Göppingen · Germany
>>> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
>>> <07161%203048801>
>>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>>
>>
>>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: High disk io read load

2017-02-18 Thread Benjamin Roth
cat /sys/block/sda/queue/read_ahead_kb
=> 8

On all CS nodes. Is that what you mean?

2017-02-18 21:32 GMT+01:00 Bhuvan Rawal :

> Hi Benjamin,
>
> What is the disk read ahead on both nodes?
>
> Regards,
> Bhuvan
>
> On Sun, Feb 19, 2017 at 1:58 AM, Benjamin Roth 
> wrote:
>
>> This is status of the largest KS of these both nodes:
>> UN  10.23.71.10  437.91 GiB  512  49.1%
>> 2679c3fa-347e-4845-bfc1-c4d0bc906576  RAC1
>> UN  10.23.71.9   246.99 GiB  256  28.3%
>> 2804ef8a-26c8-4d21-9e12-01e8b6644c2f  RAC1
>>
>> So roughly as expected.
>>
>> 2017-02-17 23:07 GMT+01:00 kurt greaves :
>>
>>> what's the Owns % for the relevant keyspace from nodetool status?
>>>
>>
>>
>>
>> --
>> Benjamin Roth
>> Prokurist
>>
>> Jaumo GmbH · www.jaumo.com
>> Wehrstraße 46 · 73035 Göppingen · Germany
>> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
>> <07161%203048801>
>> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>>
>
>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: High disk io read load

2017-02-18 Thread Benjamin Roth
We are talking about a read IO increase of over 2000% with 512 tokens
compared to 256 tokens. 100% increase would be linear which would be
perfect. 200% would even okay, taking the RAM/Load ratio for caching into
account. But > 20x the read IO is really incredible.
The nodes are configured with puppet, they share the same roles and no
manual "optimizations" are applied. So I can't imagine, a different
configuration is responsible for it.

2017-02-18 21:28 GMT+01:00 Benjamin Roth :

> This is status of the largest KS of these both nodes:
> UN  10.23.71.10  437.91 GiB  512  49.1%
> 2679c3fa-347e-4845-bfc1-c4d0bc906576  RAC1
> UN  10.23.71.9   246.99 GiB  256  28.3%
> 2804ef8a-26c8-4d21-9e12-01e8b6644c2f  RAC1
>
> So roughly as expected.
>
> 2017-02-17 23:07 GMT+01:00 kurt greaves :
>
>> what's the Owns % for the relevant keyspace from nodetool status?
>>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 <07161%203048806> · Fax +49 7161 304880-1
> <07161%203048801>
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: High disk io read load

2017-02-18 Thread Bhuvan Rawal
Hi Benjamin,

What is the disk read ahead on both nodes?

Regards,
Bhuvan

On Sun, Feb 19, 2017 at 1:58 AM, Benjamin Roth 
wrote:

> This is status of the largest KS of these both nodes:
> UN  10.23.71.10  437.91 GiB  512  49.1%
> 2679c3fa-347e-4845-bfc1-c4d0bc906576  RAC1
> UN  10.23.71.9   246.99 GiB  256  28.3%
> 2804ef8a-26c8-4d21-9e12-01e8b6644c2f  RAC1
>
> So roughly as expected.
>
> 2017-02-17 23:07 GMT+01:00 kurt greaves :
>
>> what's the Owns % for the relevant keyspace from nodetool status?
>>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: High disk io read load

2017-02-18 Thread Benjamin Roth
This is status of the largest KS of these both nodes:
UN  10.23.71.10  437.91 GiB  512  49.1%
2679c3fa-347e-4845-bfc1-c4d0bc906576  RAC1
UN  10.23.71.9   246.99 GiB  256  28.3%
2804ef8a-26c8-4d21-9e12-01e8b6644c2f  RAC1

So roughly as expected.

2017-02-17 23:07 GMT+01:00 kurt greaves :

> what's the Owns % for the relevant keyspace from nodetool status?
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Logging queries

2017-02-18 Thread Bhuvan Rawal
Im not sure if you can create an index on system_traces keyspace for this
use case.

If the performance issue that you are trying to troubleshoot is consistent
than you can switch on tracing for a while and do dump of
system_traces.events table say using COPY into csv. You can do analysis on
that for finding the problematic query.

copy system_traces.events TO 'traces_dump.csv';

Also do make sure you dont set trace probability to a high number if
working on a production database as it can adversely impact performance.

Regards,

On Sun, Feb 19, 2017 at 1:28 AM, Igor Leão  wrote:

> Hi Bhuvan,
> Thanks a lot!
>
> Any idea if something can be done for C* 2.X?
>
> Best,
> Igor
>
> 2017-02-18 16:41 GMT-03:00 Bhuvan Rawal :
>
>> Hi Igor,
>>
>> If you are using java driver, you can log slow queries on client side
>> using QueryLogger.
>> https://docs.datastax.com/en/developer/java-driver/2.1/manual/logging/
>>
>> Slow Query logger for server was introduced in C* 3.10 version. Details:
>> https://issues.apache.org/jira/browse/CASSANDRA-12403
>>
>> Regards,
>> Bhuvan
>>
>> On Sun, Feb 19, 2017 at 12:59 AM, Igor Leão  wrote:
>>
>>> Hi there,
>>>
>>> I'm wondering how to log queries from Cassandra. These queries can be
>>> either slow queries or all queries. The only constraint is that I should do
>>> this on server side.
>>>
>>> I tried using `nodetool settraceprobability`, which writes all queries
>>> to the keyspace `system_traces`. When I try to see which queries are slower
>>> than a given number, I get:
>>>
>>> Result: ```InvalidRequest: code=2200 [Invalid query] message="No
>>> secondary indexes on the restricted columns support the provided operators:
>>> "```
>>> Query: `select * from events where source_elapsed >= 1000;`
>>>
>>> My goal is to debug performance issues in a production database. I want
>>> to know which queries are degrading the performance of the db.
>>>
>>> Thanks in advance!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
> --
> Igor Leão  Site Reliability Engineer
>
> Mobile: +55 81 99727-1083 
> Skype: *igorvpcleao*
> Office: +55 81 4042-9757 
> Website: inlocomedia.com 
> [image: inlocomedia]
> 
>  [image: LinkedIn]
> 
>  [image: Facebook]  [image: Twitter]
> 
>
>
>
>
>
>
>


Re: Logging queries

2017-02-18 Thread Matija Gobec
Hi Igor,

Your best bet is to wait for our next release of diagnostics
 for 2.x branch. We
are planning it for next week.

Best,
Matija

On Sat, Feb 18, 2017 at 8:58 PM, Igor Leão  wrote:

> Hi Bhuvan,
> Thanks a lot!
>
> Any idea if something can be done for C* 2.X?
>
> Best,
> Igor
>
> 2017-02-18 16:41 GMT-03:00 Bhuvan Rawal :
>
>> Hi Igor,
>>
>> If you are using java driver, you can log slow queries on client side
>> using QueryLogger.
>> https://docs.datastax.com/en/developer/java-driver/2.1/manual/logging/
>>
>> Slow Query logger for server was introduced in C* 3.10 version. Details:
>> https://issues.apache.org/jira/browse/CASSANDRA-12403
>>
>> Regards,
>> Bhuvan
>>
>> On Sun, Feb 19, 2017 at 12:59 AM, Igor Leão  wrote:
>>
>>> Hi there,
>>>
>>> I'm wondering how to log queries from Cassandra. These queries can be
>>> either slow queries or all queries. The only constraint is that I should do
>>> this on server side.
>>>
>>> I tried using `nodetool settraceprobability`, which writes all queries
>>> to the keyspace `system_traces`. When I try to see which queries are slower
>>> than a given number, I get:
>>>
>>> Result: ```InvalidRequest: code=2200 [Invalid query] message="No
>>> secondary indexes on the restricted columns support the provided operators:
>>> "```
>>> Query: `select * from events where source_elapsed >= 1000;`
>>>
>>> My goal is to debug performance issues in a production database. I want
>>> to know which queries are degrading the performance of the db.
>>>
>>> Thanks in advance!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
> --
> Igor Leão  Site Reliability Engineer
>
> Mobile: +55 81 99727-1083 
> Skype: *igorvpcleao*
> Office: +55 81 4042-9757 
> Website: inlocomedia.com 
> [image: inlocomedia]
> 
>  [image: LinkedIn]
> 
>  [image: Facebook]  [image: Twitter]
> 
>
>
>
>
>
>
>


Re: Logging queries

2017-02-18 Thread Igor Leão
Hi Bhuvan,
Thanks a lot!

Any idea if something can be done for C* 2.X?

Best,
Igor

2017-02-18 16:41 GMT-03:00 Bhuvan Rawal :

> Hi Igor,
>
> If you are using java driver, you can log slow queries on client side
> using QueryLogger.
> https://docs.datastax.com/en/developer/java-driver/2.1/manual/logging/
>
> Slow Query logger for server was introduced in C* 3.10 version. Details:
> https://issues.apache.org/jira/browse/CASSANDRA-12403
>
> Regards,
> Bhuvan
>
> On Sun, Feb 19, 2017 at 12:59 AM, Igor Leão  wrote:
>
>> Hi there,
>>
>> I'm wondering how to log queries from Cassandra. These queries can be
>> either slow queries or all queries. The only constraint is that I should do
>> this on server side.
>>
>> I tried using `nodetool settraceprobability`, which writes all queries to
>> the keyspace `system_traces`. When I try to see which queries are slower
>> than a given number, I get:
>>
>> Result: ```InvalidRequest: code=2200 [Invalid query] message="No
>> secondary indexes on the restricted columns support the provided operators:
>> "```
>> Query: `select * from events where source_elapsed >= 1000;`
>>
>> My goal is to debug performance issues in a production database. I want
>> to know which queries are degrading the performance of the db.
>>
>> Thanks in advance!
>>
>>
>>
>>
>>
>>
>>
>


-- 
Igor Leão  Site Reliability Engineer

Mobile: +55 81 99727-1083 
Skype: *igorvpcleao*
Office: +55 81 4042-9757 
Website: inlocomedia.com 
[image: inlocomedia]

 [image: LinkedIn]

 [image: Facebook]  [image: Twitter]



Re: Logging queries

2017-02-18 Thread Bhuvan Rawal
Hi Igor,

If you are using java driver, you can log slow queries on client side using
QueryLogger.
https://docs.datastax.com/en/developer/java-driver/2.1/manual/logging/

Slow Query logger for server was introduced in C* 3.10 version. Details:
https://issues.apache.org/jira/browse/CASSANDRA-12403

Regards,
Bhuvan

On Sun, Feb 19, 2017 at 12:59 AM, Igor Leão  wrote:

> Hi there,
>
> I'm wondering how to log queries from Cassandra. These queries can be
> either slow queries or all queries. The only constraint is that I should do
> this on server side.
>
> I tried using `nodetool settraceprobability`, which writes all queries to
> the keyspace `system_traces`. When I try to see which queries are slower
> than a given number, I get:
>
> Result: ```InvalidRequest: code=2200 [Invalid query] message="No secondary
> indexes on the restricted columns support the provided operators: "```
> Query: `select * from events where source_elapsed >= 1000;`
>
> My goal is to debug performance issues in a production database. I want to
> know which queries are degrading the performance of the db.
>
> Thanks in advance!
>
>
>
>
>
>
>


Logging queries

2017-02-18 Thread Igor Leão
Hi there,

I'm wondering how to log queries from Cassandra. These queries can be
either slow queries or all queries. The only constraint is that I should do
this on server side.

I tried using `nodetool settraceprobability`, which writes all queries to
the keyspace `system_traces`. When I try to see which queries are slower
than a given number, I get:

Result: ```InvalidRequest: code=2200 [Invalid query] message="No secondary
indexes on the restricted columns support the provided operators: "```
Query: `select * from events where source_elapsed >= 1000;`

My goal is to debug performance issues in a production database. I want to
know which queries are degrading the performance of the db.

Thanks in advance!


Re: is there a query to find out the largest partition in a table?

2017-02-18 Thread Kant Kodali
*I did the following. Now I wonder if this is one node or multiple nodes?
Does this value really tell me I have a large partition?*

nodetool cfhistograms test hello // This reports the max partition size is
10GB

nodetool tablestats test.hello // This also reports Compacted partition
maximum bytes: 10299432635


Percentile  SSTables Write Latency  Read LatencyPartition Size
   Cell Count

  (micros)  (micros)   (bytes)


50% 0.00 20.50 51.01 155469300
   654949

75% 0.00 24.60 88.154139110981
 17436917

95% 6.00 29.52 155469.30   10299432635
 43388628

98% 6.00 42.51 668489.53   10299432635
 43388628

99% 6.00 61.21 802187.44   10299432635
 43388628

Min 0.00  5.72  9.89   125
5

Max 6.00 668489.538582860.53   10299432635
 43388628

On Sat, Feb 18, 2017 at 12:28 AM, Kant Kodali  wrote:

> is there a query to find out the largest partition in a table? Does the
> query below give me the largest partition?
>
> select max(mean_partition_size) from size_estimates ;
>
> Thanks,
> Kant
>


is there a query to find out the largest partition in a table?

2017-02-18 Thread Kant Kodali
is there a query to find out the largest partition in a table? Does the
query below give me the largest partition?

select max(mean_partition_size) from size_estimates ;

Thanks,
Kant