Upcoming Cassandra-related Conferences

2018-10-05 Thread Max C.
Some upcoming Cassandra-related conferences, if anyone is interested:

Scylla Summit
November 5-7, 2018
Pullman San Francisco Bay Hotel, Redwood City CA
https://www.scylladb.com/scylla-summit-2018/ 


(This one seems to be almost entirely Scylla focussed, maybe not terribly 
useful for non-Scylla users)

DataStax Accelerate
May 21-23, 2019 
National Harbor, Maryland
https://www.datastax.com/accelerate 

(No talks list or sponsors have been posted yet)

DISCLAIMER:
I’m not in the middle of the politics or nor do I have any affiliation with 
either of these companies.  I just thought lowly users like myself might 
appreciate the mention these on the -users list.  

I wish we should have had a post or two about the Distributed Data Summit;  I 
think we probably would have had an even better conference!  :-)

- Max

Re: Connections info

2018-10-05 Thread Abdul Patel
Thanks will try both options

On Friday, October 5, 2018, Alain RODRIGUEZ  wrote:

> Hello Abdul,
>
> I was caught by a different topic while answering, sending the message
> over, even though it's similar to Romain's solution.
>
> There is the metric mentioned above, or to have more details such as the
> app node IP, you can use:
>
> $ sudo netstat -tupawn | grep 9042 | grep ESTABLISHED
>
> tcp0  0 ::::*9042*   :::*
>   LISTEN  -
>
> tcp0  0 ::::*9042*   ::::51486
> ESTABLISHED -
>
> tcp0  0 ::::*9042*   ::::37624
> ESTABLISHED -
> [...]
>
> tcp0  0 ::::*9042*   ::::49108
> ESTABLISHED -
>
> or to count them:
>
> $ sudo netstat -tupawn | grep 9042 | grep ESTABLISHED | wc -l
>
> 113
>
> I'm not sure about the '-tupawn' options, it gives me the format I need
> and I never wondered much about it I must say. Maybe some of the options
> are useless.
>
> Sending this command through ssh would allow you to gather the information
> in one place. You can also run similar commands on the clients (Apps) toI
> hope that helps.
>
> C*heers
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> Le ven. 5 oct. 2018 à 06:28, Max C.  a écrit :
>
>> Looks like the number of connections is available in JMX as:
>>
>> org.apache.cassandra.metrics:type=Client,name=connectedNativeClients
>>
>> http://cassandra.apache.org/doc/4.0/operating/metrics.html
>>
>> "Number of clients connected to this nodes native protocol server”
>>
>> As for where they’re coming from — I’m not sure how to get that from
>> JMX.  Maybe you’ll have to use “lsof” or something to get that.
>>
>> - Max
>>
>> On Oct 4, 2018, at 8:57 pm, Abdul Patel  wrote:
>>
>> Hi All,
>>
>> Can we get number of users connected to each node in cassandra?
>> Also can we get from whixh app node they are connecting from?
>>
>>
>>


Re: Connections info

2018-10-05 Thread Alain RODRIGUEZ
Hello Abdul,

I was caught by a different topic while answering, sending the message
over, even though it's similar to Romain's solution.

There is the metric mentioned above, or to have more details such as the
app node IP, you can use:

$ sudo netstat -tupawn | grep 9042 | grep ESTABLISHED

tcp0  0 ::::*9042*   :::*
LISTEN  -

tcp0  0 ::::*9042*
::::51486  ESTABLISHED
-

tcp0  0 ::::*9042*
::::37624  ESTABLISHED
-
[...]

tcp0  0 ::::*9042*
::::49108  ESTABLISHED
-

or to count them:

$ sudo netstat -tupawn | grep 9042 | grep ESTABLISHED | wc -l

113

I'm not sure about the '-tupawn' options, it gives me the format I need and
I never wondered much about it I must say. Maybe some of the options are
useless.

Sending this command through ssh would allow you to gather the information
in one place. You can also run similar commands on the clients (Apps) toI
hope that helps.

C*heers
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le ven. 5 oct. 2018 à 06:28, Max C.  a écrit :

> Looks like the number of connections is available in JMX as:
>
> org.apache.cassandra.metrics:type=Client,name=connectedNativeClients
>
> http://cassandra.apache.org/doc/4.0/operating/metrics.html
>
> "Number of clients connected to this nodes native protocol server”
>
> As for where they’re coming from — I’m not sure how to get that from JMX.
> Maybe you’ll have to use “lsof” or something to get that.
>
> - Max
>
> On Oct 4, 2018, at 8:57 pm, Abdul Patel  wrote:
>
> Hi All,
>
> Can we get number of users connected to each node in cassandra?
> Also can we get from whixh app node they are connecting from?
>
>
>


Re: Connections info

2018-10-05 Thread Romain Hardouin
 Note that one "user"/application can open multiple connections. You have also 
the number of Thrift connections available in JMX if you run a legacy 
application.
Max is right, regarding where they're come from you can use lsof. For instance 
on AWS - but you can adapt it for your needs:
IP=...REGION=...
ssh $IP "sudo lsof -i -n | grep 9042 | grep -Po '(?<=->)[^:]+' | sort -u" | 
xargs -P 20 -I '{}' aws --output json --region $REGION ec2 describe-instances 
--filter Name=private-ip-address,Values={} --query 
'Reservations[].Instances[*].Tags[*]' | jq '.[0][0] | map(select(.Key == 
"Name")) | .[0].Value' | sort | uniq -c
You'll have number of instances grouped by AWS name :      3 "name_ABC"     15 
"name_example"     37 "name_test"
Best,Romain
Le vendredi 5 octobre 2018 à 06:28:51 UTC+2, Max C. 
 a écrit :  
 
 Looks like the number of connections is available in JMX as:
org.apache.cassandra.metrics:type=Client,name=connectedNativeClients
http://cassandra.apache.org/doc/4.0/operating/metrics.html
"Number of clients connected to this nodes native protocol server”
As for where they’re coming from — I’m not sure how to get that from JMX.  
Maybe you’ll have to use “lsof” or something to get that. 
- Max


On Oct 4, 2018, at 8:57 pm, Abdul Patel  wrote:
Hi All,
Can we get number of users connected to each node in cassandra?Also can we get 
from whixh app node they are connecting from?

  

Re: Metrics matrix: migrate 2.1.x metrics to 2.2.x+

2018-10-05 Thread Alain RODRIGUEZ
I feel you for most of the troubles you faced, I've been facing most of
them too. Again, Datadog support can probably help you with most of those.
You should really consider sharing this feedback to them.

there is re-namespacing of the metric names in lots of cases, and these
> don't appear to be centrally documented, but maybe i haven't found the
> magic page.
>

I don't know if that would be the 'magic' page, but that's something:
https://github.com/DataDog/integrations-core/blob/master/cassandra/metadata.csv

There are so many good stats.


Yes, and it's still improving. I love this about Cassandra. It's our work
to pick the relevant ones for each situation. I would not like Cassandra to
reduce the number of metrics exposed, we need to learn to handle them
properly. Also, this is the reason we designed 4 dashboards out the box,
the goal was to have everything we need for distinct scenarios:
- Overview - global health-check / anomaly detection
- Read Path - troubleshooting / optimizing read ops
- Write Path - troubleshooting / optimizing write ops
- SSTable Management - troubleshooting / optimizing -
comapction/flushes/... anything related to sstables.

instead of the single overview dashboard that was present before. We are
also perfectly aware that it's far from perfect, but aiming at perfect
would only have had us never releasing anything. Anyone interested could
now build missing dashboards or improve existing ones for himself or/and
suggest improvements to Datadog :). I hope I'll do some more of this work
at some point in the future.

Good luck,
C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le jeu. 4 oct. 2018 à 21:21, Carl Mueller
 a écrit :

> for 2.1.x we had a custom reporter that delivered  metrics to datadog's
> endpoint via https, bypassing the agent-imposed 350. But integrating that
> required targetting the other shared libs in the cassandra path, so the
> build is a bit of a pain when we update major versions.
>
> We are migrating our 2.1.x specific dashboards, and we will use
> agent-delivered metrics for non-table, and adapt the custom library to
> deliver the table-based ones, at a slower rate than the "core" ones.
>
> Datadog is also super annoying because there doesn't appear to be anything
> that reports what metrics the agent is sending (the metric count can
> indicate if a configured new metric increased the count and is being
> reported, but it's still... a guess), and there is re-namespacing of the
> metric names in lots of cases, and these don't appear to be centrally
> documented, but maybe i haven't found the magic page.
>
> There are so many good stats. We might also implement some facility to
> dynamically turn on the delivery of detailed metrics on the nodes.
>
> On Tue, Oct 2, 2018 at 5:21 AM Alain RODRIGUEZ  wrote:
>
>> Hello Carl,
>>
>> I guess we can use bean_regex to do specific targetted metrics for the
>>> important tables anyway.
>>>
>>
>> Yes, this would work, but 350 is very limited for Cassandra dashboards.
>> We have a LOT of metrics available.
>>
>> Datadog 350 metric limit is a PITA for tables once you get over 10 tables
>>>
>>
>> I noticed this while I was working on providing default dashboards for
>> Cassandra-Datadog integration. I was told by Datadog team it would not be
>> an issue for users, that I should not care about it. As you pointed out,
>> per table metrics quickly increase the total number of metrics we need to
>> collect.
>>
>> I believe you can set the following option: *"max_returned_metrics:
>> 1000"* - it can be used if metrics are missing to increase the limit of
>> the number of collected metrics. Be aware of CPU utilization that this
>> might imply (greatly improved in dd-agent version 6+ I believe -thanks
>> Datadog teams for that- making this fully usable for Cassandra). This
>> option should go in the *cassandra.yaml* file for Cassandra
>> integrations, off the top of my head.
>>
>> Also, do not hesitate to reach to Datadog directly for this kind of
>> questions, I have always been very happy with their support so far, I am
>> sure they would guide you through this as well, probably better than we can
>> do :). It also provides them with feedback on what people are struggling
>> with I imagine.
>>
>> I am interested to know if you still have issues getting more metrics
>> (option above not working / CPU under too much load) as this would make the
>> dashboards we built mostly unusable for clusters with more tables. We might
>> then need to review the design.
>>
>> As a side note, I believe metrics are handled the same way cross version,
>> they got the same name/label for C*2.1, 2.2 and 3+ on Datadog. There is an
>> abstraction layer that removes this complexity (if I remember well, we
>> built those dashboards a while ago).
>>
>> C*heers
>> ---
>> Alain Rodriguez - @arodream