Re: [lpi-examdev] monitoring for devops

Alex Clemente Mon, 28 Aug 2017 04:10:36 -0700

Dear;

Here in Brazil, Zabbix is larged used,but is need know most database.


I like Nagios Core to coletor UP/DOWN ando service state. For LPI DevOps
Tools Engineer, Nagios is more accept, because is used on are world.


2017-08-27 19:39 GMT-03:00 Fabian Thorns <[email protected]>:

> Hi Jeroen,
>
> On Tue, Aug 22, 2017 at 12:27 PM, Jeroen Baten <[email protected]> wrote:
>
> In case you mist my earlier reply (lost at the bottom of an old thread
>> in your email client :-) ) I repost it like this.
>>
>
> Good point, I still owe you a response on that thread. Here it comes :)
>
>
> Having thought about devops and monitoring I must admit that I am not
>> happy about where it was heading.
>>
>> I love LPI's generic and practical approach so I spend some time about
>> that regarding devops and monitoring
>>
>> Yes, a devops guy needs to know about monitoring.
>> Yes, he should know that there are a few popular open source projects
>> that do monitoring: Nagios, Icinga, Zabbix, Prometheus (if you must
>> insist allthough I think it is not nearly mature enough).
>>
>> No, he should not become an expert in one of these packages.
>> (well, I could say it must be Zabbix but 10 to 1 somebody will see that
>> completely differently)
>>
>
> We had some opinions whether or not to test a specific product / project.
> The problem with concepts is that they are hard to test. Examples ease this
> a lot because they avoid long verbal explanations. Specifying a specific
> tool might also provide guidance for candidates who are new to a topic
> because it gives them a point to start their study (and maybe learn enough
> to pick another solution that better serves their needs). Take email
> servers as an example; one might try to test email delivery on a conceptual
> level only, but (for very good reasons) we're testing Postfix in LPIC-2. In
> fact, we used to test several MTAs in former times.
>
> After all, we have to find the right balance between having some meat on
> the bones (by having examples for the concepts we test), being useful (to
> those who use the objectives to learn new topic) and being efficient (to
> those who know a different tool and prepare for the exam). All this has to
> be decided from the candidate's perspective.
>
> For the DevOps Tools Engineer exam, this candidate's focus are
> microservices in a dynamic environment where new containers / VMs are
> spawned automatically, potentially in a very high frequency, potentially
> triggered by some automatism. This has significant influence no how
> monitoring works. Keeping track of a dynamic environment requires a
> monitoring system to use some kind of service discovery. Furthermore, tools
> like Kubernetes can detect the failure of a container/pod and restart it
> automatically. Monitoring the old pod wouldn't be a great benefit; instead,
> the amount of container failures or pod restarts might be better indicator
> to find problems; i.e. since the failure of a single container/pod might
> not affect the overall availability of a service. This shifts the interest
> from a single server/container to services and from simple up/down to more
> detailed metrics. The main reason why we reconsidered Icinga2 were these
> requirements and how easy it is to fulfill them.
>
>
>
>> But we can tell students about things like:
>> -Be aware of sizing. The amount of monitorin information is the number
>> of items times the number of servers.
>>
>
> Not necessarily. It could also be the general availability of a service no
> matter how many (virtual/containerized) servers provide the service. It
> might also be the overall rate of failing requests, the overall number of
> certain API calls, the overall number of available processing nodes, the
> number of container restarts... .
>
>
>
>> -Know the difference between storing in a rrd database or a sql database
>> or elastic database and the difference in housekeeping.
>>
>
> Here we run into the same problems as mentioned above, in a general
> approach questions on these topics can easily become vague while using a
> specific example requires us to make a choice.
>
>
>
>> -The sort-of standard way how return/errorlevels are organised:
>>
>> Nagios/Icinga:
>> Plugin Return Code      Service State   Host State
>> 0       OK      UP
>> 1       WARNING UP or DOWN/UNREACHABLE*
>> 2       CRITICAL        DOWN/UNREACHABLE
>> 3       UNKNOWN DOWN/UNREACHABLE
>>
>> Zabbix: Any exit code that is different from 0 is considered as
>> execution failure.
>>
>> Prometheus:? (couldn't find it, pointers welcome)
>>
>
> Short answer: There is the "up" time series which might be an initial
> indicator.
>
> Longer answer: What defines a warning / critical / ok state of a service
> or an application? A lot of these definitions stem from metrics. Nagios and
> icinga allow us to procure performance data, but they are pretty static in
> how they interpret them (basically warning / critical thresholds).
> Prometheus also can collect multiple metrics and can be configured to alert
> on thresholds. For Nagios / Icinga(2) storing performance data over a
> longer period of time requires additional helpers; I heard the cool kids
> use datastores like InfluxDB or Graphite and dashboards like Grafana --
> which basically ends up with the same dashboards Prometheus creates. The
> result for what people want when monitoring their microservices seems to be
> pretty similar, although Prometheus seems to be the easier way to get
> there. That doesn't mean there is no use case for Nagios / Icinga(2), given
> the context of the DevOps Tools Engineer exam Prometheus seems to be the
> better fit.
>
> I know this is arguable, but I hope this makes the motivation of the
> change to Prometheus a little more transparent. Feel free to follow up :)
>
> Fabian
>
> PS: For those of you who like numbers: curl -s https://hub.docker.com/v2/
> repositories/prom/prometheus/ | jq '{ pull_count }'
>
>
>
> Op 17-08-17 om 15:55 schreef Jeroen Baten:
>> >
>> > I thought I'd give Prometheus a try.
>> > I really don't understand the enthusiasm.
>> > All I can see is that the agent sends data to the server and I can get a
>> > graph for the data.
>> >
>> > Maybe I am mistaken but I can't see things like templates
>> > (pre-configured lists of triggers and data) like in Zabbix, or how to
>> > configure alerts.
>> >
>> > And if I want a prometheus dashboard I have to install Rails and install
>> > PromDash. But even than I have to add all the metrics that I need to
>> > monitor.
>> >
>> > I want to be able to daily add servers from a cmdb to my monitoring
>> > solution and attach some templates.
>> > I want to not be bothered by metrics unless something goes wrong.
>> >
>> > AFAICT Prometheus is nice for small scale projects but that's it.
>> >
>> > Am I maybe missing something?
>> >
>> > regards,
>> > Jeroen
>> >
>>
>> --
>> Jeroen Baten              | EMAIL :  [email protected]
>>   ____  _  __              | web   :  www.i2rs.nl
>>    |  )|_)(_               | tel   :  +31 (0)345 - 75 26 28
>>   _|_/_| \__)              | Molenwindsingel 46, 4105 HK, Culemborg, the
>> Netherlands
>> _______________________________________________
>> lpi-examdev mailing list
>> [email protected]
>> http://list.lpi.org/cgi-bin/mailman/listinfo/lpi-examdev
>>
>
>
> _______________________________________________
> lpi-examdev mailing list
> [email protected]
> http://list.lpi.org/cgi-bin/mailman/listinfo/lpi-examdev
>



-- 

-- 
Alex [email protected][email protected]
Analista Linux, Unix, Virtualização e Middleware
Instrutor Linux e Open Source
-----------------------------
AWS Technical Professional
Azure Datacenter in Cloud Platform for Technical
CompTIA Linux+ Powered by LPI
SUSE 11 Certified Linux Administrator
SUSE 11 Technical Specialist
LPIC-1 Certified Linux Administrator
LPIC-2 Certified Linux Engineer

_______________________________________________
lpi-examdev mailing list
[email protected]
http://list.lpi.org/cgi-bin/mailman/listinfo/lpi-examdev

Re: [lpi-examdev] monitoring for devops

Reply via email to