Re: [opnfv-tech-discuss] [barometer] VES Heartbeats

2017-11-20 Thread SULLIVAN, BRYAN L (BRYAN L)
OK, here are some thoughts then to consider.

  *   Collectd (or other monitoring frontend) on each host will probably be 
sending measurements at least every minute or more often (VNFs may have longer 
cycles but for bare metal hosts I imagine we would want measurements at least 
every minute). Those would make an easy trigger for proxied host heartbeats 
from the VES Agent.
  *   No report from collectd means the host may be down.
  *   To avoid collectd itself from causing a false positive heartbeat failure, 
it should be run in a container that is managed by a framework like Kubernetes, 
so that if/when it fails it is automatically respawned on the same host.
  *   For heartbeats from VNFs, this should be integrated into the VNF code, 
such that if the VNF is dead or a zombie, the heartbeat fails. It should not be 
issued by a separate process on the  VNF host (VM or container) since that may 
result in a false negative.

If that makes sense maybe Gokul can work with Intel to update the ves_app.py 
Agent to proxy the heartbeats for all the hosts it gets reports from. I'll work 
on options for deploying the collectd containers using e.g. Kubernetes+Helm and 
creating a demo/dummy VNF that issues heartbeats via the ONAP Agent libraries 
(Gokul is welcome to help there as well...).

Thanks,
Bryan Sullivan | AT

From: GUPTA, ALOK
Sent: Monday, November 20, 2017 6:19 AM
To: SULLIVAN, BRYAN L (BRYAN L) 
Cc: 'opnfv-tech-discuss@lists.opnfv.org' 
Subject: RE: [barometer] VES Heartbeats

Bryan:

Interesting question. It is similar to getting VNF events from an EMS or OAM 
vm. The reason and rationale for heartbeat event was to avoid sending 
pings/queries to the devices but having DCAE analytics capable of analyzing 
heartbeat and metrics to determine health status of the device (compared to 
currently done via heath-check query to VNF). We had discussed earlier if the 
events coming in can assume if VNF is ok, instead of heartbeat and team felt 
otherwise. Fault and syslog event frequency can vary (you may not receive event 
for hours. If VNF is running smoothly). With Metrics the interval can be long 
(5  mins...15mins)...thus a need for Heartbeat event.
The heartbeat is not from agent but for devices for which data is being send. 
In some cases the entity forwarding the data could determine the health and it 
can create and send a HB event. You said this very well in your email for the 
infrastructure scenario, we may need to proxy the heartbeats for hosts not 
running the agent.

Hope this helps.


Regards,

Alok Gupta
732-420-7007
MT B2 3D30
ag1...@att.com

From: SULLIVAN, BRYAN L
Sent: Monday, November 20, 2017 8:55 AM
To: GUPTA, ALOK >
Cc: 'opnfv-tech-discuss@lists.opnfv.org' 
>
Subject: [barometer] VES Heartbeats

Alok,

Thinking about the shared Agent (ves_app.py from Barometer) design, in which we 
don't need an agent running on each node, but can use a single agent running on 
the local cloud which aggregates VES events from the Kafka bus, it brings up 
the question of how heartbeats are supposed to work (and what we use them for) 
in the VES design.

Beyond the VNF (presumably by integration of the ONAP VES library into the 
VNF), have you been assuming that the heartbeats represent the health of:

  *   The VES agent
  *   A host (real or virtual, whether running a VES agent or not) from which 
VES events are received

If the latter, we need to consider how the agent can proxy the heartbeats for 
hosts on which there is no agent running, e.g. the agent can keep a host-based 
flag that is set whenever a collectd event is picked up on the Kafka bus during 
the heartbeat period, and send a Heartbeat report for each host at the end of 
the period. But really in that case couldn't DCAE derive that information 
anyway from what it had received? So it calls into question the purpose of the 
heartbeat beyond the VNF itself (the obvious use case) - I just need to clarify 
it.

Thanks,
Bryan Sullivan | AT

___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


Re: [opnfv-tech-discuss] [barometer] VES Heartbeats

2017-11-20 Thread GUPTA, ALOK
Bryan:

Interesting question. It is similar to getting VNF events from an EMS or OAM 
vm. The reason and rationale for heartbeat event was to avoid sending 
pings/queries to the devices but having DCAE analytics capable of analyzing 
heartbeat and metrics to determine health status of the device (compared to 
currently done via heath-check query to VNF). We had discussed earlier if the 
events coming in can assume if VNF is ok, instead of heartbeat and team felt 
otherwise. Fault and syslog event frequency can vary (you may not receive event 
for hours. If VNF is running smoothly). With Metrics the interval can be long 
(5  mins...15mins)...thus a need for Heartbeat event.
The heartbeat is not from agent but for devices for which data is being send. 
In some cases the entity forwarding the data could determine the health and it 
can create and send a HB event. You said this very well in your email for the 
infrastructure scenario, we may need to proxy the heartbeats for hosts not 
running the agent.

Hope this helps.


Regards,

Alok Gupta
732-420-7007
MT B2 3D30
ag1...@att.com

From: SULLIVAN, BRYAN L
Sent: Monday, November 20, 2017 8:55 AM
To: GUPTA, ALOK 
Cc: 'opnfv-tech-discuss@lists.opnfv.org' 
Subject: [barometer] VES Heartbeats

Alok,

Thinking about the shared Agent (ves_app.py from Barometer) design, in which we 
don't need an agent running on each node, but can use a single agent running on 
the local cloud which aggregates VES events from the Kafka bus, it brings up 
the question of how heartbeats are supposed to work (and what we use them for) 
in the VES design.

Beyond the VNF (presumably by integration of the ONAP VES library into the 
VNF), have you been assuming that the heartbeats represent the health of:

  *   The VES agent
  *   A host (real or virtual, whether running a VES agent or not) from which 
VES events are received

If the latter, we need to consider how the agent can proxy the heartbeats for 
hosts on which there is no agent running, e.g. the agent can keep a host-based 
flag that is set whenever a collectd event is picked up on the Kafka bus during 
the heartbeat period, and send a Heartbeat report for each host at the end of 
the period. But really in that case couldn't DCAE derive that information 
anyway from what it had received? So it calls into question the purpose of the 
heartbeat beyond the VNF itself (the obvious use case) - I just need to clarify 
it.

Thanks,
Bryan Sullivan | AT

___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss


[opnfv-tech-discuss] [barometer] VES Heartbeats

2017-11-20 Thread SULLIVAN, BRYAN L (BRYAN L)
Alok,

Thinking about the shared Agent (ves_app.py from Barometer) design, in which we 
don't need an agent running on each node, but can use a single agent running on 
the local cloud which aggregates VES events from the Kafka bus, it brings up 
the question of how heartbeats are supposed to work (and what we use them for) 
in the VES design.

Beyond the VNF (presumably by integration of the ONAP VES library into the 
VNF), have you been assuming that the heartbeats represent the health of:

  *   The VES agent
  *   A host (real or virtual, whether running a VES agent or not) from which 
VES events are received

If the latter, we need to consider how the agent can proxy the heartbeats for 
hosts on which there is no agent running, e.g. the agent can keep a host-based 
flag that is set whenever a collectd event is picked up on the Kafka bus during 
the heartbeat period, and send a Heartbeat report for each host at the end of 
the period. But really in that case couldn't DCAE derive that information 
anyway from what it had received? So it calls into question the purpose of the 
heartbeat beyond the VNF itself (the obvious use case) - I just need to clarify 
it.

Thanks,
Bryan Sullivan | AT

___
opnfv-tech-discuss mailing list
opnfv-tech-discuss@lists.opnfv.org
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss