[bess] Re: Inverse multi-layer OAM

2025-03-23 Thread Greg Mirsky
Hi Robert,
I wholeheartedly agree that local and e2e OAM are complementary tools in an
operator's toolbox. Usually, a multi-layer OAM is constructed so that e2e
provides the network with a safety net. In that manner, local repair of a
link failure is expected to restore services before the failure is detected
on the e2e level. As I understand it, the proposal uses a different scheme.
According to it, e2e network detection is expected to be more aggressive
than the link-level OAM. To me, that's an unusual arrangement.
As for performance monitoring, although some performance metrics can be
measured spatially to compose e2e metrics, e2e performance monitoring is
easier to deploy in many environments.

Regards,
Greg

On Wed, Mar 19, 2025 at 11:21 PM Robert Raszuk  wrote:

> Hi Greg,
>
> I am very much in support of end to end path assurance. And by assurance I
> mean not only e2e liveness but also e2e loss, delays, jitter etc ...
>
> The main reason is that link layer failures (even if done on every link in
> the path) does not provide any information about transit via network
> devices. And those can be subject to packet drops, selective packet drops
> (brownouts), delays and jitter via box fabrics in distributed systems etc
> ... So to me even if e2e is slower then local link detection it still very
> much a preferred way to assure end to end path quality.
>
> Sure some of them is done at the application layer, but then it is done
> mainly for statistics and reporting. Doing it at network layer opens up
> possibilities to choose different path (quite likely via different
> provider) when original path experiences some issues or service degradation
> which with link by link failure detection is invisible to the endpoints.
>
> I think at the end of the day those two are not really competing solutions
> but complimentary. And of course end to end makes sense especially in
> deployments when you can have diverse paths end to end.
>
> Cheers
> Robert
>
> On Wed, Mar 19, 2025 at 4:58 AM Greg Mirsky  wrote:
>
>> Hi Himanshu,
>>
>> Thank you for the presentation of
>> draft-karboubi-spring-sidlist-optimized-cs-sr
>> .
>> If I understood your response to Ali correctly, the proposed mechanism is
>> expected to use more aggressive network failure detection than the link
>> layer. If that is correct, I have several questions about the multi-layer
>> OAM:
>>
>>- AFAIK link-layer failures are detected within 10 ms using a
>>connectivity check mechanism (CCM of Y.1731 or a single-hop BFD) with a 
>> 3.3
>>ms interval.
>>- If the link failure is detectable within 10 ms, what detection time
>>for the path, i.e., E2E connection failure detection, is suggested? What
>>interval between test probes will be used in that case?
>>- Furthermore, even if the path converges around the link failure
>>before the local protection is deployed, the link failure will be 
>> detected,
>>and the protection mechanism will be deployed despite the Orchestrator
>>setting up its recovery path in the network. If that is correct, local
>>defect detection and protection are unnecessary overheads. Would you 
>> agree?
>>
>>
>> Regards,
>>
>> Greg
>> ___
>> BESS mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>>
>
___
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]


[bess] Re: Inverse multi-layer OAM

2025-03-20 Thread Shah, Himanshu
Yes you did. Concerns are misplaced.
The thread has progressed and it is unfortunate that –

  *   Greg wrongfully commented on BESS WG list. This thread SHOULD NOT BE in 
BESS WG email list. It belongs in SPRING WG list. The draft was discussed in 
SPRING WG.
  *   It would really help if one was to comment on the running thread rather 
than middle of the thread causing the forks.

Please take this discussions to the SPRING mailing list.

Thanks,
Himanshu


From: Zafar Ali (zali) 
Date: Thursday, March 20, 2025 at 11:19 AM
To: Joel Halpern , Greg Mirsky , 
Robert Raszuk 
Cc: Shah, Himanshu , BESS , 
[email protected] 
, Zafar Ali (zali) 

Subject: [**EXTERNAL**] Re: [bess] Re: Inverse multi-layer OAM
Hi

I agree with Joel (as I also mentioned during the Spring session).

Thanks

Regards … Zafar

From: Joel Halpern 
Date: Thursday, March 20, 2025 at 10:42 AM
To: Greg Mirsky , Robert Raszuk 
Cc: Shah, Himanshu , BESS , 
[email protected] 

Subject: [bess] Re: Inverse multi-layer OAM

It seems rather counter-intuitive to want to try to repair things end-to-end 
faster than one expects local devices to detect local failures.  The implied 
information race conditions seem an invitation to trouble.

Yours,

Joel
On 3/19/2025 11:14 PM, Greg Mirsky wrote:
Hi Robert,
I wholeheartedly agree that local and e2e OAM are complementary tools in an 
operator's toolbox. Usually, a multi-layer OAM is constructed so that e2e 
provides the network with a safety net. In that manner, local repair of a link 
failure is expected to restore services before the failure is detected on the 
e2e level. As I understand it, the proposal uses a different scheme. According 
to it, e2e network detection is expected to be more aggressive than the 
link-level OAM. To me, that's an unusual arrangement.
As for performance monitoring, although some performance metrics can be 
measured spatially to compose e2e metrics, e2e performance monitoring is easier 
to deploy in many environments.

Regards,
Greg

On Wed, Mar 19, 2025 at 11:21 PM Robert Raszuk 
mailto:[email protected]>> wrote:
Hi Greg,

I am very much in support of end to end path assurance. And by assurance I mean 
not only e2e liveness but also e2e loss, delays, jitter etc ...

The main reason is that link layer failures (even if done on every link in the 
path) does not provide any information about transit via network devices. And 
those can be subject to packet drops, selective packet drops (brownouts), 
delays and jitter via box fabrics in distributed systems etc ... So to me even 
if e2e is slower then local link detection it still very much a preferred way 
to assure end to end path quality.

Sure some of them is done at the application layer, but then it is done mainly 
for statistics and reporting. Doing it at network layer opens up possibilities 
to choose different path (quite likely via different provider) when original 
path experiences some issues or service degradation which with link by link 
failure detection is invisible to the endpoints.

I think at the end of the day those two are not really competing solutions but 
complimentary. And of course end to end makes sense especially in deployments 
when you can have diverse paths end to end.

Cheers
Robert

On Wed, Mar 19, 2025 at 4:58 AM Greg Mirsky 
mailto:[email protected]>> wrote:

Hi Himanshu,

Thank you for the presentation of draft-karboubi-spring-sidlist-optimized-cs-sr 
[datatracker.ietf.org]<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-karboubi-spring-sidlist-optimized-cs-sr/__;!!OSsGDw!LbhSBMT2wYNpD-4Kr0InSvL5Ni-XWsSQRQWODSn5AS0CFfsX3cH6SbQKWDUbxUEookTWNw$>.
 If I understood your response to Ali correctly, the proposed mechanism is 
expected to use more aggressive network failure detection than the link layer. 
If that is correct, I have several questions about the multi-layer OAM:

  *   AFAIK link-layer failures are detected within 10 ms using a connectivity 
check mechanism (CCM of Y.1731 or a single-hop BFD) with a 3.3 ms interval.
  *   If the link failure is detectable within 10 ms, what detection time for 
the path, i.e., E2E connection failure detection, is suggested? What interval 
between test probes will be used in that case?
  *   Furthermore, even if the path converges around the link failure before 
the local protection is deployed, the link failure will be detected, and the 
protection mechanism will be deployed despite the Orchestrator setting up its 
recovery path in the network. If that is correct, local defect detection and 
protection are unnecessary overheads. Would you agree?



Regards,

Greg
___
BESS mailing list -- [email protected]<mailto:[email protected]>
To unsubscribe send an email to [email protected]<mailto:[email protected]>


___

BESS mailing list --

[bess] Re: Inverse multi-layer OAM

2025-03-19 Thread Zafar Ali (zali)
Hi

I agree with Joel (as I also mentioned during the Spring session).

Thanks

Regards … Zafar

From: Joel Halpern 
Date: Thursday, March 20, 2025 at 10:42 AM
To: Greg Mirsky , Robert Raszuk 
Cc: Shah, Himanshu , BESS , 
[email protected] 

Subject: [bess] Re: Inverse multi-layer OAM

It seems rather counter-intuitive to want to try to repair things end-to-end 
faster than one expects local devices to detect local failures.  The implied 
information race conditions seem an invitation to trouble.

Yours,

Joel
On 3/19/2025 11:14 PM, Greg Mirsky wrote:
Hi Robert,
I wholeheartedly agree that local and e2e OAM are complementary tools in an 
operator's toolbox. Usually, a multi-layer OAM is constructed so that e2e 
provides the network with a safety net. In that manner, local repair of a link 
failure is expected to restore services before the failure is detected on the 
e2e level. As I understand it, the proposal uses a different scheme. According 
to it, e2e network detection is expected to be more aggressive than the 
link-level OAM. To me, that's an unusual arrangement.
As for performance monitoring, although some performance metrics can be 
measured spatially to compose e2e metrics, e2e performance monitoring is easier 
to deploy in many environments.

Regards,
Greg

On Wed, Mar 19, 2025 at 11:21 PM Robert Raszuk 
mailto:[email protected]>> wrote:
Hi Greg,

I am very much in support of end to end path assurance. And by assurance I mean 
not only e2e liveness but also e2e loss, delays, jitter etc ...

The main reason is that link layer failures (even if done on every link in the 
path) does not provide any information about transit via network devices. And 
those can be subject to packet drops, selective packet drops (brownouts), 
delays and jitter via box fabrics in distributed systems etc ... So to me even 
if e2e is slower then local link detection it still very much a preferred way 
to assure end to end path quality.

Sure some of them is done at the application layer, but then it is done mainly 
for statistics and reporting. Doing it at network layer opens up possibilities 
to choose different path (quite likely via different provider) when original 
path experiences some issues or service degradation which with link by link 
failure detection is invisible to the endpoints.

I think at the end of the day those two are not really competing solutions but 
complimentary. And of course end to end makes sense especially in deployments 
when you can have diverse paths end to end.

Cheers
Robert

On Wed, Mar 19, 2025 at 4:58 AM Greg Mirsky 
mailto:[email protected]>> wrote:

Hi Himanshu,

Thank you for the presentation of 
draft-karboubi-spring-sidlist-optimized-cs-sr<https://datatracker.ietf.org/doc/draft-karboubi-spring-sidlist-optimized-cs-sr/>.
 If I understood your response to Ali correctly, the proposed mechanism is 
expected to use more aggressive network failure detection than the link layer. 
If that is correct, I have several questions about the multi-layer OAM:

  *   AFAIK link-layer failures are detected within 10 ms using a connectivity 
check mechanism (CCM of Y.1731 or a single-hop BFD) with a 3.3 ms interval.
  *   If the link failure is detectable within 10 ms, what detection time for 
the path, i.e., E2E connection failure detection, is suggested? What interval 
between test probes will be used in that case?
  *   Furthermore, even if the path converges around the link failure before 
the local protection is deployed, the link failure will be detected, and the 
protection mechanism will be deployed despite the Orchestrator setting up its 
recovery path in the network. If that is correct, local defect detection and 
protection are unnecessary overheads. Would you agree?



Regards,

Greg
___
BESS mailing list -- [email protected]<mailto:[email protected]>
To unsubscribe send an email to [email protected]<mailto:[email protected]>



___

BESS mailing list -- [email protected]<mailto:[email protected]>

To unsubscribe send an email to [email protected]<mailto:[email protected]>
___
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]


[bess] Re: Inverse multi-layer OAM

2025-03-19 Thread Joel Halpern
It seems rather counter-intuitive to want to try to repair things 
end-to-end faster than one expects local devices to detect local 
failures.  The implied information race conditions seem an invitation to 
trouble.


Yours,

Joel

On 3/19/2025 11:14 PM, Greg Mirsky wrote:

Hi Robert,
I wholeheartedly agree that local and e2e OAM are complementary tools 
in an operator's toolbox. Usually, a multi-layer OAM is constructed so 
that e2e provides the network with a safety net. In that manner, local 
repair of a link failure is expected to restore services before the 
failure is detected on the e2e level. As I understand it, the proposal 
uses a different scheme. According to it, e2e network detection is 
expected to be more aggressive than the link-level OAM. To me, that's 
an unusual arrangement.
As for performance monitoring, although some performance metrics can 
be measured spatially to compose e2e metrics, e2e performance 
monitoring is easier to deploy in many environments.


Regards,
Greg

On Wed, Mar 19, 2025 at 11:21 PM Robert Raszuk  wrote:

Hi Greg,

I am very much in support of end to end path assurance. And by
assurance I mean not only e2e liveness but also e2e loss, delays,
jitter etc ...

The main reason is that link layer failures (even if done on every
link in the path) does not provide any information about transit
via network devices. And those can be subject to packet drops,
selective packet drops (brownouts), delays and jitter via box
fabrics in distributed systems etc ... So to me even if e2e is
slower then local link detection it still very much a
preferred way to assure end to end path quality.

Sure some of them is done at the application layer, but then it is
done mainly for statistics and reporting. Doing it at network
layer opens up possibilities to choose different path (quite
likely via different provider) when original path experiences some
issues or service degradation which with link by link failure
detection is invisible to the endpoints.

I think at the end of the day those two are not really competing
solutions but complimentary. And of course end to end makes sense
especially in deployments when you can have diverse paths end to end.

Cheers
Robert

On Wed, Mar 19, 2025 at 4:58 AM Greg Mirsky
 wrote:

Hi Himanshu,

Thank you for the presentation of
draft-karboubi-spring-sidlist-optimized-cs-sr

.
If I understood your response to Ali correctly, the proposed
mechanism is expected to use more aggressive network failure
detection than the link layer. If that is correct, I
have several questions about the multi-layer OAM:

  * AFAIK link-layer failures are detected within 10 ms using
a connectivity check mechanism (CCM of Y.1731 or a
single-hop BFD) with a 3.3 ms interval.
  * If the link failure is detectable within 10 ms, what
detection time for the path, i.e., E2E connection failure
detection, is suggested? What interval between test probes
will be used in that case?
  * Furthermore, even if the path converges around the link
failure before the local protection is deployed, the link
failure will be detected, and the protection mechanism
will be deployed despite the Orchestrator setting up its
recovery path in the network. If that is correct, local
defect detection and protection are unnecessary overheads.
Would you agree?


Regards,

Greg

___
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]


___
BESS mailing list [email protected]
To unsubscribe send an email [email protected]___
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]


[bess] Re: Inverse multi-layer OAM

2025-03-19 Thread Robert Raszuk
Hi Greg,

I am very much in support of end to end path assurance. And by assurance I
mean not only e2e liveness but also e2e loss, delays, jitter etc ...

The main reason is that link layer failures (even if done on every link in
the path) does not provide any information about transit via network
devices. And those can be subject to packet drops, selective packet drops
(brownouts), delays and jitter via box fabrics in distributed systems etc
... So to me even if e2e is slower then local link detection it still very
much a preferred way to assure end to end path quality.

Sure some of them is done at the application layer, but then it is done
mainly for statistics and reporting. Doing it at network layer opens up
possibilities to choose different path (quite likely via different
provider) when original path experiences some issues or service degradation
which with link by link failure detection is invisible to the endpoints.

I think at the end of the day those two are not really competing solutions
but complimentary. And of course end to end makes sense especially in
deployments when you can have diverse paths end to end.

Cheers
Robert

On Wed, Mar 19, 2025 at 4:58 AM Greg Mirsky  wrote:

> Hi Himanshu,
>
> Thank you for the presentation of
> draft-karboubi-spring-sidlist-optimized-cs-sr
> .
> If I understood your response to Ali correctly, the proposed mechanism is
> expected to use more aggressive network failure detection than the link
> layer. If that is correct, I have several questions about the multi-layer
> OAM:
>
>- AFAIK link-layer failures are detected within 10 ms using a
>connectivity check mechanism (CCM of Y.1731 or a single-hop BFD) with a 3.3
>ms interval.
>- If the link failure is detectable within 10 ms, what detection time
>for the path, i.e., E2E connection failure detection, is suggested? What
>interval between test probes will be used in that case?
>- Furthermore, even if the path converges around the link failure
>before the local protection is deployed, the link failure will be detected,
>and the protection mechanism will be deployed despite the Orchestrator
>setting up its recovery path in the network. If that is correct, local
>defect detection and protection are unnecessary overheads. Would you agree?
>
>
> Regards,
>
> Greg
> ___
> BESS mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
___
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]