Re: [Linuxptp-devel] [PATCH] Problem: Under GNSS error condition/fluctuation, ts2phc updating 1sec time into PHC resulting in wrong time.

2023-11-16 Thread ramesh t via Linuxptp-devel
Thanks Richard for the feedback.
Please find update comments with code changes.

>From a8e9b8ade110ff9eb9acc2bb28c6e3d1db136c29 Mon Sep 17 00:00:00 2001
From: Ramesh Gowda 
Date: Thu, 12 Oct 2023 15:26:27 +0530
Subject: [PATCH] GNSS fluctuation resulting in wrong update of NIC PHC time.


Context:
Under GNSS fluctuation condition due bad weather or improper location of GNSS 
antenna, observed ts2phc process updates 1 sec difference into the physical 
hard clock (PHC) of the NIC.
This 1 sec jump is seen momentarily as captured below.


    Mar 10 15:17:40 sno-cluster-nrnl04-master1 ts2phc: [161756.611] nmea delay: 
89943030 ns
    Mar 10 15:17:40 sno-cluster-nrnl04-master1 ts2phc: [161756.611] enp81s0f0 
extts index 0 at 1678461497.0 corr 0 src 1678461498.844774711 diff 
-10
    Mar 10 15:17:40 sno-cluster-nrnl04-master1 ts2phc: [161756.611] enp81s0f0 
master offset -10 s2 freq -1
    Mar 10 15:17:40 sno-cluster-nrnl04-master1 ts2phc: [161756.612] nmea delay: 
89943030 ns
    Mar 10 15:17:40 sno-cluster-nrnl04-master1 ts2phc: [161756.612] enp138s0f0 
extts index 0 at 1678461497.0 corr 0 src 1678461498.845574965 diff 
-10
    Mar 10 15:17:40 sno-cluster-nrnl04-master1 ts2phc: [161756.612] enp138s0f0 
master offset -10 s2 freq -1


Problem:
ts2phc process will update wrong timing into PHC, this time is periodically 
copied into system timing using phc2sys process.
This will result in sudden jump of time in system clock impacting others 
processes which are tightly bound to system timing.


Solution:
With ts2phc in SERVO_LOCKED state, time difference between PHC and GPS would be 
in +-1ns.
But under problem condition, the time difference will be +-1 sec.
To prevent wrong update of time, defined a new configurable variable called 
skip count.
With SERVO_LOCKED state and time diff is more than 500 msecond, ts2phc will 
skip update of PHC for configured skip count value.
Default skip count value would be set 120 (2 minutes).
Time difference stays more than 500 msecond for continuous intervals more than 
skip count value, ts2phc will update the PHC with time difference value.
---
 config.c       |  1 +
 ts2phc.c       |  3 +++
 ts2phc_slave.c | 23 +++
 3 files changed, 27 insertions(+)


diff --git a/config.c b/config.c
index d237de9..2a9ee6b 100644
--- a/config.c
+++ b/config.c
@@ -331,6 +331,7 @@ struct config_item config_tab[] = {
 GLOB_ITEM_INT("utc_offset", CURRENT_UTC_OFFSET, 0, INT_MAX),
 GLOB_ITEM_INT("verbose", 0, 0, 1),
 GLOB_ITEM_INT("write_phase_mode", 0, 0, 1),
+PORT_ITEM_INT("max_phc_update_skip_cnt", 120, 0, 14400),
 };
 
 static struct unicast_master_table *current_uc_mtab;
diff --git a/ts2phc.c b/ts2phc.c
index 2342858..5687c9b 100644
--- a/ts2phc.c
+++ b/ts2phc.c
@@ -15,6 +15,8 @@
 #include "ts2phc_slave.h"
 #include "version.h"
 
+int max_phc_update_skip_count;
+
 struct interface {
 STAILQ_ENTRY(interface) list;
 };
@@ -146,6 +148,7 @@ int main(int argc, char *argv[])
 print_set_verbose(config_get_int(cfg, NULL, "verbose"));
 print_set_syslog(config_get_int(cfg, NULL, "use_syslog"));
 print_set_level(config_get_int(cfg, NULL, "logging_level"));
+max_phc_update_skip_count = config_get_int(cfg, NULL, 
"max_phc_update_skip_cnt");
 
 STAILQ_FOREACH(iface, &cfg->interfaces, list) {
 if (1 == config_get_int(cfg, interface_name(iface), "ts2phc.master")) {
diff --git a/ts2phc_slave.c b/ts2phc_slave.c
index 749efe5..6af1aeb 100644
--- a/ts2phc_slave.c
+++ b/ts2phc_slave.c
@@ -29,6 +29,8 @@
 #define SAMPLE_WEIGHT1.0
 #define SERVO_SYNC_INTERVAL1.0
 
+extern int max_phc_update_skip_count;
+
 struct ts2phc_slave {
 char *name;
 STAILQ_ENTRY(ts2phc_slave) list;
@@ -42,6 +44,8 @@ struct ts2phc_slave {
 clockid_t clk;
 int no_adj;
 int fd;
+int max_offset_skip_count;
+int current_offset_skip_count;
 };
 
 struct ts2phc_slave_array {
@@ -219,6 +223,10 @@ static struct ts2phc_slave *ts2phc_slave_create(struct 
config *cfg, const char *
 goto no_ext_ts;
 }
 
+slave->max_offset_skip_count = max_phc_update_skip_count;
+slave->current_offset_skip_count = 0;
+pr_debug("PHC slave %s has skip cnt %d", device, 
slave->max_offset_skip_count);
+
 return slave;
 no_ext_ts:
 no_pin_func:
@@ -278,6 +286,17 @@ static int ts2phc_slave_event(struct ts2phc_slave *slave,
 adj = servo_sample(slave->servo, offset, extts_ts,
    SAMPLE_WEIGHT, &slave->state);
 
+if ((slave->state == SERVO_LOCKED) || (slave->state == 
SERVO_LOCKED_STABLE)) {
+if (llabs(offset) >= NS_PER_SEC / 2) {
+if (slave->current_offset_skip_count < 
slave->max_offset_skip_count) {
+pr_debug("Skip current PHC update %s offset %10" PRId64 " s%d 
freq %+7.0f",
+slave->name, offset, slave->state, adj);
+slave->current_offset_skip_count++;
+return 0;
+

Re: [Linuxptp-devel] [PATCH] Problem: Under GNSS error condition/fluctuation, ts2phc updating 1sec time into PHC resulting in wrong time.

2023-11-16 Thread Richard Cochran
On Thu, Nov 16, 2023 at 10:42:33AM +, ramesh t wrote:

> Under GNSS fluctuation condition due bad weather or improper
> location of GNSS antenna, observed ts2phc process updates 1 sec
> difference into the physical hard clock (PHC) of the NIC.  This 1
> sec jump is seen momentarily as captured below.

What is the root cause of the one second offset?

You need to identify the problem.  This issue sounds like a bug in the
GNSS radio firmware.  It does not seem like a bug in linuxptp.

I don't take patches that work around buggy third party equipment!

Thanks,
Richard


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [RFC PATCH v2 1/9] Add new TLV for CommonMeanLinkDelayInformation

2023-11-16 Thread Andrew Zaborowski
On Thu, 16 Nov 2023 at 05:27, Richard Cochran  wrote:
> On Mon, May 15, 2023 at 06:26:04PM -0400, Kishen Maloor wrote:
> > @@ -1129,6 +1130,27 @@ static int port_management_fill_response(struct port 
> > *target,
> >   memcpy(pwr, &target->pwr, sizeof(*pwr));
> >   datalen = sizeof(*pwr);
> >   break;
> > + case MID_CMLDS_INFO_NP:
> > + cmlds = (struct cmlds_info_np *)tlv->data;
> > + /* IEEE1588-2019 16.6.3.2 h) 1) && nrate.ratio_valid because
> > +  * we have no extra field to convey that separately.
> > +  */
> > + cmlds->serviceMeasurementValid =
> > + target->peer_portid_valid && !target->pdr_missing &&
> > + !target->multiple_pdr_detected &&
> > + target->nrate.ratio_valid;
> > + cmlds->meanLinkDelay = target->peerMeanPathDelay;
> > + cmlds->scaledNeighborRateRatio =
> > + (Integer32) (target->nrate.ratio * POW2_41 - POW2_41);
>
> > + /* 16.6.3.2: "Upon receipt of a request for information, the
> > +  * Common Mean Link Delay Service may in addition return the
> > +  * raw measurement data gathered by the service for use in
> > +  * estimating the  and ."
> > +  */
> > + cmlds->egress_ts = tmv_to_nanoseconds(target->peer_delay_t1);
> > + cmlds->rx_ts = tmv_to_nanoseconds(target->peer_delay_t2);
>
> Please drop these two fields.  They don't provide any benefit.  If the
> clients don't trust the CMLDS values, then they are free to measure
> the p2p delay themselves!

The two timestamps are passed to clock_peer_delay() by the receiving
port and stored in c->tsproc.  Then they're accessed by
get_raw_delay() which is used in the filter logic.  I'm not sure how
much value that has, we can possibly pass 0s to clock_peer_delay().

Regarding "source_port_index", I assume that would contain the ifindex
of the interface?  With virtual clocks I believe the PHC indices may
differ between the PTP ports on one physical port ("link port").

Best regards


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [RFC PATCH v2 1/9] Add new TLV for CommonMeanLinkDelayInformation

2023-11-16 Thread Richard Cochran
On Thu, Nov 16, 2023 at 11:11:50PM +0100, Andrew Zaborowski wrote:

> The two timestamps are passed to clock_peer_delay() by the receiving
> port and stored in c->tsproc.  Then they're accessed by
> get_raw_delay() which is used in the filter logic.  I'm not sure how
> much value that has, we can possibly pass 0s to clock_peer_delay().

If the client uses CMLDS, then it doesn't need tsproc logic at all.
It simply take the delay value from the CMLDS server.
 
> Regarding "source_port_index", I assume that would contain the ifindex
> of the interface?  With virtual clocks I believe the PHC indices may
> differ between the PTP ports on one physical port ("link port").

No, I mean the PTP port number.  These are taken from the order of the
interfaces on the command line and in the configuration file.

Thanks,
Richard


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [RFC PATCH v2 1/9] Add new TLV for CommonMeanLinkDelayInformation

2023-11-16 Thread Andrew Zaborowski
On Thu, 16 Nov 2023 at 23:46, Richard Cochran  wrote:
> On Thu, Nov 16, 2023 at 11:11:50PM +0100, Andrew Zaborowski wrote:
> > The two timestamps are passed to clock_peer_delay() by the receiving
> > port and stored in c->tsproc.  Then they're accessed by
> > get_raw_delay() which is used in the filter logic.  I'm not sure how
> > much value that has, we can possibly pass 0s to clock_peer_delay().
>
> If the client uses CMLDS, then it doesn't need tsproc logic at all.
> It simply take the delay value from the CMLDS server.

Make sense, the CMLDS would already be filtering the timestamps once.

>
> > Regarding "source_port_index", I assume that would contain the ifindex
> > of the interface?  With virtual clocks I believe the PHC indices may
> > differ between the PTP ports on one physical port ("link port").
>
> No, I mean the PTP port number.  These are taken from the order of the
> interfaces on the command line and in the configuration file.

Won't this be the same as the UDS message's sourcePortIdentity.portNumber?

Do you want to require the user to enforce that the port numbering is
the same between the ptp4l processes?  In our use case spec compliance
is the priority, including the unique clockIdentity for CMLDS
requirement, so we'd definitely need to be running a separate ptp4l
process for CMLDS in this schema.  I'm asking because the port
numbering thing is unobvious and the user effort to configure a
compliant setup with ptp4l is already very high.

So one solution would be to let the user configure a custom portNumber
under port settings.  My colleague even had a local change to allow
passing -i ,.  This would also remove the
(possibly unobvious) requirement to run all ports sharing the
clockIdentity as one ptp4l process.

Another option would be to pass the ifindex as I thought you were suggesting.

Best regards


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel


Re: [Linuxptp-devel] [RFC PATCH v2 1/9] Add new TLV for CommonMeanLinkDelayInformation

2023-11-16 Thread Richard Cochran
On Fri, Nov 17, 2023 at 01:41:19AM +0100, Andrew Zaborowski wrote:
> On Thu, 16 Nov 2023 at 23:46, Richard Cochran  
> wrote:
> > No, I mean the PTP port number.  These are taken from the order of the
> > interfaces on the command line and in the configuration file.
> 
> Won't this be the same as the UDS message's sourcePortIdentity.portNumber?

Ah, yes, you are right.  Even simpler!

> Do you want to require the user to enforce that the port numbering is
> the same between the ptp4l processes?

No.

> In our use case spec compliance
> is the priority, including the unique clockIdentity for CMLDS
> requirement, so we'd definitely need to be running a separate ptp4l
> process for CMLDS in this schema.

The clockIdentity is not exposed, so what do you mean by "compliance"?

> I'm asking because the port
> numbering thing is unobvious and the user effort to configure a
> compliant setup with ptp4l is already very high.

That is just par for the course.  I'm not the one making up tons of
crazy new options.

The strength of linuxptp is modularity and fine grained, configurable
options.  This means that a) ptp4l works even with equipment that
fails to follow the standards and/or profiles, and b) ptp4l can be
really hard to configure.

> So one solution would be to let the user configure a custom portNumber
> under port settings.

Port numbers start with 1 and increase in the order of the configured
interfaces.  If it really is required for CMLDS ports to have
*different* numbers than the normal ports, we can add a "port index
offset" option that would start the numbering at some given value.

Thanks,
Richard


___
Linuxptp-devel mailing list
Linuxptp-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel