Re: [Linuxptp-users] Using G.8275.2 profile and getting tx timestamp timeout, but changing logSyncInterval etc. changes how often this happens

Vladimir Oltean Tue, 12 Oct 2021 16:12:23 -0700

On Fri, Oct 08, 2021 at 03:22:10PM +0000, Brian.Hutchinson--- via 
Linuxptp-users wrote:
> Hi,
>
> I'm using Christian's DSA patches
> https://lkml.org/lkml/2020/10/19/633) on a NXP iMX8MM with a Microchip
> ksz9567 with ptp4l.conf setup for E2E G.8275.2 profile.  I'm running a
> 1G RGMII interface and my GM and unit under test is connected via a 1G
> Netgear dumb switch.
>
> Using 5.10.32 kernel with CONFIG_HZ_1000 and nohz=off on cmdline.
>
> I've been getting the "timed out while polling for tx timestamp" error
> which causes linuxptp to restart.  When linuxptp restarts my 1PPS
> (generated from Microchip switch) walks all over the place on my O
> Scope until linuxptp gets a good sync again and pulls 1PPS back into
> sync with the GM sync out reference I'm also watching on the scope.
>
> Of course increasing tx_timestamp_timeout doesn't appear to help in
> this case.  I've tried values all the way up to 8000.
>
> But I can significantly reduce the frequency of the problem if I make
> changes to some ptp4l.conf settings.
>
> With ptp4l.conf settings:
>
> logAnnounceInterval 1
> logSyncInterval 0
> logMinDelayReqInterval 0
> logMinPdelayReqInterval 0
> announceReceiptTimeout 2
>
> I'll see the tx timestamp timeout probably 15 or so times running a
> test overnight.
>
> If I set :
>
> logAnnounceInterval 1
> logSyncInterval 2
> logMinDelayReqInterval 2
> logMinPdelayReqInterval 2
> announceReceiptTimeout 2
>
> ... then I might see tx timestamp only once or twice on an overnight run.
>
> I read a comment from Douglas Arnold from Meinberg that if basically
> anything goes wrong with fulfilling a grant, message rate or grant
> duration, or both, should be reduced.
>
> I've searched the archives and read all of the responses and a few
> caught my attention.  Most say it's a driver bug but some said it
> could be a stack issue.  So I'm wondering since I can significantly
> decrease the occurrence of the tx timeout by modifying above settings,
> what other settings would affect or tune this particular telco
> profile?
>
> I'm still fairly new to all this and I understand the telco profiles
> are a bit unique so I'm trying to understand what ptp4l.conf settings
> I need to focus on for this particular profile.
>
> If this is a "stack" issue, what can I do to reduce the "message rate"
> or "grant duration" if these are related to whatever a "stack" issue
> is?


I'd be willing to put my money on a driver bug. But for that you'd need
to confirm that the issue reproduces with the default.cfg and not just
with the G.8275.2 profile. Don't try to run before you can walk.

Make no mistake, there was a reason why the patches you've pointed to
were not applied to the mainline kernel in their given form at the time.

But regardless, which specific version of the patches have you applied?
Your link points to the RFC (aka "barely works"), whereas the latest
version, before being abandoned, was v5.
https://patchwork.kernel.org/project/netdevbpf/cover/[email protected]/
I specifically had a comment that TX timestamps would potentially get
lost if user space would attempt timestamping of one frame while another
was still in progress, and this only got fixed in v5 by the addition of
a ksz9477_defer_xmit() function that waits until the in-flight skb has
been timestamped. There might be other issues too.

The logAnnounceInterval should not be making a difference, because the
driver performs one-step timestamping for Sync messages, so their rate
shouldn't matter, as the TX timestamp isn't reported to user space.
Just the two-step TX timestamp of the Pdelay_Req frame is, and
therefore, modulating the logMinPdelayReqInterval value is the only
thing that should be able to modulate the behavior of your observed
issue.

[ also, don't be shy to also provide negative values to logMinPdelayReqInterval,
  for example -3 means 2^-3 seconds == 125 ms. We should see something
  really quickly with a setting like that ]

Once you have a simple reproducer with the v5, maybe Christian would be
able to tell you where to put some trace points in the kernel for a
better understanding of what goes wrong with the Pdelay_Req messages.

> Regards,
>
> Brian
>
> My complete ptp4l.conf settings.  These settings will run with less "
> timed out while polling for tx timestamp" occurrences but increases my
> 1PPS jitter observed on O Scope by +/- 600ish ns.  When I run with
> first set of logXxx settings above the jitter is much better at +/-
> 200ish ns.
>
> [global]
> #
> # Default Data Set
> #
> twoStepFlag             0
> slaveOnly               1
> priority1               128
> priority2               255
> domainNumber            44
> utc_offset             37
> #clockClass              248
> clockClass              255
> #step_window            3
> clockAccuracy           0xFE
> offsetScaledLogVariance 0xFFFF
> free_running            0
> freq_est_interval       1
> dscp_event              0
> dscp_general            0
> #dataset_comparison     ieee1588
> #for G.8275.1
> dataset_comparison      G.8275.x
> G.8275.defaultDS.localPriority  128
> #
> # Port Data Set
> #
> logAnnounceInterval     1
> logSyncInterval         2
> logMinDelayReqInterval  2
> logMinPdelayReqInterval 2
> announceReceiptTimeout  2
> syncReceiptTimeout      0
> delayAsymmetry          0
> fault_reset_interval    -128
> #fault_reset_interval    4
> neighborPropDelayThresh 20000000
> masterOnly              0
> G.8275.portDS.localPriority     128
> #
> # Run time options
> #
> assume_two_step         0
> logging_level           6
> path_trace_enabled      0
> follow_up_info          0
> hybrid_e2e              1
> inhibit_multicast_service       1
> net_sync_monitor        0
> tc_spanning_tree        0
> #tx_timestamp_timeout    300
> tx_timestamp_timeout    8000
> unicast_listen          1
> unicast_req_duration    300
> unicast_master_table    1
> use_syslog              1
> verbose                 0
> summary_interval        4
> kernel_leap             1
> #check_fup_sync          0
> check_fup_sync          1
> #
> # Servo Options
> #
> #write_phase_mode       1
> servo_offset_threshold  100
> servo_num_offset_values 64
> pi_proportional_const   0.0
> #pi_proportional_const   0.7
> pi_integral_const       0.0
> #pi_integral_const       0.3
> pi_proportional_scale   0.0
> pi_proportional_exponent        -0.3
> pi_proportional_norm_max        0.7
> pi_integral_scale       0.0
> pi_integral_exponent    0.4
> pi_integral_norm_max    0.3
> step_threshold          0.0
> #step_threshold         0.00002
> first_step_threshold    0.00002
> max_frequency           900000000
> clock_servo             pi
> sanity_freq_limit       200000000
> ntpshm_segment          0
> #
> # Transport options
> #
> transportSpecific       0x0
> ptp_dst_mac            01:1B:19:00:00:00
> p2p_dst_mac            01:80:C2:00:00:0E
> udp_ttl                 1
> #udp6_scope             0x0E
> uds_address             /var/run/ptp4l
> #
> # Default interface options
> #
> clock_type              OC
> network_transport       UDPv4
> #delay_mechanism         P2P
> delay_mechanism         E2E
> time_stamping           p2p1step
> #time_stamping           onestep
> #time_stamping           hardware
> #tsproc_mode            filter
> tsproc_mode             filter_weight
> delay_filter           moving_median
> #delay_filter_length    10
> delay_filter_length    100
> egressLatency           0
> ingressLatency          0
> boundary_clock_jbod     0
> #
> # Clock description
> #
> productDescription      ;;
> revisionData            ;;
> manufacturerIdentity    00:00:00
> userDescription         ;
> timeSource              0xA0
> maxStepsRemoved         255
> #
> [unicast_master_table]
> table_id                        1
> logQueryInterval                2
> UDPv4                           192.168.0.250
> #UDPv4                           192.168.1.250
> #
> [lan1]
> unicast_master_table
>
>
>
> CONFIDENTIALITY NOTICE: This email and any attachments are for the
> sole use of the intended recipient and may contain material that is
> proprietary, confidential, privileged or otherwise legally protected
> or restricted under applicable government laws. Any review,
> disclosure, distributing or other use without expressed permission of
> the sender is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies without
> reading, printing, or saving.

Am I an intended recipient? Let me know so I can delete the email if needed.
What about the sourceforge mail archive?


_______________________________________________
Linuxptp-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Re: [Linuxptp-users] Using G.8275.2 profile and getting tx timestamp timeout, but changing logSyncInterval etc. changes how often this happens

Reply via email to