Hi Vladimir, Thanks for digging into this ... answers below.
> -----Original Message----- > From: Vladimir Oltean <olte...@gmail.com> > Sent: Monday, October 18, 2021 7:40 AM > To: Hutchinson, Brian (US) - PSPC <brian.hutchin...@l3harris.com> > Cc: linuxptp-users@lists.sourceforge.net; cegg...@arri.de > Subject: [EXTERNAL] Re: [Linuxptp-users] Using G.8275.2 profile and getting > tx timestamp timeout, but changing logSyncInterval etc. changes how often > this happens > > On Fri, Oct 15, 2021 at 12:01:24AM +0000, brian.hutchin...@l3harris.com > wrote: > > > > > If this is a "stack" issue, what can I do to reduce the "message rate" > > > > > or "grant duration" if these are related to whatever a "stack" > > > > > issue is? > > > > > > > > I'd be willing to put my money on a driver bug. But for that you'd > > > > need to confirm that the issue reproduces with the default.cfg and > > > > not just with the > > > > G.8275.2 profile. Don't try to run before you can walk. > > > > So I ran tests using a plain 1588 profile and E2E and yes the problem still > happens. Here is that config: > > There's something that just doesn't compute for me. > In those patches, Christian wrote: > > /* Currently, only P2P delay measurement is supported. Setting > ocmode > * to slave will work independently of actually being master or slave. > * For E2E delay measurement, switching between master and slave > would > * be required, as the KSZ devices filters out PTP messages > depending on > * the ocmode setting: > * - in slave mode, DelayReq messages are filtered out > * - in master mode, Sync messages are filtered out > * Currently (and probably also in future) there is no interface in the > * kernel which allows switching between master and slave mode. > For > * this reason, E2E cannot be supported. See patchwork for full > * discussion: > * > https://patchwork.ozlabs.org/project/netdev/patch/20201019172435.4416- > 8-cegg...@arri.de/ > */ > ksz9477_ptp_tcmode_set(dev, KSZ9477_PTP_TCMODE_P2P); > ksz9477_ptp_ocmode_set(dev, KSZ9477_PTP_OCMODE_SLAVE); > > Did you modify the driver's OCMODE? I am super confused as to which Yes. You echo -n E2E > /sys/class/ptp/ptp1/device/tcmode ... and echo -n slave > /sys/class/ptp/ptp1/device/ocmode ... but for me they default to E2E and slave so I just verify that they are correct before running. For me I'm using the im8mm fec mac driver as a fixed-link. Before we realized we needed G.8275.2 and bonding for redundancy we just used the fec_ptp which shows up as /dev/ptp0 and the ksz9567 shows up as /dev/ptp1. > packets ptp4l is actually waiting for a TX timestamp for. Because if you're > using E2E and not P2P, then the entire ksz9477_port_deferred_xmit() is just > dead code, is it not? It doesn't look like dead code to me ... > > > [global] > > # > > # Default Data Set > > (summary of your changes) > > twoStepFlag: 1 to 0 > slaveOnly: 0 to 1 > clockClass: 248 to 6 > fault_reset_interval: 4 to -128 > tx_timestamp_timeout: 10 to 1000 > unicast_listen: 0 to 1 > unicast_req_duration: 3600 to 300 > summary_interval: 0 to 4 > time_stamping: hardware to p2p1step > tsproc_mode: filter to raw_weight > > Can you just print the packet in ptp4l? You're using the default.cfg settings > otherwise, so the UDPv4 network_transport, so: > > static int udp_send(struct transport *t, struct fdarray *fda, > enum transport_event event, int peer, void *buf, int len, > struct address *addr, struct hw_timestamp *hwts) ... > > cnt = sendto(fd, buf, len, 0, &addr->sa, sizeof(addr->sin)); > if (cnt < 1) { > pr_err("sendto failed: %m"); > return -errno; > } > /* > * Get the time stamp right away. > */ > return event == TRANS_EVENT ? sk_receive(fd, junk, len, NULL, > hwts, MSG_ERRQUEUE) : cnt; > ^ > you can print the buf here if > sk_receive returns negative Ok, I'll look at it. > > The only place I find where this makes sense to be called from is: > port_delay_request: > if (port_prepare_and_send(p, msg, TRANS_EVENT)) { > > But that further suggests that you've modified the driver, because: > > /* Defer transmit if waiting for egress time stamp is required. */ static > struct > sk_buff *ksz9477_defer_xmit(struct dsa_port *dp, > struct sk_buff *skb) > { > /* Use cached PTP msg type from ksz9477_ptp_port_txtstamp(). */ > ptp_msg_type = KSZ9477_SKB_CB(clone)->ptp_msg_type; > if (ptp_msg_type != PTP_MSGTYPE_PDELAY_REQ) > goto out_free_clone; /* only PDelay_Req is deferred */ > > So could you share the exact list of changes you've made to the patches from > the form that they were posted in? I haven't really changed anything with Christian's code so maybe best to check out his attached .tar in his recent email. I thought his patches were all posted but maybe not. > > > > > And I did find a bug in the DSA driver but it didn't appear to change > anything. > > > > In ksz9477_ptp_txtstamp_skb function the "ret" that is being assigned > > by "wait_for_completion_timeout" returning is declared as an "int" > > instead of an "unsigned long" so I fixed that. > > Doesn't really make a difference on a 64-bit machine. > Nonetheless, is that the sticking point? Do you see this error message in > dmesg when user space loses the TX timestamp? > > dev_err(dev->dev, "timeout waiting for time stamp\n"); Yes, that's what I'm seeing. s > > > ... still looking for other stuff but again, I'm probably not > > experienced enough (yet) with DSA and LinuxPTP to do much good. Regards, Brian CONFIDENTIALITY NOTICE: This email and any attachments are for the sole use of the intended recipient and may contain material that is proprietary, confidential, privileged or otherwise legally protected or restricted under applicable government laws. Any review, disclosure, distributing or other use without expressed permission of the sender is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies without reading, printing, or saving. _______________________________________________ Linuxptp-users mailing list Linuxptp-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-users