On 11/8/2021 12:38 PM, Vladimir Oltean wrote:
On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote:
Hi,

I am working with a Aquantia AQC 107 ethernet interface.  After the announce
message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor
generates a POLLERR.  I see 3 delay messages go out the interface on
FD_EVENT (previous to the announce message) without issue (no socket error
on read on the FD_EVENT descriptor).

The only difference i see between the two sockets is how the sock_filter is
setup.

I am thinking this is an issue with the Aquantia driver, as the same command
on a Mellanox Connect X5 works fine.

Has anyone seen this issue or have a clue as to where I should start?

Thanks!
Chris


ptp4l command line : ptp4l -i els1 -H -P -2 -m
Kernel is 4.18
I downloaded the latest Atlantic driver from the Marvell website 2.4.14.0
I have upgraded the AQC 107 firmware to 3.1.121
I've no experience with this driver whatsoever, but generally, what
ptp4l receives on the error queue of a socket is a TX timestamp. What is
surprising is that there's a TX timestamp for a general (not event)
message, because ptp4l does not ask these to be timestamped.

Apart from the error messages, does the system otherwise behave ok?

You can try to read from the general message socket into a packet buffer
and hexdump it, put it in tcpdump and see what it is. Then the next step
might be to process its control messages (cmsg), although my first guess
would be that TX timestamping is what's going on.

There are plenty of things that could go wrong in a driver (especially
in one you downloaded from the vendor's website and not from kernel.org).
If you're handy with the source code, you can check what is the
condition based on which this driver offers hardware TX timestamps to
the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP
is set for that packet, AND hardware TX timestamping was requested
through HWTSTAMP_TX_ON.

Thank you for the quick response!

This is what the current version from git looks like on the 107 without any code changes (3 delay requests, 1 announce), this loops indefinitely and MASTER never gets enabled.
ptp4l[506134.862]: selected /dev/ptp11 as PTP clock
ptp4l[506134.889]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[506134.889]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506134.889]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506141.948]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
ptp4l[506141.948]: selected local clock ac1f6b.fffe.dce92d as best master
ptp4l[506141.948]: port 1 (els1): assuming the grand master role
ptp4l[506141.950]: port 1 (els1): unexpected socket error
ptp4l[506141.950]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)


I changed raw.c function raw_send() to the below code to get the timestamp on both sockets.
   /*
    * Get the time stamp right away.
    */
   // return event == TRANS_EVENT ? sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE) : cnt;    if ( event == TRANS_EVENT ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE);    if ( event == TRANS_GENERAL ) return sk_receive(fd, pkt, len, NULL, hwts, MSG_ERRQUEUE);
   return cnt;

This is the result.
ptp4l[506201.215]: selected /dev/ptp11 as PTP clock
ptp4l[506201.245]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[506201.245]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506201.245]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[506208.757]: port 1 (els1): LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES
ptp4l[506208.757]: selected local clock ac1f6b.fffe.dce92d as best master
ptp4l[506208.757]: port 1 (els1): assuming the grand master role
ptp4l[506208.759]: poll for tx timestamp woke up on non ERR event
ptp4l[506208.759]: port 1 (els1): send announce failed
ptp4l[506208.759]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)

Unless there is something wrong in my code change, it doesn't seem to be a timestamp.

Are you saying that every POLLERR should be combined with a message in the Error Queue?



_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Reply via email to