On Mon, Nov 08, 2021 at 11:43:55PM +0200, Vladimir Oltean wrote: > On Mon, Nov 08, 2021 at 01:28:03PM -0800, Christopher Wingert wrote: > > > > > > On 11/8/2021 12:38 PM, Vladimir Oltean wrote: > > > On Mon, Nov 08, 2021 at 12:11:11PM -0800, Christopher Wingert wrote: > > > > Hi, > > > > > > > > I am working with a Aquantia AQC 107 ethernet interface. After the > > > > announce > > > > message is sent on FD_GENERAL, a poll() of the the FD_GENERAL descriptor > > > > generates a POLLERR. I see 3 delay messages go out the interface on > > > > FD_EVENT (previous to the announce message) without issue (no socket > > > > error > > > > on read on the FD_EVENT descriptor). > > > > > > > > The only difference i see between the two sockets is how the > > > > sock_filter is > > > > setup. > > > > > > > > I am thinking this is an issue with the Aquantia driver, as the same > > > > command > > > > on a Mellanox Connect X5 works fine. > > > > > > > > Has anyone seen this issue or have a clue as to where I should start? > > > > > > > > Thanks! > > > > Chris > > > > > > > > > > > > ptp4l command line : ptp4l -i els1 -H -P -2 -m > > > > Kernel is 4.18 > > > > I downloaded the latest Atlantic driver from the Marvell website > > > > 2.4.14.0 > > > > I have upgraded the AQC 107 firmware to 3.1.121 > > > I've no experience with this driver whatsoever, but generally, what > > > ptp4l receives on the error queue of a socket is a TX timestamp. What is > > > surprising is that there's a TX timestamp for a general (not event) > > > message, because ptp4l does not ask these to be timestamped. > > > > > > Apart from the error messages, does the system otherwise behave ok? > > > > > > You can try to read from the general message socket into a packet buffer > > > and hexdump it, put it in tcpdump and see what it is. Then the next step > > > might be to process its control messages (cmsg), although my first guess > > > would be that TX timestamping is what's going on. > > > > > > There are plenty of things that could go wrong in a driver (especially > > > in one you downloaded from the vendor's website and not from kernel.org). > > > If you're handy with the source code, you can check what is the > > > condition based on which this driver offers hardware TX timestamps to > > > the stack. It should be if skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP > > > is set for that packet, AND hardware TX timestamping was requested > > > through HWTSTAMP_TX_ON. > > > > Thank you for the quick response! > > > > This is what the current version from git looks like on the 107 without any > > code changes (3 delay requests, 1 announce), this loops indefinitely and > > MASTER never gets enabled. > > ptp4l[506134.862]: selected /dev/ptp11 as PTP clock > > ptp4l[506134.889]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > > ptp4l[506134.889]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on > > INIT_COMPLETE > > ptp4l[506134.889]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on > > INIT_COMPLETE > > ptp4l[506141.948]: port 1 (els1): LISTENING to MASTER on > > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > ptp4l[506141.948]: selected local clock ac1f6b.fffe.dce92d as best master > > ptp4l[506141.948]: port 1 (els1): assuming the grand master role > > ptp4l[506141.950]: port 1 (els1): unexpected socket error > > ptp4l[506141.950]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > > (FT_UNSPECIFIED) > > > > > > I changed raw.c function raw_send() to the below code to get the timestamp > > on both sockets. > > /* > > * Get the time stamp right away. > > */ > > // return event == TRANS_EVENT ? sk_receive(fd, pkt, len, NULL, hwts, > > MSG_ERRQUEUE) : cnt; > > if ( event == TRANS_EVENT ) return sk_receive(fd, pkt, len, NULL, hwts, > > MSG_ERRQUEUE); > > if ( event == TRANS_GENERAL ) return sk_receive(fd, pkt, len, NULL, > > hwts, MSG_ERRQUEUE); > > return cnt; > > > > This is the result. > > ptp4l[506201.215]: selected /dev/ptp11 as PTP clock > > ptp4l[506201.245]: port 1 (els1): INITIALIZING to LISTENING on INIT_COMPLETE > > ptp4l[506201.245]: port 0 (/var/run/ptp4l): INITIALIZING to LISTENING on > > INIT_COMPLETE > > ptp4l[506201.245]: port 0 (/var/run/ptp4lro): INITIALIZING to LISTENING on > > INIT_COMPLETE > > ptp4l[506208.757]: port 1 (els1): LISTENING to MASTER on > > ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES > > ptp4l[506208.757]: selected local clock ac1f6b.fffe.dce92d as best master > > ptp4l[506208.757]: port 1 (els1): assuming the grand master role > > ptp4l[506208.759]: poll for tx timestamp woke up on non ERR event > > ptp4l[506208.759]: port 1 (els1): send announce failed > > ptp4l[506208.759]: port 1 (els1): MASTER to FAULTY on FAULT_DETECTED > > (FT_UNSPECIFIED) > > > > Unless there is something wrong in my code change, it doesn't seem to be a > > timestamp. > > > > Are you saying that every POLLERR should be combined with a message in the > > Error Queue? > > It's still implausible that it's not a timestamp (and I don't know what > it can be if that's not it). "man poll" only says: > > POLLERR > Error condition (only returned in revents; ignored in > events). This bit is also set for a file descriptor > referring to the write end of a pipe when the read end has > been closed. > > and since ptp4l does not open connection-oriented sockets for general > PTP messages, I don't think it can detect that the read end has been > closed. > > What seems to be more likely to be going on is that you haven't made all > changes necessary for reading TX timestamps from the error queue of the > general socket. Have you called sk_timestamping_init? > > flags = 1; > if (setsockopt(fd, SOL_SOCKET, SO_SELECT_ERR_QUEUE, > &flags, sizeof(flags)) < 0) { > pr_warning("%s: SO_SELECT_ERR_QUEUE: %m", device); > sk_events = 0; > sk_revents = POLLERR; > } > > introduced by this kernel commit: > > commit 7d4c04fc170087119727119074e72445f2bb192b > Author: Keller, Jacob E <jacob.e.kel...@intel.com> > Date: Thu Mar 28 11:19:25 2013 +0000 > > net: add option to enable error queue packets waking select > > Currently, when a socket receives something on the error queue it only > wakes up > the socket on select if it is in the "read" list, that is the socket has > something to read. It is useful also to wake the socket if it is in the > error > list, which would enable software to wait on error queue packets without > waking > up for regular data on the socket. The main use case is for receiving > timestamped transmit packets which return the timestamp to the socket via > the > error queue. This enables an application to select on the socket for the > error > queue only instead of for the regular traffic. > > -v2- > * Added the SO_SELECT_ERR_QUEUE socket option to every architechture > specific file > * Modified every socket poll function that checks error queue > > Signed-off-by: Jacob Keller <jacob.e.kel...@intel.com> > Cc: Jeffrey Kirsher <jeffrey.t.kirs...@intel.com> > Cc: Richard Cochran <richardcoch...@gmail.com> > Cc: Matthew Vick <matthew.v...@intel.com> > Signed-off-by: David S. Miller <da...@davemloft.net> > > So you effectively cannot call poll() or select() on the error queue of > a socket without enabling this option. Also, I think the sk_receive() > function messes up quite badly, because of this incosistent mode in > which it's operating. See, it looks at this global variable called > sk_revents to figure out which events is poll() supposed to return. But > the code was written assuming that there's a single socket on which you > will poll for TX timestamps. And you have two, and configured > differently, at that: on one you call sk_timestamping_init() and on the > other you don't (or at least you don't mention that you do).
I think we are running around in circles. If you call sk_timestamping_init() you are effectively requesting TX timestamps for general messages, therefore changing the premise of the issue. Can you instead remove that check for !(pfd.revents & sk_revents)? _______________________________________________ Linuxptp-users mailing list Linuxptp-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-users