Hi, Daniel >You've added all this extra functionality to pass arbitrary >options, but then not used it in any of the later patches. >We've been trying to remove complexity from this code, so >I'm not in favour of adding new functionality that is not >even used.
You are right, unused functionality should not be added. I was thinking about future usage, but that seems really unnecessary now. >I'm not seeing the point of adding the support for the O_NONBLOCK >in the listener socket either - that can easily be turned on after >you have the listener socket created. I don't quite understand how I can turn it on in socket_listen, because socket_listen will create a listener socket inside it, then bind and listen. Are there other ways than passing some kind of 'config parameter'? >The O_NONBLOCK functionality makes more sense in this context >but the implementation is really broken. Well.. sorry for the broken implementation, I guess I need more practice. >These functions do hostname lookups, so can never do non-blocking >mode correctly as the hostname lookup itself does blocking I/O >to the nameserver(s). Ignoring that, the way you handle the >connect() is wrong too. You're iterating over many addresses >returned by getaddrinfo() and doing a non-blocking connect >on each of them. These will essentially all fail with EAGAIN >and you'll skip onto the next address which is wrong. Why would the hostname lookup affect the listener socket afterwards? The socket is created after the lookup procedure is done. Therefore, the config should only affect the listener socket, not the hostname lookup process. Would you explain in more detail? I'm not an expert in socket programming, so I'm confused. Also, connect() indeed could return EAGAIN, however, the continue expression is inside the do-while loop of inet_connect_addr(), rather than the for loop inside inet_connect_saddr(), which is the caller of inet_connect_addr(). So it would just try to connect again instead of skipping to next address. I know this is not a well-written patch, but if you forget about the broken implementation for a while, it actually won't fail them all. I did a little test by connecting two VMs through socket, and they can ping each other. The docker test fails because I missed the NULL pointer check for sconf, which I ignored in the ping test above, but it is irrelevant to the problem here. After I fix it, the test passes. >This code used to have support for O_NONBLOCK but it was removed >because the DNS problem means that any code relying on it was >already broken. The rest of the QEMU codebase has been converted >to use QIOChannelSocket instead which can handle non-blocking >DNS properly I didn't know this, thanks for clarifying it. Do you mean if I want to use non-blocking socket, it's better to use QIOChannelSocket instead of socket_listen/socket_connect? If I want to set some other options, do I have to use the bind/listen/connect series functions directly? That seems the way in some functions in net/socket.c. Anyway, thanks for your comment, it helps clarify things. I had intended to get start with this task, but it seems to have already been fully explored before. I guess I need to review my patches throughly, Regards Zihan