[ https://issues.apache.org/jira/browse/PROTON-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759653#comment-17759653 ]
ASF GitHub Bot commented on PROTON-2748: ---------------------------------------- astitcher commented on code in PR #402: URL: https://github.com/apache/qpid-proton/pull/402#discussion_r1307634731 ########## c/src/proactor/epoll_raw_connection.c: ########## @@ -304,7 +305,7 @@ void pn_raw_connection_write_close(pn_raw_connection_t *rc) { pni_raw_write_close(rc); } -static pn_event_t *pni_raw_batch_next(pn_event_batch_t *batch) { +static pn_event_t *pni_epoll_raw_batch_next(pn_event_batch_t *batch, bool peek_only) { Review Comment: rename this: pni_raw_batch_next_common() or pni_raw_batch_next_or_peek(): Ading epoll to the name doesn't communicate anything useful about the purpose of this function. ########## c/src/proactor/epoll_raw_connection.c: ########## @@ -318,12 +319,18 @@ static pn_event_t *pni_raw_batch_next(pn_event_batch_t *batch) { unlock(&rc->task.mutex); if (waking) pni_raw_wake(raw); - pn_event_t *e = pni_raw_event_next(raw); - if (!e || pn_event_type(e) == PN_RAW_CONNECTION_DISCONNECTED) - rc->batch_empty = true; Review Comment: Where did this logic go? It seems no longer to be anywhere. Is it no longer needed? Specifically the check against the disconnected event. ########## c/src/proactor/raw_connection.c: ########## @@ -721,6 +723,14 @@ pn_event_t *pni_raw_event_next(pn_raw_connection_t *conn) { } while (true); } +pn_event_t *pni_raw_event_next(pn_raw_connection_t *conn) { + return pni_get_next_raw_event(conn, false); +} + +pn_event_t *pni_raw_event_peek(pn_raw_connection_t *conn) { + return pni_get_next_raw_event(conn, true); +} + Review Comment: I think I would refactor pni_raw_event_next differently: * Refactor the event generating if chain into a separate function that is called by ...next() and ...peek() (or if you follow my earlier suggestion ...has_events() * Then wrap that with pn_collector_next/peek logic. Just in case it's not clear the do ... while (true) is a bit of a red herring and somewhat bad coding on my part in that there is never more than one effective pass through the loop. There can be 2 but that if only if there was no initial event. So it make more sense really to separate out the event generating logic. Hope that is clear enough. ########## c/src/proactor/epoll_raw_connection.c: ########## @@ -408,32 +415,41 @@ pn_event_batch_t *pni_raw_connection_process(task_t *t, uint32_t io_events, bool } unlock(&rc->task.mutex); - if (events & EPOLLIN) pni_raw_read(&rc->raw_connection, fd, rcv, set_error); - if (events & EPOLLOUT) pni_raw_write(&rc->raw_connection, fd, snd, set_error); - rc->batch_empty = false; + if (rc->connected) { Review Comment: I don't think you need this condition (which is why it wasn't present previously) - if you get here without returning already you must be connected. ########## c/src/proactor/epoll_raw_connection.c: ########## @@ -408,32 +415,41 @@ pn_event_batch_t *pni_raw_connection_process(task_t *t, uint32_t io_events, bool } unlock(&rc->task.mutex); - if (events & EPOLLIN) pni_raw_read(&rc->raw_connection, fd, rcv, set_error); - if (events & EPOLLOUT) pni_raw_write(&rc->raw_connection, fd, snd, set_error); - rc->batch_empty = false; + if (rc->connected) { + if (events & EPOLLERR) { + // Read and write sides closed via RST. Tear down immediately. + int soerr; + socklen_t soerrlen = sizeof(soerr); + int ec = getsockopt(fd, SOL_SOCKET, SO_ERROR, &soerr, &soerrlen); + if (ec == 0 && soerr) { + psocket_error(rc, soerr, "async disconnect"); + } + pni_raw_async_disconnect(&rc->raw_connection); + } else if (events & EPOLLHUP) { Review Comment: The style of this function is early return. So please change this to have a return with no 'else if' if that is possible - I don't understand enough of this logic to see if you could get EPOLLERR as well as EPOLLIN, EPOLLOUT or EPOLLRDHUP at the same time with a useful action of reading or writing, but it seems unlikely to me. And if this change is possible it lowers the cognitive load (and indentation) of the function. ########## c/src/proactor/epoll_raw_connection.c: ########## @@ -318,12 +319,18 @@ static pn_event_t *pni_raw_batch_next(pn_event_batch_t *batch) { unlock(&rc->task.mutex); if (waking) pni_raw_wake(raw); - pn_event_t *e = pni_raw_event_next(raw); - if (!e || pn_event_type(e) == PN_RAW_CONNECTION_DISCONNECTED) - rc->batch_empty = true; + pn_event_t *e = peek_only ? pni_raw_event_peek(raw) : pni_raw_event_next(raw); return e; } +static pn_event_t *pni_raw_batch_next(pn_event_batch_t *batch) { + return pni_epoll_raw_batch_next(batch, false); +} + +static pn_event_t *pni_raw_batch_peek(pn_event_batch_t *batch) { + return pni_epoll_raw_batch_next(batch, true); +} Review Comment: Are you sure you want to introduce a peek operation? This is only ever used to find out whether there is an outstanding event so why not just introduce an operation `bool pni_raw_batch_has_events()`? ########## c/src/proactor/raw_connection-internal.h: ########## @@ -134,9 +134,11 @@ void pni_raw_write_close(pn_raw_connection_t *conn); void pni_raw_read(pn_raw_connection_t *conn, int sock, long (*recv)(int, void*, size_t), void (*set_error)(pn_raw_connection_t *, const char *, int)); void pni_raw_write(pn_raw_connection_t *conn, int sock, long (*send)(int, const void*, size_t), void (*set_error)(pn_raw_connection_t *, const char *, int)); void pni_raw_process_shutdown(pn_raw_connection_t *conn, int sock, int (*shutdown_rd)(int), int (*shutdown_wr)(int)); +void pni_raw_async_disconnect(pn_raw_connection_t *conn); Review Comment: As this is a new event going to the raw connection state machine maybe there should be a new test for this event in raw_connection_test. ########## c/src/proactor/epoll_raw_connection.c: ########## @@ -442,24 +458,31 @@ void pni_raw_connection_done(praw_connection_t *rc) { // The task may be in the ready state even if we've got no raw connection // wakes outstanding because we dealt with it already in pni_raw_batch_next() notify = (pni_task_wake_pending(&rc->task) || have_event) && schedule(&rc->task); - ready = rc->task.ready; + ready = rc->task.ready; // No need to poll. Already scheduled. unlock(&rc->task.mutex); - pn_raw_connection_t *raw = &rc->raw_connection; - int fd = rc->psocket.epoll_io.fd; - pni_raw_process_shutdown(raw, fd, shutr, shutw); - int wanted = - (pni_raw_can_read(raw) ? EPOLLIN : 0) | - (pni_raw_can_write(raw) ? EPOLLOUT : 0); - if (wanted) { - rc->psocket.epoll_io.wanted = wanted; - rearm_polling(&rc->psocket.epoll_io, p->epollfd); // TODO: check for error + bool finished_disconnect = raw->state==conn_fini && !ready && !raw->disconnectpending; + if (finished_disconnect) { + // If we're closed and we've sent the disconnect then close + pni_raw_finalize(raw); + praw_connection_cleanup(rc); + } else if (ready) { + // Already scheduled to run. Skip poll. Remember if we want a read. + rc->read_check = pni_raw_can_read(raw); + } else if (!rc->connected) { + // Connect logic has already armed the socket. } else { - bool finished_disconnect = raw->state==conn_fini && !ready && !raw->disconnectpending; - if (finished_disconnect) { - // If we're closed and we've sent the disconnect then close - pni_raw_finalize(raw); - praw_connection_cleanup(rc); + // Must poll for iO. Review Comment: typo > Raw connections do not always complete close operations > ------------------------------------------------------- > > Key: PROTON-2748 > URL: https://issues.apache.org/jira/browse/PROTON-2748 > Project: Qpid Proton > Issue Type: Bug > Components: proton-c > Affects Versions: proton-c-0.39.0 > Environment: linux epoll > Reporter: Clifford Jansen > Assignee: Clifford Jansen > Priority: Major > Attachments: pn2748.patch > > > The Proton raw_connection_t currently requires cooperation from the > application layer to complete a close. There is a baked in assumption that > the application will always eventually provide a read buffer. A second > assumption is that the peer (not necessarily a Proton raw connection) will > detect a read close on its side, and do a graceful close of it's write side > "soon". > These incorrect assumptions can leave the raw connection in a hung state > waiting for non-existent wind up activity by the application or peer, > respectively. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org