Re: [lwip-users] event_callback() context switch when calling sys_sem_signal()
Joel Cunningham wrote: > 1) Should SYS_ARCH_PROTECT() do more than just disable interrupts? Something > that would act as a critical section in the case where a context switch > happens? Up to now, it should block task switching too, I guess. Although this is not cleanly documented, it's just expected. However, I think by now it would be better to not make that assumption, so could you please file a bug report including your fix? (Or do we already have one? I can't remember) > 2) Is it assumed that calling sys_sem_signal() will not cause a voluntary > context switch? Right now, that's what is assumed, yes. Simon ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] TCP retransmission flooding at end of stream
Sergio R. Caprile wrote: > Anyway, glad you managed to solve your issue Michael, next user with an > STM bug will be charged ;^) I wonder if the SICS can take donations... I would take donations as well :-) I'm not getting paid for this, and my slooow 2007er MacBook is one of the reasons I dislike development lately ;-) However, sadly STM doesn't produce notebooks, or do they? Simon ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
[lwip-users] event_callback() context switch when calling sys_sem_signal()
I'm running LwIP 1.4.1 and have some questions about the event_callback() in sockets.c In my project, I am experiencing a crash related to synchronization in event_callback() and an application thread calling select(). My project is a uniprocessor system running an RTOS that implements a static priority scheduler. SYS_ARCH_PROTECT() is implemented by disabling interrupts. sys_sem_signal() is implemented using a counting semaphore. TCPIP thread is higher priority than application threads. The crash happens when the application thread is waiting in select() and the TCPIP thread is calling event_callback() to process an event. What's happen is in the below loop, calling sys_sem_signal() results in a context switch on my project's RTOS even though application thread is lower priority. The RTOS's semaphore construct doesn't support priority inheritance/elevation. The application thread wakes up and finishes the select call, modifying the select_cb_list. When the context switches back to TCPIP thread, it finishes the loop iteration and crashes because the select_cb_list has been modified. What I've done to mitigate the context switch is move the line last_select_cb_ctr = select_cb_ctr; to the top of the for loop. To me the loop already had handling for a context switch per iteration, but it only saved the counter at the end. So now it can handle a switch in the call to sys_sem_signal() as well. My questions to whether this is a bug depend on: 1) Should SYS_ARCH_PROTECT() do more than just disable interrupts? Something that would act as a critical section in the case where a context switch happens? 2) Is it assumed that calling sys_sem_signal() will not cause a voluntary context switch? SYS_ARCH_PROTECT(lev); ... again: for (scb = select_cb_list; scb != NULL; scb = scb->next) { if (scb->sem_signalled == 0) { /* semaphore not signalled yet */ int do_signal = 0; /* Test this select call for our socket */ if (sock->rcvevent > 0) { if (scb->readset && FD_ISSET(s, scb->readset)) { do_signal = 1; } } if (sock->sendevent != 0) { if (!do_signal && scb->writeset && FD_ISSET(s, scb->writeset)) { do_signal = 1; } } if (sock->errevent != 0) { if (!do_signal && scb->exceptset && FD_ISSET(s, scb->exceptset)) { do_signal = 1; } } if (do_signal) { scb->sem_signalled = 1; /* Don't call SYS_ARCH_UNPROTECT() before signaling the semaphore, as this might lead to the select thread taking itself off the list, invalidagin the semaphore. */ sys_sem_signal(&scb->sem); } } last_select_cb_ctr = select_cb_ctr; /* unlock interrupts with each step */ SYS_ARCH_UNPROTECT(lev); /* this makes sure interrupt protection time is short */ SYS_ARCH_PROTECT(lev); if (last_select_cb_ctr != select_cb_ctr) { /* someone has changed select_cb_list, restart at the beginning */ goto again; } Thanks, Joel ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] TCP retransmission flooding at end of stream
For any Dragon Ball Z fans out there, this STM bug looks like Majin Buu to me... Anyway, glad you managed to solve your issue Michael, next user with an STM bug will be charged ;^) I wonder if the SICS can take donations... As per the tcp_poll() vs tcp_sent() in your scenario, it depends on what is more important for your main task. The fastest way to transmit is by filling the TCP buffer from tcp_sent(), just after freeing the acked pbuf. This way you can keep the buffer full and respond quickly to window changes (if any). However, you'll keep your hardware dedicated to this task, and if it is not your highest priority, it might not be your main interest. If you otherwise just fill the current buffer and let tcp_poll() wake you up and keep sending, you'll probably won't saturate your link, but use less processing power. The send loops in servers mostly work on tcp_sent(), this is OK when serving "regular" amounts of data. For CGI handlers that might serve long logs, I like to put a pause once in a while in case the rest of the tasks need to breath, but I don't use RTOS but bare metal. In any case, I think it is clearer if you handle data sending on the tcp_sent() callback and keep the polling for closure or resume after pause (if any). ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users