Re: [lwip-users] lwIP hangs after some data transferred
Grzegorz Niemirowski grzeg...@grzegorz.net napisał(a): Further analysis gave following observations: - thread blocks on fetch because packets are no longer inserted into mbox - packets are not inserted because the ETH interrupt is no longer raised - there is no interrupt because reception is suspended, set bits in the DMASR register show: RPS bits: 100: Suspended: Receive descriptor unavailable RBUS bit: Receive buffer unavailable status Descriptors are freed inside low_level_input after PBUF has been allocated and incoming packet has been copied. But it happens that pbuf_alloc returns NULL. What could be the reason? The traffic is not big, just a few kB/s. Maybe there is not enough space for PBUFs, but the problem doesn't occur immediately, usually a few thousand packets are transmitted without any issue before the stack stalls. I've finally found the solution. I had SYS_LIGHTWEIGHT_PROT defined as 0. After defining the macro as 1 the stack became stable. Grzegorz Niemirowski ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Grzegorz Niemirowski grzeg...@grzegorz.net napisał(a): I'm still fighting with the problem. The stack hangs on fetching packets from mbox. I have tried following code (added printfs): printf(fetch %u\n, num++); sys_timeouts_mbox_fetch(mbox, (void **)msg); printf(OK\n); And I get fetch 2272 without OK. Further analysis gave following observations: - thread blocks on fetch because packets are no longer inserted into mbox - packets are not inserted because the ETH interrupt is no longer raised - there is no interrupt because reception is suspended, set bits in the DMASR register show: RPS bits: 100: Suspended: Receive descriptor unavailable RBUS bit: Receive buffer unavailable status Descriptors are freed inside low_level_input after PBUF has been allocated and incoming packet has been copied. But it happens that pbuf_alloc returns NULL. What could be the reason? The traffic is not big, just a few kB/s. Maybe there is not enough space for PBUFs, but the problem doesn't occur immediately, usually a few thousand packets are transmitted without any issue before the stack stalls. Best regards, Grzegorz Niemirowski ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Sergio R. Caprile scapr...@gmail.com napisał(a): Hi Grzegorz, let's make this a kickstarter for the FAQ ;^) single thread applies mostly to raw API users and/or vendor code; they use an RTOS and the raw API and forget to call the lwIP stack from a single thread. I've seen some vendor code which also calls the stack from within interrupt code. Not your case, since you are using the sockets API. The ST rx handler bug is the winner this August: The interrupt routine takes the first frame off the controller and forgets to ask if there are any others waiting, so fast consecutive frames cause lost frames. IIRC, there is also a task running every second that somehow gets frames out of the chip (?), so some people notice this issue by observing a 500ms average ping delay, constantly changing (but I might be confusing it with other bug I read on the list). I'm glad you found it in the list and fixed it. Fortunately I don't use vendor code so I can't tell you if there are any other serious bugs, I prefer to deal with my own bugs (which keeps me very busy indeed ;^)). Sorry for hijacking your thread, should I start a new one or do you prefer to elaborate on the FAQ here ? You are the OP Best regards Feel free to start the FAQ here. I'm still fighting with the problem. The stack hangs on fetching packets from mbox. I have tried following code (added printfs): printf(fetch %u\n, num++); sys_timeouts_mbox_fetch(mbox, (void **)msg); printf(OK\n); And I get fetch 2272 without OK. Best regards, Grzegorz Niemirowski ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Hi Grzegorz, let's make this a kickstarter for the FAQ ;^) single thread applies mostly to raw API users and/or vendor code; they use an RTOS and the raw API and forget to call the lwIP stack from a single thread. I've seen some vendor code which also calls the stack from within interrupt code. Not your case, since you are using the sockets API. The ST rx handler bug is the winner this August: The interrupt routine takes the first frame off the controller and forgets to ask if there are any others waiting, so fast consecutive frames cause lost frames. IIRC, there is also a task running every second that somehow gets frames out of the chip (?), so some people notice this issue by observing a 500ms average ping delay, constantly changing (but I might be confusing it with other bug I read on the list). I'm glad you found it in the list and fixed it. Fortunately I don't use vendor code so I can't tell you if there are any other serious bugs, I prefer to deal with my own bugs (which keeps me very busy indeed ;^)). Sorry for hijacking your thread, should I start a new one or do you prefer to elaborate on the FAQ here ? You are the OP Best regards ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
It can be some problem with memory management but there is no place I should free memory. There are just automatic variables (allocated on stack) and public variables (allocated on heap). There are also four FreeRTOS queues, but they are created only once, when one of tasks starts. I don't use malloc or similar functions. Here is the source code of the task: #include lwip/sockets.h #include stm32f4xx_hal.h #include cmsis_os.h #include stdint.h #include string.h #include flash_settings.h void serve(int conn); uint16_t RX_BUFFER[50]; uint16_t TX_BUFFER[50]; extern struct settings_struct settings; uint8_t changePort = 0; extern xQueueHandle appQueueRx; extern xQueueHandle appQueueTx; void appTask(void const * argument) { appQueueRx = xQueueCreate(1, 100); int newconn, size; struct sockaddr_in address, remotehost; int appSocket; /* create a TCP socket */ if ((appSocket = socket(AF_INET, SOCK_STREAM, 0)) 0) { return; } fcntl(appSocket, F_SETFL, lwip_fcntl(appSocket, F_GETFL, 0) | O_NONBLOCK); /* bind to port 2324 at any interface */ memset(address, 0, sizeof(address)); address.sin_family = AF_INET; address.sin_port = htons(settings.appPort); address.sin_addr.s_addr = INADDR_ANY; if (bind(appSocket, (struct sockaddr *)address, sizeof(address)) 0) { return; } /* listen for incoming connections (TCP listen backlog = 5) */ listen(appSocket, 5); size = sizeof(remotehost); while(1) { newconn = accept(appSocket, (struct sockaddr *)remotehost, (socklen_t *)size); if (changePort){ changePort = 0; close(appSocket); appSocket = socket(AF_INET, SOCK_STREAM, 0); fcntl(appSocket, F_SETFL, lwip_fcntl(appSocket, F_GETFL, 0) | O_NONBLOCK); address.sin_port = htons(settings.appPort); bind(appSocket, (struct sockaddr *)address, sizeof (address)); listen(appSocket, 5); continue; } if (newconn = 0) serve(newconn); else vTaskDelay(100); } } void serve(int conn) { int ret; ret = read(conn, TX_BUFFER, sizeof(TX_BUFFER)); if (ret0) { xQueueSend(appQueueTx, TX_BUFFER, 100); xQueueReceive(appQueueRx, RX_BUFFER, 100); write(conn, (const unsigned char*)RX_BUFFER, (size_t)sizeof(RX_BUFFER)); } close(conn); } Noam weissman n...@silrd.com napisał(a): Hi, Please check if you free memory properly. It sounds like a memory leak? BR, Noam. -Original Message- From: lwip-users-bounces+noam=silrd@nongnu.org on behalf of Grzegorz Niemirowski Sent: Tue 9/2/2014 8:34 PM To: Mailing list for lwIP users Subject: Re: [lwip-users] lwIP hangs after some data transferred Thanks Noam. I have interrupt priorities set exactly as you have written. The problem must be somewhere else. ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
[lwip-users] lwIP hangs after some data transferred
Hello, I write code for SMT32 board using FreeRTOS and lwIP 1.4.1. I use sockets. There is a simple TCP server: it accepts incoming connection, reads 100 bytes sent from PC, sends 100 byte reply, closes connection and waits for another connection. I observed that after some time lwIP stops ACKing packets fast enough and eventually it hangs. It no longer responds to SYN or ICMP ping. Here are two examples of such communication. Both files contain whole communication between PC and my device after device reset. For testing purposes there were four applications running at the same time so the problem could be observed withing shorter time. http://www.grzegorz.net/test1a.pcapng http://www.grzegorz.net/test2a.pcapng Is it a problem with lwIP, low level IF driver or FreeRTOS? Best regards, Grzegorz Niemirowski ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Hi, You must check interrupt priorities and your FreeRTOSConfig definition file Here is my interrupt settings from the FreeRTOSConfig .h file: /* Cortex-M specific definitions. */ #ifdef __NVIC_PRIO_BITS /* __BVIC_PRIO_BITS will be specified when CMSIS is being used. */ #define configPRIO_BITS __NVIC_PRIO_BITS #else #define configPRIO_BITS 4/* 15 priority levels */ #endif /* The lowest interrupt priority that can be used in a call to a set priority function. */ #define configLIBRARY_LOWEST_INTERRUPT_PRIORITY 0xF /* The highest interrupt priority that can be used by any interrupt service routine that makes calls to interrupt safe FreeRTOS API functions. DO NOT CALL INTERRUPT SAFE FREERTOS API FUNCTIONS FROM ANY INTERRUPT THAT HAS A HIGHER PRIORITY THAN THIS! (higher priorities are lower numeric values. */ #define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY5 /* Interrupt priorities used by the kernel port layer itself. These are generic to all Cortex-M ports, and do not rely on any particular library functions. */ #define configKERNEL_INTERRUPT_PRIORITY ( configLIBRARY_LOWEST_INTERRUPT_PRIORITY (8 - configPRIO_BITS) ) /* configMAX_SYSCALL_INTERRUPT_PRIORITY must not be set to zero See http://www.FreeRTOS.org/RTOS-Cortex-M3-M4.html. */ #define configMAX_SYSCALL_INTERRUPT_PRIORITY( configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY (8 - configPRIO_BITS) ) #define ETH_ISR_PRIO 10 #define SERIAL_ISR_PRIO6 As you can see in the above file configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY is 5 as ETH_ISR_PRIO is 10 Set your ETH interrupt as follows: NVIC_PriorityGroupConfig(NVIC_PriorityGroup_4); /* Configures and enable the Ethernet global interrupt. */ NVIC_InitStructure.NVIC_IRQChannel = ETH_IRQn; NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = ETH_ISR_PRIO; NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0; NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE; NVIC_Init(NVIC_InitStructure); FreeRTOS function uses critical section protection that actually blocks interrupts Any ISR code that uses FreeRTOS functions MUST have a lower interrupt priority. If you do not follow this The critical section protection will not work and therefore you get unpredictable or un stable system. I also suggest reading the FreeRTOS section on interrupts: http://www.freertos.org/a00110.html http://www.freertos.org/RTOS-Cortex-M3-M4.html Hope that helped, Noam. -Original Message- From: lwip-users-bounces+noam=silrd@nongnu.org [mailto:lwip-users-bounces+noam=silrd@nongnu.org] On Behalf Of Grzegorz Niemirowski Sent: Tuesday, September 02, 2014 12:09 AM To: lwip-users@nongnu.org Subject: [lwip-users] lwIP hangs after some data transferred Hello, I write code for SMT32 board using FreeRTOS and lwIP 1.4.1. I use sockets. There is a simple TCP server: it accepts incoming connection, reads 100 bytes sent from PC, sends 100 byte reply, closes connection and waits for another connection. I observed that after some time lwIP stops ACKing packets fast enough and eventually it hangs. It no longer responds to SYN or ICMP ping. Here are two examples of such communication. Both files contain whole communication between PC and my device after device reset. For testing purposes there were four applications running at the same time so the problem could be observed withing shorter time. http://www.grzegorz.net/test1a.pcapng http://www.grzegorz.net/test2a.pcapng Is it a problem with lwIP, low level IF driver or FreeRTOS? Best regards, Grzegorz Niemirowski ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals computer viruses. This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals computer viruses. ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Thanks Noam. I have interrupt priorities set exactly as you have written. The problem must be somewhere else. Best regards, Grzegorz Niemirowski Noam weissman n...@silrd.com napisał(a): Hi, You must check interrupt priorities and your FreeRTOSConfig definition file Here is my interrupt settings from the FreeRTOSConfig .h file: /* Cortex-M specific definitions. */ #ifdef __NVIC_PRIO_BITS /* __BVIC_PRIO_BITS will be specified when CMSIS is being used. */ #define configPRIO_BITS __NVIC_PRIO_BITS #else #define configPRIO_BITS 4/* 15 priority levels */ #endif /* The lowest interrupt priority that can be used in a call to a set priority function. */ #define configLIBRARY_LOWEST_INTERRUPT_PRIORITY 0xF /* The highest interrupt priority that can be used by any interrupt service routine that makes calls to interrupt safe FreeRTOS API functions. DO NOT CALL INTERRUPT SAFE FREERTOS API FUNCTIONS FROM ANY INTERRUPT THAT HAS A HIGHER PRIORITY THAN THIS! (higher priorities are lower numeric values. */ #define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY 5 /* Interrupt priorities used by the kernel port layer itself. These are generic to all Cortex-M ports, and do not rely on any particular library functions. */ #define configKERNEL_INTERRUPT_PRIORITY ( configLIBRARY_LOWEST_INTERRUPT_PRIORITY (8 - configPRIO_BITS) ) /* configMAX_SYSCALL_INTERRUPT_PRIORITY must not be set to zero See http://www.FreeRTOS.org/RTOS-Cortex-M3-M4.html. */ #define configMAX_SYSCALL_INTERRUPT_PRIORITY ( configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY (8 - configPRIO_BITS) ) #define ETH_ISR_PRIO 10 #define SERIAL_ISR_PRIO6 As you can see in the above file configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY is 5 as ETH_ISR_PRIO is 10 Set your ETH interrupt as follows: NVIC_PriorityGroupConfig(NVIC_PriorityGroup_4); /* Configures and enable the Ethernet global interrupt. */ NVIC_InitStructure.NVIC_IRQChannel = ETH_IRQn; NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = ETH_ISR_PRIO; NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0; NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE; NVIC_Init(NVIC_InitStructure); FreeRTOS function uses critical section protection that actually blocks interrupts Any ISR code that uses FreeRTOS functions MUST have a lower interrupt priority. If you do not follow this The critical section protection will not work and therefore you get unpredictable or un stable system. I also suggest reading the FreeRTOS section on interrupts: http://www.freertos.org/a00110.html http://www.freertos.org/RTOS-Cortex-M3-M4.html Hope that helped, Noam. -Original Message- From: lwip-users-bounces+noam=silrd@nongnu.org [mailto:lwip-users-bounces+noam=silrd@nongnu.org] On Behalf Of Grzegorz Niemirowski Sent: Tuesday, September 02, 2014 12:09 AM To: lwip-users@nongnu.org Subject: [lwip-users] lwIP hangs after some data transferred Hello, I write code for SMT32 board using FreeRTOS and lwIP 1.4.1. I use sockets. There is a simple TCP server: it accepts incoming connection, reads 100 bytes sent from PC, sends 100 byte reply, closes connection and waits for another connection. I observed that after some time lwIP stops ACKing packets fast enough and eventually it hangs. It no longer responds to SYN or ICMP ping. Here are two examples of such communication. Both files contain whole communication between PC and my device after device reset. For testing purposes there were four applications running at the same time so the problem could be observed withing shorter time. http://www.grzegorz.net/test1a.pcapng http://www.grzegorz.net/test2a.pcapng Is it a problem with lwIP, low level IF driver or FreeRTOS? Best regards, Grzegorz Niemirowski ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Hi guys, I've been online for a couple months on the list and I'm already tired of reading about this; I imagine how you guys are... ;^) Anyway, joke aside, since people don't search the list they will probably won't read the wiki either, is there anything we can do to induce RTOS users to check for their priorities and single thread and the Rx handler missing every other packet ? Sort of a FAQ ? Then we only need a volunteer to paste the FAQ link as a response... ;^) Regards ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Krzysztof Wesołowski wrote: Second possible issue is that some old (?) STM32 demo code only processed one packet per interrupt, causing extra packets to be stalled in memory. That's a really good hint. I keep forgetting about that, although I'm using their controllers - but not with the code they provided :-) Simon ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
I can imagine how you can feel but I really avoid posting to any forum or mailing list before investigating a problem by myself. I read about priorities and they are set correctly. I don't know what you mean by single thread but every socket I use is accessed only in the thread where it was created. I also fixed the Rx handler so it reads all the packets using this patch: http://lists.gnu.org/archive/html/lwip-users/2014-03/msg00033.html Maybe there are more serious bugs in STM32Cube-generated code which I use? Best regards, Grzegorz Niemirowski Sergio R. Caprile scapr...@gmail.com napisał(a): Hi guys, I've been online for a couple months on the list and I'm already tired of reading about this; I imagine how you guys are... ;^) Anyway, joke aside, since people don't search the list they will probably won't read the wiki either, is there anything we can do to induce RTOS users to check for their priorities and single thread and the Rx handler missing every other packet ? Sort of a FAQ ? Then we only need a volunteer to paste the FAQ link as a response... ;^) Regards ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users
Re: [lwip-users] lwIP hangs after some data transferred
Hi, Please check if you free memory properly. It sounds like a memory leak? BR, Noam. -Original Message- From: lwip-users-bounces+noam=silrd@nongnu.org on behalf of Grzegorz Niemirowski Sent: Tue 9/2/2014 8:34 PM To: Mailing list for lwIP users Subject: Re: [lwip-users] lwIP hangs after some data transferred Thanks Noam. I have interrupt priorities set exactly as you have written. The problem must be somewhere else. Best regards, Grzegorz Niemirowski Noam weissman n...@silrd.com napisal(a): Hi, You must check interrupt priorities and your FreeRTOSConfig definition file Here is my interrupt settings from the FreeRTOSConfig .h file: /* Cortex-M specific definitions. */ #ifdef __NVIC_PRIO_BITS /* __BVIC_PRIO_BITS will be specified when CMSIS is being used. */ #define configPRIO_BITS __NVIC_PRIO_BITS #else #define configPRIO_BITS 4/* 15 priority levels */ #endif /* The lowest interrupt priority that can be used in a call to a set priority function. */ #define configLIBRARY_LOWEST_INTERRUPT_PRIORITY 0xF /* The highest interrupt priority that can be used by any interrupt service routine that makes calls to interrupt safe FreeRTOS API functions. DO NOT CALL INTERRUPT SAFE FREERTOS API FUNCTIONS FROM ANY INTERRUPT THAT HAS A HIGHER PRIORITY THAN THIS! (higher priorities are lower numeric values. */ #define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY 5 /* Interrupt priorities used by the kernel port layer itself. These are generic to all Cortex-M ports, and do not rely on any particular library functions. */ #define configKERNEL_INTERRUPT_PRIORITY ( configLIBRARY_LOWEST_INTERRUPT_PRIORITY (8 - configPRIO_BITS) ) /* configMAX_SYSCALL_INTERRUPT_PRIORITY must not be set to zero See http://www.FreeRTOS.org/RTOS-Cortex-M3-M4.html. */ #define configMAX_SYSCALL_INTERRUPT_PRIORITY ( configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY (8 - configPRIO_BITS) ) #define ETH_ISR_PRIO 10 #define SERIAL_ISR_PRIO6 As you can see in the above file configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY is 5 as ETH_ISR_PRIO is 10 Set your ETH interrupt as follows: NVIC_PriorityGroupConfig(NVIC_PriorityGroup_4); /* Configures and enable the Ethernet global interrupt. */ NVIC_InitStructure.NVIC_IRQChannel = ETH_IRQn; NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = ETH_ISR_PRIO; NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0; NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE; NVIC_Init(NVIC_InitStructure); FreeRTOS function uses critical section protection that actually blocks interrupts Any ISR code that uses FreeRTOS functions MUST have a lower interrupt priority. If you do not follow this The critical section protection will not work and therefore you get unpredictable or un stable system. I also suggest reading the FreeRTOS section on interrupts: http://www.freertos.org/a00110.html http://www.freertos.org/RTOS-Cortex-M3-M4.html Hope that helped, Noam. -Original Message- From: lwip-users-bounces+noam=silrd@nongnu.org [mailto:lwip-users-bounces+noam=silrd@nongnu.org] On Behalf Of Grzegorz Niemirowski Sent: Tuesday, September 02, 2014 12:09 AM To: lwip-users@nongnu.org Subject: [lwip-users] lwIP hangs after some data transferred Hello, I write code for SMT32 board using FreeRTOS and lwIP 1.4.1. I use sockets. There is a simple TCP server: it accepts incoming connection, reads 100 bytes sent from PC, sends 100 byte reply, closes connection and waits for another connection. I observed that after some time lwIP stops ACKing packets fast enough and eventually it hangs. It no longer responds to SYN or ICMP ping. Here are two examples of such communication. Both files contain whole communication between PC and my device after device reset. For testing purposes there were four applications running at the same time so the problem could be observed withing shorter time. http://www.grzegorz.net/test1a.pcapng http://www.grzegorz.net/test2a.pcapng Is it a problem with lwIP, low level IF driver or FreeRTOS? Best regards, Grzegorz Niemirowski ___ lwip-users mailing list lwip-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/lwip-users This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals computer viruses. This footnote confirms