Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-26 Thread via GitHub


xiaoxiang781216 merged PR #16431:
URL: https://github.com/apache/nuttx/pull/16431


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-26 Thread via GitHub


pussuw commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2909054752

   > @pussuw please review. For the previous two patches, we made enhancements 
or implementations based on your PR.
   
   Sorry, it will go to tomorrow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-26 Thread via GitHub


pussuw commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2909054131

   > @pussuw here is our plan:
   > 
   > 1. Merge the file handle related fix found in our internal stress test 
before your file handle restructure patch
   > 2. Resolve the conflict in your patch base on item 1 by @Donny9
   > 3. Run all internal function and stress tests to ensure all work as before 
by @Donny9
   > 
   > Do you think this plan reasonable?
   
   Yes, thank you. I'm OoO until tomorrow so don't have time to read github 
much until then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-25 Thread via GitHub


Donny9 commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2908284093

   @pussuw please review. For the previous two patches, we made enhancements or 
implementations based on your PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-24 Thread via GitHub


xiaoxiang781216 commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2906852037

   > I completely agree with @pussuw here! There is nice cleanup available for 
the FDs in PR #16361, which fixes the operation to be acc. to the posix.
   > 
   
   Yes, all agree #16361 is the right direction to go.
   
   > Please don't inject debug functionality which should NOT be inside kernel 
in there. Integrate first the #16361, and then implement the debug 
functionality in where it belongs. That is, in the kernel-userspace interface, 
not directly in the kernel.
   
   The patchset doesn't change or add any new debugging functionality, but fix 
the nasty bug found by fdsan/fdcheck. We can improve fdsan/fdcheck after 
merging this and #16361.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-24 Thread via GitHub


xiaoxiang781216 commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2906850406

   @pussuw here is our plan:
   
   1. Merge the file handle related fix found in our internal stress test 
before your file handle restructure patch
   2. Resolve the conflict in your patch base on item 1 by @Donny9 
   3. Run all internal function and stress tests to ensure all work as before 
by @Donny9 
   
   Do you think this plan reasonable?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104480060


##
fs/inode/fs_files.c:
##
@@ -476,14 +479,32 @@ void files_putlist(FAR struct filelist *list)
* because there should not be any references in this context.
*/
 
+#ifdef CONFIG_FS_REFCOUNT
+again:
+  loop = false;
+#endif
   for (i = list->fl_rows - 1; i >= 0; i--)
 {
   for (j = CONFIG_NFILE_DESCRIPTORS_PER_BLOCK - 1; j >= 0; j--)
 {
+#ifdef CONFIG_FS_REFCOUNT
+  if (fs_putfilep(&list->fl_files[i][j]) > 0)

Review Comment:
   > I mean another CPU can be performing a write/read/whatever operation while 
another CPU exit()'s the process. This is userspace racing against itself. The 
kernel must be able to endure this.
   
   @pussuw 
   The scenario you're describing should be: 
   **A task contains multiple pthreads, where CPU A is scheduling one of the 
pthreads and is currently performing read/write operations, while another CPU B 
is executing the exit process for that task.** 
   
   In this situation, we ensure in group_release that all child pthreads exit 
first via group_remove_children, and then the main task's exit is completed, 
releasing all file descriptors (fds). There are no issues with this; it's been 
done this way before, and it has no relation to the modifications in the 
current patch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104470027


##
fs/inode/fs_files.c:
##
@@ -482,11 +482,11 @@ void files_putlist(FAR struct filelist *list)
 {
   file_close(&list->fl_files[i][j]);
 }
+}
 
-  if (i != 0)
-{
-  fs_heap_free(list->fl_files[i]);
-}
+  for (i = list->fl_rows - 1; i > 0; i--)

Review Comment:
   Goes to next week for me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2904240438

   > > > > > > Why is fdcheck implemented in the kernel, the kernel should not 
have to care about such things. Shouldn't it be implemented at the user 
interface i.e. libc ?
   > > > > > 
   > > > > > 
   > > > > > @pussuw The implementation of fdcheck originally resides in libc 
https://github.com/apache/nuttx/blob/889a26db1d229ee462bc8ea075c2330dc5f79b94/libs/libc/misc/lib_fdcheck.c
   > > > > > But we need to perform the conversion from file descriptor (fd) to 
file structure pointer (filep) within the kernel, as this conversion cannot be 
done in user space. And, the parameter for system calls is an fd, nothing else, 
and the filep is also a kernel data structure.
   > > > > 
   > > > > 
   > > > > Yes the problem here is that the data is in filep. After #16361 you 
could do this magic in userspace.
   > > > 
   > > > 
   > > > @pussuw #16361 would definitely be the best if we could cover this 
type of scenario. However, the changes in this current Pull Request (PR) are 
quite extensive, so it's unlikely to be merged into the mainline quickly. 
There's a possibility of stability issues, so we should ensure its reliability 
through some testing. Therefore, I will pick it up internally and conduct LTP 
(Linux Test Project) functional testing, 24-hour Monkey stress testing, as well 
as functional testing on multiple platforms, before approving it.
   > > > Therefore, our priority is to ensure that the current design is sound. 
You can continue to rework based on #16361 .
   > > 
   > > 
   > > Thanks, I do really prefer fixing the file descriptor subsystem like 
@xiaoxiang781216 suggested in #16326 . It is better to do a comprehensive fix 
that as far as I can see, will solve most of your problems as well.
   > 
   > @pussuw @jlaitine Actually, this current PR doesn't just address the issue 
with fdcheck. Firstly, optimizing the file descriptors in epoll to be managed 
as files is quite reasonable, because fds shouldn't exist in the kernel in 
their raw form; they should be converted into `file structures` or the 
subsequent `struct fd` representation. 
[4629fc5](https://github.com/apache/nuttx/commit/4629fc52164b8607b8c5343f1042425531d0df16)
 Secondly, we've removed poll_fdsetup to ensure that everyone uses interfaces 
with reference counting to access fds, which is much 
safer."[0cb3037](https://github.com/apache/nuttx/commit/0cb3037484a143c3c72be4fe421a4fb8983e767a)
 The other two modifications are minor and only target the usage of file 
references.
   
   Yes, I agree with patches 1 and 4, this is why I did not comment on them


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104462877


##
fs/inode/fs_files.c:
##
@@ -482,11 +482,11 @@ void files_putlist(FAR struct filelist *list)
 {
   file_close(&list->fl_files[i][j]);
 }
+}
 
-  if (i != 0)
-{
-  fs_heap_free(list->fl_files[i]);
-}
+  for (i = list->fl_rows - 1; i > 0; i--)

Review Comment:
   > This problem will not exist after #16361 I think ?
   
   @pussuw  You can construct a test case and try it out using the example I 
provided above, or I can test your patch over the weekend to see if it resolves 
this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2904216638

   > > > > > Why is fdcheck implemented in the kernel, the kernel should not 
have to care about such things. Shouldn't it be implemented at the user 
interface i.e. libc ?
   > > > > 
   > > > > 
   > > > > @pussuw The implementation of fdcheck originally resides in libc 
https://github.com/apache/nuttx/blob/889a26db1d229ee462bc8ea075c2330dc5f79b94/libs/libc/misc/lib_fdcheck.c
   > > > > But we need to perform the conversion from file descriptor (fd) to 
file structure pointer (filep) within the kernel, as this conversion cannot be 
done in user space. And, the parameter for system calls is an fd, nothing else, 
and the filep is also a kernel data structure.
   > > > 
   > > > 
   > > > Yes the problem here is that the data is in filep. After #16361 you 
could do this magic in userspace.
   > > 
   > > 
   > > @pussuw #16361 would definitely be the best if we could cover this type 
of scenario. However, the changes in this current Pull Request (PR) are quite 
extensive, so it's unlikely to be merged into the mainline quickly. There's a 
possibility of stability issues, so we should ensure its reliability through 
some testing. Therefore, I will pick it up internally and conduct LTP (Linux 
Test Project) functional testing, 24-hour Monkey stress testing, as well as 
functional testing on multiple platforms, before approving it.
   > > Therefore, our priority is to ensure that the current design is sound. 
You can continue to rework based on #16361 .
   > 
   > Thanks, I do really prefer fixing the file descriptor subsystem like 
@xiaoxiang781216 suggested in #16326 . It is better to do a comprehensive fix 
that as far as I can see, will solve most of your problems as well.
   
   @pussuw @jlaitine 
   Actually, this current PR doesn't just address the issue with fdcheck. 
   Firstly, optimizing the file descriptors in epoll to be managed as files is 
quite reasonable, because fds shouldn't exist in the kernel in their raw form; 
they should be converted into ``file structures` or the subsequent `struct fd` 
representation. 
https://github.com/apache/nuttx/pull/16431/commits/4629fc52164b8607b8c5343f1042425531d0df16
   Secondly, we've removed poll_fdsetup to ensure that everyone uses interfaces 
with reference counting to access fds, which is much 
safer."https://github.com/apache/nuttx/pull/16431/commits/0cb3037484a143c3c72be4fe421a4fb8983e767a
   The other two modifications are minor and only target the usage of file 
references.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2904150857

   > > > > Why is fdcheck implemented in the kernel, the kernel should not have 
to care about such things. Shouldn't it be implemented at the user interface 
i.e. libc ?
   > > > 
   > > > 
   > > > @pussuw The implementation of fdcheck originally resides in libc 
https://github.com/apache/nuttx/blob/889a26db1d229ee462bc8ea075c2330dc5f79b94/libs/libc/misc/lib_fdcheck.c
   > > > But we need to perform the conversion from file descriptor (fd) to 
file structure pointer (filep) within the kernel, as this conversion cannot be 
done in user space. And, the parameter for system calls is an fd, nothing else, 
and the filep is also a kernel data structure.
   > > 
   > > 
   > > Yes the problem here is that the data is in filep. After #16361 you 
could do this magic in userspace.
   > 
   > @pussuw #16361 would definitely be the best if we could cover this type of 
scenario. However, the changes in this current Pull Request (PR) are quite 
extensive, so it's unlikely to be merged into the mainline quickly. There's a 
possibility of stability issues, so we should ensure its reliability through 
some testing. Therefore, I will pick it up internally and conduct LTP (Linux 
Test Project) functional testing, 24-hour Monkey stress testing, as well as 
functional testing on multiple platforms, before approving it.
   > 
   > Therefore, our priority is to ensure that the current design is sound. You 
can continue to rework based on #16361 .
   
   Thanks, I do really prefer fixing the file descriptor subsystem like 
@xiaoxiang781216 suggested in #16326 . It is better to do a comprehensive fix 
that as far as I can see, will solve most of your problems as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2904143761

   > > > Why is fdcheck implemented in the kernel, the kernel should not have 
to care about such things. Shouldn't it be implemented at the user interface 
i.e. libc ?
   > > 
   > > 
   > > @pussuw The implementation of fdcheck originally resides in libc 
https://github.com/apache/nuttx/blob/889a26db1d229ee462bc8ea075c2330dc5f79b94/libs/libc/misc/lib_fdcheck.c
   > > But we need to perform the conversion from file descriptor (fd) to file 
structure pointer (filep) within the kernel, as this conversion cannot be done 
in user space. And, the parameter for system calls is an fd, nothing else, and 
the filep is also a kernel data structure.
   > 
   > Yes the problem here is that the data is in filep. After #16361 you could 
do this magic in userspace.
   
   @pussuw  #16361  would definitely be the best if we could cover this type of 
scenario. 
   However, the changes in this current Pull Request (PR) are quite extensive, 
so it's unlikely to be merged into the mainline quickly. There's a possibility 
of stability issues, so we should ensure its reliability through some testing. 
   Therefore, I will pick it up internally and conduct LTP (Linux Test Project) 
functional testing, 24-hour Monkey stress testing, as well as functional 
testing on multiple platforms, before approving it.  
   
   Therefore, our priority is to ensure that the current design is sound. You 
can continue to rework based on #16361 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2904055101

   > > Why is fdcheck implemented in the kernel, the kernel should not have to 
care about such things. Shouldn't it be implemented at the user interface i.e. 
libc ?
   > 
   > @pussuw The implementation of fdcheck originally resides in libc 
https://github.com/apache/nuttx/blob/889a26db1d229ee462bc8ea075c2330dc5f79b94/libs/libc/misc/lib_fdcheck.c
   > 
   > But we need to perform the conversion from file descriptor (fd) to file 
structure pointer (filep) within the kernel, as this conversion cannot be done 
in user space. And, the parameter for system calls is an fd, nothing else, and 
the filep is also a kernel data structure.
   
   Yes the problem here is that the data is in filep. After #16361 you could do 
this magic in userspace.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


jlaitine commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2904072439

   I completely agree with @pussuw here! There is nice cleanup available for 
the FDs in PR #16361, which fixes the operation to be acc. to the posix.
   
   Please don't inject debug functionality which should NOT be inside kernel in 
there. Integrate first the #16361, and then implement the debug functionality 
in where it belongs. That is, in the kernel-userspace interface, not directly 
in the kernel.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104343400


##
fs/inode/fs_files.c:
##
@@ -476,14 +479,32 @@ void files_putlist(FAR struct filelist *list)
* because there should not be any references in this context.
*/
 
+#ifdef CONFIG_FS_REFCOUNT
+again:
+  loop = false;
+#endif
   for (i = list->fl_rows - 1; i >= 0; i--)
 {
   for (j = CONFIG_NFILE_DESCRIPTORS_PER_BLOCK - 1; j >= 0; j--)
 {
+#ifdef CONFIG_FS_REFCOUNT
+  if (fs_putfilep(&list->fl_files[i][j]) > 0)

Review Comment:
   I mean another CPU can be performing a write operation while another CPU 
exit()'s the process. This is userspace racing against itself. The kernel must 
be able to endure this.



##
fs/inode/fs_files.c:
##
@@ -476,14 +479,32 @@ void files_putlist(FAR struct filelist *list)
* because there should not be any references in this context.
*/
 
+#ifdef CONFIG_FS_REFCOUNT
+again:
+  loop = false;
+#endif
   for (i = list->fl_rows - 1; i >= 0; i--)
 {
   for (j = CONFIG_NFILE_DESCRIPTORS_PER_BLOCK - 1; j >= 0; j--)
 {
+#ifdef CONFIG_FS_REFCOUNT
+  if (fs_putfilep(&list->fl_files[i][j]) > 0)

Review Comment:
   I mean another CPU can be performing a write/read/whatever operation while 
another CPU exit()'s the process. This is userspace racing against itself. The 
kernel must be able to endure this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104340437


##
fs/inode/fs_files.c:
##
@@ -482,11 +482,11 @@ void files_putlist(FAR struct filelist *list)
 {
   file_close(&list->fl_files[i][j]);
 }
+}
 
-  if (i != 0)
-{
-  fs_heap_free(list->fl_files[i]);
-}
+  for (i = list->fl_rows - 1; i > 0; i--)

Review Comment:
   This problem will not exist after #16361 I think ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104217444


##
fs/inode/fs_files.c:
##
@@ -476,14 +479,32 @@ void files_putlist(FAR struct filelist *list)
* because there should not be any references in this context.
*/
 
+#ifdef CONFIG_FS_REFCOUNT
+again:
+  loop = false;
+#endif
   for (i = list->fl_rows - 1; i >= 0; i--)
 {
   for (j = CONFIG_NFILE_DESCRIPTORS_PER_BLOCK - 1; j >= 0; j--)
 {
+#ifdef CONFIG_FS_REFCOUNT
+  if (fs_putfilep(&list->fl_files[i][j]) > 0)

Review Comment:
   @pussuw Hello, the call of this function occurs when a task exits. There are 
two key points that need to be clarified first:
   1. All resources must be released when a task exits, including any opened 
file descriptors (fd).
   2. If the reference count (refs) of the filep corresponding to an fd is 
greater than 1, it means that the filep is still being used somewhere, possibly 
by another duplicated fd.
   
   Currently, we close all fds in the file_list in reverse order. 
   For example:
   Suppose the epoll fd is 5, and we add fd1=10, fd2=6, and fd3=3 to the epoll 
queue.
   So the closing order would be: fd1 -> fd2 -> epoll fd -> fd3.
   
   When we close the epoll fd, since it's part of the task exit process, 
there's no opportunity to execute epoll_ctl delete to remove fd1 and fd2. 
Therefore, when we call close(epoll_fd)->epoll_close, trying to find the filep 
through fd1 and fd2 will result in an error from fdcheck because they have 
already been closed.
   
   This is why we're considering replacing the fds in epoll with file 
references. By doing so, when we close fd1 and fd2 for the first time, we only 
decrement their reference counts by 1 instead of actually closing them. 
   
   However, we need to iterate through the for loop again to close them 
properly and release the resources.
   
   "What do you mean by 'other CPUs'? Are you referring to scenarios where 
other CPUs access across cores through rpmsg services? In such cases, other 
CPUs would receive a POLLHUP event from this fd, and this event needs to be 
handled properly."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2903884059

   > Why is fdcheck implemented in the kernel, the kernel should not have to 
care about such things. Shouldn't it be implemented at the user interface i.e. 
libc ?
   
   @pussuw The implementation of fdcheck originally resides in libc 
https://github.com/apache/nuttx/blob/889a26db1d229ee462bc8ea075c2330dc5f79b94/libs/libc/misc/lib_fdcheck.c
   
   But we need to perform the conversion from file descriptor (fd) to file 
structure pointer (filep) within the kernel, as this conversion cannot be done 
in user space. And, the parameter for system calls is an fd, nothing else, and 
the filep is also a kernel data structure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104225346


##
fs/inode/fs_files.c:
##
@@ -482,11 +482,11 @@ void files_putlist(FAR struct filelist *list)
 {
   file_close(&list->fl_files[i][j]);
 }
+}
 
-  if (i != 0)
-{
-  fs_heap_free(list->fl_files[i]);
-}
+  for (i = list->fl_rows - 1; i > 0; i--)

Review Comment:
   @pussuw  No, you definitely haven't read my commit message carefully. 
   The reason is that there are **many file descriptors (fds) that are 
interdependent,** and they may not all be on the same line in the list; they 
can span multiple lines. 
   If, when closing one fd, you try to access another fd that has already been 
closed, but the line it was on may have already been freed, it could lead to a 
Use-After-Free (UAF) issue. 
   Therefore, we need to release all the memory uniformly at the end.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#issuecomment-2903797999

   Why is fdcheck implemented in the kernel, the kernel should not have to care 
about such things. Shouldn't it be implemented at the user interface i.e. libc ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104159309


##
fs/inode/fs_files.c:
##
@@ -482,11 +482,11 @@ void files_putlist(FAR struct filelist *list)
 {
   file_close(&list->fl_files[i][j]);
 }
+}
 
-  if (i != 0)
-{
-  fs_heap_free(list->fl_files[i]);
-}
+  for (i = list->fl_rows - 1; i > 0; i--)

Review Comment:
   Looks like the change is redundant, the old code works as far as I can see ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


pussuw commented on code in PR #16431:
URL: https://github.com/apache/nuttx/pull/16431#discussion_r2104158434


##
fs/inode/fs_files.c:
##
@@ -476,14 +479,32 @@ void files_putlist(FAR struct filelist *list)
* because there should not be any references in this context.
*/
 
+#ifdef CONFIG_FS_REFCOUNT
+again:
+  loop = false;
+#endif
   for (i = list->fl_rows - 1; i >= 0; i--)
 {
   for (j = CONFIG_NFILE_DESCRIPTORS_PER_BLOCK - 1; j >= 0; j--)
 {
+#ifdef CONFIG_FS_REFCOUNT
+  if (fs_putfilep(&list->fl_files[i][j]) > 0)

Review Comment:
   I don't agree with this at all. Whomever holds the reference should release 
it. Now if the kernel needs the file still because it's doing something with it 
on another CPU it will crash.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



[PR] Resolving the issue of fdcheck reporting errors when an epoll file descriptor (fd) is closed [nuttx]

2025-05-23 Thread via GitHub


Donny9 opened a new pull request, #16431:
URL: https://github.com/apache/nuttx/pull/16431

   
   ## Summary
   
   Resolving the issue of fdcheck reporting errors when an epoll file 
descriptor (fd) is closed.
   
   The primary cause of the error is as follows: 
   When a task exits by calling exit, it closes all file descriptors (fds) 
associated with the current task. 
   When the epoll fd is closed, if there are still fds in the epoll queue that 
have values greater than the epoll fd, these fds may be closed first. This can 
lead to fdcheck failures when call epoll_close
   To resolve this, we should use a file-based approach instead of directly 
managing fds in epoll.
   
   ## Impact
   
   fix epoll fdcheck issue
   
   ## Testing
   
   open vela monkey
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]