Cynerd commented on code in PR #17360:
URL: https://github.com/apache/nuttx/pull/17360#discussion_r2549916009


##########
drivers/can/can.c:
##########
@@ -702,6 +717,13 @@ static ssize_t can_write(FAR struct file *filep, FAR const 
char *buffer,
       /* Increment the number of bytes that were sent */
 
       nsent += msglen;
+
+      if (msgalign > 1)
+        {
+          nsent = powerof2(msgalign)

Review Comment:
   That is absolutely true on platforms that do have ALU with division, but I 
would bet that on platforms without it (such as cortex-m0) this should be 
faster over emulation of division as the emulation could take hunderts of 
cycles (consider that there are actually multiplication and division in 
`roundup`).
   
   Thus, my reasoning is:
   - It is either a very tiny MCU without division, and this branching could be 
beneficial.
   - It is larger MCU that will have division bult in, but it will also be in 
general faster, thus removing the condition would be a micro-optimization, 
where I don't think that this is hot code for that.
   
   Depending on the compiler optimization it could also detect that code that 
is part of `powerof2` is actually `stdc_has_single_bit` and `roundup2` 
`stdc_bit_ceil`. The macro could even replace these with an appropriate 
compiler builtin if that would be available. Thus making the happy path of 
"power of 2" very fast, of course, depending on architecture, if those builtins 
are converted to single instruction or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to