Hi Tim, On Thu, Nov 04, 2021 at 07:12:04PM +0100, Tim Düsterhus wrote: > Your patch is already merged and the bug is fixed. However I'd like to > comment on the reasons behind why I refactored the whole function to use the > ist API: > > I *strongly* dislike code that just works because of some implicit > assumptions that might or might not hold after future changes. This is C, > every messed up boundary check can easily result in some major issues. > > In this specific case the implicit assumption is that all supported > algorithms are exactly 5 characters long. This is true with the current > implementation in HAProxy which just supports the HS, RS, ES, and PS > families. However the JOSE specification [1] also specifies other values for > the 'alg' field (though most of them for encrypted instead of just signed > tokens) and they have differing lengths. > > If someone in the future wants to add support for a 6 character algorithm, > then they need to remember to add the length check to not run into the same > strncmp() issue like you did. My experience is that if a mistake is made > once, it is going to be made again. > > The ist API already contains a large number of functions for *safe* handling > of strings that are not NUL terminated. It is very hard to misuse them, > because they all include the appropriate checks. If you see isteq(dynamic, > ist("static")) then it is very clear that this will match the string > "statis" exactly. For strncmp you will need to perform the length check > yourself. You also need to remember to manually subtract 1 for the trailing > NUL when using sizeof, whereas this is done automatically by the ist() > function. > > While I am not an expert in optimization and getting the last percent of > performance out of a function, I don't think the ist API is going to be > measurably slower than strncmp. It is used everywhere within HTX after all! > For the JWT the OpenSSL crypto is going to take up the majority of the time > anyway. > > Hope this help to clarify why I'm strongly pushing the use of the ist API > and why I've contributed a fair number of additional functions in the past > when I encountered situations where the current functions were not ergonomic > to use. One example is the istsplit() function which I added for > uri_normalizer.c. It makes tokenization of strings very easy and safe.
I overall generally agree with your arguments above as I created ist exactly to address such trouble that always end up with someone spend 3 days on a but in someone else's code just to discover a logic mistake such as a string function called on a string that's not always nul-terminated, or reverse scanning that can occasionally continue before the beginning of the string etc. However, there is an important aspect to keep in mind, which is that Rémi is the maintainer of this part and he needs to use what he feels at ease with. I can easily understand that he's not much familiar with the usage of ist. While the functions are correctly documented and generally useful, their coverage is not always complete (and you had a few times to add new ones yourself), and we don't have a simple index of what's available with a one-liner description, which doesn't help to get familiar with them. In addition there are places where the input text is still made of nul terminated strings, that require an extra conversion that can also be error-prone, or just would add some code whose benefit is not immediately perceptible. I, too, hope that over time we make most of the code that deals with strings coming from user data evolve towards a more generalized use of ist for the reasons you mention. Safer, easier, friendlier API when you're in a code area that already makes use of them. I will just not force anyone to do that, though I can encourage by showing how to use it. However you can be sure that I'll complain very loudly if I have to debug something that was caused by unsafe code that could have been avoided this way. The performance aspect is indeed something that ought not be neglected in two domains: - CPU: the functions are made to be inlined, and as their end is known, usually this results in trivial operations that can be much faster than the usual ones. However the functions are designed to deal optimally with short strings (like keywords, cookies etc) and to result in optimal code for small loops (i.e. no function call and the smallest possible setup code). Using them on large blocks will result in lower performance than regular functions. Looking up a string inside a body for example ought not be done with them ; - size: the strings use two parts and they are thus larger than a single pointer. In some structures the space could be extremely tight and the cost of storing the length can sometimes cancel some of the gains. But this is quite rare as almost all of our strings are accompanied with a length nowadays. My general rule of thumb on this is: - if you're dealing with small strings (around 200 chars or less) that you manipulate very often and for which storing the size is not a problem, better use ist. - if you're manipulating strings a lot in a function (tokenizing, etc), it's often worth turning them to ist before performing the operations as the pointer and length will then just become registers that are passed and updated between calls. In addition many times the compiler can detect constants and optimize loops. - otherwise it's probably not worth it. There are some cases where a function like this one: int is_blah_str() { return strcmp(get_blah_str(), "blah") == 0; } can be more efficiently done like this: int is_blah_ist() { return isteq(get_blah_ist(), ist("blah")); } and provide more optimal code like this: 0000000000000080 <is_blah_ist>: 80: 53 push %rbx 81: 31 db xor %ebx,%ebx 83: 31 c0 xor %eax,%eax 85: e8 00 00 00 00 callq 8a <is_blah_ist+0xa> // get_blah_ist 8a: 48 83 fa 04 cmp $0x4,%rdx 8e: 75 0b jne 9b <is_blah_ist+0x1b> 90: 31 db xor %ebx,%ebx 92: 81 38 62 6c 61 68 cmpl $0x68616c62,(%rax) 98: 0f 94 c3 sete %bl 9b: 89 d8 mov %ebx,%eax 9d: 5b pop %rbx 9e: c3 retq where the constant was turned into a single 32-bit integer comparison. This is more efficient both in terms of CPU and code size. But in order for this to happen more often, we sometimes have to provide hints to the compiler, and the code really needs to be improved (this will happen over time). By the way I think that I should create an "api" sub-directory under "doc/internals" to place the docs about the internal API parts, such as buffers, htx, filters, initcalls, and the missing ist. This could probably save developers some time and encourage others to add new stuff there from time to time. Regards, Willy