Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :
Willy,
Christopher,
I'm not very happy with the normalization logic, because am processing the URI
in reverse. This requires me to directly access offsets instead of using the
`ist` API. However this way I don't need to backtrack once I encounter a `../`
which I consider to be a win.
It is not shocking. The function is readable, it is not a real problem. Maybe we
can introduce the istoff() function to get the pointer at a given offset. You
may also choose to fully rely on pointers with a negative index. I know you want
to use the ist api as far as possible, but it is not always the easiest way :)
At the end it remains your choice. The function is quite good. I just wonder if
it could be valuable to also handle single dot-segment here in addition to
double dot-segment. Thus, the normalizer should be renamed "dot-segments" or
something similar.
Another point is about the dot encoding. It may be good to handle encoded dot
(%2E), may be via an option. And IMHO, the way empty segments are handle is a
bit counter intuitive. Calling "merge-slashes" normalizer first is a solution of
course, but this means rewriting twice the uri. We must figure out what is the
main expectation for this normalizer. Especially because ignoring empty segment
when dot-segments are merged is not exactly the same than merge all slashes.
Note I was first surprised that leading dot-segments were preserved, before
reading the 6th patch because for me it is the most important part. But I'm fine
with an option in a way or another.
--
Christopher Faulet