Mike Lissner <mliss...@michaeljaylissner.com> added the comment:
> With the fix for this bug, urlsplit silently removes (some of) those > characters before we can replace them, modifying the output of our > sanitisation code I don't have any good solutions for 3.9.5, but going forward, this feels like another example of why we should just do parsing right (the way browsers do). That'd maintain tabs and whatnot in your output, and it'd fix the security issue by putting `java\nscript` into the scheme attribute instead of the path. > One solution that presents itself to me: add a `strip_insecure_characters: > bool = True` parameter. Doesn't this lose sight of what this tool is supposed to do? It's not supposed to have a good (new, correct) and a bad (old, obsolete) way of parsing. Woe unto whoever has to write the documentation for that parameter. Also, I should reiterate that these aren't "insecure" characters so if we did have a parameter for this, it'd be more like `do_rfc_3986_parsing` or maybe `do_naive_parsing`. The chars aren't insecure in themselves. They're fine. Python just gets tripped up on them. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue43882> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com