Hello, David Maus <dm...@ictsoc.de> writes:
> IIRC org-link-escape is not used to create URLs but to escape > characters in a link that would otherwise conflict with Orgmode syntax > (e.g. square brackets). > Org applies percent escaping to a link before > it is stored in the buffer and applies unescaping when it reads a link > back. > > The percent sign is hardcoded because if org-link-escape/unescape is > used in this way we must make sure that the identity of a link is > preserved. If we would *not* escape the percent sign, then an original > link with percent encoded characters would be read back wrongly, > i.e. with the percent escaped characters unescaped. [...] > There is, of course, the nasty thing that we don't know if the link in > a buffer went through org-link-escape or not. E.g. if you paste > > ,---- > | > [[http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml]] > `---- > > into the buffer you'll get a broken link because org-link-open assumes > the link to be escaped by org. > > The bottom-line: Org creates link programmatically (org-store-link) > and needs a mechanism to protected conflicting characters. It chose > percent-escaping and in order to preserve the identity of a link Org > has to escape the escape-character. > > Hope that helps! It does. I think we are hunting two hares and that's why we are failing so far. There are two URI transformations involved. One is mandatory (escape square brackets in URI), and the other one is optional (normalize URI for external processes consumption). The former must be bi-directional, as escaping brackets must be transparent to the user (e.g., when editing a link with `org-insert-link'). The latter needn't and can happen on the fly, just before the URI is sent to whatever needs it (e.g., a browser). Therefore, I suggest to use three functions: - `org-link-escape will first %-escape "%" characters, and then "[" and "]" characters. `org-link-unescape' will reverse the operation. These function cannot break a link, encoded or not. They are applied when a link is created programmatically and read back for user editing. - `org-link-encode'[1] will %-escape every forbidden character in the URI. It doesn't need any "reverse" function. It will be called when opening a link, or parsing it. I think it shouldn't escape "%" characters, though, so that it can be applied on both encoded and plain strings. Since it isn't perfect (it doesn't parse URI), it should also be very conservative (i.e. allow more characters such as "=" or "&") and not get in the way. WDYT? Regards, [1] `url-encode-url' was introduced in Emacs 24.3. It is too young to be used mainstream, even though it does a better job than `org-link-escape'. We will benefit from it when Emacs 25 is out (i.e. when Emacs 23 support is dropped). -- Nicolas Goaziou