On Wed, 09 Sep 2020 06:38:11 +0800, 積丹尼さん wrote:
> I'm saying that emacs-w3m might be following this rule:

>   https://en.wikipedia.org/wiki/URI_fragment#cite_ref-6
>   "Notably they cannot begin with a digit or hyphen."

> But nowadays it seems there are newer rules, so emacs-w3m should expect
> ones that do begin with a digit or hyphen.

BTW, I found a thing, that might be the root cause of this bug.
In the html source of the page you first brought up
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=968589#36>
the "36" section begins as follows:

<hr>
<p class="msgreceived">
  <a name="36"></a>
  <a name="msg36"></a>
  <a href="#36">Message #36</a>
  received at 968...@bugs.debian.org

Note that there are *two* name anchors.  I guess emacs-w3m might
possibly override the first found one by the second one.  But all
of them should be fetched and held so to be referred to.

Currently, for the last element "#36" of a url, if and only if
it is "\\`#[0-9]+\\'", i.e., "#" plus numeric letters, emacs-w3m
looks for the index "36", "bla36", "36bla", and "bla36bla",
where "bla" is an arbitrary string including no numeric
letters.  So, an index that begins with a digit or hyphen will
be found.

However, there is never a document that specifies such a rule
(i.e., the url fragment "#36" points to the index not only "36"
but also "msg36", etc.) isn't it?  I feel it too ambiguous.

Anyway I'm going to work on the emacs-w3m code again.

Thanks.

Reply via email to