On 18/04/2024 19:31, Austin Leirvik wrote:
Hi dev team,
I have noticed that the PublicSuffixMatcher.getDomainRoot method does
not always return the expected registrable domain according to the
Public Suffix List formal algorithm
(https://github.com/publicsuffix/list/wiki/Format#formal-algorithm).
The result returned by getDomainRoot depends on whether the suffix is
in the ICANN or PRIVATE section of the list. For example:
ICANN: publicSuffixMatcher.getDomainRoot("test.com.ar") --> test.com.ar
PRIVATE: publicSuffixMatcher.getDomainRoot("test.appspot.com") --> appspot.com
From my understanding, the formal algorithm does not distinguish
between ICANN and PRIVATE suffixes. According to the formal algorithm,
"the registered or registrable domain is the public suffix plus one
additional label". appspot.com is a public suffix, so the registrable
domain would be "test.appspot.com".
I understand the intended behaviour of getDomainRoot based on the test
suite is to return simply "appspot.com" i.e. the public suffix
(https://github.com/apache/httpcomponents-client/blob/master/httpclient5/src/test/java/org/apache/hc/client5/http/psl/TestPublicSuffixMatcher.java#L61).
Is it possible to provide a new method, or overload the existing
method, to return the registrable domain according to the formal
algorithm regardless of ICANN vs PRIVATE?
Happy to provide more context if needed.
Hi Austin
Firstly, the point of having private domains in the PSL is to have them
treated differently from ICANN domains.
Secondly, there is nothing stopping you from having a custom PSL source,
for instance, without the PRIVATE section, or some entries moved from
PRIVATE to ICANN
If you still would like to have an option of overriding the default
behavior do feel free to propose API changes as a PR at GitHub. If they
are non-intrusive and backward compatible I see no reason for not
including them.
Oleg
Thanks,
Austin
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org