On 18/04/2024 19:31, Austin Leirvik wrote:
Hi dev team,

I have noticed that the PublicSuffixMatcher.getDomainRoot method does
not always return the expected registrable domain according to the
Public Suffix List formal algorithm
(https://github.com/publicsuffix/list/wiki/Format#formal-algorithm).

The result returned by getDomainRoot depends on whether the suffix is
in the ICANN or PRIVATE section of the list. For example:

ICANN:   publicSuffixMatcher.getDomainRoot("test.com.ar") --> test.com.ar
PRIVATE: publicSuffixMatcher.getDomainRoot("test.appspot.com") --> appspot.com

 From my understanding, the formal algorithm does not distinguish
between ICANN and PRIVATE suffixes. According to the formal algorithm,
"the registered or registrable domain is the public suffix plus one
additional label". appspot.com is a public suffix, so the registrable
domain would be "test.appspot.com".

I understand the intended behaviour of getDomainRoot based on the test
suite is to return simply "appspot.com" i.e. the public suffix
(https://github.com/apache/httpcomponents-client/blob/master/httpclient5/src/test/java/org/apache/hc/client5/http/psl/TestPublicSuffixMatcher.java#L61).

Is it possible to provide a new method, or overload the existing
method, to return the registrable domain according to the formal
algorithm regardless of ICANN vs PRIVATE?

Happy to provide more context if needed.


Hi Austin

Firstly, the point of having private domains in the PSL is to have them treated differently from ICANN domains.

Secondly, there is nothing stopping you from having a custom PSL source, for instance, without the PRIVATE section, or some entries moved from PRIVATE to ICANN

If you still would like to have an option of overriding the default behavior do feel free to propose API changes as a PR at GitHub. If they are non-intrusive and backward compatible I see no reason for not including them.

Oleg


Thanks,
Austin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to