[ https://issues.apache.org/jira/browse/HTTPCORE-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709024#comment-17709024 ]
Julian Reschke edited comment on HTTPCORE-739 at 4/5/23 5:43 PM: ----------------------------------------------------------------- That is true. However, from the URI and HTTP specs' point of view, the query part is just an opaque string without any structure. You correctly point out that the URI spec does not mention "+". But the same is true for "=" and "&". Like it or not, the syntactical structure of the query part as commonly used on the web is defined by the way how browsers support form data with GET. That's why it's defined over there. I'd absolutely agree with you if yoi said that this is weird and undesirable, but this is how it is right now. was (Author: reschke): That is true. However, from the URI and HTTP specs' point of view, the query part is just an opaque string without any structure. You correctly point out that the URI spec does not mention "+". But the same is true for "=" and "&". Like it or not, the syntactical structure used on the web is defined by the way how browsers support form data with GET. That's why it's defined over there. I'd absolutely agree with you if yoi said that this is weird and undesirable, but this is how it is right now. > org.apache.hc.core5.net.URIBuilder does not decode plus characters (`+`) in > the query part > ------------------------------------------------------------------------------------------ > > Key: HTTPCORE-739 > URL: https://issues.apache.org/jira/browse/HTTPCORE-739 > Project: HttpComponents HttpCore > Issue Type: Bug > Components: HttpCore > Affects Versions: 5.2.1 > Reporter: Andreas Loth > Priority: Major > > Currently, when decoding the query part of an URL, a plus sign is kept als > plus sign in the decoded name-value-pairs. > Expected would be that a plus sign is decoded to a space. > https://www.w3.org/Addressing/URL/uri-spec.html > > Within the query string, the plus sign is reserved as shorthand notation > > for a space. Therefore, real plus signs must be encoded. > I'm perfectly fine with encoding space everywhere to %20 and the plus sign > everywhere to %2B (this is in my experience the most unambiguous and less > error prone way to handle these characters). See HTTPCORE-628 > However, during decoding the position is the plus sign has to be respected: > decode it to space in the query part but leave it as plus everywhere else. > Test case for decoding: > {noformat} > * URL: > https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test > * path: /abc/plus-+_enc-space- _enc-plus-+_/def > * get argument 1 name: test > * get argument 1 value: plus- _enc-space- _enc-plus-+_ > * get argument 2 name: plus- _enc-space- _enc-plus-+_ > * get argument 2 value: test > {noformat} > Test case for encoding: > {noformat} > * path: /abc/plus-+_space- _/def > * get argument 1 name: test > * get argument 1 value: plus-+_space- _ > * get argument 2 name: plus-+_space- _ > * get argument 2 value: test > * URL: > https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test > {noformat} > Potential fix (untested): > https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410 > Change the vaue of the `plusAsBlank` argument from `false` to `true`. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org