[
https://issues.apache.org/jira/browse/HTTPCLIENT-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ahmed updated HTTPCLIENT-2395:
------------------------------
Description:
Hi team,
I recently upgraded Apache HTTP Client to newest version (5.5) from 5.3.1 and
one of the tests in my client side service detected an issue. Issue is
presented while forming HTTP multipart request with attachments/inlines that
contains non-ascii characters in filename.
Example:
{code:java}
val attachment : Part? = mimeMessage.attachments.firstOrNull()
val multipart = MultipartEntityBuilder.create()
multipart.setMode(HttpMultipartMode.EXTENDED)
multipart.addBinaryBody(
"attachments",
attachment?.openDataStream()?.use { it.readBytes()},
ContentType.parse(attachment?.contentType),
attachment?.name)
.build()
val httpPost = HttpPost(url())
httpPost.entity = multipart.build()
httpClient.execute(httpPost) { it.handleResponse() }{code}
>From given MIME message:
{code:java}
Content-Type: multipart/alternative;
boundary="------------705ZF0wSwOSffEDi6dR6B0hC"
Message-ID: <[email protected]>
From: "🌪️ R@nd0M ユーザー" <[email protected]>
To: "Tēst 🎯 Üser" <[email protected]>
Subject: =?UTF-8?B?Rml4IG1l?=
--------------705ZF0wSwOSffEDi6dR6B0hC
Content-Type: text/html
<p> HTML </p>
--------------705ZF0wSwOSffEDi6dR6B0hC
Content-Type: application/octet-stream; name="ติมเงินผิดเบอร์mPayเ.xlsx"
Content-Disposition: inline; filename="ติมเงินผิดเบอร์mPayเ.xlsx"
Content-Transfer-Encoding: base64
iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+P+/HgAFhAJ/wlseKgAAAABJRU5ErkJggg==
--------------705ZF0wSwOSffEDi6dR6B0hC--
{code}
This generates HTTP request with following problematic URL encoded part:
{code:java}
Content-Disposition: form-data; name="attachments";
filename="%F0%9F%90%99_inline-%E5%9B%BE%E5%83%8F_%E6%96%87%E4%BB%B6.png";
filename*="UTF-8''UTF-8%27%27%25F0%259F%2590%2599_inline-%25E5%259B%25BE%25E5%2583%258F_%25E6%2596%2587%25E4%25BB%25B6.png"Content-Type:
image/png {code}
filename* gets UTF-8 encoded two times resulting in filename with UTF-8''
prefix where actual value should be:
{code:java}
Content-Disposition: form-data; name="attachments";
filename="%F0%9F%90%99_inline-%E5%9B%BE%E5%83%8F_%E6%96%87%E4%BB%B6.png";
filename*="UTF-8''UTF-8%27%27%25F0%259F%2590%2599_inline-%25E5%259B%25BE%25E5%2583%258F_%25E6%2596%2587%25E4%25BB%25B6.png"Content-Type:
image/png {code}
I suspect that problem lies
[here|https://github.com/apache/httpcomponents-client/blob/3eda5098f82c0d5cf1ceaa72afb1c24d9836ff56/httpclient5/src/main/java/org/apache/hc/client5/http/entity/mime/HttpRFC7578Multipart.java#L104],
where additional UTF-8'' char is appended on filename along with original
appending while generating multipart itself
[here|https://github.com/apache/httpcomponents-client/blob/3eda5098f82c0d5cf1ceaa72afb1c24d9836ff56/httpclient5/src/main/java/org/apache/hc/client5/http/entity/mime/FormBodyPartBuilder.java#L164].
Problem can be avoided using LEGACY mode which doesn't look as ideal solution
to me as it doesn't support UTF-8 headers like in From or To MIME headers for
example.
Related JIRA: https://issues.apache.org/jira/browse/HTTPCLIENT-2360
was:
Hi team,
I recently upgraded Apache HTTP Client to newest version (5.5) from 5.3.1 and
one of the tests in my client side service detected an issue. Issue is
presented while forming HTTP multipart request with attachments/inlines that
contains non-ascii characters in filename.
Example:
{code:java}
val attachment : Part? = mimeMessage.attachments.firstOrNull()
val multipart = MultipartEntityBuilder.create()
multipart.setMode(HttpMultipartMode.EXTENDED)
multipart.addBinaryBody(
"attachments",
attachment?.openDataStream()?.use { it.readBytes()},
ContentType.parse(attachment?.contentType),
attachment?.name)
.build()
val httpPost = HttpPost(url())
httpPost.entity = multipart.build()
httpClient.execute(httpPost) { it.handleResponse() }{code}
>From given MIME message:
{code:java}
Content-Type: multipart/alternative;
boundary="------------705ZF0wSwOSffEDi6dR6B0hC"
Message-ID: <[email protected]>
From: "🌪️ R@nd0M ユーザー" <[email protected]>
To: "Tēst 🎯 Üser" <[email protected]>
Subject: =?UTF-8?B?Rml4IG1l?=
--------------705ZF0wSwOSffEDi6dR6B0hC
Content-Type: text/html
<p> HTML </p>
--------------705ZF0wSwOSffEDi6dR6B0hC
Content-Type: application/octet-stream; name="ติมเงินผิดเบอร์mPayเ.xlsx"
Content-Disposition: inline; filename="ติมเงินผิดเบอร์mPayเ.xlsx"
Content-Transfer-Encoding: base64
iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+P+/HgAFhAJ/wlseKgAAAABJRU5ErkJggg==
--------------705ZF0wSwOSffEDi6dR6B0hC--
{code}
This generates HTTP request with following problematic URL encoded part:
{code:java}
Content-Disposition: form-data; name="attachments";
filename="%F0%9F%90%99_inline-%E5%9B%BE%E5%83%8F_%E6%96%87%E4%BB%B6.png";
filename*="UTF-8''UTF-8%27%27%25F0%259F%2590%2599_inline-%25E5%259B%25BE%25E5%2583%258F_%25E6%2596%2587%25E4%25BB%25B6.png"Content-Type:
image/png {code}
filename* gets UTF-8 encoded two times resulting in filename with UTF-8''
prefix where actual value should be:
{code:java}
Content-Disposition: form-data; name="attachments";
filename="%F0%9F%90%99_inline-%E5%9B%BE%E5%83%8F_%E6%96%87%E4%BB%B6.png";
filename*="UTF-8''UTF-8%27%27%25F0%259F%2590%2599_inline-%25E5%259B%25BE%25E5%2583%258F_%25E6%2596%2587%25E4%25BB%25B6.png"Content-Type:
image/png {code}
I suspect that problem lies
[here|https://github.com/apache/httpcomponents-client/blob/3eda5098f82c0d5cf1ceaa72afb1c24d9836ff56/httpclient5/src/main/java/org/apache/hc/client5/http/entity/mime/HttpRFC7578Multipart.java#L104],
where additional UTF-8'' char is appended on filename along with original
appending while generating multipart itself
[here|https://github.com/apache/httpcomponents-client/blob/3eda5098f82c0d5cf1ceaa72afb1c24d9836ff56/httpclient5/src/main/java/org/apache/hc/client5/http/entity/mime/FormBodyPartBuilder.java#L164].
Problem can be avoided using LEGACY mode which doesn't look as ideal solution
to me as it doesn't support UTF-8 headers like in From or To MIME headers for
example.
Related JIRA: https://issues.apache.org/jira/browse/HTTPCLIENT-2360
> Non-ASCII filename corrupted in HTTP request
> --------------------------------------------
>
> Key: HTTPCLIENT-2395
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2395
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (classic)
> Affects Versions: 5.5
> Environment: Ubuntu 24.04
> Reporter: Ahmed
> Priority: Minor
> Labels: bug
> Fix For: 5.4.4
>
>
> Hi team,
> I recently upgraded Apache HTTP Client to newest version (5.5) from 5.3.1 and
> one of the tests in my client side service detected an issue. Issue is
> presented while forming HTTP multipart request with attachments/inlines that
> contains non-ascii characters in filename.
> Example:
>
> {code:java}
> val attachment : Part? = mimeMessage.attachments.firstOrNull()
> val multipart = MultipartEntityBuilder.create()
> multipart.setMode(HttpMultipartMode.EXTENDED)
> multipart.addBinaryBody(
> "attachments",
> attachment?.openDataStream()?.use { it.readBytes()},
> ContentType.parse(attachment?.contentType),
> attachment?.name)
> .build()
> val httpPost = HttpPost(url())
> httpPost.entity = multipart.build()
> httpClient.execute(httpPost) { it.handleResponse() }{code}
>
> From given MIME message:
>
> {code:java}
> Content-Type: multipart/alternative;
> boundary="------------705ZF0wSwOSffEDi6dR6B0hC"
> Message-ID: <[email protected]>
> From: "🌪️ R@nd0M ユーザー" <[email protected]>
> To: "Tēst 🎯 Üser" <[email protected]>
> Subject: =?UTF-8?B?Rml4IG1l?=
> --------------705ZF0wSwOSffEDi6dR6B0hC
> Content-Type: text/html
> <p> HTML </p>
> --------------705ZF0wSwOSffEDi6dR6B0hC
> Content-Type: application/octet-stream; name="ติมเงินผิดเบอร์mPayเ.xlsx"
> Content-Disposition: inline; filename="ติมเงินผิดเบอร์mPayเ.xlsx"
> Content-Transfer-Encoding: base64
> iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+P+/HgAFhAJ/wlseKgAAAABJRU5ErkJggg==
> --------------705ZF0wSwOSffEDi6dR6B0hC--
> {code}
>
> This generates HTTP request with following problematic URL encoded part:
>
> {code:java}
> Content-Disposition: form-data; name="attachments";
> filename="%F0%9F%90%99_inline-%E5%9B%BE%E5%83%8F_%E6%96%87%E4%BB%B6.png";
> filename*="UTF-8''UTF-8%27%27%25F0%259F%2590%2599_inline-%25E5%259B%25BE%25E5%2583%258F_%25E6%2596%2587%25E4%25BB%25B6.png"Content-Type:
> image/png {code}
> filename* gets UTF-8 encoded two times resulting in filename with UTF-8''
> prefix where actual value should be:
>
> {code:java}
> Content-Disposition: form-data; name="attachments";
> filename="%F0%9F%90%99_inline-%E5%9B%BE%E5%83%8F_%E6%96%87%E4%BB%B6.png";
> filename*="UTF-8''UTF-8%27%27%25F0%259F%2590%2599_inline-%25E5%259B%25BE%25E5%2583%258F_%25E6%2596%2587%25E4%25BB%25B6.png"Content-Type:
> image/png {code}
>
> I suspect that problem lies
> [here|https://github.com/apache/httpcomponents-client/blob/3eda5098f82c0d5cf1ceaa72afb1c24d9836ff56/httpclient5/src/main/java/org/apache/hc/client5/http/entity/mime/HttpRFC7578Multipart.java#L104],
> where additional UTF-8'' char is appended on filename along with original
> appending while generating multipart itself
> [here|https://github.com/apache/httpcomponents-client/blob/3eda5098f82c0d5cf1ceaa72afb1c24d9836ff56/httpclient5/src/main/java/org/apache/hc/client5/http/entity/mime/FormBodyPartBuilder.java#L164].
>
> Problem can be avoided using LEGACY mode which doesn't look as ideal solution
> to me as it doesn't support UTF-8 headers like in From or To MIME headers for
> example.
> Related JIRA: https://issues.apache.org/jira/browse/HTTPCLIENT-2360
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]