Hello,

I'm writing a unit test to simulate behavior of different browsers
regarding multipart file upload where the filenames may contain letters
with accents.  The server has noticed that some browsers (such as Chrome)
use composed accents (a single character code point with the character and
accent, for example "é") and others (such as Firefox) use combining
diacriticals ("e" + \u0301).

I'm having difficulty simulating this with HTTP client, because when the
actual HTTP multipart entity is sent to the server, the filenames of each
part are encoded in ASCII.  It seems likely that I need to set the charset
somewhere, but I don't see where.  Obviously, I set the content-type for
the entity (such as "application/octet-stream", which has no character set,
or perhaps a user-supplied text file in some arbitrary encoding, which may
differ from one part to another), but surely, the character encoding (for
text files) is unrelated to the character encoding of filenames?

Below is the relevant part of the test code (I hope it's readable, it's the
ByteArrayBody constructor which seems to be the issue), followed by the
HTTP wire log.

Thanks in advance.

--
Christopher

final String withComposition = "a\u0060b\u00b4c.txt"; //     letters
combined with accents in a single character
final String withDecomposition = "ae\u0300be\u0301c.txt"; // letters
followed by combining diacriticals
final String standaloneAccents = "a\u00e8b\u00e9c.txt"; //   accents
(without letters)
final String[] fileNames = new String[]{
withComposition,
withDecomposition,
standaloneAccents
};

try
{
// create data for a small dummy file
byte[] data = new byte[64];
try (RandomInputStream ris = new RandomInputStream(data.length))
{
for (int i = 0; i < data.length; i++)
{
data[i] = (byte)ris.read();
}
}

final HttpPost post = new HttpPost(toUrl(svc));
final MultipartEntityBuilder entityBuilder =
MultipartEntityBuilder.create().setMode(BROWSER_COMPATIBLE);

for (int i = 0; i < fileNames.length; i++)
{
final ByteArrayBody body = new ByteArrayBody(data,
ContentType.create("application/octet-stream"), fileNames[i]);
entityBuilder.addPart("f" + i, body);
}

post.setEntity(entityBuilder.build());

final HttpResponse resp = getClient().execute(post);
// rest of test omitted for brevity
}
catch (IOException e)
{
fail(e.getMessage());
}


10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >> "POST
/test/U201/test011 HTTP/1.1[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Length: 671[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Type: multipart/form-data;
boundary=U0TBOWZi2jfYaKvP-BIB6VgFyhY4jgtX2[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >> "Host:
localhost:8088[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Connection: Keep-Alive[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"User-Agent: Apache-HttpClient/4.3.5 (java 1.5)[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"--U0TBOWZi2jfYaKvP-BIB6VgFyhY4jgtX2[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Disposition: form-data; name="f0"; filename="a`b?c.txt"[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Type: application/octet-stream[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"0S3[0xe4][0x8f]|[0xc8][0xeb]s[0xb6]"[0x86]~[0xc5][0x92]_[0xb0][0xa1][0xf2]e"[0x16]*[0x9b][0xdc]a[0xaa][0x18][0xc9][0xe5][0xd4][3p$([
0xc3]r[0xcc]G[0x97][0x8f]tOFD[0xa6]=[0xb5][0xec]{[0x6][0xbd][0x9b][0x1b]w[0xb9][0x1][0xa5][0xa8]m[0x89]=[0x13][\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"--U0TBOWZi2jfYaKvP-BIB6VgFyhY4jgtX2[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Disposition: form-data; name="f1"; filename="ae?be?c.txt"[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Type: application/octet-stream[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"0S3[0xe4][0x8f]|[0xc8][0xeb]s[0xb6]"[0x86]~[0xc5][0x92]_[0xb0][0xa1][0xf2]e"[0x16]*[0x9b][0xdc]a[0xaa][0x18][0xc9][0xe5][0xd4][3p$([
0xc3]r[0xcc]G[0x97][0x8f]tOFD[0xa6]=[0xb5][0xec]{[0x6][0xbd][0x9b][0x1b]w[0xb9][0x1][0xa5][0xa8]m[0x89]=[0x13][\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"--U0TBOWZi2jfYaKvP-BIB6VgFyhY4jgtX2[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Disposition: form-data; name="f2"; filename="a?b?c.txt"[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"Content-Type: application/octet-stream[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"[\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"0S3[0xe4][0x8f]|[0xc8][0xeb]s[0xb6]"[0x86]~[0xc5][0x92]_[0xb0][0xa1][0xf2]e"[0x16]*[0x9b][0xdc]a[0xaa][0x18][0xc9][0xe5][0xd4][3p$([
0xc3]r[0xcc]G[0x97][0x8f]tOFD[0xa6]=[0xb5][0xec]{[0x6][0xbd][0x9b][0x1b]w[0xb9][0x1][0xa5][0xa8]m[0x89]=[0x13][\r][\n]"
10:16:59.424 [main] DEBUG org.apache.http.wire - http-outgoing-16 >>
"--U0TBOWZi2jfYaKvP-BIB6VgFyhY4jgtX2--[\r][\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"HTTP/1.1 200 OK[\r][\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 << "Date:
Thu, 12 Feb 2015 09:16:59 GMT[\r][\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"Allow: POST[\r][\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"Content-Type: text/plain; charset=UTF-8[\r][\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"Content-Length: 32[\r][\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"[\r][\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"a`b?c.txt[\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"ae?be?c.txt[\n]"
10:17:00.602 [main] DEBUG org.apache.http.wire - http-outgoing-16 <<
"a?b?c.txt[\n]"

Reply via email to