[ 
https://issues.apache.org/jira/browse/LUCENE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17499423#comment-17499423
 ] 

Dawid Weiss commented on LUCENE-10447:
--------------------------------------

Just to clarify - I have implemented option (1) in the past and the only 
reliable way to "detect" the platform's default encoding from forked Java (from 
within randomized tests) was to fork a java subprocess that echoes back the 
name of the charset as bytes:
{code:java}
public static class CharsetName {
  public static void main(String[] args) throws IOException {
    String name = Charset.defaultCharset().name();
    System.out.write(name.getBytes(StandardCharsets.UTF_8));
    System.out.flush();
  }
} {code}
then read it on the test infrastructure side and use it for follow-up 
command/script execution.

> Charset issue in TestScripts#testLukeCanBeLaunched()
> ----------------------------------------------------
>
>                 Key: LUCENE-10447
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10447
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: luke
>            Reporter: Lu Xugang
>            Assignee: Dawid Weiss
>            Priority: Minor
>         Attachments: 1.png, process-10536545874299101128.out
>
>
> When run TestScripts#testLukeCanBeLaunched(), a temp file will be created in 
> the path of lucene/distribution.tests/build/tmp/tests-tmp/process-*.out, this 
> process-*.out file may contains some non StandardCharsets.US_ASCII content 
> base on Operating System language, and then a Exception will be throw because 
> later the test will read this temp file with StandardCharsets.US_ASCII.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to