Consider the following MSVC program:
--------------------- cut -------------------------
// PruebaOpenDlg.cpp : Defines the entry point for the console application.
//

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>

#include <windows.h>


int main(int argc, char* argv[])
{
   OPENFILENAME ofn;       // common dialog box structure
   char szFile[260];       // buffer for file name

   // Initialize OPENFILENAME
   ZeroMemory(&ofn, sizeof(OPENFILENAME));
   ofn.lStructSize = sizeof(OPENFILENAME);
   ofn.hwndOwner = NULL;
   ofn.lpstrFile = szFile;
   ofn.nMaxFile = sizeof(szFile);
   ofn.lpstrFilter = "All\0*.*\0Text\0*.TXT\0";
   ofn.nFilterIndex = 1;
   ofn.lpstrFileTitle = NULL;
   ofn.nMaxFileTitle = 0;
   ofn.lpstrInitialDir = NULL;
//    ofn.Flags = OFN_PATHMUSTEXIST | OFN_FILEMUSTEXIST;

// Display the Open dialog box. memset(szFile, 0, sizeof(szFile));
   if (GetOpenFileName(&ofn)==TRUE) {
       char * p;
       FILE * hFile;

       printf("Chosen filename is: %s\n", ofn.lpstrFile);
       printf("Byte encoding is  :");
       for (p = ofn.lpstrFile; *p; p++) {
           printf(" (%c %02x)", *p, *p);
       }
       printf("\n");

       hFile = fopen(ofn.lpstrFile, "rb");
       if (hFile != NULL) {
           fclose(hFile);
           puts("File is readable through specified filename");
       } else {
           printf("Unable to reach file through %s - %s\n",
               ofn.lpstrFile, strerror(errno));
       }
   }
   return 0;
}
--------------------- cut -------------------------

Consider also the following Linux environment: home directory is /home/alex, and is mapped to drive F: in dosdevices. The home directory contains a directory named gatón (the string contains a [U+00F3 LATIN SMALL LETTER O WITH ACUTE] and is UTF-8 encoded as 0x67 0x61 0x74 0xC3 0xB3 0x6E), inside of which a sample file exists, which is to be selected by the Open File dialog. All tests were
made in a Fedora Core 4 system with a *default* LANG=es_EC.UTF-8.

The symptom is that, when wine runs with an UTF-8 locale (as specified with the LANG environment variable), and an attempt is made to choose a filename that is UTF-8 encoded in the filesystem, GetOpenFileNameA may return a byte string that CreateFile and other file functions are unable to map into a valid filename. Whether GetOpenFileNameA returns a valid filename or not seems to depend on the way the navigation is performed. That is, if the application starts the Open File dialog from the current directory, and the user navigates by directory change only,
the invalid filename will be returned. However, if the user first chooses a
drive letter (such as F:) and then navigates from there, the filename returned is
a valid one.

The following tests illustrate the behavior. For each entry, the first two lines are the conditions for the test. The remaining three lines are the actual output from the supplied program, copied and pasted from the console. The instances of
\uffff seen are from invalid character encodings displayed in the console.

LANG=en_US
From current directory /home/alex:
Chosen filename is: f:\gatón\Barenaked Ladies - One Week.mp3
Byte encoding is : (f 66) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33)
File is readable through specified filename

LANG=en_US
From explicit choice from drive F: :
Chosen filename is: F:\gatón\Barenaked Ladies - One Week.mp3
Byte encoding is : (F 46) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33)
File is readable through specified filename

LANG=es_EC
From current directory /home/alex:
Chosen filename is: f:\gatón\Barenaked Ladies - One Week.mp3
Byte encoding is : (f 66) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33)
File is readable through specified filename

LANG=es_EC
From explicit choice from drive F: :
Chosen filename is: F:\gatón\Barenaked Ladies - One Week.mp3
Byte encoding is : (F 46) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33)
File is readable through specified filename

LANG=es_EC.UTF-8
From current directory /home/alex:
Chosen filename is: f:\gatón\Barenaked Ladies - One Week.mp3
Byte encoding is : (f 66) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff ffffffc3) (\uffff ffffffb3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33) Unable to reach file through f:\gatón\Barenaked Ladies - One Week.mp3 - No such file or directory

LANG=es_EC.UTF-8
From explicit choice from drive F: :
Chosen filename is: F:\gat\uffffn\Barenaked Ladies - One Week.mp3
Byte encoding is : (F 46) (: 3a) (\ 5c) (g 67) (a 61) (t 74) (\uffff fffffff3) (n 6e) (\ 5c) (B 42) (a 61) (r 72) (e 65) (n 6e) (a 61) (k 6b) (e 65) (d 64) ( 20) (L 4c) (a 61) (d 64) (i 69) (e 65) (s 73) ( 20) (- 2d) ( 20) (O 4f) (n 6e) (e 65) ( 20) (W 57) (e 65) (e 65) (k 6b) (. 2e) (m 6d) (p 70) (3 33)
File is readable through specified filename

Case 5 is incorrect, but is the easiest to hit in the UTF-8 locales.

This problem is significant because all Fedora distributions since at least
Fedora Core 2 have UTF-8 support, which is probably enabled in non-US locales. Other popular distributions probably have this UTF-8 support enabled too. I am posting this on wine-devel instead of creating a bug report because I wanted to receive some comments on what the expected behavior should be before trying to submit a patch myself. Unless somebody says otherwise, I would try to submit a
patch that makes case 5 behave like case 6, by modifying the encoding of the
ANSI string to match what the file-open functions would expect for the filename. However, this essentially requires an answer to the following question: should non-Unicode strings that represent filenames be UTF-8 encoded, or locale encoded? In the UTF-8 locales, GetOpenFileNameA seems to think UTF-8 encoded sometimes, but the file open functions expect locale-encoded (in my case is ISO-8859-1). Therefore, the incorrect behavior. How would the answer change (if at all) for
Chinese or Japanese locales with a need for multibyte characters?

Alex Villacís Lasso



Reply via email to