Hello,
I tried to transfer files with Chinese file names using
"globus-url-copy" but failed to do so. The error message is "error:
[globus_gass_copy_get_url_mode]: globus_url_parse returned error code:
-8 for url: <my file path>"
To see which part went wrong, I traced the source code of
globus-url-copy and, finally, I found out that the problem came from
$gt_home/source-trees/common/source/library/globus_url.c.
This is the a small piece of the code from globus_url_get_path() where
the problem occurs:
if(isalnum((*stringp)[pos]) ||
globusl_url_issafe((*stringp)[pos]) ||
globusl_url_isextra((*stringp)[pos]) ||
globusl_url_isscheme_special((*stringp)[pos]) ||
(*stringp)[pos] == '~' || /* incorrect, but de facto */
(*stringp)[pos] == '/'||
(*stringp)[pos] == ' ') /* to be nice */
{
pos++;
}
The function "globus_url_get_path()" checks the validity of the path
before retrieving its substring. It only accepts ASCII characters and
omits any other characters. However, since Chinese characters are
encoded in UTF-8 and most UTF-8 characterss are begin with a "1" as
their leading bits. This is why Chinese file names did not work with
globus-url-copy.
I cannot understand the exact function of the code above. I mean it
seems ok to work with characters other than ASCII codes. So I am just
wondering if it is appropriate to let that function accept them, in
order to accept UTF-8 strings.
By the way, I think it is important to make grid middlewares like
globus to support multiple languages since grid computing requires
global cooperation. For example, if developers consider not just ASCII
code or program in unicode, the life would have been much easier.
However, as far as I have experienced, most programs are lack of
multi-language features.
Any comments would be helpful. Thanks.
Hai-Ning
--
Hai-Ning Wu
Academia Sinica Grid Computing
Taipei, Taiwan
Email: [EMAIL PROTECTED]