On Thu, 23 May 2013 14:02:58 +0400 Stanislav Frolov <frolosof...@gmail.com> wrote:
> I have trouble with filename encoding on Linux (utf-8) and windows (cp866?). > > Examples > > There is one file in directory: "тест" (mean "test" in russian). > (directory "*") => (#P"/path/to/ÑеÑÑ") > > Let's try create pathname from cyrilic utf-8 filename: > (pathname "тест") > Error: Cannot coerce string тест to a base-string Unfortunately, path/file names encoding are OS-specific, file-system specific and may be locale specific... POSIX filenames may contain bytes which are often used to hold UTF-8 characters on filesystems which allow this, but that too is only one of the available encoding options, and unfortunately filenames cannot be tagged with the encoding type, except if using an uncommon convention like is used in RFC 2047 for message headers, or non-portable attributes/subfiles, so files named by others on their systems may not display correctly locally on the same OS and FS). However, because POSIX syscalls expect C strings, UTF-8 is popular when the various single-byte encodings are not used. My Windows experience is limited, but I think that it usually uses UTF-16 where unicode strings are possible. ECL internally stores unicode strings using UCS-32, and the base-string only accepts character codes 0-255. This might not be the only or cleanest solution, but this might work to create UTF-8 pathnames for POSIX systems: (defun utf-8-base-string<-string (string) "Encodes the supplied STRING to an UTF-8 base-string which it returns." (let ((v (make-array (+ 5 (length string)) ; Best case but we might grow :element-type 'base-char :adjustable t :fill-pointer 0))) (with-open-stream (s (ext:make-sequence-output-stream v :external-format :utf-8)) (loop for c across string do (write-char c s) (let ((d (array-dimension v 0))) (when (< (- d (fill-pointer v)) 5) (adjust-array v (* 2 d)))))) v)) ; (pathname (utf-8-base-string<-string "тест")) -> #P"Ñ\202еÑ\201Ñ\202" If you need more portable encoding conversion code, the Babel CL library also supports such (http://common-lisp.net/project/babel/). -- Matt ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may _______________________________________________ Ecls-list mailing list Ecls-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ecls-list