Hi Krishnan, > I am working on implementing > pdf_status_t > pdf_fsys_disk_get_parent (const pdf_text_t path_name, > pdf_text_t parent_path); > as part of FS#39. > > I have a (partial) patch that does the necessary string (ASCII) > manipulation > to get the canonicalized path name for the relative path supplied to > the function. > I would need some pointers on what unicode utility functions I could > use to get this patch > working for unicode filenames as well.
I'll review the patch with more detail in the following days, just a small comment now regarding unicode strings. The path is always in host encoding: * in gnu systems paths will all be UTF-8 compatible, ending in a single NUL byte, so all standard string utility methods are enough (e.g. strlen()). If your patch assumes an ASCII-encoded path, it's very probable that you don't need to do anything specific to support UTF-8. * in win32, the path may come as wide chars, in UTF-16. Thus, you'll need to use windows-specific wide-char versions of the same string utility methods in the implementation (like wcslen()). Another note regarding the fsys module API. Currently this API uses pdf_text_t objects as input path arguments. This ends up being far from optimal. Also, pdf_text_t to host encoding conversion doesn't include trailing NUL bytes by default, which ends up requiring an additional realloc() after the conversion to add these end-of-string NUL bytes. I think its reasonable to change the API so that the fsys API gets as input always host-encoded strings represented by NUL-terminated (pdf_char_t *) instead of (pdf_text_t *) (or 2-NUL terminated wide char strings in windows). Will be discussing this with the list in the following weeks. Cheers, -- Aleksander
signature.asc
Description: This is a digitally signed message part
