Georg Baum <[EMAIL PROTECTED]> writes:

| This small patch makes most of plain text readable again (in utf8).
| 
| Questions:
| 
| 1) Is it on purpose that the functions in unicode.h convert only between
| std::vectors of characters and C strings, but not std::string/docstring? I
| think we should have variants for these as well. or are we always supposed
| to use such constructs as in the patch?

So far I have only created what I needed. But even if we add more
convenience fuctions we should be careful when adding them, we do not
want to many imho.
 
| 2) Do we agree that we should use lyx::doscstring for all internal methods
| that store parts of the document, i. e. change
| 
| std::string const TocBackend::Item::str() const;
| 
| to
| 
| lyx::docstring const TocBackend::Item::str() const;
| 
| and convert to utf8 where needed (in this case for plain text output)? Or
| should we not change the type, but use utf8 as encoding instead? I believe
| the former is safer.

This is one of the things I am thinking about... esp. in rel. to
gettext and l10n.

Should a call to gettext (_()) give us utf8 or ucs4?, so far I am
inclined to go for utf8.

| Index: src/output_plaintext.C
| ===================================================================
| --- src/output_plaintext.C    (Revision 14695)
| +++ src/output_plaintext.C    (Arbeitskopie)
| @@ -232,8 +233,10 @@ void asciiParagraph(Buffer const & buf,
|                               "writeAsciiFile: NULL char in structure." << 
endl;
|                       break;
|  
| -             default:
| -                     word += c;
| +             default: {
| +                     std::vector<char> tmp = ucs4_to_utf8(c);
| +                     tmp.push_back('\0');
| +                     word += &tmp[0];

What is word? a std::string?

std::vector<char> tmp = ucs4_to_utf8(c);
word.append(tmp.begin(), tmp.end());

-- 
        Lgb

Reply via email to