Georg Baum <[EMAIL PROTECTED]> writes:
| This small patch makes most of plain text readable again (in utf8).
|
| Questions:
|
| 1) Is it on purpose that the functions in unicode.h convert only between
| std::vectors of characters and C strings, but not std::string/docstring? I
| think we should have variants for these as well. or are we always supposed
| to use such constructs as in the patch?
So far I have only created what I needed. But even if we add more
convenience fuctions we should be careful when adding them, we do not
want to many imho.
| 2) Do we agree that we should use lyx::doscstring for all internal methods
| that store parts of the document, i. e. change
|
| std::string const TocBackend::Item::str() const;
|
| to
|
| lyx::docstring const TocBackend::Item::str() const;
|
| and convert to utf8 where needed (in this case for plain text output)? Or
| should we not change the type, but use utf8 as encoding instead? I believe
| the former is safer.
This is one of the things I am thinking about... esp. in rel. to
gettext and l10n.
Should a call to gettext (_()) give us utf8 or ucs4?, so far I am
inclined to go for utf8.
| Index: src/output_plaintext.C
| ===================================================================
| --- src/output_plaintext.C (Revision 14695)
| +++ src/output_plaintext.C (Arbeitskopie)
| @@ -232,8 +233,10 @@ void asciiParagraph(Buffer const & buf,
| "writeAsciiFile: NULL char in structure." <<
endl;
| break;
|
| - default:
| - word += c;
| + default: {
| + std::vector<char> tmp = ucs4_to_utf8(c);
| + tmp.push_back('\0');
| + word += &tmp[0];
What is word? a std::string?
std::vector<char> tmp = ucs4_to_utf8(c);
word.append(tmp.begin(), tmp.end());
--
Lgb