Two questions: 1. Is there a way to determine the prevalence of Unicode in electronic file documents (vs. documents not in Unicode)? At least for the Web, has anyone done a statistical sampling to determine the percentage of Unicode-encoded webpages?
2. A graduate student mentioned that it was her impression that most Cyrillic webpages (at least for Russian--her interest) are still not encoded in Unicode. (She is doing some research on the use of certain words in Russian and wanted to know how best to do the search.) Again: Has anyone looked into the situation with Cyrillic in terms of the percentage of Web documents in Unicode? With thanks, Debbie Anderson Deborah Anderson Researcher, Dept. of Linguistics UC Berkeley Email: [EMAIL PROTECTED] or [EMAIL PROTECTED] Script Encoding Initiative: www.linguistics.berkeley.edu/~dwanders