Hi Devin,
Thanks for your short list! Your insights are massively helpful.
Phil
Devin Asay wrote:
Hi guys,
Wow, where to start? I completely understand your confusion and feel
your pain. When I started with unicode I felt lost, too. Here's a
short list of Aha! points that should help. (Caveat--I'm describing my
understanding purely from a developer perspective; I have little
understanding about how Rev implements unicode "under the hood").
- When we talk about unicode in Rev, we're talking about UTF-16, not
UTF-8 or UTF-32.
- The current implementation of unicode is not perfect, but it is
perfectly usable. (Right-to-left languages are still problematic,
especially if you need to support user input. Display of same is
usually fine.)
- The useUnicode property has very limited application. It only
affects the behavior of the charToNum and numToChar functions. If
useUnicode is false, these 2 functions behave as we're accustomed; if
true, these 2 functions assume two byte characters instead of 1 byte.
- The byte order in which unicode files are stored is dependent upon
the processor in the host machine. That means that if you're
transferring unicode files from, say, a PPC-based machine to an
Intel-based one, UTF-16 files will be scrambled unless you invert the
bytes as you read them in.
- In light of the above, it's usually best to store unicode text as
UTF-8 or even htmlText. These have been the most reliable transfer
formats for me.
- In a Rev field unicode and ascii get mixed up all the time. For
instance, characters that normally fall within the ascii range, like
space, return and common punctuation, are considered ascii. While this
can be confusing, it does ensure that normal Rev chunk expressions
work as expected.
- There is no 100% reliable way I know of to look at a file and
determine heuristically whether it's unicode, or what flavor of
unicode it is.
- The section on unicode in the Rev User Guide (section 6.4) is pretty
good as far as it goes, but doesn't cover all the "gotchas".
- Dealing with unicode in text fields is different that in buttons and
menus.
Anyhow, those are some of the key points. For a more in depth
discussion, see my Unicode presentation from RevLive if you've got the
DVD. Failing that, you're welcome to read my presentation notes at:
http://asay.byu.edu/revUnicode.pdf
The stack I used in that presentation, which shows lots of examples,
is at:
go url "http://asay.byu.edu/unicode-RevLive08.rev"
I'm happy to help if you still have specific issues after you look at
this stuff. Unicode is doable, once you learn the tricks and pitfalls.
Regards,
Devin
On Nov 24, 2008, at 6:45 PM, Scott Rossi wrote:
Recently, Phil Davis wrote:
Thanks for asking the questions, Scott. I'm interested in clarity here
too since I'll be working with Arabic again in the next few months, and
am still a Unicode lightweight.
You want questions? I got a truck-load of 'em...
For instance... I have characters from several languages in the text
I'm
working with: Roman, French (accented), Chinese, and Russian. When I
set
the unicodeText of a field to the text, the accented French characters
render incorrectly. Looking in the source text file, it appears the
original French characters may have been reformatted when saving the
file as
UTF-16. Is there any way to keep the French characters intact within
the
unicode text?
Thanks & Regards,
Scott Rossi
Creative Director
Tactile Media, Multimedia & Design
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
Devin Asay
Humanities Technology and Research Support Center
Brigham Young University
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
--
Phil Davis
PDS Labs
Professional Software Development
http://pdslabs.net
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution