Re: is it safe to rely on the hash of a livecode variable from a character encoding standpoint?

Mark Waddingham via use-livecode Fri, 26 Jan 2018 10:34:18 -0800

On 2018-01-26 18:50, Tom Glod via use-livecode wrote:

Hi Everyone,
I want to ask how likely it is that at some point in the future somechangein character encoding could start producing a different hash for thesamesentence? just thinking about the nightmare scenarios facing a projectthatheavily uses hashing to verify and address content......ininternational
characters......to boot.

The hash/digest functions (e.g. sha1Digest) operate on binary data. Soif you do:


  put sha1Digest("foobar")

Then "foobar" is first converted to binary data using the nativeencoding (i.e. the backwards-compatibility rule we have), then that ishashed.

In every case where you produce a hash you have to explicitly choose anencoding - so pick you favourite (unicode friendly!) encoding and do:


  get sha1Digest(textEncode(tMyString, tMyEncoding))

If you are generating hashes of strings to send to existing things, thenit should say *somewhere* in the docs of the thing you are sending whatencoding to use before applying the hash.

Also be aware that unicode allows the 'same' string to be encoded inmultiple ways - so its probably wise to choose a normalization formfirst too (see normalizeText) - otherwise you could have two stringswhich look the same (e.g. e,acute / e-acute) but hash to a differentvalue.


Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Everyone can create apps

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: is it safe to rely on the hash of a livecode variable from a character encoding standpoint?

Reply via email to