On 3/20/2020 1:44 PM, Richard Gaskin via use-livecode wrote:
I would be interested to learn more about the details of the subsequent refinements over the decade since, but also the ROI proposition for today:

I'll try to remember to share the current code after this current review. I'm happy to put it out there for others who may need something. It adds a few more statistical samplings for MacRoman vs CP1252/Latin 1 over your excellent original routine that catches a few more correct guesses.

As for the diminishing returns and ROI for today, I am not sure there is any sort of general ROI for further enhancing the current routine. It does just about every best practice for detection there is (to the best of my knowledge). That said, the current case is of a researcher with a edge variant who happens to be a long time customer AND has a *LOT* of text file that should come up as MacRoman but were not. With one more tweak (a tiny bug of a mistypes variable name) they now do detect correctly.

If the customer wasn't a long time customer and someone with lots of data with this problem, I probably would not invest this level of effort.

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to