Re: Guessing the encoding of a test file...

Paul Dupuis via use-livecode Fri, 20 Mar 2020 11:08:45 -0700

On 3/20/2020 1:44 PM, Richard Gaskin via use-livecode wrote:

I would be interested to learn more about the details of thesubsequent refinements over the decade since, but also the ROIproposition for today:

I'll try to remember to share the current code after this currentreview. I'm happy to put it out there for others who may need something.It adds a few more statistical samplings for MacRoman vs CP1252/Latin 1over your excellent original routine that catches a few more correctguesses.

As for the diminishing returns and ROI for today, I am not sure there isany sort of general ROI for further enhancing the current routine. Itdoes just about every best practice for detection there is (to the bestof my knowledge). That said, the current case is of a researcher with aedge variant who happens to be a long time customer AND has a *LOT* oftext file that should come up as MacRoman but were not. With one moretweak (a tiny bug of a mistypes variable name) they now do detect correctly.

If the customer wasn't a long time customer and someone with lots ofdata with this problem, I probably would not invest this level of effort.


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Guessing the encoding of a test file...

Reply via email to