If the files submitted to you do not need to retain their original formats for 
your purposes, why not just convert them all to a standard format? it's my 
understanding if you open the file using low level file commands without the 
binfile parameter, LC will convert the data into the local encoding. I might be 
mistaken. 

It would help to have a sample file for testing. 

Bob S


> On Mar 19, 2020, at 16:46 , Paul Dupuis via use-livecode 
> <use-livecode@lists.runrev.com> wrote:
> 
> Users of our application may use text files any whatever encoding their local 
> system creates them in. We can not tell them to only create such files with a 
> specific encoding. So, we need to detect the encoding of the text file the 
> user selects.
> 
> As I mentioned, I have an LC script that implements a encoding guessing 
> algorithm. I am looking for an alternative or better one if someone out there 
> happened to have created one they might like to share or license.
> 
> Any such routine needs to work on macOS and Windows and return the types used 
> by the LC textDecode function.
> 
> I already knew about file on OSX, but I needs a x-platform solution.
> 
> 
> On 3/19/2020 6:15 PM, Pi Digital via use-livecode wrote:
>> On a mac it’s easy. Use
>> file -I “MyFile.txt”
>>  as a shell script.
>> 
>> On Windows it’s near impossible without running a whole bunch or arbitrary 
>> tests that may or may not be correct - certainly not accurate.
>> 
>> What kind of text were you hoping to see? Was you looking for a particular 
>> encoding? If it is grammatical text there’s are a bunch or runs you can do 
>> to see what character sets are used but even then it’s only a 
>> ‘probably’/‘possibly’ response.
>> 
>> Sean Cole
>> Pi Digital
>> 
>> 
>>> On 19 Mar 2020, at 20:31, Paul Dupuis via use-livecode 
>>> <use-livecode@lists.runrev.com> wrote:
>>> 
>>> This has come up many times before, but I'll ask once again in case 
>>> something has changed or someone new sees this.
>>> 
>>> 
>>> Does anyone have a routine that will take a filespec to a text file and 
>>> return the guessed encoding of the text file?
>>> 
>>> 
>>> First, please don't respond with your should know the encoding or the users 
>>> should know the encoding of their files. Not possible in the widely 
>>> uncontrolled real world.
>>> 
>>> I do already have a routine to guess file encodings. It was written by 
>>> someone else. There are instances where it should work and does not. I fear 
>>> there may be errors in the algorithm and I do not have the original 
>>> algorithm to check it against. Hence, I am looking for an alternative that 
>>> is either free to use or to be licensed for a modest fee.
>>> 
>>> My current routine attempts to return the encoding as a string that can be 
>>> directly passed to textDecode(binaryData,encoding)
>>> 
>>> "ASCII"
>>> "UTF-16"
>>> "UTF-16BE"
>>> "UTF-16LE"
>>> "UTF-32"
>>> "UTF-32BE"
>>> "UTF-32LE"
>>> "UTF-8"
>>> "CP1252" *
>>> "MacRoman" *
>>> 
>>> * for these last 2, if the file is MacRoman on a Windows system, you 
>>> actually have to textDecode(macToISO(data),"CP1252") and if you have CP1252 
>>> on the Mac, you need to do textDecode(isoToMac(data),"MacRoman"). There is 
>>> an enhancement request to support MacRoman decoding under WIndows and vice 
>>> versa at https://quality.livecode.com/show_bug.cgi?id=22391 if you want to 
>>> CC yourself to show interest.
>>> 
>>> 
>>> _______________________________________________
>>> use-livecode mailing list
>>> use-livecode@lists.runrev.com
>>> Please visit this url to subscribe, unsubscribe and manage your 
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription 
>> preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to