> Thanks for your help Dan, but I'm mo further forward, the answer is > apparently 'ascii', which is puzzling, because but the content is not > ASCII - it is still legible in a web browser as it was written > originally so the data is still intact. > > I'm guessing that Encode::Guess tests the beginning of the file to see > what it contains, which being a HTML doc would have characters within > the ASCII range?
Thus, UTF-8, shift-JIS, or euc-JIS? Even 7-bit JIS apparently tends to be mixed with ASCII, so if your first n characters are nothing but ASCII, the guess is ASCII? Is there a parameter to force the sample length? (For five brief seconds, I was thinking about the value of randomizing the starting point for samples. :*/ ) > On Wednesday, June 25, 2003, at 02:04 am, Dan Kogai wrote: > > > print $enc->name; And it was a good thing he responded, because I was going to take a closer look at this "tomorrow". (Thanks, Dan!) -- Joel Rees, programmer, Kansai Systems Group Altech Corporation (Alpsgiken), Osaka, Japan http://www.alpsgiken.co.jp