On Wed, Dec 31, 2003 at 04:38:02PM +0200, Chaim Keren Tzion wrote: > I tried 'convmv' and it worked great. One point to be aware of: I had two > directory structures and it seems that one was in cp862 and one was in iso- > 8859-8. At first I ran the same command on both directories: > convmv -r -f cp862 -t utf8 --nfc directory1 > That worked fine for the cp862 encoded directory but it messed up the iso-8859-
It means it's not that smart. I suggest you report it to the author. IMO, it should have done nothing. > 8 one. Good thing I backed them up first. I then had to play around to figure > out what encoding the second directory was in and then ran: > convmv -r -f iso-8859-8 -t utf8 --nfc --notest directory2 > It worked fine and I am now all UTF8. > > One question though; Is there a way to query what encoding a file or > directory's name is in? I had to just keep trying different 'from' encodings > until it worked. There is no way to "query" it - it's not written anywhere. The only thing you can do is _guess_ it. If you know it's hebrew, there are only a few possibilities. You can simply do 'ls --show-control-chars | od -tx1' and see the raw data - cp862 starts at hex 80 and iso8859-8 starts at hex E0. If you don't know the language, you need a smarter tool. I think one such tool is mguesser, but it has no maps for cp862 so I didn't try it (but it's probably trivial to convert its iso8859-8 map to cp862). It also needs a large amount of data to work on - I guess it compares distributions of letters to known languages' distributions. -- Didi ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
