Hi Greg,

On Mon, Nov 29, 2004 at 09:08:47AM +0200, Greg Pendler wrote:
> Hi,
> 
> As Yedidyah suggested i'm currently testing "convmv". I think i'm in big 
> trouble:
> SMB.CONF from OLD samba shows:
>        "character set = ISO8859-5"
> Which is RUSSIAN - how it happened i've got no clue.
> 
> When running convmv it works perfectly converting names to UTF, but the
> filenames are shown in RUSSIAN.

Do you read Russian? If you do, I guess you see they are actually
garbage, not russian, only use Cyrillic letters.

> 
> Again in OLD samba all filenames were in hebrew.
> 
> Is there any way to fix that (i'm talking about several hundreds 
> users:), although majority of them doesn't use Hebrew filenames).

I did tell you to test well, didn't I?

I hope you did no damage, but can't be sure.

I suggest you first convert them all back to russian (ISO8859-5), in
order to undo damage. Then, you'll have to somehow do a 2 or even 3
part translation. Details for a similar thing were in a discussion
about zip/unzip here, a few weeks ago. Basically, I guess what you
need to do is this: first convert from ISO8859-5 to cp866, then
from cp862 to utf-8. As I said, _test_! before you do this to all of
the files - create a file through old samba, double-convert, try it
with the new samba.

> 
> In addition Solaris doesn't understand 'ls -l --show-control-chars'

Maybe try to compile gnu ls on it? Or use '-b' (without 'od').

> and all my tests with "od" didn't improve my understanding, actually now 
> i do understand the problem but i don't understand how it worked earlier.

Windows wants to save filenames in Hebrew. It has two ways to work. If it
knows the other side is unicode, it sends the files as unicode, no need to
convert anything. Otherwise, it sends DOS filenames (e.g. cp862 for Hebrew,
cp866 for Russian). If the other side is samba, you can ask it to convert
back the filename. What you did was to tell samba to assume the files
arrive as cp866 and convert them to ISO8859-5. So now you should do the
opposite - convert them to cp866, and then _from_ cp862 (which is what
they actually were) to iso8859-8 or utf-8.

BTW, if you do not need to support more languages, only Hebrew, and you
do not care about how they are kept on Unix, you can simply tell the
new samba to do the same 'character set = ISO8859-5' (or maybe the option
name was changed, I do not know) and everything will seem ok on the client
side. I do not think this is recommended - utf-8 is the way to go.

Good luck,
-- 
Didi


=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to