[Samba] smbmount codepage/iocharset settings vs NT4
Hi, I'm in the process of setting up a backup server for a somewhat antiquated NT4 server. Backup server is CentOS-4 (~ RHEL-4), kernel-2.6.9-11.EL, samba-client-3.0.10-1.4E, rsync-2.6.3-1, LANG=en_US.UTF-8. NT4 shares are mounted on the server and rsynced to local disk. This setup is working pretty well, however on the NT box there are some files with names containing odd characters like accented characters and ellipsis. I'm a bit at a loss as to the correct settings of the smbmount iocharset and codepage parameters to use, and whether the display charset and unix charset options in smb.conf are relevant to the mounts. I've setup a test share. An ls in smbclient gives me the correct output in a gnome-terminal and an mget gets me the files with their correctly utf8itfied names (console seemed ok until after a toggle to X): $ smbclient -U auser //david-bowie/Test Password: Domain=[EVERYTHING] OS=[Windows NT 4.0] Server=[NT LAN Manager 4.0] smb: \ ls . D0 Sat Oct 29 18:58:49 2005 .. D0 Sat Oct 29 18:58:49 2005 ellipsis zijn heel fijn (…).doc A24064 Sat Oct 29 18:57:14 2005 Nogmaals ellipsis ….doc A24064 Sat Oct 29 18:58:31 2005 één document á €50.doc A24064 Sat Oct 29 18:55:28 2005 één document.doc A24064 Sat Oct 29 18:54:20 2005 ‘‰’.doc A24064 Sat Oct 29 18:57:55 2005 “quotes”.docA24064 Sat Oct 29 18:53:40 2005 52004 blocks of size 262144. 2165 blocks available However, an smbmount without any charset options gives me the following result: $ sudo mount -o username=auser //david-bowie/Test /mnt/tmp Password: $ ls /mnt/tmp `%'.doc ??n document.doc ellipsis zijn heel fijn (.).doc Nogmaals ellipsis ..doc ??n document ? ?50.doc quotes.doc Using cp850 improves the output somewhat: $ sudo mount -o username=auser,codepage=cp850 //david-bowie/Test /mnt/tmp Password: $ ls /mnt/tmp `%'.doc ellipsis zijn heel fijn (.).doc één document á ?50.doc Nogmaals ellipsis ..doc één document.docquotes.doc I assumed the code page used by NT4 was cp1252 (MS-ANSI), but using cp1252 for the codepage gives me the same output for these files as the mount with no codepage option set. To make a long story short: What are the proper options to pass to smbmount and/or set in /etc/samba/smb.conf? Thanks, Leonard. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba
Re: [Samba] smbmount codepage/iocharset settings vs NT4
Hello Andrew, On Mon, 2005-10-31 at 08:34 +1100, Andrew Bartlett wrote: This setup is working pretty well, however on the NT box there are some files with names containing odd characters like accented characters and ellipsis. I'm a bit at a loss as to the correct settings of the smbmount iocharset and codepage parameters to use, and whether the display charset and unix charset options in smb.conf are relevant to the mounts. You should use the CIFS VFS for your backup operations, as it will correctly use unicode on the wire, and therefore allow a correct utf8 translation. smbfs is considered deprecated, and certainly should not be used for new installations. Ok. $ sudo mount -t cifs -o username=auser //david-bowie/Test /mnt/tmp Password: [EMAIL PROTECTED] ~]$ ls /mnt/tmp één document á €50.doc ellipsis zijn heel fijn (…).doc ‘?颂ꋩ?? één document.docNogmaals ellipsis ….doc “???鲂닩? That indeed solves the issue for the more common cases. Luckily in real life I don't have to deal with these cases but what about the below 2 file names? They are printed correctly in smbclient. ‘‰’.doc A24064 Sat Oct 29 18:57:55 2005 “quotes”.docA24064 Sat Oct 29 18:53:40 2005 Leonard. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba
Re: [Samba] smbmount codepage/iocharset settings vs NT4
On Sun, 2005-10-30 at 23:08 +0100, Leonard den Ottolander wrote: Hello Andrew, On Mon, 2005-10-31 at 08:34 +1100, Andrew Bartlett wrote: This setup is working pretty well, however on the NT box there are some files with names containing odd characters like accented characters and ellipsis. I'm a bit at a loss as to the correct settings of the smbmount iocharset and codepage parameters to use, and whether the display charset and unix charset options in smb.conf are relevant to the mounts. You should use the CIFS VFS for your backup operations, as it will correctly use unicode on the wire, and therefore allow a correct utf8 translation. smbfs is considered deprecated, and certainly should not be used for new installations. Ok. $ sudo mount -t cifs -o username=auser //david-bowie/Test /mnt/tmp Password: [EMAIL PROTECTED] ~]$ ls /mnt/tmp één document á €50.doc ellipsis zijn heel fijn (…).doc ‘?颂ꋩ?? één document.docNogmaals ellipsis ….doc “???鲂닩? That indeed solves the issue for the more common cases. Luckily in real life I don't have to deal with these cases but what about the below 2 file names? They are printed correctly in smbclient. ‘‰’.doc A24064 Sat Oct 29 18:57:55 2005 “quotes”.docA24064 Sat Oct 29 18:53:40 2005 This will be smbclient correctly finding your 'display charset' (localle) from the environment which the cifsvfs can't tell from kernel space. You should use UTF8 everywhere if possible. Andrew Bartlett -- Andrew Bartletthttp://samba.org/~abartlet/ Samba Developer, SuSE Labs, Novell Inc.http://suse.de Authentication Developer, Samba Team http://samba.org Student Network Administrator, Hawker College http://hawkerc.net signature.asc Description: This is a digitally signed message part -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba
Re: [Samba] smbmount codepage/iocharset settings vs NT4
On Sun, Oct 30, 2005 at 11:08:06PM +0100, Leonard den Ottolander wrote: Hello Andrew, On Mon, 2005-10-31 at 08:34 +1100, Andrew Bartlett wrote: This setup is working pretty well, however on the NT box there are some files with names containing odd characters like accented characters and ellipsis. I'm a bit at a loss as to the correct settings of the smbmount iocharset and codepage parameters to use, and whether the display charset and unix charset options in smb.conf are relevant to the mounts. You should use the CIFS VFS for your backup operations, as it will correctly use unicode on the wire, and therefore allow a correct utf8 translation. smbfs is considered deprecated, and certainly should not be used for new installations. Ok. $ sudo mount -t cifs -o username=auser //david-bowie/Test /mnt/tmp Password: [EMAIL PROTECTED] ~]$ ls /mnt/tmp één document á €50.doc ellipsis zijn heel fijn (…).doc ‘?颂ꋩ?? één document.docNogmaals ellipsis ….doc “???鲂닩? That indeed solves the issue for the more common cases. Luckily in real life I don't have to deal with these cases but what about the below 2 file names? They are printed correctly in smbclient. Log a bug with Steve French. He should be using similar unicode conversions as smbclient. Jeremy. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba
Re: [Samba] smbmount codepage/iocharset settings vs NT4
Hello Andrew, On Mon, 2005-10-31 at 09:16 +1100, Andrew Bartlett wrote: On Sun, 2005-10-30 at 23:08 +0100, Leonard den Ottolander wrote: This will be smbclient correctly finding your 'display charset' (localle) from the environment which the cifsvfs can't tell from kernel space. You should use UTF8 everywhere if possible. LANG=en_US.UTF-8. Setting display charset and unix charset to UTF8 in smb.conf does not solve this. I'll consider filing a bug as Jeremy suggested. Right now I'm just going to be content :) . Thanks guys. Leonard. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba
Re: [Samba] smbmount codepage/iocharset settings vs NT4
On Sun, 2005-10-30 at 23:32 +0100, Leonard den Ottolander wrote: Hello Andrew, On Mon, 2005-10-31 at 09:16 +1100, Andrew Bartlett wrote: On Sun, 2005-10-30 at 23:08 +0100, Leonard den Ottolander wrote: This will be smbclient correctly finding your 'display charset' (localle) from the environment which the cifsvfs can't tell from kernel space. You should use UTF8 everywhere if possible. LANG=en_US.UTF-8. Setting display charset and unix charset to UTF8 in smb.conf does not solve this. Just for clarification, that is because the CIFS VFS doesn't read our (userspace) smb.conf file, and smbclient didn't change because it was already on UTF8. I'll consider filing a bug as Jeremy suggested. Sounds like a bug to me. Andrew Bartlett -- Andrew Bartletthttp://samba.org/~abartlet/ Samba Developer, SuSE Labs, Novell Inc.http://suse.de Authentication Developer, Samba Team http://samba.org Student Network Administrator, Hawker College http://hawkerc.net signature.asc Description: This is a digitally signed message part -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba