Rsync 3.0.0pre8 and Mac OS X

2008-01-23 Thread Rudolf E. Reiber

Hi,

I tried Rsync 3.0.0pre8 on my mac running os X 10.5.

I was very pleased about the --iconv feature, as i have to sync some  
LINUX-machines and I had really trouble with some filenames.

But I found one strange thing in connection with the mac.

First of all, the translation between the LINUX ISO-8859-15 and the  
mac ut-8 works (nearly) perfect.


As I live in Germany, we have often filenames containing special  
characters (Umlaute like äöuÄÖÜ).

And all the filenames look perfect on my mac.

But whenever I run rsync again, all the files containing one of this  
special character in the name are deleted and copied again.

And these are quite a lot.

I found the reason for this behavoiur.
Let me explain it with the example of the letter ä (uuml) in HTML.
On the LINUX machines running utf-8 the ä is coded as $C3A4 which is  
in utf-8 equal to the character E4. The ä occupies in that way 2 bytes.


I was very astonished, when I copied a mac-filename, pasted into a  
texteditor and looked at the file:


In the mac-filename the letter ä is coded as: $61CC88, which in utf-8  
means the letter a followed by a $0308. (Combining diacritical marks)
So the Mac combines the letter a with the two points above it instead  
using the E4 letter
Now the things are clear: The filenames are different, in spite of  
looking equally.


A question to the developers: do you see any solution to this problem?  
Perhaps a --icont=utf8mac, iso885915 ?


Rudolf E. Reiber


Rudolf E. Reiber
Kapuzinerberg 19/3
71263 Weil der Stadt
Tel: 07033 44228

[EMAIL PROTECTED]

To VISTA or not to VISTA, that is the question. The answer is to  
LEOPARD!





--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


DO NOT REPLY [Bug 5220] PATCH SUBMITTED: New Feature: stdio model for client

2008-01-23 Thread samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=5220





--- Comment #3 from [EMAIL PROTECTED]  2008-01-23 10:59 CST ---
(In reply to comment #2)
 You can make rsync effectively use an existing fd by passing an
 $RSYNC_CONNECT_PROG that refers to a program that shuttles data between its
 stdin/stdout and the desired fd.  I'm not sure that it is worth adding the 
 syntax to the official rsync.
 

I'm aware of that, but you have to have yet another process and have some
application to do that for you rather than follow the usual stdio pattern (set
up fd, fork/exec).  This patch makes things more resource efficient.


-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the QA contact for the bug, or are watching the QA contact.
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Rsync 3.0.0pre8 and Mac OS X

2008-01-23 Thread Matt McCutchen
On Wed, 2008-01-23 at 16:01 +0100, Rudolf E. Reiber wrote:
 I tried Rsync 3.0.0pre8 on my mac running os X 10.5.
 
 I was very pleased about the --iconv feature, as i have to sync some  
 LINUX-machines and I had really trouble with some filenames.
 But I found one strange thing in connection with the mac.
 
 First of all, the translation between the LINUX ISO-8859-15 and the  
 mac ut-8 works (nearly) perfect.
 
 As I live in Germany, we have often filenames containing special  
 characters (Umlaute like äöuÄÖÜ).
 And all the filenames look perfect on my mac.
 
 But whenever I run rsync again, all the files containing one of this  
 special character in the name are deleted and copied again.
 And these are quite a lot.
 
 I found the reason for this behavoiur.
 Let me explain it with the example of the letter ä (uuml) in HTML.
 On the LINUX machines running utf-8 the ä is coded as $C3A4 which is  
 in utf-8 equal to the character E4. The ä occupies in that way 2 bytes.
 
 I was very astonished, when I copied a mac-filename, pasted into a  
 texteditor and looked at the file:
 
 In the mac-filename the letter ä is coded as: $61CC88, which in utf-8  
 means the letter a followed by a $0308. (Combining diacritical marks)
 So the Mac combines the letter a with the two points above it instead  
 using the E4 letter
 Now the things are clear: The filenames are different, in spite of  
 looking equally.

Yup.  The Mac HFS+ filesystem automatically decomposes Unicode
characters in the stored versions of filenames, which confuses a number
of programs, including rsync and git.  A flamewar about whether to blame
the problem on HFS+ or the application has been running on the git list
for a week now.

 A question to the developers: do you see any solution to this problem?  
 Perhaps a --icont=utf8mac, iso885915 ?

Precisely.  We need an iconv encoding name for the form of UTF-8 that
the Mac likes, and none of the existing encodings in the iconv on my
computer fit the bill.  Another option is store the umlaut-named files
on a filesystem other than HFS+ on the Mac.

Matt

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Thought on large files

2008-01-23 Thread Brendan Grieve




Matt McCutchen wrote:

  On Wed, 2008-01-23 at 13:38 +0900, Brendan Grieve wrote:
  
  
Lets say 
the file, whatever it is, is a 10Gb file, and that some small amount of 
data changes in it. This is efficiently sent accross by rsync, BUT the 
rsync server side will correctly break the hard-link and create a new 
file with the changed bits. This means, if even 1 byte of that 10Gb file 
changes, you now have to store that whole file again.

  
  
  
  
What my thoughts were is that if the server could transparently break a 
large file into chunks and store them that way, then one can still make 
use of hard-links efficiently.

  
  
This is a fine idea, but I don't think support for this should be added
to rsync.  Instead, I suggest that you use rdiff-backup
( http://www.nongnu.org/rdiff-backup/ ), a backup tool that stores an
ordinary latest snapshot of the source along with reverse deltas for
previous snapshots and redundant attribute information both in its own
format.

Matt

  


I had a look at rdiff-backup, but I was trying to get something that
spoke native rsync (IE, not to force any change on the client side).

I do however agree that support should NOT be added in rsync. Rsync is
a mirroring tool and not some elaborate tool that needs to know really
how files are stored. In fact I'd go as far as to say many of the
options rsync does support veer away from being a simple mirror tool
(IE backup etc...).

After some thought I think the best place to put such a change would be
at the filesystem level. For example, if one had a FUSE filesystem that
simply ran on top of an existing one, that wrote its files as I
described (or uses diff-like methods), but presents a clean filesystem
for rsync (or indeed any tool) to make use of. I believe I may look in
that direction instead of hacking rsync.


Brendan Grieve


-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: Thought on large files

2008-01-23 Thread Matt McCutchen
On Thu, 2008-01-24 at 13:54 +0900, Brendan Grieve wrote:
 I had a look at rdiff-backup, but I was trying to get something that
 spoke native rsync (IE, not to force any change on the client side).

To achieve this, you can have the client push to an rsync daemon and
then have the daemon call rdiff-backup so that the rdiff-backup part
happens entirely on the server.  The idea is the same as the
daemon-and-rsnapshot setup I described in the following message, but
with rdiff-backup in place of rsnapshot as the backend:

http://lists.samba.org/archive/rsync/2007-December/019470.html

 After some thought I think the best place to put such a change would
 be at the filesystem level. For example, if one had a FUSE filesystem
 that simply ran on top of an existing one, that wrote its files as I
 described (or uses diff-like methods), but presents a clean filesystem
 for rsync (or indeed any tool) to make use of. I believe I may look in
 that direction instead of hacking rsync.

You could do that, but note that the rsync receiver won't explicitly
tell the filesystem what files are similar, so you'll have to either
keep a big hashtable to help you coalesce identical blocks globally or
use some kludge like looking at what other files the receiver has open
while it is writing the destination file.

Matt

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Rsync iconv (Cygwin) (file has vanished)

2008-01-23 Thread Brendan Grieve
I have another question. I'm not sure if this is the correct post for 
cygwin rsync related questions.


I've compiled rsync 3.0.0pre8 under cygwin. Works splendidly and 
compiles cleanly. I made sure to have libiconv installed and it supports 
the --iconv command (at least it accepts it).


I've been using rsync to test a backup of some files from a windows box 
to a linux box. I use the following command under windows: -
rsync.exe  -v -rlt -z --delete -y --delete-excluded 
--partial-dir=/.rsync-partial --iconv=utf-8,utf-8 
/cygdrive/D/Data_Tier1/ [EMAIL PROTECTED]::virtualdir/Data_Tier1/


I get the following error on files that have russian cryllic letters: -
---
file has vanished: /cygdrive/D/Data_Tier1/Home/xxx/???  
(thousands more entries similar)
---

Am I doing something wrong? I've also using: -
 --iconv=.

The user I run as definitely had permissions on those files.

Brendan

--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Thought on large files

2008-01-23 Thread Brendan Grieve




Matt McCutchen wrote:

  On Thu, 2008-01-24 at 13:54 +0900, Brendan Grieve wrote:
  
  
I had a look at rdiff-backup, but I was trying to get something that
spoke native rsync (IE, not to force any change on the client side).

  
  
To achieve this, you can have the client push to an rsync daemon and
then have the daemon call rdiff-backup so that the rdiff-backup part
happens entirely on the server.  The idea is the same as the
daemon-and-rsnapshot setup I described in the following message, but
with rdiff-backup in place of rsnapshot as the backend:

http://lists.samba.org/archive/rsync/2007-December/019470.html

  

Thanks, I do like that idea. 



-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

DO NOT REPLY [Bug 5201] Rsync lets user corrupt dest by applying non-inplace batch in inplace mode

2008-01-23 Thread samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=5201


[EMAIL PROTECTED] changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |




--- Comment #2 from [EMAIL PROTECTED]  2008-01-23 23:27 CST ---
With --read-batch, when rsync processes the flags, the protocol_version is
always 30 (it has not yet been set from the batch file).  As a result,
--inplace is incorrectly forced off for any pre-protocol-30 batch file,
regardless of whether the batch file was written with --inplace.

Once the above is fixed, I am still hoping that rsync 3 will fail more
gracefully when applying a *pre-protocol-30* non-inplace batch file in inplace
mode.


-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the QA contact for the bug, or are watching the QA contact.
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


DO NOT REPLY [Bug 5199] Exclusion of source arg ancestor short-circuits recursion

2008-01-23 Thread samba-bugs
https://bugzilla.samba.org/show_bug.cgi?id=5199





--- Comment #2 from [EMAIL PROTECTED]  2008-01-23 23:33 CST ---
IMHO, the old behavior of excluding just the implied dir from the file-list was
useful, not weird...but anyway the important problem is fixed.


-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the QA contact for the bug, or are watching the QA contact.
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html