On Wed, Feb 12, 2003 at 01:00:35PM +0100, Thomas Osterried wrote: > I fully agree with jw schultz's first and second issue, to his --delete > assumption and to the the point that lexical order does not matter. > > > This unfortunately does mean that a means of preserving > > initial sequence must be incorporated or the qsort approach > > to finding duplicates would have to be forgotten. > > This could be as simple as running qsort on an array of > > indices to flist->files instead of flist->files itself. > > qsort is a problem and it's clear why. > i just wonder, why in real-life it always woked fine for me ;) > > > > Another concern I have about this fix in 2.5.6 is that there is risk > > > the change is not backward compatible with earlier protocol versions. > > > The file list is sent (unsorted and uncleaned) from the sender to the > > > receiver, and each side then sorts and cleans the list. > > > Remember that the files are referred to as an > > > integer index into the sorted file list, and the receiver skips > > > NULL (duplicate) files. > > > I suspect (but haven't checked) that if a 2.5.5 receiver is talking to > > > a 2.5.6 sender then 2.5.5 will send the index for the 3rd file, which > > > will be null_file on 2.5.6. > > FYI, I just ran a test, and indeed, this causes a seg fault. I'm > > working up a patch. > > I initially thought the file deletion is done by the sending site, > before sending the integer index list to the receiving rsync.. The > problem is more complex than it initially seemed. > > rsync2.5.5 to 2.5.6 could generally request either the last _or_ last-1 > file, depending on the files in the tree [last-1 for e.g., when given 4 > source trees with a missing correspondending file in tree2]. > > > This also means that if we change which file gets removed (switching > > back to leaving the last name) we'd need to bump the protocol number > > and add some compatibility code for older versions. Ick. > > Since there are some problems with this issue, an increased pid could be > a good idea.
OK. I think i might have an approach. My idea is to use a slightly different routine for comparisons by qsort(). This would never treat two files as equal. As a fall-back use their position in the array as a basis for comparison. This will preserve the command-line order of otherwise equal files. Because this is a small adjustment in the qsort the duplicate selection will be non-deterministic on a qsort version mismatch. I've attached a patch i _think_ will do this. It includes code to do fallback for deleted files based on the assumption that modtime will not be NULL if the file exists. I have confirmed it compiles and make check doesn't fail. PS. inlining file_compare() in qsort_file_compare would be a good idea. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt
Index: flist.c =================================================================== RCS file: /cvsroot/rsync/flist.c,v retrieving revision 1.130 diff -u -b -r1.130 flist.c --- flist.c 12 Feb 2003 09:15:23 -0000 1.130 +++ flist.c 12 Feb 2003 21:40:50 -0000 @@ -1148,6 +1148,13 @@ return u_strcmp(f_name(*f1), f_name(*f2)); } +int qsort_file_compare(struct file_struct **f1,struct file_struct **f2) +{ + int rv; + rv = file_compare(f1, f2); + if (!rv) rv = (*f1 < *f2) ? -1 : 1; + return rv; +} int flist_find(struct file_list *flist, struct file_struct *f) { @@ -1249,17 +1256,19 @@ static void clean_flist(struct file_list *flist, int strip_root, int no_dups) { int i; + int remove_i, prev_i; char *name, *prev_name = NULL; if (!flist || flist->count == 0) return; qsort(flist->files, flist->count, - sizeof(flist->files[0]), (int (*)()) file_compare); + sizeof(flist->files[0]), (int (*)()) qsort_file_compare); for (i = no_dups? 0 : flist->count; i < flist->count; i++) { if (flist->files[i]->basename) { prev_name = f_name(flist->files[i]); + prev_i = i; break; } } @@ -1273,14 +1282,23 @@ "removing duplicate name %s from file list %d\n", name, i); } + if (flist->files[i]->modtime) + { + remove_i = prev_i; + prev_i = i; + } else { + remove_i = i; + } /* it's not great that the flist knows the semantics of * the file memory usage, but i'd rather not add a flag * byte to that struct. * XXX can i use a bit in the flags field? */ if (flist->string_area) - flist->files[i][0] = null_file; + flist->files[remove_i][0] = null_file; else - free_file(flist->files[i]); + free_file(flist->files[remove_i]); + } else { + prev_i = i; } prev_name = name; }
-- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html