On Wed, Feb 12, 2003 at 01:00:35PM +0100, Thomas Osterried wrote:
> I fully agree with jw schultz's first and second issue, to his --delete
> assumption and to the the point that lexical order does not matter.
> 
> > This unfortunately does mean that a means of preserving
> > initial sequence must be incorporated or the qsort approach
> > to finding duplicates would have to be forgotten.
> > This could be as simple as running qsort on an array of
> > indices to flist->files instead of flist->files itself.
> 
> qsort is a problem and it's clear why.
> i just wonder, why in real-life it always woked fine for me ;)
> 
> > > Another concern I have about this fix in 2.5.6 is that there is risk
> > > the change is not backward compatible with earlier protocol versions.
> > > The file list is sent (unsorted and uncleaned) from the sender to the
> > > receiver, and each side then sorts and cleans the list.
> > > Remember that the files are referred to as an
> > > integer index into the sorted file list, and the receiver skips
> > > NULL (duplicate) files.
> > > I suspect (but haven't checked) that if a 2.5.5 receiver is talking to
> > > a 2.5.6 sender then 2.5.5 will send the index for the 3rd file, which
> > > will be null_file on 2.5.6.
> > FYI, I just ran a test, and indeed, this causes a seg fault.  I'm
> > working up a patch.
> 
> I initially thought the file deletion is done by the sending site,
> before sending the integer index list to the receiving rsync.. The
> problem is more complex than it initially seemed.
> 
> rsync2.5.5 to 2.5.6 could generally request either the last _or_ last-1
> file, depending on the files in the tree [last-1 for e.g., when given 4
> source trees with a missing correspondending file in tree2].
> 
> > This also means that if we change which file gets removed (switching
> > back to leaving the last name) we'd need to bump the protocol number
> > and add some compatibility code for older versions.  Ick.
> 
> Since there are some problems with this issue, an increased pid could be
> a good idea.

OK.  I think i might have an approach.  My idea is to use a
slightly different routine for comparisons by qsort().  This
would never treat two files as equal.  As a fall-back use
their position in the array as a basis for comparison.  This
will preserve the command-line order of otherwise equal
files.

Because this is a small adjustment in the qsort the
duplicate selection will be non-deterministic on a qsort
version mismatch.

I've attached a patch i _think_ will do this.  It includes
code to do fallback for deleted files based on the
assumption that modtime will not be NULL if the file exists.
I have confirmed it compiles and make check doesn't fail.

PS. inlining file_compare() in qsort_file_compare would be a
good idea.

-- 
________________________________________________________________
        J.W. Schultz            Pegasystems Technologies
        email address:          [EMAIL PROTECTED]

                Remember Cernan and Schmitt
Index: flist.c
===================================================================
RCS file: /cvsroot/rsync/flist.c,v
retrieving revision 1.130
diff -u -b -r1.130 flist.c
--- flist.c     12 Feb 2003 09:15:23 -0000      1.130
+++ flist.c     12 Feb 2003 21:40:50 -0000
@@ -1148,6 +1148,13 @@
        return u_strcmp(f_name(*f1), f_name(*f2));
 }
 
+int qsort_file_compare(struct file_struct **f1,struct file_struct **f2)
+{
+       int rv;
+       rv = file_compare(f1, f2);
+       if (!rv) rv = (*f1 < *f2) ? -1 : 1;
+       return rv;
+}
 
 int flist_find(struct file_list *flist, struct file_struct *f)
 {
@@ -1249,17 +1256,19 @@
 static void clean_flist(struct file_list *flist, int strip_root, int no_dups)
 {
        int i;
+       int remove_i, prev_i;
        char *name, *prev_name = NULL;
 
        if (!flist || flist->count == 0)
                return;
 
        qsort(flist->files, flist->count,
-             sizeof(flist->files[0]), (int (*)()) file_compare);
+             sizeof(flist->files[0]), (int (*)()) qsort_file_compare);
 
        for (i = no_dups? 0 : flist->count; i < flist->count; i++) {
                if (flist->files[i]->basename) {
                        prev_name = f_name(flist->files[i]);
+                       prev_i = i;
                        break;
                }
        }
@@ -1273,14 +1282,23 @@
                                        "removing duplicate name %s from file list 
%d\n",
                                        name, i);
                        }
+                       if (flist->files[i]->modtime)
+                       {
+                               remove_i = prev_i;
+                               prev_i = i;
+                       } else {
+                               remove_i = i;
+                       }
                        /* it's not great that the flist knows the semantics of
                         * the file memory usage, but i'd rather not add a flag
                         * byte to that struct.
                         * XXX can i use a bit in the flags field? */
                        if (flist->string_area)
-                               flist->files[i][0] = null_file;
+                               flist->files[remove_i][0] = null_file;
                        else
-                               free_file(flist->files[i]);
+                               free_file(flist->files[remove_i]);
+               } else {
+                       prev_i = i;
                }
                prev_name = name;
        }
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html

Reply via email to