Re: TODO hardlink performance optimizations
On Tue, 6 Jan 2004 15:11:54 -0800, jw schultz [EMAIL PROTECTED] wrote: On Tue, Jan 06, 2004 at 02:04:19AM -0600, John Van Essen wrote: On Mon, 5 Jan 2004, jw schultz [EMAIL PROTECTED] wrote: [snip] union links { struct idev { INO64_T inode; DEV64_T dev; } struct hlinks { intcount; file_struct**link_list; } } [snip] Now that you have opened the door for re-using the inode and dev area in file_struct, I can make my initial suggestion that requires two 'additional' pointers in file_struct. I didn't suggest it before because I didn't want to add to the struct. I propose to reuse the DEV amd INODE areas to store two new file_struct pointers (the 2nd part of the union). These would be set during the post-qsort phase during init_hard_links. The first pointer links together the file_structs for each hardlink group in a list in the same order as in the qsort so they can be walked by your proposed link-dest method. The second pointer points to the head (first file_struct of a hardlink group). For each file_struct that is modified in this way a flag bit needs to be set indicating so. No it doesn't. After we walk the qsorted hlink_list all file_structs-links pointers either point to a hlinks or are NULL. If there are no links file_structs-links is NULL. If you qsort the entire list without first filtering out any entries (e.g. !IS_REG), then yes, that would be true. Then, if the file_struct address equals its head pointer, then you are at a head of a hardlink list which can be processed using the new method using link-dest that you outlined earlier. So you are proposing a singly linked list: struct hlinks { file_struct*head; file_struct*next; } Yes. That would work too. With the pointers into the array you compare the first in the array with yourself instead of head. I preferred knowing in advance how many links there were. But there is no need to know exactly how many links, is there? I'm still not clear on what exactly will happen during that new processing method. I assume that if the head file does not exist, it will find any existant file and hardlink it to the head file, and if not found, will transfer the head file. Is that correct? Not exactly. Like my earlier post if the head (your term) doesn't exist (existence including compare_dest) we iterate over the link set and the first that does exist is used for fnamecmp. But that is phase two or three. Let me put it another way... Using your proposed method, after the 'head' file is processed, will it now exist on the receiver so that its siblings can be hardlinked to it? Processing of non-head files (file_struct address not equal to head pointer) can be skipped since they are hardlinked later. The final walk-through that creates the hardlinks for the non-head files can walk the qsorted list and use the head pointer as the target for the hardlinks of the non-nead files. Actually, if we could do the hardlinks of non-head files as we encounter them while walking the file list and used your singly linked lists we could free the hlink_list after traversal. The problem is that we would need to be sure the head had completed processing so the hardlink would have to be created in receiver and that gets ugly. Right. You've just explained the here-to-fore unknown reason why that sibling hlinking was being done in a separate, final phase. If we keep it that way, I'd like to see a comment added to explain what you just explained. But here's an idea, still using my proposed head/next struct... - Make *next a circular list (the last points to the first). - Leave *head NULL initially. - When a file of a hlink group is examined, and *head is NULL, then it is the first one of that group to be processed, and it becomes the head. All the *head pointers are set to its file_struct address. - Subsequent processing of siblings can immediately hardlink them to the head file. The drawback is that this will invoke CopyOnWrite (as dicussed earlier in this thread). To avoid that, *head would have to point outside to a group head pointer, which would then be set. So you'd need an array of pointers of the same size as the number of distinct hardlink groups. Say! We already have the sorted hlink_list. We could just point to the first element of the group (setting it to NULL initially) after creating the circularly linked list. One possibility would be to keep the trailing walk-through but reduce the hlink_list to an array of heads. Keep the array and use just the heads, as per above. No extra memory required beyond that already required for the qsorted pointer list. No binary search required during the file processing phase. The key is getting
Re: TODO hardlink performance optimizations
On Tue, 6 Jan 2004 22:33:06 -0800, Wayne Davison [EMAIL PROTECTED] wrote: I'd suggest also changing the last line of the function: -return file_compare(f1, f2); +return file_compare(f1p, f2p); This is because the old way asks the compiler to take the address of f1 and f2, thus forcing them to become real stack variables. Changing the code to use the passed-in f1p and f2p allows the compiler to leave both f1 and f2 as registers (if possible). Good point, but I have an even better suggestion, now that I finally understand the nuts and bolts of all the hlink.c code. The file_compare() is invoked when the dev and inode values match in order to present a consistent sorting order during the sort. There is no compelling reason to have the hlink list be sorted alphabetically. It just has to sort consistently. So the final comparison can be done on the addresses of the file_structs, since they are not moved around and will remain constant: return ( ( f1 f2 ) ? -1 : ( f1 f2 ) ); (Unsure if the code is right, but you get my drift.) For filesets with many hardlinks, this will use less CPU time. -- John Van Essen Univ of MN Alumnus [EMAIL PROTECTED] -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: TODO hardlink performance optimizations
On Wed, Jan 07, 2004 at 01:04:34AM -0600, John Van Essen wrote: On Tue, 6 Jan 2004 15:11:54 -0800, jw schultz [EMAIL PROTECTED] wrote: On Tue, Jan 06, 2004 at 02:04:19AM -0600, John Van Essen wrote: On Mon, 5 Jan 2004, jw schultz [EMAIL PROTECTED] wrote: [snip] union links { struct idev { INO64_T inode; DEV64_T dev; } struct hlinks { intcount; file_struct**link_list; } } [snip] Now that you have opened the door for re-using the inode and dev area in file_struct, I can make my initial suggestion that requires two 'additional' pointers in file_struct. I didn't suggest it before because I didn't want to add to the struct. I propose to reuse the DEV amd INODE areas to store two new file_struct pointers (the 2nd part of the union). These would be set during the post-qsort phase during init_hard_links. The first pointer links together the file_structs for each hardlink group in a list in the same order as in the qsort so they can be walked by your proposed link-dest method. The second pointer points to the head (first file_struct of a hardlink group). For each file_struct that is modified in this way a flag bit needs to be set indicating so. No it doesn't. After we walk the qsorted hlink_list all file_structs-links pointers either point to a hlinks or are NULL. If there are no links file_structs-links is NULL. If you qsort the entire list without first filtering out any entries (e.g. !IS_REG), then yes, that would be true. Either NULL it during the hlink_list walk or don't populate file_struct-links while building the file list (saving a malloc) for unlinkable files. Then, if the file_struct address equals its head pointer, then you are at a head of a hardlink list which can be processed using the new method using link-dest that you outlined earlier. So you are proposing a singly linked list: struct hlinks { file_struct*head; file_struct*next; } Yes. That would work too. With the pointers into the array you compare the first in the array with yourself instead of head. I preferred knowing in advance how many links there were. But there is no need to know exactly how many links, is there? Not at this time. I'm still not clear on what exactly will happen during that new processing method. I assume that if the head file does not exist, it will find any existant file and hardlink it to the head file, and if not found, will transfer the head file. Is that correct? Not exactly. Like my earlier post if the head (your term) doesn't exist (existence including compare_dest) we iterate over the link set and the first that does exist is used for fnamecmp. But that is phase two or three. Let me put it another way... Using your proposed method, after the 'head' file is processed, will it now exist on the receiver so that its siblings can be hardlinked to it? Yes. Processing of non-head files (file_struct address not equal to head pointer) can be skipped since they are hardlinked later. The final walk-through that creates the hardlinks for the non-head files can walk the qsorted list and use the head pointer as the target for the hardlinks of the non-nead files. Actually, if we could do the hardlinks of non-head files as we encounter them while walking the file list and used your singly linked lists we could free the hlink_list after traversal. The problem is that we would need to be sure the head had completed processing so the hardlink would have to be created in receiver and that gets ugly. Right. You've just explained the here-to-fore unknown reason why that sibling hlinking was being done in a separate, final phase. If we keep it that way, I'd like to see a comment added to explain what you just explained. But here's an idea, still using my proposed head/next struct... - Make *next a circular list (the last points to the first). - Leave *head NULL initially. - When a file of a hlink group is examined, and *head is NULL, then it is the first one of that group to be processed, and it becomes the head. All the *head pointers are set to its file_struct address. - Subsequent processing of siblings can immediately hardlink them to the head file. The drawback is that this will invoke CopyOnWrite (as dicussed earlier in this thread). To avoid that, *head would have to point outside to a group head pointer, which would then be set. So you'd need an array of pointers of the same size as the number of distinct hardlink groups. Say! We already have the sorted hlink_list. We could just point to the first element of the group (setting it to NULL initially) after creating the circularly
Re: TODO hardlink performance optimizations
On Wed, Jan 07, 2004 at 01:33:43AM -0600, John Van Essen wrote: On Tue, 6 Jan 2004 22:33:06 -0800, Wayne Davison [EMAIL PROTECTED] wrote: I'd suggest also changing the last line of the function: -return file_compare(f1, f2); +return file_compare(f1p, f2p); This is because the old way asks the compiler to take the address of f1 and f2, thus forcing them to become real stack variables. Changing the code to use the passed-in f1p and f2p allows the compiler to leave both f1 and f2 as registers (if possible). Good point, but I have an even better suggestion, now that I finally understand the nuts and bolts of all the hlink.c code. The file_compare() is invoked when the dev and inode values match in order to present a consistent sorting order during the sort. There is no compelling reason to have the hlink list be sorted alphabetically. It just has to sort consistently. So the final comparison can be done on the addresses of the file_structs, since they are not moved around and will remain constant: return ( ( f1 f2 ) ? -1 : ( f1 f2 ) ); (Unsure if the code is right, but you get my drift.) For filesets with many hardlinks, this will use less CPU time. There may well be good reason for having the link sets subsorted consistantly with the file list. See my notes regarding COW, fork and the modification of the link info. -- J.W. SchultzPegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
2.6.0 file has vanished fails to set exit code on local client
A new 2.6.0 feature is supposed to use a different exit code when the only 'errors' were from files that disappeared between the building of the file list and the actual transfer of files. But if the client is local and the server is remote, IOERR_VANISHED gets set on the remote server, but is never passed to the local client (the io_error value is passed at the end of the file list, not during or after the file transfer phase). The old scheme used FERROR for the send_files failed to open message. The new scheme uses FINFO for the file has vanished: message. The client receiver sets log_got_error when it receives a FERROR message from the sender. The old scheme used (io_error || log_got_error) to report a partial transfer (with no alternative of vanished files). The new scheme uses the IOERR_VANISHED flag to distinguish the two errors, and it will never be set in an rsync pull (nor will log_got_error get set if vanished files are the only errors). Hence, the exit code stays 0. Furthermore, if the local client is pre-2.6.0 and the remote server is 2.6.0, the same problem happens, since the only thing pre-2.6.0 keys on is an FERROR message coming from the server during the file transfers. So now it also (incorrectly) exits with a 0 exit code in the case of partial transfers from a 2.6.0 server. So this needs some work... - On the server, if the client protocol is 27, use FERROR instead of FINFO so the pre-2.6.0 client can use a RERR_PARTIAL exit code. - On the client side, it has to somehow recognize the vanished error on the server. It could examine each FINFO message that comes over to see if it begins with file has vanished: and set the IOERR_VANISHED flag (but that's pretty kludgy...). I haven't coded anything pending review of this bug by whoever coded the IOERR_VANISHED feature to verify my analysis. (Wayne?) -- John Van Essen Univ of MN Alumnus [EMAIL PROTECTED] -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: TODO hardlink performance optimizations
On Wed, 7 Jan 2004 01:30:19 -0800, jw schultz [EMAIL PROTECTED] wrote: The steps i see are: - The hlink_list change to a pointer array (just committed) - Create the union and change file_struct and the routines that reference and populate it to use the union for dev and inode. This may include not allocating the union for unlinkable files. - Overwrite the unions with the linked list stuff and change the logic to use them. Also free the unions for unlinked files. (this is the biggest step) - Reduce the hlink_list to just the heads and change do_hard_links. - consolidate the fnamecmp finder function for recv_generator() in generator.c and recv_files() in receiver.c - Add the list walk for heads that don't exist yet. Each of these is a discrete step that when complete the code will function correctly. Feel free to start coding. -) Not that I'm lazy... cough Oh! Sorry to hear that, I am. The only thing preventing me from saying go ahead is my uncertainty whether we both have the same design. Except for the part about allocating and freeing the union, I'm with ya. For this initial attempt, shall we just leave the union in the file_struct (in place of the DEV / INODE vars)? Maybe move it to the end where it can be conditionally allocated (leaving a short structure when --hard-links is not used). Shall we take the coding details discussion off-list? I imagine that the faithful readers of this mailing list are getting a bit weary reading about this fairly obscure bit of code... -- John Van Essen Univ of MN Alumnus [EMAIL PROTECTED] -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: Problem with many files in rsync server directory ?
Having had a night to sleep on this I think rsync's limit on filename globbing needs pointing out more clearly. I think we need: 1) An entry in the FAQ (Done) 2) A better error message from rsync when it exceeds the limit. Saying: rsync: read error: Connection reset by peer rsync error: error in rsync protocol data stream (code 12) at io.c(201) doesn't help many users. Not even programmers with 20 years experience, like me ;-) 3) How about adding a file called LIMITS to the source distribution that tells system administrators and users of the limits that are built in to rsync, and how they can be changed. 4) Or maybe even - horror or horrors - some comments in the source file rsync.h. Jon -Original Message- From: Wayne Davison [mailto:[EMAIL PROTECTED] Sent: 06 January 2004 17:30 To: Jon Hirst Cc: '[EMAIL PROTECTED]' Subject: Re: Problem with many files in rsync server directory ? On Tue, Jan 06, 2004 at 05:05:16PM -, Jon Hirst wrote: $ rsync [EMAIL PROTECTED]::gsh/* . There's a limit to how many files you can glob with a wildcard. Just remove the wildcard and let rsync transfer the whole directory: rsync [EMAIL PROTECTED]::gsh/ . While you're at it, you should probably add (at least) the -t option, which will preserve timestamps and make future updated copies faster (or just use -a). ..wayne.. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Problem with many files in rsync server directory ?
On Wed, Jan 07, 2004 at 10:26:16AM -, Jon Hirst wrote: Having had a night to sleep on this I think rsync's limit on filename globbing needs pointing out more clearly. I think we need: 1) An entry in the FAQ (Done) 2) A better error message from rsync when it exceeds the limit. Saying: rsync: read error: Connection reset by peer rsync error: error in rsync protocol data stream (code 12) at io.c(201) doesn't help many users. Not even programmers with 20 years experience, like me ;-) 3) How about adding a file called LIMITS to the source distribution that tells system administrators and users of the limits that are built in to rsync, and how they can be changed. 4) Or maybe even - horror or horrors - some comments in the source file rsync.h. 5) It trumpeted from the mountain tops (and maybe in the documentation somewhere) that using * to get all files in a directory is stupid or ignorant. a) * and ? are globbed by the shell unless quoted and may produce unexpected behaviour. b) There are limits to the size of command-lines. c) filenames with spaces glob badly. d) The only time the argument globbing is done by rsync is on the daemon, all other times it is done one shell or another. I've lost track of the number of times someone has complained on this list because blah/blah/* didn't behave as he expected and the problem went away when he dropped the unnecessary wildcard. -- J.W. SchultzPegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
BUG in 2.6.0: make test failes if build dir is not source dir
There is a small bug in the build system of 2.6.0: If the directory you build rsync in differs from the sourcedir make test failes: $ tar -xzf ~/rsync-2.6.0.tar.gz $ mkdir build $ cd build $ ../rsync-2.6.0/configure $ make test PASSunsafe-byname PASSunsafe-links - wildmatch log follows Testing for symlinks using 'test -h' + /tmp/bla/build/wildtest Unable to open wildtest.txt. - wildmatch log ends FAILwildmatch - overall results: 14 passed 1 failed 3 skipped overall result is 1 make: *** [check] Fehler 1 The problem is in wildtest.c : if ((fp = fopen(wildtest.txt, r)) == NULL) { fprintf(stderr, Unable to open wildtest.txt.\n); exit(1); } cu, Stefan -- Stefan Nehlsen | ParlaNet Administration | [EMAIL PROTECTED] | +49 431 988-1260 -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Problem with many files in rsync server directory ?
--On Wednesday, January 07, 2004 03:10:23 -0800 jw schultz [EMAIL PROTECTED] wrote: I've lost track of the number of times someone has complained on this list because blah/blah/* didn't behave as he expected and the problem went away when he dropped the unnecessary wildcard. Hmmm... given the following files: foo/a foo/b foo/c/1 how do you do rsync foo/* bar without globs? Note that this is _not_ recursive. All I can think of is to replace the glob with an exclude, doing rsync -r --exclude='*/*' foo/ bar/, which is an absolutely terrible construct (please recurse - whoops, just kidding!). Hmmm... using bash, you can do rsync --files-from=(find foo/. -maxdepth 1 ! -type d -printf '%P\n') foo bar/, but that's also wretched. -- Carson -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: Copying hard-linked tree structure
I have a tree structure on one server similar to the following: /Current /01-04-2003 /01-03-2003 etc... /Current holds the most recent rsynced data, and the date directories are created with cp -al on a daily basis so they are hard-linked. I'm going back 60 days. The question is how can I move this entire structure to a new server and preserve the links from the date directories to the /Current directory? Well, I ended up rsyncing the root directory to the new server with the -H option and it seemed to work. I have 30 directories for 30 days of rotating backups. However, I had a dir called /Current that had 12Gbs and then all the /date directories had 120mb, 60mb, etc...the daily changes that occurred. Well now the directory called /01-01-2004 has 12Gb and /Current has like 100mb. I guess /01-01-2004 went first do to sorting. Anyway to change /Current back as the real directory? Or does it even matter? Thanks, Max -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: Copying hard-linked tree structure
I have a tree structure on one server similar to the following: /Current /01-04-2003 /01-03-2003 etc... /Current holds the most recent rsynced data, and the date directories are created with cp -al on a daily basis so they are hard-linked. I'm going back 60 days. The question is how can I move this entire structure to a new server and preserve the links from the date directories to the /Current directory? Well, I ended up rsyncing the root directory to the new server with the -H option and it seemed to work. I have 30 directories for 30 days of rotating backups. However, I had a dir called /Current that had 12Gbs and then all the /date directories had 120mb, 60mb, etc...the daily changes that occurred. Well now the directory called /01-01-2004 has 12Gb and /Current has like 100mb. I guess /01-01-2004 went first do to sorting. It has to do with the tool you are using to measure them. Anyway to change /Current back as the real directory? Or does it even matter? What do you man real. With hardlinks all links for an inode are equal. I'm using du --max-depth=1 -h on the root dir. The actual file(s) has to be stored in some directory, right? And then the hard links point to this directory. Well they are all pointing to /01-01-2004 instead of /Current. Max -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Copying hard-linked tree structure
On Wed, Jan 07, 2004 at 03:18:40PM -0600, Max Kipness wrote: I have a tree structure on one server similar to the following: /Current /01-04-2003 /01-03-2003 etc... /Current holds the most recent rsynced data, and the date directories are created with cp -al on a daily basis so they are hard-linked. I'm going back 60 days. The question is how can I move this entire structure to a new server and preserve the links from the date directories to the /Current directory? Well, I ended up rsyncing the root directory to the new server with the -H option and it seemed to work. I have 30 directories for 30 days of rotating backups. However, I had a dir called /Current that had 12Gbs and then all the /date directories had 120mb, 60mb, etc...the daily changes that occurred. Well now the directory called /01-01-2004 has 12Gb and /Current has like 100mb. I guess /01-01-2004 went first do to sorting. It has to do with the tool you are using to measure them. Anyway to change /Current back as the real directory? Or does it even matter? What do you man real. With hardlinks all links for an inode are equal. I'm using du --max-depth=1 -h on the root dir. The actual file(s) has to be stored in some directory, right? And then the hard links point to this directory. Well they are all pointing to /01-01-2004 instead of /Current. Only symlinks point to another directory entry. All hardlinks are equal. The way you are using it du is simply using the directory order to pick which paths to descend first. ls -f should list the directory in the same order that du does. On the source system the directory order will be semi-random if you have been creating and deleting entries for awhile. On the destination they will be in lexical order because that was the creation order by rsync and you haven't mixed that up yet. Try this, mv 01-01-2004 01-01-2004-a-long-name mv 01-01-2004-a-long-name 01-01-2004 Now 01-01-2004 will likely not be the first on the list from ls -f and another direcotory will likely be held responsible for the space by du. -- J.W. SchultzPegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: TODO hardlink performance optimizations
On Wed, Jan 07, 2004 at 01:30:19AM -0800, jw schultz wrote: On Wed, Jan 07, 2004 at 02:45:46AM -0600, John Van Essen wrote: The point of this exercise was to find a way to avoid unnecessary transfers of already existing files I thought the point was to reduce the memory footprint and then get rid of the binary search. They are both desireable goals, and I'd like to see one other: a reduction in number of bytes transmitted when sending hard-link data. If we omit the dev/inode data for items that can't be linked together, we should be able to save a large amount of transmission size (but this will require a protocol bump). Of course this does not mean that the new optimized hard-link code would require this optimized sending in order to work. - Create the union and change file_struct and the routines that reference and populate it to use the union for dev and inode. This may include not allocating the union for unlinkable files. I had been considering possible ways to avoid having the extra pointer in the flist_struct, and a suggestion John made has made me think that we can leave it out if we allow the file_struct to be of variable length. We'd set a flag if it has the extra trailing data, and never refer to this data if the flag is not set. - Reduce the hlink_list to just the heads and change do_hard_links. I'm not sure this is worth the cost of copying the bytes, but we'll see. Each of these is a discrete step that when complete the code will function correctly. Yes. Nice plan. If either of you have started coding the next stuff, let me know -- I'm thinking about doing some coding. ..wayne.. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: TODO hardlink performance optimizations
On Wed, Jan 07, 2004 at 03:25:39PM -0800, Wayne Davison wrote: On Wed, Jan 07, 2004 at 01:30:19AM -0800, jw schultz wrote: On Wed, Jan 07, 2004 at 02:45:46AM -0600, John Van Essen wrote: The point of this exercise was to find a way to avoid unnecessary transfers of already existing files I thought the point was to reduce the memory footprint and then get rid of the binary search. They are both desireable goals, and I'd like to see one other: a reduction in number of bytes transmitted when sending hard-link data. If we omit the dev/inode data for items that can't be linked together, we should be able to save a large amount of transmission size (but That would also require increasing the size of flags so the savings of 8-16 bytes would be offset somewhat by a 1 byte increase. Most likely use 2 bits (SAME_DEV and HAVE_INODE). That would give us 6 bits for future expansion. I'd also want to send for all !IS_DIR and not just IS_REG. Otherwise fixing the failure to preserve links on symlinks, device, fifos and sockets would need yet another protocol bump. this will require a protocol bump). Of course this does not mean that the new optimized hard-link code would require this optimized sending in order to work. - Create the union and change file_struct and the routines that reference and populate it to use the union for dev and inode. This may include not allocating the union for unlinkable files. I had been considering possible ways to avoid having the extra pointer in the flist_struct, and a suggestion John made has made me think that we can leave it out if we allow the file_struct to be of variable length. We'd set a flag if it has the extra trailing data, and never refer to this data if the flag is not set. Runtime variable sized structures should be avoided. Do you want to make rdev, link and sum conditional also? We are replacing two u64 with one pointer that will often be NULL, that should be enough. If you wanted i suppose you could make rdev, link and sum a union within file_struct since they are mutually exclusive and dependent on IS_*(mode). That would squeeze another 8 bytes/file with a minimal impact on the code. - Reduce the hlink_list to just the heads and change do_hard_links. I'm not sure this is worth the cost of copying the bytes, but we'll see. The cache lines are hot, it will free usable amounts of memory and it will simplify subsequent logic without complicating the code that walks the hlink_list. Each of these is a discrete step that when complete the code will function correctly. Yes. Nice plan. If either of you have started coding the next stuff, let me know -- I'm thinking about doing some coding. I've not started coding beyond what i've already committed. John seemed eager to start work on this but i'm not sure of his status. Having gotten the design hammered out he seemed to wish to take implimentation details off-list, i'm sure he'll be glad to CC you. The transmission reduction above is largely independant of the other code. Q for lurkers: What is the value of dev and inode on systems that don't have them? 0 or -1? -- J.W. SchultzPegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
CVS update: rsync
Date: Thu Jan 8 04:56:40 2004 Author: wayned Update of /data/cvs/rsync In directory dp.samba.org:/tmp/cvs-serv18275 Modified Files: proto.h Log Message: The latest prototypes. Revisions: proto.h 1.166 = 1.167 http://www.samba.org/cgi-bin/cvsweb/rsync/proto.h.diff?r1=1.166r2=1.167 ___ rsync-cvs mailing list [EMAIL PROTECTED] http://lists.samba.org/mailman/listinfo/rsync-cvs