On Fri, Mar 12, 2004 at 09:41:03AM -0500, Wesley D Craig wrote: > I'd be happy to work on this myself, since I already have very similar > code, if there was some possibility that the rsync maintainers would be > willing to accept the modifications. Otherwise, it seems like a waste > of effort.
I think the maintainers have to err on the side of caution regarding changes to the application, and especially to the protocol. If we can find a way to handle metadata in general, and then implement the specific needs for HFS+, (instead of the other way around, which has been my approach so far), then I think they would see more value in making changes. The problem, as I see it (usual disclaimers apply), is that rsync depends in many places on a very strict sourcefile-to- destinationfile mapping. To make the sourcefile "virtual", i.e. created on the fly, would require deep restructuring (and extra memory). The easier way out is to create a new sourcefile in the FS from the metadata and add it to the flist, but that requires local disk space, and unless the protocol is updated, requires you to trick the sort routine. The most straightforward and plausible idea I can think of is to update the protocol to include explicit file-IDs (instead of implicit offsets in the sorted flist), and add a "capabilities" header to the protocol version. e.g.: rsync protocol 28 (means file-IDs (can be) explicit, and should exchange local capabilities) capabilities: (subset of) read_ufs store_ufs read_hfs+metadata store_hfs+metadata read_ntfs-streams store_ntfs-streams The sender and receiver can figure out if they'll be able to accomplish what they need to by exchanging this info. (not sure if rsync "dumbs-down" for lower protocol revs.) If the receiver's protocol is less than 28, then the sender has to decide if it can send all the data it wants to, and whether to send what it can (or bail) if not. If the receiver's protocol is >=28, the sender (can) check the receiver capabilities to see if it should just stream the whole HFS+ file (a la rsync_hfs), or convert to a common-capability format first (AppleSingle, MacBinary), or elect to split the file out into multiple streams (AppleDouble, Apple's newer ._<filename> scheme, etc). Just getting explicit file-IDs into the protocol would greatly increase flexibility, if the capabilities idea is too fraught. Rsync was written with UFS and similar file systems deeply in mind, and then optimized (ad absurdium -- if two files have the same byte-count, the protocol omits sending that piece of duplicate information for the second file) in the interest of keeping network traffic to the ABSOLUTE minimum. But rsync has become the best tool I know of to move files between any two machines. I think metadata awareness and capabilities would be a great addition. Filesystem designers should make the metadata more accessible* but all modern filesystems have metadata, and I think it will only get more important as time goes on. * maybe the POSIX open() should open as a full-metadata stream so that /bin/cp and such will just *work*. Another open function could be used by the OS when it only wants part of the file (data fork) etc... Or flags on open(), but that will break POSIX compliance...the OS could fake it out, but it might not be worth the developer confusion. Anyway, so far filesystem/OS designers aren't doing this, so here we are. Andrew [EMAIL PROTECTED] -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html