Re: Feature Concept: enable iCloud Drive with rsync

2023-09-12 Thread Kevin Korb via rsync
Is this being accessed via a fuse mount?  If so it seems like that is 
where this kind of feature should be implemented (like a mount option to 
decide how to handle such files).  Rsync shouldn't need special features 
to deal with every kind of file storage.


On 9/12/23 05:22, Brian "bex" Exelbierd via rsync wrote:

Hi,

I have also posted this on GitHub but it isn’t clear that was the right 
place: https://github.com/WayneD/rsync/issues/522


iCloud Drive will evict files that are unused or when additional space 
is needed on the local drive. The evicted files are replace by 
"bookmark" files that allow MacOS to continue to report the files in the 
file system as though they were actually present. The files are 
downloaded again either on request or when needed.


rsync, like all similar tools I can find, doesn't have any way of 
handling these evicted files. I have been thinking about this and I 
think I know how to make it work.


The short explanation is that iCloud Drive preserves access to the 
required metadata. Here is an abbreviated output from `mdls` of an 
evicted file:


```
% mdls .1998-tax-return.pdf.icloud
kMDItemContentCreationDate = 2018-08-28 18:32:24 +
kMDItemContentCreationDate_Ranking = 2021-11-05 00:00:00 +
kMDItemContentModificationDate = 2018-08-28 18:32:24 +
kMDItemDateAdded = 2021-11-05 15:42:15 +
kMDItemDisplayName = "1998-tax-return.pdf"
kMDItemFSContentChangeDate = 2018-08-28 18:32:24 +
kMDItemFSCreationDate = 2018-08-28 18:32:24 +
kMDItemFSSize = 1929932
kMDItemInterestingDate_Ranking = 2018-08-28 00:00:00 +
kMDItemLogicalSize = 1929932
kMDItemPhysicalSize = 1929932
```

This metadata, I think, is enough to pass the rsync quick check as 
described in the man page. Therefore, I suspect what needs to happen, 
from a code perspective, is that rsync needs to be modified to do the 
following when it finds an evicted file. All evicted files are named 
consistently, `..iCloud` so they are easy to spot.


1. Perform the system call equivalent of the mdls above to obtain the 
appropriate dates and sizes.
2. If the file cannot be skipped because it either has changed or a 
checksum is required, perform the system call equivalent of `brctl 
download ` to get the file downloaded.

3. Rsync the file
4. If the file was downloaded by rsync, perform the system call 
equivalent of `brctl evict ` to remove the file to leave the 
system in the same state.


This simplistic algorithm would leave some open issues/caveats:
- It is likely that the rsync can only be run one-way from iCloud Drive 
to non-iCloud Drive data storage. While it is possible it could be run 
two-way research is needed on whether you’d have to download the old 
file before you replace it.
- Timeouts may happen. iCloud Drive still gets stuck sometimes and there 
will be non-zero time during downloads that don’t get stuck.
- There is no guarantee there is ever enough room on disk to hold a 
specific file from iCloud Drive. This is likely resolved by MacOS 
directly, however, this may take excessive time.
- If you thought running with checksums was slow before, strap in, 
because those downloads are going to add up.


All that said, I think this would accomplish a high percentage of what 
users are looking for.


Is this kind of a patch welcome? Is the plan sound? I haven't written C 
since college (in the 90s!) and would be open to advice/mentoring.


Thank you.

regards,

bex



--
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,
Kevin Korb  Phone:(407) 252-6853
Systems Administrator   Internet:
FutureQuest, Inc.   ke...@futurequest.net  (work)
Orlando, Floridak...@sanitarium.net (personal)
Web page:   https://sanitarium.net/
PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,

--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Feature Concept: enable iCloud Drive with rsync

2023-09-12 Thread Brian "bex" Exelbierd via rsync
Hi,

I have also posted this on GitHub but it isn’t clear that was the right place: 
https://github.com/WayneD/rsync/issues/522

iCloud Drive will evict files that are unused or when additional space is 
needed on the local drive. The evicted files are replace by "bookmark" files 
that allow MacOS to continue to report the files in the file system as though 
they were actually present. The files are downloaded again either on request or 
when needed.

rsync, like all similar tools I can find, doesn't have any way of handling 
these evicted files. I have been thinking about this and I think I know how to 
make it work.

The short explanation is that iCloud Drive preserves access to the required 
metadata. Here is an abbreviated output from `mdls` of an evicted file:

```
% mdls .1998-tax-return.pdf.icloud
kMDItemContentCreationDate = 2018-08-28 18:32:24 +
kMDItemContentCreationDate_Ranking = 2021-11-05 00:00:00 +
kMDItemContentModificationDate = 2018-08-28 18:32:24 +
kMDItemDateAdded = 2021-11-05 15:42:15 +
kMDItemDisplayName = "1998-tax-return.pdf"
kMDItemFSContentChangeDate = 2018-08-28 18:32:24 +
kMDItemFSCreationDate = 2018-08-28 18:32:24 +
kMDItemFSSize = 1929932
kMDItemInterestingDate_Ranking = 2018-08-28 00:00:00 +
kMDItemLogicalSize = 1929932
kMDItemPhysicalSize = 1929932
```

This metadata, I think, is enough to pass the rsync quick check as described in 
the man page. Therefore, I suspect what needs to happen, from a code 
perspective, is that rsync needs to be modified to do the following when it 
finds an evicted file. All evicted files are named consistently, 
`..iCloud` so they are easy to spot.

1. Perform the system call equivalent of the mdls above to obtain the 
appropriate dates and sizes.
2. If the file cannot be skipped because it either has changed or a checksum is 
required, perform the system call equivalent of `brctl download ` to 
get the file downloaded.
3. Rsync the file
4. If the file was downloaded by rsync, perform the system call equivalent of 
`brctl evict ` to remove the file to leave the system in the same 
state.

This simplistic algorithm would leave some open issues/caveats:
- It is likely that the rsync can only be run one-way from iCloud Drive to 
non-iCloud Drive data storage. While it is possible it could be run two-way 
research is needed on whether you’d have to download the old file before you 
replace it.
- Timeouts may happen. iCloud Drive still gets stuck sometimes and there will 
be non-zero time during downloads that don’t get stuck.
- There is no guarantee there is ever enough room on disk to hold a specific 
file from iCloud Drive. This is likely resolved by MacOS directly, however, 
this may take excessive time.
- If you thought running with checksums was slow before, strap in, because 
those downloads are going to add up.

All that said, I think this would accomplish a high percentage of what users 
are looking for.

Is this kind of a patch welcome? Is the plan sound? I haven't written C since 
college (in the 90s!) and would be open to advice/mentoring.

Thank you.

regards,

bex
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html