The data portion will not be trivial, especially if the you are using
a compressed pool.  The problem lies in the fact that IO is
experienced as a series of reads of a given size at a given offset.
This means the Fuse driver will receive a series of calls like:

read("FILEA",      0 , 8192)
read("FILEA", 8192, 8192)
read("FILEA",16384,8192)
etc.

Now the problem is reading the data via the BackupPC API gives you a
file handle to the data stream beginning at offset 0.  So, if you try
to design it stateless (which is the best in terms of resource usage
for meta data and overall stability) you will have to open a fresh
handle to the data, seek to the position, and then read the requested
block of data.  This will be very slow and wasteful of resources
setting up the handle on each IO.  This is true even though you would
generally have IO that either read just the head of the file (such as
a file typing request like "file FILEA") or read the file from
beginning to end.

If you make the driver stateful you could cache an open file handle
that would service the sequential reading well but brings in headaches
like how long to cache before allowing cache to timeout and cleanup.
The documentation surrounding the open, read, close calls that are
made aren't real clear and you could have multiple overlapping open
and closes for any given file (this supports different processes
having independent file handles for a given file).

All of this begs the question of whether its better to fit a FUSE
interface to the current pool/catalog functionality or look at
rewriting the whole pool / catalog functionality with deduping written
directly into the FUSE filesystem.  This would then allow BackupPC to
simply write its file to the filesystem and have the filesystem handle
all of the magic of deduping / compressing.

Jon


On Wed, Jan 28, 2009 at 8:18 PM, Kenneth Porter <[email protected]> wrote:
> I saw discussion about this on the users list and hoped to reawaken
> something here. I just popped up Alex Harrington's implementatation and was
> able to navigate directories, but the first file I inspected seemed to
> contain just nulls, so I think the data part of the implementation is
> stubbed. How hard would it be to flesh that out? That would be enough to do
> something like:
>
> diff -rq /mnt/BackupPC/balrog/2361/D /mnt/Balrog-D
>
> (Where Balrog is my Windows host, 2361 is a backup archive number, and D is
> the share being compared.)
>
> What I'd like to do is script the above to run nightly so I can see if
> there are any trouble spots in the backup.
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> SourcForge Community
> SourceForge wants to tell your story.
> http://p.sf.net/sfu/sf-spreadtheword
> _______________________________________________
> BackupPC-devel mailing list
> [email protected]
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-devel
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>



-- 
Jonathan Craig

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
BackupPC-devel mailing list
[email protected]
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-devel
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

Reply via email to