[BackupPC-users] BackupPC and MooseFS?

2011-05-17 Thread Mike

Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?



--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-17 Thread Michael Stowe
>
> Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?
>

No, but it seems like a *really* bad idea -- the concept of slow, off-box,
redundant storage isn't a really good fit with the concepts of pooling and
linking.

Since it's fuse-based, there's the possibility it doesn't support
hard-linking at all, which would make it completely unfeasible.  (I don't
know, it's not obvious what it supports from the documentation.)


--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-17 Thread Papp Tamas

On 05/17/2011 06:57 PM, Michael Stowe wrote:
>> Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?
>>
> No, but it seems like a *really* bad idea -- the concept of slow, off-box,
> redundant storage isn't a really good fit with the concepts of pooling and
> linking.
>
> Since it's fuse-based, there's the possibility it doesn't support
> hard-linking at all, which would make it completely unfeasible.  (I don't
> know, it's not obvious what it supports from the documentation.)

I use it with dirvish.
It works just really fine.

tamas

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-17 Thread Les Mikesell
On 5/17/2011 11:57 AM, Michael Stowe wrote:
>>
>> Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?
>>
>
> No, but it seems like a *really* bad idea -- the concept of slow, off-box,
> redundant storage isn't a really good fit with the concepts of pooling and
> linking.

Actually the concept looks good.  It does rely on a single master node 
though.

> Since it's fuse-based, there's the possibility it doesn't support
> hard-linking at all, which would make it completely unfeasible.  (I don't
> know, it's not obvious what it supports from the documentation.)

The docs say it does handle hardlinks - but it is hard to tell if it 
does it well enough for backuppc.  I'd expect the fuse layer to be the 
bottleneck in the design - at least if you have several data servers.

-- 
   Les Mikesell
lesmikes...@gmail.com


--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-17 Thread Jeffrey J. Kosowsky
Michael Stowe wrote at about 11:57:42 -0500 on Tuesday, May 17, 2011:
 > >
 > > Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?
 > >
 > 
 > No, but it seems like a *really* bad idea -- the concept of slow, off-box,
 > redundant storage isn't a really good fit with the concepts of pooling and
 > linking.
 > 
 > Since it's fuse-based, there's the possibility it doesn't support
 > hard-linking at all, which would make it completely unfeasible.  (I don't
 > know, it's not obvious what it supports from the documentation.)

The first paragraph on the page linked by the OP says:
Symbolic links (file names pointing to target files, not necessarily
on MooseFS) and hard links (different names of files which refer to
the same data on MooseFS) 

So at least it supports hard links...

Your points about speed may very well be true...

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-18 Thread Carl Wilhelm Soderstrom
On 05/17 01:25 , Mike wrote:
> Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?

Thanks for the link. That looks like a pretty cool project.

-- 
Carl Soderstrom
Systems Administrator
Real-Time Enterprises
www.real-time.com

--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-18 Thread Les Mikesell
On 5/18/2011 3:21 PM, Carl Wilhelm Soderstrom wrote:
> On 05/17 01:25 , Mike wrote:
>> Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?
>
> Thanks for the link. That looks like a pretty cool project.

I've been hoping someone would write a fuse layer on top of riak (a 
distributed, clustered DB that doesn't need a master node).  There is 
something called luwak that handles files as streams, but because the 
chunking step hashes the key from the chunk contents (and thus 
deduplicates with no reference counting) you can't ever delete anything.

-- 
   Les Mikesell
lesmikes...@gmail.com

--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-20 Thread Mike
On 11-05-18 05:21 PM, Carl Wilhelm Soderstrom wrote:
> On 05/17 01:25 , Mike wrote:
>> Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?
> Thanks for the link. That looks like a pretty cool project.
>
and from initial appearances / testing, it runs pretty darn well, too. 
We only have ~2T on our test environment so far over 3 machines, so it's 
certainly nothing large.

I haven't tried backuppc on it yet, but storing mail in maildir folder 
works well, and virtual machine images work well.

being able to say "I want to have 2 copies of anything in this directory 
and 3 copies of anything in this directory" is very nice. So is being 
able to put half your servers in one building and half in another.  
Failing a disk and watching all the unmet goals (goal = min # of copies 
of something) get resolved is fun as well.


Now, to convert the media machine at home...





--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-21 Thread Scott
But does moosefs basically duplicate the data, so if you have 2tb of
backuppc data, you need a moosefs with 2tb of storage to duplicate the whole
thing?


On Fri, May 20, 2011 at 8:05 AM, Mike  wrote:

> On 11-05-18 05:21 PM, Carl Wilhelm Soderstrom wrote:
> > On 05/17 01:25 , Mike wrote:
> >> Has anyone tried using BackupPC and MooseFS (http://www.moosefs.org/)?
> > Thanks for the link. That looks like a pretty cool project.
> >
> and from initial appearances / testing, it runs pretty darn well, too.
> We only have ~2T on our test environment so far over 3 machines, so it's
> certainly nothing large.
>
> I haven't tried backuppc on it yet, but storing mail in maildir folder
> works well, and virtual machine images work well.
>
> being able to say "I want to have 2 copies of anything in this directory
> and 3 copies of anything in this directory" is very nice. So is being
> able to put half your servers in one building and half in another.
> Failing a disk and watching all the unmet goals (goal = min # of copies
> of something) get resolved is fun as well.
>
>
> Now, to convert the media machine at home...
>
>
>
>
>
--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-23 Thread Les Mikesell
On 5/21/2011 7:24 PM, Scott wrote:
> But does moosefs basically duplicate the data, so if you have 2tb of
> backuppc data, you need a moosefs with 2tb of storage to duplicate the
> whole thing?

Yes, it gives the effect of raid1 mirrors - but if I understand it 
correctly the contents can be distributed across several machines 
instead of needing space for a full copy of even a single instance of 
the whole filesystem on any single machine or drive.

-- 
   Les Mikesell
lesmikes...@gmail.com



--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] BackupPC and MooseFS?

2011-05-23 Thread Holger Parplies
Hi,

Mike wrote on 2011-05-20 09:05:11 -0300 [Re: [BackupPC-users] BackupPC and 
MooseFS?]:
> [...]
> being able to say "I want to have 2 copies of anything in this directory 
> and 3 copies of anything in this directory" is very nice. [...]

Les Mikesell wrote on 2011-05-23 11:12:18 -0500 [Re: [BackupPC-users] BackupPC 
and MooseFS?]:
> On 5/21/2011 7:24 PM, Scott wrote:
> > But does moosefs basically duplicate the data, so if you have 2tb of
> > backuppc data, you need a moosefs with 2tb of storage to duplicate the
> > whole thing?
> 
> Yes, it gives the effect of raid1 mirrors -

from what Mike wrote, shouldn't you be able to say "I want to have only one
copy of anything in  and two copies of everything else"?
With BackupPC, that doesn't seem to make any sense (but: see below) - why
would you want to replicate only part of the pool, and why only files that
happen to have a partial file md5sum starting with certain letters? You could
limit log files to a single copy, but is that enough data to even worry about?
This does bring up questions, though: how does it handle hardlinks, if you
determine numbers of copies by directory, i.e. how many copies do you get,
if a file is in one directory where you want three copies and in another
directory where you chose two copies?
If the answer is "five copies", it won't work with BackupPC ;-).

Mike, can you test what it does with hard links, e.g. by creating a large file
with several links?

I'm just asking, because with normal UNIX file system usage patterns, you
could probably get away with cheating (and creating five copies) without
anyone complaining (or even noticing). Then again, the mechanism might be
totally different, like putting the number of copies in the inode (and
inheriting from the parent directory on file creation; presuming it *has* an
own inode and doesn't just use a different FS for local storage). If that is
the case, you could even conceivably have some hosts' data replicated X
times and other hosts' data Y times (e.g. Y=1) by tagging the appropriate pc/
directories accordingly. Only problem: *shared* data (file contents appearing
on hosts in both sets) would 'randomly' have X or Y copies, depending on which
set of hosts happened to contain the file first (but you could probably adjust
that later and "watch all the unmet goals get resolved" ;-).

> but if I understand it correctly the contents can be distributed across
> several machines instead of needing space for a full copy of even a single
> instance of the whole filesystem on any single machine or drive.

The way BackupPC works (heavily relying on fast read performance), I would
expect it to be important for performance to have a full copy of the file
system locally on the BackupPC server. Is there a way to enforce that?

Another consideration would be, how well does it handle a large backlog of
unmet goals? If you're replicating over a comparatively slow connection, you
might need to spread out updates to the "mirror(s)" over more time than your
backup window contains. Does a large backlog of unmet goals deplete system
memory needed for caching?

Mike wrote:
> I haven't tried backuppc on it yet, but storing mail in maildir folder
> works well, and virtual machine images work well.

Unfortunately, both of these examples don't resemble BackupPC's disk usage.
Virtual machine images are possibly high-bandwidth single large file
operations, maildir folders use many small files, but the bandwidth is
probably severely limited by your internet connection and MTA processing
(DNSBL lookups, sender verification, Spamassassin, ...). Reading mail is
limited by your POP or IMAP server's processing speed (well, or NFS). And
all of that only happens if there is actually incoming mail or users
checking their mailbox, which you probably don't have at a sustained high
rate for longer periods of time.

While BackupPC's performance may also be limited by link bandwidth or
client speed, from what I read on this list, server disk performance seems
to be the most important limiting factor.

So, while your results are encouraging, we still simply need to try it out,
unless we can establish a reason why it won't work. For any meaningful
results, it would be best to have an alternate BackupPC server with
"conventional" storage (and comparable hardware) backing up the same clients
(but not at the same time) to compare backup performance with.

Regards,
Holger

--
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-d