Re: Snapshots in tmpfs

2012-03-05 Thread David Young
On Mon, Mar 05, 2012 at 06:14:04AM +, David Holland wrote:
 The problem with that scheme is that you rewrite everything to the
 flash over and over again anytime something changes, which is going to
 generate vastly more write cycles than just using a normal fs.

This scheme doesn't write anytime something changes, it writes
periodically.  The number of write cycles over/under a normal fs depends
on the period, on the rate of application writes, on the proportion of
files changed v. unchanged in a typical period.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: Snapshots in tmpfs

2012-03-05 Thread Thor Lancelot Simon
On Mon, Mar 05, 2012 at 01:12:43PM -0600, David Young wrote:
 On Mon, Mar 05, 2012 at 06:14:04AM +, David Holland wrote:
  The problem with that scheme is that you rewrite everything to the
  flash over and over again anytime something changes, which is going to
  generate vastly more write cycles than just using a normal fs.
 
 This scheme doesn't write anytime something changes, it writes
 periodically.  The number of write cycles over/under a normal fs depends

Practically speaking, that requires the rest of the device runtime to
know that all configuration updates are to be batched together.  Generally
speaking, that requires enough coordination that it's not really any
harder to then tar up the tmpfs and put it aside somewhere (like a raw
partition, which is what some Linux embedded environments do) -- no kernel
code required.

Also, even when done carefully, this can result in writes of huge
numbers of blocks when only a few really change.

On another subject, though, I do want to talk about the question of
whether running FFS with WAPBL for /etc on flash devices causes double
write load for common cases.  Here is why I believe it does:

* The typical application pattern for config file updates is
  to create a temporary file -- one transaction written to the
  journal, plus the data blocks which must be written,
  then move it into place -- another transaction written to the
  journal.  Many applications also fsync the temporary file
  first.

* Without journaling, this looks like, pretending it's a 1 block
  file:

* 1 write for the data block
* 1 write for the file-create metadata change
* 1 write for the rename metadata change

* With journaling, though:

* 1 write into the journal for the file create
* 1 write for the data blocks
* possibly a journal flush caused by the fsync
* 1 write into the journal for the rename
* 2 writes for the actual metadata changes
* 1 write to mark the journal entries as done

Is that not the case?

Thor


Re: Snapshots in tmpfs

2012-03-04 Thread David Holland
On Wed, Feb 29, 2012 at 06:45:41PM -0600, David Young wrote:
Oh, my mistake, since there was concern about filesystem type I
thought you were talking about raw flash, but apparently CompactFlash
is not raw flash, same as USB sticks aren't.

In that case, just use wapbl.
   
   That doubles the write rate for the common create new version of
   file and rename into place pattern...

(no it doesn't)

   Translation layer or not, doubling the write rate to any type of
   flash is not a great idea.
  
  One way to hold writes to flash down to a very low rate is to keep files
  that change in a tmpfs, and everything else in a read-only FFS.
  
  Sometimes the files that change need to persist across reboots and power
  failures.  One way to make them persist is to periodically write a
  checkpoint of the tmpfs containing those files to flash.  After a reset
  or power failure, use the last checkpoint to restore the tmpfs.
  
  One way to store the checkpoints is to reserve a partition on flash for
  receiving them.  You don't put a filesystem on the checkpoint partition,
  but you treat it like a (circular) tape with big blocks.  Ideally, the
  block size is a multiple of the biggest block size that the flash uses.
  [...]

The problem with that scheme is that you rewrite everything to the
flash over and over again anytime something changes, which is going to
generate vastly more write cycles than just using a normal fs.

Sure, you can write incremental updates to the flash (you can probably
even use dump to generate them) but by the time you've sorted that all
out and debugged it all, you've done a lot of work to get roughly the
same results you'd get just by using a normal fs.

-- 
David A. Holland
dholl...@netbsd.org


Re: Snapshots in tmpfs

2012-02-29 Thread David Young
On Thu, Feb 23, 2012 at 08:04:01PM -0500, Thor Lancelot Simon wrote:
 On Fri, Feb 24, 2012 at 12:45:32AM +, David Holland wrote:
  On Thu, Feb 23, 2012 at 11:20:18PM +, David Holland wrote:
  Is CHFS really suitable for CompactFlash?  Is LFS even usable?
 
 No 

I thought the whole point of chfs was to be able to operate on raw
flash devices that don't have their own flash translation layer.
  
  Oh, my mistake, since there was concern about filesystem type I
  thought you were talking about raw flash, but apparently CompactFlash
  is not raw flash, same as USB sticks aren't.
  
  In that case, just use wapbl.
 
 That doubles the write rate for the common create new version of
 file and rename into place pattern...
 
 Translation layer or not, doubling the write rate to any type of
 flash is not a great idea.

One way to hold writes to flash down to a very low rate is to keep files
that change in a tmpfs, and everything else in a read-only FFS.

Sometimes the files that change need to persist across reboots and power
failures.  One way to make them persist is to periodically write a
checkpoint of the tmpfs containing those files to flash.  After a reset
or power failure, use the last checkpoint to restore the tmpfs.

One way to store the checkpoints is to reserve a partition on flash for
receiving them.  You don't put a filesystem on the checkpoint partition,
but you treat it like a (circular) tape with big blocks.  Ideally, the
block size is a multiple of the biggest block size that the flash uses.

To create a checkpoint of your tmpfs, first you create a (possibly
read-only) snapshot of it: in this way you can write a self-consistent
checkpoint, containing the tmpfs contents at a moment in time, without
suspending tmpfs activity.  Write the checkpoint to the first half of
the checkpoint partition with something like this:

{
checkpoint_header   # writes checkpoint magic, a checkpoint
# generation number, checkpoint date  time
cd $tmpfs_mountpoint
pax -w . | gzip
checkpoint_trailer  # SHA1 sum of previous
} | dd obs=$big_block_size seek=$checkpoint_offset of=$checkpoint_partition

Finally, destroy the snapshot.

Write checkpoints to alternate halves of the checkpoint partition: the
2nd checkpoint to the 2nd half of the checkpoint partition, the 3rd to
the 1st half, 4th to the 2nd half, and so on.

The latest complete checkpoint is the one with the greatest generation
number of all checkpoints with a correct sum.

(It's possible to be fancy, reserving space both for complete
checkpoints and for partials---think partial backups.)

This checkpoint scheme has the interesting property that once the kernel
part, the tmpfs snapshots, is done, you can write the rest using a
Bourne shell script, and there are countless alternate scripts that you
could write.  Also, you can write the checkpoints at the full bandwidth
of whichever device receives them, which can be very fast indeed!

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: Snapshots in tmpfs

2012-02-28 Thread Manuel Bouyer
On Tue, Feb 28, 2012 at 08:12:03AM +0100, Adam Hamsik wrote:
 I can help with both zfs and ext3. If requirement for your thesis is to 
 design and implement something new then implementing DRBD like network raid 
 on top of device-mapper would be amazing. [1], [2]


A DRDB-like functionality is something we need. It could even be a gsoc
project ...

-- 
Manuel Bouyer bou...@antioche.eu.org
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: Snapshots in tmpfs

2012-02-28 Thread Adam Hamsik

On Feb,Tuesday 28 2012, at 9:29 AM, Manuel Bouyer wrote:

 On Tue, Feb 28, 2012 at 08:12:03AM +0100, Adam Hamsik wrote:
 I can help with both zfs and ext3. If requirement for your thesis is to 
 design and implement something new then implementing DRBD like network raid 
 on top of device-mapper would be amazing. [1], [2]
 
 
 A DRDB-like functionality is something we need. It could even be a gsoc
 project ...


It would be really great thing with XenSMP and Xen Suspend, because we would be 
able to create clusters based on NetBSD Dom0 with DomU running from shared 
storage.

Regards

Adam.



Re: Snapshots in tmpfs

2012-02-28 Thread Manuel Wiesinger

I can help with both zfs and ext3. If requirement for your thesis is
to design and implement something new then implementing DRBD like
network raid on top of device-mapper would be amazing. [1], [2]

It is not required to do something new (awkward, I know). But there
is no reason, not to do so. DRBD is a very good idea. My
lecturer was involved in it. As far as I know only for academical
purposes, which is good to know, since it might prevent conflicting
interests.

My main interest is still a good thesis. When it is not required to
implement a DRBD-like feature (I think it won't), I won't do it for this 
time. When time allows it would be interesting to work on that later.


I'll write to my lecturer, and let you know when the exact requirements 
are defined.


Thanks a lot!

Manuel


Re: Snapshots in tmpfs

2012-02-28 Thread Manuel Wiesinger



A DRDB-like functionality is something we need. It could even be a
gsoc project ...

I thought about that too. I decided that gsoc + the thesis is too much.
I'm to busy to have to start in April, and it's impossible to stick to 
the given deadlines.


Thank you anyway!

Manuel


Re: Snapshots in tmpfs

2012-02-27 Thread Manuel Wiesinger
As it seems, implementing the snapshot feature does not really pay off. 
Other projects seem to be much more rewarding for me.


An attractive alternative, is the ext3 journaling feature. It seems to 
be an appropriate amount of work, even more interesting and much more 
useful.


Is there anybody already working on it?

Manuel

[1] http://wiki.netbsd.org/projects/project/ext3fs/


Re: Snapshots in tmpfs

2012-02-27 Thread Christos Zoulas
In article 4f4c09c6.3040...@bsdstammtisch.at,
Manuel Wiesinger  man...@bsdstammtisch.at wrote:
As it seems, implementing the snapshot feature does not really pay off. 
Other projects seem to be much more rewarding for me.

An attractive alternative, is the ext3 journaling feature. It seems to 
be an appropriate amount of work, even more interesting and much more 
useful.

Is there anybody already working on it?

No, I don't think anyone is. Even better, how about getting zfs in a working
state?

christos



Re: Snapshots in tmpfs

2012-02-27 Thread Manuel Wiesinger

No, I don't think anyone is. Even better, how about getting zfs in a
working state?
Hrm... I don't have enough experience with zfs to feel confident with 
it. But I consider it.


Re: Snapshots in tmpfs

2012-02-24 Thread Jonathan Stone
I think you mean halves the write rate.


--- On Thu, 2/23/12, Thor Lancelot Simon t...@panix.com wrote:

From: Thor Lancelot Simon t...@panix.com
Subject: Re: Snapshots in tmpfs
To: David Holland dholland-t...@netbsd.org
Cc: tech-kern@netbsd.org
Date: Thursday, February 23, 2012, 5:04 PM

On Fri, Feb 24, 2012 at 12:45:32AM +, David Holland wrote:
 On Thu, Feb 23, 2012 at 11:20:18PM +, David Holland wrote:
 Is CHFS really suitable for CompactFlash?  Is LFS even usable?
    
    No 
   
   I thought the whole point of chfs was to be able to operate on raw
   flash devices that don't have their own flash translation layer.
 
 Oh, my mistake, since there was concern about filesystem type I
 thought you were talking about raw flash, but apparently CompactFlash
 is not raw flash, same as USB sticks aren't.
 
 In that case, just use wapbl.

That doubles the write rate for the common create new version of
file and rename into place pattern...

Translation layer or not, doubling the write rate to any type of
flash is not a great idea.

-- 
Thor Lancelot Simon                                   t...@panix.com
  All of my opinions are consistent, but I cannot present them all
   at once.    -Jean-Jacques Rousseau, On The Social Contract


Re: Snapshots in tmpfs

2012-02-24 Thread David Laight
On Fri, Feb 24, 2012 at 12:45:32AM +, David Holland wrote:
 On Thu, Feb 23, 2012 at 11:20:18PM +, David Holland wrote:
 Is CHFS really suitable for CompactFlash?  Is LFS even usable?

No 
   
   I thought the whole point of chfs was to be able to operate on raw
   flash devices that don't have their own flash translation layer.
 
 Oh, my mistake, since there was concern about filesystem type I
 thought you were talking about raw flash, but apparently CompactFlash
 is not raw flash, same as USB sticks aren't.

OTOH I've seen a CF card with completely trashed contents.
Data in all the wrong sectors.
I suspect it had power removed in the middle of someinternal action.

David

-- 
David Laight: da...@l8s.co.uk


Re: Snapshots in tmpfs

2012-02-23 Thread David Young
On Thu, Feb 23, 2012 at 07:58:11AM +, David Holland wrote:
 On Wed, Feb 22, 2012 at 08:17:15AM -0600, David Young wrote:
   On Wed, Feb 22, 2012 at 01:42:45PM +0100, Manuel Wiesinger wrote:
*)
What is it good for? The only practical use I can imagine are
backups on thin clients, which operate without a hard disk. But this
is clearly far-fetched, in my eyes.
   
   It's good for writing checkpoints of a tmpfs to non-volatile (NV)
   storage in an embedded system where writing to the NV storage is costly
   (it wears out, or it is slow, or both).  When you have a snapshot, you
   can stream it to NV storage using pax(1).  This is the best practical
   way that I can think of in NetBSD at this time.
 
 other than, say, chfs or lfs?

Is CHFS really suitable for CompactFlash?  Is LFS even usable?

 That sounds like a horrible hack, anyhow, and prone to dying horribly
 if you crash or lose power in the middle of a writeback. (plus you'd
 want to use rsync to transfer, or so I'd think, or rewriting
 unmodified blocks will burn write cycles faster than not bothering to
 do anything special.)

I agree that whatever you have in mind sounds like a horrible hack. :-)

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: Snapshots in tmpfs

2012-02-23 Thread Adam Hoka
On 2/23/2012 7:34 PM, David Young wrote:
 On Thu, Feb 23, 2012 at 07:58:11AM +, David Holland wrote:
 On Wed, Feb 22, 2012 at 08:17:15AM -0600, David Young wrote:
   On Wed, Feb 22, 2012 at 01:42:45PM +0100, Manuel Wiesinger wrote:
*)
What is it good for? The only practical use I can imagine are
backups on thin clients, which operate without a hard disk. But this
is clearly far-fetched, in my eyes.
   
   It's good for writing checkpoints of a tmpfs to non-volatile (NV)
   storage in an embedded system where writing to the NV storage is costly
   (it wears out, or it is slow, or both).  When you have a snapshot, you
   can stream it to NV storage using pax(1).  This is the best practical
   way that I can think of in NetBSD at this time.

 other than, say, chfs or lfs?
 
 Is CHFS really suitable for CompactFlash?  Is LFS even usable?

No and no.

 
 That sounds like a horrible hack, anyhow, and prone to dying horribly
 if you crash or lose power in the middle of a writeback. (plus you'd
 want to use rsync to transfer, or so I'd think, or rewriting
 unmodified blocks will burn write cycles faster than not bothering to
 do anything special.)
 
 I agree that whatever you have in mind sounds like a horrible hack. :-)
 
 Dave
 



Re: Snapshots in tmpfs

2012-02-23 Thread J. Hannken-Illjes
On Feb 22, 2012, at 1:52 PM, Martin Husemann wrote:

 Note that we already have file system snapshots for ffs file systems,
 see fss(4). They are used for backup purposes (atomically create a snapshot,
 while the file system is busy, then backup the now quiet snapshot) - among
 others.

Right -- and as persistent (across unmount) snapshots of a tmpfs file system
don't make any sense file system external snapshots where the way to go.

All that is needed here is the is the ability of tmpfs to suspend its
operations (see fstrans(9) and msdosfs for example).

Not sure if it is worth the effort though ...

--
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)

Re: Snapshots in tmpfs

2012-02-23 Thread David Holland
On Thu, Feb 23, 2012 at 08:07:50PM +0100, Adam Hoka wrote:
   Is CHFS really suitable for CompactFlash?  Is LFS even usable?
  
  No 

I thought the whole point of chfs was to be able to operate on raw
flash devices that don't have their own flash translation layer.

  and no.

Do you have any specific problems to report or are you just repeating
hearsay?

-- 
David A. Holland
dholl...@netbsd.org


Re: Snapshots in tmpfs

2012-02-23 Thread David Holland
On Thu, Feb 23, 2012 at 11:20:18PM +, David Holland wrote:
Is CHFS really suitable for CompactFlash?  Is LFS even usable?
   
   No 
  
  I thought the whole point of chfs was to be able to operate on raw
  flash devices that don't have their own flash translation layer.

Oh, my mistake, since there was concern about filesystem type I
thought you were talking about raw flash, but apparently CompactFlash
is not raw flash, same as USB sticks aren't.

In that case, just use wapbl.

-- 
David A. Holland
dholl...@netbsd.org


Re: Snapshots in tmpfs

2012-02-23 Thread Thor Lancelot Simon
On Fri, Feb 24, 2012 at 12:45:32AM +, David Holland wrote:
 On Thu, Feb 23, 2012 at 11:20:18PM +, David Holland wrote:
 Is CHFS really suitable for CompactFlash?  Is LFS even usable?

No 
   
   I thought the whole point of chfs was to be able to operate on raw
   flash devices that don't have their own flash translation layer.
 
 Oh, my mistake, since there was concern about filesystem type I
 thought you were talking about raw flash, but apparently CompactFlash
 is not raw flash, same as USB sticks aren't.
 
 In that case, just use wapbl.

That doubles the write rate for the common create new version of
file and rename into place pattern...

Translation layer or not, doubling the write rate to any type of
flash is not a great idea.

-- 
Thor Lancelot Simont...@panix.com
  All of my opinions are consistent, but I cannot present them all
   at once.-Jean-Jacques Rousseau, On The Social Contract


Re: Snapshots in tmpfs

2012-02-23 Thread David Holland
On Thu, Feb 23, 2012 at 08:04:01PM -0500, Thor Lancelot Simon wrote:
  Is CHFS really suitable for CompactFlash?  Is LFS even usable?
 
 No 

I thought the whole point of chfs was to be able to operate on raw
flash devices that don't have their own flash translation layer.
   
   Oh, my mistake, since there was concern about filesystem type I
   thought you were talking about raw flash, but apparently CompactFlash
   is not raw flash, same as USB sticks aren't.
   
   In that case, just use wapbl.
  
  That doubles the write rate for the common create new version of
  file and rename into place pattern...

Uh no, no it doesn't, it doesn't even approximate it unless you're
continuously writing new versions of lots of files.

Furthermore, remember, it does so in order to give some chance that a
crash or power failure won't wipe out the data. Ad hoc workarounds
(including this tmpfs snapshots scheme) will lose that property.

(Also, doubles relative to what? Remember that these devices are
designed with FAT32 in mind.)

-- 
David A. Holland
dholl...@netbsd.org


Snapshots in tmpfs

2012-02-22 Thread Manuel Wiesinger

Hi folks,

when stumbling around in the wiki for interesting topics for my bachelor 
thesis I found this entry:

http://wiki.netbsd.org/projects/project/tmpfs-snapshot

I consider to work on that this summer.

What I've done so far:

*) spoke to a lecturer who is very interested in this topic and is 
prepared to supervise me.


*) spoke to two NetBSD developers personally. They said that are 
prepared to help me finding support in the community.


*) spent enough time working with C to feel confident with this task.

*) Made myself clear that the main purpose is a good thesis, which might 
not be ready for being committed to NetBSD, but might me a good source 
for others.


What information I need:

*)
I'm into this out of interest in filesystems and because I'm looking for 
challenge. Are there any other interesting filesystem topics (which 
might not be on the wiki) to work on?


*)
What is it good for? The only practical use I can imagine are backups on 
thin clients, which operate without a hard disk. But this is clearly 
far-fetched, in my eyes.


*) Are there folks out there who are interested in supporting me?

Regards,
Manuel Wiesinger


Re: Snapshots in tmpfs

2012-02-22 Thread Martin Husemann
Note that we already have file system snapshots for ffs file systems,
see fss(4). They are used for backup purposes (atomically create a snapshot,
while the file system is busy, then backup the now quiet snapshot) - among
others.

 *)
 What is it good for? The only practical use I can imagine are backups on 
 thin clients, which operate without a hard disk. But this is clearly 
 far-fetched, in my eyes.

I have no clue. For the disklaess (or r/o flash based) thin clients you
probably can easily do something easier at application level (like: mount
flash r/w, copy some files over).

Martin


Re: Snapshots in tmpfs

2012-02-22 Thread Greg Troxel

Another possible thing to do (instead) would be to look at Coda, and
consider something like porting Coda to use FUSE instead of a homegrown
(pre-FUSE, to be fair) kernel module.  A bigger challenge is to separate
the write-back caching from the upstream server protocol, so that one
could use something coda-like over a connected-mode-only remote
filesystem.  But that probably doesn't fit in a bachelor's thesis.


  


pgpiIvsJmrio9.pgp
Description: PGP signature


Re: Snapshots in tmpfs

2012-02-22 Thread David Young
On Wed, Feb 22, 2012 at 01:42:45PM +0100, Manuel Wiesinger wrote:
 *)
 What is it good for? The only practical use I can imagine are
 backups on thin clients, which operate without a hard disk. But this
 is clearly far-fetched, in my eyes.

It's good for writing checkpoints of a tmpfs to non-volatile (NV)
storage in an embedded system where writing to the NV storage is costly
(it wears out, or it is slow, or both).  When you have a snapshot, you
can stream it to NV storage using pax(1).  This is the best practical
way that I can think of in NetBSD at this time.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: Snapshots in tmpfs

2012-02-22 Thread David Holland
On Wed, Feb 22, 2012 at 08:17:15AM -0600, David Young wrote:
  On Wed, Feb 22, 2012 at 01:42:45PM +0100, Manuel Wiesinger wrote:
   *)
   What is it good for? The only practical use I can imagine are
   backups on thin clients, which operate without a hard disk. But this
   is clearly far-fetched, in my eyes.
  
  It's good for writing checkpoints of a tmpfs to non-volatile (NV)
  storage in an embedded system where writing to the NV storage is costly
  (it wears out, or it is slow, or both).  When you have a snapshot, you
  can stream it to NV storage using pax(1).  This is the best practical
  way that I can think of in NetBSD at this time.

other than, say, chfs or lfs?

That sounds like a horrible hack, anyhow, and prone to dying horribly
if you crash or lose power in the middle of a writeback. (plus you'd
want to use rsync to transfer, or so I'd think, or rewriting
unmodified blocks will burn write cycles faster than not bothering to
do anything special.)

-- 
David A. Holland
dholl...@netbsd.org