Re: Single root filesystem evilness decreasing in 2010? (on workstations) [LONG]

2010-03-09 Thread Robert Brockway

On Thu, 4 Mar 2010, thib wrote:


OTOH - I haven't studied XFS - but from the little overviews I read about
it, I suppose its allocation groups are a way to scale with this problem
(along with other unrelated advantages like parallelism in multithreaded
environments).  What happens if a filesystem doesn't have anything like it?


Filesystems will hit scale problems at some point.  As you note AGs in XFS 
help it to scale alot but you do need to be careful in selecting the 
number.  Too many and you can become CPU bound.



Maybe no-one cares because we currently don't have filesystems big enough to
actually see the problem?


Some people definitely do.


I agree with that, but I know it's because I, personally, *need* to know
what's going on, all the time.  Some people are OK with letting a program
(even such a critical one) do some magic;  and without having tested any
"complex" one, I suspect they try to KIS for the user.


The problem is that if a backup system breaks you get to keep both pieces 
:)   Failing to understand your backup system and now you can DR under the 
worst case is a serious risk.



The problem is, if there's a problem with the backup system itself, then
it's going to be a long night.  If there's no need for such software, I,
again, agree, there's no use to take risks, even if they're minimal.


Amanda is a good example.  I keep 'backup state information at the 
beginning of the tapes and allows the information to be dumped to a test 
file easily.  I have done a 10TB SAN DR with Amanda and used printed out 
pages of the tape state information to guide me.  It was relatively 
painless considering the amount of data I was bringing back.



Considering your experience, I have to believe you;  we can always backup
very simply, even very large systems.  It's just weird to picture, all these
complex backup systems would be useless?  (I know, it's not a binary answer,
but you know what I mean.)


I'm not saying they are useless but organisation do need to take more time 
considering DR I think.  Large organisations will have fully operational 
DR sites and they can afford to run a database for their backup system 
since they can expect at least one of their sites to be operational at any 
given time.


I have known people who run a copy of the backup DB on a laptop which is 
supposedly kept offsite.  These laptops likely come on site occassionally 
and they are a prime candidate for bitrot.


Anything that gets between me and data restoration makes me nervous :)

And for those people who think that off-site/off-line backups aren't needed 
anymore because you can just replicate data across the network, I'll give 
you 5 minutes to find the floor in that plan :)


I guess I'm perfectly OK with that, but are we still talking about
workstations?  :-)


I'm talking about servers.  There is no substitute for offsite/offline 
backups and there never will be.  This is one of the few topics were I 
will use absolute statements like this.


You can never predict the nature of the failure.  If you try to figure out 
how a failure will occur then you will sooner or later run in to a failure 
of imagination.


The only way to guarantee against a single disaster of a certain size is 
to physically seperate the data stores by a sufficient distance and keep 
the backups offline.


No technology can change this fundamental truth since our understanding of 
the possible failure modes will always be incomplete.



My understanding is that the "cached" column of the output of free(1) is the
sum of all pages, clean and dirty.  The "buffers" column would be the


Right.  It might be nice if free did display them seperately.  It would 
confuse people less then :)  /proc certain present the info.  Checkout the 
source of 'free' - it is a really simple application.



Since there's no "cached" column for the swapspace, I guess no clean page
gets pushed there, although it could be useful if that space is on a
significantly faster volume.  Anyway, the "used" column should be the total,
actual swapspace used, so your comment kind of confuses me.  Am I really
wrong here?


I'd recommend doing some reading.  The cached system memory and the swap 
space disaplayed by free are really unrelated concepts (at least at the 
level we're talking about here).


If you want to chat on IRC about fun subjects like caching and swap space 
sometime you can find me as Solver on Freenode & OFTC.


Cheers,

Rob

--
Email: rob...@timetraveller.org
IRC: Solver
Web: http://www.practicalsysadmin.com
I tried to change the world but they had a no-return policy


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: 
http://lists.debian.org/alpine.deb.1.10.1003091250110.18...@castor.opentrend.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations) [LONG]

2010-03-04 Thread thib

Robert Brockway wrote:

[...]
Possibly.  I didn't mean to suggest that dd was a good way to backup.  I 
think it is a terrible way to backup[1].  I was talking about dump 
utilities.  I started using dump on Solaris in the mid 90s and really 
like the approach to backing up that dump utilities offer.  On Linux I 
use xfs a lot and backup with xfsdump in many cases.


OK, now we're on the same wavelength.


[...]
Sure.  GPFS (a commercial filesystem available for Linux) allows for the 
addition of i-nodes dynamically.  We can expect more and more dynamic 
changes to filesystems as the science advances.


I once nearly ran out of i-nodes on a 20TB GPFS filesystem on a SAN. 
Being able to dynamically add i-nodes was a huge relief.  I didn't even 
need to unmount the filesystem.


Oh, that.  I was looking too far again - that's neat indeed.

I was actually referring to the performance problem in my original post;
which of course highly depends on the filesystem type.  I remember reading
stuff about NTFS (old stuff, probably not relevant anymore) saying that a
big MFT would impact performance by 10-20% (whatever that means) depending
on its size.  I was wondering whether that could be true for other
filesystems as well, but I suspect not, since I've never seen anyone
actually considering this.

OTOH - I haven't studied XFS - but from the little overviews I read about
it, I suppose its allocation groups are a way to scale with this problem
(along with other unrelated advantages like parallelism in multithreaded
environments).  What happens if a filesystem doesn't have anything like it?

BTW, that's a way to dynamically (and automatically) add i-nodes, too
(unless I've missed the point).

Maybe no-one cares because we currently don't have filesystems big enough to
actually see the problem?


[...]
The core of any DR plan is the KISS principal.  There's a good chance 
that the poor guy doing the DR is doing it at 3am so the instructions 
need to be simple to reduce the chance of errors.


If the backup solution requires me to have a working DB just to extract 
data or wants me to install an OS and the app before I can get rolling 
then I view it with extreme suspicion.


I agree with that, but I know it's because I, personally, *need* to know
what's going on, all the time.  Some people are OK with letting a program
(even such a critical one) do some magic;  and without having tested any
"complex" one, I suspect they try to KIS for the user.

The problem is, if there's a problem with the backup system itself, then
it's going to be a long night.  If there's no need for such software, I,
again, agree, there's no use to take risks, even if they're minimal.

Considering your experience, I have to believe you;  we can always backup
very simply, even very large systems.  It's just weird to picture, all these
complex backup systems would be useless?  (I know, it's not a binary answer,
but you know what I mean.)

And for those people who think that off-site/off-line backups aren't 
needed anymore because you can just replicate data across the network, 
I'll give you 5 minutes to find the floor in that plan :)


I guess I'm perfectly OK with that, but are we still talking about
workstations?  :-)


[...]
Free is telling you the total memory in disk cache.  Any given page in 
the cache may be 'dirty' or 'clean'.  A dirty page has not yet been 
written to disk.  New pages start out dirty.  Within about 30 seconds 
(varies by filesystem and other factors) the page is written to disk.  
The page in the cache is now clean.


Unless your system is writing heavily most pages in the cache are likely 
to be clean.


Yup, I think I had that right.

The difference is that clean pages can be dumped instantly to reclaim 
the memory.  Dirty pages must be flushed to disk before they can be 
reclaimed.  Using clean pages allows fast read access from the cache 
without the risk of not having committed the data.  I describe this as 
'having your cake and eating it too'[2].


My understanding is that the "cached" column of the output of free(1) is the
sum of all pages, clean and dirty.  The "buffers" column would be the
kernel-space (implied by the manpage), and "used"-"buffers"-"cached" would
be the userspace.

Since there's no "cached" column for the swapspace, I guess no clean page
gets pushed there, although it could be useful if that space is on a
significantly faster volume.  Anyway, the "used" column should be the total,
actual swapspace used, so your comment kind of confuses me.  Am I really
wrong here?

I can't find any documentation in the procps package, and I think I need some.


[...]


Thanks.

-thib


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4b8fba15.9070...@stammed.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations) [LONG]

2010-03-03 Thread Robert Brockway

On Thu, 4 Mar 2010, thib wrote:

If restore speed is really that critical, it should still be possible to 
generate an image without including the free space - I know virtualization 
techs are doing it just fine for most filesystems.


Maybe we misunderstood each other - saw a different problem.


Possibly.  I didn't mean to suggest that dd was a good way to backup.  I 
think it is a terrible way to backup[1].  I was talking about dump 
utilities.  I started using dump on Solaris in the mid 90s and really like 
the approach to backing up that dump utilities offer.  On Linux I use xfs 
a lot and backup with xfsdump in many cases.


[1] A long time ago I used to use it to backup MS-Windows systems from 
Linux but disks grew so much it became infeasable.


I recommend backing up all system binaries.  It's the only way you can 
guarantee you will get back to the same system you had before the rebuild. 
This is most important for servers were even small behavioural changes can 
impact the system in a big way.


So you don't trust Debian stable to be stable?  :-)


Actually I'd say Debian is best-of-breed when it comes to backporting 
security patches to retain consistent functionality.  Having said that, 
system binaries represents an ever reducing proportion of total data on a 
computer system.  When I first started with Linux the OS took up about 80% 
of the available disk space that I had.  Today I'd be generous if I said 
it took up 2%.  So even if there is an alternative, backing them up now is 
hardly onerous and improves the chances of a successful disaster recovery. 
I cover this more in the backup talk.


Thanks a lot;  that's a talk full of useful checklists.  I'll definitely eat 
your wiki pages when I have the time.


Great.  I'm gradually adding more and more info to the site.

While this may be a problem now I think it will be less of a problem in the 
future as some filesystems already allow you to add i-nodes dynamically and 
this will increasingly be the case.


I'm not sure I follow you, but that sounds cool.  Could you elaborate?


Sure.  GPFS (a commercial filesystem available for Linux) allows for the 
addition of i-nodes dynamically.  We can expect more and more dynamic 
changes to filesystems as the science advances.


I once nearly ran out of i-nodes on a 20TB GPFS filesystem on a SAN. 
Being able to dynamically add i-nodes was a huge relief.  I didn't even 
need to unmount the filesystem.


Anyway, my preference isn't based on my own experience so I'm not actually 
using anything like that, but I'm willing to look at and try fsarchiver and 
see if it can really beat simple ad-hoc scripts for my needs.  Or something 
heavier, just for fun (Bacula?).


I'm fairly particular about backup systems.  I think most people who 
design backup systems have never done a DR in the real world.


I seem to end having to do at least one large scale DR per year.  I've 
done two in the last month.  I've done several DRs in the multi-TB range.


Virtually every DR I've done has a hardware fault as the underlying cause. 
In several cases multiple (supposedly independent) systems failed 
simultaneously.


The core of any DR plan is the KISS principal.  There's a good chance that 
the poor guy doing the DR is doing it at 3am so the instructions need to 
be simple to reduce the chance of errors.


If the backup solution requires me to have a working DB just to extract 
data or wants me to install an OS and the app before I can get rolling 
then I view it with extreme suspicion.


And for those people who think that off-site/off-line backups aren't 
needed anymore because you can just replicate data across the network, 
I'll give you 5 minutes to find the floor in that plan :)


Ah but they are.   Cache pages may be clean or dirty.  Your disk cache may 
be full of clean cache pages, which is just fine.


Am I interpreting the output of free(1) the wrong way?


Sort of :)

Free is telling you the total memory in disk cache.  Any given page in the 
cache may be 'dirty' or 'clean'.  A dirty page has not yet been written to 
disk.  New pages start out dirty.  Within about 30 seconds (varies by 
filesystem and other factors) the page is written to disk.  The page in 
the cache is now clean.


Unless your system is writing heavily most pages in the cache are likely 
to be clean.


The difference is that clean pages can be dumped instantly to reclaim the 
memory.  Dirty pages must be flushed to disk before they can be 
reclaimed.  Using clean pages allows fast read access from the cache 
without the risk of not having committed the data.  I describe this as 
'having your cake and eating it too'[2].


More info can be found here:

http://en.wikipedia.org/wiki/Page_cache

[2] Paraphrase of English language saying.


 cay:~$ free -o
  total   used   free sharedbuffers cached
 Mem:   31167483029124  87624  0 7215001548628
 Swap:  314572080031449

Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-03-03 Thread thib

Robert Brockway wrote:
> [...]
Some filesystems such as XFS & ZFS allow you to effectively set quotas 
on parts of the filesystem.  I think we'll see this becoming more 
common. This takes away a big part of the need for multiple filesystems.


This is a neat feature indeed.  And you're right;  apparently, work is 
beeing done on ext4.


  http://lwn.net/Articles/373513/


* Specific mount options
[...]


This is a good point.  I actually hadn't considered this in my list. 
I'll respond by saying that in general the mount options I use for 
different filesystems on the same box do not vary much (or at all) in 
practice.


I've just discovered bindfs [1], a FUSE-based virtual filesystem, which 
might just answer this problem partially.  It looks quite nice, simple and 
flexible but will obviously won't be able to enable optimizations like 
noatime.  Don't know about the possible overhead though - at that point, one 
might want to go with "true" access control systems instead.


[1] http://code.google.com/p/bindfs/

> If I want a filesystem marked noatime then I probably want
> all the filesystems marked noatime.  There are exceptions to this of
> course.

Yep, like giving relatime to a filesystem containing mboxes or something 
like that.  But it's true, yes, access times are becoming less and less 
useful, and I can't think of another real problem (that isn't answered by 
access control systems) besides that one.



* System software replacement

Easier to reinstall the system if it's on separate volumes than conf 
and data?  Come on..


That's true but the time savings is not terribly great IMHO.  The system 
can be backing up and restoring the dats while the human is off doing 
other stuff.  Saves computer time (cheap) but not human time (expensive).


Either way, there's software to automate and abstract it all.  I think the 
real question is really processing vs storage resources;  human resources 
are the same.


The only reason I saw for doing inflexible volume imaging to do backups is 
to avoid the filesystem formatting operations as well as files unpacking and 
copying operations when restoring, which are theorically slower than copying 
a volume byte-by-byte.  "Whatever".


If restore speed is really that critical, it should still be possible to 
generate an image without including the free space - I know virtualization 
techs are doing it just fine for most filesystems.


Maybe we misunderstood each other - saw a different problem.


[...]
I recommend backing up all system binaries.  It's the only way you can 
guarantee you will get back to the same system you had before the 
rebuild. This is most important for servers were even small behavioural 
changes can impact the system in a big way.


So you don't trust Debian stable to be stable?  :-)


See this link for my talk on backups which goes in to this issue further:

http://www.timetraveller.org/talks/backup_talk.pdf

All the info in this talk is being transferred to 
http://www.practicalsysadmin.com.


Thanks a lot;  that's a talk full of useful checklists.  I'll definitely eat 
your wiki pages when I have the time.



[...]

* Metadata (i-node) table sizes


While this may be a problem now I think it will be less of a problem in 
the future as some filesystems already allow you to add i-nodes 
dynamically and this will increasingly be the case.


I'm not sure I follow you, but that sounds cool.  Could you elaborate?


* Block/Volume level operations (dm-crypt, backup, ...)

[...]
As said earlier, I don't need a fast backup solution.  I already 
prefer smarter filesystem-based backup systems in general.


As do I.  What do you use?  If you want to use dump with ext2/3/4 you 
will need to snapshot for data safety.


Actually I would think dump is a fast but "dumb" solution (much like 
partimage).  And yep, I know, LVM2 is just great for that.


Anyway, my preference isn't based on my own experience so I'm not actually 
using anything like that, but I'm willing to look at and try fsarchiver and 
see if it can really beat simple ad-hoc scripts for my needs.  Or something 
heavier, just for fun (Bacula?).



[...]
In modern disks the sector layout is hidden.  The fastest sectors may be 
at the beginning of the disk, the end or striped throughout.  This is 
specific to the design of the HDD and it is no longer possible to tell 
short of doing timing tests[1].  My recommendation is to ignore 
differences in sector speeds.


[1] I'd love to hear if anyone has found a method but I can't see how 
they could get through the h/w abstraction.


Good to know, I've actually never seen anything fancy like that (striped 
throughout).  I'll test my disks to see how I can make the best out of them 
anyway - but I agree with you in the case one wants to setup a portable, 
deployable system.


LVM won't theoritically guarantee the physical position of the logical 
volumes anyway.  And I'll need it if I do any partitioning.


So now it is abstracted (at least) tw

Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-03-03 Thread Robert Brockway

On Sun, 28 Feb 2010, Clive McBarton wrote:


Ignore swap, that's just small stuff, especially with 3GB. You could
have 64GB and it would still be not that important. Put it on any
partition or file you want.
The rule is 1:2 BTW.


Hi Clive.  I liked the rest of your post but I did want to make one little 
comment here.


The 1:2 rule was true on some versions of *nix in the past.  It has never 
been true on Linux and is not AFAIK true on any modern version of *nix. 
With improvements in disk capacity outstripping improvements in disk i/o 
by a significant margin 1:2 would mean that the system would be 
effectively unusable from thrashing long before you had used half of 
the allocated swap.


I cover this a little more here:

http://practicalsysadmin.com/wiki/index.php/Swap_space

Cheers,

Rob


Well, here it is;  so, should I do it?

If you feel like tinkering and sorting out problems, then yes. If you
want to just get your computer running and never think about it again,
then no.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuKngUACgkQ+VSRxYk4409jVwCfdeZARa+3LjZR9yWZat6na0bv
iesAoJ1mYVKnBbupounl709caGPzOEqN
=c+qk
-END PGP SIGNATURE-





--
Email: rob...@timetraveller.org
IRC: Solver
Web: http://www.practicalsysadmin.com
I tried to change the world but they had a no-return policy


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: 
http://lists.debian.org/alpine.deb.1.10.100303190.7...@castor.opentrend.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-03-03 Thread Robert Brockway

On Sun, 28 Feb 2010, thib wrote:

Usually I never ask myself whether I should organize my disks into separate 
filesystems or not.  I just think "how?" and I go with a cool layout without 
thinking back - LVM lets us correct them easily anyway.  I should even say 
that I believed a single root filesystem on a system was "a first sign" (you 
know what I mean ;-).


But now I'm about to try a new setup for a Squeeze/Sid *workstation*, and I 
somehow feel I could be a little more open-minded.  I'd like some input on 
what I covered, and more importantly, on what I may have missed.  Maybe 
someone can point me to some actually useful lists of advantages to 
"partitioning"?  I find a lot of BS around the net, people often miss the 
purpose of it.


So, what are the advantages I see, and why don't they matter to me anymore?


I've been pondering this myself of late.   I was going to post to another 
list (SAGE or SAGE-AU) but you've done such a nice list of 
advantages/disadvantages that I think I'll piggy back here :)


I'm a long-time sysadmin and generally deal with servers but of course 
people ask my advice on workstations too.



* Filesystem corruption containment


A corrupt root filesystem is _much_ worse than a corrupt non-root 
filesystem.  As long as the root FS is ok the box will boot, possibly 
without network access.


OTOH a box booted with just the root FS mounted is probably pretty 
useless.  These days if a box has any filesystem problem I'm likely to 
boot it from a live cdrom to perform recoveries.  In the past reasonably 
alternatives were available too though.  Some Unixen in the 80s could boot 
from tape for example.


In the end the final defence against a corrupt filesystem is the backups.



* Free space issues

Since I'm the only one who uses this machine, I should know if something may 
go wrong and eat up my entire filesystem (which is quite big for a 
workstation).  Yes, I still monitor them constantly.


On servers there is a concern about one part of the filesystem gobbling 
all the space.  This has been one of the most compelling reasons to use 
multiple partitions.


Some filesystems such as XFS & ZFS allow you to effectively set quotas on 
parts of the filesystem.  I think we'll see this becoming more common. 
This takes away a big part of the need for multiple filesystems.



* Specific mount options

According to the Lenny manpage, mount(8) --bind won't allow me to set 
specific options to the remounted tree, I wonder if this limitation can 
possibly be lifted.  If not, I think a dummy virtual filesystem would do the 
trick, but that seems kludgy, doesn't it?  Pointers?


I guess I could live without it, but I would actually find this quite 
annoying.


This is a good point.  I actually hadn't considered this in my list. 
I'll respond by saying that in general the mount options I use for 
different filesystems on the same box do not vary much (or at all) in 
practice.  If I want a filesystem marked noatime then I probably want all 
the filesystems marked noatime.  There are exceptions to this of course.




* System software replacement

Easier to reinstall the system if it's on separate volumes than conf and 
data?  Come on..


That's true but the time savings is not terribly great IMHO.  The system 
can be backing up and restoring the dats while the human is off doing 
other stuff.  Saves computer time (cheap) but not human time (expensive).


For a workstation, I don't need a fast system recovery mechanism, and I want 
to minimize my backup sizes.  I'd rather save a list of selections rather 
than a big archive of binaries.


I recommend backing up all system binaries.  It's the only way you can 
guarantee you will get back to the same system you had before the rebuild. 
This is most important for servers were even small behavioural changes can 
impact the system in a big way.


See this link for my talk on backups which goes in to this issue further:

http://www.timetraveller.org/talks/backup_talk.pdf

All the info in this talk is being transferred to 
http://www.practicalsysadmin.com.



* Fragmentation optimization

One of the most obvious advantages, and usually my main motivation to 
separate these logs, spools, misc system variable data, temporary 
directories, personal data, static software and configuration files.


This is less of an issue than it used to be.  Even ext2 will work towards 
minimising fragmentation.  Several *nix filesystems now allow for online 
defragmentation (eg, xfs).  I expect this problem will completely vanish 
in the future.



* Metadata (i-node) table sizes


While this may be a problem now I think it will be less of a problem in 
the future as some filesystems already allow you to add i-nodes 
dynamically and this will increasingly be the case.



* Block/Volume level operations (dm-crypt, backup, ...)

Encryption (with LUKS) in particular should beat any implementation at 
filesystem level.  I don't have any number to back that up, ho

Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-03-03 Thread Robert Brockway

On Sun, 28 Feb 2010, Stan Hoeppner wrote:


swap  4GB may never need it, but u have plenty of disk
/boot 100MB   ext2safe call, even if grub(2) doesn't need a /boot
/ 40GBext2/3  journal may eliminate mandatory check interval
/var  up2uext2sequential write/read, journal unnecessary


Hi Stan.  Questions about the need for a journal aside, if you run ext2
when you will get a fsck following a crash _whether you need it or not_.
Yes you can bypass it but the filesystem will be marked dirty.

My recommendation is to not use ext2 unless you have to and then only on
small filesystems.


/home up2uxfs best performance for all file sizes and counts


As a long time sysadmin I council against mixing filesystems like this
unless there is a compelling reason.   Using different filesystems drives
up management overhead.  The tools and procedures to fix different
filesystems are different so you're expecting someone to have to deal with
both cases in the event of problems.  Methods to backup can also vary,
etc.

This is an example of the classic 'heretogeneous vs homogenous system'
argument applied to filesystems.

*You may trust ext4 at this point, but I, and many others don't.  xfs 
beats ext4 in every category, so why bother with ext4?


I do rather like xfs and it can be used for every regular filesystem on
the box.

Cheers,

Rob

--
Email: rob...@timetraveller.org
IRC: Solver
Web: http://www.practicalsysadmin.com
I tried to change the world but they had a no-return policy


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: 
http://lists.debian.org/alpine.deb.1.10.1003031022090.7...@castor.opentrend.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-03-01 Thread thib

Clive McBarton wrote:

google "ext4 kde4" and the first hit is "Data loss may occurr when using
ext4 and KDE 4". I think Ubuntu offered ext4 as optional then and many
people ran into problems, supposedly massive data loss. XFS would be the
same. Application programmers don't cope with delayed allocation, and
since you cannot fix all the apps, you'd be stuck with the problem.

Apart from specific technical issues, there's general conservatism, most
of all in Debian.


[off topic]

Yep, I get that, and I know we can't possibly fix everything, but do we 
really need to?  In a way, all these apps are not even POSIX compliant for 
the operations they intend to do, they're crappy-ext3-lazy-standards 
compliant.  Nobody should support that.  Linux supports many filesystems, 
and we shouldn't be stuck with only one for pseudo-reliable usage because a 
single one of its generations had an odd behavior *some* people (not the 
entire world) based their software upon.  I understand people have lived 
happily with XFS for a long time, which "suffers" from the same "problems".


What's worse, in my opinion, is that people feel safer with ext3 than with 
ext4.  What's the difference *for these apps*?  Their data gets 
automatically flushed every 5 seconds instead of every 30-35 seconds 
(because of data=ordered).  Great.  If that's all that matters, let's just 
change some arbitrary numbers in ext4..


  /proc/sys/vm/dirty_expire_centisecs

I think the real buzz around this issue comes from this user who got his DE 
personal conf and "some personal data" files nuked because he crashed before 
they could reach the disk - which spread fear and panic in exaggerated ways 
among bloggers (I read the story a hundred times).  This could have happened 
in ext3 with another bad timing.


Applications should tell when to sync if they need to, not rely on a 
particular filesystem to do it "frequently enough" (whatever that means). 
That allows software that do *not* need to (or need it less often) to be 
more efficient.  In some cases, a temporary file might never need to reach 
the platters.  Let it happen more often.


That beeing said, I tend to think that there may be fewer software which 
would really need some love than we think.  Note that in the meantime, there 
are hacks to force flush truncated files and such that are available, which 
should help apps that don't receive that love but still need it.  Again, few 
software probably needs some.


Paranoids about delayed allocation can also disable it.  If Debian and some 
other distros are and decide to do it (and/or use hacks), I certainly 
wouldn't mind personally (it's not like we've ever been forced to accept 
anything) but I would [mind] if we decide to hold back on ext3 - just 
because it seems stupid to ignore every other features ext4 comes with, 
based on a funny story on a random bug tracker.


Anyway, that topic is out of my league, and people talked about it in way 
deeper depth than that at various places - I believe I understand enough of 
it to make my mind.  Anybody feel free to correct me if I sound misguided.



That's a very interesting point. Filesystems *not* responsible for data
integrity? Whow. While I do get the idea (move integrity checking up to
higher-level structures to improve thruput), and I am sure it will speed
things up greatly when it works, doesn't this require all your software
to first be rewritten to take care of it?


AFAIK, there has never been such a contract that filesystems should 
guarantee it.  In fact, they can't possibly do.  If some data in file A only 
makes sense if there's some specific data in file B, for example, only the 
application that writes them knows - the filesystem can't detect the 
corruption of the data if one file has been written but not the other.


If data integrity is important for an application, its writer should always 
have the question "OK, what if it crashes *there*?" in mind and think about 
an ad-hoc mechanism to make the operation atomic, in the sense that fits the 
data structure of the application.  If one wants some generic integrity 
features, there's plenty of database software around - and by that I also 
mean simple embedded/nosql stuff which could even write plain text.


Filesystem journals only guarantee that you won't get backstabbed by the 
system losing or overwriting a block part of a previously coherent data 
structure.



[and back on, sorry for that]

> [...]

Your request is perfectly reasonable. It is clearly possible in theory,
and I believe some Unix OS actually have it (don't know which though).
It is actually required for some backup schemes (which hence don't work
under Linux).


Good to know.


Quick googling gave me http://lwn.net/Articles/281157/ where they say
the limitation exists up to 2.6.25 kernels (the article is from 2008
though).


Whoa, thanks, I couldn't dig that one up.  I'll go ask.
Apparently, it wasn't included in 2.6.26 (silently fails on a Lenny 
machine)

Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Stefan Monnier
> So, what are the advantages I see, and why don't they matter to me anymore?

First off, IIUC you seem to want to use LVM, right?
I'd agree with this choice: there's little reasons not to use LVM nowadays.
Once you've decided to use LVM, then the rest (happily) doesn't really
matter anyway, since you can change your mind later on without too much
trouble.

The way I usually do it: one small /boot ext3 partition, and everything
else in LVM.  After that, do as you feel like.

As for all those issues you're pondering: they really don't matter that
much for a workstation.  I still use a "/, and /home" split (plus /boot)
on my machines, mostly out of habit.

I've been known to create separate partitions for my music and my DVD
collections, but that's only to work around the static-number-of-inodes
limitation of ext3 where multimedia content ends up wasting a lot of
inodes and I wanted to gain back a few more percents of disk space.

Since I have separate filesystems, I sometimes mount them with different
options (like nodev for /home), but I'm not paranoid enough to use
separate filesystems just to be able to use such mount options.


Stefan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/jwvy6iclqxb.fsf-monnier+gmane.linux.debian.u...@gnu.org



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Clive McBarton
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Alex Samad wrote:
> my 2c, with the size of HD's and the processing power we have now, I
> really wonder if spending more than a second on deciding on a single
> partition or not is worth it. 

It's theoretical reasoning. It's good for understanding. And no, it's
not worth the time for people who, unlike the thread starter, just want
things up and running. But his questions are good to think about in
principle, since distros like Debian need such prior to changing the
partition recommendations.

> Are the amount of space lost - expressed
> as a percentage of the disk really worth all the time being spent on it
> ?
> 
> And the cpu overhead for using separate partitions and lvm - again with
> todays cpus

No, *they* wont' be worth it. Neither space nor CPU will show noticeable
improvements either way as far as I can see.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuLDVEACgkQ+VSRxYk440+kWgCfcxDbb+cteK87GH0L9J8YExgr
UfwAoLrr/YT0rLFCXROThju6OWbLYy17
=fwwk
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4b8b0d51.7000...@web.de



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Clive McBarton
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

thib wrote:
>> You trust ext4, and so does Ubuntu. Others (including most distros,
>> including Debian) do not.
> 
> I'm sorry if I should know, but is that a clear position or the general
> fear around delayed allocation?  

google "ext4 kde4" and the first hit is "Data loss may occurr when using
ext4 and KDE 4". I think Ubuntu offered ext4 as optional then and many
people ran into problems, supposedly massive data loss. XFS would be the
same. Application programmers don't cope with delayed allocation, and
since you cannot fix all the apps, you'd be stuck with the problem.

Apart from specific technical issues, there's general conservatism, most
of all in Debian.

> I'd say that I only trust it for its
> own integrity management, not that of my data.  I don't think anyone
> should expect that from a filesystem, that's, to my knowledge, what
> databases are for.  

That's a very interesting point. Filesystems *not* responsible for data
integrity? Whow. While I do get the idea (move integrity checking up to
higher-level structures to improve thruput), and I am sure it will speed
things up greatly when it works, doesn't this require all your software
to first be rewritten to take care of it?

>>> * Specific mount options
>>> mount(8) --bind won't allow me to set
>>> specific options to the remounted tree, I wonder if this limitation can
>>> possibly be lifted. 
>> I have not heard of any way around it, and since you find it annoying,
>> that speaks against your single filesystem plan.
> 
> Yep;  but that's not right, I don't see how it can't be possible.
> Can somebody recommend me where I could forward this discussion?  The
> kernel lists?  I'm not sure.

Your request is perfectly reasonable. It is clearly possible in theory,
and I believe some Unix OS actually have it (don't know which though).
It is actually required for some backup schemes (which hence don't work
under Linux).

Quick googling gave me http://lwn.net/Articles/281157/ where they say
the limitation exists up to 2.6.25 kernels (the article is from 2008
though).

> I
> actually managed to dig a benchmark, yes.  Shown a greater hit than that
> (I won't brag) but when you think about it, you'd really have to torture
> the filesystem to see it.  

Possible. I'd like to see it; I don't know any LVM benchmarks,
unfortunately.

> sequential read
> at the beginning of the disk can be twice as fast as at the end?

Sure. That's not fiddling with individual sectors and 3D coordinates on
the HDD, but simply using partitions at the beginning of the disk. If
you care about a factor 2, then do partition it.

> I think everybody should keep a handy recovery live CD around;  in fact,
> one would have enough with a separate partition only if the GRUB
> LVM/RAID modules break - if the core breaks, it's of no help.

Good point. A recovery CD obsoletes recovery partitions sometimes.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuLDAwACgkQ+VSRxYk440+bOwCfRowkIKWB4cp6yB9muuzm9KfJ
HEcAoLLPlH2C3HvedpvawNsH4uAvMJZX
=//v/
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4b8b0c0c.2010...@web.de



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread thib

Clive McBarton wrote:

*You may trust ext4 at this point, but I, and many others don't.  xfs beats
ext4 in every category, so why bother with ext4?


Exactly. If any Ubuntu maintainers were on this list, we could ask them,
 they see some reason for it (but I don't know what it is).


Are there really discussions about defaulting to XFS?

-thib


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4b8afc78.40...@stammed.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread thib

Stan Hoeppner wrote:

Use LILO instead of grub(2), and stick the boot loader on the MBR.  The
/boot partition isn't absolutely necessary, but it provides a small amount
of additional safety and system compatibility from a boot perspective.


What exactly can I gain from LILO in this case?  I was just about to really 
dive into GRUB2, so if I missed a war or something, I'd be happy to know more.


From what I understand, LILO is filesystem agnostic (busted, I'm not an 
old-timer), so a separate /boot would be greatly recommended, indeed. 
Currently, I see that as a limitation.



XFS was specifically designed for excellent performance on gigantic volumes.
 It won't break a sweat with a 1-2TB filesystem.  A 10TB filesystem is a
small snack for XFS; a 500TB filesystem would be lunch; a Petabyte
filesystem might be dinner.  Maximum individual file size is 8 Exabytes and
maximum filesystem size is 16 Exabytes.


Okay, I'll bite.  I must admit I haven't considered it seriously before. 
There are tons of stuff to read about it, so I'll take my time.



Go with a single big XFS root filesystem and you'll never have to worry
about jockeying stuff around and resizing partitions.  You won't have to
worry about performance either.


That's quickly said but, thanks. :-)
I'll make up my mind.

-thib


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4b8af59f.1000...@stammed.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Stan Hoeppner
thib put forth on 2/28/2010 1:13 PM:
> [Not sure you've seen;  I messed up, hence the second message:  this
> conversation went private, would you like to keep it that way?]

Oh, I thought you meant to go private so I was honoring that.  I'll go back
on list with this.

> Well, for someone who owns a domain named "hardwarefreak.com" I'd have
> hoped you could understand why I'd want all I can get from my hardware
> ;-).  

The hardwarefreaks have an NFS/CIFS file server for data and an IMAP server
for mail, so it's easier deciding on the disk layout for a workstation,
basically /boot and /. ;)

> Many points I enumerated were about performance optimization,
> others are often said to be important even for a workstation.  I know no
> kitten will be murdered if I don't take the time to think about all
> this, but since I have it, and that apparently it's interesting to at
> least one other person, well, why not try to learn something today.

No argument there.

>> Why do you need to be resizing volumes?
> 
> I.. just need it.  My /usr/local might vary from 20 to 60, maybe 80
> gigs, for example, and I'd like that space to be available for other
> things when it's not used there.

Given the requirements you've stated, I think your best option would be

100 MB EXT2  /boot
remaining GBs  XFS   /

Use LILO instead of grub(2), and stick the boot loader on the MBR.  The
/boot partition isn't absolutely necessary, but it provides a small amount
of additional safety and system compatibility from a boot perspective.

XFS was specifically designed for excellent performance on gigantic volumes.
 It won't break a sweat with a 1-2TB filesystem.  A 10TB filesystem is a
small snack for XFS; a 500TB filesystem would be lunch; a Petabyte
filesystem might be dinner.  Maximum individual file size is 8 Exabytes and
maximum filesystem size is 16 Exabytes.

Go with a single big XFS root filesystem and you'll never have to worry
about jockeying stuff around and resizing partitions.  You won't have to
worry about performance either.

-- 
Stan




-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4b8ae9f4.6080...@hardwarefreak.com



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread thib

Alex Samad wrote:

my 2c, with the size of HD's and the processing power we have now, I
really wonder if spending more than a second on deciding on a single
partition or not is worth it.  Are the amount of space lost - expressed
as a percentage of the disk really worth all the time being spent on it
?


I think so, yes.  My /usr/local may vary by more than 50G (very quickly, but 
very rarely), for example.  When it's not eating up ~60-80G, I'd like this 
space to be available for something else, probably /home.


So, it's either binding/linking the trees on a single filesystem (what I 
would call a hack), using LVM more than I should, or stopping just a second 
to think about the advantages of partitioning that I can afford to lose.



And the cpu overhead for using separate partitions and lvm - again with
todays cpus


Hmm, it works the other way around:  you usually get performance boosts 
*from* the separation of your volumes.  Even if you simply want the best 
performance, and you have the time and interest to actually look into it, 
you might still find yourself wondering, like me, if it is necessary at all 
to think about any volume layout.


(Since we're talking about the burden of thinking, I'd say going with a 
single root filesystem is quite easier.)


Now, as we just discussed, it's not all about CPU and disk space, but 
sometimes about practicalities (reinstallation, block level operations, 
boot) or safety (mount options, filesystem corruption containment, free 
space issues).



my default laptop build is
partition 1 /boot - ext2 or ext3 about 500M - 1G
swap = to mem - my last 3 laptops have >3G memory so if I am swapping
there is something wrong
rest = 1 big partitions currently ext4 


As it seems you're already alright with losing the benefits of partitioning 
(except for your boot partition, which might not even be necessary in your 
case), but most people still believe it's better to take the time to 
separate the data into several filesystems (even just a few), for a workstation.


-thib


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4b8ae67f.4030...@stammed.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Alex Samad
On Sun, Feb 28, 2010 at 07:34:03PM +0100, thib wrote:
> Clive McBarton wrote:
> >I find the concept very interesting in principle, although I am not sure
> >I can recommend it. In some respects single file systems are more
> >acceptable nowadays. In others they are not. Here are my $.02:
> 

my 2c, with the size of HD's and the processing power we have now, I
really wonder if spending more than a second on deciding on a single
partition or not is worth it.  Are the amount of space lost - expressed
as a percentage of the disk really worth all the time being spent on it
?

And the cpu overhead for using separate partitions and lvm - again with
todays cpus

my default laptop build is
partition 1 /boot - ext2 or ext3 about 500M - 1G
swap = to mem - my last 3 laptops have >3G memory so if I am swapping
there is something wrong
rest = 1 big partitions currently ext4 

alex


[snip]

> Thank you.


signature.asc
Description: Digital signature


Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread thib

Clive McBarton wrote:

I find the concept very interesting in principle, although I am not sure
I can recommend it. In some respects single file systems are more
acceptable nowadays. In others they are not. Here are my $.02:


Thank you.


[...]
You trust ext4, and so does Ubuntu. Others (including most distros,
including Debian) do not.


I'm sorry if I should know, but is that a clear position or the general fear 
around delayed allocation?  I'd say that I only trust it for its own 
integrity management, not that of my data.  I don't think anyone should 
expect that from a filesystem, that's, to my knowledge, what databases are 
for.  Other than that, each application should make the necessary steps to 
ensure files are correctly flushed on disk (f(data)?sync(),..)


Anyway, one can still disable some features;  I hope ext4 will be mature 
enough in everyone's head by the time Squeeze will be frozen, there are just 
some features that feel necessary.



[...]

* Specific mount options
mount(8) --bind won't allow me to set
specific options to the remounted tree, I wonder if this limitation can
possibly be lifted. 

I have not heard of any way around it, and since you find it annoying,
that speaks against your single filesystem plan.


Yep;  but that's not right, I don't see how it can't be possible.
Can somebody recommend me where I could forward this discussion?  The kernel 
lists?  I'm not sure.



[...]
But you backup /home and the rest separately? Should.


Sure (but at filesystem level, not the whole byte-by-byte volume).


* Fragmentation optimization

What's "Fragmentation"? This is Unix ;) But seriously, unless the
difference is really measurable I wouldn't care.


Yes, you're right, especially with delayed allocation.


What's funny is that the physical extents now get fragmented, there's
just no way around it - and I believe that to this date, LVM2's
contiguous policy doesn't allow for defragmentation when it's stuck. 

Should it? Is there any noticeable impact? Hard evidence? Benchmarks?


That would be possible to do, so I guess it should, yes, but since that 
certainly should *not* be a priority for them, well, I'd forget about it.



[...]
If it's under 1%, ignore it.


I hate myself for not keeping track of these when I see them, but I actually 
managed to dig a benchmark, yes.  Shown a greater hit than that (I won't 
brag) but when you think about it, you'd really have to torture the 
filesystem to see it.  The article was quite old and seemed somewhat random, 
I wouldn't even trust it much.



* Block/Volume level operations (dm-crypt, backup, ...)
you know of any good benchmark of the main cryptographic virtual
filesystems?  

Ignore this issue, CPUs are much faster than needed for this.


Actually, with a fancy raid array and enough disks, you can achieve some 
throughput that might stress an older CPU, especially if it also has to 
manage the array in software.  Just went up to 1.20 load with a simple 
sequential write (and that CPU is not *that* old - 64 x2 3800+).  I think 
there's still some way to go to achieve absolute crypto transparency 
performance wise, and the CPU is the main player here.



* Special block sizes for specific trees
I found a maildir with a 1k block size was more convenient than the
current 4k default

What's the advantage? Hardly size, unless you have more than 10^8 mails.


Well since I don't need speed performance to read my mails, and that the 
problem doesn't seem that hard to solve, I'd prefer to waste some space on 
something else than half empty blocks.  So, yes, it's to gain space, but 
more in a "why not?" way.



* (Mad?) positioning optimizations
It's often said some sectors on some cylinders get better performance,

HDDs nowadays only use logical sector numbers. The old h/t/s
3D-interface is just there for compatibility and cannot access the true
h/t/s data of the HDD. Such optimization cannot work.


I found this as an example
http://www.tomshardware.co.uk/forum/250867-14-wd10eavs-disk-performance

And finally, that's something recent.  Apparently, um, sequential read at 
the beginning of the disk can be twice as fast as at the end?  I'm not sure 
if I should feel surprised, but I am.  As I just answered myself, that's 
another argument against a single root filesystem.



[...]
If grub2 breaks, you need another tiny partition, so might as well make
one now. The space loss won't hurt you.


I think everybody should keep a handy recovery live CD around;  in fact, one 
would have enough with a separate partition only if the GRUB LVM/RAID 
modules break - if the core breaks, it's of no help.



[...]
Ignore swap, that's just small stuff, especially with 3GB. You could
have 64GB and it would still be not that important. Put it on any
partition or file you want.


For a workstation, yeah.


The rule is 1:2 BTW.


Different schools ;-).


Well, here it is;  so, should I do it?

If you feel like tinkering and sorting out problems, then yes. If you
want

Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Clive McBarton
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Stan Hoeppner wrote:
> /var  up2uext2sequential write/read, journal unnecessary

I don't see the advantage of ext2 over ext3 here (or for that matter
anywhere else, which may just be my ignorance). The journal may be
unnecessary, but it doesn't cost much either, neither performance nor
space in noticeable quantities.

> *You may trust ext4 at this point, but I, and many others don't.  xfs beats
> ext4 in every category, so why bother with ext4?

Exactly. If any Ubuntu maintainers were on this list, we could ask them,
 they see some reason for it (but I don't know what it is).

> If you have a 500GB, 750GB, 1TB, 1.5TB, 2TB disk, leave the freak'n bulk of
> it unallocated until you actually need it. 

How exactly is that useful w/o LVM? How is the space supposed to be
included later?
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuKoMEACgkQ+VSRxYk440/lfgCdGMXUb0i7mKyDEHo0JLen8R7o
Kn0An0/5BMGeVjoCNk/vWM9psWWJR7sJ
=EiW4
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4b8aa0c1.1060...@web.de



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Clive McBarton
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I find the concept very interesting in principle, although I am not sure
I can recommend it. In some respects single file systems are more
acceptable nowadays. In others they are not. Here are my $.02:

> * Filesystem corruption containment
> 
> I use ext4, and I've read enough about it to trust its developers for my
> workstations.  I don't think that's a risky bet.  
You trust ext4, and so does Ubuntu. Others (including most distros,
including Debian) do not.

In fact, I believe
> this old statement dates back to when we hadn't journals, in the ext2 days.

It does not date back at all. Filesystem checks on ext3 can still take
hours on a perfectly clean filesystem. The quotient of read speed to
capacity of drives gets smaller with every new HDD generation,
converging to zero.

> * Free space issues
You are right on this one, single workstations have least free space
issues without partitions.

> * Specific mount options
> mount(8) --bind won't allow me to set
> specific options to the remounted tree, I wonder if this limitation can
> possibly be lifted. 
I have not heard of any way around it, and since you find it annoying,
that speaks against your single filesystem plan.

> * System software replacement
> For a workstation, I don't need a fast system recovery mechanism, and I
> want to minimize my backup sizes. 
But you backup /home and the rest separately? Should.

> * Fragmentation optimization
What's "Fragmentation"? This is Unix ;) But seriously, unless the
difference is really measurable I wouldn't care.

> What's funny is that the physical extents now get fragmented, there's
> just no way around it - and I believe that to this date, LVM2's
> contiguous policy doesn't allow for defragmentation when it's stuck. 
Should it? Is there any noticeable impact? Hard evidence? Benchmarks?

> I also know the performance hit is minimal, the PE
> sizes can be and are typically quite big, but..  it's still there and
> should be avoided if possible.
If it's under 1%, ignore it.

> there's an online
> defragmenter for ext4 I can afford to run regularly now.
I have not heard of fragmentation being a problem even with ext2.


> * Metadata (i-node) table sizes
Ignore this, +1T or not +1T. Unless you run out of inodes, it won't matter.


> * Block/Volume level operations (dm-crypt, backup, ...)
> you know of any good benchmark of the main cryptographic virtual
> filesystems?  
Ignore this issue, CPUs are much faster than needed for this.

> * Special block sizes for specific trees
> I found a maildir with a 1k block size was more convenient than the
> current 4k default
What's the advantage? Hardly size, unless you have more than 10^8 mails.

> * (Mad?) positioning optimizations
> It's often said some sectors on some cylinders get better performance,
HDDs nowadays only use logical sector numbers. The old h/t/s
3D-interface is just there for compatibility and cannot access the true
h/t/s data of the HDD. Such optimization cannot work.

> * Boot obligations
>  I guess
> you'd still need a separate boot partition if you're stuck with another
> boot loader.  
If grub2 breaks, you need another tiny partition, so might as well make
one now. The space loss won't hurt you.


> * Swap special-case
> I'm just OK with my three gigs.  The 1:1
> mem:swap rule has got to be wasting space here, hasn't it?
Ignore swap, that's just small stuff, especially with 3GB. You could
have 64GB and it would still be not that important. Put it on any
partition or file you want.
The rule is 1:2 BTW.

> Well, here it is;  so, should I do it?
If you feel like tinkering and sorting out problems, then yes. If you
want to just get your computer running and never think about it again,
then no.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkuKngUACgkQ+VSRxYk4409jVwCfdeZARa+3LjZR9yWZat6na0bv
iesAoJ1mYVKnBbupounl709caGPzOEqN
=c+qk
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4b8a9e05.6080...@web.de



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread thib

Stan Hoeppner wrote:

All of that talk and gyration over a workstation disk layout?  You never did
mention what the primary application usage is on this machine, which should
be a factor in how you set it up.  If you're an email warrior, what damn
difference does it make, and why bother with LVM on a workstation?  What
size is the new disk?


Well, sorry, I guess.  But why not?  Like you, I always come up with a 
little layout for my needs, and I'm here wondering if that's really useful.


*I'd bother with LVM to be able to resize my volumes easily, like I said. 
Also, I may not be able to use GPT because of dual-boot restrictions, and 
BIOS MBR's partition table may have limitations that affect me.  I don't 
think that's really the point anyway.



Here's a safe bet, even with grub(2):

swap4GB may never need it, but u have plenty of disk
/boot   100MB   ext2safe call, even if grub(2) doesn't need a /boot
/   40GBext2/3  journal may eliminate mandatory check interval
/varup2uext2sequential write/read, journal unnecessary
/home   up2uxfs best performance for all file sizes and counts


For example, why would that be safer to put /boot aside on ext2 with grub2? 
 (Honest question, that was in my first post.)


I know, these layouts depend on so much things, and you can't possibly know 
what I need if I don't tell you;  in fact I wasn't exactly asking for help 
to partition my volumes (even though I appreciate yours, really), I was 
trying to figure out how bad it would be if I don't.  Yes, "in general", 
that's a broad subject that affects many things, and that's why I did all 
this talk and gyration about it.  If the discussion isn't of interest, well, 
let it die.


You raised a point, however - that some trees might need journaling, and 
some do not.  I.. don't follow you.  Especially for /var which, beeing 
written to more often, is more subject to corruption.


BTW, it's possible to use ext4 without a journal (and before anyone points 
it out, no, it's not the same as ext2).



*You may trust ext4 at this point, but I, and many others don't.  xfs beats
ext4 in every category, so why bother with ext4?


I don't think anybody wants to go there.  (That doesn't mean that I disagree 
with you.)



If you have a 500GB, 750GB, 1TB, 1.5TB, 2TB disk, leave the freak'n bulk of
it unallocated until you actually need it.  This rule alone eliminates much
of the vacillation you are currently experiencing WRT "Omg what am I ever
going to do with all this disk?!"


I'm sorry if sounded that stupid to you, but I will need all of it.  What's 
really a problem is that the disk usage of my local software will vary 
*greatly* (nearly as much as my data, yes).  I either need to use LVM all 
the time, mount --bind trees on a big filesystem containing all the most 
variable data and software (which in some way defeats the purpose of 
partitioning) or.. well, wonder if that is all necessary, and go with the 
noob "single root filesystem" way - which, as I'm trying to convince myself, 
may not be that dumb.



Again, if you think this is pointless, leave me with it, I'll be alright.

-thib


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4b8a88e6.7080...@stammed.net



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Stan Hoeppner
Andrei Popescu put forth on 2/28/2010 8:32 AM:
> On Sun,28.Feb.10, 03:20:38, Stan Hoeppner wrote:
> 
>> /var up2uext2sequential write/read, journal unnecessary
> 
> Would you mind going into details? I always thought the journal was 
> especially useful on partitions like /var where it is more likely that 
> the system will be writing something right before a crash/power failure, 
> but I'm definitely not an expert.

Depends on what you're writing and what platform.  Most of /var is
non-permanent data, /var/mail and /var/log being exceptions.  Depending on
what mail client is used and how it's used, /var/mail may not be used at
all, leaving most of the write activity to the log files.  Workstations
typically aren't writing a ton of data to /var/log, nothing like most
servers, so IMO a journaled FS isn't really necessary for /var.  Others may
have other opinions.

Something to consider is that millions of people have been running single
ext2 partitions with everything in / for quite some time.  I don't recall
reading many system crash horror stories of data loss in /var or anywhere
else in the Linux filesystem.

My recommendation of XFS for /home was mainly focused on speed.  XFS has
very fast copy speed and handles large files very well.  The journal is just
a bonus.  My comments in this thread are workstation specific.  In a server
environment there are many other reasons I choose XFS.

-- 
Stan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4b8a887a.9040...@hardwarefreak.com



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Andrei Popescu
On Sun,28.Feb.10, 03:20:38, Stan Hoeppner wrote:

> /var  up2uext2sequential write/read, journal unnecessary

Would you mind going into details? I always thought the journal was 
especially useful on partitions like /var where it is more likely that 
the system will be writing something right before a crash/power failure, 
but I'm definitely not an expert.

Regards,
Andrei
-- 
Offtopic discussions among Debian users and developers:
http://lists.alioth.debian.org/mailman/listinfo/d-community-offtopic


signature.asc
Description: Digital signature


Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Eduardo M KALINOWSKI

On 02/27/2010 11:18 PM, thib wrote:

Hello,

Usually I never ask myself whether I should organize my disks into separate
filesystems or not.  I just think "how?" and I go with a cool layout without
thinking back - LVM lets us correct them easily anyway.  I should even say
that I believed a single root filesystem on a system was "a first sign" (you
know what I mean ;-).

But now I'm about to try a new setup for a Squeeze/Sid *workstation*, and I
somehow feel I could be a little more open-minded.  I'd like some input on
what I covered, and more importantly, on what I may have missed.  Maybe
someone can point me to some actually useful lists of advantages to
"partitioning"?  I find a lot of BS around the net, people often miss the
purpose of it.

So, what are the advantages I see, and why don't they matter to me anymore?
   

> [snip]

I only split /home and /usr/local so that I could reinstall wiping the 
whole system but keeping my data. And, if necessary for boot purposes, 
/boot . Everything else goes into / .


--
Small is beautiful.

Eduardo M KALINOWSKI
edua...@kalinowski.com.br


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4b8a63f6.6040...@kalinowski.com.br



Re: Single root filesystem evilness decreasing in 2010? (on workstations)

2010-02-28 Thread Stan Hoeppner
thib put forth on 2/27/2010 8:18 PM:
> Hello,
> 
> Usually I never ask myself whether I should organize my disks into
> separate filesystems or not.  I just think "how?" and I go with a cool
> layout without thinking back - LVM lets us correct them easily anyway. 
> I should even say that I believed a single root filesystem on a system
> was "a first sign" (you know what I mean ;-).



All of that talk and gyration over a workstation disk layout?  You never did
mention what the primary application usage is on this machine, which should
be a factor in how you set it up.  If you're an email warrior, what damn
difference does it make, and why bother with LVM on a workstation?  What
size is the new disk?

Here's a safe bet, even with grub(2):

swap4GB may never need it, but u have plenty of disk
/boot   100MB   ext2safe call, even if grub(2) doesn't need a /boot
/   40GBext2/3  journal may eliminate mandatory check interval
/varup2uext2sequential write/read, journal unnecessary
/home   up2uxfs best performance for all file sizes and counts

*You may trust ext4 at this point, but I, and many others don't.  xfs beats
ext4 in every category, so why bother with ext4?

If you have a 500GB, 750GB, 1TB, 1.5TB, 2TB disk, leave the freak'n bulk of
it unallocated until you actually need it.  This rule alone eliminates much
of the vacillation you are currently experiencing WRT "Omg what am I ever
going to do with all this disk?!"

-- 
Stan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4b8a3566.9060...@hardwarefreak.com