Re: Detecting mount point of last mount?

2000-04-20 Thread Theodore Y. Ts'o

   Date:Thu, 20 Apr 2000 16:32:11 +0200
   From: Guest section DW <[EMAIL PROTECTED]>

   It depends.

   # tune2fs -l /dev/hda6 | grep mounted
   tune2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
   Last mounted on:  /mnt/g1

   This output is incorrect: the last ten times this was mounted was on /g1.
   In fact the kernel (or mount) do not keep track of this, but
   ext2 utilities like mke2fs and tune2fs are willing to write this field
   (s_last_mounted).

The kernel *should* write this information, and in fact the high-level
VFS mount code has this information.  The problem is that the mount
point isn't getting passed down into the fs-specific mount routine;
sonce this is done the change to allow ext2 to store this information on
mount is as you can guess, trivial.   There were patches floating around
to do this, but unfortunately they haven't yet made it into the mainline
kernel yet.

- Ted



Re: [linux-audio-dev] Re: More results and thoughts on disk writing performance

2000-04-20 Thread Paul Barton-Davis

>> 
>> Incidently, on the systems I tested it appears that preallocation *slows dow
>n*
>> data writing. Paul, have you compared your system with and without using
>> preallocation? What speed difference do you see?
>
>EXACTLY !
>I am experiencing the same !
>After Paul praised the preallocation so much, I decided to test it, and I
>get about 20% performance slowdown over the case when running without
>preallocation.

.
.
.
>I can't explain why we do experience the slowdown with preallocation.

No big suprise here. Suppose you write the files in 256kB chunks, and
re-read them in the same way. If ext2 behaves the way I would expect (*)
it to, you end up with somewhat-to-totally block-interleaved files
that are read with no-or-little seeking (because the read pattern will
exactly match the write pattern).

The problem with not preallocating occurs only on the first write, and to
be honest, my preallocation scheme should be changed to mirror the
actual actions of a true "first write" by block-interleaving the files
instead of aiming for complete per-file contiguity. The one difficulty
with this is that if you change the size of the i/o requests, you may
get *worse* performance. At one time, I imagined this size to be
rather fluid, but it now appears to be likely to assume a fairly
constant value across all disks on all systems (and certainly on any
particular system). This removes my only real objection to
block-interleaving the files. 

I will change the way the files are pre-allocated and see if it speeds
things up even more.

Again, just in case anyone missed it: I have never encountered the
problems Benno has had (or at least, not the same underlying causes -
I used to have disk i/o performance problems), probably due to my use
of SCSI h/w, and ardour is working multichannel hard-disk recorder.

>No you are wrong here:
>the audio thread requires higher priority because it nneds finer granularity,
>(we want low-latency response from our HD recorder).
>The audio thread releases the CPU during blocking write()s to the audio device
>,
>giving the disk thread all the time it needs to perform large disk I/Os which
>blocks the disk thread almost all the time.
>Therefore you gain NOTHING (=zero disk performance increase) by giving the dis
>k
>thread higher priority than to the audio thread, except that the audio will
>drop-out sometime. 

Fully agreed.

--p

(*) ext2 filesystems have a pre-allocate distance which someone
mentioned. I am hoping that allocating 256kB at a time makes this
figure irrelevant, but I am not sure at all that this is true.



Re: [linux-audio-dev] Re: File writes with O_SYNC slow

2000-04-20 Thread Benno Senoner

On Thu, 20 Apr 2000, Steve Lord wrote:

> I should comment on this one since it would be on XFS
> 
> The difference between O_SYNC and O_DSYNC on XFS is only there if the
> file size does not change during the write. O_DSYNC which extends the file
> still does a synchronous transaction (a bad thing) if it extents the file.
> XFS tries hard to do the right things in these cases, we also have customers
> who use XFS inside of things like MRI scanners, they insist that if they
> write data to the disk, it better stay there, even if the power goes
> out.
> 
> I realize with audio recording it is difficult to know the size of the
> file in advance. But you would get better results if you bumped up
> the filesize periodically rather than on every write. Preallocation
> would also help.
> 
> Steve

My preliminary testing shows that preallocation under Linux with ext2 (writing
to an existing file without using O_TRUNC), slows down operations a bit in 
the async case, and causes a considerable speed increase when using O_SYNC.
(but the O_SYNC speed + preallocation is still slowed than plain writes).


Anyway you are talking about bumping up filesizes periodically.
What do you mean exactly ?
using ftruncate() (can it be used to increase filesize?) or 
lseek()  + writing a small amount of data periodically (every few secs?) in
order increase filesizes ?
But wouldn't that require touching all inodes within the new data section,
thus ending up in the equivalent case as writing (new) large blocks (256kb in
my case)), to the file ?

Benno.



Re: More results and thoughts on disk writing performance

2000-04-20 Thread Benno Senoner

On Thu, 20 Apr 2000, Karl JH Millar wrote:
> Benno,
> 
> I tried your benchmark on the linux systems I have available, and couldn't
> reproduce the slow down you find using sync/fsync/fdatasync. There is some
> overhead in the system calls themselves, which gave a few % performance loss,
> but not your 30-60%. The only other difference I noticed on one disk that is
> fairly full and fairly fragmented was more seeking using fsync() after each
> write than under other methods, but even that only caused only a moderate
> performance hit (maybe 10-15% if my drive was faster).

please benchmark with 500 - 1000MB total datasize,
to get more accurate results.
Anyway I am running on 2.2.14-smp ( RH 6.2), therefore the fsync() slowdown 
on very large filesizes if what I may experience in my tests.
(read Stephen's posts)
> Given my results, it seems that the best method to
> write data would be to do a group of write()s, and follow them with a sync(),
> so we eliminate the bursty behaviour that was killing performance before.

As said before I tried it too, but with no luck.

> Using a single call to sync instead of running fsync() on every file descriptor
> saves a little kernel time, and allows the kernel to choose what order to write
> the data.

Although the single sync() is better than the separate fsync() / fdatasync(), 
it performs slower than plain writes, with peaks of up to 30%.

> On my machine at least, this gives us everything we need - close to maximal
> throughput (unlike O_SYNC) and predictable behaviour (which we don't get
> without calling sync()).

It neither predicable with sync() too , since the metadata updates can disturb
the disk I/O quite a bit (just look at the hdrbench graphs)

> 
> Incidently, on the systems I tested it appears that preallocation *slows down*
> data writing. Paul, have you compared your system with and without using
> preallocation? What speed difference do you see?

EXACTLY !
I am experiencing the same !
After Paul praised the preallocation so much, I decided to test it, and I
get about 20% performance slowdown over the case when running without
preallocation.
But the interesting thing is that when running in O_SYNC mode,
writing to preallocated files is much much faster, than in the no-preallocation
case. But still very inferiour to plain buffered/non preallocated writes.
I can't explain why we do experience the slowdown with preallocation.
Any thoughts ?

> 
> Random other things that may help getting optimal disk throughput:
>- Mounting the file system using the noatime flag. Then at least
>reading data doesn't force the metadata to be updated, which
>should help a little.

Yes that would be nice to check,
I made a quick test running hdrbench with 30 read tracks and 1 write tracks,
running both in atime and noatime mode. 
noatime performs a bit better (looking at the plotted graphs), but not
significantly.


>- Swap should either be off or on a different drive so we don't have
>to worry about paging in the middle of intensive I/O.

that is for sure: a professional HD recording system should always have 
one or more disks dedicated exclusively to audio.

>- Why are people giving the audio thread higher priority than the disk
>thread? In my mind it makes sense to have it the other way around,
>as the disk thread is going to block almost immediately anyway,
>whereas the audio thread may run for some time, during which we have
>no disk I/O going on. It seems to me to be better to start the disk
>I/O as soon as possible by giving it the higher priority. This could
>only cause problems if you try to write too much data at once,
>causing the thread not to block immediately, but this is easily
>avoided.

No you are wrong here:
the audio thread requires higher priority because it nneds finer granularity,
(we want low-latency response from our HD recorder).
The audio thread releases the CPU during blocking write()s to the audio device,
giving the disk thread all the time it needs to perform large disk I/Os which
blocks the disk thread almost all the time.
Therefore you gain NOTHING (=zero disk performance increase) by giving the disk
thread higher priority than to the audio thread, except that the audio will
drop-out sometime. 

Benno.




Re: Detecting mount point of last mount?

2000-04-20 Thread Guest section DW

On Thu, Apr 20, 2000 at 11:18:00AM -0400, James Antill wrote:

>  Get util-linux-2.9y

The latest is util-linux 2.10l.

>  Ps. You might want to contact the current utils-linux/mount
> maintainer as I know I sent some patches to fix a couple of things.

So, you might check whether the things that displeased you have been
changed (and otherwise resubmit your patches).

Andries - [EMAIL PROTECTED]



Re: [linux-audio-dev] Re: File writes with O_SYNC slow

2000-04-20 Thread Juhana Sadeharju

>From:  Benno Senoner <[EMAIL PROTECTED]>
>
>write() + fsync()/fdatasync() on linux doesn't work well too since the kernel
>isn't able to optimize disk writing by using the elevator algorithm.
>
>You can try this in by trying to fsync()/fdatasync() all output descriptors
>in the disk_thread routine.

Benno, I have not yet tested the following idea with my shmrec recorder, but
you could test it as well:

Instead of having one thread doing "write(); fsync();" loop, there could
be multiple threads each doing "write(); fsync();".

I have no idea why fsync() gets slower and slower while the filesize
increases. Does the above help at all if all writes happens to the same
file (as in my shmrec)? Does it help when it is done in multitrack
program for each file separately (while ending up to 30 disc threads)?

Of course, even in my stereo recorder, I could write to multiple files
and later combine the files.

Juhana



Re: Detecting mount point of last mount?

2000-04-20 Thread James Antill

Jeff Garzik <[EMAIL PROTECTED]> writes:

> It would be handy if our installer could reconstruct /etc/fstab from a
> list of existing Linux partitions.  It is possible to read the ext2
> superblock or similar, and obtain the last mount point of a partition? 
> "hda6 was last mounted on /usr/local", etc.

 Get util-linux-2.9y and look in mount/linux_fs.h and
mount/mount_by_label.c.

 Basically you need to add "char s_last_mounted[64];" to the end of
struct ext2_super_block and then you need to do the check after one of
the many read()'s (grep for ext2magic()).

 Note that it won't be filled if you've crashed, and none of the magic
currently works for reiserfs (which I wouldn't mind if you fixed so I
could mount by label on reiserfs as well :).
 So it's probably ok for a hint to an installer but no more.

 Ps. You might want to contact the current utils-linux/mount
maintainer as I know I sent some patches to fix a couple of things.

-- 
James Antill -- [EMAIL PROTECTED]
"If we can't keep this sort of thing out of the kernel, we might as well
pack it up and go run Solaris." -- Larry McVoy.



Re: Detecting mount point of last mount?

2000-04-20 Thread Guest section DW

On Thu, Apr 20, 2000 at 09:38:24AM -0400, Jeff Garzik wrote:

> It would be handy if our installer could reconstruct /etc/fstab from a
> list of existing Linux partitions.  It is possible to read the ext2
> superblock or similar, and obtain the last mount point of a partition? 
> "hda6 was last mounted on /usr/local", etc.

It depends.

# tune2fs -l /dev/hda6 | grep mounted
tune2fs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
Last mounted on:  /mnt/g1

This output is incorrect: the last ten times this was mounted was on /g1.
In fact the kernel (or mount) do not keep track of this, but
ext2 utilities like mke2fs and tune2fs are willing to write this field
(s_last_mounted).

Andries




Detecting mount point of last mount?

2000-04-20 Thread Jeff Garzik

It would be handy if our installer could reconstruct /etc/fstab from a
list of existing Linux partitions.  It is possible to read the ext2
superblock or similar, and obtain the last mount point of a partition? 
"hda6 was last mounted on /usr/local", etc.

Thanks,

Jeff





-- 
Jeff Garzik  | Nothing cures insomnia like the
Building 1024| realization that it's time to get up.
MandrakeSoft, Inc.   |-- random fortune



Re: [linux-audio-dev] Re: File writes with O_SYNC slow

2000-04-20 Thread Steve Lord


> 
> I tried O_SYNC and even O_DSYNC on the SGI (Origin 2k),
> (D_SYNC syncs only data blocks but not metadata blocks)
> both only delivering 3.5MBytes/sec. (plain buffered writes were about
> 15-16MB/sec).


I should comment on this one since it would be on XFS

The difference between O_SYNC and O_DSYNC on XFS is only there if the
file size does not change during the write. O_DSYNC which extends the file
still does a synchronous transaction (a bad thing) if it extents the file.
XFS tries hard to do the right things in these cases, we also have customers
who use XFS inside of things like MRI scanners, they insist that if they
write data to the disk, it better stay there, even if the power goes
out.

I realize with audio recording it is difficult to know the size of the
file in advance. But you would get better results if you bumped up
the filesize periodically rather than on every write. Preallocation
would also help.

Steve


> 
> Benno.
> 





Re: [linux-audio-dev] Re: File writes with O_SYNC slow

2000-04-20 Thread Stephen C. Tweedie

Hi,

On Thu, Apr 20, 2000 at 10:57:15AM +0200, Benno Senoner wrote:
> 
> I tried all combinations using my hdtest.c which I posted yesterday.
> 
> I tried O_SYNC and even O_DSYNC on the SGI (Origin 2k),
> (D_SYNC syncs only data blocks but not metadata blocks)

Not quite.  O_DSYNC syncs metadata too.  The only thing it skips is
inode timestamps.  

There is an important difference between the two when you are overwriting
an existing allocated file.  In that case, there are no metadata changes
except for timestamp updates, so O_DSYNC is very much faster.  However,
if you are appending to a file, then some metadata updates (for file
mapping information and for the file size) are necessary for both 
O_SYNC and O_DSYNC.
 
> write() + fsync()/fdatasync() on linux doesn't work well too since the kernel
> isn't able to optimize disk writing by using the elevator algorithm.

There are other problems with fsync/fdatasync in 2.2.  In particular,
it is slow for larger files since it tries to scan all the mapping
information for the entire file.

I'll put together an old patch I did to make fsync/fdatasync and
O_DSYNC work much faster.  It will be interesting to see if it makes
much difference, and it may be the stick we need to beat Linus into
believing that this change is really quite important.

--Stephen



Re: [linux-audio-dev] Re: File writes with O_SYNC slow

2000-04-20 Thread Benno Senoner

On Thu, 20 Apr 2000, Eric W. Biederman wrote:
> "Benjamin C.R. LaHaise" <[EMAIL PROTECTED]> writes:
> 
> > On Wed, 19 Apr 2000, Karl JH Millar wrote:
> > 
> > > Furthermore, doing a write and then a fsync should be *just* as slow as a
> > > synchronous write, but I'm measuring it as over 10 times faster.
> > 
> > Doing synchronous writes involves synching metadata for every 4KB of data,
> > not the entire 256KB chunk.
> 
> Or more precisely for ever (data or indirect) block appended to the
> file there is a metadata sync.   Plus a forced write of the data
> blocks every 32 data blocks written...
> 
> The metadata sync will kill you on appends whichever way you slice it.
> 
> For the practical effect of keeping dirty buffers down write+fsync
> looks like a better choice than using O_SYNC...
> 
> Eric

I tried all combinations using my hdtest.c which I posted yesterday.

I tried O_SYNC and even O_DSYNC on the SGI (Origin 2k),
(D_SYNC syncs only data blocks but not metadata blocks)
both only delivering 3.5MBytes/sec. (plain buffered writes were about
15-16MB/sec).

write() + fsync()/fdatasync() on linux doesn't work well too since the kernel
isn't able to optimize disk writing by using the elevator algorithm.

You can try this in by trying to fsync()/fdatasync() all output descriptors
in the disk_thread routine.

I tried:
A)
for each output file {
  write();
  fsync() or fdatasync();
}

B)

for each output file {
  write();
}
for each output file {
  fsync() or fdatasync();
}
 

C)
for each output file {
  write();
}

and every N seconds (I tried with N ranging from 1 to 20),
I call fsync() / fdatasync() or sync() in general,
in order to flush buffers to disk.

all three methods with all combinations of fsync()/fdatasync() failed,
and the IO performance was inferior to the plain write()s 
(wihout any data synching).

for example the B) method in the single threaded case gives me
3.5MB/sec instead of 8.5MB/sec without fdatasync() 
and the A) method in the multithreaded case, gives me 5MB/sec,
instead of 8MB/sec without fdatasync()

As you can see the performance drop ranges from 30-60% which is simply
not acceptable for demanding applications.

For now if you want to do HD recording on linux with many tracks on an EIDE
disk, (SCSI seems to behave better,standing to Paul's findings) the only chance
to succeed is to buy LOTS of RAM (with 2MB/track you can squeeze out quite the
maximum out of your disk), to overcome to these long lasting (6-8secs)
filesystem stalls.

I still can't believe that windoze uses only 256kb buffers, because there is no
big chance to optimize writes using the elevator algorithm.
I guess that they delivers fewer tracks ( = less total MB/sec throughput)

Benno.