Re: technical comparison

2001-05-30 Thread Terry Lambert

Dave Hayes wrote:
 You can't make that assumption just yet (although it seems
 reasonable). We really don't know exactly what the problem they are
 trying to solve is. Network news sites running old versions of
 software (as an example, I know someone who still runs CNEWS) have
 very clear reasons for phenomena resembling 60,000 files in one
 directory.

I think it's the how can we come up with an artificial
benchmark to prove the opinions we already have problem...

Right up there with the Polygraph web caching benchmark,
which intentionally stacks the deck to test cache replacement,
and for whom the people who get the best benchmarks are those
who cheat back and use random replacement instead of LRU or
some other sane algorithm, since the test intentionally
destroys locality of reference.

People have made the same complaint about the lmbench micro
benchmarks, which test things which aren't really meaningful
any more (e.g. NULL system call overhead, when we have things
like kqueue, etc.).

I'm largely unimpressed with benchmarks written to beat a
particular drum for political reasons, rather than as a
tool for optimizing something that's meaningful to real
world performance under actual load conditions.  Call me
crazy that way...

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-30 Thread Terry Lambert

Andrew Reilly wrote:
 On Sat, May 26, 2001 at 07:25:16PM +1000, Andrew Reilly wrote:
  One of my personal mail folders has 4400 messages in it, and
  I've only been collecting that one for a few years.  It's not
  millions, but its a few more than the 500 that I've seen some
  discuss here as a reasonable limit (why is that reasonable?) and
  it's many many more than the 72 or so limit available in ADFS.
 
 I realised as soon as I pressed the send button that my current
 use of large directories for mail files doesn't actually involve
 any random access: the directory is read sequentially to build
 the header list.
 
 It is quite concievable that a performance tweak to the IMAP
 server could involve a header cache in a relational database of
 some sort, and that would certainly contain references to the
 individual files, which would then be accessed randomly.
 
 /usr/ports/distfiles on any of the mirrors probably contains
 upwards of 5000 files too, and there is a strong likelyhood that
 these will be accessed out-of-order by ports-makefile-driven
 fetch requests.

Cyrus IMAP uses a header cache in precisely this way.

And since the cache files are created very early on, they
are early in the directory, and so do not suffer a large
startup penalty.

The searches for specific files would indeed be linear,
but they would be O(1) linear for each file.


As I said before, I replaced the FFS directory code with
a trie structured directory structure.  Using these
n-ary structures, you could very quickly look up any
individual files, and a linear traversal of the directory
to iterate all files (an increasingly common thing for
visual file browsers to do) was still O(1) linear.

People didn't find the patches very useful, beyond them
being an interesting curiousity, since in reality, the
problem of huge directories tends not to exist in nature,
where code was written to deal with the limitations of
S51K and similar FS's, and thus doesn't tend to do things
like dump all its large number of files into a single
directory.

A similar set of patches cause iterated filenames to have
their vnodes prefaulted, which helps immensely in AppleTalk
and SMB file serving, where the protocol demands stat data
back at the same time because of assumptions about the host
OS's files.  This effectively enters them into the directory
cache, and locality of reference keeps them there.  This is
a really simple hack.  Then all you have to do is up your
directory cache size to whatever your favorite unreasonable
limit happens to be (e.g. 70,000), and everything becomes a
cache hit, after the initial load-up.

The second is still a clever hack (IMO), since it's still
useful, but you'd want it to be a per FS option, or at
least minimally an ioctl() to set the option on a directory
fd after opening it, so that exported SMBFS share would have
the behaviour, but your news spool would not.

As I said before, however, the tradeoff of better performance
on really obscene directories was not really worth the binary
backward compatability problems that resulted when switching
a system over (in other words, it was an interesting research
topic, but little more than that).

I think the benchmark in question is pretty lame, and the
things which it is attempting to prove will not occur in
real systems, unless you are running pessimal code.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-29 Thread Peter Jeremy

On Sun, 27 May 2001 22:50:48 -0300 (BRST), Rik van Riel [EMAIL PROTECTED] wrote:
On Sat, 26 May 2001, Peter Wemm wrote:
 Which is more expensive?  Maintaining an on-disk hashed (or b+tree)
 directory format for *everything* or maintaining a simple low-cost
 format on disk with in-memory hashing for fast lookups?

I bet that for modest directory sizes the cost of disk IO outweighs
the added CPU usage by so much that you may as well take the trouble
of using the more scalable directory format.

I'm not sure I follow this.  Reading sequentially is always going to
be much faster than reading randomly.  For a modest directory size,
you run the risk that randomly accessing fewer blocks will actually
take longer than just reading the entire directory sequentially.

 For the small directory case I suspect the FFS+namecache way is more
 cost effective.  For the medium to large directory case (10,000 to
 100,000 entries), I suspect the FFS+namecache method isn't too shabby,
 providing you are not starved for memory.  For the insanely large
 cases - I dont want to think about :-).

The ext2 fs, which uses roughly the same directory structure as
UFS and has a name cache which isn't limited in size, seems to
bog down at about 10,000 directory entries.

As has been pointed out earlier, hash algorithms need a `maximum
number of entries' parameter as part of their algorithm.  Beyond some
point, defined by this number, the hash will degenerate to (typically)
O(N).  It sounds like the Linux name cache hashing algorithm is not
intended to handle so many directory entries.

Daniel Phillips is working on a hash extension to ext2; not a
replacement of the directory format, but a way to tack a hashed
index after the normal directory index.

I think a tree structure is better than a hash because there is no
inherent limit to the size (though the downside is O(log N) rather
than close to fixed time).  It may be possible to build a tree
structure around the UFS directory block structure in such a way that
it would be backward compatible[1].  Of course managing to correctly
handle soft-updates write ordering for a tree re-balance is
non-trivial.

One point that hasn't come out so far is that reading a UFS is quite
easy - hence boot2 can locate a loader or kernel by name within the
root filesystem, rather than needing to hardware block numbers to
load.  If the directory structure does change, we need to ensure that
it's possible to (possibly inefficiently) parse the structure in
a fairly small amount of code.

It also has the advantage of being able to keep using the
triedtested fsck utilities.

Whatever is done, fsck would need to be enhanced to validate the
directory structure, otherwise you could wind up with files that
can't be found/deleted because they aren't where the hash/tree
algorithm expects them.

Suggestion for the lets use the filesystem as a general purpose
relational database crowd: A userland implementation of the existing
directory search scheme (ignoring name caching) would be trivial (see
/usr/include/ufs/ufs/dir.h and dir(5) for details).  Modify postmark
(or similar) to simulate the creation/deletion of files in a userland
`directory' structure and demonstrate an algorithm that is faster for
the massive directory case and doesn't pessimize small directories.
The effect of the name cache and datafile I/O should be able to be
ignored since you just want to compare directory algorithms.

[1] Keep entries within each block sorted.  Reserve space at the end
of the block for left and right child branch pointers and other
overheads, with the left branch being less than the first entry in
the block and the right branch being greater than the last entry.
The reserved space is counted in d_reclen of the last entry (which
makes it backward compatible).  I haven't thought through the
block splitting/merging algorithm so this may not work.

Peter

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-27 Thread Doug Barton

Andrew Reilly wrote:

 It is quite concievable that a performance tweak to the IMAP
 server could involve a header cache in a relational database of
 some sort, and that would certainly contain references to the
 individual files, which would then be accessed randomly.

You might want to give mbox format a try. imap-uw will use this format if
you perform a few tweaks described in the documentation that comes with it.
Basically, instead of the mailbox being in plain text it creates a type of
database at the top of the file that describes the contents. Makes access
much faster for large ( 1k letters) mailboxes.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-27 Thread Marc G. Fournier

On Sun, 27 May 2001, Doug Barton wrote:

 Andrew Reilly wrote:

  It is quite concievable that a performance tweak to the IMAP
  server could involve a header cache in a relational database of
  some sort, and that would certainly contain references to the
  individual files, which would then be accessed randomly.

   You might want to give mbox format a try. imap-uw will use this
 format if you perform a few tweaks described in the documentation that
 comes with it. Basically, instead of the mailbox being in plain text
 it creates a type of database at the top of the file that describes
 the contents. Makes access much faster for large ( 1k letters)
 mailboxes.

what you are suggesting sounds like something that Cyrus-IMAP already
done, using Berkeley-DB ... loading up several thousand email's and
sorting them takes no time ...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-27 Thread Rik van Riel

On Sat, 26 May 2001, Peter Wemm wrote:

 Which is more expensive?  Maintaining an on-disk hashed (or b+tree)
 directory format for *everything* or maintaining a simple low-cost
 format on disk with in-memory hashing for fast lookups?

I bet that for modest directory sizes the cost of disk IO outweighs
the added CPU usage by so much that you may as well take the trouble
of using the more scalable directory format.

 For the small directory case I suspect the FFS+namecache way is more
 cost effective.  For the medium to large directory case (10,000 to
 100,000 entries), I suspect the FFS+namecache method isn't too shabby,
 providing you are not starved for memory.  For the insanely large
 cases - I dont want to think about :-).

The ext2 fs, which uses roughly the same directory structure as
UFS and has a name cache which isn't limited in size, seems to
bog down at about 10,000 directory entries.

Daniel Phillips is working on a hash extension to ext2; not a
replacement of the directory format, but a way to tack a hashed
index after the normal directory index.

This way the filesystem is backward compatible, older kernels
will just use the old directory format and will clear a flag
when they write to the directory, this can later be used by
the new kernel to rebuild the hashed directory index.

It also has the advantage of being able to keep using the
triedtested fsck utilities.

Maybe this could be an idea to enhance UFS scalability for
huge directories without endangering reliability ?

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-26 Thread Andrew Reilly

On Fri, May 25, 2001 at 08:49:21PM +, Terry Lambert wrote:
 There is _no_ performance problem with the existing implementation,
 if you treat postgres as the existing implementation; it will do
 what you want, quickly and effectively, for millions of record keys.

Does postgres make a good mail archive database?  Can it handle
arbitrary record lengths?  It couldn't the last time I looked at
it.

 Why are you treating an FS as if it were a relational database?  It
 is a tool intended to solve an entirely different problem set.

I'm not treating it as a relational database.  But if mail
messages aren't conceptually files, then I don't know what
they are.

One of my personal mail folders has 4400 messages in it, and
I've only been collecting that one for a few years.  It's not
millions, but its a few more than the 500 that I've seen some
discuss here as a reasonable limit (why is that reasonable?) and
it's many many more than the 72 or so limit available in ADFS.

I changed over to Maildirs becuase I like the fact that I
can use normal Unix file search and manipulation programs on
individual messages, as well as a wider set of MUAs (thanks
to courier IMAP...), and because folder opening doesn't bog
down when there are a couple of messages with really large
attachments in them, the way mbox folders do.

 You are bitching about your hammer not making a good screwdriver.

If the file system isn't a good place to store files, then what
is it good for?

Souce code trees only?

There are application specific databases available for that too.

What have you got left?

-- 
Andrew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-26 Thread Andrew Reilly

On Sat, May 26, 2001 at 07:25:16PM +1000, Andrew Reilly wrote:
 One of my personal mail folders has 4400 messages in it, and
 I've only been collecting that one for a few years.  It's not
 millions, but its a few more than the 500 that I've seen some
 discuss here as a reasonable limit (why is that reasonable?) and
 it's many many more than the 72 or so limit available in ADFS.

I realised as soon as I pressed the send button that my current
use of large directories for mail files doesn't actually involve
any random access: the directory is read sequentially to build
the header list.

It is quite concievable that a performance tweak to the IMAP
server could involve a header cache in a relational database of
some sort, and that would certainly contain references to the
individual files, which would then be accessed randomly.

/usr/ports/distfiles on any of the mirrors probably contains
upwards of 5000 files too, and there is a strong likelyhood that
these will be accessed out-of-order by ports-makefile-driven
fetch requests.

-- 
Andrew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-26 Thread Nat Lanza

Andrew Reilly [EMAIL PROTECTED] writes:

 Where in open(1) does it specify a limit on the number of files
 permissible in a directory?  The closest that it comes, that I can
 see is:

Well, read(2) doesn't tell you not to do your IO one character at a
time, but that doesn't mean it's a good idea. The point here is not
interface definitions, it's efficiency. Nobody's saying you shouldn't
be _allowed_ to put thousands and thousands of files in a directory if
you like. They're just saying that you shouldn't expect it to be fast.
Similarly, you can read data one byte at a time if you like, but you
shouldn't expect that to be fast either.

Pointing to manpages and saying you weren't warned that a particular
approach is slow is a really weak defense. Do you expect cliffs to
have little If you drive off this cliff, you will die warning signs
on them?

If a documented part of the API simply did not work, then you'd have a
point. Instead, what we have is a case where a method of storing files
that most people reasonably expect to be slow is in fact slow.

The folks who've pointed out the /a/a/aardvark solution are right --
directory hashing is a well-known solution to this problem. It isn't
a hack at all. No matter what method you use for storing directories,
larger directories are going to be slower to use than smaller ones,
and hashing filenames fixes that.


--nat

-- 
nat lanza --- there are no whole truths;
[EMAIL PROTECTED]  all truths are half-truths
http://www.cs.cmu.edu/~magus/ ---  -- alfred north whitehead

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-26 Thread .

Andrew Reilly writes:
 On Fri, May 25, 2001 at 08:49:21PM +, Terry Lambert wrote:
  There is _no_ performance problem with the existing implementation,
  if you treat postgres as the existing implementation; it will do
  what you want, quickly and effectively, for millions of record keys.
 
 Does postgres make a good mail archive database?  Can it handle
Yes, I look at it to replace my messarge.

 arbitrary record lengths?  It couldn't the last time I looked at
 it.
Now (7.1) it can.

-- 
@BABOLO  http://links.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-26 Thread .

Andrew Reilly writes:

 /usr/ports/distfiles on any of the mirrors probably contains
 upwards of 5000 files too, and there is a strong likelyhood that
 these will be accessed out-of-order by ports-makefile-driven
 fetch requests.
Oh!
You point a good example!
0cicuta~(13)/bin/ls /usr/ports/distfiles/ | wc
96729672  198244

-- 
@BABOLO  http://links.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-26 Thread Peter Wemm

[EMAIL PROTECTED] wrote:
 Andrew Reilly writes:
 
  /usr/ports/distfiles on any of the mirrors probably contains
  upwards of 5000 files too, and there is a strong likelyhood that
  these will be accessed out-of-order by ports-makefile-driven
  fetch requests.
 Oh!
 You point a good example!
 0cicuta~(13)/bin/ls /usr/ports/distfiles/ | wc
 96729672  198244

.. Which is almost entirely stored in the name cache, which is hashed. Once
you scan the directory for the first time, the entries are pre-inserted
into the hash.  This cache is very long lived and is quite effective at
dealing with this sort of thing, especially if you have plenty of memory
and have vfs.vmiodirenable=1 turned on.  While it may not scale too well to
directories with millions of files, it certainly deals well with tens of
thousands of files.  We have recently made improvements to the hashing
algorithms to get better dispersion on small and iterative filenames, eg:
00, 01, 02 - FF.

It is not perfect, but it is a hell of a lot better than the false
assumption that the linear search method is the usual case.

Which is more expensive?  Maintaining an on-disk hashed (or b+tree)
directory format for *everything* or maintaining a simple low-cost format
on disk with in-memory hashing for fast lookups?  For the small directory
case I suspect the FFS+namecache way is more cost effective.  For the
medium to large directory case (10,000 to 100,000 entries), I suspect the
FFS+namecache method isn't too shabby, providing you are not starved for
memory.  For the insanely large cases - I dont want to think about :-).

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Benchmarking FreeBSD (was Re: technical comparison)

2001-05-25 Thread Dave Hayes

Jordan Hubbard [EMAIL PROTECTED] writes:
 Erm, folks?  Can anyone please tell me what this has to do with
 freebsd-hackers any longer? 

While the thread has diverged from it's original intent, there is
something related I consider to be a more interesting topic. If it's
still not appropriate for hackers, please let me know. 

When people are doing benchmarks, I noted that there are -lots- of
little sysctl tweaks or kernel tweaks that tend to make a big
difference in the results. 

I know it is possible to define some sort of abstraction that uniquely
specifies a complete (and/or relevant) set of these tweaks when
comparing benchmarks. Does this already exist, and if not, how hard
would it be to catalog every single relavent tunable parameter in a
FreeBSD system?
--
Dave Hayes - Consultant - Altadena CA, USA - [EMAIL PROTECTED] 
 The opinions expressed above are entirely my own 

War doesn't determine who's right. War determines who's left.
-Confucious








To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-25 Thread .

Greg Black writes:
 Andresen,Jason R. wrote:
 
 | On Thu, 24 May 2001, void wrote:
 | 
 |  On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote:
 |  
 |   Why is knowing the file names cheating?  It is almost certain
 |   that the application will know the names of it's own files
 |   (and won't be grepping the entire directory every time it
 |   needs to find a file).
 | 
 |  With 60,000 files, that would have the application duplicating
 |  60,000 pieces of information that are stored by the operating system.
 |  Operations like open() and unlink() still have to search the directory
 |  to get the inode, so there isn't much incentive for an application to
 |  do that, I think.
 | 
 | This still doesn't make sense to me.  It's not like the program is going
 | to want to do a find on the directory every time it has some data it
 | wants to put somewhere.  I think for the majority of the cases (I'm sure
 | there are exceptions) an application program that wants to interact with
 | files will know what filename it wants ahead of time.  This doesn't
 | necessarily mean storing 60,000 filenames either, it could be something
 | like:
 | I have files fooX where X is a number from 0 to 6 in that
 | directory.  I need to find a piece of information, so I run that
 | information through a hash of some sort and determine that the file I want
 | is number 23429, so I open that file.
 
 And if this imaginary program is going to do that, it's equally
 easy to use a multilevel directory structure and that will make
 the life of all users of the system simpler.  There's no real
 excuse for directories with millions (or even thousands) of
 files.
There is.
You assume that names are random.
Assume that they  are not.
VERY old example:
a
aa
...
aaa...aaa 255 times
aaa...aab
so on.
Yes, I know: hash.

Is it practical to this in every application
(sometimes it is unknown before practical use
if directories become big) instead in
one file system?

Sorry for a bad English.

-- 
@BABOLO  http://links.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Benchmarking FreeBSD (was Re: technical comparison)

2001-05-25 Thread Matt Dillon


:Jordan Hubbard [EMAIL PROTECTED] writes:
: Erm, folks?  Can anyone please tell me what this has to do with
: freebsd-hackers any longer? 
:
:While the thread has diverged from it's original intent, there is
:something related I consider to be a more interesting topic. If it's
:still not appropriate for hackers, please let me know. 
:
:When people are doing benchmarks, I noted that there are -lots- of
:little sysctl tweaks or kernel tweaks that tend to make a big
:difference in the results. 
:
:I know it is possible to define some sort of abstraction that uniquely
:specifies a complete (and/or relevant) set of these tweaks when
:comparing benchmarks. Does this already exist, and if not, how hard
:would it be to catalog every single relavent tunable parameter in a
:FreeBSD system?
:--
:Dave Hayes - Consultant - Altadena CA, USA - [EMAIL PROTECTED] 
: The opinions expressed above are entirely my own 

Well, it's been done before.  The problem is that the landscape changes
every time we do a new release.

I did a 'security' man page a while ago (which is still mostly relevant).
I suppose I could do a 'performance' man page.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-25 Thread Greg Black

I would have sent this to the original author if he had used a
proper email address on his post; sorry to those who don't want
to see it.

|  | I have files fooX where X is a number from 0 to 6 in that
|  | directory.  I need to find a piece of information, so I run that
|  | information through a hash of some sort and determine that the file I want
|  | is number 23429, so I open that file.
|  
|  And if this imaginary program is going to do that, it's equally
|  easy to use a multilevel directory structure and that will make
|  the life of all users of the system simpler.  There's no real
|  excuse for directories with millions (or even thousands) of
|  files.
| There is.
| You assume that names are random.
| Assume that they  are not.
| VERY old example:
| a
| aa
| ...
| aaa...aaa 255 times
| aaa...aab
| so on.
| Yes, I know: hash.
| 
| Is it practical to this in every application
| (sometimes it is unknown before practical use
| if directories become big) instead in
| one file system?

Any real programmer has tools that make this trivial.  I keep a
pathname hashing function and a couple of standalone programs
that exercise it from shell scripts in my toolbox and can stitch
them into anything that needs fixing in no time.  My code allows
for nearly 1.3 trillion names in a six-level hierarchy if you
can limit yourself to about 500 hundred names per directory, but
can be easily extended for really idiotic uses.

| Sorry for a bad English.

We can live with that, but it's a bit rude to send messages out
without a valid From address.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-25 Thread Terry Lambert

]  ]  1. I don't think I've ever seen a Linux distro which has write
]  ] caching enabled by default. Hell, DMA33 isn't even enabled
]  ] by default ;)
]  ] 
]  ] You are talking about controlling the IDE drive cache.
]  ] 
]  ] The issue here is write cache in the filesystem code.
]  
]  No.  The issue here is the write cache on the drive.
]  FreeBSD with soft updates will operate within 4% of the top memory
]  bandwidth; see the Ganger/Patt paper on the technology.
] 
] I have a file, CSE-TR-254-95.ps, that I think is probably the paper
] you are talking about. The title is Soft Updates: A Solution to the
] Metadata Update Problem in File Systems. The link on Ganger's page was
] dead, but I'm sure this is the one you mean.
] 
] Nowhere do they support the idea that soft udpates can approach a
] system's memory bandwidth.

I said top memory bandwidth, not a system's memory bandwidth;
please be more careful.

Quoting from section 6, Conclusions and Future Work:

We have described a new mechanism, soft updates, that
can be used to achieve memory-based file system
   
performance while providing stronger integrity and
***

security guarantees (e.g. allocation initialization)
and higher availability (via shorter recovery times)
than most UNIX file systems.  This translates into a
performance improvement of more than a factor fo 2
in many cases (up to a maximum observed difference
of a factor of 15).


Terry Lambert
[EMAIL PROTECTED]
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-25 Thread Terry Lambert

] Nothing in Unix stops you from putting millions of files in a
] directory.  There are (I mantain _obviously_) good reasons to
] want to do that.  The only thing that stops you is that _some_
] Unix platforms, using _some_ file systems, behave badly if you
] do that.

There are _no_ good reasons for using an FS as if the directory
structure was a key file, file names keys, and file contents
data records in a relational database.

We have things which were built precisely for this type of use.

We call them relational databases.


] They should be fixed.

Feel free to submit patches, so long as they do not damage any
backward compatability, and do not compromise performance under
normal workloads just to pass some obscure test that somone
has devised to prove one FS is better than another by doing
ridiculous things which will never happen except in special
purpose situations, in which special purpose tools are a better
fit.


Terry Lambert
[EMAIL PROTECTED]
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-25 Thread Terry Lambert

] It's got nothing to do with the basics of software engineering or
] computer science.  It's got to do with interface definitions and
] APIs.
] 
] Where in open(1) does it specify a limit on the number of files
] permissible in a directory?  The closest that it comes, that I can
] see is:

[ ... ]

] All of which quite clearly indicate that if one wants to put all
] of ones allocation of blocks or inodes into a single directory,
] then one can, as long as the name and path length limits are
] observed.

UNIX, in not preventing you from doing stupid things, permits
you to do clever things which other operatings systems do not
permit..

I maintain that just because you are not administratively
prohibited from doing stupid things, that in no way makes
doing those things less stupid.


] You're welcome to claim a documentation bug, and add the
] appropriate caveat.  It seems clear to me that Hans Reiser (and
] Silicon Graphics before him) have taken the more obvious approach,
] of attempting to remove the performance limitation inherent in the
] existing implementation.

The performance limitation?

Get your story straight: is there a limitation, or isn't there?


] You can moan about tree-structured vs relational databases, but if
] your problem space doesn't intrinsically map to a tree, then it
] doesn't stop the tree-structring transformation that Terry
] mentioned from being a gratuitious hack to work around a
] performance problem with the existing implementation.


It is not a performance problem with the existing implementation,
it is pilot error.

There is _no_ performance problem with the existing implementation,
if you treat postgres as the existing implementation; it will do
what you want, quickly and effectively, for millions of record keys.

Why are you treating an FS as if it were a relational database?  It
is a tool intended to solve an entirely different problem set.

You are bitching about your hammer not making a good screwdriver.


Terry Lambert
[EMAIL PROTECTED]
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-25 Thread Matt Dillon

One word:  B+Tree.  Hash tables work well if the entire hash table fits
into memory and you know (approximately) what the upper limit on records
is going to be.  If you don't, then a B+Tree is the only proven way to
go.  (sure, there are plenty of other schemes, some hybrid, some 
completely different, but B+Tree's have been long proven so unless you
want to experiment, just use one).

In general I agree that UFS's only major pitfall is the sequential
directory scanning.  The reality, though, is that very few programs
actually need to create thousands or millions of files in a single
directory.   The biggest one used to be USENET news but that has
shifted into multi-article files and isn't an issue any more.  Now
the biggest one is probably squid.  Databases are big storage-wise,
but don't usually require lots of files.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-25 Thread Matt Dillon


Ultimately something like Reiser will win over UFS, but performance
figures aren't the whole picture.  Most of the bugs have been worked out
of UFS and the recovery tools are extremely mature.  Only a handful
of edge cases have been found in the last decade.  Nearly all the bugs
in the last few years have turned out to be buffer cache or VM bugs
rather then filesystem bugs.  ResierFS has a long way to go before it
can be safely used on production systems.  Linux, having just moved
to a totally new VM system also has a long way to go (and, for the same
reason, FreeBSD-5 has a long way to go before it can safely be used in
production).  When Reiser starts to get close, I'll be the first one
to port it to FreeBSD :-)

Consider for a moment the development roadmap for UFS, EXT2FS, and
REISERFS.  It took UFS and its supporting tools years to get as good as
it is for production purposes.  It has taken EXT2FS a number of years
to reach where it is.  ReiserFS is new, and it is going to be a while.

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Andresen,Jason R.

On Thu, 24 May 2001, void wrote:

 On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote:
 
  Why is knowing the file names cheating?  It is almost certain
  that the application will know the names of it's own files
  (and won't be grepping the entire directory every time it
  needs to find a file).

 With 60,000 files, that would have the application duplicating
 60,000 pieces of information that are stored by the operating system.
 Operations like open() and unlink() still have to search the directory
 to get the inode, so there isn't much incentive for an application to
 do that, I think.

This still doesn't make sense to me.  It's not like the program is going
to want to do a find on the directory every time it has some data it
wants to put somewhere.  I think for the majority of the cases (I'm sure
there are exceptions) an application program that wants to interact with
files will know what filename it wants ahead of time.  This doesn't
necessarily mean storing 60,000 filenames either, it could be something
like:
I have files fooX where X is a number from 0 to 6 in that
directory.  I need to find a piece of information, so I run that
information through a hash of some sort and determine that the file I want
is number 23429, so I open that file.

I don't expect programs to try to offload this sort of information on the
filesystem.  Do you have an example of a program that interacts with the
filesystem without knowing the names of the files it wants?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Rik van Riel

On Wed, 23 May 2001, Shannon wrote:
 On Wed, May 23, 2001 at 10:54:40PM -0300, Rik van Riel wrote:

  1. I don't think I've ever seen a Linux distro which has write
 caching enabled by default. Hell, DMA33 isn't even enabled
 by default ;)

 You are talking about controlling the IDE drive cache.

 The issue here is write cache in the filesystem code.

1) IIRC they were talking about hw.ata.wc

2) soft-updates _is_ a form of write cache in the
   filesystem code, in fact, that's one of the points
   of soft-updates in the first place ;)


regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Warner Losh

In message [EMAIL PROTECTED] Jason Andresen writes:
: If only FreeBSD could boot from those funky M-Systems flash disks. 

We boot FreeBSD off of M-Systems flash disks all the time.  Don't know
what the problem is with your boxes.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Shannon Hendrix

On Thu, May 24, 2001 at 12:25:59PM -0300, Rik van Riel wrote:
 On Wed, 23 May 2001, Shannon wrote:
  On Wed, May 23, 2001 at 10:54:40PM -0300, Rik van Riel wrote:
 
   1. I don't think I've ever seen a Linux distro which has write
  caching enabled by default. Hell, DMA33 isn't even enabled
  by default ;)
 
  You are talking about controlling the IDE drive cache.
 
  The issue here is write cache in the filesystem code.
 
 1) IIRC they were talking about hw.ata.wc

In a subthread, yeah. I think though, the overall issue is the caching
ext2 does that ufs does not. I'm not even sure that soft updates is
quite the same thing. I think the soft-updates paper mentions that it
shouldn't increase risk, while a lot of people feel like ext2 is very
risky.

I never really notice a big difference when I turn on write caching
with my system (on the hard drive). It's been awhile since I did any
benchmarks though, since I no longer run IDE drives on most systems. You
can control the cache on them too with the right scsi tools, but I've
not really messed with it.





-- 
 There is no such thing as security.  Life is either bold   | | |
 adventure, or it is nothing -- Helen Keller| | |
/  |  \
s h a n n o n @ w i d o m a k e r . c o m _/   |   \_

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Greg Black

Andresen,Jason R. wrote:

| On Thu, 24 May 2001, void wrote:
| 
|  On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote:
|  
|   Why is knowing the file names cheating?  It is almost certain
|   that the application will know the names of it's own files
|   (and won't be grepping the entire directory every time it
|   needs to find a file).
| 
|  With 60,000 files, that would have the application duplicating
|  60,000 pieces of information that are stored by the operating system.
|  Operations like open() and unlink() still have to search the directory
|  to get the inode, so there isn't much incentive for an application to
|  do that, I think.
| 
| This still doesn't make sense to me.  It's not like the program is going
| to want to do a find on the directory every time it has some data it
| wants to put somewhere.  I think for the majority of the cases (I'm sure
| there are exceptions) an application program that wants to interact with
| files will know what filename it wants ahead of time.  This doesn't
| necessarily mean storing 60,000 filenames either, it could be something
| like:
| I have files fooX where X is a number from 0 to 6 in that
| directory.  I need to find a piece of information, so I run that
| information through a hash of some sort and determine that the file I want
| is number 23429, so I open that file.

And if this imaginary program is going to do that, it's equally
easy to use a multilevel directory structure and that will make
the life of all users of the system simpler.  There's no real
excuse for directories with millions (or even thousands) of
files.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Jason Andresen

Greg Black wrote:
 
 Andresen,Jason R. wrote:
 | This still doesn't make sense to me.  It's not like the program is going
 | to want to do a find on the directory every time it has some data it
 | wants to put somewhere.  I think for the majority of the cases (I'm sure
 | there are exceptions) an application program that wants to interact with
 | files will know what filename it wants ahead of time.  This doesn't
 | necessarily mean storing 60,000 filenames either, it could be something
 | like:
 | I have files fooX where X is a number from 0 to 6 in that
 | directory.  I need to find a piece of information, so I run that
 | information through a hash of some sort and determine that the file I want
 | is number 23429, so I open that file.
 
 And if this imaginary program is going to do that, it's equally
 easy to use a multilevel directory structure and that will make
 the life of all users of the system simpler.  There's no real
 excuse for directories with millions (or even thousands) of
 files.

No, there is no excuse, however some third party application (FOR WHICH
YOU DO
NOT HAVE THE SOURCE[1]) may do it anyway.  In the original parent of
this post
that was the exact situtation.  It would be nice if everybody followed
the rules 
and played nice, but it is just something you can't count on in real
life.

[1] Emphasis added because for people in the Free Software business, it
is easy
to forget that you don't always have access to the source code, and
convincing a 
company to rewrite their product because it doesn't like your (almost
certainly 
unsupported) OS smacks of futility.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Bsdguru

In a message dated 05/23/2001 5:04:36 PM Eastern Daylight Time, 
[EMAIL PROTECTED] writes:

 
   Tell them to fire 20K packets/second at the linux box and watch it 
crumble.
 
   Linux has lots of little kludges to make it appear faster on some 
 benchmarks,
   but from a networking standpoint it cant handle significant network 
loads.
  
  Are you sure this is still true?  The 2.4.x series kernel was supposed to
  have significant networking improvements over the previous kernels.

I dont know, but I doubt it. the problem isnt the networking preformance, its 
the inability of the memory system and the ethernet drivers to handle 
overloads properly. They are modeled in a way that fails in practice.

Bryan

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Terry Lambert

] Terry Lambert writes:
] 
]  I don't understand the inability to perform the trivial
]  design engineering necessary to keep from needing to put
]  60,000 files in one directory.
] 
]  However, we can take it as a given that people who need
]  to do this are incapable of doing computer science.
] 
] One could say the same about the design engineering necessary
] to handle 60,000 files in one directory. You're making excuses.

No, I'm not.  I released trie patches for FreeBSD directory
sotrage in 1995.

No one thought they were very useful, because only morons
would treat a filesystem as if it were a database, instead
of using a database as a database.

If you want to get technical, a filesystem is a form of a
database... but it's a _hierarchical_ database, like DNS or
LDAP, and trying to use it as a _relational_ database, with
key/value pairs, is still a stupid idea.  Use the right tool
for the job.

] People _want_ to do this, and it often performs better on
] a modern filesystem. This is not about need; it's about
] keeping ugly hacks out of the app code.
] 
] http://www.namesys.com/5_1.html

I'm glad you said people want to do this instead of saying
computer professionals want to do this.

The 60,000 file benchmark is meaningless to a properly
designed system.

]  (the rationale behind this last is that people who can't
]  design around needing 60,000 files in a single directory
]  are probably going to to be unable to correctly remember
]  the names of the files they created, since if they could,
]  then they could remember things like ./a/a/aardvark or
]  ./a/b/abominable).
] 
] Eeew. ./a/b/abominable is a disgusting old hack used to
] work around traditional filesystem deficiencies.

No, it's a hack to work around being too damn lazy to use
a database where it makes sense to use a database.


Terry Lambert
[EMAIL PROTECTED]
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Terry Lambert

]  1. I don't think I've ever seen a Linux distro which has write
] caching enabled by default. Hell, DMA33 isn't even enabled
] by default ;)
] 
] You are talking about controlling the IDE drive cache.
] 
] The issue here is write cache in the filesystem code.

No.  The issue here is the write cache on the drive.
FreeBSD with soft updates will operate within 4% of the
top memory bandwidth; see the Ganger/Patt paper on the
technology.


Terry Lambert
[EMAIL PROTECTED]
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



RE: technical comparison

2001-05-24 Thread Charles Randall

From: Greg Black [mailto:[EMAIL PROTECTED]]
And if this imaginary program is going to do that, it's equally
easy to use a multilevel directory structure and that will make
the life of all users of the system simpler.  There's no real
excuse for directories with millions (or even thousands) of
files.

While I agree completely that there's no excuse for applications that behave
like that, a filesystem that scales well under these harsh conditions will
serve us all better in the long run.

Charles

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Daniel C. Sobral

Shannon Hendrix wrote:
 
   You are talking about controlling the IDE drive cache.
  
   The issue here is write cache in the filesystem code.
 
  1) IIRC they were talking about hw.ata.wc
 
 In a subthread, yeah. I think though, the overall issue is the caching
 ext2 does that ufs does not. I'm not even sure that soft updates is
 quite the same thing. I think the soft-updates paper mentions that it
 shouldn't increase risk, while a lot of people feel like ext2 is very
 risky.

Actually, no. Someone *specifically* mentioned that FreeBSD 4.3-RELEASE
disables hardware caching on IDE, and Linux does not.

-- 
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

wow regex humor... I'm a geek

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Daniel C. Sobral

Jason Andresen wrote:
 
  And if this imaginary program is going to do that, it's equally
  easy to use a multilevel directory structure and that will make
  the life of all users of the system simpler.  There's no real
  excuse for directories with millions (or even thousands) of
  files.
 
 No, there is no excuse, however some third party application (FOR WHICH
 YOU DO
 NOT HAVE THE SOURCE[1]) may do it anyway.  In the original parent of
 this post
 that was the exact situtation.  It would be nice if everybody followed
 the rules
 and played nice, but it is just something you can't count on in real
 life.

Uhhh, no. The original message was remarking about a software
development team which repeatedly fail to deliver the product to the
specs asked for, and that said team blamed FreeBSD and wanted Linux
instead.

So the comment applies. 

-- 
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

wow regex humor... I'm a geek

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Shannon Hendrix

On Thu, May 24, 2001 at 05:00:44PM -0400, [EMAIL PROTECTED] wrote:

Linux has lots of little kludges to make it appear faster on some 
  benchmarks,
but from a networking standpoint it cant handle significant network 
 loads.
   
   Are you sure this is still true?  The 2.4.x series kernel was supposed to
   have significant networking improvements over the previous kernels.
 
 I dont know, but I doubt it. 

There were significant network and memory improvements in the 2.4
release. There were also some improvements that will have to wait for
the next release, but overall it is much improved.

FreeBSD 4.3 is much improved over 2.x and 3.x, so I'm not sure why that
would be considered unusual or surprising.

The memory system in Linux is still set up by default to give more speed
at the expense of smooth load handling.  It seems better, but you have
to go into /proc and tune things to get better load handling.

 the problem isnt the networking preformance, its the inability of the
 memory system and the ethernet drivers to handle overloads properly.
 They are modeled in a way that fails in practice.

The way I understood it was certain drivers were more affected by this
than others. Some were just fine, and handled very high loads. Another
problem was multiple ethernet cards, but I forgot what caused that. A
lot of that was addressed in the 2.4 release, and it seems to have made
a lot of people happier.

I can't test the difference because I have nothing but 10mbit ethernet.
However, the 2.4 kernel is definitely faster in my day-to-day work,
and has allowed me to delay a complete move to FreeBSD 4.x on my
workstation. It was that much of a step forward. Now I can wait until I
get proper 3D support for my nVidia graphics card.


-- 
We have nothing to prove -- Alan Dawkins

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Ted Faber

On Thu, May 24, 2001 at 04:42:02PM -0600, Charles Randall wrote:
 From: Greg Black [mailto:[EMAIL PROTECTED]]
   There's no real
 excuse for directories with millions (or even thousands) of
 files.
 
 While I agree completely that there's no excuse for applications that behave
 like that, a filesystem that scales well under these harsh conditions will
 serve us all better in the long run.

TANSTAAFL.  

It's not obvious that you can get such scaling for free.  If the
tradeoffs to make the system perform a dumb task well mean that it
won't perform a sane task well, you lose.


 PGP signature


Re: technical comparison

2001-05-24 Thread Shannon Hendrix

On Thu, May 24, 2001 at 10:34:26PM +, Terry Lambert wrote:
 ]  1. I don't think I've ever seen a Linux distro which has write
 ] caching enabled by default. Hell, DMA33 isn't even enabled
 ] by default ;)
 ] 
 ] You are talking about controlling the IDE drive cache.
 ] 
 ] The issue here is write cache in the filesystem code.
 
 No.  The issue here is the write cache on the drive.
 FreeBSD with soft updates will operate within 4% of the top memory
 bandwidth; see the Ganger/Patt paper on the technology.

I have a file, CSE-TR-254-95.ps, that I think is probably the paper
you are talking about. The title is Soft Updates: A Solution to the
Metadata Update Problem in File Systems. The link on Ganger's page was
dead, but I'm sure this is the one you mean.

Nowhere do they support the idea that soft udpates can approach a
system's memory bandwidth.

What they did say was that in _one_ case, creating and then immediately
deleting a directory entry, you are operating at processor/memory
speeds. They said soft updates in that case were 6 times faster than the
conventional system. That's not even close to the memory bandwidth of
the 486 system they were using, so they had to mean the filesystem code
in that test was able to run without waiting on I/O.

In the more general cases, their findings were more than a factor of
two compared to synchronous write ufs.

I _wish_ my workstation was able to write metadata at nearly 1GB/s all
the time... :)

-- 
Star Wars Moral Number 17: Teddy bears are dangerous in herds.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Andrew Reilly

On Fri, May 25, 2001 at 06:17:33AM +1000, Greg Black wrote:
 the life of all users of the system simpler.  There's no real
 excuse for directories with millions (or even thousands) of
 files.

One of the things that I've always liked about Unix was that
there aren't as many arbitrary limits on what you can do and how
you can do it, as there are on other platforms.

For example, I once used an Acorn Archimedes computer, which had
an OS called RISC-OS.  The advanced disk filing system, ADFS,
had some cute limits built in: no more than 10 characters in a
file name, and no more than 70 (?memory fades) files in a
directory.

Nothing in Unix stops you from putting millions of files in a
directory.  There are (I mantain _obviously_) good reasons to
want to do that.  The only thing that stops you is that _some_
Unix platforms, using _some_ file systems, behave badly if you
do that.

They should be fixed.

-- 
Andrew

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Greg Black

Andrew Reilly wrote:

| On Fri, May 25, 2001 at 06:17:33AM +1000, Greg Black wrote:
|  the life of all users of the system simpler.  There's no real
|  excuse for directories with millions (or even thousands) of
|  files.
| 
| [...]
| 
| Nothing in Unix stops you from putting millions of files in a
| directory.

This is just not true.  For the vast majority of the systems
that have ever been called Unix, attempting to put millions of
files into a directory would be an utter disaster.  No ifs or
buts.  It might be nice if this were different, although I see
no good reason to support it myself, but it's generally not a
serious possibility and so applications that depend on being
able to do that are plain stupid.  Their authors are either too
lazy to make their use of the file system a bit more sensible or
too stupid to know that file systems are not databases.  The
right answer is to write applications with some understanding of
the basics of software engineering or computer science.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Andrew Reilly

On 25 May, Greg Black wrote:
 This is just not true.  For the vast majority of the systems
 that have ever been called Unix, attempting to put millions of
 files into a directory would be an utter disaster.  No ifs or
 buts.  It might be nice if this were different, although I see
 no good reason to support it myself, but it's generally not a
 serious possibility and so applications that depend on being
 able to do that are plain stupid.  Their authors are either too
 lazy to make their use of the file system a bit more sensible or
 too stupid to know that file systems are not databases.  The
 right answer is to write applications with some understanding of
 the basics of software engineering or computer science.

It's got nothing to do with the basics of software engineering or
computer science.  It's got to do with interface definitions and
APIs.

Where in open(1) does it specify a limit on the number of files
permissible in a directory?  The closest that it comes, that I can
see is:

 [ENOSPC]   O_CREAT is specified, the file does not exist, and the
directory in which the entry for the new file is being
placed cannot be extended because there is no space
left on the file system containing the directory.

 [ENOSPC]   O_CREAT is specified, the file does not exist, and
there are no free inodes on the file system on which
the file is being created.

 [EDQUOT]   O_CREAT is specified, the file does not exist, and the
directory in which the entry for the new file is being
placed cannot be extended because the user's quota of
disk blocks on the file system containing the direc-
tory has been exhausted.

 [EDQUOT]   O_CREAT is specified, the file does not exist, and the
user's quota of inodes on the file system on which the
file is being created has been exhausted.

or perhaps:

 [ENAMETOOLONG] A component of a pathname exceeded 255 characters, or
an entire path name exceeded 1023 characters.


All of which quite clearly indicate that if one wants to put all
of ones allocation of blocks or inodes into a single directory,
then one can, as long as the name and path length limits are
observed.

See: there's a system defined limit, and it's documented as such.

That's what I was getting at.

You're welcome to claim a documentation bug, and add the
appropriate caveat.  It seems clear to me that Hans Reiser (and
Silicon Graphics before him) have taken the more obvious approach,
of attempting to remove the performance limitation inherent in the
existing implementation.

You can moan about tree-structured vs relational databases, but if
your problem space doesn't intrinsically map to a tree, then it
doesn't stop the tree-structring transformation that Terry
mentioned from being a gratuitious hack to work around a
performance problem with the existing implementation.

-- 
Andrew


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Greg Black

Andrew Reilly wrote:

| You can moan about tree-structured vs relational databases, [...]

I can moan about whatever I please -- for instance the fact that
you can't be bothered using a mailer that conforms with basic
rules.  Please figure out how to get a Message-Id header into
your mail and make sure that future messages go out with such a
header.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-24 Thread Jordan Hubbard

Erm, folks?  Can anyone please tell me what this has to do with
freebsd-hackers any longer?  It's been quite a long thread
already - have a heart please and take it to -chat. :(

Thanks,

- Jordan

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Daniel C. Sobral

Shannon Hendrix wrote:
 
  And just to get things worse... :-) the test must be made on the *same*
  slice. If you configure two different slices, the one on the outer
  tracks will be faster.
 
 I cannot verify that with my drive, but my largest is 18GB so maybe
 the difference is not as pronounced as on some newer drives like those
 (currently) monster 70GB drives.

It should be measurable.

On one hand, more sectors per track, same time to read a single track =
more bytes read per second.

On the other hand, more sectors per track, more bytes per track, less
tracks per same size, less track seek needed.

-- 
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

wow regex humor... I'm a geek

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Eric Melville

 The proposed filesystem is most likely Reiserfs. This is a true
 journalling filesystem with a radically non-traditional layout.
 It is no problem to put millions of files in a single directory.
 (actually, the all-in-one approach performs better than a tree)
 
 XFS and JFS are similarly capable, but Reiserfs is well tested
 and part of the official Linux kernel. You can get the Reiserfs
 team to support you too, in case you want to bypass the normal
 filesystem interface for even better performance.

It should be noted that simply because something is tested and a part of a
release, it is not automatically wonderful. My last experiance with linux
was in the 2.2 days, and ended with a lost root filesystem while attempting
to access an msdosfs drive.

From what I've read, mixing reiserfs and nfs is about as exciting as the
stock market has been in the last few months.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Peter Pentchev

On Wed, May 23, 2001 at 08:17:12AM -0400, Andresen,Jason R. wrote:
 On Tue, 22 May 2001, Kris Kennaway wrote:
 
  On Tue, May 22, 2001 at 10:27:27PM +0300, Nadav Eiron wrote:
   I ran tests that I think are similar to what Jason ran on identically
   configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much
   faster than UFS+softupdates on these tests.
  
   Linux (2.2.14-5 + ReiserFS):
   Time:
   164 seconds total
   97 seconds of transactions (103 per second)
  
   Files:
   65052 created (396 per second)
   Creation alone: 6 files (1090 per second)
   Mixed with transactions: 5052 files (52 per second)
   4936 read (50 per second)
   5063 appended (52 per second)
   65052 deleted (396 per second)
   Deletion alone: 60104 files (5008 per second)
   Mixed with transactions: 4948 files (51 per second)
  
   Data:
   24.83 megabytes read (155.01 kilobytes per second)
   336.87 megabytes written (2.05 megabytes per second)
  
   FreeBSD 4.3-RELEASE (ufs/softupdates):
 
  Did you enable write caching?  You didn't mention, and it's off by
  default in 4.3, but I think enabled by default on Linux.
 
 I tried to leave the FreeBSD and Linux boxes as unchanged as possible for
 my tests (they are lab machines that have other uses, although I made sure
 they were idle during the test periods).
 
 I left write caching enabled in the Linux boxes, and left it disabled on
 the FreeBSD boxes.  Personally, I'm hesitant to enable write caching
 on FreeBSD because we tend to use it on machines where we really really
 don't want to lose data.  Write caching is ok on the Linux machines
 because we use them as pure testbeds that we can reconstruct easily if
 their disks go south.

If the tests on the Linux machines are made to simulate how those Linux
machines would operate if used as production servers, then do that:
configure the Linux machines exactly as if they were your production
servers.  That is, if you want write caching off on production servers,
turn it off at test time.

G'luck,
Peter

-- 
If you think this sentence is confusing, then change one pig.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Tue, 22 May 2001, Daniel C. Sobral wrote:

 Jason Andresen wrote:
 
  If only FreeBSD could boot from those funky M-Systems flash disks.

 It can.

How?  Nothing I found in the documentation indicated this, or gave any
sort hint as to how I might go about doing it.  The Linux driver has a
hacked version of Lilo that has to be installed prior to even thinking of
doing anything with the flash, but I found no equivelent for FreeBSDs
boot1.

FreeBSD can mount the disks just fine (I used a custom PicoBSD boot floppy
to fix up the Linux install on the flash disk enough so that it would boot
(the stupid script M-Systems provided installed a completely hosed
system!)).

This sort of information might be handy to have on freebsd.org.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Tue, 22 May 2001, Daniel C. Sobral wrote:

 Jason Andresen wrote:
 
  Results:
  ufs+softupdates is a little slower than ext2fs+wc for low numbers of
  files, but scales better.  I wish I had a Reiserfs partition to
  test with.

 Ext2fs is a non-contender.

 Note, though, that there is some very recent perfomance improvement on
 very large directories known as dirpref (what changed, actually, was
 dirpref's algorithm). This is NOT present on 4.3-RELEASE, though it
 _might_ have since been committed to stable.

The new dirpref code is mostly just a performance tweak.  We can't compete
with ReiserFS on large directories without a major improvement to the
code, assuming the previous post was true and ReiserFS has some log time
components where ufs has linear time components.

Note that the improvement from using the new dirpref code is about 12%,
which isn't bad, but still doesn't put us in the right ballpark.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



RE: technical comparison

2001-05-23 Thread Koster, K.J.

Dear All,

An interview with Reiser just appeared on http://www.slashdot.org/

Just to add a little oil to the fire. :-)

Kees Jan


 You are only young once,
   but you can stay immature all your life.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Tue, 22 May 2001, Shannon Hendrix wrote:

 On Tue, May 22, 2001 at 09:31:34AM -0400, Jason Andresen wrote:

  We only have three Linux boxes here (and one is a PC104 with a flash
  disk) and already I've had to reinstall the entire OS once when we had a
  power glitch.  ext2fsck managed to destroy about 1/3 of the files on the
  system, in a pretty much random manner (the lib and etc were hit hard).

 This is not typical. Also, I have heard the same thing from other people
 about flash disks. fs crash, fsck, and a mess afterwards. It would be
 nice if you could use ufs and see if the same problem exists.

The scary thing is that it was the attached harddrive that lost all of the
files.  The situitation is this:

Attached HD: I just installed Redhat on the hard drive.  I rebooted and
the system booted off of the harddrive normally.  Successful install.  I
logged into the system and started looking into rebuilding the kernel to
include the binary only M-Systems modules when a co-worker accidentally
unplugged the wrong plug (he was working on some nearby machines),
unplugging the power supply I was using to power the hard drives (and
pretty much crashing the PC104 system).  I powered down the PC104 system,
and we plugged everything in again.  When I tried to reboot the system,
Lilo couldn't even find the kernel.  I pulled out the emergency resuce
disc (RedHat's install disk) and booted it up.  When I ran fsck on the
drive, it found error after error on the drive.  Eventually I had to ^C
that fsck run and try it again with the -y option (my arm was getting
tired).  Once fsck was done / was pretty much a ghost town, at which point
I decided to just reinstall the system.

It's entirely possible that there is something I could have done to
prevent fsck from clearing out the filesystem, but it certainly isn't
obvious from the manual, and I've never seen a FreeBSD system do that.

Also, for anybody who says the pull the power test isn't realistic, I can
assure you that power failures DO happen (probably less in your area than
mine (I hope!)) and not planning for them only brings disaster later when
you have a room with 1000 servers lose power.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Tue, 22 May 2001, Shannon Hendrix wrote:

 On Tue, May 22, 2001 at 02:49:21PM -0400, Jason Andresen wrote:

  6 files took ~15 minutes to create as is.  I'm going to have to wait
  until tonight to run larger sets.  2.2.16 is what we have here.
  I'm still waiting to see how much faster ReiserFS is.

 I'm willing to overnight your test if you want. Do you have it packaged
 up to send? It would be interesting just to get numbers from a Linux
 system with a modern kernel. 2.4.1 gave me enough of a speed boost to
 put off another FreeBSD install until I fix some problems there.

 I cannot test FreeBSD with SCSI right now so my system will be an
 inequal set of results.

 I would offer to test NetBSD as well, but I suppose no one would be
 interested in that.

The test is 'postmark'.  It is in /usr/ports/benchmarks, but the
distribution is a single C file.  Just compile it with:
gcc -O -o postmark postmark.c on all of the systems.  That's what the port
uses.  Your system should be unladen but running in multiuser mode and the
test directory you choose should be empty.

The options you are interested in are:
set transactions 1 -- What I used for all of my tests
set number number of starting file in the directory to test
set location /path/to/empty/local/directory
run



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

I just finished the FreeBSD test with

vfs.vmiodirenable=1 (it was 0 before)

6 simlultanious files, 1 transactions, FreeBSD
4.0-Release+Softupdates with write cacheing disabled.  Results are pretty
much unchanged.  Do you have to enable vmiodirenable at boot time for it
to take affect?

Time:
1286 seconds total
505 seconds of transactions (19 per second)

Files:
65065 created (50 per second)
Creation alone: 6 files (85 per second)
Mixed with transactions: 5065 files (10 per second)
5078 read (10 per second)
4921 appended (9 per second)
65065 deleted (50 per second)
Deletion alone: 60130 files (761 per second)
Mixed with transactions: 4935 files (9 per second)

Data:
26.01 megabytes read (20.23 kilobytes per second)
325.12 megabytes written (252.82 kilobytes per second)



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Tue, 22 May 2001, Terry Lambert wrote:

 I don't understand the inability to perform the trivial
 design engineering necessary to keep from needing to put
 60,000 files in one directory.

 However, we can take it as a given that people who need
 to do this are incapable of doing computer science.

 I would suggest two things:

 1)If write caching is off on the Linux disks, turn
   it off on the FreeBSD disks.

 2)   -- and then turn it on on both.

 3)Modify the test to delete the files based on a
   directory traversal, instead of promiscuous
   knowledge of the file names, which is cheating
   to make the lookups appear faster.

 (the rationale behind this last is that people who can't
 design around needing 60,000 files in a single directory
 are probably going to to be unable to correctly remember
 the names of the files they created, since if they could,
 then they could remember things like ./a/a/aardvark or
 ./a/b/abominable).

The problem comes along when you are using a third party
application that keeps a bazillion files in a directory,
which was the problem that spawned this entire thread.

Why is knowing the file names cheating?  It is almost certain
that the application will know the names of it's own files
(and won't be grepping the entire directory every time it
needs to find a file).  I doubt a human is ever going to want
to work in a directory where you have 6 files lying about,
but an application might easily be written to work in just such
conditions.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Tue, 22 May 2001, Shannon Hendrix wrote:

 On Tue, May 22, 2001 at 12:03:33PM -0400, Jason Andresen wrote:

  The data:
 
  Hardware:
  Both machines have the same hardware on paper (although it is TWO
  machines,
  YMMV).
  PII-300
  Intel PIIX4 ATA33 controller
  IBM-DHEA-38451 8063MB ata0-master using UDMA33 HD
 
  Note: all variables are left at default unless mentioned.
 
  1 transactions, 500 files.

 What did you set size to?  How much memory on the machine?

Size was left at the default.  The machines have 64MB of main memory.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Shannon Hendrix

On Wed, May 23, 2001 at 06:53:37AM -0300, Daniel C. Sobral wrote:

  I cannot verify that with my drive, but my largest is 18GB so maybe
  the difference is not as pronounced as on some newer drives like those
  (currently) monster 70GB drives.
 
 It should be measurable.

Actually, I edited too much.  I have seen a difference, but it was too small
to care abot on my system.  These are 7200rpm 18GB drives too.

The other variances in filesystem performance seem to overshadow the
difference.

The only thing I ever did to pick up some speed was to move some data
on a raw device to the faster tracks. I was streaming it in so the
speedup was good. I also picked up some performance on one Linux system
by putting swap in the faster tracks. But for the most part, I've never
been able to tell.

I have read that on the 40-80GB drives, it's very noticeable. In fact,
the IBM Ultrastars are supposed to be faster than their electronics can
handle on the very outer tracks.

-- 
 Secrecy is the beginning of tyranny. -- Unknown   | | |
 | | |
/  |  \
s h a n n o n @ w i d o m a k e r . c o m _/   |   \_

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread julien

Hi all,

I tried your tests on a quite different configuration, a PIII 800 with
1GB ram, with an AcceleRAID 170 controller and a single RAID5 pack of
4*8GB IBM SCSI drives. The system is a 4.3-rc2, NO softupdates, default
configuration.
Here are the results :

pmset transactions 1
pmset number 6
pmset location /root/test/
pmrun
Creating files...Done
Performing transactions..Done
Deleting files...Done
Time:
1715 seconds total
199 seconds of transactions (50 per second)

Files:
65065 created (37 per second)
Creation alone: 6 files (73 per second)
Mixed with transactions: 5065 files (25 per second)
5078 read (25 per second)
4921 appended (24 per second)
65065 deleted (37 per second)
Deletion alone: 60130 files (86 per second)
Mixed with transactions: 4935 files (24 per second)

Data:
26.01 megabytes read (15.17 kilobytes per second)
325.12 megabytes written (189.58 kilobytes per second)

--
---
-- [EMAIL PROTECTED]
---

- Original Message -
From: Andresen,Jason R. [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, May 23, 2001 3:15 PM
Subject: Re: technical comparison


 I just finished the FreeBSD test with

 vfs.vmiodirenable=1 (it was 0 before)

 6 simlultanious files, 1 transactions, FreeBSD
 4.0-Release+Softupdates with write cacheing disabled.  Results are
pretty
 much unchanged.  Do you have to enable vmiodirenable at boot time for
it
 to take affect?

 Time:
 1286 seconds total
 505 seconds of transactions (19 per second)

 Files:
 65065 created (50 per second)
 Creation alone: 6 files (85 per second)
 Mixed with transactions: 5065 files (10 per second)
 5078 read (10 per second)
 4921 appended (9 per second)
 65065 deleted (50 per second)
 Deletion alone: 60130 files (761 per second)
 Mixed with transactions: 4935 files (9 per second)

 Data:
 26.01 megabytes read (20.23 kilobytes per second)
 325.12 megabytes written (252.82 kilobytes per second)



 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-hackers in the body of the message



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Shannon Hendrix

On Wed, May 23, 2001 at 09:03:37AM -0400, Andresen,Jason R. wrote:

 The scary thing is that it was the attached harddrive that lost all of the
 files.  The situitation is this:
[snip]

Sorry to hear that, but like I said, it isn't typical. ext2 in it's
early days, an ext before that were really bad. But I have few problems
with it these days. I've lost more ufs filesystems than I have ext2, but
I don't assume my results are typical: I know ufs is better. However,
ext2's problems are grossly exaggerated. 

 It's entirely possible that there is something I could have done to
 prevent fsck from clearing out the filesystem, but it certainly isn't
 obvious from the manual, and I've never seen a FreeBSD system do that.

Nothing much you can do unless you happen to know ext2 inside and out,
and fix it manually.

It's not normal for ext2 to die like that, and be unable to recover. 

Over the years I have had more bizarre, inexplicable OS problems on
Intel PCs than any other.

 Also, for anybody who says the pull the power test isn't realistic, I can
 assure you that power failures DO happen (probably less in your area than

My point was that yanking power only tests one aspect of the filesystem.
Chosing one based on passing or not passing that test isn't a good idea.

 mine (I hope!)) and not planning for them only brings disaster later when
 you have a room with 1000 servers lose power.

Well, a UPS system is as important in any system you care about as the
computers and operating systems. If you run 1000 servers and they can
lose power, you're on borrowed time anyway.

Where I live, the power gets worse every year. I lost quite a few ext
filesystems, but only a couple of ufs and ext2 filesystems. Then I
bought a 1920VA UPS and it's no longer an issue. I just found it easier
to not lose power than to worry about which filesystem recovers from it
better.

-- 
There are nowadays professors of philosophy, but not philosophers.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread D. Rock

Zitiere Daniel C. Sobral [EMAIL PROTECTED]:

 Note, though, that there is some very recent perfomance improvement on
 very large directories known as dirpref (what changed, actually, was
 dirpref's algorithm). This is NOT present on 4.3-RELEASE, though it
 _might_ have since been committed to stable.
I don't think dirpref should help in this case.

IIRC the dirpref patches change the algorithm choosing the cg of subdirectories
relative to their parent. Since postmark by default only uses one directory,
there should be no benefit.

-- 
Daniel

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Wed, 23 May 2001, Shannon Hendrix wrote:

 Where I live, the power gets worse every year. I lost quite a few ext
 filesystems, but only a couple of ufs and ext2 filesystems. Then I
 bought a 1920VA UPS and it's no longer an issue. I just found it easier
 to not lose power than to worry about which filesystem recovers from it
 better.

One of the funny things about the place I used to work (which will remain
unnamed) was how the UPS folks were always testing their systems by
pulling the plug on the main power to the building.  The problem was they
apparently hired untrained monkeys to wire up the UPS systems (which were
just a few rooms chock full of batteries) and managed to kill power to the
entire building (including the computer rooms) at least once every three
months.  This was doubly annoying because we had well over 100 full RAID
racks (with 80 disks in each rack) in the facility.  Hard drives, as most
of you probablly know, are most likly to fail on boot time, so every time
one of the brain cases managed to kill the power in the CRs, we had to
spend the rest of the day replacing failed RAID drives.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Kris Kennaway

On Wed, May 23, 2001 at 08:17:12AM -0400, Andresen,Jason R. wrote:

  Did you enable write caching?  You didn't mention, and it's off by
  default in 4.3, but I think enabled by default on Linux.
 
 I tried to leave the FreeBSD and Linux boxes as unchanged as possible for
 my tests (they are lab machines that have other uses, although I made sure
 they were idle during the test periods).
 
 I left write caching enabled in the Linux boxes, and left it disabled on
 the FreeBSD boxes.  Personally, I'm hesitant to enable write caching
 on FreeBSD because we tend to use it on machines where we really really
 don't want to lose data.  Write caching is ok on the Linux machines
 because we use them as pure testbeds that we can reconstruct easily if
 their disks go south.

That's all well and good, but I thought the aim here was to compare
Linux and FreeBSD performance on as level playing field as possible?
You're not measuring FS performance, you're measuring FS performance
plus cache performance, so your numbers so far tell you nothing
concrete.

Kris

 PGP signature


Re: technical comparison

2001-05-23 Thread Andresen,Jason R.

On Wed, 23 May 2001, Kris Kennaway wrote:

 On Wed, May 23, 2001 at 08:17:12AM -0400, Andresen,Jason R. wrote:

   Did you enable write caching?  You didn't mention, and it's off by
   default in 4.3, but I think enabled by default on Linux.
 
  I tried to leave the FreeBSD and Linux boxes as unchanged as possible for
  my tests (they are lab machines that have other uses, although I made sure
  they were idle during the test periods).
 
  I left write caching enabled in the Linux boxes, and left it disabled on
  the FreeBSD boxes.  Personally, I'm hesitant to enable write caching
  on FreeBSD because we tend to use it on machines where we really really
  don't want to lose data.  Write caching is ok on the Linux machines
  because we use them as pure testbeds that we can reconstruct easily if
  their disks go south.

 That's all well and good, but I thought the aim here was to compare
 Linux and FreeBSD performance on as level playing field as possible?
 You're not measuring FS performance, you're measuring FS performance
 plus cache performance, so your numbers so far tell you nothing
 concrete.

Yes, they tell us that FreeBSD with softupdates and no write cache
performs better in large cases than Linux with ext2fs and write caching
enabled.

Also my FreeBSD 4.0 boxes don't have the hw.ata.wc knob, so it's harder
for me to test this.  Also, I don't know how ones goes about disabling the
write cache in Linux without recompiling the kernel (which we have some
custom mods in place, so I'm reluctant to do this).


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread void

On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote:
 
 Why is knowing the file names cheating?  It is almost certain
 that the application will know the names of it's own files
 (and won't be grepping the entire directory every time it
 needs to find a file).

With 60,000 files, that would have the application duplicating
60,000 pieces of information that are stored by the operating system.
Operations like open() and unlink() still have to search the directory
to get the inode, so there isn't much incentive for an application to
do that, I think.

-- 
 Ben

An art scene of delight
 I created this to be ...  -- Sun Ra

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Dave Hayes

Terry Lambert [EMAIL PROTECTED] writes:
 I don't understand the inability to perform the trivial
 design engineering necessary to keep from needing to put
 60,000 files in one directory.

Hear hear! ;) (Been waiting for that one)

 However, we can take it as a given that people who need
 to do this are incapable of doing computer science.

You can't make that assumption just yet (although it seems
reasonable). We really don't know exactly what the problem they are
trying to solve is. Network news sites running old versions of
software (as an example, I know someone who still runs CNEWS) have
very clear reasons for phenomena resembling 60,000 files in one
directory.

I would begin to question the assumption that seems to have been
unquestioned. Namely, why is the focus -just- on speed? FreeBSD
outperforms Linux on reliability and security as well. Not to mention
networking. 
--
Dave Hayes - Consultant - Altadena CA, USA - [EMAIL PROTECTED] 
 The opinions expressed above are entirely my own 

We can never have enough of that which we really do not want.  --Eric Hoffer





To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Rik van Riel

On Wed, 23 May 2001, Andresen,Jason R. wrote:
 On Wed, 23 May 2001, Kris Kennaway wrote:

  That's all well and good, but I thought the aim here was to compare
  Linux and FreeBSD performance on as level playing field as possible?
  You're not measuring FS performance, you're measuring FS performance
  plus cache performance, so your numbers so far tell you nothing
  concrete.

*nod*

 Yes, they tell us that FreeBSD with softupdates and no write
 cache performs better in large cases than Linux with ext2fs and
 write caching enabled.

 Also my FreeBSD 4.0 boxes don't have the hw.ata.wc knob, so it's harder
 for me to test this.  Also, I don't know how ones goes about disabling the
 write cache in Linux without recompiling the kernel (which we have some
 custom mods in place, so I'm reluctant to do this).

1. I don't think I've ever seen a Linux distro which has write
   caching enabled by default. Hell, DMA33 isn't even enabled
   by default ;)

2. hdparm -W0 /dev/drive to turn write caching off, -W1 to
   turn it on

3. I've seen many disks which got _slower_ with write caching
   turned on. Sure, it helps for sequential IO, but with more
   random IO the write caching on the disk can interfere really
   badly with the IO scheduling in the OS ... I've seen as much
   as a 5x drop in random IO performance with write caching ON
   compared to OFF.

I guess it would be good to follow Kris' suggestions and try
to do the tests on a level playing field.  The results might
just be interesting ;)

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-23 Thread Shannon

On Wed, May 23, 2001 at 10:54:40PM -0300, Rik van Riel wrote:

 1. I don't think I've ever seen a Linux distro which has write
caching enabled by default. Hell, DMA33 isn't even enabled
by default ;)

You are talking about controlling the IDE drive cache.

The issue here is write cache in the filesystem code.

-- 
[EMAIL PROTECTED]  _
__/ armchairrocketscientistgraffitiexenstentialist
 And in billows of might swell the Saxons before her,-- Unite, oh
 unite!  Or the billows burst o'er her! -- Downfall of the Gael

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Jason Andresen

Albert D. Cahalan wrote:

 It should be immediately obvious that ext2 is NOT the filesystem
 being proposed, async or not. For large directories, ext2 sucks
 as bad as UFS does. This is because ext2 is a UFS clone.
 
 The proposed filesystem is most likely Reiserfs. This is a true
 journalling filesystem with a radically non-traditional layout.
 It is no problem to put millions of files in a single directory.
 (actually, the all-in-one approach performs better than a tree)
 
 XFS and JFS are similarly capable, but Reiserfs is well tested
 and part of the official Linux kernel. You can get the Reiserfs
 team to support you too, in case you want to bypass the normal
 filesystem interface for even better performance.

Er, I don't think ReiserFS is in the Linux kernel yet, although it is
the default filesystem on some distros apparently.  I think Linus has
some reservations about the stability of the filesystem since it is 
fairly new.  That said, it would be hard to be much worse than Ext2fs
with write cacheing enabled (default!) in the event of power failure.
We only have three Linux boxes here (and one is a PC104 with a flash
disk) and already I've had to reinstall the entire OS once when we had a 
power glitch.  ext2fsck managed to destroy about 1/3 of the files on the
system, in a pretty much random manner (the lib and etc were hit hard).  
Heck, the system didn't even try to boot when it came back, I had to
pull
out the rescue disk and run fsck from there.  Good thing the rescue disk
was the same as the install disk, it saved me a disk swap. :(

If only FreeBSD could boot from those funky M-Systems flash disks. 

 So, no async here, and UFS + soft updates can't touch the
 performance on huge directories.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Hroi Sigurdsson

[trimming CCs]

On Tue, May 22, 2001 at 09:31:34AM -0400, Jason Andresen wrote:

 Er, I don't think ReiserFS is in the Linux kernel yet, although it is
 the default filesystem on some distros apparently.  I think Linus has
 some reservations about the stability of the filesystem since it is 
 fairly new. 

It is in now AFAIK.

 That said, it would be hard to be much worse than Ext2fs
 with write cacheing enabled (default!) in the event of power failure.
 We only have three Linux boxes here (and one is a PC104 with a flash
 disk) and already I've had to reinstall the entire OS once when we had a 
 power glitch.  ext2fsck managed to destroy about 1/3 of the files on the
 system, in a pretty much random manner (the lib and etc were hit hard).  
 Heck, the system didn't even try to boot when it came back, I had to
 pull

FWIW, I lost two filesystems last week. One ext2 and the second reiser
and no crashes/power failures were involved.
The ext2 failure meant a complete reinstall (only 4-5 files where left
in / after fsck). A reiser filesystem started giving input/output errors
and could not be repaired with reiserfsck. Trying to back up the file
system before a repair only resulted in kernel panics.

-- 
Hroi Sigurdsson [EMAIL PROTECTED]
Netgroup A/S  http://www.netgroup.dk

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Jason Andresen

Albert D. Cahalan wrote:
 
 Gordon Tetlow writes:
  On Mon, 21 May 2001, Jordan Hubbard wrote:
  [Charles C. Figueire]
 
  c) A filesystem that will be fast in light of tens of thousands of
 files in a single directory (maybe even hundreds of thousands)
 
  I think we can more than hold our own with UFS + soft updates.  This
  is another area where you need to get hard numbers from the Linux
  folks.  I think your assumption that Linux handles this effectively
  is flawed and I'd like to see hard numbers which prove otherwise;
  you should demand no less.
 
  Also point out the reliability factor here which is a bit harder to point
  to a magic number and See, we *are* better! ext2 runs async by default
  which can lead to nasty filesystem corruption in the event of a power
  loss. With softupdates, the filesystem metadata will always be in sync and
  uncorrupted (barring media failure of course).
 
 It should be immediately obvious that ext2 is NOT the filesystem
 being proposed, async or not. For large directories, ext2 sucks
 as bad as UFS does. This is because ext2 is a UFS clone.
 
 The proposed filesystem is most likely Reiserfs. This is a true
 journalling filesystem with a radically non-traditional layout.
 It is no problem to put millions of files in a single directory.
 (actually, the all-in-one approach performs better than a tree)
 
 XFS and JFS are similarly capable, but Reiserfs is well tested
 and part of the official Linux kernel. You can get the Reiserfs
 team to support you too, in case you want to bypass the normal
 filesystem interface for even better performance.
 
 So, no async here, and UFS + soft updates can't touch the
 performance on huge directories.

Unfortunatly I don't have a ReiserFS partition available to test with,
but I do have UFS and ext2fs partitions.

Here's the results I got from postmark, which seems to be the closest
match to the original problem in the entire ports tree. 

Test setup:
Two machines with the same make and model hardware, one running
FreeBSD 4.0, the other running RedHat Linux 7.0.

The data:

Hardware:
Both machines have the same hardware on paper (although it is TWO
machines,
YMMV).
PII-300
Intel PIIX4 ATA33 controller
IBM-DHEA-38451 8063MB ata0-master using UDMA33 HD

Note: all variables are left at default unless mentioned.

1 transactions, 500 files.

FreeBSD 4.0 +Softupdates, write cache disabled:
Time:
35 seconds total
34 seconds of transactions (294 per second)

Files:
5513 created (157 per second)
Creation alone: 500 files (500 per second)
Mixed with transactions: 5013 files (147 per second)
4917 read (144 per second)
5016 appended (147 per second)
5513 deleted (157 per second)
Deletion alone: 526 files (526 per second)
Mixed with transactions: 4987 files (146 per second)

Data:
31.27 megabytes read (893.48 kilobytes per second)
34.71 megabytes written (991.70 kilobytes per second)


Linux 2.2.16 ext2fs and write caching enabled
Time:
28 seconds total
28 seconds of transactions (357 per second)

Files:
5513 created (196 per second)
Creation alone: 500 files (500 per second)
Mixed with transactions: 5013 files (179 per second)
4917 read (175 per second)
5016 appended (179 per second)
5513 deleted (196 per second)
Deletion alone: 526 files (526 per second)
Mixed with transactions: 4987 files (178 per second)

Data:
31.27 megabytes read (1.12 megabytes per second)
34.71 megabytes written (1.24 megabytes per second)


1 transactions, 3 files:
FreeBSD 4.0 +softupdates, write cache disabled: 

Time:
640 seconds total
410 seconds of transactions (24 per second)

Files:
34993 created (54 per second)
Creation alone: 3 files (146 per second)
Mixed with transactions: 4993 files (12 per second)
5055 read (12 per second)
4944 appended (12 per second)
34993 deleted (54 per second)
Deletion alone: 29986 files (1199 per second)
Mixed with transactions: 5007 files (12 per second)

Data:
25.62 megabytes read (40.03 kilobytes per second)
179.79 megabytes written (280.92 kilobytes per second)

Linux 2.2.16 ext2fs with write caching enabled
Time:
1009 seconds total
612 seconds of transactions (16 per second)

Files:
34993 created (34 per second)
Creation alone: 3 files (83 per second)
Mixed with transactions: 4993 files (8 per second)
5055 read (8 per second)
4944 appended (8 per second)
34993 deleted (34 per second)
Deletion alone: 29986 files (768 per second)
Mixed with transactions: 5007 files (8 per second)

Data:
25.62 megabytes read 

Re: technical comparison

2001-05-22 Thread Jason Andresen

Jason Andresen wrote:

Oops, I fubbed up the linux at 6 files test, I'm rerunning it now, 
but it will take a while to finish.

 Results:
 ufs+softupdates is a little slower than ext2fs+wc for low numbers of
 files, but scales better.  I wish I had a Reiserfs partition to
 test with.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Jason Andresen

Jason Andresen wrote:
 
 Jason Andresen wrote:
 
 Oops, I fubbed up the linux at 6 files test, I'm rerunning it now,
 but it will take a while to finish.
 
  Results:
  ufs+softupdates is a little slower than ext2fs+wc for low numbers of
  files, but scales better.  I wish I had a Reiserfs partition to
  test with.

The test is done:

Linux 2.2.16 with ext2fs and write caching
1 transactions, 6 simultanious files:

Time:
2084 seconds total
702 seconds of transactions (14 per second)

Files:
65065 created (31 per second)
Creation alone: 6 files (48 per second)
Mixed with transactions: 5065 files (7 per second)
5078 read (7 per second)
4921 appended (7 per second)
65065 deleted (31 per second)
Deletion alone: 60130 files (395 per second)
Mixed with transactions: 4935 files (7 per second)

Data:
26.01 megabytes read (12.48 kilobytes per second)
325.12 megabytes written (156.01 kilobytes per second)

I don't suppose anybody has a FreeBSD and Linux box dual booting
(or identically speced) with ReiserFS anywhere?  I'm quite 
curious how much faster ReiserFS is in these tests.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Terry Lambert

] I work in an environment consisting of 300+ systems, all FreeBSD
] and Solaris, along with lots of EMC and F5 stuff. Our engineering division
] has been working on a dynamic content server and search engine for the
] past 2.5 years. They have consistently not met up to performance and
] throughput requirements and have always blamed our use of FreeBSD for it.

You may wish to point out to them that their F5 boxes are
running FreeBSD.


Terry Lambert
[EMAIL PROTECTED]
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Albert D. Cahalan

Jason Andresen writes:
 Albert D. Cahalan wrote:

 It should be immediately obvious that ext2 is NOT the filesystem
 being proposed, async or not. For large directories, ext2 sucks
 as bad as UFS does. This is because ext2 is a UFS clone.

 The proposed filesystem is most likely Reiserfs. This is a true
 journalling filesystem with a radically non-traditional layout.
 It is no problem to put millions of files in a single directory.
 (actually, the all-in-one approach performs better than a tree)

 XFS and JFS are similarly capable, but Reiserfs is well tested
 and part of the official Linux kernel. You can get the Reiserfs
 team to support you too, in case you want to bypass the normal
 filesystem interface for even better performance.

 Er, I don't think ReiserFS is in the Linux kernel yet, although it is
 the default filesystem on some distros apparently.  I think Linus has
 some reservations about the stability of the filesystem since it is

It is in the kernel:
http://lxr.linux.no/source/fs/reiserfs/?v=2.4.4
Bugs died left and right when it went in.

 fairly new.  That said, it would be hard to be much worse than Ext2fs
 with write cacheing enabled (default!) in the event of power failure.
 We only have three Linux boxes here (and one is a PC104 with a flash
 disk) and already I've had to reinstall the entire OS once when we had a
 power glitch.  ext2fsck managed to destroy about 1/3 of the files on the
 system, in a pretty much random manner (the lib and etc were hit hard).

If you don't like ext2, why should it like you? :-)
I power cycle a Linux box nearly every day to reset
a board.

 If only FreeBSD could boot from those funky M-Systems flash disks.

If you want flash, use a filesystem designed for flash.
(not UFS, ext2, Reiserfs, XFS, JFS, or FAT... try JFFS2)

 So, no async here, and UFS + soft updates can't touch the
 performance on huge directories.

From another email you mention benchmarking with:

 Linux 2.2.16 with ext2fs and write caching
 1 transactions, 6 simultanious files:

1. The 2.2.16 kernel is obsolete.
2. 6 files is not a lot. Try a few million files.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



RE: technical comparison

2001-05-22 Thread Matt Simerson

 -Original Message-
 From: Terry Lambert [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, May 22, 2001 10:59 AM
 To: [EMAIL PROTECTED]
 Subject: Re: technical comparison
 
 ]  I work in an environment consisting of 300+ systems, all FreeBSD
 ] and Solaris, along with lots of EMC and F5 stuff. Our engineering
division
 ] has been working on a dynamic content server and search engine for the
 ] past 2.5 years. They have consistently not met up to performance and
 ] throughput requirements and have always blamed our use of FreeBSD for
it.
 
 You may wish to point out to them that their F5 boxes are
 running FreeBSD.
 
   Terry Lambert
   [EMAIL PROTECTED]

When did that change?  As of March which was the last time I had my grubby
little hands all over a F5 BigIP box in our lab, it was NOT running FreeBSD.
It runs a tweaked version of BSDI's kernel. 

Matt  


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Jason Andresen

Albert D. Cahalan wrote:
 
 Jason Andresen writes:
  Er, I don't think ReiserFS is in the Linux kernel yet, although it is
  the default filesystem on some distros apparently.  I think Linus has
  some reservations about the stability of the filesystem since it is
 
 It is in the kernel:
 http://lxr.linux.no/source/fs/reiserfs/?v=2.4.4
 Bugs died left and right when it went in.

Looks like my news was out of date.  Thanks for the update.

  fairly new.  That said, it would be hard to be much worse than Ext2fs
  with write cacheing enabled (default!) in the event of power failure.
  We only have three Linux boxes here (and one is a PC104 with a flash
  disk) and already I've had to reinstall the entire OS once when we had a
  power glitch.  ext2fsck managed to destroy about 1/3 of the files on the
  system, in a pretty much random manner (the lib and etc were hit hard).
 
 If you don't like ext2, why should it like you? :-)
 I power cycle a Linux box nearly every day to reset
 a board.
 
  If only FreeBSD could boot from those funky M-Systems flash disks.
 
 If you want flash, use a filesystem designed for flash.
 (not UFS, ext2, Reiserfs, XFS, JFS, or FAT... try JFFS2)
 
  So, no async here, and UFS + soft updates can't touch the
  performance on huge directories.
 
 From another email you mention benchmarking with:
 
  Linux 2.2.16 with ext2fs and write caching
  1 transactions, 6 simultanious files:
 
 1. The 2.2.16 kernel is obsolete.
 2. 6 files is not a lot. Try a few million files.

6 files took ~15 minutes to create as is.  I'm going to have to wait 
until tonight to run larger sets.  2.2.16 is what we have here.  
I'm still waiting to see how much faster ReiserFS is.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Nadav Eiron

I ran tests that I think are similar to what Jason ran on identically
configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much
faster than UFS+softupdates on these tests. 

Linux (2.2.14-5 + ReiserFS):
Time:
164 seconds total
97 seconds of transactions (103 per second)

Files:
65052 created (396 per second)
Creation alone: 6 files (1090 per second)
Mixed with transactions: 5052 files (52 per second)
4936 read (50 per second)
5063 appended (52 per second)
65052 deleted (396 per second)
Deletion alone: 60104 files (5008 per second)
Mixed with transactions: 4948 files (51 per second)

Data:
24.83 megabytes read (155.01 kilobytes per second)
336.87 megabytes written (2.05 megabytes per second)

FreeBSD 4.3-RELEASE (ufs/softupdates):
Time:
537 seconds total
155 seconds of transactions (64 per second)

Files:
65052 created (121 per second)
Creation alone: 6 files (172 per second)
Mixed with transactions: 5052 files (32 per second)
4936 read (31 per second)
5063 appended (32 per second)
65052 deleted (121 per second)
Deletion alone: 60104 files (1717 per second)
Mixed with transactions: 4948 files (31 per second)

Data:
24.83 megabytes read (47.34 kilobytes per second)
336.87 megabytes written (642.38 kilobytes per second)

Both tests were done with postmark-1.5, 6 files in 1 transactions.
The machines are IBM Netfinity 4000R, the disk is an IBM DPSS-336950N,
connected to an Adaptec 2940UW.

Nadav


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread void

On Tue, May 22, 2001 at 12:40:11PM -0600, Matt Simerson wrote:

 When did that change?  As of March which was the last time I had my grubby
 little hands all over a F5 BigIP box in our lab, it was NOT running FreeBSD.
 It runs a tweaked version of BSDI's kernel. 

I believe it is Terry's information that's out of date, not yours.

-- 
 Ben

An art scene of delight
 I created this to be ...  -- Sun Ra

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Munish Chopra

ReiserFS entered Linux kernels in the pre 2.4.1 series, and was 'official' with 2.4.1. 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Kris Kennaway

On Tue, May 22, 2001 at 10:27:27PM +0300, Nadav Eiron wrote:
 I ran tests that I think are similar to what Jason ran on identically
 configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much
 faster than UFS+softupdates on these tests. 
 
 Linux (2.2.14-5 + ReiserFS):
 Time:
 164 seconds total
 97 seconds of transactions (103 per second)
 
 Files:
 65052 created (396 per second)
 Creation alone: 6 files (1090 per second)
 Mixed with transactions: 5052 files (52 per second)
 4936 read (50 per second)
 5063 appended (52 per second)
 65052 deleted (396 per second)
 Deletion alone: 60104 files (5008 per second)
 Mixed with transactions: 4948 files (51 per second)
 
 Data:
 24.83 megabytes read (155.01 kilobytes per second)
 336.87 megabytes written (2.05 megabytes per second)
 
 FreeBSD 4.3-RELEASE (ufs/softupdates):

Did you enable write caching?  You didn't mention, and it's off by
default in 4.3, but I think enabled by default on Linux.

Kris

 PGP signature


Re: technical comparison

2001-05-22 Thread Nadav Eiron

I didn't, but I believe Jason's numbers (for ext2 and ufs) also had write
caching only enabled on Linux.

On Tue, 22 May 2001, Kris Kennaway wrote:

 On Tue, May 22, 2001 at 10:27:27PM +0300, Nadav Eiron wrote:
  I ran tests that I think are similar to what Jason ran on identically
  configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much
  faster than UFS+softupdates on these tests. 
  
  Linux (2.2.14-5 + ReiserFS):
  Time:
  164 seconds total
  97 seconds of transactions (103 per second)
  
  Files:
  65052 created (396 per second)
  Creation alone: 6 files (1090 per second)
Mixed with transactions: 5052 files (52 per second)
  4936 read (50 per second)
  5063 appended (52 per second)
  65052 deleted (396 per second)
  Deletion alone: 60104 files (5008 per second)
Mixed with transactions: 4948 files (51 per second)
  
  Data:
  24.83 megabytes read (155.01 kilobytes per second)
  336.87 megabytes written (2.05 megabytes per second)
  
  FreeBSD 4.3-RELEASE (ufs/softupdates):
 
 Did you enable write caching?  You didn't mention, and it's off by
 default in 4.3, but I think enabled by default on Linux.
 
 Kris
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Daniel C. Sobral

Jason Andresen wrote:
 
 If only FreeBSD could boot from those funky M-Systems flash disks.

It can.

-- 
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

wow regex humor... I'm a geek

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Daniel C. Sobral

Jason Andresen wrote:
 
 Results:
 ufs+softupdates is a little slower than ext2fs+wc for low numbers of
 files, but scales better.  I wish I had a Reiserfs partition to
 test with.

Ext2fs is a non-contender.

Note, though, that there is some very recent perfomance improvement on
very large directories known as dirpref (what changed, actually, was
dirpref's algorithm). This is NOT present on 4.3-RELEASE, though it
_might_ have since been committed to stable.

-- 
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

wow regex humor... I'm a geek

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Daniel C. Sobral

Nadav Eiron wrote:
 
 I ran tests that I think are similar to what Jason ran on identically
 configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much
 faster than UFS+softupdates on these tests.

For that matter, did you have vfs.vmiodirenable enabled?

-- 
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

wow regex humor... I'm a geek

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Shannon Hendrix

On Tue, May 22, 2001 at 02:49:21PM -0400, Jason Andresen wrote:

 6 files took ~15 minutes to create as is.  I'm going to have to wait 
 until tonight to run larger sets.  2.2.16 is what we have here.  
 I'm still waiting to see how much faster ReiserFS is.

I'm willing to overnight your test if you want. Do you have it packaged
up to send? It would be interesting just to get numbers from a Linux
system with a modern kernel. 2.4.1 gave me enough of a speed boost to
put off another FreeBSD install until I fix some problems there.

I cannot test FreeBSD with SCSI right now so my system will be an
inequal set of results.

I would offer to test NetBSD as well, but I suppose no one would be
interested in that.

-- 
[EMAIL PROTECTED]  _
__/ armchairrocketscientistgraffitiexistentialist
 There is no such thing as security.  Life is either bold adventure,
 or it is nothing -- Helen Keller

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Shannon Hendrix

On Tue, May 22, 2001 at 09:31:34AM -0400, Jason Andresen wrote:

 Er, I don't think ReiserFS is in the Linux kernel yet, although it is
 the default filesystem on some distros apparently.  

ReiserFS, on my system anyway, started just losing files. I'd log in and
would notice some mp3 files or source code was just gone. No heavy load,
and no crashes. Nope, not for me. I think they'll get it in time if the
basic design isn't flawed, but things like an fs just take a lot of time
to debug and come to trust.

There are already some very good journaling systems, and it would seem
better to get them ported, and leave things like ReiserFS a research
project until it proves itself.

 That said, it would be hard to be much worse than Ext2fs with write cacheing
 enabled (default!) in the event of power failure.

Point taken, but the yank power, see who survives test is illogical
and dangerous thinking. 

Besides, my drives have megabytes of write-cache that I cannot disable.
Most are large enough to cause problems for most any fs if they crash
at just the right moment. From what I have read, a lot of drives really
ignore commands to turn it off or do synchronous writes.

Both ext2 and ufs both handle my chores with little or no trouble. On
some systems, I've actually preferred ufs to the journaled file systems.

 We only have three Linux boxes here (and one is a PC104 with a flash
 disk) and already I've had to reinstall the entire OS once when we had a 
 power glitch.  ext2fsck managed to destroy about 1/3 of the files on the
 system, in a pretty much random manner (the lib and etc were hit hard).  

This is not typical. Also, I have heard the same thing from other people
about flash disks. fs crash, fsck, and a mess afterwards. It would be
nice if you could use ufs and see if the same problem exists.

-- 
 There's music along the river For Love wanders there, Pale | | |
 flowers on his mantle, Dark leaves on his hair. -- James Joyce | | |
/  |  \
s h a n n o n @ w i d o m a k e r . c o m _/   |   \_

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Daniel C. Sobral

Shannon Hendrix wrote:
 
 On Tue, May 22, 2001 at 02:49:21PM -0400, Jason Andresen wrote:
 
  6 files took ~15 minutes to create as is.  I'm going to have to wait
  until tonight to run larger sets.  2.2.16 is what we have here.
  I'm still waiting to see how much faster ReiserFS is.
 
 I'm willing to overnight your test if you want. Do you have it packaged
 up to send? It would be interesting just to get numbers from a Linux
 system with a modern kernel. 2.4.1 gave me enough of a speed boost to
 put off another FreeBSD install until I fix some problems there.
 
 I cannot test FreeBSD with SCSI right now so my system will be an
 inequal set of results.
 
 I would offer to test NetBSD as well, but I suppose no one would be
 interested in that.

And just to get things worse... :-) the test must be made on the *same*
slice. If you configure two different slices, the one on the outer
tracks will be faster.

-- 
Daniel C. Sobral(8-DCS)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

wow regex humor... I'm a geek

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Shannon Hendrix

On Tue, May 22, 2001 at 12:03:33PM -0400, Jason Andresen wrote:

 Here's the results I got from postmark, which seems to be the closest
 match to the original problem in the entire ports tree. 
 
 Test setup:
 Two machines with the same make and model hardware, one running
 FreeBSD 4.0, the other running RedHat Linux 7.0.
 
 The data:
 
 Hardware:
 Both machines have the same hardware on paper (although it is TWO
 machines,
 YMMV).
 PII-300
 Intel PIIX4 ATA33 controller
 IBM-DHEA-38451 8063MB ata0-master using UDMA33 HD
 
 Note: all variables are left at default unless mentioned.
 
 1 transactions, 500 files.

What did you set size to?  How much memory on the machine?

I tested on a 700MHz Athlon system with 256MB RAM, Adaptec 2940UW
controller, 18GB IBM Ultrastar SCSI drive. You must have really low
memory or something because I know that 1 transactions and 500 files
can't be enough for anything faster than my old Sun SS5.

I hit over 16MB/sec and 5000 transactions per second on my Linux
machine.  On the larger tests, it was disappointing.  I can't
test FreeBSD on SCSI right now, but my NetBSD machine (the 
old Sun SS5 wasn't terrible at least:

Time:
220 seconds total
204 seconds of transactions (49 per second)

Files:
5564 created (25 per second)
Creation alone: 500 files (62 per second)
Mixed with transactions: 5064 files (24 per second)
4999 read (24 per second)
4967 appended (24 per second)
5564 deleted (25 per second)
Deletion alone: 628 files (78 per second)
Mixed with transactions: 4936 files (24 per second)

Data:
32.12 megabytes read (149.52 kilobytes per second)
35.61 megabytes written (165.73 kilobytes per second)

 1 transactions, 6 files
 FreeBSD 4.0 with Softupdates, write cache disabled
 Time:
 1259 seconds total
 495 seconds of transactions (20 per second)

I got about 60 per second right here.

I was actually expecting better results from Linux and NetBSD than I
got, and would expect more from FreeBSD than you got.

I'm going to test FreeBSD tomorrow and Linux again with much larger 
numbers of files and transactions.

-- 
 Star Wars Moral Number 17: Teddy bears are dangerous in| | |
 herds. | | |
/  |  \
s h a n n o n @ w i d o m a k e r . c o m _/   |   \_

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Shannon Hendrix

On Tue, May 22, 2001 at 10:55:09PM -0300, Daniel C. Sobral wrote:

 And just to get things worse... :-) the test must be made on the *same*
 slice. If you configure two different slices, the one on the outer
 tracks will be faster.

I cannot verify that with my drive, but my largest is 18GB so maybe
the difference is not as pronounced as on some newer drives like those
(currently) monster 70GB drives.

A 70GB IBM Ultrastar supposedly can physically outrun the internal
electronics on the faster tracks. One review I read mentioned it as a
problem, though I'm not sure why. 

In any case, I'm not quite that picky, and I would not think that
postmark would benefit as much from being on the faster tracks. It's
doing a lot more complicated things than just streaming data.

-- 
And in billows of might swell the Saxons before her,-- Unite, oh
unite!  Or the billows burst o'er her! -- Downfall of the Gael
__
Charles Shannon Hendrix  s h a n n o n @ w i d o m a k e r . c o m

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Terry Lambert

Nadav Eiron wrote:
 
 I ran tests that I think are similar to what Jason ran on identically
 configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much
 faster than UFS+softupdates on these tests.

[ ... ]

 Both tests were done with postmark-1.5, 6 files in
 1 transactions.  The machines are IBM Netfinity 4000R,
 the disk is an IBM DPSS-336950N, connected to an Adaptec
 2940UW.

I don't understand the inability to perform the trivial
design engineering necessary to keep from needing to put
60,000 files in one directory.

However, we can take it as a given that people who need
to do this are incapable of doing computer science.

I would suggest two things:

1)  If write caching is off on the Linux disks, turn
it off on the FreeBSD disks.

2) -- and then turn it on on both.

3)  Modify the test to delete the files based on a
directory traversal, instead of promiscuous
knowledge of the file names, which is cheating
to make the lookups appear faster.

(the rationale behind this last is that people who can't
design around needing 60,000 files in a single directory
are probably going to to be unable to correctly remember
the names of the files they created, since if they could,
then they could remember things like ./a/a/aardvark or
./a/b/abominable).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread David Scheidt

On Tue, 22 May 2001, Shannon Hendrix wrote:

:
:Point taken, but the yank power, see who survives test is illogical
:and dangerous thinking.

Depends on the enviornment.  I've had lots of machines just lose power.
People will pull power cords out, the back-up generators won't start
before the battery back-up runs out, someone will push the Big Red
Switch.  Even the best back-up power isn't going to help if it catches
fire.  I sort of like machines to work when the power comes back.


-- 
[EMAIL PROTECTED]
Bipedalism is only a fad.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Terry Lambert

void wrote:
 
 On Tue, May 22, 2001 at 12:40:11PM -0600, Matt Simerson wrote:
 
  When did that change?  As of March which was the last time
  I had my grubby little hands all over a F5 BigIP box in our
   lab, it was NOT running FreeBSD.  It runs a tweaked version
  of BSDI's kernel.
 
 I believe it is Terry's information that's out of date, not yours.

Yep; mea culpa.

I guess they will just have to install BSDI systems in
place of your FreeBSD and Linux systems.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Albert D. Cahalan

Shannon Hendrix writes:
 On Tue, May 22, 2001 at 12:03:33PM -0400, Jason Andresen wrote:

 Here's the results I got from postmark, which seems to be the closest
 match to the original problem in the entire ports tree. 
 
 Test setup:
 Two machines with the same make and model hardware, one running
 FreeBSD 4.0, the other running RedHat Linux 7.0.

That should be FreeBSD 4.3 and Red Hat 7.1 at least,
or -current and 2.4.5-pre5. Considering that this is
about a new system, the latest software and hardware
ought to be used. Reiserfs only became stable just
recently; the 2.4.1 kernel would be a dumb choice.

 1 transactions, 500 files.
...
 1 transactions, 6 files

Even 6 files is insignificant by Reiserfs standards.
The test gets interesting with several million files.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-22 Thread Albert D. Cahalan


Terry Lambert writes:

 I don't understand the inability to perform the trivial
 design engineering necessary to keep from needing to put
 60,000 files in one directory.

 However, we can take it as a given that people who need
 to do this are incapable of doing computer science.

One could say the same about the design engineering necessary
to handle 60,000 files in one directory. You're making excuses.

People _want_ to do this, and it often performs better on
a modern filesystem. This is not about need; it's about
keeping ugly hacks out of the app code.

http://www.namesys.com/5_1.html

 (the rationale behind this last is that people who can't
 design around needing 60,000 files in a single directory
 are probably going to to be unable to correctly remember
 the names of the files they created, since if they could,
 then they could remember things like ./a/a/aardvark or
 ./a/b/abominable).

Eeew. ./a/b/abominable is a disgusting old hack used to
work around traditional filesystem deficiencies.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-21 Thread Greg Black

Charles C. Figueiredo wrote:

|   I appoligize if this is the improper channel for this sort of
| discussion, but it is in the best interests of the FreeBSD following,
| atleast, within my orginization.

It is the wrong place -- see the list descriptions.

| Linux on Intel fits the bill because it meets these three requirements
| *very* effectively.

So setup some Linux boxes and let them play.  If that solves
your problem, just go with it.  If it doesn't, then you know the
problem is different and you can look into it.  If it really
turns out that the Linux solution works and if you want to do
something to help FreeBSD do as well, then you'll have the data
to make that a possibility.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-21 Thread Jordan Hubbard

From: Charles C. Figueiredo [EMAIL PROTECTED]
Subject: technical comparison
Date: Mon, 21 May 2001 17:10:54 -0400 (EDT)

   I work in an environment consisting of 300+ systems, all FreeBSD
 and Solaris, along with lots of EMC and F5 stuff. Our engineering division
 has been working on a dynamic content server and search engine for the
 past 2.5 years. They have consistently not met up to performance and
 throughput requirements and have always blamed our use of FreeBSD for it.

This is your first warning sign.  This has all the appearances of a
group of people who've _already_ made their conclusions and are now
busily engaged in fitting the data to match.  The only defense against
this kind of situation is to take their data head-on.  You're probably
not going to get them to alter their preexisting bias since they
probably have their own reasons for being Linux evangelists, but you
can at least fight them to a stand-still on the comparative data
front.  Sinc FreeBSD is already entrenched there, that means you win
the battle, at least for now.  Winning the war will require that you
not get complacent and continue with your objective measurements to
prove (or disprove) FreeBSD's suitability for your needs.  In the
cases where you disprove it, at least the data is in friendly hands
and you can open back-channel communications with us to try and
address those shortcomings, whatever they may be.

To take your current list:

 a) A machine that has fast character operations

I think that's probably more architecture (machine) dependant than it
is a function of the OS.  A PC does a fine job at many things, but an
IBM 3090 it's not.  You should probably establish as compared to
what for this argument and see what Linux's numbers are; I suspect it
will quickly become a non-issue since the beancounters won't want to
spend the kind of money truly improving this would cost.

 b) A *supported* Oracle client

That's a gotcha, no doubt about it.  About the best you can probably
do here is show that the Linux Oracle client works just fine under
compatibility mode and determine just how many support calls you make
to Oracle with respect to their client (and not the server) software a
year.

 c) A filesystem that will be fast in light of tens of thousands of
files in a single directory (maybe even hundreds of thousands)

I think we can more than hold our own with UFS + soft updates.  This
is another area where you need to get hard numbers from the Linux
folks.  I think your assumption that Linux handles this effectively
is flawed and I'd like to see hard numbers which prove otherwise;
you should demand no less.

- Jordan

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-21 Thread Gordon Tetlow

On Mon, 21 May 2001, Jordan Hubbard wrote:

  c) A filesystem that will be fast in light of tens of thousands of
 files in a single directory (maybe even hundreds of thousands)

 I think we can more than hold our own with UFS + soft updates.  This
 is another area where you need to get hard numbers from the Linux
 folks.  I think your assumption that Linux handles this effectively
 is flawed and I'd like to see hard numbers which prove otherwise;
 you should demand no less.

Also point out the reliability factor here which is a bit harder to point
to a magic number and See, we *are* better! ext2 runs async by default
which can lead to nasty filesystem corruption in the event of a power
loss. With softupdates, the filesystem metadata will always be in sync and
uncorrupted (barring media failure of course).

-gordon


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: technical comparison

2001-05-21 Thread Albert D. Cahalan


Gordon Tetlow writes:
 On Mon, 21 May 2001, Jordan Hubbard wrote:
 [Charles C. Figueire]

 c) A filesystem that will be fast in light of tens of thousands of
files in a single directory (maybe even hundreds of thousands)

 I think we can more than hold our own with UFS + soft updates.  This
 is another area where you need to get hard numbers from the Linux
 folks.  I think your assumption that Linux handles this effectively
 is flawed and I'd like to see hard numbers which prove otherwise;
 you should demand no less.

 Also point out the reliability factor here which is a bit harder to point
 to a magic number and See, we *are* better! ext2 runs async by default
 which can lead to nasty filesystem corruption in the event of a power
 loss. With softupdates, the filesystem metadata will always be in sync and
 uncorrupted (barring media failure of course).

It should be immediately obvious that ext2 is NOT the filesystem
being proposed, async or not. For large directories, ext2 sucks
as bad as UFS does. This is because ext2 is a UFS clone.

The proposed filesystem is most likely Reiserfs. This is a true
journalling filesystem with a radically non-traditional layout.
It is no problem to put millions of files in a single directory.
(actually, the all-in-one approach performs better than a tree)

XFS and JFS are similarly capable, but Reiserfs is well tested
and part of the official Linux kernel. You can get the Reiserfs
team to support you too, in case you want to bypass the normal
filesystem interface for even better performance.

So, no async here, and UFS + soft updates can't touch the
performance on huge directories.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message