Re: technical comparison
Dave Hayes wrote: You can't make that assumption just yet (although it seems reasonable). We really don't know exactly what the problem they are trying to solve is. Network news sites running old versions of software (as an example, I know someone who still runs CNEWS) have very clear reasons for phenomena resembling 60,000 files in one directory. I think it's the how can we come up with an artificial benchmark to prove the opinions we already have problem... Right up there with the Polygraph web caching benchmark, which intentionally stacks the deck to test cache replacement, and for whom the people who get the best benchmarks are those who cheat back and use random replacement instead of LRU or some other sane algorithm, since the test intentionally destroys locality of reference. People have made the same complaint about the lmbench micro benchmarks, which test things which aren't really meaningful any more (e.g. NULL system call overhead, when we have things like kqueue, etc.). I'm largely unimpressed with benchmarks written to beat a particular drum for political reasons, rather than as a tool for optimizing something that's meaningful to real world performance under actual load conditions. Call me crazy that way... -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andrew Reilly wrote: On Sat, May 26, 2001 at 07:25:16PM +1000, Andrew Reilly wrote: One of my personal mail folders has 4400 messages in it, and I've only been collecting that one for a few years. It's not millions, but its a few more than the 500 that I've seen some discuss here as a reasonable limit (why is that reasonable?) and it's many many more than the 72 or so limit available in ADFS. I realised as soon as I pressed the send button that my current use of large directories for mail files doesn't actually involve any random access: the directory is read sequentially to build the header list. It is quite concievable that a performance tweak to the IMAP server could involve a header cache in a relational database of some sort, and that would certainly contain references to the individual files, which would then be accessed randomly. /usr/ports/distfiles on any of the mirrors probably contains upwards of 5000 files too, and there is a strong likelyhood that these will be accessed out-of-order by ports-makefile-driven fetch requests. Cyrus IMAP uses a header cache in precisely this way. And since the cache files are created very early on, they are early in the directory, and so do not suffer a large startup penalty. The searches for specific files would indeed be linear, but they would be O(1) linear for each file. As I said before, I replaced the FFS directory code with a trie structured directory structure. Using these n-ary structures, you could very quickly look up any individual files, and a linear traversal of the directory to iterate all files (an increasingly common thing for visual file browsers to do) was still O(1) linear. People didn't find the patches very useful, beyond them being an interesting curiousity, since in reality, the problem of huge directories tends not to exist in nature, where code was written to deal with the limitations of S51K and similar FS's, and thus doesn't tend to do things like dump all its large number of files into a single directory. A similar set of patches cause iterated filenames to have their vnodes prefaulted, which helps immensely in AppleTalk and SMB file serving, where the protocol demands stat data back at the same time because of assumptions about the host OS's files. This effectively enters them into the directory cache, and locality of reference keeps them there. This is a really simple hack. Then all you have to do is up your directory cache size to whatever your favorite unreasonable limit happens to be (e.g. 70,000), and everything becomes a cache hit, after the initial load-up. The second is still a clever hack (IMO), since it's still useful, but you'd want it to be a per FS option, or at least minimally an ioctl() to set the option on a directory fd after opening it, so that exported SMBFS share would have the behaviour, but your news spool would not. As I said before, however, the tradeoff of better performance on really obscene directories was not really worth the binary backward compatability problems that resulted when switching a system over (in other words, it was an interesting research topic, but little more than that). I think the benchmark in question is pretty lame, and the things which it is attempting to prove will not occur in real systems, unless you are running pessimal code. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Sun, 27 May 2001 22:50:48 -0300 (BRST), Rik van Riel [EMAIL PROTECTED] wrote: On Sat, 26 May 2001, Peter Wemm wrote: Which is more expensive? Maintaining an on-disk hashed (or b+tree) directory format for *everything* or maintaining a simple low-cost format on disk with in-memory hashing for fast lookups? I bet that for modest directory sizes the cost of disk IO outweighs the added CPU usage by so much that you may as well take the trouble of using the more scalable directory format. I'm not sure I follow this. Reading sequentially is always going to be much faster than reading randomly. For a modest directory size, you run the risk that randomly accessing fewer blocks will actually take longer than just reading the entire directory sequentially. For the small directory case I suspect the FFS+namecache way is more cost effective. For the medium to large directory case (10,000 to 100,000 entries), I suspect the FFS+namecache method isn't too shabby, providing you are not starved for memory. For the insanely large cases - I dont want to think about :-). The ext2 fs, which uses roughly the same directory structure as UFS and has a name cache which isn't limited in size, seems to bog down at about 10,000 directory entries. As has been pointed out earlier, hash algorithms need a `maximum number of entries' parameter as part of their algorithm. Beyond some point, defined by this number, the hash will degenerate to (typically) O(N). It sounds like the Linux name cache hashing algorithm is not intended to handle so many directory entries. Daniel Phillips is working on a hash extension to ext2; not a replacement of the directory format, but a way to tack a hashed index after the normal directory index. I think a tree structure is better than a hash because there is no inherent limit to the size (though the downside is O(log N) rather than close to fixed time). It may be possible to build a tree structure around the UFS directory block structure in such a way that it would be backward compatible[1]. Of course managing to correctly handle soft-updates write ordering for a tree re-balance is non-trivial. One point that hasn't come out so far is that reading a UFS is quite easy - hence boot2 can locate a loader or kernel by name within the root filesystem, rather than needing to hardware block numbers to load. If the directory structure does change, we need to ensure that it's possible to (possibly inefficiently) parse the structure in a fairly small amount of code. It also has the advantage of being able to keep using the triedtested fsck utilities. Whatever is done, fsck would need to be enhanced to validate the directory structure, otherwise you could wind up with files that can't be found/deleted because they aren't where the hash/tree algorithm expects them. Suggestion for the lets use the filesystem as a general purpose relational database crowd: A userland implementation of the existing directory search scheme (ignoring name caching) would be trivial (see /usr/include/ufs/ufs/dir.h and dir(5) for details). Modify postmark (or similar) to simulate the creation/deletion of files in a userland `directory' structure and demonstrate an algorithm that is faster for the massive directory case and doesn't pessimize small directories. The effect of the name cache and datafile I/O should be able to be ignored since you just want to compare directory algorithms. [1] Keep entries within each block sorted. Reserve space at the end of the block for left and right child branch pointers and other overheads, with the left branch being less than the first entry in the block and the right branch being greater than the last entry. The reserved space is counted in d_reclen of the last entry (which makes it backward compatible). I haven't thought through the block splitting/merging algorithm so this may not work. Peter To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andrew Reilly wrote: It is quite concievable that a performance tweak to the IMAP server could involve a header cache in a relational database of some sort, and that would certainly contain references to the individual files, which would then be accessed randomly. You might want to give mbox format a try. imap-uw will use this format if you perform a few tweaks described in the documentation that comes with it. Basically, instead of the mailbox being in plain text it creates a type of database at the top of the file that describes the contents. Makes access much faster for large ( 1k letters) mailboxes. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Sun, 27 May 2001, Doug Barton wrote: Andrew Reilly wrote: It is quite concievable that a performance tweak to the IMAP server could involve a header cache in a relational database of some sort, and that would certainly contain references to the individual files, which would then be accessed randomly. You might want to give mbox format a try. imap-uw will use this format if you perform a few tweaks described in the documentation that comes with it. Basically, instead of the mailbox being in plain text it creates a type of database at the top of the file that describes the contents. Makes access much faster for large ( 1k letters) mailboxes. what you are suggesting sounds like something that Cyrus-IMAP already done, using Berkeley-DB ... loading up several thousand email's and sorting them takes no time ... To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Sat, 26 May 2001, Peter Wemm wrote: Which is more expensive? Maintaining an on-disk hashed (or b+tree) directory format for *everything* or maintaining a simple low-cost format on disk with in-memory hashing for fast lookups? I bet that for modest directory sizes the cost of disk IO outweighs the added CPU usage by so much that you may as well take the trouble of using the more scalable directory format. For the small directory case I suspect the FFS+namecache way is more cost effective. For the medium to large directory case (10,000 to 100,000 entries), I suspect the FFS+namecache method isn't too shabby, providing you are not starved for memory. For the insanely large cases - I dont want to think about :-). The ext2 fs, which uses roughly the same directory structure as UFS and has a name cache which isn't limited in size, seems to bog down at about 10,000 directory entries. Daniel Phillips is working on a hash extension to ext2; not a replacement of the directory format, but a way to tack a hashed index after the normal directory index. This way the filesystem is backward compatible, older kernels will just use the old directory format and will clear a flag when they write to the directory, this can later be used by the new kernel to rebuild the hashed directory index. It also has the advantage of being able to keep using the triedtested fsck utilities. Maybe this could be an idea to enhance UFS scalability for huge directories without endangering reliability ? regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to [EMAIL PROTECTED] (spam digging piggy) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Fri, May 25, 2001 at 08:49:21PM +, Terry Lambert wrote: There is _no_ performance problem with the existing implementation, if you treat postgres as the existing implementation; it will do what you want, quickly and effectively, for millions of record keys. Does postgres make a good mail archive database? Can it handle arbitrary record lengths? It couldn't the last time I looked at it. Why are you treating an FS as if it were a relational database? It is a tool intended to solve an entirely different problem set. I'm not treating it as a relational database. But if mail messages aren't conceptually files, then I don't know what they are. One of my personal mail folders has 4400 messages in it, and I've only been collecting that one for a few years. It's not millions, but its a few more than the 500 that I've seen some discuss here as a reasonable limit (why is that reasonable?) and it's many many more than the 72 or so limit available in ADFS. I changed over to Maildirs becuase I like the fact that I can use normal Unix file search and manipulation programs on individual messages, as well as a wider set of MUAs (thanks to courier IMAP...), and because folder opening doesn't bog down when there are a couple of messages with really large attachments in them, the way mbox folders do. You are bitching about your hammer not making a good screwdriver. If the file system isn't a good place to store files, then what is it good for? Souce code trees only? There are application specific databases available for that too. What have you got left? -- Andrew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Sat, May 26, 2001 at 07:25:16PM +1000, Andrew Reilly wrote: One of my personal mail folders has 4400 messages in it, and I've only been collecting that one for a few years. It's not millions, but its a few more than the 500 that I've seen some discuss here as a reasonable limit (why is that reasonable?) and it's many many more than the 72 or so limit available in ADFS. I realised as soon as I pressed the send button that my current use of large directories for mail files doesn't actually involve any random access: the directory is read sequentially to build the header list. It is quite concievable that a performance tweak to the IMAP server could involve a header cache in a relational database of some sort, and that would certainly contain references to the individual files, which would then be accessed randomly. /usr/ports/distfiles on any of the mirrors probably contains upwards of 5000 files too, and there is a strong likelyhood that these will be accessed out-of-order by ports-makefile-driven fetch requests. -- Andrew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andrew Reilly [EMAIL PROTECTED] writes: Where in open(1) does it specify a limit on the number of files permissible in a directory? The closest that it comes, that I can see is: Well, read(2) doesn't tell you not to do your IO one character at a time, but that doesn't mean it's a good idea. The point here is not interface definitions, it's efficiency. Nobody's saying you shouldn't be _allowed_ to put thousands and thousands of files in a directory if you like. They're just saying that you shouldn't expect it to be fast. Similarly, you can read data one byte at a time if you like, but you shouldn't expect that to be fast either. Pointing to manpages and saying you weren't warned that a particular approach is slow is a really weak defense. Do you expect cliffs to have little If you drive off this cliff, you will die warning signs on them? If a documented part of the API simply did not work, then you'd have a point. Instead, what we have is a case where a method of storing files that most people reasonably expect to be slow is in fact slow. The folks who've pointed out the /a/a/aardvark solution are right -- directory hashing is a well-known solution to this problem. It isn't a hack at all. No matter what method you use for storing directories, larger directories are going to be slower to use than smaller ones, and hashing filenames fixes that. --nat -- nat lanza --- there are no whole truths; [EMAIL PROTECTED] all truths are half-truths http://www.cs.cmu.edu/~magus/ --- -- alfred north whitehead To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andrew Reilly writes: On Fri, May 25, 2001 at 08:49:21PM +, Terry Lambert wrote: There is _no_ performance problem with the existing implementation, if you treat postgres as the existing implementation; it will do what you want, quickly and effectively, for millions of record keys. Does postgres make a good mail archive database? Can it handle Yes, I look at it to replace my messarge. arbitrary record lengths? It couldn't the last time I looked at it. Now (7.1) it can. -- @BABOLO http://links.ru/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andrew Reilly writes: /usr/ports/distfiles on any of the mirrors probably contains upwards of 5000 files too, and there is a strong likelyhood that these will be accessed out-of-order by ports-makefile-driven fetch requests. Oh! You point a good example! 0cicuta~(13)/bin/ls /usr/ports/distfiles/ | wc 96729672 198244 -- @BABOLO http://links.ru/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
[EMAIL PROTECTED] wrote: Andrew Reilly writes: /usr/ports/distfiles on any of the mirrors probably contains upwards of 5000 files too, and there is a strong likelyhood that these will be accessed out-of-order by ports-makefile-driven fetch requests. Oh! You point a good example! 0cicuta~(13)/bin/ls /usr/ports/distfiles/ | wc 96729672 198244 .. Which is almost entirely stored in the name cache, which is hashed. Once you scan the directory for the first time, the entries are pre-inserted into the hash. This cache is very long lived and is quite effective at dealing with this sort of thing, especially if you have plenty of memory and have vfs.vmiodirenable=1 turned on. While it may not scale too well to directories with millions of files, it certainly deals well with tens of thousands of files. We have recently made improvements to the hashing algorithms to get better dispersion on small and iterative filenames, eg: 00, 01, 02 - FF. It is not perfect, but it is a hell of a lot better than the false assumption that the linear search method is the usual case. Which is more expensive? Maintaining an on-disk hashed (or b+tree) directory format for *everything* or maintaining a simple low-cost format on disk with in-memory hashing for fast lookups? For the small directory case I suspect the FFS+namecache way is more cost effective. For the medium to large directory case (10,000 to 100,000 entries), I suspect the FFS+namecache method isn't too shabby, providing you are not starved for memory. For the insanely large cases - I dont want to think about :-). Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Benchmarking FreeBSD (was Re: technical comparison)
Jordan Hubbard [EMAIL PROTECTED] writes: Erm, folks? Can anyone please tell me what this has to do with freebsd-hackers any longer? While the thread has diverged from it's original intent, there is something related I consider to be a more interesting topic. If it's still not appropriate for hackers, please let me know. When people are doing benchmarks, I noted that there are -lots- of little sysctl tweaks or kernel tweaks that tend to make a big difference in the results. I know it is possible to define some sort of abstraction that uniquely specifies a complete (and/or relevant) set of these tweaks when comparing benchmarks. Does this already exist, and if not, how hard would it be to catalog every single relavent tunable parameter in a FreeBSD system? -- Dave Hayes - Consultant - Altadena CA, USA - [EMAIL PROTECTED] The opinions expressed above are entirely my own War doesn't determine who's right. War determines who's left. -Confucious To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Greg Black writes: Andresen,Jason R. wrote: | On Thu, 24 May 2001, void wrote: | | On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote: | | Why is knowing the file names cheating? It is almost certain | that the application will know the names of it's own files | (and won't be grepping the entire directory every time it | needs to find a file). | | With 60,000 files, that would have the application duplicating | 60,000 pieces of information that are stored by the operating system. | Operations like open() and unlink() still have to search the directory | to get the inode, so there isn't much incentive for an application to | do that, I think. | | This still doesn't make sense to me. It's not like the program is going | to want to do a find on the directory every time it has some data it | wants to put somewhere. I think for the majority of the cases (I'm sure | there are exceptions) an application program that wants to interact with | files will know what filename it wants ahead of time. This doesn't | necessarily mean storing 60,000 filenames either, it could be something | like: | I have files fooX where X is a number from 0 to 6 in that | directory. I need to find a piece of information, so I run that | information through a hash of some sort and determine that the file I want | is number 23429, so I open that file. And if this imaginary program is going to do that, it's equally easy to use a multilevel directory structure and that will make the life of all users of the system simpler. There's no real excuse for directories with millions (or even thousands) of files. There is. You assume that names are random. Assume that they are not. VERY old example: a aa ... aaa...aaa 255 times aaa...aab so on. Yes, I know: hash. Is it practical to this in every application (sometimes it is unknown before practical use if directories become big) instead in one file system? Sorry for a bad English. -- @BABOLO http://links.ru/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Benchmarking FreeBSD (was Re: technical comparison)
:Jordan Hubbard [EMAIL PROTECTED] writes: : Erm, folks? Can anyone please tell me what this has to do with : freebsd-hackers any longer? : :While the thread has diverged from it's original intent, there is :something related I consider to be a more interesting topic. If it's :still not appropriate for hackers, please let me know. : :When people are doing benchmarks, I noted that there are -lots- of :little sysctl tweaks or kernel tweaks that tend to make a big :difference in the results. : :I know it is possible to define some sort of abstraction that uniquely :specifies a complete (and/or relevant) set of these tweaks when :comparing benchmarks. Does this already exist, and if not, how hard :would it be to catalog every single relavent tunable parameter in a :FreeBSD system? :-- :Dave Hayes - Consultant - Altadena CA, USA - [EMAIL PROTECTED] : The opinions expressed above are entirely my own Well, it's been done before. The problem is that the landscape changes every time we do a new release. I did a 'security' man page a while ago (which is still mostly relevant). I suppose I could do a 'performance' man page. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
I would have sent this to the original author if he had used a proper email address on his post; sorry to those who don't want to see it. | | I have files fooX where X is a number from 0 to 6 in that | | directory. I need to find a piece of information, so I run that | | information through a hash of some sort and determine that the file I want | | is number 23429, so I open that file. | | And if this imaginary program is going to do that, it's equally | easy to use a multilevel directory structure and that will make | the life of all users of the system simpler. There's no real | excuse for directories with millions (or even thousands) of | files. | There is. | You assume that names are random. | Assume that they are not. | VERY old example: | a | aa | ... | aaa...aaa 255 times | aaa...aab | so on. | Yes, I know: hash. | | Is it practical to this in every application | (sometimes it is unknown before practical use | if directories become big) instead in | one file system? Any real programmer has tools that make this trivial. I keep a pathname hashing function and a couple of standalone programs that exercise it from shell scripts in my toolbox and can stitch them into anything that needs fixing in no time. My code allows for nearly 1.3 trillion names in a six-level hierarchy if you can limit yourself to about 500 hundred names per directory, but can be easily extended for really idiotic uses. | Sorry for a bad English. We can live with that, but it's a bit rude to send messages out without a valid From address. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
] ] 1. I don't think I've ever seen a Linux distro which has write ] ] caching enabled by default. Hell, DMA33 isn't even enabled ] ] by default ;) ] ] ] ] You are talking about controlling the IDE drive cache. ] ] ] ] The issue here is write cache in the filesystem code. ] ] No. The issue here is the write cache on the drive. ] FreeBSD with soft updates will operate within 4% of the top memory ] bandwidth; see the Ganger/Patt paper on the technology. ] ] I have a file, CSE-TR-254-95.ps, that I think is probably the paper ] you are talking about. The title is Soft Updates: A Solution to the ] Metadata Update Problem in File Systems. The link on Ganger's page was ] dead, but I'm sure this is the one you mean. ] ] Nowhere do they support the idea that soft udpates can approach a ] system's memory bandwidth. I said top memory bandwidth, not a system's memory bandwidth; please be more careful. Quoting from section 6, Conclusions and Future Work: We have described a new mechanism, soft updates, that can be used to achieve memory-based file system performance while providing stronger integrity and *** security guarantees (e.g. allocation initialization) and higher availability (via shorter recovery times) than most UNIX file systems. This translates into a performance improvement of more than a factor fo 2 in many cases (up to a maximum observed difference of a factor of 15). Terry Lambert [EMAIL PROTECTED] --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
] Nothing in Unix stops you from putting millions of files in a ] directory. There are (I mantain _obviously_) good reasons to ] want to do that. The only thing that stops you is that _some_ ] Unix platforms, using _some_ file systems, behave badly if you ] do that. There are _no_ good reasons for using an FS as if the directory structure was a key file, file names keys, and file contents data records in a relational database. We have things which were built precisely for this type of use. We call them relational databases. ] They should be fixed. Feel free to submit patches, so long as they do not damage any backward compatability, and do not compromise performance under normal workloads just to pass some obscure test that somone has devised to prove one FS is better than another by doing ridiculous things which will never happen except in special purpose situations, in which special purpose tools are a better fit. Terry Lambert [EMAIL PROTECTED] --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
] It's got nothing to do with the basics of software engineering or ] computer science. It's got to do with interface definitions and ] APIs. ] ] Where in open(1) does it specify a limit on the number of files ] permissible in a directory? The closest that it comes, that I can ] see is: [ ... ] ] All of which quite clearly indicate that if one wants to put all ] of ones allocation of blocks or inodes into a single directory, ] then one can, as long as the name and path length limits are ] observed. UNIX, in not preventing you from doing stupid things, permits you to do clever things which other operatings systems do not permit.. I maintain that just because you are not administratively prohibited from doing stupid things, that in no way makes doing those things less stupid. ] You're welcome to claim a documentation bug, and add the ] appropriate caveat. It seems clear to me that Hans Reiser (and ] Silicon Graphics before him) have taken the more obvious approach, ] of attempting to remove the performance limitation inherent in the ] existing implementation. The performance limitation? Get your story straight: is there a limitation, or isn't there? ] You can moan about tree-structured vs relational databases, but if ] your problem space doesn't intrinsically map to a tree, then it ] doesn't stop the tree-structring transformation that Terry ] mentioned from being a gratuitious hack to work around a ] performance problem with the existing implementation. It is not a performance problem with the existing implementation, it is pilot error. There is _no_ performance problem with the existing implementation, if you treat postgres as the existing implementation; it will do what you want, quickly and effectively, for millions of record keys. Why are you treating an FS as if it were a relational database? It is a tool intended to solve an entirely different problem set. You are bitching about your hammer not making a good screwdriver. Terry Lambert [EMAIL PROTECTED] --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
One word: B+Tree. Hash tables work well if the entire hash table fits into memory and you know (approximately) what the upper limit on records is going to be. If you don't, then a B+Tree is the only proven way to go. (sure, there are plenty of other schemes, some hybrid, some completely different, but B+Tree's have been long proven so unless you want to experiment, just use one). In general I agree that UFS's only major pitfall is the sequential directory scanning. The reality, though, is that very few programs actually need to create thousands or millions of files in a single directory. The biggest one used to be USENET news but that has shifted into multi-article files and isn't an issue any more. Now the biggest one is probably squid. Databases are big storage-wise, but don't usually require lots of files. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Ultimately something like Reiser will win over UFS, but performance figures aren't the whole picture. Most of the bugs have been worked out of UFS and the recovery tools are extremely mature. Only a handful of edge cases have been found in the last decade. Nearly all the bugs in the last few years have turned out to be buffer cache or VM bugs rather then filesystem bugs. ResierFS has a long way to go before it can be safely used on production systems. Linux, having just moved to a totally new VM system also has a long way to go (and, for the same reason, FreeBSD-5 has a long way to go before it can safely be used in production). When Reiser starts to get close, I'll be the first one to port it to FreeBSD :-) Consider for a moment the development roadmap for UFS, EXT2FS, and REISERFS. It took UFS and its supporting tools years to get as good as it is for production purposes. It has taken EXT2FS a number of years to reach where it is. ReiserFS is new, and it is going to be a while. -Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Thu, 24 May 2001, void wrote: On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote: Why is knowing the file names cheating? It is almost certain that the application will know the names of it's own files (and won't be grepping the entire directory every time it needs to find a file). With 60,000 files, that would have the application duplicating 60,000 pieces of information that are stored by the operating system. Operations like open() and unlink() still have to search the directory to get the inode, so there isn't much incentive for an application to do that, I think. This still doesn't make sense to me. It's not like the program is going to want to do a find on the directory every time it has some data it wants to put somewhere. I think for the majority of the cases (I'm sure there are exceptions) an application program that wants to interact with files will know what filename it wants ahead of time. This doesn't necessarily mean storing 60,000 filenames either, it could be something like: I have files fooX where X is a number from 0 to 6 in that directory. I need to find a piece of information, so I run that information through a hash of some sort and determine that the file I want is number 23429, so I open that file. I don't expect programs to try to offload this sort of information on the filesystem. Do you have an example of a program that interacts with the filesystem without knowing the names of the files it wants? To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, 23 May 2001, Shannon wrote: On Wed, May 23, 2001 at 10:54:40PM -0300, Rik van Riel wrote: 1. I don't think I've ever seen a Linux distro which has write caching enabled by default. Hell, DMA33 isn't even enabled by default ;) You are talking about controlling the IDE drive cache. The issue here is write cache in the filesystem code. 1) IIRC they were talking about hw.ata.wc 2) soft-updates _is_ a form of write cache in the filesystem code, in fact, that's one of the points of soft-updates in the first place ;) regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
In message [EMAIL PROTECTED] Jason Andresen writes: : If only FreeBSD could boot from those funky M-Systems flash disks. We boot FreeBSD off of M-Systems flash disks all the time. Don't know what the problem is with your boxes. Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Thu, May 24, 2001 at 12:25:59PM -0300, Rik van Riel wrote: On Wed, 23 May 2001, Shannon wrote: On Wed, May 23, 2001 at 10:54:40PM -0300, Rik van Riel wrote: 1. I don't think I've ever seen a Linux distro which has write caching enabled by default. Hell, DMA33 isn't even enabled by default ;) You are talking about controlling the IDE drive cache. The issue here is write cache in the filesystem code. 1) IIRC they were talking about hw.ata.wc In a subthread, yeah. I think though, the overall issue is the caching ext2 does that ufs does not. I'm not even sure that soft updates is quite the same thing. I think the soft-updates paper mentions that it shouldn't increase risk, while a lot of people feel like ext2 is very risky. I never really notice a big difference when I turn on write caching with my system (on the hard drive). It's been awhile since I did any benchmarks though, since I no longer run IDE drives on most systems. You can control the cache on them too with the right scsi tools, but I've not really messed with it. -- There is no such thing as security. Life is either bold | | | adventure, or it is nothing -- Helen Keller| | | / | \ s h a n n o n @ w i d o m a k e r . c o m _/ | \_ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andresen,Jason R. wrote: | On Thu, 24 May 2001, void wrote: | | On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote: | | Why is knowing the file names cheating? It is almost certain | that the application will know the names of it's own files | (and won't be grepping the entire directory every time it | needs to find a file). | | With 60,000 files, that would have the application duplicating | 60,000 pieces of information that are stored by the operating system. | Operations like open() and unlink() still have to search the directory | to get the inode, so there isn't much incentive for an application to | do that, I think. | | This still doesn't make sense to me. It's not like the program is going | to want to do a find on the directory every time it has some data it | wants to put somewhere. I think for the majority of the cases (I'm sure | there are exceptions) an application program that wants to interact with | files will know what filename it wants ahead of time. This doesn't | necessarily mean storing 60,000 filenames either, it could be something | like: | I have files fooX where X is a number from 0 to 6 in that | directory. I need to find a piece of information, so I run that | information through a hash of some sort and determine that the file I want | is number 23429, so I open that file. And if this imaginary program is going to do that, it's equally easy to use a multilevel directory structure and that will make the life of all users of the system simpler. There's no real excuse for directories with millions (or even thousands) of files. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Greg Black wrote: Andresen,Jason R. wrote: | This still doesn't make sense to me. It's not like the program is going | to want to do a find on the directory every time it has some data it | wants to put somewhere. I think for the majority of the cases (I'm sure | there are exceptions) an application program that wants to interact with | files will know what filename it wants ahead of time. This doesn't | necessarily mean storing 60,000 filenames either, it could be something | like: | I have files fooX where X is a number from 0 to 6 in that | directory. I need to find a piece of information, so I run that | information through a hash of some sort and determine that the file I want | is number 23429, so I open that file. And if this imaginary program is going to do that, it's equally easy to use a multilevel directory structure and that will make the life of all users of the system simpler. There's no real excuse for directories with millions (or even thousands) of files. No, there is no excuse, however some third party application (FOR WHICH YOU DO NOT HAVE THE SOURCE[1]) may do it anyway. In the original parent of this post that was the exact situtation. It would be nice if everybody followed the rules and played nice, but it is just something you can't count on in real life. [1] Emphasis added because for people in the Free Software business, it is easy to forget that you don't always have access to the source code, and convincing a company to rewrite their product because it doesn't like your (almost certainly unsupported) OS smacks of futility. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
In a message dated 05/23/2001 5:04:36 PM Eastern Daylight Time, [EMAIL PROTECTED] writes: Tell them to fire 20K packets/second at the linux box and watch it crumble. Linux has lots of little kludges to make it appear faster on some benchmarks, but from a networking standpoint it cant handle significant network loads. Are you sure this is still true? The 2.4.x series kernel was supposed to have significant networking improvements over the previous kernels. I dont know, but I doubt it. the problem isnt the networking preformance, its the inability of the memory system and the ethernet drivers to handle overloads properly. They are modeled in a way that fails in practice. Bryan To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
] Terry Lambert writes: ] ] I don't understand the inability to perform the trivial ] design engineering necessary to keep from needing to put ] 60,000 files in one directory. ] ] However, we can take it as a given that people who need ] to do this are incapable of doing computer science. ] ] One could say the same about the design engineering necessary ] to handle 60,000 files in one directory. You're making excuses. No, I'm not. I released trie patches for FreeBSD directory sotrage in 1995. No one thought they were very useful, because only morons would treat a filesystem as if it were a database, instead of using a database as a database. If you want to get technical, a filesystem is a form of a database... but it's a _hierarchical_ database, like DNS or LDAP, and trying to use it as a _relational_ database, with key/value pairs, is still a stupid idea. Use the right tool for the job. ] People _want_ to do this, and it often performs better on ] a modern filesystem. This is not about need; it's about ] keeping ugly hacks out of the app code. ] ] http://www.namesys.com/5_1.html I'm glad you said people want to do this instead of saying computer professionals want to do this. The 60,000 file benchmark is meaningless to a properly designed system. ] (the rationale behind this last is that people who can't ] design around needing 60,000 files in a single directory ] are probably going to to be unable to correctly remember ] the names of the files they created, since if they could, ] then they could remember things like ./a/a/aardvark or ] ./a/b/abominable). ] ] Eeew. ./a/b/abominable is a disgusting old hack used to ] work around traditional filesystem deficiencies. No, it's a hack to work around being too damn lazy to use a database where it makes sense to use a database. Terry Lambert [EMAIL PROTECTED] --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
] 1. I don't think I've ever seen a Linux distro which has write ] caching enabled by default. Hell, DMA33 isn't even enabled ] by default ;) ] ] You are talking about controlling the IDE drive cache. ] ] The issue here is write cache in the filesystem code. No. The issue here is the write cache on the drive. FreeBSD with soft updates will operate within 4% of the top memory bandwidth; see the Ganger/Patt paper on the technology. Terry Lambert [EMAIL PROTECTED] --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: technical comparison
From: Greg Black [mailto:[EMAIL PROTECTED]] And if this imaginary program is going to do that, it's equally easy to use a multilevel directory structure and that will make the life of all users of the system simpler. There's no real excuse for directories with millions (or even thousands) of files. While I agree completely that there's no excuse for applications that behave like that, a filesystem that scales well under these harsh conditions will serve us all better in the long run. Charles To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Shannon Hendrix wrote: You are talking about controlling the IDE drive cache. The issue here is write cache in the filesystem code. 1) IIRC they were talking about hw.ata.wc In a subthread, yeah. I think though, the overall issue is the caching ext2 does that ufs does not. I'm not even sure that soft updates is quite the same thing. I think the soft-updates paper mentions that it shouldn't increase risk, while a lot of people feel like ext2 is very risky. Actually, no. Someone *specifically* mentioned that FreeBSD 4.3-RELEASE disables hardware caching on IDE, and Linux does not. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wow regex humor... I'm a geek To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Jason Andresen wrote: And if this imaginary program is going to do that, it's equally easy to use a multilevel directory structure and that will make the life of all users of the system simpler. There's no real excuse for directories with millions (or even thousands) of files. No, there is no excuse, however some third party application (FOR WHICH YOU DO NOT HAVE THE SOURCE[1]) may do it anyway. In the original parent of this post that was the exact situtation. It would be nice if everybody followed the rules and played nice, but it is just something you can't count on in real life. Uhhh, no. The original message was remarking about a software development team which repeatedly fail to deliver the product to the specs asked for, and that said team blamed FreeBSD and wanted Linux instead. So the comment applies. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wow regex humor... I'm a geek To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Thu, May 24, 2001 at 05:00:44PM -0400, [EMAIL PROTECTED] wrote: Linux has lots of little kludges to make it appear faster on some benchmarks, but from a networking standpoint it cant handle significant network loads. Are you sure this is still true? The 2.4.x series kernel was supposed to have significant networking improvements over the previous kernels. I dont know, but I doubt it. There were significant network and memory improvements in the 2.4 release. There were also some improvements that will have to wait for the next release, but overall it is much improved. FreeBSD 4.3 is much improved over 2.x and 3.x, so I'm not sure why that would be considered unusual or surprising. The memory system in Linux is still set up by default to give more speed at the expense of smooth load handling. It seems better, but you have to go into /proc and tune things to get better load handling. the problem isnt the networking preformance, its the inability of the memory system and the ethernet drivers to handle overloads properly. They are modeled in a way that fails in practice. The way I understood it was certain drivers were more affected by this than others. Some were just fine, and handled very high loads. Another problem was multiple ethernet cards, but I forgot what caused that. A lot of that was addressed in the 2.4 release, and it seems to have made a lot of people happier. I can't test the difference because I have nothing but 10mbit ethernet. However, the 2.4 kernel is definitely faster in my day-to-day work, and has allowed me to delay a complete move to FreeBSD 4.x on my workstation. It was that much of a step forward. Now I can wait until I get proper 3D support for my nVidia graphics card. -- We have nothing to prove -- Alan Dawkins To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Thu, May 24, 2001 at 04:42:02PM -0600, Charles Randall wrote: From: Greg Black [mailto:[EMAIL PROTECTED]] There's no real excuse for directories with millions (or even thousands) of files. While I agree completely that there's no excuse for applications that behave like that, a filesystem that scales well under these harsh conditions will serve us all better in the long run. TANSTAAFL. It's not obvious that you can get such scaling for free. If the tradeoffs to make the system perform a dumb task well mean that it won't perform a sane task well, you lose. PGP signature
Re: technical comparison
On Thu, May 24, 2001 at 10:34:26PM +, Terry Lambert wrote: ] 1. I don't think I've ever seen a Linux distro which has write ] caching enabled by default. Hell, DMA33 isn't even enabled ] by default ;) ] ] You are talking about controlling the IDE drive cache. ] ] The issue here is write cache in the filesystem code. No. The issue here is the write cache on the drive. FreeBSD with soft updates will operate within 4% of the top memory bandwidth; see the Ganger/Patt paper on the technology. I have a file, CSE-TR-254-95.ps, that I think is probably the paper you are talking about. The title is Soft Updates: A Solution to the Metadata Update Problem in File Systems. The link on Ganger's page was dead, but I'm sure this is the one you mean. Nowhere do they support the idea that soft udpates can approach a system's memory bandwidth. What they did say was that in _one_ case, creating and then immediately deleting a directory entry, you are operating at processor/memory speeds. They said soft updates in that case were 6 times faster than the conventional system. That's not even close to the memory bandwidth of the 486 system they were using, so they had to mean the filesystem code in that test was able to run without waiting on I/O. In the more general cases, their findings were more than a factor of two compared to synchronous write ufs. I _wish_ my workstation was able to write metadata at nearly 1GB/s all the time... :) -- Star Wars Moral Number 17: Teddy bears are dangerous in herds. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Fri, May 25, 2001 at 06:17:33AM +1000, Greg Black wrote: the life of all users of the system simpler. There's no real excuse for directories with millions (or even thousands) of files. One of the things that I've always liked about Unix was that there aren't as many arbitrary limits on what you can do and how you can do it, as there are on other platforms. For example, I once used an Acorn Archimedes computer, which had an OS called RISC-OS. The advanced disk filing system, ADFS, had some cute limits built in: no more than 10 characters in a file name, and no more than 70 (?memory fades) files in a directory. Nothing in Unix stops you from putting millions of files in a directory. There are (I mantain _obviously_) good reasons to want to do that. The only thing that stops you is that _some_ Unix platforms, using _some_ file systems, behave badly if you do that. They should be fixed. -- Andrew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andrew Reilly wrote: | On Fri, May 25, 2001 at 06:17:33AM +1000, Greg Black wrote: | the life of all users of the system simpler. There's no real | excuse for directories with millions (or even thousands) of | files. | | [...] | | Nothing in Unix stops you from putting millions of files in a | directory. This is just not true. For the vast majority of the systems that have ever been called Unix, attempting to put millions of files into a directory would be an utter disaster. No ifs or buts. It might be nice if this were different, although I see no good reason to support it myself, but it's generally not a serious possibility and so applications that depend on being able to do that are plain stupid. Their authors are either too lazy to make their use of the file system a bit more sensible or too stupid to know that file systems are not databases. The right answer is to write applications with some understanding of the basics of software engineering or computer science. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On 25 May, Greg Black wrote: This is just not true. For the vast majority of the systems that have ever been called Unix, attempting to put millions of files into a directory would be an utter disaster. No ifs or buts. It might be nice if this were different, although I see no good reason to support it myself, but it's generally not a serious possibility and so applications that depend on being able to do that are plain stupid. Their authors are either too lazy to make their use of the file system a bit more sensible or too stupid to know that file systems are not databases. The right answer is to write applications with some understanding of the basics of software engineering or computer science. It's got nothing to do with the basics of software engineering or computer science. It's got to do with interface definitions and APIs. Where in open(1) does it specify a limit on the number of files permissible in a directory? The closest that it comes, that I can see is: [ENOSPC] O_CREAT is specified, the file does not exist, and the directory in which the entry for the new file is being placed cannot be extended because there is no space left on the file system containing the directory. [ENOSPC] O_CREAT is specified, the file does not exist, and there are no free inodes on the file system on which the file is being created. [EDQUOT] O_CREAT is specified, the file does not exist, and the directory in which the entry for the new file is being placed cannot be extended because the user's quota of disk blocks on the file system containing the direc- tory has been exhausted. [EDQUOT] O_CREAT is specified, the file does not exist, and the user's quota of inodes on the file system on which the file is being created has been exhausted. or perhaps: [ENAMETOOLONG] A component of a pathname exceeded 255 characters, or an entire path name exceeded 1023 characters. All of which quite clearly indicate that if one wants to put all of ones allocation of blocks or inodes into a single directory, then one can, as long as the name and path length limits are observed. See: there's a system defined limit, and it's documented as such. That's what I was getting at. You're welcome to claim a documentation bug, and add the appropriate caveat. It seems clear to me that Hans Reiser (and Silicon Graphics before him) have taken the more obvious approach, of attempting to remove the performance limitation inherent in the existing implementation. You can moan about tree-structured vs relational databases, but if your problem space doesn't intrinsically map to a tree, then it doesn't stop the tree-structring transformation that Terry mentioned from being a gratuitious hack to work around a performance problem with the existing implementation. -- Andrew To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Andrew Reilly wrote: | You can moan about tree-structured vs relational databases, [...] I can moan about whatever I please -- for instance the fact that you can't be bothered using a mailer that conforms with basic rules. Please figure out how to get a Message-Id header into your mail and make sure that future messages go out with such a header. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Erm, folks? Can anyone please tell me what this has to do with freebsd-hackers any longer? It's been quite a long thread already - have a heart please and take it to -chat. :( Thanks, - Jordan To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Shannon Hendrix wrote: And just to get things worse... :-) the test must be made on the *same* slice. If you configure two different slices, the one on the outer tracks will be faster. I cannot verify that with my drive, but my largest is 18GB so maybe the difference is not as pronounced as on some newer drives like those (currently) monster 70GB drives. It should be measurable. On one hand, more sectors per track, same time to read a single track = more bytes read per second. On the other hand, more sectors per track, more bytes per track, less tracks per same size, less track seek needed. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wow regex humor... I'm a geek To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
The proposed filesystem is most likely Reiserfs. This is a true journalling filesystem with a radically non-traditional layout. It is no problem to put millions of files in a single directory. (actually, the all-in-one approach performs better than a tree) XFS and JFS are similarly capable, but Reiserfs is well tested and part of the official Linux kernel. You can get the Reiserfs team to support you too, in case you want to bypass the normal filesystem interface for even better performance. It should be noted that simply because something is tested and a part of a release, it is not automatically wonderful. My last experiance with linux was in the 2.2 days, and ended with a lost root filesystem while attempting to access an msdosfs drive. From what I've read, mixing reiserfs and nfs is about as exciting as the stock market has been in the last few months. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, May 23, 2001 at 08:17:12AM -0400, Andresen,Jason R. wrote: On Tue, 22 May 2001, Kris Kennaway wrote: On Tue, May 22, 2001 at 10:27:27PM +0300, Nadav Eiron wrote: I ran tests that I think are similar to what Jason ran on identically configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much faster than UFS+softupdates on these tests. Linux (2.2.14-5 + ReiserFS): Time: 164 seconds total 97 seconds of transactions (103 per second) Files: 65052 created (396 per second) Creation alone: 6 files (1090 per second) Mixed with transactions: 5052 files (52 per second) 4936 read (50 per second) 5063 appended (52 per second) 65052 deleted (396 per second) Deletion alone: 60104 files (5008 per second) Mixed with transactions: 4948 files (51 per second) Data: 24.83 megabytes read (155.01 kilobytes per second) 336.87 megabytes written (2.05 megabytes per second) FreeBSD 4.3-RELEASE (ufs/softupdates): Did you enable write caching? You didn't mention, and it's off by default in 4.3, but I think enabled by default on Linux. I tried to leave the FreeBSD and Linux boxes as unchanged as possible for my tests (they are lab machines that have other uses, although I made sure they were idle during the test periods). I left write caching enabled in the Linux boxes, and left it disabled on the FreeBSD boxes. Personally, I'm hesitant to enable write caching on FreeBSD because we tend to use it on machines where we really really don't want to lose data. Write caching is ok on the Linux machines because we use them as pure testbeds that we can reconstruct easily if their disks go south. If the tests on the Linux machines are made to simulate how those Linux machines would operate if used as production servers, then do that: configure the Linux machines exactly as if they were your production servers. That is, if you want write caching off on production servers, turn it off at test time. G'luck, Peter -- If you think this sentence is confusing, then change one pig. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, 22 May 2001, Daniel C. Sobral wrote: Jason Andresen wrote: If only FreeBSD could boot from those funky M-Systems flash disks. It can. How? Nothing I found in the documentation indicated this, or gave any sort hint as to how I might go about doing it. The Linux driver has a hacked version of Lilo that has to be installed prior to even thinking of doing anything with the flash, but I found no equivelent for FreeBSDs boot1. FreeBSD can mount the disks just fine (I used a custom PicoBSD boot floppy to fix up the Linux install on the flash disk enough so that it would boot (the stupid script M-Systems provided installed a completely hosed system!)). This sort of information might be handy to have on freebsd.org. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, 22 May 2001, Daniel C. Sobral wrote: Jason Andresen wrote: Results: ufs+softupdates is a little slower than ext2fs+wc for low numbers of files, but scales better. I wish I had a Reiserfs partition to test with. Ext2fs is a non-contender. Note, though, that there is some very recent perfomance improvement on very large directories known as dirpref (what changed, actually, was dirpref's algorithm). This is NOT present on 4.3-RELEASE, though it _might_ have since been committed to stable. The new dirpref code is mostly just a performance tweak. We can't compete with ReiserFS on large directories without a major improvement to the code, assuming the previous post was true and ReiserFS has some log time components where ufs has linear time components. Note that the improvement from using the new dirpref code is about 12%, which isn't bad, but still doesn't put us in the right ballpark. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: technical comparison
Dear All, An interview with Reiser just appeared on http://www.slashdot.org/ Just to add a little oil to the fire. :-) Kees Jan You are only young once, but you can stay immature all your life. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, 22 May 2001, Shannon Hendrix wrote: On Tue, May 22, 2001 at 09:31:34AM -0400, Jason Andresen wrote: We only have three Linux boxes here (and one is a PC104 with a flash disk) and already I've had to reinstall the entire OS once when we had a power glitch. ext2fsck managed to destroy about 1/3 of the files on the system, in a pretty much random manner (the lib and etc were hit hard). This is not typical. Also, I have heard the same thing from other people about flash disks. fs crash, fsck, and a mess afterwards. It would be nice if you could use ufs and see if the same problem exists. The scary thing is that it was the attached harddrive that lost all of the files. The situitation is this: Attached HD: I just installed Redhat on the hard drive. I rebooted and the system booted off of the harddrive normally. Successful install. I logged into the system and started looking into rebuilding the kernel to include the binary only M-Systems modules when a co-worker accidentally unplugged the wrong plug (he was working on some nearby machines), unplugging the power supply I was using to power the hard drives (and pretty much crashing the PC104 system). I powered down the PC104 system, and we plugged everything in again. When I tried to reboot the system, Lilo couldn't even find the kernel. I pulled out the emergency resuce disc (RedHat's install disk) and booted it up. When I ran fsck on the drive, it found error after error on the drive. Eventually I had to ^C that fsck run and try it again with the -y option (my arm was getting tired). Once fsck was done / was pretty much a ghost town, at which point I decided to just reinstall the system. It's entirely possible that there is something I could have done to prevent fsck from clearing out the filesystem, but it certainly isn't obvious from the manual, and I've never seen a FreeBSD system do that. Also, for anybody who says the pull the power test isn't realistic, I can assure you that power failures DO happen (probably less in your area than mine (I hope!)) and not planning for them only brings disaster later when you have a room with 1000 servers lose power. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, 22 May 2001, Shannon Hendrix wrote: On Tue, May 22, 2001 at 02:49:21PM -0400, Jason Andresen wrote: 6 files took ~15 minutes to create as is. I'm going to have to wait until tonight to run larger sets. 2.2.16 is what we have here. I'm still waiting to see how much faster ReiserFS is. I'm willing to overnight your test if you want. Do you have it packaged up to send? It would be interesting just to get numbers from a Linux system with a modern kernel. 2.4.1 gave me enough of a speed boost to put off another FreeBSD install until I fix some problems there. I cannot test FreeBSD with SCSI right now so my system will be an inequal set of results. I would offer to test NetBSD as well, but I suppose no one would be interested in that. The test is 'postmark'. It is in /usr/ports/benchmarks, but the distribution is a single C file. Just compile it with: gcc -O -o postmark postmark.c on all of the systems. That's what the port uses. Your system should be unladen but running in multiuser mode and the test directory you choose should be empty. The options you are interested in are: set transactions 1 -- What I used for all of my tests set number number of starting file in the directory to test set location /path/to/empty/local/directory run To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
I just finished the FreeBSD test with vfs.vmiodirenable=1 (it was 0 before) 6 simlultanious files, 1 transactions, FreeBSD 4.0-Release+Softupdates with write cacheing disabled. Results are pretty much unchanged. Do you have to enable vmiodirenable at boot time for it to take affect? Time: 1286 seconds total 505 seconds of transactions (19 per second) Files: 65065 created (50 per second) Creation alone: 6 files (85 per second) Mixed with transactions: 5065 files (10 per second) 5078 read (10 per second) 4921 appended (9 per second) 65065 deleted (50 per second) Deletion alone: 60130 files (761 per second) Mixed with transactions: 4935 files (9 per second) Data: 26.01 megabytes read (20.23 kilobytes per second) 325.12 megabytes written (252.82 kilobytes per second) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, 22 May 2001, Terry Lambert wrote: I don't understand the inability to perform the trivial design engineering necessary to keep from needing to put 60,000 files in one directory. However, we can take it as a given that people who need to do this are incapable of doing computer science. I would suggest two things: 1)If write caching is off on the Linux disks, turn it off on the FreeBSD disks. 2) -- and then turn it on on both. 3)Modify the test to delete the files based on a directory traversal, instead of promiscuous knowledge of the file names, which is cheating to make the lookups appear faster. (the rationale behind this last is that people who can't design around needing 60,000 files in a single directory are probably going to to be unable to correctly remember the names of the files they created, since if they could, then they could remember things like ./a/a/aardvark or ./a/b/abominable). The problem comes along when you are using a third party application that keeps a bazillion files in a directory, which was the problem that spawned this entire thread. Why is knowing the file names cheating? It is almost certain that the application will know the names of it's own files (and won't be grepping the entire directory every time it needs to find a file). I doubt a human is ever going to want to work in a directory where you have 6 files lying about, but an application might easily be written to work in just such conditions. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, 22 May 2001, Shannon Hendrix wrote: On Tue, May 22, 2001 at 12:03:33PM -0400, Jason Andresen wrote: The data: Hardware: Both machines have the same hardware on paper (although it is TWO machines, YMMV). PII-300 Intel PIIX4 ATA33 controller IBM-DHEA-38451 8063MB ata0-master using UDMA33 HD Note: all variables are left at default unless mentioned. 1 transactions, 500 files. What did you set size to? How much memory on the machine? Size was left at the default. The machines have 64MB of main memory. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, May 23, 2001 at 06:53:37AM -0300, Daniel C. Sobral wrote: I cannot verify that with my drive, but my largest is 18GB so maybe the difference is not as pronounced as on some newer drives like those (currently) monster 70GB drives. It should be measurable. Actually, I edited too much. I have seen a difference, but it was too small to care abot on my system. These are 7200rpm 18GB drives too. The other variances in filesystem performance seem to overshadow the difference. The only thing I ever did to pick up some speed was to move some data on a raw device to the faster tracks. I was streaming it in so the speedup was good. I also picked up some performance on one Linux system by putting swap in the faster tracks. But for the most part, I've never been able to tell. I have read that on the 40-80GB drives, it's very noticeable. In fact, the IBM Ultrastars are supposed to be faster than their electronics can handle on the very outer tracks. -- Secrecy is the beginning of tyranny. -- Unknown | | | | | | / | \ s h a n n o n @ w i d o m a k e r . c o m _/ | \_ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Hi all, I tried your tests on a quite different configuration, a PIII 800 with 1GB ram, with an AcceleRAID 170 controller and a single RAID5 pack of 4*8GB IBM SCSI drives. The system is a 4.3-rc2, NO softupdates, default configuration. Here are the results : pmset transactions 1 pmset number 6 pmset location /root/test/ pmrun Creating files...Done Performing transactions..Done Deleting files...Done Time: 1715 seconds total 199 seconds of transactions (50 per second) Files: 65065 created (37 per second) Creation alone: 6 files (73 per second) Mixed with transactions: 5065 files (25 per second) 5078 read (25 per second) 4921 appended (24 per second) 65065 deleted (37 per second) Deletion alone: 60130 files (86 per second) Mixed with transactions: 4935 files (24 per second) Data: 26.01 megabytes read (15.17 kilobytes per second) 325.12 megabytes written (189.58 kilobytes per second) -- --- -- [EMAIL PROTECTED] --- - Original Message - From: Andresen,Jason R. [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, May 23, 2001 3:15 PM Subject: Re: technical comparison I just finished the FreeBSD test with vfs.vmiodirenable=1 (it was 0 before) 6 simlultanious files, 1 transactions, FreeBSD 4.0-Release+Softupdates with write cacheing disabled. Results are pretty much unchanged. Do you have to enable vmiodirenable at boot time for it to take affect? Time: 1286 seconds total 505 seconds of transactions (19 per second) Files: 65065 created (50 per second) Creation alone: 6 files (85 per second) Mixed with transactions: 5065 files (10 per second) 5078 read (10 per second) 4921 appended (9 per second) 65065 deleted (50 per second) Deletion alone: 60130 files (761 per second) Mixed with transactions: 4935 files (9 per second) Data: 26.01 megabytes read (20.23 kilobytes per second) 325.12 megabytes written (252.82 kilobytes per second) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, May 23, 2001 at 09:03:37AM -0400, Andresen,Jason R. wrote: The scary thing is that it was the attached harddrive that lost all of the files. The situitation is this: [snip] Sorry to hear that, but like I said, it isn't typical. ext2 in it's early days, an ext before that were really bad. But I have few problems with it these days. I've lost more ufs filesystems than I have ext2, but I don't assume my results are typical: I know ufs is better. However, ext2's problems are grossly exaggerated. It's entirely possible that there is something I could have done to prevent fsck from clearing out the filesystem, but it certainly isn't obvious from the manual, and I've never seen a FreeBSD system do that. Nothing much you can do unless you happen to know ext2 inside and out, and fix it manually. It's not normal for ext2 to die like that, and be unable to recover. Over the years I have had more bizarre, inexplicable OS problems on Intel PCs than any other. Also, for anybody who says the pull the power test isn't realistic, I can assure you that power failures DO happen (probably less in your area than My point was that yanking power only tests one aspect of the filesystem. Chosing one based on passing or not passing that test isn't a good idea. mine (I hope!)) and not planning for them only brings disaster later when you have a room with 1000 servers lose power. Well, a UPS system is as important in any system you care about as the computers and operating systems. If you run 1000 servers and they can lose power, you're on borrowed time anyway. Where I live, the power gets worse every year. I lost quite a few ext filesystems, but only a couple of ufs and ext2 filesystems. Then I bought a 1920VA UPS and it's no longer an issue. I just found it easier to not lose power than to worry about which filesystem recovers from it better. -- There are nowadays professors of philosophy, but not philosophers. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Zitiere Daniel C. Sobral [EMAIL PROTECTED]: Note, though, that there is some very recent perfomance improvement on very large directories known as dirpref (what changed, actually, was dirpref's algorithm). This is NOT present on 4.3-RELEASE, though it _might_ have since been committed to stable. I don't think dirpref should help in this case. IIRC the dirpref patches change the algorithm choosing the cg of subdirectories relative to their parent. Since postmark by default only uses one directory, there should be no benefit. -- Daniel To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, 23 May 2001, Shannon Hendrix wrote: Where I live, the power gets worse every year. I lost quite a few ext filesystems, but only a couple of ufs and ext2 filesystems. Then I bought a 1920VA UPS and it's no longer an issue. I just found it easier to not lose power than to worry about which filesystem recovers from it better. One of the funny things about the place I used to work (which will remain unnamed) was how the UPS folks were always testing their systems by pulling the plug on the main power to the building. The problem was they apparently hired untrained monkeys to wire up the UPS systems (which were just a few rooms chock full of batteries) and managed to kill power to the entire building (including the computer rooms) at least once every three months. This was doubly annoying because we had well over 100 full RAID racks (with 80 disks in each rack) in the facility. Hard drives, as most of you probablly know, are most likly to fail on boot time, so every time one of the brain cases managed to kill the power in the CRs, we had to spend the rest of the day replacing failed RAID drives. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, May 23, 2001 at 08:17:12AM -0400, Andresen,Jason R. wrote: Did you enable write caching? You didn't mention, and it's off by default in 4.3, but I think enabled by default on Linux. I tried to leave the FreeBSD and Linux boxes as unchanged as possible for my tests (they are lab machines that have other uses, although I made sure they were idle during the test periods). I left write caching enabled in the Linux boxes, and left it disabled on the FreeBSD boxes. Personally, I'm hesitant to enable write caching on FreeBSD because we tend to use it on machines where we really really don't want to lose data. Write caching is ok on the Linux machines because we use them as pure testbeds that we can reconstruct easily if their disks go south. That's all well and good, but I thought the aim here was to compare Linux and FreeBSD performance on as level playing field as possible? You're not measuring FS performance, you're measuring FS performance plus cache performance, so your numbers so far tell you nothing concrete. Kris PGP signature
Re: technical comparison
On Wed, 23 May 2001, Kris Kennaway wrote: On Wed, May 23, 2001 at 08:17:12AM -0400, Andresen,Jason R. wrote: Did you enable write caching? You didn't mention, and it's off by default in 4.3, but I think enabled by default on Linux. I tried to leave the FreeBSD and Linux boxes as unchanged as possible for my tests (they are lab machines that have other uses, although I made sure they were idle during the test periods). I left write caching enabled in the Linux boxes, and left it disabled on the FreeBSD boxes. Personally, I'm hesitant to enable write caching on FreeBSD because we tend to use it on machines where we really really don't want to lose data. Write caching is ok on the Linux machines because we use them as pure testbeds that we can reconstruct easily if their disks go south. That's all well and good, but I thought the aim here was to compare Linux and FreeBSD performance on as level playing field as possible? You're not measuring FS performance, you're measuring FS performance plus cache performance, so your numbers so far tell you nothing concrete. Yes, they tell us that FreeBSD with softupdates and no write cache performs better in large cases than Linux with ext2fs and write caching enabled. Also my FreeBSD 4.0 boxes don't have the hw.ata.wc knob, so it's harder for me to test this. Also, I don't know how ones goes about disabling the write cache in Linux without recompiling the kernel (which we have some custom mods in place, so I'm reluctant to do this). To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, May 23, 2001 at 09:20:51AM -0400, Andresen,Jason R. wrote: Why is knowing the file names cheating? It is almost certain that the application will know the names of it's own files (and won't be grepping the entire directory every time it needs to find a file). With 60,000 files, that would have the application duplicating 60,000 pieces of information that are stored by the operating system. Operations like open() and unlink() still have to search the directory to get the inode, so there isn't much incentive for an application to do that, I think. -- Ben An art scene of delight I created this to be ... -- Sun Ra To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Terry Lambert [EMAIL PROTECTED] writes: I don't understand the inability to perform the trivial design engineering necessary to keep from needing to put 60,000 files in one directory. Hear hear! ;) (Been waiting for that one) However, we can take it as a given that people who need to do this are incapable of doing computer science. You can't make that assumption just yet (although it seems reasonable). We really don't know exactly what the problem they are trying to solve is. Network news sites running old versions of software (as an example, I know someone who still runs CNEWS) have very clear reasons for phenomena resembling 60,000 files in one directory. I would begin to question the assumption that seems to have been unquestioned. Namely, why is the focus -just- on speed? FreeBSD outperforms Linux on reliability and security as well. Not to mention networking. -- Dave Hayes - Consultant - Altadena CA, USA - [EMAIL PROTECTED] The opinions expressed above are entirely my own We can never have enough of that which we really do not want. --Eric Hoffer To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, 23 May 2001, Andresen,Jason R. wrote: On Wed, 23 May 2001, Kris Kennaway wrote: That's all well and good, but I thought the aim here was to compare Linux and FreeBSD performance on as level playing field as possible? You're not measuring FS performance, you're measuring FS performance plus cache performance, so your numbers so far tell you nothing concrete. *nod* Yes, they tell us that FreeBSD with softupdates and no write cache performs better in large cases than Linux with ext2fs and write caching enabled. Also my FreeBSD 4.0 boxes don't have the hw.ata.wc knob, so it's harder for me to test this. Also, I don't know how ones goes about disabling the write cache in Linux without recompiling the kernel (which we have some custom mods in place, so I'm reluctant to do this). 1. I don't think I've ever seen a Linux distro which has write caching enabled by default. Hell, DMA33 isn't even enabled by default ;) 2. hdparm -W0 /dev/drive to turn write caching off, -W1 to turn it on 3. I've seen many disks which got _slower_ with write caching turned on. Sure, it helps for sequential IO, but with more random IO the write caching on the disk can interfere really badly with the IO scheduling in the OS ... I've seen as much as a 5x drop in random IO performance with write caching ON compared to OFF. I guess it would be good to follow Kris' suggestions and try to do the tests on a level playing field. The results might just be interesting ;) regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Wed, May 23, 2001 at 10:54:40PM -0300, Rik van Riel wrote: 1. I don't think I've ever seen a Linux distro which has write caching enabled by default. Hell, DMA33 isn't even enabled by default ;) You are talking about controlling the IDE drive cache. The issue here is write cache in the filesystem code. -- [EMAIL PROTECTED] _ __/ armchairrocketscientistgraffitiexenstentialist And in billows of might swell the Saxons before her,-- Unite, oh unite! Or the billows burst o'er her! -- Downfall of the Gael To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Albert D. Cahalan wrote: It should be immediately obvious that ext2 is NOT the filesystem being proposed, async or not. For large directories, ext2 sucks as bad as UFS does. This is because ext2 is a UFS clone. The proposed filesystem is most likely Reiserfs. This is a true journalling filesystem with a radically non-traditional layout. It is no problem to put millions of files in a single directory. (actually, the all-in-one approach performs better than a tree) XFS and JFS are similarly capable, but Reiserfs is well tested and part of the official Linux kernel. You can get the Reiserfs team to support you too, in case you want to bypass the normal filesystem interface for even better performance. Er, I don't think ReiserFS is in the Linux kernel yet, although it is the default filesystem on some distros apparently. I think Linus has some reservations about the stability of the filesystem since it is fairly new. That said, it would be hard to be much worse than Ext2fs with write cacheing enabled (default!) in the event of power failure. We only have three Linux boxes here (and one is a PC104 with a flash disk) and already I've had to reinstall the entire OS once when we had a power glitch. ext2fsck managed to destroy about 1/3 of the files on the system, in a pretty much random manner (the lib and etc were hit hard). Heck, the system didn't even try to boot when it came back, I had to pull out the rescue disk and run fsck from there. Good thing the rescue disk was the same as the install disk, it saved me a disk swap. :( If only FreeBSD could boot from those funky M-Systems flash disks. So, no async here, and UFS + soft updates can't touch the performance on huge directories. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
[trimming CCs] On Tue, May 22, 2001 at 09:31:34AM -0400, Jason Andresen wrote: Er, I don't think ReiserFS is in the Linux kernel yet, although it is the default filesystem on some distros apparently. I think Linus has some reservations about the stability of the filesystem since it is fairly new. It is in now AFAIK. That said, it would be hard to be much worse than Ext2fs with write cacheing enabled (default!) in the event of power failure. We only have three Linux boxes here (and one is a PC104 with a flash disk) and already I've had to reinstall the entire OS once when we had a power glitch. ext2fsck managed to destroy about 1/3 of the files on the system, in a pretty much random manner (the lib and etc were hit hard). Heck, the system didn't even try to boot when it came back, I had to pull FWIW, I lost two filesystems last week. One ext2 and the second reiser and no crashes/power failures were involved. The ext2 failure meant a complete reinstall (only 4-5 files where left in / after fsck). A reiser filesystem started giving input/output errors and could not be repaired with reiserfsck. Trying to back up the file system before a repair only resulted in kernel panics. -- Hroi Sigurdsson [EMAIL PROTECTED] Netgroup A/S http://www.netgroup.dk To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Albert D. Cahalan wrote: Gordon Tetlow writes: On Mon, 21 May 2001, Jordan Hubbard wrote: [Charles C. Figueire] c) A filesystem that will be fast in light of tens of thousands of files in a single directory (maybe even hundreds of thousands) I think we can more than hold our own with UFS + soft updates. This is another area where you need to get hard numbers from the Linux folks. I think your assumption that Linux handles this effectively is flawed and I'd like to see hard numbers which prove otherwise; you should demand no less. Also point out the reliability factor here which is a bit harder to point to a magic number and See, we *are* better! ext2 runs async by default which can lead to nasty filesystem corruption in the event of a power loss. With softupdates, the filesystem metadata will always be in sync and uncorrupted (barring media failure of course). It should be immediately obvious that ext2 is NOT the filesystem being proposed, async or not. For large directories, ext2 sucks as bad as UFS does. This is because ext2 is a UFS clone. The proposed filesystem is most likely Reiserfs. This is a true journalling filesystem with a radically non-traditional layout. It is no problem to put millions of files in a single directory. (actually, the all-in-one approach performs better than a tree) XFS and JFS are similarly capable, but Reiserfs is well tested and part of the official Linux kernel. You can get the Reiserfs team to support you too, in case you want to bypass the normal filesystem interface for even better performance. So, no async here, and UFS + soft updates can't touch the performance on huge directories. Unfortunatly I don't have a ReiserFS partition available to test with, but I do have UFS and ext2fs partitions. Here's the results I got from postmark, which seems to be the closest match to the original problem in the entire ports tree. Test setup: Two machines with the same make and model hardware, one running FreeBSD 4.0, the other running RedHat Linux 7.0. The data: Hardware: Both machines have the same hardware on paper (although it is TWO machines, YMMV). PII-300 Intel PIIX4 ATA33 controller IBM-DHEA-38451 8063MB ata0-master using UDMA33 HD Note: all variables are left at default unless mentioned. 1 transactions, 500 files. FreeBSD 4.0 +Softupdates, write cache disabled: Time: 35 seconds total 34 seconds of transactions (294 per second) Files: 5513 created (157 per second) Creation alone: 500 files (500 per second) Mixed with transactions: 5013 files (147 per second) 4917 read (144 per second) 5016 appended (147 per second) 5513 deleted (157 per second) Deletion alone: 526 files (526 per second) Mixed with transactions: 4987 files (146 per second) Data: 31.27 megabytes read (893.48 kilobytes per second) 34.71 megabytes written (991.70 kilobytes per second) Linux 2.2.16 ext2fs and write caching enabled Time: 28 seconds total 28 seconds of transactions (357 per second) Files: 5513 created (196 per second) Creation alone: 500 files (500 per second) Mixed with transactions: 5013 files (179 per second) 4917 read (175 per second) 5016 appended (179 per second) 5513 deleted (196 per second) Deletion alone: 526 files (526 per second) Mixed with transactions: 4987 files (178 per second) Data: 31.27 megabytes read (1.12 megabytes per second) 34.71 megabytes written (1.24 megabytes per second) 1 transactions, 3 files: FreeBSD 4.0 +softupdates, write cache disabled: Time: 640 seconds total 410 seconds of transactions (24 per second) Files: 34993 created (54 per second) Creation alone: 3 files (146 per second) Mixed with transactions: 4993 files (12 per second) 5055 read (12 per second) 4944 appended (12 per second) 34993 deleted (54 per second) Deletion alone: 29986 files (1199 per second) Mixed with transactions: 5007 files (12 per second) Data: 25.62 megabytes read (40.03 kilobytes per second) 179.79 megabytes written (280.92 kilobytes per second) Linux 2.2.16 ext2fs with write caching enabled Time: 1009 seconds total 612 seconds of transactions (16 per second) Files: 34993 created (34 per second) Creation alone: 3 files (83 per second) Mixed with transactions: 4993 files (8 per second) 5055 read (8 per second) 4944 appended (8 per second) 34993 deleted (34 per second) Deletion alone: 29986 files (768 per second) Mixed with transactions: 5007 files (8 per second) Data: 25.62 megabytes read
Re: technical comparison
Jason Andresen wrote: Oops, I fubbed up the linux at 6 files test, I'm rerunning it now, but it will take a while to finish. Results: ufs+softupdates is a little slower than ext2fs+wc for low numbers of files, but scales better. I wish I had a Reiserfs partition to test with. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Jason Andresen wrote: Jason Andresen wrote: Oops, I fubbed up the linux at 6 files test, I'm rerunning it now, but it will take a while to finish. Results: ufs+softupdates is a little slower than ext2fs+wc for low numbers of files, but scales better. I wish I had a Reiserfs partition to test with. The test is done: Linux 2.2.16 with ext2fs and write caching 1 transactions, 6 simultanious files: Time: 2084 seconds total 702 seconds of transactions (14 per second) Files: 65065 created (31 per second) Creation alone: 6 files (48 per second) Mixed with transactions: 5065 files (7 per second) 5078 read (7 per second) 4921 appended (7 per second) 65065 deleted (31 per second) Deletion alone: 60130 files (395 per second) Mixed with transactions: 4935 files (7 per second) Data: 26.01 megabytes read (12.48 kilobytes per second) 325.12 megabytes written (156.01 kilobytes per second) I don't suppose anybody has a FreeBSD and Linux box dual booting (or identically speced) with ReiserFS anywhere? I'm quite curious how much faster ReiserFS is in these tests. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
] I work in an environment consisting of 300+ systems, all FreeBSD ] and Solaris, along with lots of EMC and F5 stuff. Our engineering division ] has been working on a dynamic content server and search engine for the ] past 2.5 years. They have consistently not met up to performance and ] throughput requirements and have always blamed our use of FreeBSD for it. You may wish to point out to them that their F5 boxes are running FreeBSD. Terry Lambert [EMAIL PROTECTED] --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Jason Andresen writes: Albert D. Cahalan wrote: It should be immediately obvious that ext2 is NOT the filesystem being proposed, async or not. For large directories, ext2 sucks as bad as UFS does. This is because ext2 is a UFS clone. The proposed filesystem is most likely Reiserfs. This is a true journalling filesystem with a radically non-traditional layout. It is no problem to put millions of files in a single directory. (actually, the all-in-one approach performs better than a tree) XFS and JFS are similarly capable, but Reiserfs is well tested and part of the official Linux kernel. You can get the Reiserfs team to support you too, in case you want to bypass the normal filesystem interface for even better performance. Er, I don't think ReiserFS is in the Linux kernel yet, although it is the default filesystem on some distros apparently. I think Linus has some reservations about the stability of the filesystem since it is It is in the kernel: http://lxr.linux.no/source/fs/reiserfs/?v=2.4.4 Bugs died left and right when it went in. fairly new. That said, it would be hard to be much worse than Ext2fs with write cacheing enabled (default!) in the event of power failure. We only have three Linux boxes here (and one is a PC104 with a flash disk) and already I've had to reinstall the entire OS once when we had a power glitch. ext2fsck managed to destroy about 1/3 of the files on the system, in a pretty much random manner (the lib and etc were hit hard). If you don't like ext2, why should it like you? :-) I power cycle a Linux box nearly every day to reset a board. If only FreeBSD could boot from those funky M-Systems flash disks. If you want flash, use a filesystem designed for flash. (not UFS, ext2, Reiserfs, XFS, JFS, or FAT... try JFFS2) So, no async here, and UFS + soft updates can't touch the performance on huge directories. From another email you mention benchmarking with: Linux 2.2.16 with ext2fs and write caching 1 transactions, 6 simultanious files: 1. The 2.2.16 kernel is obsolete. 2. 6 files is not a lot. Try a few million files. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: technical comparison
-Original Message- From: Terry Lambert [mailto:[EMAIL PROTECTED]] Sent: Tuesday, May 22, 2001 10:59 AM To: [EMAIL PROTECTED] Subject: Re: technical comparison ] I work in an environment consisting of 300+ systems, all FreeBSD ] and Solaris, along with lots of EMC and F5 stuff. Our engineering division ] has been working on a dynamic content server and search engine for the ] past 2.5 years. They have consistently not met up to performance and ] throughput requirements and have always blamed our use of FreeBSD for it. You may wish to point out to them that their F5 boxes are running FreeBSD. Terry Lambert [EMAIL PROTECTED] When did that change? As of March which was the last time I had my grubby little hands all over a F5 BigIP box in our lab, it was NOT running FreeBSD. It runs a tweaked version of BSDI's kernel. Matt To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Albert D. Cahalan wrote: Jason Andresen writes: Er, I don't think ReiserFS is in the Linux kernel yet, although it is the default filesystem on some distros apparently. I think Linus has some reservations about the stability of the filesystem since it is It is in the kernel: http://lxr.linux.no/source/fs/reiserfs/?v=2.4.4 Bugs died left and right when it went in. Looks like my news was out of date. Thanks for the update. fairly new. That said, it would be hard to be much worse than Ext2fs with write cacheing enabled (default!) in the event of power failure. We only have three Linux boxes here (and one is a PC104 with a flash disk) and already I've had to reinstall the entire OS once when we had a power glitch. ext2fsck managed to destroy about 1/3 of the files on the system, in a pretty much random manner (the lib and etc were hit hard). If you don't like ext2, why should it like you? :-) I power cycle a Linux box nearly every day to reset a board. If only FreeBSD could boot from those funky M-Systems flash disks. If you want flash, use a filesystem designed for flash. (not UFS, ext2, Reiserfs, XFS, JFS, or FAT... try JFFS2) So, no async here, and UFS + soft updates can't touch the performance on huge directories. From another email you mention benchmarking with: Linux 2.2.16 with ext2fs and write caching 1 transactions, 6 simultanious files: 1. The 2.2.16 kernel is obsolete. 2. 6 files is not a lot. Try a few million files. 6 files took ~15 minutes to create as is. I'm going to have to wait until tonight to run larger sets. 2.2.16 is what we have here. I'm still waiting to see how much faster ReiserFS is. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
I ran tests that I think are similar to what Jason ran on identically configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much faster than UFS+softupdates on these tests. Linux (2.2.14-5 + ReiserFS): Time: 164 seconds total 97 seconds of transactions (103 per second) Files: 65052 created (396 per second) Creation alone: 6 files (1090 per second) Mixed with transactions: 5052 files (52 per second) 4936 read (50 per second) 5063 appended (52 per second) 65052 deleted (396 per second) Deletion alone: 60104 files (5008 per second) Mixed with transactions: 4948 files (51 per second) Data: 24.83 megabytes read (155.01 kilobytes per second) 336.87 megabytes written (2.05 megabytes per second) FreeBSD 4.3-RELEASE (ufs/softupdates): Time: 537 seconds total 155 seconds of transactions (64 per second) Files: 65052 created (121 per second) Creation alone: 6 files (172 per second) Mixed with transactions: 5052 files (32 per second) 4936 read (31 per second) 5063 appended (32 per second) 65052 deleted (121 per second) Deletion alone: 60104 files (1717 per second) Mixed with transactions: 4948 files (31 per second) Data: 24.83 megabytes read (47.34 kilobytes per second) 336.87 megabytes written (642.38 kilobytes per second) Both tests were done with postmark-1.5, 6 files in 1 transactions. The machines are IBM Netfinity 4000R, the disk is an IBM DPSS-336950N, connected to an Adaptec 2940UW. Nadav To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, May 22, 2001 at 12:40:11PM -0600, Matt Simerson wrote: When did that change? As of March which was the last time I had my grubby little hands all over a F5 BigIP box in our lab, it was NOT running FreeBSD. It runs a tweaked version of BSDI's kernel. I believe it is Terry's information that's out of date, not yours. -- Ben An art scene of delight I created this to be ... -- Sun Ra To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
ReiserFS entered Linux kernels in the pre 2.4.1 series, and was 'official' with 2.4.1. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, May 22, 2001 at 10:27:27PM +0300, Nadav Eiron wrote: I ran tests that I think are similar to what Jason ran on identically configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much faster than UFS+softupdates on these tests. Linux (2.2.14-5 + ReiserFS): Time: 164 seconds total 97 seconds of transactions (103 per second) Files: 65052 created (396 per second) Creation alone: 6 files (1090 per second) Mixed with transactions: 5052 files (52 per second) 4936 read (50 per second) 5063 appended (52 per second) 65052 deleted (396 per second) Deletion alone: 60104 files (5008 per second) Mixed with transactions: 4948 files (51 per second) Data: 24.83 megabytes read (155.01 kilobytes per second) 336.87 megabytes written (2.05 megabytes per second) FreeBSD 4.3-RELEASE (ufs/softupdates): Did you enable write caching? You didn't mention, and it's off by default in 4.3, but I think enabled by default on Linux. Kris PGP signature
Re: technical comparison
I didn't, but I believe Jason's numbers (for ext2 and ufs) also had write caching only enabled on Linux. On Tue, 22 May 2001, Kris Kennaway wrote: On Tue, May 22, 2001 at 10:27:27PM +0300, Nadav Eiron wrote: I ran tests that I think are similar to what Jason ran on identically configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much faster than UFS+softupdates on these tests. Linux (2.2.14-5 + ReiserFS): Time: 164 seconds total 97 seconds of transactions (103 per second) Files: 65052 created (396 per second) Creation alone: 6 files (1090 per second) Mixed with transactions: 5052 files (52 per second) 4936 read (50 per second) 5063 appended (52 per second) 65052 deleted (396 per second) Deletion alone: 60104 files (5008 per second) Mixed with transactions: 4948 files (51 per second) Data: 24.83 megabytes read (155.01 kilobytes per second) 336.87 megabytes written (2.05 megabytes per second) FreeBSD 4.3-RELEASE (ufs/softupdates): Did you enable write caching? You didn't mention, and it's off by default in 4.3, but I think enabled by default on Linux. Kris To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Jason Andresen wrote: If only FreeBSD could boot from those funky M-Systems flash disks. It can. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wow regex humor... I'm a geek To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Jason Andresen wrote: Results: ufs+softupdates is a little slower than ext2fs+wc for low numbers of files, but scales better. I wish I had a Reiserfs partition to test with. Ext2fs is a non-contender. Note, though, that there is some very recent perfomance improvement on very large directories known as dirpref (what changed, actually, was dirpref's algorithm). This is NOT present on 4.3-RELEASE, though it _might_ have since been committed to stable. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wow regex humor... I'm a geek To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Nadav Eiron wrote: I ran tests that I think are similar to what Jason ran on identically configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much faster than UFS+softupdates on these tests. For that matter, did you have vfs.vmiodirenable enabled? -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wow regex humor... I'm a geek To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, May 22, 2001 at 02:49:21PM -0400, Jason Andresen wrote: 6 files took ~15 minutes to create as is. I'm going to have to wait until tonight to run larger sets. 2.2.16 is what we have here. I'm still waiting to see how much faster ReiserFS is. I'm willing to overnight your test if you want. Do you have it packaged up to send? It would be interesting just to get numbers from a Linux system with a modern kernel. 2.4.1 gave me enough of a speed boost to put off another FreeBSD install until I fix some problems there. I cannot test FreeBSD with SCSI right now so my system will be an inequal set of results. I would offer to test NetBSD as well, but I suppose no one would be interested in that. -- [EMAIL PROTECTED] _ __/ armchairrocketscientistgraffitiexistentialist There is no such thing as security. Life is either bold adventure, or it is nothing -- Helen Keller To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, May 22, 2001 at 09:31:34AM -0400, Jason Andresen wrote: Er, I don't think ReiserFS is in the Linux kernel yet, although it is the default filesystem on some distros apparently. ReiserFS, on my system anyway, started just losing files. I'd log in and would notice some mp3 files or source code was just gone. No heavy load, and no crashes. Nope, not for me. I think they'll get it in time if the basic design isn't flawed, but things like an fs just take a lot of time to debug and come to trust. There are already some very good journaling systems, and it would seem better to get them ported, and leave things like ReiserFS a research project until it proves itself. That said, it would be hard to be much worse than Ext2fs with write cacheing enabled (default!) in the event of power failure. Point taken, but the yank power, see who survives test is illogical and dangerous thinking. Besides, my drives have megabytes of write-cache that I cannot disable. Most are large enough to cause problems for most any fs if they crash at just the right moment. From what I have read, a lot of drives really ignore commands to turn it off or do synchronous writes. Both ext2 and ufs both handle my chores with little or no trouble. On some systems, I've actually preferred ufs to the journaled file systems. We only have three Linux boxes here (and one is a PC104 with a flash disk) and already I've had to reinstall the entire OS once when we had a power glitch. ext2fsck managed to destroy about 1/3 of the files on the system, in a pretty much random manner (the lib and etc were hit hard). This is not typical. Also, I have heard the same thing from other people about flash disks. fs crash, fsck, and a mess afterwards. It would be nice if you could use ufs and see if the same problem exists. -- There's music along the river For Love wanders there, Pale | | | flowers on his mantle, Dark leaves on his hair. -- James Joyce | | | / | \ s h a n n o n @ w i d o m a k e r . c o m _/ | \_ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Shannon Hendrix wrote: On Tue, May 22, 2001 at 02:49:21PM -0400, Jason Andresen wrote: 6 files took ~15 minutes to create as is. I'm going to have to wait until tonight to run larger sets. 2.2.16 is what we have here. I'm still waiting to see how much faster ReiserFS is. I'm willing to overnight your test if you want. Do you have it packaged up to send? It would be interesting just to get numbers from a Linux system with a modern kernel. 2.4.1 gave me enough of a speed boost to put off another FreeBSD install until I fix some problems there. I cannot test FreeBSD with SCSI right now so my system will be an inequal set of results. I would offer to test NetBSD as well, but I suppose no one would be interested in that. And just to get things worse... :-) the test must be made on the *same* slice. If you configure two different slices, the one on the outer tracks will be faster. -- Daniel C. Sobral(8-DCS) [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] wow regex humor... I'm a geek To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, May 22, 2001 at 12:03:33PM -0400, Jason Andresen wrote: Here's the results I got from postmark, which seems to be the closest match to the original problem in the entire ports tree. Test setup: Two machines with the same make and model hardware, one running FreeBSD 4.0, the other running RedHat Linux 7.0. The data: Hardware: Both machines have the same hardware on paper (although it is TWO machines, YMMV). PII-300 Intel PIIX4 ATA33 controller IBM-DHEA-38451 8063MB ata0-master using UDMA33 HD Note: all variables are left at default unless mentioned. 1 transactions, 500 files. What did you set size to? How much memory on the machine? I tested on a 700MHz Athlon system with 256MB RAM, Adaptec 2940UW controller, 18GB IBM Ultrastar SCSI drive. You must have really low memory or something because I know that 1 transactions and 500 files can't be enough for anything faster than my old Sun SS5. I hit over 16MB/sec and 5000 transactions per second on my Linux machine. On the larger tests, it was disappointing. I can't test FreeBSD on SCSI right now, but my NetBSD machine (the old Sun SS5 wasn't terrible at least: Time: 220 seconds total 204 seconds of transactions (49 per second) Files: 5564 created (25 per second) Creation alone: 500 files (62 per second) Mixed with transactions: 5064 files (24 per second) 4999 read (24 per second) 4967 appended (24 per second) 5564 deleted (25 per second) Deletion alone: 628 files (78 per second) Mixed with transactions: 4936 files (24 per second) Data: 32.12 megabytes read (149.52 kilobytes per second) 35.61 megabytes written (165.73 kilobytes per second) 1 transactions, 6 files FreeBSD 4.0 with Softupdates, write cache disabled Time: 1259 seconds total 495 seconds of transactions (20 per second) I got about 60 per second right here. I was actually expecting better results from Linux and NetBSD than I got, and would expect more from FreeBSD than you got. I'm going to test FreeBSD tomorrow and Linux again with much larger numbers of files and transactions. -- Star Wars Moral Number 17: Teddy bears are dangerous in| | | herds. | | | / | \ s h a n n o n @ w i d o m a k e r . c o m _/ | \_ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, May 22, 2001 at 10:55:09PM -0300, Daniel C. Sobral wrote: And just to get things worse... :-) the test must be made on the *same* slice. If you configure two different slices, the one on the outer tracks will be faster. I cannot verify that with my drive, but my largest is 18GB so maybe the difference is not as pronounced as on some newer drives like those (currently) monster 70GB drives. A 70GB IBM Ultrastar supposedly can physically outrun the internal electronics on the faster tracks. One review I read mentioned it as a problem, though I'm not sure why. In any case, I'm not quite that picky, and I would not think that postmark would benefit as much from being on the faster tracks. It's doing a lot more complicated things than just streaming data. -- And in billows of might swell the Saxons before her,-- Unite, oh unite! Or the billows burst o'er her! -- Downfall of the Gael __ Charles Shannon Hendrix s h a n n o n @ w i d o m a k e r . c o m To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Nadav Eiron wrote: I ran tests that I think are similar to what Jason ran on identically configured FreeBSD and Linux/ReiserFS machines. ResierFS is much much faster than UFS+softupdates on these tests. [ ... ] Both tests were done with postmark-1.5, 6 files in 1 transactions. The machines are IBM Netfinity 4000R, the disk is an IBM DPSS-336950N, connected to an Adaptec 2940UW. I don't understand the inability to perform the trivial design engineering necessary to keep from needing to put 60,000 files in one directory. However, we can take it as a given that people who need to do this are incapable of doing computer science. I would suggest two things: 1) If write caching is off on the Linux disks, turn it off on the FreeBSD disks. 2) -- and then turn it on on both. 3) Modify the test to delete the files based on a directory traversal, instead of promiscuous knowledge of the file names, which is cheating to make the lookups appear faster. (the rationale behind this last is that people who can't design around needing 60,000 files in a single directory are probably going to to be unable to correctly remember the names of the files they created, since if they could, then they could remember things like ./a/a/aardvark or ./a/b/abominable). -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Tue, 22 May 2001, Shannon Hendrix wrote: : :Point taken, but the yank power, see who survives test is illogical :and dangerous thinking. Depends on the enviornment. I've had lots of machines just lose power. People will pull power cords out, the back-up generators won't start before the battery back-up runs out, someone will push the Big Red Switch. Even the best back-up power isn't going to help if it catches fire. I sort of like machines to work when the power comes back. -- [EMAIL PROTECTED] Bipedalism is only a fad. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
void wrote: On Tue, May 22, 2001 at 12:40:11PM -0600, Matt Simerson wrote: When did that change? As of March which was the last time I had my grubby little hands all over a F5 BigIP box in our lab, it was NOT running FreeBSD. It runs a tweaked version of BSDI's kernel. I believe it is Terry's information that's out of date, not yours. Yep; mea culpa. I guess they will just have to install BSDI systems in place of your FreeBSD and Linux systems. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Shannon Hendrix writes: On Tue, May 22, 2001 at 12:03:33PM -0400, Jason Andresen wrote: Here's the results I got from postmark, which seems to be the closest match to the original problem in the entire ports tree. Test setup: Two machines with the same make and model hardware, one running FreeBSD 4.0, the other running RedHat Linux 7.0. That should be FreeBSD 4.3 and Red Hat 7.1 at least, or -current and 2.4.5-pre5. Considering that this is about a new system, the latest software and hardware ought to be used. Reiserfs only became stable just recently; the 2.4.1 kernel would be a dumb choice. 1 transactions, 500 files. ... 1 transactions, 6 files Even 6 files is insignificant by Reiserfs standards. The test gets interesting with several million files. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Terry Lambert writes: I don't understand the inability to perform the trivial design engineering necessary to keep from needing to put 60,000 files in one directory. However, we can take it as a given that people who need to do this are incapable of doing computer science. One could say the same about the design engineering necessary to handle 60,000 files in one directory. You're making excuses. People _want_ to do this, and it often performs better on a modern filesystem. This is not about need; it's about keeping ugly hacks out of the app code. http://www.namesys.com/5_1.html (the rationale behind this last is that people who can't design around needing 60,000 files in a single directory are probably going to to be unable to correctly remember the names of the files they created, since if they could, then they could remember things like ./a/a/aardvark or ./a/b/abominable). Eeew. ./a/b/abominable is a disgusting old hack used to work around traditional filesystem deficiencies. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Charles C. Figueiredo wrote: | I appoligize if this is the improper channel for this sort of | discussion, but it is in the best interests of the FreeBSD following, | atleast, within my orginization. It is the wrong place -- see the list descriptions. | Linux on Intel fits the bill because it meets these three requirements | *very* effectively. So setup some Linux boxes and let them play. If that solves your problem, just go with it. If it doesn't, then you know the problem is different and you can look into it. If it really turns out that the Linux solution works and if you want to do something to help FreeBSD do as well, then you'll have the data to make that a possibility. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
From: Charles C. Figueiredo [EMAIL PROTECTED] Subject: technical comparison Date: Mon, 21 May 2001 17:10:54 -0400 (EDT) I work in an environment consisting of 300+ systems, all FreeBSD and Solaris, along with lots of EMC and F5 stuff. Our engineering division has been working on a dynamic content server and search engine for the past 2.5 years. They have consistently not met up to performance and throughput requirements and have always blamed our use of FreeBSD for it. This is your first warning sign. This has all the appearances of a group of people who've _already_ made their conclusions and are now busily engaged in fitting the data to match. The only defense against this kind of situation is to take their data head-on. You're probably not going to get them to alter their preexisting bias since they probably have their own reasons for being Linux evangelists, but you can at least fight them to a stand-still on the comparative data front. Sinc FreeBSD is already entrenched there, that means you win the battle, at least for now. Winning the war will require that you not get complacent and continue with your objective measurements to prove (or disprove) FreeBSD's suitability for your needs. In the cases where you disprove it, at least the data is in friendly hands and you can open back-channel communications with us to try and address those shortcomings, whatever they may be. To take your current list: a) A machine that has fast character operations I think that's probably more architecture (machine) dependant than it is a function of the OS. A PC does a fine job at many things, but an IBM 3090 it's not. You should probably establish as compared to what for this argument and see what Linux's numbers are; I suspect it will quickly become a non-issue since the beancounters won't want to spend the kind of money truly improving this would cost. b) A *supported* Oracle client That's a gotcha, no doubt about it. About the best you can probably do here is show that the Linux Oracle client works just fine under compatibility mode and determine just how many support calls you make to Oracle with respect to their client (and not the server) software a year. c) A filesystem that will be fast in light of tens of thousands of files in a single directory (maybe even hundreds of thousands) I think we can more than hold our own with UFS + soft updates. This is another area where you need to get hard numbers from the Linux folks. I think your assumption that Linux handles this effectively is flawed and I'd like to see hard numbers which prove otherwise; you should demand no less. - Jordan To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
On Mon, 21 May 2001, Jordan Hubbard wrote: c) A filesystem that will be fast in light of tens of thousands of files in a single directory (maybe even hundreds of thousands) I think we can more than hold our own with UFS + soft updates. This is another area where you need to get hard numbers from the Linux folks. I think your assumption that Linux handles this effectively is flawed and I'd like to see hard numbers which prove otherwise; you should demand no less. Also point out the reliability factor here which is a bit harder to point to a magic number and See, we *are* better! ext2 runs async by default which can lead to nasty filesystem corruption in the event of a power loss. With softupdates, the filesystem metadata will always be in sync and uncorrupted (barring media failure of course). -gordon To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: technical comparison
Gordon Tetlow writes: On Mon, 21 May 2001, Jordan Hubbard wrote: [Charles C. Figueire] c) A filesystem that will be fast in light of tens of thousands of files in a single directory (maybe even hundreds of thousands) I think we can more than hold our own with UFS + soft updates. This is another area where you need to get hard numbers from the Linux folks. I think your assumption that Linux handles this effectively is flawed and I'd like to see hard numbers which prove otherwise; you should demand no less. Also point out the reliability factor here which is a bit harder to point to a magic number and See, we *are* better! ext2 runs async by default which can lead to nasty filesystem corruption in the event of a power loss. With softupdates, the filesystem metadata will always be in sync and uncorrupted (barring media failure of course). It should be immediately obvious that ext2 is NOT the filesystem being proposed, async or not. For large directories, ext2 sucks as bad as UFS does. This is because ext2 is a UFS clone. The proposed filesystem is most likely Reiserfs. This is a true journalling filesystem with a radically non-traditional layout. It is no problem to put millions of files in a single directory. (actually, the all-in-one approach performs better than a tree) XFS and JFS are similarly capable, but Reiserfs is well tested and part of the official Linux kernel. You can get the Reiserfs team to support you too, in case you want to bypass the normal filesystem interface for even better performance. So, no async here, and UFS + soft updates can't touch the performance on huge directories. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message