Re: [ANN] Squashfs 3.3 released
Christoph Hellwig wrote: On Wed, Nov 21, 2007 at 02:02:43PM +, Phillip Lougher wrote: Unfortunately the move to fixed little endian filesystem will involve another filesystem layout change. The current filesystem layout still uses packed bitfield structures, and it is impossible to swap these using the standard kernel swap macros. Removal of my routines that can properly swap packed bitfield structures is another change demanded by the Linux kernel mailing list. The normal way to do it is to use shift and mask after doing the endian conversion. But the problem with bitfields is that they can have different kinds of layouts depending on the compiler or abi which is another reason to avoid them in ondisk/wire formats. Yes, the bitfields are packed differently on little and big endian architectures which mean they appear in different places in the structure. I want to move away from that mess when I move to little endian only. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANN] Squashfs 3.3 released
Dave Jones wrote: The biggest problem we've seen with it (asides from having to rediff it every time we rebase when there isn't a newer upstream) Yes, this is mainly my fault. There was a gap of 10 months between the 3.2 release in January this year, and the latest in November. With the rate of new kernel releases this wasn't really acceptable because the January release was stuck with a patch for kernels no newer than 2.6.20. I received numerous complaints about it. Some of you may be aware that I started work at Canonical, and this left almost no spare-time to work on Squashfs for 9 months. is complaints along the lines of "my Fedora 7 kernel can't unpack squashfs images from Fedora 5" (s/Fedora 5/other random older distros/ ) Squashfs has backwards compatibility with older versions, and it should mount all older versions back to 2.0 (released May 2004). Unfortunately RedHat grabbed a CVS version of Squashfs just before the 3.0 release. This was development code, and release testing showed it had a bug where it couldn't mount older versions. It was fixed for release. If the format is now stable however, it would be great to get it upstream. The move from the 2.0 format to the later 3.0 format was mainly forced by the demands of the Linux kernel mailing list when I first submitted it early 2005. There was no other way to incorporate demands for larger than 4GB filesystems, and provide support for "." and ".." in readdir without modifying the filesystem format. Unfortunately the move to fixed little endian filesystem will involve another filesystem layout change. The current filesystem layout still uses packed bitfield structures, and it is impossible to swap these using the standard kernel swap macros. Removal of my routines that can properly swap packed bitfield structures is another change demanded by the Linux kernel mailing list. Once the little endian work has been done, and hopefully once it is in the kernel, I don't anticipate any further layout changes. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] cramfs: Add mount option "swapendian"
Linus Torvalds wrote: But it should be *trivial* to compress the metadata too if the code just were to put the metadata at the beginning of the image, and the real data at the end, and then you can build up the image from both ends and you *can* have a fixed starting point for the data (up at the top of the image) even though you are changing the size of the metadata by compression. I decided to compress the metadata when I designed Squashfs, a read-only filesystem which was inspired by Cramfs. Squashfs stores the data at the front of the filesystem and puts the metadata at the end, so the data is always at a fixed point. Doing that and a couple of other things allows the metadata to be built up and compressed in one-pass while the filesystem is being created. The metadata is split into an inode table and a directory table and compressed separately because it compresses better than way. But I literally designed and wrote the thing in a couple of days, and I really didn't think it through right. As a result, the metadata may be dense, but it's totally uncompressed. It would have been better to allow a less dense basic format (allowing bigger uid/gid values, and offsets and file sizes), but compress it. Squashfs stores much more metadata information, but as it is compressed it is much smaller than Cramfs. Typically the inode table compresses to less than 40% and the directory table to less than 50%. So a "v2" cramfs would be a great idea. That is what I always considered Squashfs to be. But I also made the mistake of making Squashfs both little and big endian. That's going to be fixed and then I'll make a second attempt at submitting it for inclusion in the mainline kernel. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANN] Squashfs 3.3 released
maximilian attems wrote: On Mon, Nov 05, 2007 at 11:13:14AM +, Phillip Lougher wrote: The next stage after this release is to fix the one remaining blocking issue (filesystem endianness), and then try to get Squashfs mainlined into the Linux kernel again. that would be very cool! Yes, it would be cool :) Five years is a long time to maintain something out of tree, especially recently when there's been so many minor changes to the VFS interface between kernel releases. with my hat as debian kernel maintainer i'd be very relieved to see it mainlined. i don't know of any major distro that doesn't ship it. I don't know of any major distro that doesn't ship Squashfs either (except arguably Slackware). Putting my other hat on (one of the Ubuntu kernel maintainers) I don't think Squashfs has caused distros that many problems because it is an easy patch to apply (it doesn't touch that many kernel files), but it is always good to minimise the differences from the stock kernel.org kernel. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANN] Squashfs 3.3 released
Michael Tokarev wrote: A tiny bug[fix] I always forgot to send... In fs/squashfs/inode.c, constants TASK_UNINTERRUPTIBLE and TASK_INTERRUPTIBLE are used, but they aren't sometimes defined (declared in linux/sched.h): Thanks - Squashfs gained a lot of #includes over time, many which I deemed were unnecessary and removed in Squashfs 3.2. I obviously removed too many. Fix applied to CVS. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANN] Squashfs 3.3 released
Hi, I'm pleased to announce another release of Squashfs. This is the 22nd release in just over five years. Squashfs 3.3 has lots of nice improvements, both to the filesystem itself (bigger blocks and sparse files), but also to the Squashfs-tools Mksquashfs and Unsquashfs. The next stage after this release is to fix the one remaining blocking issue (filesystem endianness), and then try to get Squashfs mainlined into the Linux kernel again. The list of changes from the change-log are as follows: 1. Filesystem improvements: 1.1. Maximum block size has been increased to 1Mbyte, and the default block size has been increased to 128 Kbytes. This improves compression. 1.2. Sparse files are now supported. Sparse files are files which have large areas of unallocated data commonly called holes. These files are now detected by Squashfs and stored more efficiently. This improves compression and read performance for sparse files. 2. Mksquashfs improvements: 2.1. Exclude files have been extended to use wildcard pattern matching and regular expressions. Support has also been added for non-anchored excludes, which means it is now possible to specify excludes which match anywhere in the filesystem (i.e. leaf files), rather than always having to specify exclude files starting from the root directory (anchored excludes). 2.2. Recovery files are now created when appending to existing Squashfs filesystems. This allows the original filesystem to be recovered if Mksquashfs aborts unexpectedly (i.e. power failure). 3. Unsquashfs improvements: 3.1. Multiple extract files can now be specified on the command line, and the files/directories to be extracted can now also be given in a file. 3.2. Extract files have been extended to use wildcard pattern matching and regular expressions. 3.3. Filename printing has been enhanced and Unquashfs can now display filenames with file attributes ('ls -l' style output). 3.4. A -stat option has been added which displays the filesystem superblock information. 3.5. Unsquashfs now supports 1.x filesystems. 4. Miscellaneous improvements/bug fixes: 4.1. Squashfs kernel code improved to use SetPageError in squashfs_readpage() if I/O error occurs. 4.2. Fixed Squashfs kernel code bug preventing file seeking beyond 2GB. 4.3. Mksquashfs now detects file size changes between first phase directory scan and second phase filesystem create. Regards Phillip Lougher - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/17] jffs2: convert jffs2_gc_fetch_page to read_cache_page
Nate Diller wrote: wow, you're right. I was sure I compile-tested this ... oh, "depends on MTD". oops. thanks for reviewing. does it look OK to you otherwise? Yes.. NATE - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/17] jffs2: convert jffs2_gc_fetch_page to read_cache_page
Nate Diller wrote: + page = read_cache_page(OFNI_EDONI_2SFFJ(f)->i_mapping, + start >> PAGE_CACHE_SHIFT, + (void *)jffs2_do_readpage_unlock, + OFNI_EDONI_2SFFJ(f)); - if (IS_ERR(pg_ptr)) { + if (IS_ERR(page)) { printk(KERN_WARNING "read_cache_page() returned error: %ld\n", PTR_ERR(pg_ptr)); should be printk(KERN_WARNING "read_cache_page() returned error: %ld\n", PTR_ERR(page)); - return PTR_ERR(pg_ptr); + return PTR_ERR(page); - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Writing a VFS driver
On 7 Jan 2007, at 23:28, Avishay Traeger wrote: On Sun, 2007-01-07 at 17:36 -0500, David H. Lynch Jr. wrote: I am looking for something really simple to start from, but also something that actually uses an underlying block device. All the "tutorial" examples I have tripped over (rkfs, ols2006 samplefs) seem to impliment in memory filesystems - unless I am mis-understanding how VFS to block device mapping works. You may want to look at cramfs (fs/cramfs), which is a read-only file system that doesn't have much code to it. I personally wouldn't look at Cramfs, althought it is a simple block device based filesystem, it has some elements such as compression, and non unique inode numbers that make it unnecessarily complicated for your needs. I would personally use Romfs as a guide. This, although an old ilesystem, has all the elements you need. You don't mention whether your filesystem has unique inode numbers, but you can use the disk location of the inode to generate the inode number. Doing this ensures your inode numbers are unique, and you can use the standard VFS iget routine, which can use this inode number to go straight to the information on disk. You mention your filesystem aligns each inode on a 8 Kbyte boundary, however, the file data appears to follow immediately after the inode header, and hence this won't be aligned on a page boundary (4 Kbytes). Due to this you cannot use the generic read page function (block_read_full_page), however, ROMFS has this non-alignment issue and you can simply copy what it does. Hope that helps. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[announce] Squashfs 3.2 released
Hi, I'm pleased to announce the release of Squashfs 3.2. NFS exporting is now supported, and the kernel code has been hardened against accidently or maliciously corrupted filesystems. The new release correctly handles all corrupted filesystems generated by the fsfuzzer tool (written by LMH/Steve Grubb) without oopsing the kernel. This in particular fixes the MOKB (Month of Kernel Bugs) report raised against Squashfs. Squashfs can be dowloaded from http://squashfs.sourceforge.net. The full list of changes are: Improvements: 1. Squashfs filesystems can now be exported via NFS. 2. Unsquashfs now supports 2.x filesystems. 3. Mksquashfs now displays a progress bar. 4. Squashfs kernel code has been hardened against accidently or maliciously corrupted Squashfs filesystems. Bug fixes: 5. Race condition occurring on S390 in readpage() fixed. 6. Odd behaviour of MIPS memcpy in read_data() routine worked- around. 7. Missing cache_flush in Squashfs symlink_readpage() added. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Finding hardlinks
On 29 Dec 2006, at 00:44, Bryan Henderson wrote: Plus, in some cases optimization is a matter of life or death -- the extra resources (storage space, cache space, access time, etc) for the duplicated files might be enough to move you from practical to impractical. You see this a lot with LiveCDs that use hardlinks to cram as much information onto a CDROM as possible. People copy the liveCD, lose the hardlinks, and wonder why their recreated LiveCD filesystem doesn't fit. In fact, liveCDs are a good example of things which are difficult to backup with the standard POSIX interface. Most LiveCDs sort the fles on disk to optimise boot-time, and this sort information is always lost by copying. People tend to demand that restore programs faithfully restore what was backed up. (I've even seen requirements that the inode numbers upon restore be the same). Given the difficulty of dealing with multi- linked files, not to mention various nonstandard file attributes fancy filesystem types have, I suppose they probably don't have really high expectations of that nowadays, but it's still a worthy goal not to turn one file into two. It is also equally important to not turn two files into one (i.e. incorrectly make hardlinks of files which are not hardlinks). Cramfs doesn't support hardlinks, but it does detect duplicates - duplicates share the file data on disk. Unfortunately, cramfs computes inode numbers from the file data location, which means two files with the same data get the same inode number, even if they were not hardlinks in the original filesystem. If it wasn't for the fact that cramfs always stores nlink as 1, they would look like hardlinks, and probably look sufficiently like hardlinks to fool a lot of applications. Of course as cramfs is a read-only filesystem it doesn't matter unless the filesystem is copied. I think "statement 2" is extremely important. Without this guarantee applications have to guess which files are hardlinks. Any guessing is going to be be got wrong sometimes with potentially disastrous results. Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [UPDATED PATCH] fix memory corruption from misinterpreted bad_inode_ops return values
Eric Sandeen wrote: >but Al felt that it was probably better to create an EIO-returner for each >actual op signature. Since so few ops share a signature, I just went ahead >& created an EIO function for each individual file & inode op that returns >a value. Hmm, the problem with this is it bloats bad_inode.o with lots of empty functions that return -EIO. Even though we're not interested in the parameters, GCC doesn't know this, and doesn't fold the functions into only the couple of definitions that return different types. Text size of original bad_inode.o: Idx Name Size VMA LMA File off Algn 0 .text 006c 0034 2**2 == 108 bytes Size with patch applied: Idx Name Size VMA LMA File off Algn 0 .text 016b 0034 2**2 patch applied: == 363 bytes, or over three times larger! >I originally had coded up the fix by creating a return_EIO_ macro >for each return type, This adds two extra functions (return for ssize_t and long), which gives an increase in size of only 12 bytes: Idx Name Size VMA LMA File off Algn 0 .text 0078 0034 2**2 == 120 bytes. Isn't this better? Thanks Phillip -- View this message in context: http://www.nabble.com/-UPDATED-PATCH--fix-memory-corruption-from-misinterpreted-bad_inode_ops-return-values-tf2916716.html#a8178968 Sent from the linux-fsdevel mailing list archive at Nabble.com. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Finding hardlinks
On 29 Dec 2006, at 08:41, Arjan van de Ven wrote: I think "statement 2" is extremely important. Without this guarantee applications have to guess which files are hardlinks. Any guessing is going to be be got wrong sometimes with potentially disastrous results. actually no. Statement 1 will tell them when the kernel knows they are hardlinks. It's the kernels job to make a reasonably quality of implementation so that that works most of the time. Statement 2 requires that "all of the time" which suddenly creates a lot of evil corner cases (like "what if I mount a network filesystem twice and the server doesn't quite tell me enough to figure it out" cases) to make it impractical. Actually no. Statement 2 for me is important in terms of archive correctness. With my "archiver" program Mksquashfs, if the two files are the same, and filesystem says they're hardlinks, I make them hardlinks in the Squashfs filesystem, otherwise they're stored as duplicates (same data, different inode). Doesn't matter much in terms of storage overhead, but it does matter if two files become one, or vice versa. If a filesystem cannot guarantee statement 2 in the "normal" case, I wouldn't use hardlinks in that filesystem, period. Using "evil corner cases" and network filesystems as an objection is somewhat like saying because we can't do it in every case, we shouldn't bother doing it in the "normal" case too. Disk based filesystems should be able to handle statements 1 and 2. No-one expects things to always work correctly in "evil corner cases" or with network filesystems. Phillip Think of it as the difference between good and perfect. (and perfect is the enemy of good :) the kernel will tell you when it knows within reason, via statement 1 technology. It's not perfect, but reasonably will be enough for normal userspace to depend on it. Your case is NOT a case of "I require 100%".. it's a "we'd like to take hardlinks into account" case. -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http:// www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix cramfs making duplicate entries in inode cache
Dave Johnson wrote: Patch below fixes this by making get_cramfs_inode() use the inode cache before blindly creating a new entry every time. This eliminates the duplicate inodes and duplicate buffer cache. > + struct inode * inode = iget_locked(sb, CRAMINO(cramfs_inode)); Doesn't iget_locked() assume inode numbers are unique? In Cramfs inode numbers are set to 1 for non-data inodes (fifos, sockets, devices, empty directories), i.e %stat device namedpipe File: `device' Size: 0 Blocks: 0 IO Block: 4096 character special file Device: 700h/1792d Inode: 1 Links: 1 Device type: 1,1 Access: (0644/crw-r--r--) Uid: (0/root) Gid: (0/root) Access: 1970-01-01 01:00:00.0 +0100 Modify: 1970-01-01 01:00:00.0 +0100 Change: 1970-01-01 01:00:00.0 +0100 File: `namedpipe' Size: 0 Blocks: 0 IO Block: 4096 fifo Device: 700h/1792d Inode: 1 Links: 1 Access: (0644/prw-r--r--) Uid: (0/root) Gid: (0/root) Access: 1970-01-01 01:00:00.0 +0100 Modify: 1970-01-01 01:00:00.0 +0100 Change: 1970-01-01 01:00:00.0 +0100 Should iget5_locked() be used here? Phillip - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: files of size larger than fs size
Max wrote: Hello! I've discovered that it is possible to create files of size much larger than partition size. I thought that this is JFS bug, so I've filed a bugreport against it at http://bugzilla.kernel.org/show_bug.cgi?id=4345 Detailed info and testcase program are provided there. Later I've found that at least XFS and EXT3 filesystems have the same problem (though the resulting filesize is different for each fs). So the problem may be not in fs code but in some other piece of kernel. Could kernel gurus please investigate the problem? Your test case isn't writing a full file, it is only writing 4 bytes at various offsets (1^32, 1^40, 1^48, 1^56). The filesystems you mention support files with "holes" in them, in otherwords they support gaps between data which don't take up any storage. Even though your test case is creating a huge file, only a couple of bytes are written, the rest of the huge file doesn't take up any space. The behaviour you're seeing isn't a bug... Phillip Lougher - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html