Re: [ANN] Squashfs 3.3 released

2007-11-21 Thread Phillip Lougher

Christoph Hellwig wrote:

On Wed, Nov 21, 2007 at 02:02:43PM +, Phillip Lougher wrote:
Unfortunately the move to fixed little endian filesystem will involve 
another filesystem layout change.  The current filesystem layout still 
uses packed bitfield structures, and it is impossible to swap these 
using the standard kernel swap macros.  Removal of my routines that can 
properly swap packed bitfield structures is another change demanded by 
the Linux kernel mailing list.


The normal way to do it is to use shift and mask after doing the endian
conversion.  But the problem with bitfields is that they can have different
kinds of layouts depending on the compiler or abi which is another reason
to avoid them in ondisk/wire formats.



Yes, the bitfields are packed differently on little and big endian 
architectures which mean they appear in different places in the 
structure. I want to move away from that mess when I move to little 
endian only.


Phillip

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANN] Squashfs 3.3 released

2007-11-21 Thread Phillip Lougher

Dave Jones wrote:



The biggest problem we've seen with it (asides from having to rediff
it every time we rebase when there isn't a newer upstream)


Yes, this is mainly my fault.  There was a gap of 10 months between the 
3.2 release in January this year, and the latest in November. With the 
rate of new kernel releases this wasn't really acceptable because the 
January release was stuck with a patch for kernels no newer than 2.6.20. 
I received numerous complaints about it.


Some of you may be aware that I started work at Canonical, and this left 
almost no spare-time to work on Squashfs for 9 months.



is complaints
along the lines of "my Fedora 7 kernel can't unpack squashfs images
from Fedora 5"
(s/Fedora 5/other random older distros/ )



Squashfs has backwards compatibility with older versions, and it should 
mount all older versions back to 2.0 (released May 2004). Unfortunately 
RedHat grabbed a CVS version of Squashfs just before the 3.0 release. 
This was development code, and release testing showed it

had a bug where it couldn't mount older versions. It was fixed for
release.


If the format is now stable however, it would be great to get it upstream.



The move from the 2.0 format to the later 3.0 format was mainly forced 
by the demands of the Linux kernel mailing list when I first submitted 
it early 2005.  There was no other way to incorporate demands for larger 
than 4GB filesystems, and provide support for "." and ".." in readdir 
without modifying the filesystem format.


Unfortunately the move to fixed little endian filesystem will involve 
another filesystem layout change.  The current filesystem layout still 
uses packed bitfield structures, and it is impossible to swap these 
using the standard kernel swap macros.  Removal of my routines that can 
properly swap packed bitfield structures is another change demanded by 
the Linux kernel mailing list.


Once the little endian work has been done, and hopefully once it is in 
the kernel, I don't anticipate any further layout changes.


Phillip

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] cramfs: Add mount option "swapendian"

2007-11-15 Thread Phillip Lougher

Linus Torvalds wrote:



But it should be *trivial* to compress the metadata too if the code just 
were to put the metadata at the beginning of the image, and the real data 
at the end, and then you can build up the image from both ends and you 
*can* have a fixed starting point for the data (up at the top of the 
image) even though you are changing the size of the metadata by 
compression.




I decided to compress the metadata when I designed Squashfs, a read-only 
filesystem which was inspired by Cramfs. Squashfs stores the data at the 
front of the filesystem and puts the metadata at the end, so the data is 
always at a fixed point.  Doing that and a couple of other things allows 
the metadata to be built up and compressed in one-pass while the 
filesystem is being created.  The metadata is split into an inode table 
and a directory table and compressed separately because it compresses 
better than way.


But I literally designed and wrote the thing in a couple of days, and I 
really didn't think it through right. As a result, the metadata may be 
dense, but it's totally uncompressed. It would have been better to allow a 
less dense basic format (allowing bigger uid/gid values, and offsets and 
file sizes), but compress it.




Squashfs stores much more metadata information, but as it is compressed 
it is much smaller than Cramfs.  Typically the inode table compresses

to less than 40% and the directory table to less than 50%.



So a "v2" cramfs would be a great idea.


That is what I always considered Squashfs to be.  But I also made the 
mistake of making Squashfs both little and big endian.  That's going to

be fixed and then I'll make a second attempt at submitting it for
inclusion in the mainline kernel.

Phillip


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANN] Squashfs 3.3 released

2007-11-07 Thread Phillip Lougher

maximilian attems wrote:

On Mon, Nov 05, 2007 at 11:13:14AM +, Phillip Lougher wrote:


The next stage after this release is to fix the one remaining blocking issue
(filesystem endianness), and then try to get Squashfs mainlined into the
Linux kernel again.



that would be very cool!


Yes, it would be cool :)  Five years is a long time to maintain
something out of tree, especially recently when there's been
so many minor changes to the VFS interface between kernel releases.


with my hat as debian kernel maintainer i'd be very relieved to see it
mainlined. i don't know of any major distro that doesn't ship it.



I don't know of any major distro that doesn't ship Squashfs either
(except arguably Slackware).  Putting my other hat on (one of the
Ubuntu kernel maintainers) I don't think Squashfs has caused
distros that many problems because it is an easy patch to apply
(it doesn't touch that many kernel files), but it is always good
to minimise the differences from the stock kernel.org kernel.

Phillip

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANN] Squashfs 3.3 released

2007-11-07 Thread Phillip Lougher

Michael Tokarev wrote:



A tiny bug[fix] I always forgot to send...  In fs/squashfs/inode.c,
constants TASK_UNINTERRUPTIBLE and TASK_INTERRUPTIBLE are used, but
they aren't sometimes defined (declared in linux/sched.h):



Thanks - Squashfs gained a lot of #includes over time, many which I deemed were
unnecessary and removed in Squashfs 3.2.   I obviously removed too many.
Fix applied to CVS.

Phillip


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANN] Squashfs 3.3 released

2007-11-05 Thread Phillip Lougher

Hi,

I'm pleased to announce another release of Squashfs.  This is the 22nd
release in just over five years.  Squashfs 3.3 has lots of nice improvements,
both to the filesystem itself (bigger blocks and sparse files), but
also to the Squashfs-tools Mksquashfs and Unsquashfs.

The next stage after this release is to fix the one remaining blocking issue
(filesystem endianness), and then try to get Squashfs mainlined into the
Linux kernel again.

The list of changes from the change-log are as follows:

1. Filesystem improvements:

   1.1. Maximum block size has been increased to 1Mbyte, and the
default block size has been increased to 128 Kbytes.
This improves compression.

   1.2. Sparse files are now supported.  Sparse files are files
which have large areas of unallocated data commonly called
holes.  These files are now detected by Squashfs and stored
more efficiently.  This improves compression and read
performance for sparse files.

2. Mksquashfs improvements:

   2.1.  Exclude files have been extended to use wildcard pattern
 matching and regular expressions.  Support has also been
 added for non-anchored excludes, which means it is
 now possible to specify excludes which match anywhere
 in the filesystem (i.e. leaf files), rather than always
 having to specify exclude files starting from the root
 directory (anchored excludes).

   2.2.  Recovery files are now created when appending to existing
 Squashfs filesystems.  This allows the original filesystem
 to be recovered if Mksquashfs aborts unexpectedly
 (i.e. power failure).

3. Unsquashfs improvements:

3.1. Multiple extract files can now be specified on the
 command line, and the files/directories to be extracted can
 now also be given in a file.

3.2. Extract files have been extended to use wildcard pattern
 matching and regular expressions.

3.3. Filename printing has been enhanced and Unquashfs can
 now display filenames with file attributes
 ('ls -l' style output).

3.4. A -stat option has been added which displays the filesystem
 superblock information.

3.5. Unsquashfs now supports 1.x filesystems.

4. Miscellaneous improvements/bug fixes:

   4.1. Squashfs kernel code improved to use SetPageError in
squashfs_readpage() if I/O error occurs.

   4.2. Fixed Squashfs kernel code bug preventing file
seeking beyond 2GB.

   4.3. Mksquashfs now detects file size changes between
first phase directory scan and second phase filesystem create.

Regards

Phillip Lougher
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/17] jffs2: convert jffs2_gc_fetch_page to read_cache_page

2007-04-12 Thread Phillip Lougher

Nate Diller wrote:



wow, you're right.  I was sure I compile-tested this ... oh, "depends
on MTD".  oops.

thanks for reviewing.  does it look OK to you otherwise?



Yes..


NATE



-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/17] jffs2: convert jffs2_gc_fetch_page to read_cache_page

2007-04-12 Thread Phillip Lougher

Nate Diller wrote:


+   page = read_cache_page(OFNI_EDONI_2SFFJ(f)->i_mapping,
+   start >> PAGE_CACHE_SHIFT,
+   (void *)jffs2_do_readpage_unlock,
+   OFNI_EDONI_2SFFJ(f));
 
-	if (IS_ERR(pg_ptr)) {

+   if (IS_ERR(page)) {
printk(KERN_WARNING "read_cache_page() returned error: %ld\n", 
PTR_ERR(pg_ptr));


should be

printk(KERN_WARNING "read_cache_page() returned error: %ld\n", 
PTR_ERR(page));


-   return PTR_ERR(pg_ptr);
+   return PTR_ERR(page);

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Writing a VFS driver

2007-01-07 Thread Phillip Lougher


On 7 Jan 2007, at 23:28, Avishay Traeger wrote:


On Sun, 2007-01-07 at 17:36 -0500, David H. Lynch Jr. wrote:


I am looking for something really simple to start from, but also
something that actually uses an underlying block device.
All the "tutorial" examples I have tripped over (rkfs, ols2006
samplefs) seem to impliment in memory filesystems - unless I
am mis-understanding how VFS to block device mapping works.


You may want to look at cramfs (fs/cramfs), which is a read-only file
system that doesn't have much code to it.



I personally wouldn't look at Cramfs, althought it is a simple block  
device based filesystem, it has some elements such as compression,  
and non unique inode numbers that make it unnecessarily complicated  
for your needs.


I would personally use Romfs as a guide.  This, although an old  
ilesystem, has all the elements you need.   You don't mention whether  
your filesystem has unique inode numbers, but you can use the disk  
location of the inode to generate the inode number.  Doing this  
ensures your inode numbers are unique, and you can use the standard  
VFS iget routine, which can use this inode number to go straight to  
the information on disk.  You mention your filesystem aligns each  
inode on a 8 Kbyte boundary, however, the file data appears to follow  
immediately after the inode header, and hence this won't be aligned  
on a page boundary (4 Kbytes).   Due to this you cannot use the  
generic read page function (block_read_full_page), however, ROMFS has  
this non-alignment issue and you can simply copy what it does.


Hope that helps.

Phillip

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[announce] Squashfs 3.2 released

2007-01-06 Thread Phillip Lougher

Hi,

I'm pleased to announce the release of Squashfs 3.2.  NFS exporting  
is now supported, and the kernel code has been hardened against  
accidently or maliciously corrupted filesystems.  The new release  
correctly handles all corrupted filesystems generated by the fsfuzzer  
tool (written by LMH/Steve Grubb) without oopsing the kernel.  This  
in particular fixes the MOKB (Month of Kernel Bugs) report raised  
against Squashfs.


Squashfs can be dowloaded from http://squashfs.sourceforge.net. The  
full list of changes  are:


Improvements:

1. Squashfs filesystems can now be exported via NFS.
2. Unsquashfs now supports 2.x filesystems.
3. Mksquashfs now displays a progress bar.
4. Squashfs kernel code has been hardened against accidently or
maliciously corrupted Squashfs filesystems.

Bug fixes:

   5. Race condition occurring on S390 in readpage() fixed.
   6. Odd behaviour of MIPS memcpy in read_data() routine worked- 
around.

   7. Missing cache_flush in Squashfs symlink_readpage() added.

Phillip

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-06 Thread Phillip Lougher


On 29 Dec 2006, at 00:44, Bryan Henderson wrote:


Plus, in some cases optimization is a matter of
life or death -- the extra resources (storage space, cache space,  
access

time, etc) for the duplicated files might be enough to move you from
practical to impractical.



You see this a lot with LiveCDs that use hardlinks to cram as much  
information onto a CDROM as possible.  People copy the liveCD, lose  
the hardlinks, and wonder why their recreated LiveCD filesystem  
doesn't fit.  In fact, liveCDs are a good example of things which are  
difficult to backup with the standard POSIX interface.  Most LiveCDs  
sort the fles on disk to optimise boot-time, and this sort  
information is always lost by copying.


People tend to demand that restore programs faithfully restore what  
was

backed up.  (I've even seen requirements that the inode numbers upon
restore be the same).  Given the difficulty of dealing with multi- 
linked
files, not to mention various nonstandard file attributes fancy  
filesystem
types have, I suppose they probably don't have really high  
expectations of
that nowadays, but it's still a worthy goal not to turn one file  
into two.




It is also equally important to not turn two files into one (i.e.  
incorrectly make hardlinks of files which are not hardlinks).
Cramfs doesn't support hardlinks, but it does detect duplicates -  
duplicates share the file data on disk.  Unfortunately, cramfs  
computes inode numbers from the file data location, which means two  
files with the same data get the same inode number, even if they were  
not hardlinks in the original filesystem.  If it wasn't for the fact  
that cramfs always stores nlink as 1, they would look like hardlinks,  
and probably look sufficiently like hardlinks to fool a lot of  
applications.  Of course as cramfs is a read-only filesystem it  
doesn't matter unless the filesystem is copied.


I think "statement 2" is extremely important.  Without this guarantee  
applications have to guess which files are hardlinks.  Any guessing  
is going to be be got wrong sometimes with potentially disastrous  
results.


Phillip

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [UPDATED PATCH] fix memory corruption from misinterpreted bad_inode_ops return values

2007-01-05 Thread Phillip Lougher

Eric Sandeen wrote:
>but Al felt that it was probably better to create an EIO-returner for each 
>actual op signature.  Since so few ops share a signature, I just went ahead 
>& created an EIO function for each individual file & inode op that returns
>a value.

Hmm, the problem with this is it bloats bad_inode.o with lots of empty
functions that return -EIO.  Even though we're not interested in the
parameters, GCC doesn't know this, and doesn't fold the functions into only
the couple of definitions that return different types.

Text size of original bad_inode.o:

Idx Name  Size  VMA   LMA   File off  Algn
  0 .text 006c      0034  2**2

== 108 bytes

Size with patch applied:

Idx Name  Size  VMA   LMA   File off  Algn
  0 .text 016b      0034  2**2 patch
applied:

== 363 bytes, or over three times larger!

>I originally had coded up the fix by creating a return_EIO_ macro 
>for each return type,

This adds two extra functions (return for ssize_t and long), which gives an
increase in size of only 12 bytes:

Idx Name  Size  VMA   LMA   File off  Algn
  0 .text 0078      0034  2**2

== 120 bytes.

Isn't this better?

Thanks

Phillip

-- 
View this message in context: 
http://www.nabble.com/-UPDATED-PATCH--fix-memory-corruption-from-misinterpreted-bad_inode_ops-return-values-tf2916716.html#a8178968
Sent from the linux-fsdevel mailing list archive at Nabble.com.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Phillip Lougher


On 29 Dec 2006, at 08:41, Arjan van de Ven wrote:




I think "statement 2" is extremely important.  Without this guarantee
applications have to guess which files are hardlinks.  Any guessing
is going to be be got wrong sometimes with potentially disastrous
results.


actually no. Statement 1 will tell them when the kernel knows they are
hardlinks. It's the kernels job to make a reasonably quality of
implementation so that that works most of the time.

Statement 2 requires that "all of the time" which suddenly creates  
a lot

of evil corner cases (like "what if I mount a network filesystem twice
and the server doesn't quite tell me enough to figure it out"  
cases) to

make it impractical.



Actually no.  Statement  2 for me is important in terms of archive  
correctness.  With my "archiver" program Mksquashfs, if the two files  
are the same, and filesystem says they're hardlinks, I make them  
hardlinks in the Squashfs filesystem, otherwise they're stored as  
duplicates (same data, different inode).  Doesn't matter much in  
terms of storage overhead, but it does matter if two files become  
one, or vice versa.


If a filesystem cannot guarantee statement 2 in the "normal" case, I  
wouldn't use hardlinks in that filesystem, period.   Using "evil  
corner cases" and network filesystems as an objection is somewhat  
like saying because we can't do it in every case, we shouldn't bother  
doing it in the "normal" case too.  Disk based filesystems should be  
able to handle statements 1 and 2.  No-one expects things to always  
work correctly in "evil corner cases" or with network filesystems.


Phillip


Think of it as the difference between good and perfect.
(and perfect is the enemy of good :)

the kernel will tell you when it knows within reason, via statement 1
technology. It's not perfect, but reasonably will be enough for normal
userspace to depend on it. Your case is NOT a case of "I require  
100%"..

it's a "we'd like to take hardlinks into account" case.


--
if you want to mail me at work (you don't), use arjan (at)  
linux.intel.com
Test the interaction between Linux and your BIOS via http:// 
www.linuxfirmwarekit.org




-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix cramfs making duplicate entries in inode cache

2005-08-19 Thread Phillip Lougher

Dave Johnson wrote:



Patch below fixes this by making get_cramfs_inode() use the inode
cache before blindly creating a new entry every time.  This eliminates
the duplicate inodes and duplicate buffer cache.


> +  struct inode * inode = iget_locked(sb, CRAMINO(cramfs_inode));

Doesn't iget_locked() assume inode numbers are unique?

In Cramfs inode numbers are set to 1 for non-data inodes (fifos, 
sockets, devices, empty directories), i.e


%stat device namedpipe
  File: `device'
  Size: 0   Blocks: 0  IO Block: 4096   character 
special file

Device: 700h/1792d  Inode: 1   Links: 1 Device type: 1,1
Access: (0644/crw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 1970-01-01 01:00:00.0 +0100
Modify: 1970-01-01 01:00:00.0 +0100
Change: 1970-01-01 01:00:00.0 +0100
  File: `namedpipe'
  Size: 0   Blocks: 0  IO Block: 4096   fifo
Device: 700h/1792d  Inode: 1   Links: 1
Access: (0644/prw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 1970-01-01 01:00:00.0 +0100
Modify: 1970-01-01 01:00:00.0 +0100
Change: 1970-01-01 01:00:00.0 +0100

Should iget5_locked() be used here?

Phillip
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: files of size larger than fs size

2005-03-15 Thread Phillip Lougher
Max wrote:
Hello!
I've discovered that it is possible to create files of size much larger 
than partition size.
I thought that this is JFS bug, so I've filed a bugreport against it at 
http://bugzilla.kernel.org/show_bug.cgi?id=4345
Detailed info and testcase program are provided there.

Later I've found that at least XFS and EXT3 filesystems have the same 
problem (though the resulting filesize is different for each fs). So the 
problem may be not in fs code but in some other piece of kernel.

Could kernel gurus please investigate the problem?
Your test case isn't writing a full file, it is only writing 4 bytes at 
various offsets (1^32, 1^40, 1^48, 1^56).

The filesystems you mention support files with "holes" in them, in 
otherwords they support gaps between data which don't take up any storage.

Even though your test case is creating a huge file, only a couple of 
bytes are written, the rest of the huge file doesn't take up any space.

The behaviour you're seeing isn't a bug...
Phillip Lougher
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html