Re: [PATCH 1/12] ext3 balloc: reset windowsz when full

2006-11-28 Thread Hugh Dickins
On Tue, 28 Nov 2006, Mingming Cao wrote:

> Port a series ext2 balloc patches from Hugh to ext3/4. The first 6
> patches are against ext3, and the rest are aginst ext4.

Thanks for all that, Mingming:
whichever is appropriate, all twelve
Acked-by: Hugh Dickins <[EMAIL PROTECTED]>
or
Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>

I'll think about your other mails, those that need further thought,
later on: I need to pin down more accurately the repetitious sequence of
reservations in the mistaken case - maybe it indicated further issues,
maybe not; and I need to consider our different views of the my_rsv
find_next_usable_block.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 9/12] ext4 balloc: fix off-by-one against rsv_end

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix off-by-one against rsv_end
From: Hugh Dickins <[EMAIL PROTECTED]>

rsv_end is the last block within the reservation, so alloc_new_reservation
should accept start_block == rsv_end as success.
--
Sync up  a ext2 reservation fix in ext4
Signed-Off-By: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-rsv_end 
fs/ext4/balloc.c
--- 
linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-rsv_end
2006-11-28 19:37:15.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:15.0 
-0800
@@ -1165,7 +1165,7 @@ retry:
 * check if the first free block is within the
 * free space we just reserved
 */
-   if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end)
+   if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end)
return 0;   /* success */
/*
 * if the first free bit we found is out of the reservable space

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/12] ext3 balloc: fix off-by-one against grp_goal

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix off-by-one against grp_goal
From: Hugh Dickins <[EMAIL PROTECTED]>

grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv
should treat it as such.
--

Sync up with ext2 reservation fix  in ext3

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-grp_goal 
fs/ext3/balloc.c
--- 
linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-grp_goal   
2006-11-28 19:36:48.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:48.0 
-0800
@@ -1271,7 +1271,7 @@ ext3_try_to_allocate_with_rsv(struct sup
}
/*
 * grp_goal is a group relative block number (if there is a goal)
-* 0 < grp_goal < EXT3_BLOCKS_PER_GROUP(sb)
+* 0 <= grp_goal < EXT3_BLOCKS_PER_GROUP(sb)
 * first block is a filesystem wide block number
 * first block is the block number of the first block in this group
 */
@@ -1307,7 +1307,7 @@ ext3_try_to_allocate_with_rsv(struct sup
if (!goal_in_my_reservation(&my_rsv->rsv_window,
grp_goal, group, sb))
grp_goal = -1;
-   } else if (grp_goal > 0) {
+   } else if (grp_goal >= 0) {
int curr = my_rsv->rsv_end -
(grp_goal + group_first_block) + 1;
 

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/12] ext2 balloc: fix _with_rsv freeze

2006-11-28 Thread Mingming Cao

Sync up a reservation fix from ext2 in ext3
--
Subject: ext2 balloc: fix _with_rsv freeze
From: Hugh Dickins <[EMAIL PROTECTED]>

After several days of testing ext2 with reservations, it got caught inside
ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding on
the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to find the
free block guaranteed to be included (unless there's contention).

Fix the range to find_next_usable_block's memscan: the scan from "here"
(0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes not 2
(the relevant bytes of bitmap in this case being f7 df ff - none 00, but the
premature cutoff implying that the last was found 00).

Is this a problem for mainline ext2?  No, because the "size" in its memscan is
always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a multiple of
8.  Is this a problem for ext3 or ext4?  No, because they have an additional
extN_test_allocatable test which rescues them from the error.

--

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-fix-_with_rsv-freeze fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-_with_rsv-freeze  
2006-11-28 19:36:55.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:55.0 
-0800
@@ -730,7 +730,7 @@ find_next_usable_block(ext3_grpblk_t sta
here = 0;
 
p = ((char *)bh->b_data) + (here >> 3);
-   r = memscan(p, 0, (maxblocks - here + 7) >> 3);
+   r = memscan(p, 0, ((maxblocks + 7) >> 3) - (here >> 3));
next = (r - ((char *)bh->b_data)) << 3;
 
if (next < maxblocks && next >= start && ext3_test_allocatable(next, 
bh))

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/12] ext3 balloc: fix _with_rsv freeze

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix _with_rsv freeze
From: Hugh Dickins <[EMAIL PROTECTED]>

After several days of testing ext2 with reservations, it got caught inside
ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding on
the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to find the
free block guaranteed to be included (unless there's contention).

Fix the range to find_next_usable_block's memscan: the scan from "here"
(0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes not 2
(the relevant bytes of bitmap in this case being f7 df ff - none 00, but the
premature cutoff implying that the last was found 00).

Is this a problem for mainline ext2?  No, because the "size" in its memscan is
always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a multiple of
8.  Is this a problem for ext3 or ext4?  No, because they have an additional
extN_test_allocatable test which rescues them from the error.

--

Sync up a reservation fix from ext2 in ext4
Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-fix-_with_rsv-freeze fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-_with_rsv-freeze  
2006-11-28 19:37:12.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:12.0 
-0800
@@ -747,7 +747,7 @@ find_next_usable_block(ext4_grpblk_t sta
here = 0;
 
p = ((char *)bh->b_data) + (here >> 3);
-   r = memscan(p, 0, (maxblocks - here + 7) >> 3);
+   r = memscan(p, 0, ((maxblocks + 7) >> 3 - (here >> 3));
next = (r - ((char *)bh->b_data)) << 3;
 
if (next < maxblocks && next >= start && ext4_test_allocatable(next, 
bh))

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/12] ext4 balloc: fix off-by-one against grp_goal

2006-11-28 Thread Mingming Cao

Subject: ext2 balloc: fix off-by-one against grp_goal
From: Hugh Dickins <[EMAIL PROTECTED]>

grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv
should treat it as such.
--
Sync up with ext2 reservation fix  in ext4
Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-grp_goal 
fs/ext4/balloc.c
--- 
linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-fix-off-by-one-against-grp_goal   
2006-11-28 19:37:05.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:05.0 
-0800
@@ -1288,7 +1288,7 @@ ext4_try_to_allocate_with_rsv(struct sup
}
/*
 * grp_goal is a group relative block number (if there is a goal)
-* 0 < grp_goal < EXT4_BLOCKS_PER_GROUP(sb)
+* 0 <= grp_goal < EXT4_BLOCKS_PER_GROUP(sb)
 * first block is a filesystem wide block number
 * first block is the block number of the first block in this group
 */
@@ -1324,7 +1324,7 @@ ext4_try_to_allocate_with_rsv(struct sup
if (!goal_in_my_reservation(&my_rsv->rsv_window,
grp_goal, group, sb))
grp_goal = -1;
-   } else if (grp_goal > 0) {
+   } else if (grp_goal >= 0) {
int curr = my_rsv->rsv_end -
(grp_goal + group_first_block) + 1;
 

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/12] ext4 balloc: use io_error label

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: use io_error label
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks has a nice io_error label for setting -EIO, so goto that in
the one place that doesn't already use it.

--
Fix it in ext4_new_blocks.

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-use-io_error-label fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-use-io_error-label
2006-11-28 19:42:45.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:43:21.0 
-0800
@@ -1529,10 +1529,8 @@ retry_alloc:
if (group_no >= ngroups)
group_no = 0;
gdp = ext4_get_group_desc(sb, group_no, &gdp_bh);
-   if (!gdp) {
-   *errp = -EIO;
-   goto out;
-   }
+   if (!gdp)
+   goto io_error;
free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
/*
 * skip this group if the number of

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/12] ext4 balloc: say rb_entry not list_entry

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: say rb_entry not list_entry
From: Hugh Dickins <[EMAIL PROTECTED]>

The reservations tree is an rb_tree not a list, so it's less confusing to use
rb_entry() than list_entry() - though they're both just container_of().

--

Sync up this fix in ext4

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff -puN fs/ext4/balloc.c~ext4-balloc-say-rb_entry-not-list_entry 
fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4-balloc-say-rb_entry-not-list_entry   
2006-11-28 19:37:08.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:08.0 
-0800
@@ -165,7 +165,7 @@ restart:
 
printk("Block Allocation Reservation Windows Map (%s):\n", fn);
while (n) {
-   rsv = list_entry(n, struct ext4_reserve_window_node, rsv_node);
+   rsv = rb_entry(n, struct ext4_reserve_window_node, rsv_node);
if (verbose)
printk("reservation window 0x%p "
   "start:  %llu, end:  %llu\n",
@@ -966,7 +966,7 @@ static int find_next_reservable_window(
 
prev = rsv;
next = rb_next(&rsv->rsv_node);
-   rsv = list_entry(next,struct ext4_reserve_window_node,rsv_node);
+   rsv = rb_entry(next,struct ext4_reserve_window_node,rsv_node);
 
/*
 * Reached the last reservation, we can just append to the
@@ -1210,7 +1210,7 @@ static void try_to_extend_reservation(st
if (!next)
my_rsv->rsv_end += size;
else {
-   next_rsv = list_entry(next, struct ext4_reserve_window_node, 
rsv_node);
+   next_rsv = rb_entry(next, struct ext4_reserve_window_node, 
rsv_node);
 
if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size)
my_rsv->rsv_end += size;

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/12] ext4 balloc: reset windowsz when full

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: reset windowsz when full
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks should reset the reservation window size to 0 when squeezing
the last blocks out of an almost full filesystem, so the retry doesn't skip
any groups with less than half that free, reporting ENOSPC too soon.

--
Sync up reservation fix from ext2 in ext4

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext4/balloc.c |1 +
 1 file changed, 1 insertion(+)

diff -puN fs/ext4/balloc.c~ext4_reset_windowsz_in_full_fs fs/ext4/balloc.c
--- linux-2.6.19-rc5/fs/ext4/balloc.c~ext4_reset_windowsz_in_full_fs
2006-11-28 19:37:01.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext4/balloc.c   2006-11-28 19:37:01.0 
-0800
@@ -1566,6 +1566,7 @@ retry_alloc:
 */
if (my_rsv) {
my_rsv = NULL;
+   windowsz = 0;
group_no = goal_group;
goto retry_alloc;
}

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/12] ext3 balloc: use io_error label

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: use io_error label
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks has a nice io_error label for setting -EIO, so goto that in
the one place that doesn't already use it.

--
Fix it in ext3_new_blocks.

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-use-io_error-label fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-use-io_error-label
2006-11-28 19:45:51.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:45:51.0 
-0800
@@ -1515,10 +1515,8 @@ retry_alloc:
if (group_no >= ngroups)
group_no = 0;
gdp = ext3_get_group_desc(sb, group_no, &gdp_bh);
-   if (!gdp) {
-   *errp = -EIO;
-   goto out;
-   }
+   if (!gdp)
+   goto io_error;
free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
/*
 * skip this group if the number of

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/12] ext3 balloc: fix off-by-one against rsv_end

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: fix off-by-one against rsv_end
From: Hugh Dickins <[EMAIL PROTECTED]>

rsv_end is the last block within the reservation, so alloc_new_reservation
should accept start_block == rsv_end as success.
--
Sync up  a ext2 reservation fix in ext3

Signed-Off-By: Mingming Cao <[EMAIL PROTECTED]>


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-rsv_end 
fs/ext3/balloc.c
--- 
linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-fix-off-by-one-against-rsv_end
2006-11-28 19:36:58.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:58.0 
-0800
@@ -1148,7 +1148,7 @@ retry:
 * check if the first free block is within the
 * free space we just reserved
 */
-   if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end)
+   if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end)
return 0;   /* success */
/*
 * if the first free bit we found is out of the reservable space

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/12] ext3 balloc: say rb_entry not list_entry

2006-11-28 Thread Mingming Cao

--
Subject: ext2 balloc: say rb_entry not list_entry
From: Hugh Dickins <[EMAIL PROTECTED]>

The reservations tree is an rb_tree not a list, so it's less confusing to use
rb_entry() than list_entry() - though they're both just container_of().

--

Sync up this fix in ext3

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff -puN fs/ext3/balloc.c~ext3-balloc-say-rb_entry-not-list_entry 
fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3-balloc-say-rb_entry-not-list_entry   
2006-11-28 19:36:52.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:52.0 
-0800
@@ -144,7 +144,7 @@ restart:
 
printk("Block Allocation Reservation Windows Map (%s):\n", fn);
while (n) {
-   rsv = list_entry(n, struct ext3_reserve_window_node, rsv_node);
+   rsv = rb_entry(n, struct ext3_reserve_window_node, rsv_node);
if (verbose)
printk("reservation window 0x%p "
   "start:  %lu, end:  %lu\n",
@@ -949,7 +949,7 @@ static int find_next_reservable_window(
 
prev = rsv;
next = rb_next(&rsv->rsv_node);
-   rsv = list_entry(next,struct ext3_reserve_window_node,rsv_node);
+   rsv = rb_entry(next,struct ext3_reserve_window_node,rsv_node);
 
/*
 * Reached the last reservation, we can just append to the
@@ -1193,7 +1193,7 @@ static void try_to_extend_reservation(st
if (!next)
my_rsv->rsv_end += size;
else {
-   next_rsv = list_entry(next, struct ext3_reserve_window_node, 
rsv_node);
+   next_rsv = rb_entry(next, struct ext3_reserve_window_node, 
rsv_node);
 
if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size)
my_rsv->rsv_end += size;

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/12] ext3 balloc: reset windowsz when full

2006-11-28 Thread Mingming Cao
Port a series ext2 balloc patches from Hugh to ext3/4. The first 6
patches are against ext3, and the rest are aginst ext4.


--
Subject: ext2 balloc: reset windowsz when full
From: Hugh Dickins <[EMAIL PROTECTED]>

ext2_new_blocks should reset the reservation window size to 0 when squeezing
the last blocks out of an almost full filesystem, so the retry doesn't skip
any groups with less than half that free, reporting ENOSPC too soon.

--
Sync up reservation fix from ext2

Signed-off-by: Mingming Cao <[EMAIL PROTECTED]>
---


---

 linux-2.6.19-rc5-cmm/fs/ext3/balloc.c |1 +
 1 file changed, 1 insertion(+)

diff -puN fs/ext3/balloc.c~ext3_reset_windowsz_in_full_fs fs/ext3/balloc.c
--- linux-2.6.19-rc5/fs/ext3/balloc.c~ext3_reset_windowsz_in_full_fs
2006-11-28 19:36:41.0 -0800
+++ linux-2.6.19-rc5-cmm/fs/ext3/balloc.c   2006-11-28 19:36:41.0 
-0800
@@ -1552,6 +1552,7 @@ retry_alloc:
 */
if (my_rsv) {
my_rsv = NULL;
+   windowsz = 0;
group_no = goal_group;
goto retry_alloc;
}

_


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: a question about ext4_fsblk_t

2006-11-28 Thread Mingming Cao

On Sun, 2006-10-22 at 16:50 +0800, guomingyang wrote:
> 
>I also find many places where the block number type is not changed in 
> namei.c and dir.c. Why? 

They are all file logical blocks, ext4_fsblk_t is for on disk blocks.

Takashi Sato has a patch to define all file logical blocks as
ext3_fileblk_t, it did not make to mainline before ext4 was forked.
Probably we should bring the to ext3/4 to clarify the confusions.

Mingming
>Thank you to anyone who offer help
> >Hello everyone:
> >
> > I am a student interesting in linux filesystem, I have a 
> > problem about the replace of block number type to ext4_fsblk_t. Is it 
> > complete? Or has it passed all the test? Because I have found a place like 
> > this in linux-2.6.19-rc2/fs/ext4/inode.c 
> >
> >
> >
> >+struct buffer_head *ext4_bread(handle_t *handle, struct inode *inode,
> >+   int block, int create, int *err)
> >
> >
> >the block's type is not changed. Although I think it will bring no mistake, 
> >I doubt about why not change it. Thank you !
> >


> >
> >guomingyang
> >
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> >the body of a message to [EMAIL PROTECTED]
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> guomingyang
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] ext2 balloc: fix _with_rsv freeze

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 20:07 +, Hugh Dickins wrote:
> On Tue, 28 Nov 2006, Mingming Cao wrote:
> > On Tue, 2006-11-28 at 17:40 +, Hugh Dickins wrote:
> > > After several days of testing ext2 with reservations, it got caught inside
> > > ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding
> > > on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to
> > > find the free block guaranteed to be included (unless there's contention).
> > >
> >
> > Hmm, I suspect there is other issue: alloc_new_reservation should not
> > repeatedly allocating the same window, if ext2_try_to_allocate
> > repeatedly fails to find a free block in that window.
> > find_next_reservable_window() takes my_rsv (the old window that he
> > thinks there is no free block) as a guide to find a window "after" the
> > end block of my_rsv, so how could this happen?
> 
> Hmmm.  I haven't studied that part of the code, but what you say sounds
> sensible: that would leave more to be explained, yes.  I guess it would
> happen if all the rest of the bitmap were either allocated or reserved,

But bitmap_search_next_usable_block() will fail in the case the rest of
bitmap were allocated, and find_next_reservable_space() will fail in the
case the rest of group were all reserved. alloc_new_reservation() should
not create a new window in this case. 

> but I don't believe that was the case here: I have noted that the map
> was all 00s from offset 0x1ae onwards, plenty unallocated; I've not
> recorded the following reservations, but it seems unlikely they covered
> the remaining free area (and still covered it even when the remaining
> tasks got to the point of just waiting for this one).
> 
> >
> > > Fix the range to find_next_usable_block's memscan: the scan from "here"
> > > (0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes
> > > not 2 (the relevant bytes of bitmap in this case being f7 df ff - none
> > > 00, but the premature cutoff implying that the last was found 00).
> > >
> >
> > alloc_new_reservation() reserved a window with free block, when come to
> > the time to claim it, it scans the window again. So it seems that the
> > range of the the scan is too small:
> 
> The range of the scan is 1 byte too small in this case, yes.
> 
> >
> > p = ((char *)bh->b_data) + (here >> 3);
> > r = memscan(p, 0, (maxblocks - here + 7) >> 3);
> > next = (r - ((char *)bh->b_data)) << 3;
> >
> > ->   next is -1
> 
> I don't understand you: next was not -1, it was 0xd08.
> 
> > if (next < maxblocks && next >= here)
> > return next;
> >
> > --> falls to false branch
> 
> No, it passed the "next < maxblocks && next >= here" test
> (maxblocks being 0xd0e and here being 0xcfe), so returned
> pointing to an allocated block - then the caller finds it
> cannot set the bit.
> 

Apologies for the confusion. I thought ext2_try_to_allocate() failed
because we could not find a free block in the reserved window (i.e.,
find_next_usable_block() failed)

It seems in this case, find_next_usable_block() incorrectly returns a
bit it *thinks* free, but ext2_try_to_allocate() fails to claim it as
it's being marked as used.


So yes, Acked this fix. Thanks.

> >
> > here = bitmap_search_next_usable_block(here, bh, maxblocks);
> > return here;
> >
> > So we failed to find a free byte in the range.  That's seems fine to me.
> > It's only a nice thing to have -- try to allocate a block in a place
> > where it's neighbors are all free also. If it fails, it will search the
> > window bit by bit. So I don't understand why it is not being recovered
> > by bitmap_search_next_usable_block(), which test the bitmap bit by bit?
> 
> It already returned, it doesn't reach that line.
> 
Yep.

> >
> > > Is this a problem for mainline ext2?  No, because the "size" in its 
> > > memscan
> > > is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a
> > > multiple of 8.  Is this a problem for ext3 or ext4?  No, because they have
> > > an additional extN_test_allocatable test which rescues them from the 
> > > error.
> > >
> > Hmm, if the error is it prematurely think there is no free block in the
> > range (bitmap on disk), then even in ext3/4, it will not bother checking
> > the jbd copy of the bitmap. I am not sure this is the cause that ext3/4
> > may not has the problem.
> 
> In the ext3/4 case, it indeed won't bother to check the jbd copy
> (having found this bitmap bit set), it'll fall through to the
> bitmap_search_next_usable_block you indicated above,
> and that should do the right thing, finding the first
> free bit in the area originally reserved.
> 
Make sense.  
> >
> > > But the bigger question is, why does the my_rsv case come here to
> > > find_next_usable_block at all?
> >
> > Because grp_goal is -1?
> 
> Well, yes, but my point is that we've got a reservation, and we're
> hoping to al

Re: Boot failure with ext2 and initrds

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 14:33 -0800, Andrew Morton wrote: 
> On Tue, 28 Nov 2006 13:04:53 -0800
> Mingming Cao <[EMAIL PROTECTED]> wrote:
> 
> > Thanks, I have acked most of them, and will port them to ext3/4 soon.
> 
> You've acked #2 and #3.  #4, #5 and #6 remain un-commented-upon and #1 is
> unclear?

sorry, just acked #4, #5 and #6. I mean to do that before I said so but
they did not go out of my outbox till now ( I was out for a few hours).

#1 looks correct to me now, just thought there are other issues. I will
comment on that one in that thread.


Mingming

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/6] ext2 balloc: say rb_entry not list_entry

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 17:43 +, Hugh Dickins wrote:
> The reservations tree is an rb_tree not a list, so it's less confusing to
> use rb_entry() than list_entry() - though they're both just container_of().
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
> ---
> 

Acked.

Thanks,
Mingming
>  fs/ext2/balloc.c |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> --- 2.6.19-rc6-mm2/fs/ext2/balloc.c   2006-11-24 08:18:02.0 +
> +++ linux/fs/ext2/balloc.c2006-11-27 19:28:41.0 +
> @@ -156,7 +156,7 @@ restart:
> 
>   printk("Block Allocation Reservation Windows Map (%s):\n", fn);
>   while (n) {
> - rsv = list_entry(n, struct ext2_reserve_window_node, rsv_node);
> + rsv = rb_entry(n, struct ext2_reserve_window_node, rsv_node);
>   if (verbose)
>   printk("reservation window 0x%p "
>   "start: %lu, end: %lu\n",
> @@ -753,7 +753,7 @@ static int find_next_reservable_window(
> 
>   prev = rsv;
>   next = rb_next(&rsv->rsv_node);
> - rsv = list_entry(next,struct ext2_reserve_window_node,rsv_node);
> + rsv = rb_entry(next,struct ext2_reserve_window_node,rsv_node);
> 
>   /*
>* Reached the last reservation, we can just append to the
> @@ -995,7 +995,7 @@ static void try_to_extend_reservation(st
>   if (!next)
>   my_rsv->rsv_end += size;
>   else {
> - next_rsv = list_entry(next, struct ext2_reserve_window_node, 
> rsv_node);
> + next_rsv = rb_entry(next, struct ext2_reserve_window_node, 
> rsv_node);
> 
>   if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size)
>   my_rsv->rsv_end += size;

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] ext2 balloc: use io_error label

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 17:44 +, Hugh Dickins wrote:
> ext2_new_blocks has a nice io_error label for setting -EIO,
> so goto that in the one place that doesn't already use it.
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
> ---
> 

Acked, 

Mingming
>  fs/ext2/balloc.c |7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> --- 2.6.19-rc6-mm2/fs/ext2/balloc.c   2006-11-24 08:18:02.0 +
> +++ linux/fs/ext2/balloc.c2006-11-27 19:28:41.0 +
> @@ -1258,10 +1258,9 @@ retry_alloc:
>   if (group_no >= ngroups)
>   group_no = 0;
>   gdp = ext2_get_group_desc(sb, group_no, &gdp_bh);
> - if (!gdp) {
> - *errp = -EIO;
> - goto out;
> - }
> + if (!gdp)
> + goto io_error;
> +
>   free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
>   /*
>* skip this group if the number of
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] ext2 balloc: fix off-by-one against grp_goal

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 17:42 +, Hugh Dickins wrote:
> grp_goal 0 is a genuine goal (unlike -1),
> so ext2_try_to_allocate_with_rsv should treat it as such.
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>

Acked.

Thanks,

Mingming
> ---
> 
>  fs/ext2/balloc.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> --- 2.6.19-rc6-mm2/fs/ext2/balloc.c   2006-11-24 08:18:02.0 +
> +++ linux/fs/ext2/balloc.c2006-11-27 19:28:41.0 +
> @@ -1053,7 +1053,7 @@ ext2_try_to_allocate_with_rsv(struct sup
>   }
>   /*
>* grp_goal is a group relative block number (if there is a goal)
> -  * 0 < grp_goal < EXT2_BLOCKS_PER_GROUP(sb)
> +  * 0 <= grp_goal < EXT2_BLOCKS_PER_GROUP(sb)
>* first block is a filesystem wide block number
>* first block is the block number of the first block in this group
>*/
> @@ -1089,7 +1089,7 @@ ext2_try_to_allocate_with_rsv(struct sup
>   if (!goal_in_my_reservation(&my_rsv->rsv_window,
>   grp_goal, group, sb))
>   grp_goal = -1;
> - } else if (grp_goal > 0) {
> + } else if (grp_goal >= 0) {
>   int curr = my_rsv->rsv_end -
>   (grp_goal + group_first_block) + 1;
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Boot failure with ext2 and initrds

2006-11-28 Thread Andrew Morton
On Tue, 28 Nov 2006 13:04:53 -0800
Mingming Cao <[EMAIL PROTECTED]> wrote:

> Thanks, I have acked most of them, and will port them to ext3/4 soon.

You've acked #2 and #3.  #4, #5 and #6 remain un-commented-upon and #1 is
unclear?
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][PATCH 1/1] Comprehensive Secure Deletion for ext4

2006-11-28 Thread Harry Papaxenopoulos
Hello All,

As promised a few weeks ago, we have composed a complete patch to add
secure deletion functionality in ext4. With this patch, we can ensure
that data, meta-data and the filename of any file marked with the
special attribute "s", are overwritten right after it is deleted.

The patch performs N overwrites with configurable patterns. It can
handle separate overwrite configurations for each of the data, meta-data
and directory entry. In order to handle partial file truncations, our
patch will intercept truncate requests in addition to unlink requests
(something that user-mode programs such as shred cannot handle).
Moreover, to make the data overwrites and file truncation an atomic
operation, we add the file to an orphan list right before we begin to
overwrite and remove it from the orphan list after truncation has
finished. The patch works in all journaling modes.

As far as the journal is concerned, we leave it untouched. There are a
few reasons behind this approach. First, performance will be noticeably
hindered. This is because the journal will have to postpone all further
transactions until the necessary blocks are overwritten - which will
take time. Secondly, by the time we want to overwrite the necessary
blocks, the journal will have already done this for us by committing all
further transactions, especially in full journaling mode. That said, if
the community will appear to also want secure-deletion of journal
records, then we can add that (probably as a separate feature one can
turn on independently).

Secure deletion arguments during mount time are of the form:

mount -t ext4dev -o secdel="args" [dev] [mnt]

where args is a 6-way list delimited by colons as such:

secdel="N1:S1:N2:S2:N3:S3"

N1,N2,N3 denote the number of overwrites for data,meta-data and the
filename respectively.

S1,S2,S3 denote the character sequence used to overwrite data,meta-data
and the filename respectively. Random character sequences are denoted by
'R'.

Any item that is left empty is not securely deleted.

In addition, if one wants to secure-delete all portions of a file, an
abbreviated format can be used as such:

secdel="N:S"

N denotes the number of overwrites chosen
S denotes the character sequence to use for overwriting

For example, the mount option:

secdel="2:Rc0:::1:\0"

Means that data will be overwritten once using a random character,
then once using the ASCII character 'c', and finally once more using the
ASCII character 0. Finally this whole sequence will be written 2 times.
Meta-data will not be touched and the filename will be overwritten 1
time with the value zero.

Any comments, help or feedback will be greatly appreciated.

Thank you,
Harry Papaxenopoulos, Nikolai Joukov, and Erez Zadok
--
File systems and Storage Laboratory
Stony Brook University
--

Signed-off-by: Harry Papaxenopoulos <[EMAIL PROTECTED]>
Signed-off-by: Nikolai Joukov <[EMAIL PROTECTED]>
Signed-off-by: Erez Zadok <[EMAIL PROTECTED]>
Signed-off-by: Jeff Sipek <[EMAIL PROTECTED]>

diff -Naur 2.6.19-rc6/fs/ext4/inode.c 2.6.19-rc6sd/fs/ext4/inode.c
--- 2.6.19-rc6/fs/ext4/inode.c  2006-11-28 16:08:08.0 -0500
+++ 2.6.19-rc6sd/fs/ext4/inode.c2006-11-28 16:09:10.0 -0500
@@ -37,8 +37,10 @@
 #include 
 #include 
 #include 
+#include 
 #include "xattr.h"
 #include "acl.h"
+#include "secrm.h"

 /*
  * Test whether an inode is a fast symlink.
@@ -201,8 +203,19 @@
if (IS_SYNC(inode))
handle->h_sync = 1;
inode->i_size = 0;
+#ifndef CONFIG_EXT4DEV_FS_SECRM
if (inode->i_blocks)
ext4_truncate(inode);
+#else
+   if (EXT4_I(inode)->i_flags & EXT4_SECRM_FL)
+   ext4_journal_stop(handle);
+   if (inode->i_blocks)
+   ext4_truncate(inode);
+   if (EXT4_I(inode)->i_flags & EXT4_SECRM_FL) {
+   ext4_secrm_metadata(inode);
+   handle = start_transaction(inode);
+   }
+#endif
/*
 * Kill off the orphan record which ext4_truncate created.
 * AKPM: I think this can be inside the above `if'.
@@ -1082,6 +1095,272 @@
return NULL;
 }

+#ifdef CONFIG_EXT4DEV_FS_SECRM
+static int ext4_ovwt_blocks(handle_t *handle, struct inode *inode, char param,
+   sector_t first, sector_t last)
+{
+   int retval = 0;
+   sector_t sector;
+   sector_t batch = DIO_CREDITS;
+   struct buffer_head *bh = NULL;
+   struct buffer_head tmp;
+   struct super_block *sb;
+   J_ASSERT(handle != NULL);
+
+   tmp.b_state = 0;
+   tmp.b_blocknr = 0;
+   sb = inode->i_sb;
+
+   J_ASSERT( last >= first );
+
+   if (IS_ERR(handle)) {
+   retval = PTR_ERR(handle);
+   goto out;
+   }
+
+   for (sector = first; sector <= last; sector++) {
+   if (handle->h_buffer_credits <= EXT4_RESERVE_TRANS_BLOCKS) {
+

Re: Boot failure with ext2 and initrds

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 17:38 +, Hugh Dickins wrote:

> >
> > And could you check the start and end block for every rsv_window_add and
> > rsv_window_remove, to see if it was keep creating and removing the same
> > window in the same block group?
> 
> The same every time it settled on a usable reservation, but a lot of
> wasted adds and removes as it works across a fully allocated area of
> the bitmap.  Not very efficient, all those rb_tree insertions and
> removals until it gets to a free area; but I can't judge from this
> example how common that would be, may not be worth bothering about.
> 

Yeah, it's not so efficient to add a window before knowing it has a free
block, as rb tree insertion is quit expensive. Actually it used to be
this way: only insert a window to the rb tree when there is a free bit
inside of it. However we need to hold the per-fs reservation lock while
scanning the bitmap:( This badly hurt the performance in some case and
the real time folks complained about it. 

So we changed the code to the current way. I think it was not that
inefficient in the case it works across a fully allocated/reserved area,
since bitmap_search_next_usable_block() will skip the fully allocated
area fairly quickly, as it searchs from the beginning of the window till
the last block of the block group (not just the end of the window)


> >
> > And it might be useful to dump the whole filesystem block reservation
> > tree as well.
> 
> Not necessary, I've put just the relevant numbers in the patch comment,
> it helped me a lot to see the actual numbers and work it out from there.
> 
> >
> > Not sure if it worth the effort to test it on ext3.  The ext2 block
> > reservation code in 2.6.19-rc5-mm2 looks pretty much the same as ext3/4,
> > except the missing truncate_mutex. I understand this might take a few
> > days to reproduce, so this might not needed now.
> 
> Turns out vanilla-ext2 and ext3 and ext4 are safe:
> ext3 and ext4 slightly wrong in the same way, but safe.
> 

Good to know.

> I'll continue this thread with the bugfix patch 1/6; and five more
> (roughly descending order of importance, the latter just cosmetic)
> little fs/ext2/balloc.c patches to things I noticed on the way.
> Nothing very worrying.  Easier to send patches than ask questions:
> please check, perhaps my "off-by-one" accusations are wrong, for
> example.  If you're satisfied they're right, please apply the
> same to ext3 and ext4.
> 

Thanks, I have acked most of them, and will port them to ext3/4 soon.


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


xfs preallocation writeup, for comparison

2006-11-28 Thread Eric Sandeen
As promised, here is a writeup of xfs preallocation routines.

I don't hold these up as the perfect or best way to do this task, but it
is worth looking at what has been done before, to get ideas, find better
ways, and avoid pitfalls for ext4.

XFS preallocation interfaces.
=

The xfs preallocation interfaces are described in the xfsctl(3) manpage.
It's not the best doc, so I'll summarize:

XFS has these ioctl calls for space managment of files:

   XFS_IOC_ALLOCSP
   XFS_IOC_FREESP
   XFS_IOC_RESVSP
   XFS_IOC_UNRESVSP

All of these interfaces take an flock-style argument, and you use it to
specify the range of bytes in the file which should be preallocated,
essentially with an offset and a length.

The real work for all of this is done in xfs_change_file_space() in
xfs_vnodeops.c

The main difference between resvsp and allocsp is that resvsp marks the
blocks as "unwritten" meaning that they are allocated but not yet
written to, and if they are read, they will return zeros.  allocsp
actually writes zeros into the allocated blocks.  We can use the xfs_io
tool to demonstrate.

resvsp example:
==

[EMAIL PROTECTED] test]# touch resvsp
[EMAIL PROTECTED] test]# xfs_io resvsp
xfs_io> resvsp 0 10g
xfs_io> bmap -vp
resvsp:
 EXT: FILE-OFFSET   BLOCK-RANGEAG AG-OFFSET
TOTAL FLAGS
   0: [0..16657327]:16657456..33314783  1 (64..16657391)
16657328 1
   1: [16657328..20971519]: 96..4314287 0 (96..4314287)
4314192 1

so we got 2 extents for this 10g file - those are actual filesystem
blocks allocated.  The file is 0 length, but is using 10g of blocks:

[EMAIL PROTECTED] test]# ls -lh resvsp
-rw-r--r--  1 root root 0 Nov 28 14:11 resvsp
[EMAIL PROTECTED] test]# du -hc resvsp
10G resvsp
10G total

The extents are simply flagged as unwritten (0x1 above), so very
little IO occurs and the space reservation is fast..

allocsp example:
===
(note there's a bit of a buglet in xfs_io, hence the swapped arguments)

[EMAIL PROTECTED] test]# touch allocsp
[EMAIL PROTECTED] test]# xfs_io allocsp
xfs_io> allocsp 10g 0

xfs_io> bmap -vp
allocsp:
 EXT: FILE-OFFSET   BLOCK-RANGEAG AG-OFFSET
TOTAL
   0: [0..16657327]:33314848..49972175  2 (64..16657391)
16657328
   1: [16657328..20971519]: 4314288..86284790 (4314288..8628479)
4314192

We also got 2 extents here, but they are not flagged as unwritten -
those filesystem blocks were all actually filled with zeros.

[EMAIL PROTECTED] test]# ls -lh allocsp
-rw-r--r--  1 root root 10G Nov 28 14:19 allocsp
[EMAIL PROTECTED] test]# du -hc allocsp
10G allocsp
10G total

It would be very nice to see posix_fallocate hooked up to the underlying
filesystem, so that it can make smart decisions about how to efficiently
reserve space...

-Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


xfs defragmentation writeup, for comparison

2006-11-28 Thread Eric Sandeen
As promised, here is a writeup of xfs defragmentation routines.

I don't hold these up as the perfect or best way to do this task, but it
is worth looking at what has been done before, to get ideas, find better
ways, and avoid pitfalls for ext4.

XFS defragmentation interface.
=

xfs uses the xfs_fsr tool, found in the xfsdump (!) package, to
defragment files on the filesystem.

It has a few features for defragmenting a whole filesystem,
starting/stopping, etc, but I'll just sketch out how it defragments a
single file.

The xfs preallocation routines are central to this; fsr uses
preallocation to create the less-fragmented space for the file in
question.

The 10,000 foot overview is:

1. create a new temporary file - open & unlink
2. preallocate space for the new file to match the file to be defragmented
3. see if we got fewer extents than the original
4. do an O_DIRECT data copy into the new extents
5. call the kernel to swap the extents between the two, with sanity checks
6. close unlinked temporary file which now contains the fragmented extents.

userspace work is done in xfsdump/fsr/xfs_fsr.c
kernelspace work is done in fs/xfs/xfs_dfrag.c

In more detail...

in userspace, fsrfile_common() / packfile():

check for mandatory locks, skip this file if present
make sure there is room to copy the file
get inode attributes (ext2-style, append-only, immutable, etc)
 skip if immutable, append-only, or no-defrag set
get the current extent layout of the file (XFS_IOC_GETBMAP)
 stop if already best nr. of extents
open the temp file and immediately unlink it
set extended attributes on the temp file
set other extended inode flags on the temp file
set up buffers for direct IO
loop through original block map, preallocating extents for tmp file
 (this preserves holes as well)
double check that we have fewer extents
now loop through the block map, copying into temp file via O_DIRECT
truncate temp file to proper size (O_DIRECT alignment may have made it
larger)
switch to file owner's UID/GID to preserve quota information
set up swapext ioctl to swap extents
call kernel to swap extents between original & temp files:

typedef struct xfs_swapext
{
__int64_t   sx_version; /* swapext version */
__int64_t   sx_fdtarget;/* fd of target file */
__int64_t   sx_fdtmp;   /* fd of tmp file */
xfs_off_t   sx_offset;  /* offset into file */
xfs_off_t   sx_length;  /* leng from offset */
charsx_pad[16]; /* pad space, unused */
xfs_bstat_t sx_stat;/* stat of target b4 copy */
} xfs_swapext_t;

now in kernelspace, xfs_swapext() / xfs_swap_extents():

verify both files on same filesystem
verify that inode numbers differ
abort if filesystem is shut down
lock the inodes (ilock, ilock...)
check permissions on the files
verify that they both have the same format/type (S_IFMT, realtime, etc)
if temp file is cached, flush it
verify size of both files match
verify both files have extended attributes (or not)
compare change & modify times with what was passed in
 abort if they differ, file was changed before locking
abort if the original file is memory-mapped
set up transaction
swap the data forks of the inodes, fix up on-disk inode values
commit the transaction
unlock the inodes

-Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Boot failure with ext2 and initrds

2006-11-28 Thread Hugh Dickins
On Tue, 21 Nov 2006, Mingming Cao wrote:
> On Mon, 2006-11-20 at 16:19 +, Hugh Dickins wrote:
> > After four days of running, the EM64T has at last reproduced the same
> > hang as it did in an hour before: stuck in
> > ext2_try_to_allocate_with_rsv,
> > repeatedly ext2_rsv_window_add, ext2_try_to_allocate,
> > rsv_window_remove
> > (perhaps not in that order).
> > 

At last it did happen again, on both x86_64 and ppc64.

> 
> Don't have much clue, still...:(

Don't worry, KDB helped me work it out, patch follows in a moment.

> 
> The logic of the while loop in ext2_try_to_allocate_with_rsv() looks
> like:

Thanks for your clarifications and tips.

> > 
> A bit confused, did the whole system hang or just the "ld" writer hang
> in ext2 block allocation? And what other writers were waiting for? Were
> they trying to write to the same file?

The system was pingable, but couldn't do much else.  Only the one
"ld" was active on the ext2 filesystem by this time, other tasks of
the make just waiting on it, nothing else was trying to write there.

4 cpus, well, 2*HT: why couldn't I ssh or login?  I don't know,
something else to investigate, but that can be quite separate: very
unlikely to be related to the particular ext2 bug which showed it
(that filesystem was just for the test, it's not my root).
Probably stupidly obvious once I've worked it out.

> 
> It might be helpful to check the return value of ext2_try_to_allocate(),
> to see if it fails. It would be nice if you could check if there any
> free bit inside the reservation window.

After screwing up last time, I was ultra-paranoid this time, and did
not dare to set any breakpoints: proceeded by repeatedly breaking in
from the keyboard, and the time I happened to catch it on return from
memscan() was revealing.

> 
> And could you check the start and end block for every rsv_window_add and
> rsv_window_remove, to see if it was keep creating and removing the same
> window in the same block group?

The same every time it settled on a usable reservation, but a lot of
wasted adds and removes as it works across a fully allocated area of
the bitmap.  Not very efficient, all those rb_tree insertions and
removals until it gets to a free area; but I can't judge from this
example how common that would be, may not be worth bothering about.

> 
> And it might be useful to dump the whole filesystem block reservation
> tree as well.

Not necessary, I've put just the relevant numbers in the patch comment,
it helped me a lot to see the actual numbers and work it out from there.

> 
> Not sure if it worth the effort to test it on ext3.  The ext2 block
> reservation code in 2.6.19-rc5-mm2 looks pretty much the same as ext3/4,
> except the missing truncate_mutex. I understand this might take a few
> days to reproduce, so this might not needed now.

Turns out vanilla-ext2 and ext3 and ext4 are safe:
ext3 and ext4 slightly wrong in the same way, but safe.

I'll continue this thread with the bugfix patch 1/6; and five more
(roughly descending order of importance, the latter just cosmetic)
little fs/ext2/balloc.c patches to things I noticed on the way.
Nothing very worrying.  Easier to send patches than ask questions:
please check, perhaps my "off-by-one" accusations are wrong, for
example.  If you're satisfied they're right, please apply the
same to ext3 and ext4.
 
Although 1/6 does (I believe: too early for my tests to tell)
fix the bug in a minimal way, I rather think that area needs
cleanup - further comments below my Signed-off-by in 1/6.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] ext2 balloc: fix _with_rsv freeze

2006-11-28 Thread Hugh Dickins
On Tue, 28 Nov 2006, Mingming Cao wrote:
> On Tue, 2006-11-28 at 17:40 +, Hugh Dickins wrote:
> > After several days of testing ext2 with reservations, it got caught inside
> > ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding
> > on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to
> > find the free block guaranteed to be included (unless there's contention).
> > 
> 
> Hmm, I suspect there is other issue: alloc_new_reservation should not
> repeatedly allocating the same window, if ext2_try_to_allocate
> repeatedly fails to find a free block in that window.
> find_next_reservable_window() takes my_rsv (the old window that he
> thinks there is no free block) as a guide to find a window "after" the
> end block of my_rsv, so how could this happen?

Hmmm.  I haven't studied that part of the code, but what you say sounds
sensible: that would leave more to be explained, yes.  I guess it would
happen if all the rest of the bitmap were either allocated or reserved,
but I don't believe that was the case here: I have noted that the map
was all 00s from offset 0x1ae onwards, plenty unallocated; I've not
recorded the following reservations, but it seems unlikely they covered
the remaining free area (and still covered it even when the remaining
tasks got to the point of just waiting for this one).

> 
> > Fix the range to find_next_usable_block's memscan: the scan from "here"
> > (0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes
> > not 2 (the relevant bytes of bitmap in this case being f7 df ff - none
> > 00, but the premature cutoff implying that the last was found 00).
> > 
> 
> alloc_new_reservation() reserved a window with free block, when come to
> the time to claim it, it scans the window again. So it seems that the
> range of the the scan is too small:

The range of the scan is 1 byte too small in this case, yes.

> 
> p = ((char *)bh->b_data) + (here >> 3);
> r = memscan(p, 0, (maxblocks - here + 7) >> 3);
> next = (r - ((char *)bh->b_data)) << 3;
> 
>   ->   next is -1

I don't understand you: next was not -1, it was 0xd08.

> if (next < maxblocks && next >= here)
> return next;
> 
>   --> falls to false branch

No, it passed the "next < maxblocks && next >= here" test
(maxblocks being 0xd0e and here being 0xcfe), so returned
pointing to an allocated block - then the caller finds it
cannot set the bit.

> 
> here = bitmap_search_next_usable_block(here, bh, maxblocks);
> return here;
> 
> So we failed to find a free byte in the range.  That's seems fine to me.
> It's only a nice thing to have -- try to allocate a block in a place
> where it's neighbors are all free also. If it fails, it will search the
> window bit by bit. So I don't understand why it is not being recovered
> by bitmap_search_next_usable_block(), which test the bitmap bit by bit? 

It already returned, it doesn't reach that line.

> 
> > Is this a problem for mainline ext2?  No, because the "size" in its memscan
> > is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a
> > multiple of 8.  Is this a problem for ext3 or ext4?  No, because they have
> > an additional extN_test_allocatable test which rescues them from the error.
> > 
> Hmm, if the error is it prematurely think there is no free block in the
> range (bitmap on disk), then even in ext3/4, it will not bother checking
> the jbd copy of the bitmap. I am not sure this is the cause that ext3/4
> may not has the problem.

In the ext3/4 case, it indeed won't bother to check the jbd copy
(having found this bitmap bit set), it'll fall through to the
bitmap_search_next_usable_block you indicated above,
and that should do the right thing, finding the first
free bit in the area originally reserved.

> 
> > But the bigger question is, why does the my_rsv case come here to
> > find_next_usable_block at all? 
> 
> Because grp_goal is -1?

Well, yes, but my point is that we've got a reservation, and we're
hoping to allocate from it (even though we've given up on the "goal"),
but find_next_usable_block is not respecting it at all - liable to be
allocating out of others' reservations instead.

> 
> >  Doesn't its 64-bit boundary limit, and its
> > memscan, blithely ignore what the reservation prepared?
> 
> I agree with you that the double check is urgly. But it's necessary:( If
> there to prevent contention: other file make steal that free block we
> reserved for this file, in the case filesystem is full of reservation...

I agree it's necessary to recheck the allocation; I disagree that the
64-bit boundary limit and memscan are appropriate when my_rsv is set.

> 
> >   It's messy too,
> > the complement of the memscan being that "i < 7" loop over in
> > ext2_try_to_allocate.  I think this ought to be cleaned up,
> > in ext2+reservations and ext3 and ext4.
> > 
> The "i<7" loop there is for

Re: [PATCH 3/6] ext2 balloc: fix off-by-one against rsv_end

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 17:41 +, Hugh Dickins wrote:
> rsv_end is the last block within the reservation,
> so alloc_new_reservation should accept start_block == rsv_end as success.
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
> ---
> 

Thanks, Acked.

This is not a problem for now, as the default window size is 8 blocks,
and we never shrink the window size.  But it could be a issue in the
future, if the reservation window could be dynamically shrink, when it
keep failing to create a new(large) reservation window, or we keep throw
away a just-allocated-window as the application is doing very seeky
random write.


>  fs/ext2/balloc.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- 2.6.19-rc6-mm2/fs/ext2/balloc.c   2006-11-24 08:18:02.0 +
> +++ linux/fs/ext2/balloc.c2006-11-27 19:28:41.0 +
> @@ -950,7 +950,7 @@ retry:
>* check if the first free block is within the
>* free space we just reserved
>*/
> - if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end)
> + if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end)
>   return 0;   /* success */
>   /*
>* if the first free bit we found is out of the reservable space

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] ext2 balloc: reset windowsz when full

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 17:40 +, Hugh Dickins wrote:
> ext2_new_blocks should reset the reservation window size to 0 when squeezing
> the last blocks out of an almost full filesystem, so the retry doesn't skip
> any groups with less than half that free, reporting ENOSPC too soon.
> 

I realize this is a bug when I was looking at the code as well. I was
thinking we should not check for whether the window size is less than
the free blocks counter at all, when the allocation is fall back to non
reservation allocation. But your fix works as well.


Mingming



-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] ext2 balloc: fix _with_rsv freeze

2006-11-28 Thread Mingming Cao
On Tue, 2006-11-28 at 17:40 +, Hugh Dickins wrote:
> After several days of testing ext2 with reservations, it got caught inside
> ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding
> on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to
> find the free block guaranteed to be included (unless there's contention).
> 

Hmm, I suspect there is other issue: alloc_new_reservation should not
repeatedly allocating the same window, if ext2_try_to_allocate
repeatedly fails to find a free block in that window.
find_next_reservable_window() takes my_rsv (the old window that he
thinks there is no free block) as a guide to find a window "after" the
end block of my_rsv, so how could this happen?


> Fix the range to find_next_usable_block's memscan: the scan from "here"
> (0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes
> not 2 (the relevant bytes of bitmap in this case being f7 df ff - none
> 00, but the premature cutoff implying that the last was found 00).
> 

alloc_new_reservation() reserved a window with free block, when come to
the time to claim it, it scans the window again. So it seems that the
range of the the scan is too small:

p = ((char *)bh->b_data) + (here >> 3);
r = memscan(p, 0, (maxblocks - here + 7) >> 3);
next = (r - ((char *)bh->b_data)) << 3;

->   next is -1
if (next < maxblocks && next >= here)
return next;

--> falls to false branch

here = bitmap_search_next_usable_block(here, bh, maxblocks);
return here;

So we failed to find a free byte in the range.  That's seems fine to me.
It's only a nice thing to have -- try to allocate a block in a place
where it's neighbors are all free also. If it fails, it will search the
window bit by bit. So I don't understand why it is not being recovered
by bitmap_search_next_usable_block(), which test the bitmap bit by bit? 

> Is this a problem for mainline ext2?  No, because the "size" in its memscan
> is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a
> multiple of 8.  Is this a problem for ext3 or ext4?  No, because they have
> an additional extN_test_allocatable test which rescues them from the error.
> 
Hmm, if the error is it prematurely think there is no free block in the
range (bitmap on disk), then even in ext3/4, it will not bother checking
the jbd copy of the bitmap. I am not sure this is the cause that ext3/4
may not has the problem.

> But the bigger question is, why does the my_rsv case come here to
> find_next_usable_block at all? 

Because grp_goal is -1?

>  Doesn't its 64-bit boundary limit, and its
> memscan, blithely ignore what the reservation prepared?

I agree with you that the double check is urgly. But it's necessary:( If
there to prevent contention: other file make steal that free block we
reserved for this file, in the case filesystem is full of reservation...

>   It's messy too,
> the complement of the memscan being that "i < 7" loop over in
> ext2_try_to_allocate.  I think this ought to be cleaned up,
> in ext2+reservations and ext3 and ext4.
> 
The "i<7" loop there is for non reservation case. Since
find_next_usable_block() could find a free byte, it's trying to avoid
filesystem holes by shifting the start of the free block for at most 7
times.


Thanks!

Mingming
>  fs/ext2/balloc.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- 2.6.19-rc6-mm2/fs/ext2/balloc.c   2006-11-24 08:18:02.0 +
> +++ linux/fs/ext2/balloc.c2006-11-27 19:28:41.0 +
> @@ -570,7 +570,7 @@ find_next_usable_block(int start, struct
>   here = 0;
> 
>   p = ((char *)bh->b_data) + (here >> 3);
> - r = memscan(p, 0, (maxblocks - here + 7) >> 3);
> + r = memscan(p, 0, ((maxblocks + 7) >> 3) - (here >> 3));
>   next = (r - ((char *)bh->b_data)) << 3;
> 
>   if (next < maxblocks && next >= here)
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] ext2 balloc: fix off-by-one against grp_goal

2006-11-28 Thread Hugh Dickins
grp_goal 0 is a genuine goal (unlike -1),
so ext2_try_to_allocate_with_rsv should treat it as such.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 fs/ext2/balloc.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- 2.6.19-rc6-mm2/fs/ext2/balloc.c 2006-11-24 08:18:02.0 +
+++ linux/fs/ext2/balloc.c  2006-11-27 19:28:41.0 +
@@ -1053,7 +1053,7 @@ ext2_try_to_allocate_with_rsv(struct sup
}
/*
 * grp_goal is a group relative block number (if there is a goal)
-* 0 < grp_goal < EXT2_BLOCKS_PER_GROUP(sb)
+* 0 <= grp_goal < EXT2_BLOCKS_PER_GROUP(sb)
 * first block is a filesystem wide block number
 * first block is the block number of the first block in this group
 */
@@ -1089,7 +1089,7 @@ ext2_try_to_allocate_with_rsv(struct sup
if (!goal_in_my_reservation(&my_rsv->rsv_window,
grp_goal, group, sb))
grp_goal = -1;
-   } else if (grp_goal > 0) {
+   } else if (grp_goal >= 0) {
int curr = my_rsv->rsv_end -
(grp_goal + group_first_block) + 1;
 
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] ext2 balloc: say rb_entry not list_entry

2006-11-28 Thread Hugh Dickins
The reservations tree is an rb_tree not a list, so it's less confusing to
use rb_entry() than list_entry() - though they're both just container_of().

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 fs/ext2/balloc.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- 2.6.19-rc6-mm2/fs/ext2/balloc.c 2006-11-24 08:18:02.0 +
+++ linux/fs/ext2/balloc.c  2006-11-27 19:28:41.0 +
@@ -156,7 +156,7 @@ restart:
 
printk("Block Allocation Reservation Windows Map (%s):\n", fn);
while (n) {
-   rsv = list_entry(n, struct ext2_reserve_window_node, rsv_node);
+   rsv = rb_entry(n, struct ext2_reserve_window_node, rsv_node);
if (verbose)
printk("reservation window 0x%p "
"start: %lu, end: %lu\n",
@@ -753,7 +753,7 @@ static int find_next_reservable_window(
 
prev = rsv;
next = rb_next(&rsv->rsv_node);
-   rsv = list_entry(next,struct ext2_reserve_window_node,rsv_node);
+   rsv = rb_entry(next,struct ext2_reserve_window_node,rsv_node);
 
/*
 * Reached the last reservation, we can just append to the
@@ -995,7 +995,7 @@ static void try_to_extend_reservation(st
if (!next)
my_rsv->rsv_end += size;
else {
-   next_rsv = list_entry(next, struct ext2_reserve_window_node, 
rsv_node);
+   next_rsv = rb_entry(next, struct ext2_reserve_window_node, 
rsv_node);
 
if ((next_rsv->rsv_start - my_rsv->rsv_end - 1) >= size)
my_rsv->rsv_end += size;
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] ext2 balloc: fix _with_rsv freeze

2006-11-28 Thread Hugh Dickins
After several days of testing ext2 with reservations, it got caught inside
ext2_try_to_allocate_with_rsv: alloc_new_reservation repeatedly succeeding
on the window [12cff,12d0e], ext2_try_to_allocate repeatedly failing to
find the free block guaranteed to be included (unless there's contention).

Fix the range to find_next_usable_block's memscan: the scan from "here"
(0xcfe) up to (but excluding) "maxblocks" (0xd0e) needs to scan 3 bytes
not 2 (the relevant bytes of bitmap in this case being f7 df ff - none
00, but the premature cutoff implying that the last was found 00).

Is this a problem for mainline ext2?  No, because the "size" in its memscan
is always EXT2_BLOCKS_PER_GROUP(sb), which mkfs.ext2 requires to be a
multiple of 8.  Is this a problem for ext3 or ext4?  No, because they have
an additional extN_test_allocatable test which rescues them from the error.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---
But the bigger question is, why does the my_rsv case come here to
find_next_usable_block at all?  Doesn't its 64-bit boundary limit, and its
memscan, blithely ignore what the reservation prepared?  It's messy too,
the complement of the memscan being that "i < 7" loop over in
ext2_try_to_allocate.  I think this ought to be cleaned up,
in ext2+reservations and ext3 and ext4.

 fs/ext2/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- 2.6.19-rc6-mm2/fs/ext2/balloc.c 2006-11-24 08:18:02.0 +
+++ linux/fs/ext2/balloc.c  2006-11-27 19:28:41.0 +
@@ -570,7 +570,7 @@ find_next_usable_block(int start, struct
here = 0;
 
p = ((char *)bh->b_data) + (here >> 3);
-   r = memscan(p, 0, (maxblocks - here + 7) >> 3);
+   r = memscan(p, 0, ((maxblocks + 7) >> 3) - (here >> 3));
next = (r - ((char *)bh->b_data)) << 3;
 
if (next < maxblocks && next >= here)
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] ext2 balloc: fix off-by-one against rsv_end

2006-11-28 Thread Hugh Dickins
rsv_end is the last block within the reservation,
so alloc_new_reservation should accept start_block == rsv_end as success.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 fs/ext2/balloc.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- 2.6.19-rc6-mm2/fs/ext2/balloc.c 2006-11-24 08:18:02.0 +
+++ linux/fs/ext2/balloc.c  2006-11-27 19:28:41.0 +
@@ -950,7 +950,7 @@ retry:
 * check if the first free block is within the
 * free space we just reserved
 */
-   if (start_block >= my_rsv->rsv_start && start_block < my_rsv->rsv_end)
+   if (start_block >= my_rsv->rsv_start && start_block <= my_rsv->rsv_end)
return 0;   /* success */
/*
 * if the first free bit we found is out of the reservable space
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] ext2 balloc: use io_error label

2006-11-28 Thread Hugh Dickins
ext2_new_blocks has a nice io_error label for setting -EIO,
so goto that in the one place that doesn't already use it.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 fs/ext2/balloc.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

--- 2.6.19-rc6-mm2/fs/ext2/balloc.c 2006-11-24 08:18:02.0 +
+++ linux/fs/ext2/balloc.c  2006-11-27 19:28:41.0 +
@@ -1258,10 +1258,9 @@ retry_alloc:
if (group_no >= ngroups)
group_no = 0;
gdp = ext2_get_group_desc(sb, group_no, &gdp_bh);
-   if (!gdp) {
-   *errp = -EIO;
-   goto out;
-   }
+   if (!gdp)
+   goto io_error;
+
free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
/*
 * skip this group if the number of
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] ext2 balloc: reset windowsz when full

2006-11-28 Thread Hugh Dickins
ext2_new_blocks should reset the reservation window size to 0 when squeezing
the last blocks out of an almost full filesystem, so the retry doesn't skip
any groups with less than half that free, reporting ENOSPC too soon.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 fs/ext2/balloc.c |1 +
 1 file changed, 1 insertion(+)

--- 2.6.19-rc6-mm2/fs/ext2/balloc.c 2006-11-24 08:18:02.0 +
+++ linux/fs/ext2/balloc.c  2006-11-27 19:28:41.0 +
@@ -1292,6 +1292,7 @@ retry_alloc:
 */
if (my_rsv) {
my_rsv = NULL;
+   windowsz = 0;
group_no = goal_group;
goto retry_alloc;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html