date:20070504

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

2007-05-04 Thread Andrew Morton

On Fri, 4 May 2007 23:34:08 -0400 Quentin Godfroy <[EMAIL PROTECTED]> wrote:

> By the way, is init 32 bits or 64 bits? It could break the ia32
> emulation thing, but not the 64bit native mode.

akpm2:/home/akpm> file /sbin/init  
/sbin/init: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for 
GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, 
stripped

akpm2:/home/akpm> cat /etc/issue
Fedora Core release 6 (Zod)
Kernel \r on an \m
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

2007-05-04 Thread Jeremy Fitzhardinge

Quentin Godfroy wrote:
>> Won't this break with ET_DYN executables?  And besides, isn't this the
>> same thing?   
>> 
>
> Indeed, I haven't seen that. For ET_DYN executables, it could be done a
> thing like load_addr+elf_ppnt->p_vaddr (in the function that creates the
> auxv, as ity has access to the elf header), and for ET_EXEC do what I
> propose. I think this is trivial to do. I'll do it as soon as I come back
> in front of my machine.
>   

I don't think you need to special-case it.  You can compute the offset
between the linked address and the load address (first
PT_LOAD[0]->p_vaddr - load_addr) and use that to offset all the other
addresses.


> I don't understand. Yes it is what it is supposed to be, and the kernel
> is supposed to give the vaddr of the phdr table to the interpreter and
> not load addr + offset of phdr in file, which is sometimes wrong.
>   

How can it be wrong?  Does the PT_PHDR point to a different array of
Phdr entries?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

2007-05-04 Thread Quentin Godfroy

On Fri, May 04, 2007 at 04:22:08PM -0700, Andrew Morton wrote:
> This patch kills my FC6 machine (using a config which was derived from RH's
> original):
> 
> Freeing unused kernel memory: 368k freed
> Write protecting the kernel read-only data: 959k
> request_module: runaway loop modprobe binfmt-464c
> request_module: runaway loop modprobe binfmt-464c
> request_module: runaway loop modprobe binfmt-464c
> request_module: runaway loop modprobe binfmt-464c
> request_module: runaway loop modprobe binfmt-464c
> 
> 
> .config: http://userweb.kernel.org/~akpm/config-akpm2.txt

I didn't try it on a 64bit kernel. I'll do as soon as I can reach my
machine. Probably the loop does not find PT_PHDR and then returns noexec.

I had such a problem, but it was because I forgot elf_ppnt = elf_phdata
before the loop.

By the way, is init 32 bits or 64 bits? It could break the ia32
emulation thing, but not the 64bit native mode.

Anyway the problem could be addressed by returning back to the old
behaviour if the loop fails, but it's not clean at all.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

2007-05-04 Thread Quentin Godfroy

On Fri, May 04, 2007 at 04:31:49PM -0700, Jeremy Fitzhardinge wrote:
> Quentin Godfroy wrote:
> > +   elf_ppnt = elf_phdata;
> > +   for (i = 0; i< loc->elf_ex.e_phnum; i++, elf_ppnt++)
> > +   if (elf_ppnt->p_type == PT_PHDR) {
> > +   phdr_addr = elf_ppnt->p_vaddr;
> >   
> 
> Won't this break with ET_DYN executables?  And besides, isn't this the
> same thing?   

Indeed, I haven't seen that. For ET_DYN executables, it could be done a
thing like load_addr+elf_ppnt->p_vaddr (in the function that creates the
auxv, as ity has access to the elf header), and for ET_EXEC do what I
propose. I think this is trivial to do. I'll do it as soon as I come back
in front of my machine.

>  Shouldn't PT_PHDR->p_vaddr point to the vaddr of the Phdr
> table itself?

I don't understand. Yes it is what it is supposed to be, and the kernel
is supposed to give the vaddr of the phdr table to the interpreter and
not load addr + offset of phdr in file, which is sometimes wrong.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

2007-05-04 Thread Jeremy Fitzhardinge

Quentin Godfroy wrote:
> + elf_ppnt = elf_phdata;
> + for (i = 0; i< loc->elf_ex.e_phnum; i++, elf_ppnt++)
> + if (elf_ppnt->p_type == PT_PHDR) {
> + phdr_addr = elf_ppnt->p_vaddr;
>   

Won't this break with ET_DYN executables?  And besides, isn't this the
same thing?  Shouldn't PT_PHDR->p_vaddr point to the vaddr of the Phdr
table itself?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

2007-05-04 Thread Andrew Morton

On Fri, 4 May 2007 10:09:21 -0400
Quentin Godfroy <[EMAIL PROTECTED]> wrote:

> On a dynamic ELF executable, the current kernel loader gives to the
> interpreter (in the AUXV vector) the AT_PHDR argument as :
> offset_of_phdr_in_file + first address.
> 
> It can be wrong for an executable where the program headers are not located 
> in the first loaded segment.
> 
> This patch corrects the behaviour.
> 
> Signed-off-by: Quentin Godfroy <[EMAIL PROTECTED]>
> ---
>  Here is an example of such an ELF executable which the current code
>  fails on :
>  ftp://quatramaran.ens.fr/pub/godfroy/addrpath/broken-sample
> 
> --- linux-2.6.21.1/fs/binfmt_elf.c2007-05-04 03:20:00.0 -0400
> +++ linux-2.6.21.1-patch/fs/binfmt_elf.c  2007-05-04 08:02:18.0 
> -0400
> @@ -134,6 +134,7 @@ static int padzero(unsigned long elf_bss
>  static int
>  create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
>   int interp_aout, unsigned long load_addr,
> + unsigned long phdr_addr,
>   unsigned long interp_load_addr)
>  {
>   unsigned long p = bprm->p;
> @@ -190,7 +191,7 @@ create_elf_tables(struct linux_binprm *b
>   NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
>   NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE);
>   NEW_AUX_ENT(AT_CLKTCK, CLOCKS_PER_SEC);
> - NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
> + NEW_AUX_ENT(AT_PHDR, phdr_addr);
>   NEW_AUX_ENT(AT_PHENT, sizeof(struct elf_phdr));
>   NEW_AUX_ENT(AT_PHNUM, exec->e_phnum);
>   NEW_AUX_ENT(AT_BASE, interp_load_addr);
> @@ -529,7 +530,7 @@ static unsigned long randomize_stack_top
>  static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
>  {
>   struct file *interpreter = NULL; /* to shut gcc up */
> - unsigned long load_addr = 0, load_bias = 0;
> + unsigned long load_addr = 0, load_bias = 0, phdr_addr = 0;
>   int load_addr_set = 0;
>   char * elf_interpreter = NULL;
>   unsigned int interpreter_type = INTERPRETER_NONE;
> @@ -718,6 +719,16 @@ static int load_elf_binary(struct linux_
>   break;
>   }
>  
> + elf_ppnt = elf_phdata;
> + for (i = 0; i< loc->elf_ex.e_phnum; i++, elf_ppnt++)
> + if (elf_ppnt->p_type == PT_PHDR) {
> + phdr_addr = elf_ppnt->p_vaddr;
> + break;
> + }
> + retval = -ENOEXEC;
> + if (!phdr_addr)
> + goto out_free_dentry;
> +
>   /* Some simple consistency checks for the interpreter */
>   if (elf_interpreter) {
>   interpreter_type = INTERPRETER_ELF | INTERPRETER_AOUT;
> @@ -987,7 +998,7 @@ static int load_elf_binary(struct linux_
>   current->flags &= ~PF_FORKNOEXEC;
>   create_elf_tables(bprm, &loc->elf_ex,
> (interpreter_type == INTERPRETER_AOUT),
> -   load_addr, interp_load_addr);
> +   load_addr, phdr_addr, interp_load_addr);
>   /* N.B. passed_fileno might not be initialized? */
>   if (interpreter_type == INTERPRETER_AOUT)
>   current->mm->arg_start += strlen(passed_fileno) + 1;

This patch kills my FC6 machine (using a config which was derived from RH's
original):

Freeing unused kernel memory: 368k freed
Write protecting the kernel read-only data: 959k
request_module: runaway loop modprobe binfmt-464c
request_module: runaway loop modprobe binfmt-464c
request_module: runaway loop modprobe binfmt-464c
request_module: runaway loop modprobe binfmt-464c
request_module: runaway loop modprobe binfmt-464c


.config: http://userweb.kernel.org/~akpm/config-akpm2.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] AF_RXRPC: Sort out MTU handling

2007-05-04 Thread David Miller

From: David Howells <[EMAIL PROTECTED]>
Date: Fri, 04 May 2007 15:56:47 +0100

> Sort out the MTU determination and handling in AF_RXRPC:
> 
>  (1) If it's present, parse the additional information supplied by the peer at
>  the end of the ACK packet (struct ackinfo) to determine the MTU sizes
>  that peer is willing to support.
> 
>  (2) Initialise the MTU size to that peer from the kernel's routing records.
> 
>  (3) Send ACKs rather than ACKALLs as the former carry the additional info,
>  and the latter do not.
> 
>  (4) Declare the interface MTU size in outgoing ACKs as a maximum amount of
>  data that can be stuffed into an RxRPC packet without it having to be
>  fragmented to come in this computer's NIC.
> 
>  (5) If sendmsg() is given MSG_MORE then it should allocate an skb of the
>  maximum size rather than one just big enough for the data it's got left
>  to process on the theory that there is more data to come that it can
>  append to that packet.
> 
>  This means, for example, that if AFS does a large StoreData op, all the
>  packets barring the last will be filled to the maximum unfragmented size.
> 
> Signed-off-by: David Howells <[EMAIL PROTECTED]>

Applied, thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] JFS: Fix race waking up jfsIO kernel thread

2007-05-04 Thread Dave Kleikamp

I've been looking at a hang that was reported off-list, and I believe I
found the cause.  I'm still waiting for confirmation on whether the
patch does fix the problem, but I wanted to distribute the patch in case
anyone else is seeing a similar hang.  It looks like the problem is in
kernels from 2.6.17 to the present.

Thanks,
Shaggy

JFS: Fix race waking up jfsIO kernel thread

It's possible for a journal I/O request to be added to the log_redrive
queue and the jfsIO thread to be awakened after the thread releases
log_redrive_lock but before it sets its state to TASK_INTERRUPTIBLE.

The jfsIO thread should set its state before giving up the spinlock, so
the waking thread will really wake it.

Signed-off-by: Dave Kleikamp <[EMAIL PROTECTED]>

diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index ff7f1be..16c6268 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -2354,12 +2354,13 @@ int jfsIOWait(void *arg)
lbmStartIO(bp);
spin_lock_irq(&log_redrive_lock);
}
-   spin_unlock_irq(&log_redrive_lock);
 
if (freezing(current)) {
+   spin_unlock_irq(&log_redrive_lock);
refrigerator();
} else {
set_current_state(TASK_INTERRUPTIBLE);
+   spin_unlock_irq(&log_redrive_lock);
schedule();
__set_current_state(TASK_RUNNING);
}

-- 
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] AF_RXRPC: Sort out MTU handling

2007-05-04 Thread David Howells

Sort out the MTU determination and handling in AF_RXRPC:

 (1) If it's present, parse the additional information supplied by the peer at
 the end of the ACK packet (struct ackinfo) to determine the MTU sizes
 that peer is willing to support.

 (2) Initialise the MTU size to that peer from the kernel's routing records.

 (3) Send ACKs rather than ACKALLs as the former carry the additional info,
 and the latter do not.

 (4) Declare the interface MTU size in outgoing ACKs as a maximum amount of
 data that can be stuffed into an RxRPC packet without it having to be
 fragmented to come in this computer's NIC.

 (5) If sendmsg() is given MSG_MORE then it should allocate an skb of the
 maximum size rather than one just big enough for the data it's got left
 to process on the theory that there is more data to come that it can
 append to that packet.

 This means, for example, that if AFS does a large StoreData op, all the
 packets barring the last will be filled to the maximum unfragmented size.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 net/rxrpc/ar-ack.c|   80 ++---
 net/rxrpc/ar-error.c  |2 +
 net/rxrpc/ar-output.c |2 +
 net/rxrpc/ar-peer.c   |   45 +++-
 4 files changed, 115 insertions(+), 14 deletions(-)

diff --git a/net/rxrpc/ar-ack.c b/net/rxrpc/ar-ack.c
index fc07a92..657ee69 100644
--- a/net/rxrpc/ar-ack.c
+++ b/net/rxrpc/ar-ack.c
@@ -543,6 +543,38 @@ static void rxrpc_zap_tx_window(struct rxrpc_call *call)
 }
 
 /*
+ * process the extra information that may be appended to an ACK packet
+ */
+static void rxrpc_extract_ackinfo(struct rxrpc_call *call, struct sk_buff *skb,
+ unsigned latest, int nAcks)
+{
+   struct rxrpc_ackinfo ackinfo;
+   struct rxrpc_peer *peer;
+   unsigned mtu;
+
+   if (skb_copy_bits(skb, nAcks + 3, &ackinfo, sizeof(ackinfo)) < 0) {
+   _leave(" [no ackinfo]");
+   return;
+   }
+
+   _proto("Rx ACK %%%u Info { rx=%u max=%u rwin=%u jm=%u }",
+  latest,
+  ntohl(ackinfo.rxMTU), ntohl(ackinfo.maxMTU),
+  ntohl(ackinfo.rwind), ntohl(ackinfo.jumbo_max));
+
+   mtu = min(ntohl(ackinfo.rxMTU), ntohl(ackinfo.maxMTU));
+
+   peer = call->conn->trans->peer;
+   if (mtu < peer->maxdata) {
+   spin_lock_bh(&peer->lock);
+   peer->maxdata = mtu;
+   peer->mtu = mtu + peer->hdrsize;
+   spin_unlock_bh(&peer->lock);
+   _net("Net MTU %u (maxdata %u)", peer->mtu, peer->maxdata);
+   }
+}
+
+/*
  * process packets in the reception queue
  */
 static int rxrpc_process_rx_queue(struct rxrpc_call *call,
@@ -606,6 +638,8 @@ process_further:
   rxrpc_acks[ack.reason],
   ack.nAcks);
 
+   rxrpc_extract_ackinfo(call, skb, latest, ack.nAcks);
+
if (ack.reason == RXRPC_ACK_PING) {
_proto("Rx ACK %%%u PING Request", latest);
rxrpc_propose_ACK(call, RXRPC_ACK_PING_RESPONSE,
@@ -801,9 +835,9 @@ void rxrpc_process_call(struct work_struct *work)
struct msghdr msg;
struct kvec iov[5];
unsigned long bits;
-   __be32 data;
+   __be32 data, pad;
size_t len;
-   int genbit, loop, nbit, ioc, ret;
+   int genbit, loop, nbit, ioc, ret, mtu;
u32 abort_code = RX_PROTOCOL_ERROR;
u8 *acks = NULL;
 
@@ -899,9 +933,30 @@ void rxrpc_process_call(struct work_struct *work)
}
 
if (test_bit(RXRPC_CALL_ACK_FINAL, &call->events)) {
-   hdr.type = RXRPC_PACKET_TYPE_ACKALL;
genbit = RXRPC_CALL_ACK_FINAL;
-   goto send_message;
+
+   ack.bufferSpace = htons(8);
+   ack.maxSkew = 0;
+   ack.serial  = 0;
+   ack.reason  = RXRPC_ACK_IDLE;
+   ack.nAcks   = 0;
+   call->ackr_reason = 0;
+
+   spin_lock_bh(&call->lock);
+   ack.serial = call->ackr_serial;
+   ack.previousPacket = call->ackr_prev_seq;
+   ack.firstPacket = htonl(call->rx_data_eaten + 1);
+   spin_unlock_bh(&call->lock);
+
+   pad = 0;
+
+   iov[1].iov_base = &ack;
+   iov[1].iov_len  = sizeof(ack);
+   iov[2].iov_base = &pad;
+   iov[2].iov_len  = 3;
+   iov[3].iov_base = &ackinfo;
+   iov[3].iov_len  = sizeof(ackinfo);
+   goto send_ACK;
}
 
if (call->events & ((1 << RXRPC_CALL_RCVD_BUSY) |
@@ -971,8 +1026,6 @@ void rxrpc_process_call(struct work_struct *work)
 
/* consider sending an ordinary ACK */
if (test_bit(RXRPC_CALL_ACK, &call->events)) {
-   __be32 pad;
-
_debug("send ACK: window: %d - %d { %lx }",

patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

2007-05-04 Thread Quentin Godfroy

On a dynamic ELF executable, the current kernel loader gives to the
interpreter (in the AUXV vector) the AT_PHDR argument as :
offset_of_phdr_in_file + first address.

It can be wrong for an executable where the program headers are not located 
in the first loaded segment.

This patch corrects the behaviour.

Signed-off-by: Quentin Godfroy <[EMAIL PROTECTED]>
---
 Here is an example of such an ELF executable which the current code
 fails on :
 ftp://quatramaran.ens.fr/pub/godfroy/addrpath/broken-sample

--- linux-2.6.21.1/fs/binfmt_elf.c  2007-05-04 03:20:00.0 -0400
+++ linux-2.6.21.1-patch/fs/binfmt_elf.c2007-05-04 08:02:18.0 
-0400
@@ -134,6 +134,7 @@ static int padzero(unsigned long elf_bss
 static int
 create_elf_tables(struct linux_binprm *bprm, struct elfhdr *exec,
int interp_aout, unsigned long load_addr,
+   unsigned long phdr_addr,
unsigned long interp_load_addr)
 {
unsigned long p = bprm->p;
@@ -190,7 +191,7 @@ create_elf_tables(struct linux_binprm *b
NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE);
NEW_AUX_ENT(AT_CLKTCK, CLOCKS_PER_SEC);
-   NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
+   NEW_AUX_ENT(AT_PHDR, phdr_addr);
NEW_AUX_ENT(AT_PHENT, sizeof(struct elf_phdr));
NEW_AUX_ENT(AT_PHNUM, exec->e_phnum);
NEW_AUX_ENT(AT_BASE, interp_load_addr);
@@ -529,7 +530,7 @@ static unsigned long randomize_stack_top
 static int load_elf_binary(struct linux_binprm *bprm, struct pt_regs *regs)
 {
struct file *interpreter = NULL; /* to shut gcc up */
-   unsigned long load_addr = 0, load_bias = 0;
+   unsigned long load_addr = 0, load_bias = 0, phdr_addr = 0;
int load_addr_set = 0;
char * elf_interpreter = NULL;
unsigned int interpreter_type = INTERPRETER_NONE;
@@ -718,6 +719,16 @@ static int load_elf_binary(struct linux_
break;
}
 
+   elf_ppnt = elf_phdata;
+   for (i = 0; i< loc->elf_ex.e_phnum; i++, elf_ppnt++)
+   if (elf_ppnt->p_type == PT_PHDR) {
+   phdr_addr = elf_ppnt->p_vaddr;
+   break;
+   }
+   retval = -ENOEXEC;
+   if (!phdr_addr)
+   goto out_free_dentry;
+
/* Some simple consistency checks for the interpreter */
if (elf_interpreter) {
interpreter_type = INTERPRETER_ELF | INTERPRETER_AOUT;
@@ -987,7 +998,7 @@ static int load_elf_binary(struct linux_
current->flags &= ~PF_FORKNOEXEC;
create_elf_tables(bprm, &loc->elf_ex,
  (interpreter_type == INTERPRETER_AOUT),
- load_addr, interp_load_addr);
+ load_addr, phdr_addr, interp_load_addr);
/* N.B. passed_fileno might not be initialized? */
if (interpreter_type == INTERPRETER_AOUT)
current->mm->arg_start += strlen(passed_fileno) + 1;
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Implement renaming for debugfs

2007-05-04 Thread Jan Kara

On Thu 03-05-07 17:16:02, Greg KH wrote:
> On Thu, May 03, 2007 at 11:54:52AM +0200, Jan Kara wrote:
> > On Tue 01-05-07 20:26:27, Greg KH wrote:
> > > On Mon, Apr 30, 2007 at 07:55:36PM +0200, Jan Kara wrote:
> > > >   Hello,
> > > > 
> > > >   attached patch implements renaming for debugfs. I was asked for this
> > > > feature by WLAN guys and I guess it makes sence (they have some debug 
> > > > info
> > > > in the directory identified by interface name and that can change...).
> > > > Could someone have a look at what I wrote whether it looks reasonable?
> > > > Thanks.
> > > > 
> > > > Honza
> > > > 
> > > > -- 
> > > > Jan Kara <[EMAIL PROTECTED]>
> > > > SuSE CR Labs
> > > 
> > > > Implement debugfs_rename() to allow renaming files/directories in 
> > > > debugfs.
> > > 
> > > I think you are going to need more infrastructure here, the caller
> > > doesn't want to have to allocate a new dentry themselves, they just want
> > > to pass in the new filename :)
> >   Actually, I wanted the call to be in the spirit of other debugfs calls.
> > So we have for example:
> > void debugfs_remove(struct dentry *dentry)
> 
> That is because 'debugfs_create' returns a dentry.
> 
> > struct dentry *debugfs_create_dir(const char *name, struct dentry *parent)
> > etc.
> 
> Same here, you already have a dentry to place this directory into, _and_
> all the user needs to provide is a name for the new directory.  They
> don't ever create a dentry themselves, which is what your function
> required them to do.
> 
> Try using your function and you'll see what I mean :)
  I've tried it when testing the function :). The code looked like:
dir1 = debugfs_create_dir("dir1", NULL);
dir2 = debugfs_create_dir("dir2", NULL);
file1 = debugfs_create_file("file1", 0644, dir1, NULL, NULL);
file2 = debugfs_rename(dir1, file1, dir2, "new_name");
  No new dentries needed to be created...
  
> >   So it seemed to me that the interface with dentries was perfectly
> > appropriate... One possibility would be to take filename of a file to
> > rename instead of old_dentry. But dirs should IMHO remain to be dentries...
> 
> Sure, they should be dentries, but the caller should just provide a
> name.
  Hmm, I'm not sure I understand... So you prefer debugfs_rename() to be:

struct dentry *debugfs_rename(struct dentry *old_dir, const char *old_name,
struct dentry *new_dir, const char *new_name);

  Right?
Honza
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] fs: add 4th case to do_path_lookup

2007-05-04 Thread Christoph Hellwig


Oh and btw, net/sunrpc/rpc_pipe.c:rpc_lookup_parent() and
fs/nfsctl.c:do_open() should  be switched to the new code, at which
point the path_walk() export can go.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

2007-05-04 Thread David Chinner

On Thu, May 03, 2007 at 11:28:15PM -0700, Andrew Morton wrote:
> On Fri, 4 May 2007 16:07:31 +1000 David Chinner <[EMAIL PROTECTED]> wrote:
> > On Thu, May 03, 2007 at 09:29:55PM -0700, Andrew Morton wrote:
> > > On Thu, 26 Apr 2007 23:33:32 +0530 "Amit K. Arora" <[EMAIL PROTECTED]> 
> > > wrote:
> > > 
> > > > This patch implements the fallocate() system call and adds support for
> > > > i386, x86_64 and powerpc.
> > > > 
> > > > ...
> > > > +{
> > > > +   struct file *file;
> > > > +   struct inode *inode;
> > > > +   long ret = -EINVAL;
> > > > +
> > > > +   if (len == 0 || offset < 0)
> > > > +   goto out;
> > > 
> > > The posix spec implies that negative `len' is permitted - presumably 
> > > "allocate
> > > ahead of `offset'".  How peculiar.
> > 
> > I just checked the man page for posix_fallocate() and it says:
> > 
> >   EINVAL  offset or len was less than zero.
> > 
> > We should probably follow this lead.
> 
> Yes, I think so.  I'm suspecting that
> http://www.opengroup.org/onlinepubs/009695399/functions/posix_fallocate.html
> is just buggy.  Or I can't read.
> 
> I mean, if we're going to support negative `len' then is the byte at
> `offset' inside or outside the segment?  Head spins.

I don't think we should care. If we provide a syscall with the
semantics of "allocate from offset to offset+len" then glibc's
implementation can turn negative length into two separate
fallocate syscalls

> > > > +   ret = -ENODEV;
> > > > +   if (!S_ISREG(inode->i_mode))
> > > > +   goto out_fput;
> > > 
> > > So we return ENODEV against an S_ISBLK fd, as per the posix spec.  That
> > > seems a bit silly of them.
> > 
> > H - I thought that the intention of sys_fallocate() was to
> > be generic enough to eventually allow preallocation on directories.
> > If that is the case, then this check will prevent that
> 
> The above opengroup page only permits S_ISREG.  Preallocating directories
> sounds quite useful to me, although it's something which would be pretty
> hard to emulate if the FS doesn't support it.  And there's a decent case to
> be made for emulating it - run-anywhere reasons.  Does glibc emulation support
> directories?  Quite unlikely.
> 
> But yes, sounds like a desirable thing.  Would XFS support it easily if the 
> above
> check was relaxed?

No - right now empty blocks are pruned from the directory immediately so I
don't think we really have a concept of empty blocks in the btree structure.
dir2 is bloody complex, so adding preallocation is probably not going to
be simple to do.

It's not high on my list to add, either, because we can typically avoid the
worst case directory fragmentation by using larger directory block sizes
(e.g. 16k instead of the default 4k on a 4k block size fs).

IIRC directory preallocation has been talked about more for ext3/4

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] fs: add 4th case to do_path_lookup

2007-05-04 Thread Christoph Hellwig

sorry, I proposed Jeff a reply long ago but haven't done yet.

On Fri, May 04, 2007 at 12:02:00AM -0700, Andrew Morton wrote:
> > @@ -1125,6 +1125,10 @@ static int fastcall do_path_lookup(int dfd, const 
> > char *name,
> > nd->mnt = mntget(fs->rootmnt);
> > nd->dentry = dget(fs->root);
> > read_unlock(&fs->lock);
> > +   } else if (flags & LOOKUP_ONE) {
> > +   /* nd->mnt and nd->dentry already set, just grab references */
> > +   mntget(nd->mnt);
> > +   dget(nd->dentry);
> > } else if (dfd == AT_FDCWD) {
> > read_lock(&fs->lock);
> > nd->mnt = mntget(fs->pwdmnt);
> 
> Well the patch passes my too-small-to-care-about test ;)
> 
> Unless someone objects I'd suggest that you add it to the unionfs tree.

The code is obviously correct.  There is one little thing that bothers
me, and that's that nd was purely an output paramter to path_lookup and
do_path_lookup, and no it's an input paramter for the least used case.
It might make sense to just a simple helper ala:

static int path_component_lookup(struct dentry *dentry, struct vfsmount *mnt,
const char *name, unsigned int flags, struct nameidata *nd)
{
int retval;

nd->last_type = LAST_ROOT;
nd->flags = flags;
nd->mnt = mntget(mnt);
nd->dentry = dget(dentry);
nd->depth = 0;

retval = path_walk(name, nd);
if (unlikely(!retval && !audit_dummy_context() &&
 nd->dentry && nd->dentry->d_inode))
audit_inode(name, nd->dentry->d_inode);

return retval;

}

instead.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] fs: add 4th case to do_path_lookup

2007-05-04 Thread Andrew Morton

On Sun, 29 Apr 2007 23:30:12 -0400 Josef Sipek <[EMAIL PROTECTED]> wrote:

> Signed-off-by: Josef 'Jeff' Sipek <[EMAIL PROTECTED]>
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index 2995fba..1516a9b 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1125,6 +1125,10 @@ static int fastcall do_path_lookup(int dfd, const char 
> *name,
>   nd->mnt = mntget(fs->rootmnt);
>   nd->dentry = dget(fs->root);
>   read_unlock(&fs->lock);
> + } else if (flags & LOOKUP_ONE) {
> + /* nd->mnt and nd->dentry already set, just grab references */
> + mntget(nd->mnt);
> + dget(nd->dentry);
>   } else if (dfd == AT_FDCWD) {
>   read_lock(&fs->lock);
>   nd->mnt = mntget(fs->pwdmnt);
> diff --git a/include/linux/namei.h b/include/linux/namei.h
> index 92b422b..aa89d97 100644
> --- a/include/linux/namei.h
> +++ b/include/linux/namei.h
> @@ -48,6 +48,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, 
> LAST_BIND};
>   *  - internal "there are more path compnents" flag
>   *  - locked when lookup done with dcache_lock held
>   *  - dentry cache is untrusted; force a real lookup
> + *  - lookup path from given dentry/vfsmount pair
>   */
>  #define LOOKUP_FOLLOW 1
>  #define LOOKUP_DIRECTORY  2
> @@ -55,6 +56,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT, 
> LAST_BIND};
>  #define LOOKUP_PARENT16
>  #define LOOKUP_NOALT 32
>  #define LOOKUP_REVAL 64
> +#define LOOKUP_ONE  128

Well the patch passes my too-small-to-care-about test ;)

Unless someone objects I'd suggest that you add it to the unionfs tree.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

Re: patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

Re: [PATCH] AF_RXRPC: Sort out MTU handling

[PATCH] JFS: Fix race waking up jfsIO kernel thread

[PATCH] AF_RXRPC: Sort out MTU handling

patch: VFS: fix passing of AT_PHDR value in auxv to ELF interpreter

Re: [PATCH] Implement renaming for debugfs

Re: [PATCH 1/1] fs: add 4th case to do_path_lookup

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

Re: [PATCH 1/1] fs: add 4th case to do_path_lookup

Re: [PATCH 1/1] fs: add 4th case to do_path_lookup

15 matches

Site Navigation

Mail list logo

Footer information