Re: [PATCH] fscrypt: add a documentation file for filesystem-level encryption

2017-08-18 Thread Andreas Dilger
On Aug 18, 2017, at 1:47 PM, Eric Biggers  wrote:
> +Key hierarchy
> +=
> +
> +Master Keys
> +---
> +
> +Userspace should generate master keys either using a cryptographically
> +secure random number generator, e.g. by reading from ``/dev/urandom``
> +or calling getrandom(), or by using a KDF (Key Derivation Function).
> +Note that whenever a KDF is used to "stretch" a lower-entropy secret
> +such as a passphrase, it is critical that a KDF designed for this
> +purpose be used, such as scrypt, PBKDF2, or Argon2.

One minor suggestion - when generating a master key for a filesystem,
I'd think it is preferable to use /dev/random instead of /dev/urandom
to ensure there is enough entropy.

Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP


[PATCH] fscrypt: add a documentation file for filesystem-level encryption

2017-08-18 Thread Eric Biggers
From: Eric Biggers 

Perhaps long overdue, add a documentation file for filesystem-level
encryption, a.k.a. fscrypt or fs/crypto/, to the Documentation
directory.  The new file is based loosely on the latest version of the
"EXT4 Encryption Design Document (public version)" Google Doc, but with
many improvements made, including:

- Reflect the reality that it is not specific to ext4 anymore.
- More thoroughly document the design and user-visible API/behavior.
- Replace outdated information, such as the outdated explanation of how
  encrypted filenames are hashed for indexed directories and how
  encrypted filenames are presented to userspace without the key.
  (This was changed just before release.)

For now the focus is on the design and user-visible API/behavior, not on
how to add encryption support to a filesystem --- since the internal API
is still pretty messy and any standalone documentation for it would
become outdated as things get refactored over time.

Signed-off-by: Eric Biggers 
---
 Documentation/filesystems/fscrypt.rst | 587 ++
 Documentation/filesystems/index.rst   |  11 +
 MAINTAINERS   |   1 +
 3 files changed, 599 insertions(+)
 create mode 100644 Documentation/filesystems/fscrypt.rst

diff --git a/Documentation/filesystems/fscrypt.rst 
b/Documentation/filesystems/fscrypt.rst
new file mode 100644
index ..633d859a0ab1
--- /dev/null
+++ b/Documentation/filesystems/fscrypt.rst
@@ -0,0 +1,587 @@
+=
+Filesystem-level encryption (fscrypt)
+=
+
+Introduction
+
+
+fscrypt is a library which filesystems can hook into to support
+transparent encryption of files and directories.
+
+Note: "fscrypt" in this document refers to the kernel-level portion,
+implemented in ``fs/crypto/``, as opposed to the userspace tool
+`fscrypt `_.  This document only
+covers the kernel-level portion.  For command-line examples of how to
+use encryption, see the documentation for the userspace tool `fscrypt
+`_.
+
+Unlike dm-crypt, fscrypt operates at the filesystem level rather than
+at the block device level.  This allows it to encrypt different files
+with different keys and to have unencrypted files on the same
+filesystem.  This is useful for multi-user systems where each user's
+data-at-rest needs to be cryptographically isolated from the others.
+However, except for filenames, fscrypt does not encrypt filesystem
+metadata.
+
+Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated
+directly into supported filesystems --- currently ext4, F2FS, and
+UBIFS.  This allows encrypted files to be read and written without
+caching both the decrypted and encrypted pages in the pagecache,
+thereby halving the memory used and bringing it in line with
+unencrypted files.  Similarly, half as many dentries and inodes are
+needed.  eCryptfs also limits filenames to 143 bytes, causing
+application compatibility issues; fscrypt allows the full 255 bytes
+(NAME_MAX).  Finally, unlike eCryptfs, the fscrypt API can be used by
+unprivileged users, with no need to mount anything.
+
+fscrypt does not support encrypting files in-place.  Instead, it
+supports marking an empty directory as encrypted.  Then, after
+userspace provides the key, all regular files, directories, and
+symbolic links created in that directory tree are transparently
+encrypted.
+
+Threat model
+
+
+Offline attacks
+---
+
+Provided that userspace chooses a strong encryption key, fscrypt
+protects the confidentiality of file contents and filenames in the
+event of a single point-in-time permanent offline compromise of the
+block device content.  fscrypt does not protect the confidentiality of
+non-filename metadata, e.g. file sizes, file permissions, file
+timestamps, and extended attributes.  Also, the existence and location
+of holes (unallocated blocks which logically contain all zeroes) in
+files is not protected.
+
+fscrypt is not guaranteed to protect confidentiality or authenticity
+if an attacker is able to manipulate the filesystem offline prior to
+an authorized user later accessing the filesystem.
+
+Online attacks
+--
+
+fscrypt (and storage encryption in general) can only provide limited
+protection, if any at all, against online attacks.  In detail:
+
+fscrypt is only resistant to side-channel attacks, such as timing or
+electromagnetic attacks, to the extent that the underlying Linux
+Cryptographic API algorithms are.  If a vulnerable algorithm is used,
+such as a table-based implementation of AES, it may be possible for an
+attacker to mount a side channel attack against the online system.
+
+After an encryption key has been provided, fscrypt is not designed to
+hide the plaintext file contents or filenames from other users on the
+same system, regardless of the visibility of the keyrin

Re: [PATCH v3 0/5] fs/dcache: Limit # of negative dentries

2017-08-18 Thread Waiman Long
On 08/18/2017 05:59 AM, Wangkai (Kevin,C) wrote:
>
>>> In my patch the DCACHE_FILE_REMOVED flag was to distinguish the
>>> removed file and The closed file, I found there was no difference of a
>>> dentry between the removed file and the closed File, they all on the lru 
>>> list.
>> There is a difference between removed file and closed file. The type field of
>> d_flags will be empty for a removed file which indicate a negative dentry.
>> Anything else is a positive dentry. Look at the inline function 
>> d_is_negative()
>> [d_is_miss()] and you will see how it is done.
> After the file was removed, the dentry flag was not MISS, the flag was:
> DCACHE_REFERENCED | DCACHE_RCUACCESS | DCACHE_LRU_LIST | DCACHE_REGULAR_TYPE
> So, the dentry never be freed, until the kernel reclaim the slab memory.

The dentry_unlink_inode() function will clear DCACHE_REGULAR_TYPE.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/4] powerpc/fadump: reduce memory consumption for capture kernel

2017-08-18 Thread Hari Bathini



On Friday 18 August 2017 05:27 PM, Michal Suchánek wrote:

On Fri, 18 Aug 2017 16:20:53 +0530
Hari Bathini  wrote:


Hi Michal,


Thanks for the patches. I tried testing with the patches:

[0.00] fadump: Firmware-assisted dump is active.
[0.00] fadump: Modifying command line to enforce the
additional parameters passed through 'fadump_extra_args='
[0.00] fadump: Original command line:
BOOT_IMAGE=/vmlinux-4.13.0-rc1-bz155783+ root=/dev/mapper/rhel-root
ro crashkernel=2048M fadump=on fadump_reserve_mem=1024M
"fadump_extra_args=nr_cpus=1 numa=off udev.childern-max=2"
rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap
[0.00] fadump: Modified command line:
BOOT_IMAGE=/vmlinux-4.13.0-rc1-bz155783+ root=/dev/mapper/rhel-root
ro crashkernel=2048M fadump=on fadump_reserve_mem=1024M
"fadump_extra_args nr_cpus=1 numa=off udev.childern-max=2"
rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap

Looks like the quotes are retained not enforcing the parameters...

Hello,

You are passing an argument >>"fadump_extra_args<<  - that is an
argument containing a quote in the name. I did not test this scenario


Actually, this was not intentional..
I passed fadump_extra_args="nr_cpus=1 numa=off udev.childern-max=2" 
through grub loader
but chosen/bootargs ended up having "fadump_extra_args=nr_cpus=1 
numa=off udev.childern-max=2".

Need to check why that is..

Thanks
Hari


assuming the argument name would not match in this case. It would
probably not match if the quote was in the middle of the argument
name but at the start it is skipped. Note that due to the requirement
to remove quotes symmetrically which is added in the third patch this
case does not break the commandline - it merely makes the arguments
ineffective.

The format suggested in the documentation is

fadump_extra_args="nr_cpus=1 numa=off udev.childern-max=2"<< - that

is quotes around the value. This format worked in my testing.

Unfortunately, the format with quote before argument name would
probably require another extra parameter to the callback to detect
properly.

Thanks

Michal


I am yet to test the patches in other scenarios though..


Thanks

Hari


On Friday 18 August 2017 01:44 AM, Michal Suchanek wrote:

From: Hari Bathini 

With fadump (dump capture) kernel booting like a regular kernel, it
needs almost the same amount of memory to boot as the production
kernel, which is unwarranted for a dump capture kernel. But with no
option to disable some of the unnecessary subsystems in fadump
kernel, that much memory is wasted on fadump, depriving the
production kernel of that memory.

Introduce kernel parameter 'fadump_extra_args=' that would take
regular parameters as a space separated quoted string, to be
enforced when fadump is active. This 'fadump_extra_args=' parameter
can be leveraged to pass parameters like nr_cpus=1,
cgroup_disable=memory and numa=off, to disable unwarranted
resources/subsystems.

Also, ensure the log "Firmware-assisted dump is active" is printed
early in the boot process to put the subsequent fadump messages in
context.

Suggested-by: Michael Ellerman 
Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
Changes from v6:
Correct and simplify quote handling. Ideally I would like to extend
parse_args to give the length of the original quoted value to
callback. However, parse_args removes at most one doubel-quote from
the start and one from the end so that is easy to detect. Otherwise
all other users will have to be updated to trash the new argument.
---
   arch/powerpc/include/asm/fadump.h |   2 +
   arch/powerpc/kernel/fadump.c  | 109
--
arch/powerpc/kernel/prom.c|   7 +++ 3 files changed, 115
insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h
b/arch/powerpc/include/asm/fadump.h index
ce88bbe1d809..98ae00943fb3 100644 ---
a/arch/powerpc/include/asm/fadump.h +++
b/arch/powerpc/include/asm/fadump.h @@ -208,11 +208,13 @@ extern
int early_init_dt_scan_fw_dump(unsigned long node, const char
*uname, int depth, void *data); extern int fadump_reserve_mem(void);
   extern int setup_fadump(void);
+extern void enforce_fadump_extra_args(char *cmdline);
   extern int is_fadump_active(void);
   extern void crash_fadump(struct pt_regs *, const char *);
   extern void fadump_cleanup(void);

   #else/* CONFIG_FA_DUMP */
+static inline void enforce_fadump_extra_args(char *cmdline) { }
   static inline int is_fadump_active(void) { return 0; }
   static inline void crash_fadump(struct pt_regs *regs, const char
*str) { } #endif
diff --git a/arch/powerpc/kernel/fadump.c
b/arch/powerpc/kernel/fadump.c index dc0c49cfd90a..a1614d9b8a21
100644 --- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -78,8 +78,10 @@ int __init early_init_dt_scan_fw_dump(unsigned
long node,
 * dump data waiting for us.
 */
fdm_active = of_get_flat_dt_prop(node, "ibm,kernel-dump",
NULL);
-   if (fdm_active)
+   if (fdm_active)

Re: [PATCH] switchdev: documentation: minor typo fixes

2017-08-18 Thread Jonathan Corbet
On Fri, 18 Aug 2017 13:34:35 +1200
Chris Packham  wrote:

> Two typos in switchdev.txt

This looks good, but davem likes to take networking-related docs patches
through his tree.  Can I suggest resending with a copy to
net...@vger.kernel.org?

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 1/4] powerpc/fadump: reduce memory consumption for capture kernel

2017-08-18 Thread Michal Suchánek
On Fri, 18 Aug 2017 16:20:53 +0530
Hari Bathini  wrote:

> Hi Michal,
> 
> 
> Thanks for the patches. I tried testing with the patches:
> 
> [0.00] fadump: Firmware-assisted dump is active.
> [0.00] fadump: Modifying command line to enforce the
> additional parameters passed through 'fadump_extra_args='
> [0.00] fadump: Original command line: 
> BOOT_IMAGE=/vmlinux-4.13.0-rc1-bz155783+ root=/dev/mapper/rhel-root
> ro crashkernel=2048M fadump=on fadump_reserve_mem=1024M 
> "fadump_extra_args=nr_cpus=1 numa=off udev.childern-max=2" 
> rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap
> [0.00] fadump: Modified command line: 
> BOOT_IMAGE=/vmlinux-4.13.0-rc1-bz155783+ root=/dev/mapper/rhel-root
> ro crashkernel=2048M fadump=on fadump_reserve_mem=1024M
> "fadump_extra_args nr_cpus=1 numa=off udev.childern-max=2"
> rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap
> 
> Looks like the quotes are retained not enforcing the parameters...

Hello,

You are passing an argument >>"fadump_extra_args<<  - that is an
argument containing a quote in the name. I did not test this scenario
assuming the argument name would not match in this case. It would
probably not match if the quote was in the middle of the argument
name but at the start it is skipped. Note that due to the requirement
to remove quotes symmetrically which is added in the third patch this
case does not break the commandline - it merely makes the arguments
ineffective.

The format suggested in the documentation is
>>fadump_extra_args="nr_cpus=1 numa=off udev.childern-max=2"<< - that
is quotes around the value. This format worked in my testing.

Unfortunately, the format with quote before argument name would
probably require another extra parameter to the callback to detect
properly.

Thanks

Michal

> 
> I am yet to test the patches in other scenarios though..
> 
> 
> Thanks
> 
> Hari
> 
> 
> On Friday 18 August 2017 01:44 AM, Michal Suchanek wrote:
> > From: Hari Bathini 
> >
> > With fadump (dump capture) kernel booting like a regular kernel, it
> > needs almost the same amount of memory to boot as the production
> > kernel, which is unwarranted for a dump capture kernel. But with no
> > option to disable some of the unnecessary subsystems in fadump
> > kernel, that much memory is wasted on fadump, depriving the
> > production kernel of that memory.
> >
> > Introduce kernel parameter 'fadump_extra_args=' that would take
> > regular parameters as a space separated quoted string, to be
> > enforced when fadump is active. This 'fadump_extra_args=' parameter
> > can be leveraged to pass parameters like nr_cpus=1,
> > cgroup_disable=memory and numa=off, to disable unwarranted
> > resources/subsystems.
> >
> > Also, ensure the log "Firmware-assisted dump is active" is printed
> > early in the boot process to put the subsequent fadump messages in
> > context.
> >
> > Suggested-by: Michael Ellerman 
> > Signed-off-by: Hari Bathini 
> > Signed-off-by: Michal Suchanek 
> > ---
> > Changes from v6:
> > Correct and simplify quote handling. Ideally I would like to extend
> > parse_args to give the length of the original quoted value to
> > callback. However, parse_args removes at most one doubel-quote from
> > the start and one from the end so that is easy to detect. Otherwise
> > all other users will have to be updated to trash the new argument.
> > ---
> >   arch/powerpc/include/asm/fadump.h |   2 +
> >   arch/powerpc/kernel/fadump.c  | 109
> > --
> > arch/powerpc/kernel/prom.c|   7 +++ 3 files changed, 115
> > insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/fadump.h
> > b/arch/powerpc/include/asm/fadump.h index
> > ce88bbe1d809..98ae00943fb3 100644 ---
> > a/arch/powerpc/include/asm/fadump.h +++
> > b/arch/powerpc/include/asm/fadump.h @@ -208,11 +208,13 @@ extern
> > int early_init_dt_scan_fw_dump(unsigned long node, const char
> > *uname, int depth, void *data); extern int fadump_reserve_mem(void);
> >   extern int setup_fadump(void);
> > +extern void enforce_fadump_extra_args(char *cmdline);
> >   extern int is_fadump_active(void);
> >   extern void crash_fadump(struct pt_regs *, const char *);
> >   extern void fadump_cleanup(void);
> >
> >   #else /* CONFIG_FA_DUMP */
> > +static inline void enforce_fadump_extra_args(char *cmdline) { }
> >   static inline int is_fadump_active(void) { return 0; }
> >   static inline void crash_fadump(struct pt_regs *regs, const char
> > *str) { } #endif
> > diff --git a/arch/powerpc/kernel/fadump.c
> > b/arch/powerpc/kernel/fadump.c index dc0c49cfd90a..a1614d9b8a21
> > 100644 --- a/arch/powerpc/kernel/fadump.c
> > +++ b/arch/powerpc/kernel/fadump.c
> > @@ -78,8 +78,10 @@ int __init early_init_dt_scan_fw_dump(unsigned
> > long node,
> >  * dump data waiting for us.
> >  */
> > fdm_active = of_get_flat_dt_prop(node, "ibm,kernel-dump",
> > NULL);
> > -   if (fdm_active)
> > +   if (fdm_active) {
> > +   pr_info("Firmware-as

Re: [PATCH v7 1/4] powerpc/fadump: reduce memory consumption for capture kernel

2017-08-18 Thread Hari Bathini

Hi Michal,


Thanks for the patches. I tried testing with the patches:

[0.00] fadump: Firmware-assisted dump is active.
[0.00] fadump: Modifying command line to enforce the additional 
parameters passed through 'fadump_extra_args='
[0.00] fadump: Original command line: 
BOOT_IMAGE=/vmlinux-4.13.0-rc1-bz155783+ root=/dev/mapper/rhel-root ro 
crashkernel=2048M fadump=on fadump_reserve_mem=1024M 
"fadump_extra_args=nr_cpus=1 numa=off udev.childern-max=2" 
rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap
[0.00] fadump: Modified command line: 
BOOT_IMAGE=/vmlinux-4.13.0-rc1-bz155783+ root=/dev/mapper/rhel-root ro 
crashkernel=2048M fadump=on fadump_reserve_mem=1024M "fadump_extra_args 
nr_cpus=1 numa=off udev.childern-max=2" rd.lvm.lv=rhel/root 
rd.lvm.lv=rhel/swap


Looks like the quotes are retained not enforcing the parameters...

I am yet to test the patches in other scenarios though..


Thanks

Hari


On Friday 18 August 2017 01:44 AM, Michal Suchanek wrote:

From: Hari Bathini 

With fadump (dump capture) kernel booting like a regular kernel, it needs
almost the same amount of memory to boot as the production kernel, which is
unwarranted for a dump capture kernel. But with no option to disable some
of the unnecessary subsystems in fadump kernel, that much memory is wasted
on fadump, depriving the production kernel of that memory.

Introduce kernel parameter 'fadump_extra_args=' that would take regular
parameters as a space separated quoted string, to be enforced when fadump
is active. This 'fadump_extra_args=' parameter can be leveraged to pass
parameters like nr_cpus=1, cgroup_disable=memory and numa=off, to disable
unwarranted resources/subsystems.

Also, ensure the log "Firmware-assisted dump is active" is printed early
in the boot process to put the subsequent fadump messages in context.

Suggested-by: Michael Ellerman 
Signed-off-by: Hari Bathini 
Signed-off-by: Michal Suchanek 
---
Changes from v6:
Correct and simplify quote handling. Ideally I would like to extend
parse_args to give the length of the original quoted value to callback.
However, parse_args removes at most one doubel-quote from the start and
one from the end so that is easy to detect. Otherwise all other users
will have to be updated to trash the new argument.
---
  arch/powerpc/include/asm/fadump.h |   2 +
  arch/powerpc/kernel/fadump.c  | 109 --
  arch/powerpc/kernel/prom.c|   7 +++
  3 files changed, 115 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h 
b/arch/powerpc/include/asm/fadump.h
index ce88bbe1d809..98ae00943fb3 100644
--- a/arch/powerpc/include/asm/fadump.h
+++ b/arch/powerpc/include/asm/fadump.h
@@ -208,11 +208,13 @@ extern int early_init_dt_scan_fw_dump(unsigned long node,
const char *uname, int depth, void *data);
  extern int fadump_reserve_mem(void);
  extern int setup_fadump(void);
+extern void enforce_fadump_extra_args(char *cmdline);
  extern int is_fadump_active(void);
  extern void crash_fadump(struct pt_regs *, const char *);
  extern void fadump_cleanup(void);

  #else /* CONFIG_FA_DUMP */
+static inline void enforce_fadump_extra_args(char *cmdline) { }
  static inline int is_fadump_active(void) { return 0; }
  static inline void crash_fadump(struct pt_regs *regs, const char *str) { }
  #endif
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index dc0c49cfd90a..a1614d9b8a21 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -78,8 +78,10 @@ int __init early_init_dt_scan_fw_dump(unsigned long node,
 * dump data waiting for us.
 */
fdm_active = of_get_flat_dt_prop(node, "ibm,kernel-dump", NULL);
-   if (fdm_active)
+   if (fdm_active) {
+   pr_info("Firmware-assisted dump is active.\n");
fw_dump.dump_active = 1;
+   }

/* Get the sizes required to store dump data for the firmware provided
 * dump sections.
@@ -332,8 +334,11 @@ int __init fadump_reserve_mem(void)
  {
unsigned long base, size, memory_boundary;

-   if (!fw_dump.fadump_enabled)
+   if (!fw_dump.fadump_enabled) {
+   if (fw_dump.dump_active)
+   pr_warn("Firmware-assisted dump was active but kernel booted 
with fadump disabled!\n");
return 0;
+   }

if (!fw_dump.fadump_supported) {
printk(KERN_INFO "Firmware-assisted dump is not supported on"
@@ -373,7 +378,6 @@ int __init fadump_reserve_mem(void)
memory_boundary = memblock_end_of_DRAM();

if (fw_dump.dump_active) {
-   printk(KERN_INFO "Firmware-assisted dump is active.\n");
/*
 * If last boot has crashed then reserve all the memory
 * above boot_memory_size so that we don't touch it until
@@ -460,6 +464,105 @@ static int __init early_fadump_reserve_mem(char *p)
  }
  early_param(

RE: [PATCH v3 0/5] fs/dcache: Limit # of negative dentries

2017-08-18 Thread Wangkai (Kevin,C)


> -Original Message-
> From: Waiman Long [mailto:long...@redhat.com]
> Sent: Thursday, August 17, 2017 9:04 PM
> To: Wangkai (Kevin,C); Alexander Viro; Jonathan Corbet
> Cc: linux-ker...@vger.kernel.org; linux-doc@vger.kernel.org;
> linux-fsde...@vger.kernel.org; Paul E. McKenney; Andrew Morton; Ingo Molnar;
> Miklos Szeredi; Matthew Wilcox; Larry Woodman; James Bottomley
> Subject: Re: [PATCH v3 0/5] fs/dcache: Limit # of negative dentries
> 
> On 08/17/2017 12:00 AM, Wangkai (Kevin,C) wrote:
> >
> >>>
> >>> Hi Longman,
> >>> I am a fresher of fsdevel, about 2 weeks before, I have joined this
> >>> mail list, recently I have met the same problem of negative
> >>> dentries, in my opinion, the dentries should be remove together with
> >>> the files or
> >> directories, I don't know you have submit this patch, I have another
> >> patch about this:
> >>> http://marc.info/?l=linux-fsdevel&m=150209902215266&w=2
> >>>
> >>> maybe this is a foo idea...
> >>>
> >>> regards
> >>> Kevin
> >> If you look at the code, the front dentries of the LRU list are
> >> removed when there are too many negative dentries. That includes
> >> positive dentries as well as it is not practical to just remove the 
> >> negative
> dentries.
> >>
> >> I have looked at your patch. The dentry of a removed file becomes a
> >> negative dentry. The kernel can keep track of those negative entries
> >> and there is no need to add an additional flag for that.
> >>
> >> Cheers,
> >> Longman
> > One comment about your patch:
> > In the patch 1/5 function dentry_kill first get dentry->d_flags, after
> > lock parent and Compare d_flags again, is this needed? The d_flags was
> changed under lock.
> 
> Yes, it is necessary. We are talking about an SMP system with multiple threads
> running concurrently. If you look at the lock parent code, it may release the
> current dentry lock before taking the parent's and then the dentry lock again.
> As soon as the lock is released, anything can happen to the dentry including
> changes in d_flags.

Yes, I am not check the lock parent code, it is necessary.

> > In my patch the DCACHE_FILE_REMOVED flag was to distinguish the
> > removed file and The closed file, I found there was no difference of a
> > dentry between the removed file and the closed File, they all on the lru 
> > list.
> 
> There is a difference between removed file and closed file. The type field of
> d_flags will be empty for a removed file which indicate a negative dentry.
> Anything else is a positive dentry. Look at the inline function 
> d_is_negative()
> [d_is_miss()] and you will see how it is done.

After the file was removed, the dentry flag was not MISS, the flag was:
DCACHE_REFERENCED | DCACHE_RCUACCESS | DCACHE_LRU_LIST | DCACHE_REGULAR_TYPE
So, the dentry never be freed, until the kernel reclaim the slab memory.

Regards,
Kevin