Re: PAT support for i386 and x86_64

2007-08-08 Thread Cédric Augonnet
[Apologize for the double-post, messed up with my mailer... ]

2007/8/8, Andi Kleen <[EMAIL PROTECTED]>:
> > I don't see why we have to worry about cache corruption in the case at
> > hand. Write-combining is needed to map io (typically pci-mem regions)
> > which are never mapped cachable anywhere, including in the linear map.
>
> If we WC them using PAT then there would be a UC<->WC conflict with
> the direct mapping and possibly others. That's already undefined
> and not allowed.
>
> After some very bad experiences in the past I'm not going to take
> chances on this.
>
> We really need to keep all possible mappings synchronized.
>
> -Andi
>
>

Hi,

First thanks for your reactions.

Andi, what i don't understand is, if we only put a minimal patch, like only
modifying the register, perhaps we may avoid having all vendors doing
their own recipe, possibly all messing the one the other. As you said ,
changing this register is absolutely not the actual issue with PAT, but don't
you think such a first step is needed to avoid conflicts since people _do_
already set that register from their driver.

I agree there is work for a comprehensive support of PAT, especially when
dealing with ioremap, but even though we only handle a smallish portion
of the problem yet, perhaps it's time to at least modify that register ?

Of course there is no problem for that being dependant on a CONFIG_PAT.

Kind regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PAT support for i386 and x86_64

2007-08-08 Thread Cédric Augonnet
[Apologize for the double-post, messed up with my mailer... ]

2007/8/8, Andi Kleen [EMAIL PROTECTED]:
  I don't see why we have to worry about cache corruption in the case at
  hand. Write-combining is needed to map io (typically pci-mem regions)
  which are never mapped cachable anywhere, including in the linear map.

 If we WC them using PAT then there would be a UC-WC conflict with
 the direct mapping and possibly others. That's already undefined
 and not allowed.

 After some very bad experiences in the past I'm not going to take
 chances on this.

 We really need to keep all possible mappings synchronized.

 -Andi



Hi,

First thanks for your reactions.

Andi, what i don't understand is, if we only put a minimal patch, like only
modifying the register, perhaps we may avoid having all vendors doing
their own recipe, possibly all messing the one the other. As you said ,
changing this register is absolutely not the actual issue with PAT, but don't
you think such a first step is needed to avoid conflicts since people _do_
already set that register from their driver.

I agree there is work for a comprehensive support of PAT, especially when
dealing with ioremap, but even though we only handle a smallish portion
of the problem yet, perhaps it's time to at least modify that register ?

Of course there is no problem for that being dependant on a CONFIG_PAT.

Kind regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] PAT support for write combining in pci_mmap_page_range

2007-08-06 Thread Cédric Augonnet

Adds support for write-combining in pci_mmap_page_range using PAT6.

Some distinction has to be made between huge pages and normal pages since
the position of the bit encoding PAT depends on that.

Signed-off-by: Cedric Augonnet <[EMAIL PROTECTED]>
CC: Loic Prylli <[EMAIL PROTECTED]>
CC: Brice Goglin <[EMAIL PROTECTED]>	
---
diff -urN 2.6.23-rc2/arch/i386/pci/i386.c 2.6.23-rc2-pat/arch/i386/pci/i386.c
--- 2.6.23-rc2/arch/i386/pci/i386.c	2007-07-31 10:48:58.0 -0400
+++ 2.6.23-rc2-pat/arch/i386/pci/i386.c	2007-08-06 18:19:18.0 -0400
@@ -299,13 +299,20 @@
 	 * address on this platform.
 	 */
 	prot = pgprot_val(vma->vm_page_prot);
-	if (boot_cpu_data.x86 > 3)
-		prot |= _PAGE_PCD | _PAGE_PWT;
-	vma->vm_page_prot = __pgprot(prot);
 
-	/* Write-combine setting is ignored, it is changed via the mtrr
-	 * interfaces on this platform.
+	/* Unless PAT is enabling write combining, it is here ignored and
+	 * changed via the mtrr interfaces on this platform.
 	 */
+	if (cpu_has_pat && write_combine) {
+		/* PAT bit is not the same if we have huge pages */
+		vma->vm_page_prot =
+			__pgprot_wc(prot, (vma->vm_flags & VM_HUGETLB));
+	} else {
+		if (boot_cpu_data.x86 > 3)
+			prot |= _PAGE_PCD | _PAGE_PWT;
+		vma->vm_page_prot = __pgprot(prot);
+	}
+
 	if (io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
 			   vma->vm_end - vma->vm_start,
 			   vma->vm_page_prot))
diff -urN 2.6.23-rc2/include/asm-i386/pgtable.h 2.6.23-rc2-pat/include/asm-i386/pgtable.h
--- 2.6.23-rc2/include/asm-i386/pgtable.h	2007-08-06 15:17:12.0 -0400
+++ 2.6.23-rc2-pat/include/asm-i386/pgtable.h	2007-08-06 18:33:48.0 -0400
@@ -119,6 +119,12 @@
 #define _PAGE_UNUSED2	0x400
 #define _PAGE_UNUSED3	0x800
 
+/* defined for write combining support with PAT enabled */
+#define _PAGE_PAT6		0x088		/* PAT6 for 4K pages */
+#define _PAGE_PAT6_HUGE		0x1008		/* PAT6 for huge pages */
+#define _PAGE_WR_CMB		_PAGE_PAT6
+#define _PAGE_WR_CMB_HUGE	_PAGE_PAT6_HUGE
+
 /* If _PAGE_PRESENT is clear, we use these: */
 #define _PAGE_FILE	0x040	/* nonlinear file mapping, saved PTE; unset:swap */
 #define _PAGE_PROTNONE	0x080	/* if the user mapped it with PROT_NONE;
@@ -171,6 +177,11 @@
 #define PAGE_KERNEL_LARGE	__pgprot(__PAGE_KERNEL_LARGE)
 #define PAGE_KERNEL_LARGE_EXEC	__pgprot(__PAGE_KERNEL_LARGE_EXEC)
 
+#define __pgprot_wc(flag, is_huge_page)\
+	(__pgprot((is_huge_page) ?\
+		((flag) & (~_PAGE_PWT)) | _PAGE_WR_CMB_HUGE :	\
+		((flag) & (~_PAGE_PWT)) | _PAGE_WR_CMB) )
+
 /*
  * The i386 can't do page protection for execute, and considers that
  * the same are read. Also, write permissions imply read permissions.
diff -urN 2.6.23-rc2/include/asm-x86_64/pgtable.h 2.6.23-rc2-pat/include/asm-x86_64/pgtable.h
--- 2.6.23-rc2/include/asm-x86_64/pgtable.h	2007-08-06 15:17:15.0 -0400
+++ 2.6.23-rc2-pat/include/asm-x86_64/pgtable.h	2007-08-06 18:29:28.0 -0400
@@ -164,6 +164,13 @@
 #define _PAGE_GLOBAL	0x100	/* Global TLB entry */
 
 #define _PAGE_PROTNONE	0x080	/* If not present */
+
+/* defined for write combining support with PAT enabled */
+#define _PAGE_PAT6		0x088		/* PAT6 for 4K pages */
+#define _PAGE_PAT6_HUGE		0x1008		/* PAT6 for huge pages */
+#define _PAGE_WR_CMB		_PAGE_PAT6
+#define _PAGE_WR_CMB_HUGE	_PAGE_PAT6_HUGE
+
 #define _PAGE_NX(_AC(1,UL)<<_PAGE_BIT_NX)
 
 #define _PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY)
@@ -196,6 +203,11 @@
 #define __PAGE_KERNEL_LARGE_EXEC \
 	(__PAGE_KERNEL_EXEC | _PAGE_PSE)
 
+#define __pgprot_wc(flag, is_huge_page)\
+	(__pgprot((is_huge_page) ?\
+		((flag) & (~_PAGE_PWT)) | _PAGE_WR_CMB_HUGE :	\
+		((flag) & (~_PAGE_PWT)) | _PAGE_WR_CMB))
+
 #define MAKE_GLOBAL(x) __pgprot((x) | _PAGE_GLOBAL)
 
 #define PAGE_KERNEL MAKE_GLOBAL(__PAGE_KERNEL)


[PATCH 1/2] PAT setting write combining on PAT6 at boot time

2007-08-06 Thread Cédric Augonnet

This sets PAT6 to write combining during boot on i386 and x86_64

Signed-off-by: Cedric Augonnet <[EMAIL PROTECTED]>
CC: Loic Prylli <[EMAIL PROTECTED]>
CC: Brice Goglin <[EMAIL PROTECTED]>
---
diff -urN 2.6.23-rc2/arch/i386/kernel/cpu/Makefile 2.6.23-rc2-pat/arch/i386/kernel/cpu/Makefile
--- 2.6.23-rc2/arch/i386/kernel/cpu/Makefile	2007-08-06 15:16:16.0 -0400
+++ 2.6.23-rc2-pat/arch/i386/kernel/cpu/Makefile	2007-08-06 17:44:36.0 -0400
@@ -11,6 +11,7 @@
 obj-y	+=	intel.o intel_cacheinfo.o addon_cpuid_features.o
 obj-y	+=	nexgen.o
 obj-y	+=	umc.o
+obj-y	+=	pat.o
 
 obj-$(CONFIG_X86_MCE)	+=	mcheck/
 
diff -urN 2.6.23-rc2/arch/i386/kernel/cpu/pat.c 2.6.23-rc2-pat/arch/i386/kernel/cpu/pat.c
--- 2.6.23-rc2/arch/i386/kernel/cpu/pat.c	1969-12-31 19:00:00.0 -0500
+++ 2.6.23-rc2-pat/arch/i386/kernel/cpu/pat.c	2007-08-06 18:16:37.0 -0400
@@ -0,0 +1,36 @@
+/*
+ * x86 Page Attribute Table (PAT) support.
+ * Copyright (C) 2007 Cedric Augonnet.
+ *   (C) 2007 Myricom, Inc.
+ *   (C) 2003 Jeff Hartmann
+ *   (C) 2001 Jeff Hartmann
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static inline void
+enable_pat_on_cpu(void *info)
+{
+	uint32_t low;
+	uint32_t high;
+
+	/* use PAT 6 for write combining */
+	rdmsr(MSR_CR_PAT, low, high);
+	if ((high & 0xff) != 0x01) {
+		high = (high & 0xff00) | 0x0001;
+		wrmsr(MSR_CR_PAT, low, high);
+	}
+}
+
+void __init
+pat_init(void)
+{
+	/* enable PAT is CPU supports it */
+	if (cpu_has_pat)
+		on_each_cpu(enable_pat_on_cpu, 0, 0, 1);
+}
diff -urN 2.6.23-rc2/arch/i386/kernel/setup.c 2.6.23-rc2-pat/arch/i386/kernel/setup.c
--- 2.6.23-rc2/arch/i386/kernel/setup.c	2007-08-06 15:16:17.0 -0400
+++ 2.6.23-rc2-pat/arch/i386/kernel/setup.c	2007-08-06 17:44:36.0 -0400
@@ -62,6 +62,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* This value is set up by the early boot code to point to the value
immediately after the boot time page tables.  It contains a *physical*
@@ -650,4 +651,5 @@
 	conswitchp = _con;
 #endif
 #endif
+	pat_init();
 }
diff -urN 2.6.23-rc2/arch/x86_64/kernel/Makefile 2.6.23-rc2-pat/arch/x86_64/kernel/Makefile
--- 2.6.23-rc2/arch/x86_64/kernel/Makefile	2007-08-06 15:16:27.0 -0400
+++ 2.6.23-rc2-pat/arch/x86_64/kernel/Makefile	2007-08-06 17:44:36.0 -0400
@@ -16,6 +16,7 @@
 obj-$(CONFIG_X86_MCE_INTEL)	+= mce_intel.o
 obj-$(CONFIG_X86_MCE_AMD)	+= mce_amd.o
 obj-$(CONFIG_MTRR)		+= ../../i386/kernel/cpu/mtrr/
+obj-y+= pat.o
 obj-$(CONFIG_ACPI)		+= acpi/
 obj-$(CONFIG_X86_MSR)		+= msr.o
 obj-$(CONFIG_MICROCODE)		+= microcode.o
@@ -61,3 +62,4 @@
 alternative-y			+= ../../i386/kernel/alternative.o
 pcspeaker-y			+= ../../i386/kernel/pcspeaker.o
 perfctr-watchdog-y		+= ../../i386/kernel/cpu/perfctr-watchdog.o
+pat-y+= ../../i386/kernel/cpu/pat.o
diff -urN 2.6.23-rc2/arch/x86_64/kernel/setup.c 2.6.23-rc2-pat/arch/x86_64/kernel/setup.c
--- 2.6.23-rc2/arch/x86_64/kernel/setup.c	2007-08-06 15:16:27.0 -0400
+++ 2.6.23-rc2-pat/arch/x86_64/kernel/setup.c	2007-08-06 17:44:36.0 -0400
@@ -64,6 +64,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Machine setup..
@@ -416,6 +417,7 @@
 	conswitchp = _con;
 #endif
 #endif
+	pat_init();
 }
 
 static int __cpuinit get_model_name(struct cpuinfo_x86 *c)
diff -urN 2.6.23-rc2/include/asm-i386/cpufeature.h 2.6.23-rc2-pat/include/asm-i386/cpufeature.h
--- 2.6.23-rc2/include/asm-i386/cpufeature.h	2007-08-06 15:17:12.0 -0400
+++ 2.6.23-rc2-pat/include/asm-i386/cpufeature.h	2007-08-06 17:44:36.0 -0400
@@ -164,6 +164,7 @@
 #define cpu_has_pebs 		boot_cpu_has(X86_FEATURE_PEBS)
 #define cpu_has_clflush		boot_cpu_has(X86_FEATURE_CLFLSH)
 #define cpu_has_bts 		boot_cpu_has(X86_FEATURE_BTS)
+#define cpu_has_pat		boot_cpu_has(X86_FEATURE_PAT)
 
 #endif /* __ASM_I386_CPUFEATURE_H */
 
diff -urN 2.6.23-rc2/include/asm-i386/pat.h 2.6.23-rc2-pat/include/asm-i386/pat.h
--- 2.6.23-rc2/include/asm-i386/pat.h	1969-12-31 19:00:00.0 -0500
+++ 2.6.23-rc2-pat/include/asm-i386/pat.h	2007-08-06 18:26:11.0 -0400
@@ -0,0 +1,32 @@
+/*
+ * x86 Page Attribute Table (PAT) support.
+ * Copyright (C) 2007 Cedric Augonnet.
+ *   (C) 2007 Myricom, Inc.
+ *   (C) 2003 Jeff Hartmann
+ *   (C) 2001 Jeff Hartmann
+ */
+
+#ifndef _LINUX_PAT_H
+#define _LINUX_PAT_H
+
+#define	MSR_CR_PAT	0x277
+
+/*
+ *   Once PAT is enabled using pat_init on each CPU,we should have the
+ * following layout : the first four are not changed for backward
+ * compatibility with processors not implementing PAT . PAT 6 is changed
+ * to write combining mode.
+ *
+ *   	PAT 0 : WB		 PAT 4 : WB
+ *   	PAT 1 : WT		 PAT 5 : WT
+ *   	PAT 2 : UC-		 PAT 6 : WC
+ *   	PAT 3 : UC		 PAT 7 : UC-
+ *
+ */
+
+/*
+ * This function enables write combining on PAT6 on each CPU.
+ */
+extern void pat_init(void);
+
+#endif
diff -urN 2.6.23-rc2/include/asm-x86_64/pat.h 2.6.23-rc2-pat/include/asm-x86_64/pat.h

[PATCH 0/2] PAT support for i386 and x86_64

2007-08-06 Thread Cédric Augonnet
Hi all,

For quite a while now, there as been numerous attempt to introduce support for
Page Attribute Table (PAT) in the Linux kernel, whereas most other OS already
have some support for this feature. Such a proposition popping up periodically,
perhaps it would be an opportunity to fix that lack for once.

  PAT actually makes life much easier for people needing to use write-combining
(WC) attribute. Indeed the only solution yet available is MTRRs, which are not
as flexible as PAT is.
  Indeed, even if they both allow to set memory types on memory regions, MTRRs
are intended to be used on a physical memory range, while PAT makes possible
to set such memory type based on a linear address mapping. The actual problem
with MTRR is that the BIOS usually covers physical RAM with write-back (WB),
and that MTRRs should not be overlapped.
  Using write-combining feature is sometimes crucial for performance of driver
for which writes operation on the bus are critical. High-speed networking
drivers such as myri10ge would take benefit of using PAT instead of MTRRs.
Video drivers using frame-buffers could also avoid using MTRRs that way.

  PAT6 can be a candidate for that purpose. Not only for backward compatibility
reasons, but also as various errata concerning PAT would end up by having the
PAT bit ignored, therefore if we use PAT6, when such an error occurs we would
have an uncached area, which means it would at worst not be effective, thus
resolving the issues raised in http://lkml.org/lkml/2004/4/13/102 .

 ***

 Back to 2.4.20, Terence Ripperda already submitted such a proposal for PAT
 support in
   - http://lkml.org/lkml/2003/5/20/131
 himself refering to
   - http://www.ussg.iu.edu/hypermail/linux/kernel/0303.1/0606.html

 In 2005, Eric W. Biederman submitted another similar patch :
   - http://lkml.org/lkml/2005/8/29/226

 More recently, there have been numerous threads dealing with MTRR issues,
 some of them suggesting that there should actually be some support for PAT.
 See for instance :
   - http://lkml.org/lkml/2007/6/6/333

 ***

 I therefore propose here a set of 2 patches to add some support for
 write-combining using PAT :

 [PATCH 1/2] - [PAT] Setting write combining on PAT6 at boot time
 [PATCH 2/2] - [PAT] Support for write combining in pci_mmap_page_range

Kind regards,
Cedric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] PAT support for i386 and x86_64

2007-08-06 Thread Cédric Augonnet
Hi all,

For quite a while now, there as been numerous attempt to introduce support for
Page Attribute Table (PAT) in the Linux kernel, whereas most other OS already
have some support for this feature. Such a proposition popping up periodically,
perhaps it would be an opportunity to fix that lack for once.

  PAT actually makes life much easier for people needing to use write-combining
(WC) attribute. Indeed the only solution yet available is MTRRs, which are not
as flexible as PAT is.
  Indeed, even if they both allow to set memory types on memory regions, MTRRs
are intended to be used on a physical memory range, while PAT makes possible
to set such memory type based on a linear address mapping. The actual problem
with MTRR is that the BIOS usually covers physical RAM with write-back (WB),
and that MTRRs should not be overlapped.
  Using write-combining feature is sometimes crucial for performance of driver
for which writes operation on the bus are critical. High-speed networking
drivers such as myri10ge would take benefit of using PAT instead of MTRRs.
Video drivers using frame-buffers could also avoid using MTRRs that way.

  PAT6 can be a candidate for that purpose. Not only for backward compatibility
reasons, but also as various errata concerning PAT would end up by having the
PAT bit ignored, therefore if we use PAT6, when such an error occurs we would
have an uncached area, which means it would at worst not be effective, thus
resolving the issues raised in http://lkml.org/lkml/2004/4/13/102 .

 ***

 Back to 2.4.20, Terence Ripperda already submitted such a proposal for PAT
 support in
   - http://lkml.org/lkml/2003/5/20/131
 himself refering to
   - http://www.ussg.iu.edu/hypermail/linux/kernel/0303.1/0606.html

 In 2005, Eric W. Biederman submitted another similar patch :
   - http://lkml.org/lkml/2005/8/29/226

 More recently, there have been numerous threads dealing with MTRR issues,
 some of them suggesting that there should actually be some support for PAT.
 See for instance :
   - http://lkml.org/lkml/2007/6/6/333

 ***

 I therefore propose here a set of 2 patches to add some support for
 write-combining using PAT :

 [PATCH 1/2] - [PAT] Setting write combining on PAT6 at boot time
 [PATCH 2/2] - [PAT] Support for write combining in pci_mmap_page_range

Kind regards,
Cedric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] PAT setting write combining on PAT6 at boot time

2007-08-06 Thread Cédric Augonnet

This sets PAT6 to write combining during boot on i386 and x86_64

Signed-off-by: Cedric Augonnet [EMAIL PROTECTED]
CC: Loic Prylli [EMAIL PROTECTED]
CC: Brice Goglin [EMAIL PROTECTED]
---
diff -urN 2.6.23-rc2/arch/i386/kernel/cpu/Makefile 2.6.23-rc2-pat/arch/i386/kernel/cpu/Makefile
--- 2.6.23-rc2/arch/i386/kernel/cpu/Makefile	2007-08-06 15:16:16.0 -0400
+++ 2.6.23-rc2-pat/arch/i386/kernel/cpu/Makefile	2007-08-06 17:44:36.0 -0400
@@ -11,6 +11,7 @@
 obj-y	+=	intel.o intel_cacheinfo.o addon_cpuid_features.o
 obj-y	+=	nexgen.o
 obj-y	+=	umc.o
+obj-y	+=	pat.o
 
 obj-$(CONFIG_X86_MCE)	+=	mcheck/
 
diff -urN 2.6.23-rc2/arch/i386/kernel/cpu/pat.c 2.6.23-rc2-pat/arch/i386/kernel/cpu/pat.c
--- 2.6.23-rc2/arch/i386/kernel/cpu/pat.c	1969-12-31 19:00:00.0 -0500
+++ 2.6.23-rc2-pat/arch/i386/kernel/cpu/pat.c	2007-08-06 18:16:37.0 -0400
@@ -0,0 +1,36 @@
+/*
+ * x86 Page Attribute Table (PAT) support.
+ * Copyright (C) 2007 Cedric Augonnet.
+ *   (C) 2007 Myricom, Inc.
+ *   (C) 2003 Jeff Hartmann
+ *   (C) 2001 Jeff Hartmann
+ */
+
+#include linux/init.h
+#include linux/kernel.h
+#include linux/smp.h
+#include linux/io.h
+#include asm/pat.h
+#include asm/cpufeature.h
+
+static inline void
+enable_pat_on_cpu(void *info)
+{
+	uint32_t low;
+	uint32_t high;
+
+	/* use PAT 6 for write combining */
+	rdmsr(MSR_CR_PAT, low, high);
+	if ((high  0xff) != 0x01) {
+		high = (high  0xff00) | 0x0001;
+		wrmsr(MSR_CR_PAT, low, high);
+	}
+}
+
+void __init
+pat_init(void)
+{
+	/* enable PAT is CPU supports it */
+	if (cpu_has_pat)
+		on_each_cpu(enable_pat_on_cpu, 0, 0, 1);
+}
diff -urN 2.6.23-rc2/arch/i386/kernel/setup.c 2.6.23-rc2-pat/arch/i386/kernel/setup.c
--- 2.6.23-rc2/arch/i386/kernel/setup.c	2007-08-06 15:16:17.0 -0400
+++ 2.6.23-rc2-pat/arch/i386/kernel/setup.c	2007-08-06 17:44:36.0 -0400
@@ -62,6 +62,7 @@
 #include asm/vmi.h
 #include setup_arch.h
 #include bios_ebda.h
+#include asm/pat.h
 
 /* This value is set up by the early boot code to point to the value
immediately after the boot time page tables.  It contains a *physical*
@@ -650,4 +651,5 @@
 	conswitchp = dummy_con;
 #endif
 #endif
+	pat_init();
 }
diff -urN 2.6.23-rc2/arch/x86_64/kernel/Makefile 2.6.23-rc2-pat/arch/x86_64/kernel/Makefile
--- 2.6.23-rc2/arch/x86_64/kernel/Makefile	2007-08-06 15:16:27.0 -0400
+++ 2.6.23-rc2-pat/arch/x86_64/kernel/Makefile	2007-08-06 17:44:36.0 -0400
@@ -16,6 +16,7 @@
 obj-$(CONFIG_X86_MCE_INTEL)	+= mce_intel.o
 obj-$(CONFIG_X86_MCE_AMD)	+= mce_amd.o
 obj-$(CONFIG_MTRR)		+= ../../i386/kernel/cpu/mtrr/
+obj-y+= pat.o
 obj-$(CONFIG_ACPI)		+= acpi/
 obj-$(CONFIG_X86_MSR)		+= msr.o
 obj-$(CONFIG_MICROCODE)		+= microcode.o
@@ -61,3 +62,4 @@
 alternative-y			+= ../../i386/kernel/alternative.o
 pcspeaker-y			+= ../../i386/kernel/pcspeaker.o
 perfctr-watchdog-y		+= ../../i386/kernel/cpu/perfctr-watchdog.o
+pat-y+= ../../i386/kernel/cpu/pat.o
diff -urN 2.6.23-rc2/arch/x86_64/kernel/setup.c 2.6.23-rc2-pat/arch/x86_64/kernel/setup.c
--- 2.6.23-rc2/arch/x86_64/kernel/setup.c	2007-08-06 15:16:27.0 -0400
+++ 2.6.23-rc2-pat/arch/x86_64/kernel/setup.c	2007-08-06 17:44:36.0 -0400
@@ -64,6 +64,7 @@
 #include asm/numa.h
 #include asm/sections.h
 #include asm/dmi.h
+#include asm/pat.h
 
 /*
  * Machine setup..
@@ -416,6 +417,7 @@
 	conswitchp = dummy_con;
 #endif
 #endif
+	pat_init();
 }
 
 static int __cpuinit get_model_name(struct cpuinfo_x86 *c)
diff -urN 2.6.23-rc2/include/asm-i386/cpufeature.h 2.6.23-rc2-pat/include/asm-i386/cpufeature.h
--- 2.6.23-rc2/include/asm-i386/cpufeature.h	2007-08-06 15:17:12.0 -0400
+++ 2.6.23-rc2-pat/include/asm-i386/cpufeature.h	2007-08-06 17:44:36.0 -0400
@@ -164,6 +164,7 @@
 #define cpu_has_pebs 		boot_cpu_has(X86_FEATURE_PEBS)
 #define cpu_has_clflush		boot_cpu_has(X86_FEATURE_CLFLSH)
 #define cpu_has_bts 		boot_cpu_has(X86_FEATURE_BTS)
+#define cpu_has_pat		boot_cpu_has(X86_FEATURE_PAT)
 
 #endif /* __ASM_I386_CPUFEATURE_H */
 
diff -urN 2.6.23-rc2/include/asm-i386/pat.h 2.6.23-rc2-pat/include/asm-i386/pat.h
--- 2.6.23-rc2/include/asm-i386/pat.h	1969-12-31 19:00:00.0 -0500
+++ 2.6.23-rc2-pat/include/asm-i386/pat.h	2007-08-06 18:26:11.0 -0400
@@ -0,0 +1,32 @@
+/*
+ * x86 Page Attribute Table (PAT) support.
+ * Copyright (C) 2007 Cedric Augonnet.
+ *   (C) 2007 Myricom, Inc.
+ *   (C) 2003 Jeff Hartmann
+ *   (C) 2001 Jeff Hartmann
+ */
+
+#ifndef _LINUX_PAT_H
+#define _LINUX_PAT_H
+
+#define	MSR_CR_PAT	0x277
+
+/*
+ *   Once PAT is enabled using pat_init on each CPU,we should have the
+ * following layout : the first four are not changed for backward
+ * compatibility with processors not implementing PAT . PAT 6 is changed
+ * to write combining mode.
+ *
+ *   	PAT 0 : WB		 PAT 4 : WB
+ *   	PAT 1 : WT		 PAT 5 : WT
+ *   	PAT 2 : UC-		 PAT 6 : WC
+ *   	PAT 3 : UC		 PAT 7 : UC-
+ *
+ */
+
+/*
+ * This function enables write 

[PATCH 2/2] PAT support for write combining in pci_mmap_page_range

2007-08-06 Thread Cédric Augonnet

Adds support for write-combining in pci_mmap_page_range using PAT6.

Some distinction has to be made between huge pages and normal pages since
the position of the bit encoding PAT depends on that.

Signed-off-by: Cedric Augonnet [EMAIL PROTECTED]
CC: Loic Prylli [EMAIL PROTECTED]
CC: Brice Goglin [EMAIL PROTECTED]	
---
diff -urN 2.6.23-rc2/arch/i386/pci/i386.c 2.6.23-rc2-pat/arch/i386/pci/i386.c
--- 2.6.23-rc2/arch/i386/pci/i386.c	2007-07-31 10:48:58.0 -0400
+++ 2.6.23-rc2-pat/arch/i386/pci/i386.c	2007-08-06 18:19:18.0 -0400
@@ -299,13 +299,20 @@
 	 * address on this platform.
 	 */
 	prot = pgprot_val(vma-vm_page_prot);
-	if (boot_cpu_data.x86  3)
-		prot |= _PAGE_PCD | _PAGE_PWT;
-	vma-vm_page_prot = __pgprot(prot);
 
-	/* Write-combine setting is ignored, it is changed via the mtrr
-	 * interfaces on this platform.
+	/* Unless PAT is enabling write combining, it is here ignored and
+	 * changed via the mtrr interfaces on this platform.
 	 */
+	if (cpu_has_pat  write_combine) {
+		/* PAT bit is not the same if we have huge pages */
+		vma-vm_page_prot =
+			__pgprot_wc(prot, (vma-vm_flags  VM_HUGETLB));
+	} else {
+		if (boot_cpu_data.x86  3)
+			prot |= _PAGE_PCD | _PAGE_PWT;
+		vma-vm_page_prot = __pgprot(prot);
+	}
+
 	if (io_remap_pfn_range(vma, vma-vm_start, vma-vm_pgoff,
 			   vma-vm_end - vma-vm_start,
 			   vma-vm_page_prot))
diff -urN 2.6.23-rc2/include/asm-i386/pgtable.h 2.6.23-rc2-pat/include/asm-i386/pgtable.h
--- 2.6.23-rc2/include/asm-i386/pgtable.h	2007-08-06 15:17:12.0 -0400
+++ 2.6.23-rc2-pat/include/asm-i386/pgtable.h	2007-08-06 18:33:48.0 -0400
@@ -119,6 +119,12 @@
 #define _PAGE_UNUSED2	0x400
 #define _PAGE_UNUSED3	0x800
 
+/* defined for write combining support with PAT enabled */
+#define _PAGE_PAT6		0x088		/* PAT6 for 4K pages */
+#define _PAGE_PAT6_HUGE		0x1008		/* PAT6 for huge pages */
+#define _PAGE_WR_CMB		_PAGE_PAT6
+#define _PAGE_WR_CMB_HUGE	_PAGE_PAT6_HUGE
+
 /* If _PAGE_PRESENT is clear, we use these: */
 #define _PAGE_FILE	0x040	/* nonlinear file mapping, saved PTE; unset:swap */
 #define _PAGE_PROTNONE	0x080	/* if the user mapped it with PROT_NONE;
@@ -171,6 +177,11 @@
 #define PAGE_KERNEL_LARGE	__pgprot(__PAGE_KERNEL_LARGE)
 #define PAGE_KERNEL_LARGE_EXEC	__pgprot(__PAGE_KERNEL_LARGE_EXEC)
 
+#define __pgprot_wc(flag, is_huge_page)\
+	(__pgprot((is_huge_page) ?\
+		((flag)  (~_PAGE_PWT)) | _PAGE_WR_CMB_HUGE :	\
+		((flag)  (~_PAGE_PWT)) | _PAGE_WR_CMB) )
+
 /*
  * The i386 can't do page protection for execute, and considers that
  * the same are read. Also, write permissions imply read permissions.
diff -urN 2.6.23-rc2/include/asm-x86_64/pgtable.h 2.6.23-rc2-pat/include/asm-x86_64/pgtable.h
--- 2.6.23-rc2/include/asm-x86_64/pgtable.h	2007-08-06 15:17:15.0 -0400
+++ 2.6.23-rc2-pat/include/asm-x86_64/pgtable.h	2007-08-06 18:29:28.0 -0400
@@ -164,6 +164,13 @@
 #define _PAGE_GLOBAL	0x100	/* Global TLB entry */
 
 #define _PAGE_PROTNONE	0x080	/* If not present */
+
+/* defined for write combining support with PAT enabled */
+#define _PAGE_PAT6		0x088		/* PAT6 for 4K pages */
+#define _PAGE_PAT6_HUGE		0x1008		/* PAT6 for huge pages */
+#define _PAGE_WR_CMB		_PAGE_PAT6
+#define _PAGE_WR_CMB_HUGE	_PAGE_PAT6_HUGE
+
 #define _PAGE_NX(_AC(1,UL)_PAGE_BIT_NX)
 
 #define _PAGE_TABLE	(_PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED | _PAGE_DIRTY)
@@ -196,6 +203,11 @@
 #define __PAGE_KERNEL_LARGE_EXEC \
 	(__PAGE_KERNEL_EXEC | _PAGE_PSE)
 
+#define __pgprot_wc(flag, is_huge_page)\
+	(__pgprot((is_huge_page) ?\
+		((flag)  (~_PAGE_PWT)) | _PAGE_WR_CMB_HUGE :	\
+		((flag)  (~_PAGE_PWT)) | _PAGE_WR_CMB))
+
 #define MAKE_GLOBAL(x) __pgprot((x) | _PAGE_GLOBAL)
 
 #define PAGE_KERNEL MAKE_GLOBAL(__PAGE_KERNEL)


Re: voyager_{thread,cat}.c compile warnings

2007-07-22 Thread Cédric Augonnet

2007/7/22, James Bottomley <[EMAIL PROTECTED]>:

On Sun, 2007-07-22 at 18:49 -0400, Cédric Augonnet wrote:
> iff -urN a/arch/i386/mach-voyager/voyager_cat.c
> b/arch/i386/mach-voyager/voyager_cat.c
> --- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c  
2007-07-20 11:50:17.0 -0400
> +++ linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c   2007-07-22
> 11:24:34.0 -0400
> @@ -682,7 +682,7 @@
> outb(VOYAGER_CAT_END, CAT_CMD);
> continue;
> }
> -   if(eprom_size > sizeof(eprom_buf)) {
> +   if((unsigned)eprom_size > sizeof(eprom_buf)) {

Actually, no.  If gcc can deduce that the comparison is always false
then I want it not to build the body of the if.  The only thing I don't
know how to do is to shut up the warning in this case.  What you've done
is make gcc pretend it doesn't know the if is always false.

> printk("**WARNING**: Voyager insufficient size
> to read EPROM data, module 0x%x.  Need %d\n", i, eprom_size);
> outb(VOYAGER_CAT_END, CAT_CMD);
> continue;
> @@ -752,7 +752,7 @@
> outb(VOYAGER_CAT_END, CAT_CMD);
> continue;
> }
> -   if(eprom_size > sizeof(eprom_buf)) {
> +   if((unsigned)eprom_size > sizeof(eprom_buf)) {
> printk("**WARNING**: Voyager insufficient size
> to read EPROM data, module 0x%x.  Need %d\n", i, eprom_size);
> outb(VOYAGER_CAT_END, CAT_CMD);
> continue;
> diff -urN a/arch/i386/mach-voyager/voyager_thread.c
> b/arch/i386/mach-voyager/voyager_thread.c
> --- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c 
  2007-07-20 11:50:17.0 -0400
> +++
> linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c2007-07-22
> 11:27:13.0 -0400
> @@ -92,7 +92,7 @@
> }
>  }
>
> -static int
> +static void
>  thread(void *unused)
>  {
> printk(KERN_NOTICE "Voyager starting monitor thread\n");

You didn't actually compile this, did you?  Apparently the signature of
the kthread_run function changed from returning void to returning int.
Unfortunately the person who fixed this up forgot to add a return 0 at
the end of the voyager thread() function .. which is the correct fix.


Arg i was caught by that one.


James



Ouch indeed this quick'n'dirty patch was, let's call it a full mistake
:) sorry for that, it could indeed not be tested as i don't have the
hardware.

Still, is it safe to compare two variable with different types anyway ?

In http://lists.infradead.org/pipermail/linux-pcmcia/2004-March/000586.html
they also have the same issue, they just do
s/ foo > 0x / foo & ~0x /
should not it solve the problem as well ?

Sorry again for the first patch, next time i'll just shut up.

Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: voyager_{thread,cat}.c compile warnings

2007-07-22 Thread Cédric Augonnet

2007/7/22, Cédric Augonnet <[EMAIL PROTECTED]>:

Hi,

2007/7/21, Gabriel C <[EMAIL PROTECTED]>:
> Hi,
>
> I noticed this warnings on current git:
>
>
> ...
>
> arch/i386/mach-voyager/voyager_thread.c: In function 'thread':
> arch/i386/mach-voyager/voyager_thread.c:113: warning: no return statement in 
function returning non-void
>
> ...
>
> I think return 0; is missing on line 112 here.
>

I guess there is no use in always returning 0, simply not returning anything and
putting a void type might be cleaner, especially as this function
seems to never
return.

>
> arch/i386/mach-voyager/voyager_cat.c: In function 'voyager_cat_init':
> arch/i386/mach-voyager/voyager_cat.c:685: warning: comparison is always false 
due to limited range of data type
> arch/i386/mach-voyager/voyager_cat.c:755: warning: comparison is always false 
due to limited range of data type
>

Hum perhaps it does not like the comparision of a __u16 and an unsigned int ?

I enclose some dummy patch that should perhaps solve these issues,
could not test it as i don't have the hardware... just my 2 cents.
Don't know if casting is an
acceptable workaround though.

>
>
> Regards,
>
> Gabriel C
>

Cheers,
Cédric




(forgot to cc the maintainer, sorry ...)
Submitted by: Cedric Augonnet <[EMAIL PROTECTED]>
---
diff -urN a/arch/i386/mach-voyager/voyager_cat.c b/arch/i386/mach-voyager/voyager_cat.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-22 11:24:34.0 -0400
@@ -682,7 +682,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size > sizeof(eprom_buf)) {
+		if((unsigned)eprom_size > sizeof(eprom_buf)) {
 			printk("**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n", i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
@@ -752,7 +752,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size > sizeof(eprom_buf)) {
+		if((unsigned)eprom_size > sizeof(eprom_buf)) {
 			printk("**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n", i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
diff -urN a/arch/i386/mach-voyager/voyager_thread.c b/arch/i386/mach-voyager/voyager_thread.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-22 11:27:13.0 -0400
@@ -92,7 +92,7 @@
 	}
 }
 
-static int
+static void
 thread(void *unused)
 {
 	printk(KERN_NOTICE "Voyager starting monitor thread\n");


Re: voyager_{thread,cat}.c compile warnings

2007-07-22 Thread Cédric Augonnet

Hi,

2007/7/21, Gabriel C <[EMAIL PROTECTED]>:

Hi,

I noticed this warnings on current git:


...

arch/i386/mach-voyager/voyager_thread.c: In function 'thread':
arch/i386/mach-voyager/voyager_thread.c:113: warning: no return statement in 
function returning non-void

...

I think return 0; is missing on line 112 here.



I guess there is no use in always returning 0, simply not returning anything and
putting a void type might be cleaner, especially as this function
seems to never
return.



arch/i386/mach-voyager/voyager_cat.c: In function 'voyager_cat_init':
arch/i386/mach-voyager/voyager_cat.c:685: warning: comparison is always false 
due to limited range of data type
arch/i386/mach-voyager/voyager_cat.c:755: warning: comparison is always false 
due to limited range of data type



Hum perhaps it does not like the comparision of a __u16 and an unsigned int ?

I enclose some dummy patch that should perhaps solve these issues,
could not test it as i don't have the hardware... just my 2 cents.
Don't know if casting is an
acceptable workaround though.




Regards,

Gabriel C



Cheers,
Cédric
Submitted by: Cedric Augonnet <[EMAIL PROTECTED]>
---
diff -urN a/arch/i386/mach-voyager/voyager_cat.c b/arch/i386/mach-voyager/voyager_cat.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-22 11:24:34.0 -0400
@@ -682,7 +682,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size > sizeof(eprom_buf)) {
+		if((unsigned)eprom_size > sizeof(eprom_buf)) {
 			printk("**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n", i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
@@ -752,7 +752,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size > sizeof(eprom_buf)) {
+		if((unsigned)eprom_size > sizeof(eprom_buf)) {
 			printk("**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n", i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
diff -urN a/arch/i386/mach-voyager/voyager_thread.c b/arch/i386/mach-voyager/voyager_thread.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-22 11:27:13.0 -0400
@@ -92,7 +92,7 @@
 	}
 }
 
-static int
+static void
 thread(void *unused)
 {
 	printk(KERN_NOTICE "Voyager starting monitor thread\n");


Re: voyager_{thread,cat}.c compile warnings

2007-07-22 Thread Cédric Augonnet

Hi,

2007/7/21, Gabriel C [EMAIL PROTECTED]:

Hi,

I noticed this warnings on current git:


...

arch/i386/mach-voyager/voyager_thread.c: In function 'thread':
arch/i386/mach-voyager/voyager_thread.c:113: warning: no return statement in 
function returning non-void

...

I think return 0; is missing on line 112 here.



I guess there is no use in always returning 0, simply not returning anything and
putting a void type might be cleaner, especially as this function
seems to never
return.



arch/i386/mach-voyager/voyager_cat.c: In function 'voyager_cat_init':
arch/i386/mach-voyager/voyager_cat.c:685: warning: comparison is always false 
due to limited range of data type
arch/i386/mach-voyager/voyager_cat.c:755: warning: comparison is always false 
due to limited range of data type



Hum perhaps it does not like the comparision of a __u16 and an unsigned int ?

I enclose some dummy patch that should perhaps solve these issues,
could not test it as i don't have the hardware... just my 2 cents.
Don't know if casting is an
acceptable workaround though.




Regards,

Gabriel C



Cheers,
Cédric
Submitted by: Cedric Augonnet [EMAIL PROTECTED]
---
diff -urN a/arch/i386/mach-voyager/voyager_cat.c b/arch/i386/mach-voyager/voyager_cat.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-22 11:24:34.0 -0400
@@ -682,7 +682,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size  sizeof(eprom_buf)) {
+		if((unsigned)eprom_size  sizeof(eprom_buf)) {
 			printk(**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n, i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
@@ -752,7 +752,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size  sizeof(eprom_buf)) {
+		if((unsigned)eprom_size  sizeof(eprom_buf)) {
 			printk(**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n, i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
diff -urN a/arch/i386/mach-voyager/voyager_thread.c b/arch/i386/mach-voyager/voyager_thread.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-22 11:27:13.0 -0400
@@ -92,7 +92,7 @@
 	}
 }
 
-static int
+static void
 thread(void *unused)
 {
 	printk(KERN_NOTICE Voyager starting monitor thread\n);


Re: voyager_{thread,cat}.c compile warnings

2007-07-22 Thread Cédric Augonnet

2007/7/22, Cédric Augonnet [EMAIL PROTECTED]:

Hi,

2007/7/21, Gabriel C [EMAIL PROTECTED]:
 Hi,

 I noticed this warnings on current git:


 ...

 arch/i386/mach-voyager/voyager_thread.c: In function 'thread':
 arch/i386/mach-voyager/voyager_thread.c:113: warning: no return statement in 
function returning non-void

 ...

 I think return 0; is missing on line 112 here.


I guess there is no use in always returning 0, simply not returning anything and
putting a void type might be cleaner, especially as this function
seems to never
return.


 arch/i386/mach-voyager/voyager_cat.c: In function 'voyager_cat_init':
 arch/i386/mach-voyager/voyager_cat.c:685: warning: comparison is always false 
due to limited range of data type
 arch/i386/mach-voyager/voyager_cat.c:755: warning: comparison is always false 
due to limited range of data type


Hum perhaps it does not like the comparision of a __u16 and an unsigned int ?

I enclose some dummy patch that should perhaps solve these issues,
could not test it as i don't have the hardware... just my 2 cents.
Don't know if casting is an
acceptable workaround though.



 Regards,

 Gabriel C


Cheers,
Cédric




(forgot to cc the maintainer, sorry ...)
Submitted by: Cedric Augonnet [EMAIL PROTECTED]
---
diff -urN a/arch/i386/mach-voyager/voyager_cat.c b/arch/i386/mach-voyager/voyager_cat.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c	2007-07-22 11:24:34.0 -0400
@@ -682,7 +682,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size  sizeof(eprom_buf)) {
+		if((unsigned)eprom_size  sizeof(eprom_buf)) {
 			printk(**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n, i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
@@ -752,7 +752,7 @@
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
 		}
-		if(eprom_size  sizeof(eprom_buf)) {
+		if((unsigned)eprom_size  sizeof(eprom_buf)) {
 			printk(**WARNING**: Voyager insufficient size to read EPROM data, module 0x%x.  Need %d\n, i, eprom_size);
 			outb(VOYAGER_CAT_END, CAT_CMD);
 			continue;
diff -urN a/arch/i386/mach-voyager/voyager_thread.c b/arch/i386/mach-voyager/voyager_thread.c
--- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-20 11:50:17.0 -0400
+++ linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c	2007-07-22 11:27:13.0 -0400
@@ -92,7 +92,7 @@
 	}
 }
 
-static int
+static void
 thread(void *unused)
 {
 	printk(KERN_NOTICE Voyager starting monitor thread\n);


Re: voyager_{thread,cat}.c compile warnings

2007-07-22 Thread Cédric Augonnet

2007/7/22, James Bottomley [EMAIL PROTECTED]:

On Sun, 2007-07-22 at 18:49 -0400, Cédric Augonnet wrote:
 iff -urN a/arch/i386/mach-voyager/voyager_cat.c
 b/arch/i386/mach-voyager/voyager_cat.c
 --- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c  
2007-07-20 11:50:17.0 -0400
 +++ linux-2.6.22/arch/i386/mach-voyager/voyager_cat.c   2007-07-22
 11:24:34.0 -0400
 @@ -682,7 +682,7 @@
 outb(VOYAGER_CAT_END, CAT_CMD);
 continue;
 }
 -   if(eprom_size  sizeof(eprom_buf)) {
 +   if((unsigned)eprom_size  sizeof(eprom_buf)) {

Actually, no.  If gcc can deduce that the comparison is always false
then I want it not to build the body of the if.  The only thing I don't
know how to do is to shut up the warning in this case.  What you've done
is make gcc pretend it doesn't know the if is always false.

 printk(**WARNING**: Voyager insufficient size
 to read EPROM data, module 0x%x.  Need %d\n, i, eprom_size);
 outb(VOYAGER_CAT_END, CAT_CMD);
 continue;
 @@ -752,7 +752,7 @@
 outb(VOYAGER_CAT_END, CAT_CMD);
 continue;
 }
 -   if(eprom_size  sizeof(eprom_buf)) {
 +   if((unsigned)eprom_size  sizeof(eprom_buf)) {
 printk(**WARNING**: Voyager insufficient size
 to read EPROM data, module 0x%x.  Need %d\n, i, eprom_size);
 outb(VOYAGER_CAT_END, CAT_CMD);
 continue;
 diff -urN a/arch/i386/mach-voyager/voyager_thread.c
 b/arch/i386/mach-voyager/voyager_thread.c
 --- /home/gonnet/tmp/linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c 
  2007-07-20 11:50:17.0 -0400
 +++
 linux-2.6.22/arch/i386/mach-voyager/voyager_thread.c2007-07-22
 11:27:13.0 -0400
 @@ -92,7 +92,7 @@
 }
  }

 -static int
 +static void
  thread(void *unused)
  {
 printk(KERN_NOTICE Voyager starting monitor thread\n);

You didn't actually compile this, did you?  Apparently the signature of
the kthread_run function changed from returning void to returning int.
Unfortunately the person who fixed this up forgot to add a return 0 at
the end of the voyager thread() function .. which is the correct fix.


Arg i was caught by that one.


James



Ouch indeed this quick'n'dirty patch was, let's call it a full mistake
:) sorry for that, it could indeed not be tested as i don't have the
hardware.

Still, is it safe to compare two variable with different types anyway ?

In http://lists.infradead.org/pipermail/linux-pcmcia/2004-March/000586.html
they also have the same issue, they just do
s/ foo  0x / foo  ~0x /
should not it solve the problem as well ?

Sorry again for the first patch, next time i'll just shut up.

Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel Story

2007-07-12 Thread Cédric Augonnet

2007/7/12, [EMAIL PROTECTED] <[EMAIL PROTECTED]>:

On Thu, Jul 12, 2007 at 11:08:52AM -0300, Renato S. Yamane wrote:
> Vijayakumar Subburaj escreveu:
> >My first mail to lkml.
>
> Welcome :-)
>
> >I would like to know what happened to linux kernel from its 1.0.
>
> From 1.0 to 2.6.22.1?
> Wowww
>
> I think that you can read all changelogs available in www.kernel.org. In
> 2 years you finish read 2.4 and 2.6 series :-)
>
> Are you try ?
>
  I think this is a good idea to know all the changes!

> Regards,
> Renato
> -


If you have to study some given sub-system, why not, you might have to
look at how this evolved from years to years, that must already be a
huge work from 2.0 to now for any single part of the kernel.
But seriously, if it's already virtually impossible to get a real
picture of the overall kernels, trying to figure out all changes from
1.0 to now, for all subsystems sounds a little illusory. Moreover they
might have just been all rewritten several times since.

But perhaps what Vijayakumar was looking for is more the different
feature that were added within the curse of years (eg : SMP) ?

Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel Story

2007-07-12 Thread Cédric Augonnet

2007/7/12, [EMAIL PROTECTED] [EMAIL PROTECTED]:

On Thu, Jul 12, 2007 at 11:08:52AM -0300, Renato S. Yamane wrote:
 Vijayakumar Subburaj escreveu:
 My first mail to lkml.

 Welcome :-)

 I would like to know what happened to linux kernel from its 1.0.

 From 1.0 to 2.6.22.1?
 Wowww

 I think that you can read all changelogs available in www.kernel.org. In
 2 years you finish read 2.4 and 2.6 series :-)

 Are you try http://kernelnewbies.org/LinuxChanges?

  I think this is a good idea to know all the changes!

 Regards,
 Renato
 -


If you have to study some given sub-system, why not, you might have to
look at how this evolved from years to years, that must already be a
huge work from 2.0 to now for any single part of the kernel.
But seriously, if it's already virtually impossible to get a real
picture of the overall kernels, trying to figure out all changes from
1.0 to now, for all subsystems sounds a little illusory. Moreover they
might have just been all rewritten several times since.

But perhaps what Vijayakumar was looking for is more the different
feature that were added within the curse of years (eg : SMP) ?

Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [EXT4 set 6][PATCH 1/1]Export jbd stats through procfs

2007-07-10 Thread Cédric Augonnet

2007/7/10, Andrew Morton <[EMAIL PROTECTED]>:

Hi all,


> + size = sizeof(struct transaction_stats_s);
> + s->stats = kmalloc(size, GFP_KERNEL);
> + if (s == NULL) {
> + kfree(s);
> + return -EIO;

ENOMEM


I'm sorry if i missed some point, but i just don't see the use of such
a kfree here, not sure Andrew meant you should only return ENOMEM
instead, but why issuing those kfree(NULL), instead of just a if (!s)
return ENOMEM ?

Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [EXT4 set 6][PATCH 1/1]Export jbd stats through procfs

2007-07-10 Thread Cédric Augonnet

2007/7/10, Andrew Morton [EMAIL PROTECTED]:

Hi all,


 + size = sizeof(struct transaction_stats_s);
 + s-stats = kmalloc(size, GFP_KERNEL);
 + if (s == NULL) {
 + kfree(s);
 + return -EIO;

ENOMEM


I'm sorry if i missed some point, but i just don't see the use of such
a kfree here, not sure Andrew meant you should only return ENOMEM
instead, but why issuing those kfree(NULL), instead of just a if (!s)
return ENOMEM ?

Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-mm2: HDAPS? BUG: at kernel/mutex.c:311

2007-05-26 Thread Cédric Augonnet

2007/5/14, Satyam Sharma <[EMAIL PROTECTED]>:

On 5/14/07, Dmitry Torokhov <[EMAIL PROTECTED]> wrote:
> Hi Satyam,
>
> On Saturday 12 May 2007 01:45, Satyam Sharma wrote:
> > Seems to be good-looking code!
>
> Thanks. Do you have the hardware? Were you able to test the patch?

Oh, sorry, no. I was actually referring to the input-polldev code. Let's
hope Matt can test and ack this.

Thanks,
Satyam
-


Hi all,

Since i still had that BUG on my T43 running a 2.6.22-rc2-mm1 kernel,
I tried that patch. It sounds like this has solved the bug since I
can't find any trace of that issue anymore once the patch is applied.

Thanks for the patch,
Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.21-mm2: HDAPS? BUG: at kernel/mutex.c:311

2007-05-26 Thread Cédric Augonnet

2007/5/14, Satyam Sharma [EMAIL PROTECTED]:

On 5/14/07, Dmitry Torokhov [EMAIL PROTECTED] wrote:
 Hi Satyam,

 On Saturday 12 May 2007 01:45, Satyam Sharma wrote:
  Seems to be good-looking code!

 Thanks. Do you have the hardware? Were you able to test the patch?

Oh, sorry, no. I was actually referring to the input-polldev code. Let's
hope Matt can test and ack this.

Thanks,
Satyam
-


Hi all,

Since i still had that BUG on my T43 running a 2.6.22-rc2-mm1 kernel,
I tried that patch. It sounds like this has solved the bug since I
can't find any trace of that issue anymore once the patch is applied.

Thanks for the patch,
Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + timer_stats-slimmed-down-using-statistics-infrastucture.patch added to -mm tree

2007-05-13 Thread Cédric Augonnet

2007/5/13, Arjan van de Ven <[EMAIL PROTECTED]>:

On Fri, 2007-04-20 at 17:18 -0700, [EMAIL PROTECTED] wrote:
> The patch titled
>  timer_stats slimmed down: using statistics infrastucture
> has been added to the -mm tree.  Its filename is
>  timer_stats-slimmed-down-using-statistics-infrastucture.patch
>
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>
> See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
> out what to do about this
>
> --
> Subject: timer_stats slimmed down: using statistics infrastucture
> From: Martin Peschke <[EMAIL PROTECTED]>
>
> This patch implements timer_stats based on the statistics infrastructure.
>
> Here is some sample output, which even somewhat
> resembles /proc/timer_stats.
> (>80 char lines because of symbol names, sorry for line breaks).
>
> It reads:
>
>
> with label being:
>   ()
>
> [EMAIL PROTECTED] timer_stats]# cat data

this patch changes the userspace API though, and breaks PowerTOP :(

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Moreover, the documentation still refers to /proc/timer_stats which do
not appear in my 2.6.21-mm2 kernel.

Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + timer_stats-slimmed-down-using-statistics-infrastucture.patch added to -mm tree

2007-05-13 Thread Cédric Augonnet

2007/5/13, Arjan van de Ven [EMAIL PROTECTED]:

On Fri, 2007-04-20 at 17:18 -0700, [EMAIL PROTECTED] wrote:
 The patch titled
  timer_stats slimmed down: using statistics infrastucture
 has been added to the -mm tree.  Its filename is
  timer_stats-slimmed-down-using-statistics-infrastucture.patch

 *** Remember to use Documentation/SubmitChecklist when testing your code ***

 See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
 out what to do about this

 --
 Subject: timer_stats slimmed down: using statistics infrastucture
 From: Martin Peschke [EMAIL PROTECTED]

 This patch implements timer_stats based on the statistics infrastructure.

 Here is some sample output, which even somewhat
 resembles /proc/timer_stats.
 (80 char lines because of symbol names, sorry for line breaks).

 It reads:
 statistic name basket hits label

 with label being:
 pid task name start_fn(stop_fn)

 [EMAIL PROTECTED] timer_stats]# cat data

this patch changes the userspace API though, and breaks PowerTOP :(

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Moreover, the documentation still refers to /proc/timer_stats which do
not appear in my 2.6.21-mm2 kernel.

Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-18 Thread Cédric Augonnet

2007/2/17, Cédric Augonnet <[EMAIL PROTECTED]>:

2007/2/17, Bill Davidsen <[EMAIL PROTECTED]>:
> Cédric Augonnet wrote:

That is my all point actually, i am not telling i have a valid
partition. I'm just describing the fact that the minix fs driver is
making too many assumptions on the partition it is given. For sure
there must be something nasty about this image, as you told this is
the duty of the kernel not to mount such a partition if it is not a
partition. Here I was only giving an example of the way we could trick
the driver. This is as important as having the driver working
correctly on valid partitions i suppose.



Eventually the problem got solved  by Andries Bouwer :
http://marc.theaimsgroup.com/?l=linux-mm-commits=117176125510900=2

Now i can issue a df command on this partition or whatever it may be,
and i obtain a consistent result wiythout any oops.

Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-18 Thread Cédric Augonnet

2007/2/17, Cédric Augonnet [EMAIL PROTECTED]:

2007/2/17, Bill Davidsen [EMAIL PROTECTED]:
 Cédric Augonnet wrote:

That is my all point actually, i am not telling i have a valid
partition. I'm just describing the fact that the minix fs driver is
making too many assumptions on the partition it is given. For sure
there must be something nasty about this image, as you told this is
the duty of the kernel not to mount such a partition if it is not a
partition. Here I was only giving an example of the way we could trick
the driver. This is as important as having the driver working
correctly on valid partitions i suppose.



Eventually the problem got solved  by Andries Bouwer :
http://marc.theaimsgroup.com/?l=linux-mm-commitsm=117176125510900w=2

Now i can issue a df command on this partition or whatever it may be,
and i obtain a consistent result wiythout any oops.

Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-17 Thread Cédric Augonnet

2007/2/17, Bill Davidsen <[EMAIL PROTECTED]>:

Cédric Augonnet wrote:
>
> Hi Daniel,
>
> On 2.6.20-rc6-mm3 and 2.6.20-mm1, i get an OOPS when using the minix 3
> file system. I enclose the dmesg and the .config to that mail.
>
> Here are the steps to reproduce this oops (they involve using qemu to
> run Minix 3)
> - First create a 2GB image using
>  qemu-img create minix.img 2G
>   (Please note that this seem to be producing an eroneous image)
> - Then launch Minix inside qemu to make a minix partition on this
> image using mkfs on the corresponding device.

That's two steps, right? First you make a partition on the "disk" qemu
provides, then you put a filesystem on the partition? Or did you put a
filesystem on the raw device?



Yes, i first create a RAW image of a disk. Then, i use this image file
as an image given to qemu. Inside qemu, running Minix i issue mkfs
/dev/c0d1 (this is corresponding to this image).



> - Mount the image on loopback using
>  mount -t minix -o loop minix.img /mnt/qemu/

Does mount know to use Minux3 with this command line?



Well at least it is yet allowing to do so. And most things work like hell.


> - issue a "df" command on /mnt/qemu
>
> This oops occurs everytime i use df on this directory. However, this
> does not occur if the image was for a 1MB partition. And it does not
> occur if the partition on which we created minix.img was the same as
> the partition on which qemu stands. Sounds like qemu has an issue and
> creates an erroneous partition which linux does not handle correctly.
>
> Regards, and thanks for your patch by the way !

Having been burned a few times by the fact that qemu provides disk
images which then (normally) get partitions, I'm not sure you aren't
having the same problem.

None of which justifies the OOPS, of course, nice kernels don't go down.



That is my all point actually, i am not telling i have a valid
partition. I'm just describing the fact that the minix fs driver is
making too many assumptions on the partition it is given. For sure
there must be something nasty about this image, as you told this is
the duty of the kernel not to mount such a partition if it is not a
partition. Here I was only giving an example of the way we could trick
the driver. This is as important as having the driver working
correctly on valid partitions i suppose.


--
Bill Davidsen <[EMAIL PROTECTED]>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot



Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-17 Thread Cédric Augonnet

2007/2/17, Daniel Aragonés <[EMAIL PROTECTED]>:

On 2/17/07, Cédric Augonnet <[EMAIL PROTECTED]> wrote:

> It appears that the trouble is in  the count_free of file
> fs/minix/bitmap.c . This procedure is actually called twice when we
> issue a df command.
> The point where things start to get strange is
> i = ((numbits - (numblocks-1) * bh->b_size * 8) / 16) * 2; at line 36
>
> In the first call to that procedure i have
>i = 3506
>bh->b_data = 0xd4e79000
>
> Whereas in the second call i have something much different :
>i = 536838736
>bh->b_data = d4e78000
>


More precisely, i traced how we call the count_free procedure : it is
once called by minix_count_free_blocks (and succeeds) and then by
minix_count_free_inodes and fails.

For the first call, which does not fail :
  minix_count_free_blocks(0xd4105240)
  sbi->s_zmap_blocks = 0x0010
  sbi->s_nzones = 0x0008
  sbi->s_firstdatazone = 0x1266
  (sbi->s_nzones - sbi->s_firstdatazone + 1) = 0x0007ed9b

 count_free( , 0x0010, 0x0007ed9b)
   unsigned numblocks = 0x0010 = 16
   __u32 numbits =0x0007ed9b = 519579
   bh->b_size = 0x1000 = 4096
== > i = 3506
This is consistent with the formula



For the second call
  minix_count_free_inodes(0xd4105240)
  sbi->s_imap_blocks = 0x000a
  sbi->s_ninodes = 0x9280

  count_free(..., 0x000a, 0x9281)
   unsigned numblocks = 0x000a  = 10
   __u32 numbits = 0x9281 = 37505
   bh->b_size =0x1000 = 4096
==> i = 536838736

(gdb) p/x (( 0x9281 - (0x00a-1) * 0x1000 * 8) / 16) * 2
$20 = 0x8252
$21 = -32174

(Very sorry for than mess !)



Well, that line is modified by my patch. The original one is:
i = ((numbits-(numblocks-1)*BLOCK_SIZE*8)/16)*2;

As you see, the constant is substituted by a variable. As you are
running under emulation, the true value of that variable ( 4096 in
standard minix3 releases, instead of the constant 1024 in minix1 and
minix2), is probably found where it should not be found, or not found
at all. And the result is what you show. Minix 3 provides for a
variable size of the blocks. Try to find what block size you are
emulating.

Also, though your dmesg shows a mounted loop partition, the minix
subpartition in it is not stated. So it cannot be accessed.


Well i actually do access to this partition, i can edit it and use it,
this on Linux. Sorry if this is not clear.


By the way, you don't need to support minix on your Linux box to run
it through an emulator, do you?


Actually i am running a full Minix OS in the qemu emulator and all
Linux does is providing me some disk images.

To make it more clear, i created a disk image file on Linux.  I
created a FS on that partition in Minix using mkfs inside qemu running
Minix. And then i just want to mount this partition to have access to
this filesystem on my Linux. This is ugly for sure, but it should
still not be oopsing.



Regards

Daniel



I thought you could be interested in having the actual image doing all
these nasty things :
 http://perso.ens-lyon.fr/cedric.augonnet/Linux/br.tar
 tar -xvf br.tar   should produce some br.img file that you can
mount using
 mount -o loop -t minix br.img /mnt/foo
 df -h should there be issuing the oops.

 Please remark that all the files inside this image were created
with Linux, not Minix, the _only_ thing done with minix and qemu was
to create the file system on the image.

Hoping this will be a bit useful,
Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-17 Thread Cédric Augonnet

2007/2/17, Daniel Aragonés <[EMAIL PROTECTED]>:

 Well, a glance at your dmesg doesn' show that a minix partition was
recognized. Otherwise it would sow it. So you have not such a
partition within your drives.

You are using an emulator to run minix. You will have the same problem
if you run minix2 or minix3 through an emulator and not from a real
minix2 or minix3 partition.

Regards,

Daniel



Indeed the only line of dmesg showing i mounted the partition is
  loop: loaded (max 8 devices)

Still I actually mount an image : i can add files, edit and so on. If
i use the emulator and mount the partition, this partition works
perfectly and even the df commands works.

Now i tried to understand what is going on and traced the program :

It appears that the trouble is in  the count_free of file
fs/minix/bitmap.c . This procedure is actually called twice when we
issue a df command.
The point where things start to get strange is
   i = ((numbits - (numblocks-1) * bh->b_size * 8) / 16) * 2; at line 36

In the first call to that procedure i have
  i = 3506
  bh->b_data = 0xd4e79000

Whereas in the second call i have something much different :
  i = 536838736
  bh->b_data = d4e78000

Tt really seems to me that this value should not be so large.
Unfortunately, i cannot get any deeper in tracing the problem since
even if i tried to read the code, it not documented, so i can't be
sure I understand everything you're doing in count_free ...

Regards,
Cédric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-17 Thread Cédric Augonnet

2007/2/17, Daniel Aragonés [EMAIL PROTECTED]:

 Well, a glance at your dmesg doesn' show that a minix partition was
recognized. Otherwise it would sow it. So you have not such a
partition within your drives.

You are using an emulator to run minix. You will have the same problem
if you run minix2 or minix3 through an emulator and not from a real
minix2 or minix3 partition.

Regards,

Daniel



Indeed the only line of dmesg showing i mounted the partition is
  loop: loaded (max 8 devices)

Still I actually mount an image : i can add files, edit and so on. If
i use the emulator and mount the partition, this partition works
perfectly and even the df commands works.

Now i tried to understand what is going on and traced the program :

It appears that the trouble is in  the count_free of file
fs/minix/bitmap.c . This procedure is actually called twice when we
issue a df command.
The point where things start to get strange is
   i = ((numbits - (numblocks-1) * bh-b_size * 8) / 16) * 2; at line 36

In the first call to that procedure i have
  i = 3506
  bh-b_data = 0xd4e79000

Whereas in the second call i have something much different :
  i = 536838736
  bh-b_data = d4e78000

Tt really seems to me that this value should not be so large.
Unfortunately, i cannot get any deeper in tracing the problem since
even if i tried to read the code, it not documented, so i can't be
sure I understand everything you're doing in count_free ...

Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-17 Thread Cédric Augonnet

2007/2/17, Daniel Aragonés [EMAIL PROTECTED]:

On 2/17/07, Cédric Augonnet [EMAIL PROTECTED] wrote:

 It appears that the trouble is in  the count_free of file
 fs/minix/bitmap.c . This procedure is actually called twice when we
 issue a df command.
 The point where things start to get strange is
 i = ((numbits - (numblocks-1) * bh-b_size * 8) / 16) * 2; at line 36

 In the first call to that procedure i have
i = 3506
bh-b_data = 0xd4e79000

 Whereas in the second call i have something much different :
i = 536838736
bh-b_data = d4e78000



More precisely, i traced how we call the count_free procedure : it is
once called by minix_count_free_blocks (and succeeds) and then by
minix_count_free_inodes and fails.

For the first call, which does not fail :
  minix_count_free_blocks(0xd4105240)
  sbi-s_zmap_blocks = 0x0010
  sbi-s_nzones = 0x0008
  sbi-s_firstdatazone = 0x1266
  (sbi-s_nzones - sbi-s_firstdatazone + 1) = 0x0007ed9b

 count_free( , 0x0010, 0x0007ed9b)
   unsigned numblocks = 0x0010 = 16
   __u32 numbits =0x0007ed9b = 519579
   bh-b_size = 0x1000 = 4096
==  i = 3506
This is consistent with the formula



For the second call
  minix_count_free_inodes(0xd4105240)
  sbi-s_imap_blocks = 0x000a
  sbi-s_ninodes = 0x9280

  count_free(..., 0x000a, 0x9281)
   unsigned numblocks = 0x000a  = 10
   __u32 numbits = 0x9281 = 37505
   bh-b_size =0x1000 = 4096
== i = 536838736

(gdb) p/x (( 0x9281 - (0x00a-1) * 0x1000 * 8) / 16) * 2
$20 = 0x8252
$21 = -32174

(Very sorry for than mess !)



Well, that line is modified by my patch. The original one is:
i = ((numbits-(numblocks-1)*BLOCK_SIZE*8)/16)*2;

As you see, the constant is substituted by a variable. As you are
running under emulation, the true value of that variable ( 4096 in
standard minix3 releases, instead of the constant 1024 in minix1 and
minix2), is probably found where it should not be found, or not found
at all. And the result is what you show. Minix 3 provides for a
variable size of the blocks. Try to find what block size you are
emulating.

Also, though your dmesg shows a mounted loop partition, the minix
subpartition in it is not stated. So it cannot be accessed.


Well i actually do access to this partition, i can edit it and use it,
this on Linux. Sorry if this is not clear.


By the way, you don't need to support minix on your Linux box to run
it through an emulator, do you?


Actually i am running a full Minix OS in the qemu emulator and all
Linux does is providing me some disk images.

To make it more clear, i created a disk image file on Linux.  I
created a FS on that partition in Minix using mkfs inside qemu running
Minix. And then i just want to mount this partition to have access to
this filesystem on my Linux. This is ugly for sure, but it should
still not be oopsing.



Regards

Daniel



I thought you could be interested in having the actual image doing all
these nasty things :
 http://perso.ens-lyon.fr/cedric.augonnet/Linux/br.tar
 tar -xvf br.tar   should produce some br.img file that you can
mount using
 mount -o loop -t minix br.img /mnt/foo
 df -h should there be issuing the oops.

 Please remark that all the files inside this image were created
with Linux, not Minix, the _only_ thing done with minix and qemu was
to create the file system on the image.

Hoping this will be a bit useful,
Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.20-mm1 - Oops using Minix 3 file system

2007-02-17 Thread Cédric Augonnet

2007/2/17, Bill Davidsen [EMAIL PROTECTED]:

Cédric Augonnet wrote:

 Hi Daniel,

 On 2.6.20-rc6-mm3 and 2.6.20-mm1, i get an OOPS when using the minix 3
 file system. I enclose the dmesg and the .config to that mail.

 Here are the steps to reproduce this oops (they involve using qemu to
 run Minix 3)
 - First create a 2GB image using
  qemu-img create minix.img 2G
   (Please note that this seem to be producing an eroneous image)
 - Then launch Minix inside qemu to make a minix partition on this
 image using mkfs on the corresponding device.

That's two steps, right? First you make a partition on the disk qemu
provides, then you put a filesystem on the partition? Or did you put a
filesystem on the raw device?



Yes, i first create a RAW image of a disk. Then, i use this image file
as an image given to qemu. Inside qemu, running Minix i issue mkfs
/dev/c0d1 (this is corresponding to this image).



 - Mount the image on loopback using
  mount -t minix -o loop minix.img /mnt/qemu/

Does mount know to use Minux3 with this command line?



Well at least it is yet allowing to do so. And most things work like hell.


 - issue a df command on /mnt/qemu

 This oops occurs everytime i use df on this directory. However, this
 does not occur if the image was for a 1MB partition. And it does not
 occur if the partition on which we created minix.img was the same as
 the partition on which qemu stands. Sounds like qemu has an issue and
 creates an erroneous partition which linux does not handle correctly.

 Regards, and thanks for your patch by the way !

Having been burned a few times by the fact that qemu provides disk
images which then (normally) get partitions, I'm not sure you aren't
having the same problem.

None of which justifies the OOPS, of course, nice kernels don't go down.



That is my all point actually, i am not telling i have a valid
partition. I'm just describing the fact that the minix fs driver is
making too many assumptions on the partition it is given. For sure
there must be something nasty about this image, as you told this is
the duty of the kernel not to mount such a partition if it is not a
partition. Here I was only giving an example of the way we could trick
the driver. This is as important as having the driver working
correctly on valid partitions i suppose.


--
Bill Davidsen [EMAIL PROTECTED]
   We have more to fear from the bungling of the incompetent than from
the machinations of the wicked.  - from Slashdot



Regards,
Cédric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/