Re: [PATCH] add transport class symlink to device object

2005-08-31 Thread Dmitry Torokhov
On Wednesday 31 August 2005 16:43, Greg KH wrote:
> On Thu, Aug 18, 2005 at 02:50:19PM -0500, Dmitry Torokhov wrote:
> > On 8/18/05, Greg KH <[EMAIL PROTECTED]> wrote:
> > > @@ -500,9 +519,13 @@ int class_device_add(struct class_device
> > >}
> > > 
> > >class_device_add_attrs(class_dev);
> > > -   if (class_dev->dev)
> > > +   if (class_dev->dev) {
> > > +   class_name = make_class_name(class_dev);
> > >sysfs_create_link(_dev->kobj,
> > >  _dev->dev->kobj, "device");
> > > +   sysfs_create_link(_dev->dev->kobj, _dev->kobj,
> > > + class_name);
> > > +   }
> > > 
> > 
> > I wonder if we need to grab a reference to class_dev->dev here:
> > 
> > dev = device_get(class_dev->dev);
> > if (dev) {
> >  
> > }
> > 
> > Otherwise, if device gets unregistered/deleted before class device is
> > deleted we'll get into trouble when removing the link since
> > class_dev->dev will be garbage.
> > 
> > .. But grabbing that reference will cause pains in SCSI system which,
> > when I looked, removed class devices from device's release function.
> 
> No the sysfs_create_link() call increments the kobject reference on the
> target of the symlink.  See sysfs_add_link() for details.  So this
> should be just fine, right?
>

Yes, you are right. Sorry for the moise.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][FAT] FAT dirent scan with hin take #3

2005-08-31 Thread Machida, Hiroyuki

OGAWA Hirofumi wrote:

"Machida, Hiroyuki" <[EMAIL PROTECTED]> writes:



Right, it looks like TLB, which holds cache "Physical addres"
correponding to "Logical address". In this case, PID and file name
to be looked up, perform role of "Logical address".



But, there is the big difference between hint table and TLB. TLB is
just the cache, and TLB hit is perfectly good, because kernel is
flushing the wrong values.

But this hint table is just collecting the recent access, it's not
cache, and it's not tracking the process's access at all.  So, since
the hint value is really random, the hint value may be bad.

I worry bad cases of this.


Umm... How about tracking the access pattern of process?  If that
seems randomly access, just give up tracking and return no hint.  And,
probably, I think it would be easy to improve the behavior later.

What do you think?


Sounds interesting...

Once concern about global URL in general, it tends to be occupied
by specific pattern, like accesses from one process or to on dir.
It prevents to realize locality.

I think it's better to have limitations like;
entries for same process would be limited to 2/3
entries for same dir would be limited to 1/3



e.g.

#define FAT_LOOKUP_HINT_MAX 16

/* this data per task */
struct fat_lookup_hint {
struct list_head lru;
pid_t pid;
struct super_block *sb;
struct inode *dir;
loff_t last_pos;
/*  int state;*/
};


Does this mean for each process recording last recent 16
accesses to FAT file system ? If true, pid would be eliminated.

I guess it's better to record nr_slots for this entry.

As implementation issue, if number of entires is small enough,
we can use an array, not a list.



static void fat_lkup_hint_inval(struct super_block *, struct inode *);
static loff_t fat_lkup_hint_get(struct super_block *, struct inode *);
static void fat_lkup_hint_add(struct super_block *, struct inode *, loff_t);
static int fat_lkup_hint_init(void);


I think super_block can be retrieved from inode, any other intention do
you have?


In addtion, we can do follwoing to check the exact match case;

0. Record hash value of file name in struct fat_lookup_hint

1. Check hash value to find exact match case,
1-1. If matched entry is found, check if file name and
 file name retieved from dirent corresponding
1-2. We found the entry

2. Get hint value, if there seem to have locality
2-1. Check locality of access pattern for this PID and this
 DIR.
2-2. If we relize access locality, return hit value so that
 it covers a potential working set.
2-3. Use hint value as start position of dirscan.

--
Hiroyuki Machida
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.13] IOCHK interface for I/O error handling/detecting

2005-08-31 Thread Hidetoshi Seto

This patch implements IOCHK interfaces that enable PCI drivers to
detect error and make their error handling easier.

Please refer archives if you need, e.g. http://lwn.net/Articles/139240/

Thanks,
H.Seto

Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>

---

 drivers/pci/pci.c   |2 ++
 include/asm-generic/iomap.h |   32 
 2 files changed, 34 insertions(+)

Index: linux-2.6.13/include/asm-generic/iomap.h
===
--- linux-2.6.13.orig/include/asm-generic/iomap.h
+++ linux-2.6.13/include/asm-generic/iomap.h
@@ -65,4 +65,36 @@ struct pci_dev;
 extern void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long 
max);
 extern void pci_iounmap(struct pci_dev *dev, void __iomem *);

+/*
+ * IOMAP_CHECK provides additional interfaces for drivers to detect
+ * some IO errors, supports drivers having ability to recover errors.
+ *
+ * All works around iomap-check depends on the design of "iocookie"
+ * structure. Every architecture owning its iomap-check is free to
+ * define the actual design of iocookie to fit its special style.
+ */
+#ifndef HAVE_ARCH_IOMAP_CHECK
+/* Dummy definition of default iocookie */
+typedef int iocookie;
+#endif
+
+/*
+ * Clear/Read iocookie to check IO error while using iomap.
+ *
+ * Note that default iochk_clear-read pair interfaces don't have
+ * any effective error check, but some high-reliable platforms
+ * would provide useful information to you.
+ * And note that some action may be limited (ex. irq-unsafe)
+ * between the pair depend on the facility of the platform.
+ */
+#ifdef HAVE_ARCH_IOMAP_CHECK
+extern void iochk_init(void);
+extern void iochk_clear(iocookie *cookie, struct pci_dev *dev);
+extern int iochk_read(iocookie *cookie);
+#else
+static inline void iochk_init(void) {}
+static inline void iochk_clear(iocookie *cookie, struct pci_dev *dev) {}
+static inline int iochk_read(iocookie *cookie) { return 0; }
+#endif
+
 #endif
Index: linux-2.6.13/drivers/pci/pci.c
===
--- linux-2.6.13.orig/drivers/pci/pci.c
+++ linux-2.6.13/drivers/pci/pci.c
@@ -777,6 +777,8 @@ static int __devinit pci_init(void)
while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
pci_fixup_device(pci_fixup_final, dev);
}
+
+   iochk_init();
return 0;
 }


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.13] IOCHK interface for I/O error handling/detecting (for ia64)

2005-08-31 Thread Hidetoshi Seto

This patch implements ia64-specific IOCHK interfaces that enable
PCI drivers to detect error and make their error handling easier.

Please refer archives if you need, e.g. http://lwn.net/Articles/139240/

Thanks,
H.Seto

Signed-off-by: Hidetoshi Seto <[EMAIL PROTECTED]>

---
 arch/ia64/Kconfig   |   13 +++
 arch/ia64/kernel/mca.c  |   34 
 arch/ia64/kernel/mca_asm.S  |   32 
 arch/ia64/kernel/mca_drv.c  |   85 ++
 arch/ia64/lib/Makefile  |1
 arch/ia64/lib/iomap_check.c |  168 
 include/asm-ia64/io.h   |  139 
 7 files changed, 472 insertions(+)

Index: linux-2.6.13/arch/ia64/lib/Makefile
===
--- linux-2.6.13.orig/arch/ia64/lib/Makefile
+++ linux-2.6.13/arch/ia64/lib/Makefile
@@ -16,6 +16,7 @@ lib-$(CONFIG_MCKINLEY)+= copy_page_mck.
 lib-$(CONFIG_PERFMON)  += carta_random.o
 lib-$(CONFIG_MD_RAID5) += xor.o
 lib-$(CONFIG_HAVE_DEC_LOCK) += dec_and_lock.o
+lib-$(CONFIG_IOMAP_CHECK) += iomap_check.o

 AFLAGS___divdi3.o  =
 AFLAGS___udivdi3.o = -DUNSIGNED
Index: linux-2.6.13/arch/ia64/Kconfig
===
--- linux-2.6.13.orig/arch/ia64/Kconfig
+++ linux-2.6.13/arch/ia64/Kconfig
@@ -399,6 +399,19 @@ config PCI_DOMAINS
bool
default PCI

+config IOMAP_CHECK
+   bool "Support iochk interfaces for IO error detection."
+   depends on PCI && EXPERIMENTAL
+   ---help---
+ Saying Y provides iochk infrastructure for "RAS-aware" drivers
+ to detect and recover some IO errors, which strongly required by
+ some of very-high-reliable systems.
+ The implementation of this infrastructure is highly depend on arch,
+ bus system, chipset and so on.
+ Currently, very few drivers or architectures implement this support.
+
+ If you don't know what to do here, say N.
+
 source "drivers/pci/Kconfig"

 source "drivers/pci/hotplug/Kconfig"
Index: linux-2.6.13/arch/ia64/lib/iomap_check.c
===
--- /dev/null
+++ linux-2.6.13/arch/ia64/lib/iomap_check.c
@@ -0,0 +1,168 @@
+/*
+ * File:iomap_check.c
+ * Purpose: Implement the IA64 specific iomap recovery interfaces
+ */
+
+#include 
+#include 
+#include 
+
+void iochk_init(void);
+void iochk_clear(iocookie *cookie, struct pci_dev *dev);
+int  iochk_read(iocookie *cookie);
+
+struct list_head iochk_devices;
+DEFINE_RWLOCK(iochk_lock); /* all works are excluded on this lock */
+
+static struct pci_dev *search_host_bridge(struct pci_dev *dev);
+static int have_error(struct pci_dev *dev);
+
+void notify_bridge_error(struct pci_dev *bridge);
+void clear_bridge_error(struct pci_dev *bridge);
+void save_bridge_error(void);
+
+void iochk_init(void)
+{
+   /* setup */
+   INIT_LIST_HEAD(_devices);
+}
+
+void iochk_clear(iocookie *cookie, struct pci_dev *dev)
+{
+   unsigned long flag;
+
+   INIT_LIST_HEAD(&(cookie->list));
+
+   cookie->dev = dev;
+   cookie->host = search_host_bridge(dev);
+
+   write_lock_irqsave(_lock, flag);
+   if (cookie->host && have_error(cookie->host)) {
+   /* someone under my bridge causes error... */
+   notify_bridge_error(cookie->host);
+   clear_bridge_error(cookie->host);
+   }
+   list_add(>list, _devices);
+   write_unlock_irqrestore(_lock, flag);
+
+   cookie->error = 0;
+}
+
+int iochk_read(iocookie *cookie)
+{
+   unsigned long flag;
+   int ret = 0;
+
+   write_lock_irqsave(_lock, flag);
+   if ( cookie->error || have_error(cookie->dev)
+   || (cookie->host && have_error(cookie->host)) )
+   ret = 1;
+   list_del(>list);
+   write_unlock_irqrestore(_lock, flag);
+
+   return ret;
+}
+
+struct pci_dev *search_host_bridge(struct pci_dev *dev)
+{
+   struct pci_bus *pbus;
+
+   /* there is no bridge */
+   if (!dev->bus->self)
+   return NULL;
+
+   /* find root bus bridge */
+   for (pbus = dev->bus; pbus->parent && pbus->parent->self;
+   pbus = pbus->parent);
+
+   return pbus->self;
+}
+
+static int have_error(struct pci_dev *dev)
+{
+   u16 status;
+
+   /* check status */
+   switch (dev->hdr_type) {
+   case PCI_HEADER_TYPE_NORMAL: /* 0 */
+   pci_read_config_word(dev, PCI_STATUS, );
+   break;
+   case PCI_HEADER_TYPE_BRIDGE: /* 1 */
+   pci_read_config_word(dev, PCI_SEC_STATUS, );
+   break;
+   case PCI_HEADER_TYPE_CARDBUS: /* 2 */
+   return 0; /* FIX ME */
+   default:
+   BUG();
+   }
+
+   if ( (status & PCI_STATUS_REC_TARGET_ABORT)
+   || (status & PCI_STATUS_REC_MASTER_ABORT)
+   || (status & 

Re: Updated dynamic tick patches

2005-08-31 Thread Con Kolivas
On Thu, 1 Sep 2005 02:58 am, Srivatsa Vaddagiri wrote:
> Following patches related to dynamic tick are posted in separate mails,
> for convenience of review. The first patch probably applies w/o dynamic
> tick consideration also.
>
> Patch 1/3  -> Fixup lost tick calculation in timer_pm.c
> Patch 2/3  -> Dyn-tick cleanups
> Patch 3/3  -> Use lost tick information in dyn-tick time recovery
>
> These patches are against 2.6.13-rc6-mm2.
>
> Con, would be great if you can upload a consolidated new version of
> dyn-tick patch on your website!

Great, thanks. I'll wait till 2.6.13-mm1 is out since that's due shortly and 
I'll resync everything with that and perhaps tweak along the way.

Cheers,
Con
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: ip_contrack refuses to load if built UP as a module on IA64

2005-08-31 Thread Peter Chubb


This patch makes UP and SMP do the same thing as far as module per-cpu
data go.

Unfortunately it affects core code.

To repeat the problem:
  IA64 keeps per-cpu data in a small data area that is referenced by a
  22-bit offset, for both UP and SMP cases.  If a module defines
  per-cpu data, it too will end up in the small-data area.  But the
  module loader at present special-cases the UP treatment of per-cpu
  data, assumes that it is in the GP-relative data area, and does
  nothing (for SMP it allocates space, and copies initialised data
  items into it) 

  The effect is that modules defining per-cpu data fail to load if
  they're built UP, because of an impossible relocation.

  The appended patch makes the treatment of per-cpu data uniform
  between UP and SMP cases.  For most architectures, the per-cpu data
  section will be empty for UP, and so the per-cpu setup code will not
  be invoked.

Signed-off-by: Peter Chubb <[EMAIL PROTECTED]>

diff --git a/arch/ia64/kernel/module.c b/arch/ia64/kernel/module.c
--- a/arch/ia64/kernel/module.c
+++ b/arch/ia64/kernel/module.c
@@ -951,4 +951,10 @@ percpu_modcopy (void *pcpudst, const voi
if (cpu_possible(i))
memcpy(pcpudst + __per_cpu_offset[i], src, size);
 }
+#else
+void
+percpu_modcopy (void *pcpudst, const void *src, unsigned long size)
+{
+   memcpy(pcpudst, src, size);
+}
 #endif /* CONFIG_SMP */
diff --git a/kernel/module.c b/kernel/module.c
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -209,7 +209,6 @@ static struct module *find_module(const 
return NULL;
 }
 
-#ifdef CONFIG_SMP
 /* Number of blocks used and allocated. */
 static unsigned int pcpu_num_used, pcpu_num_allocated;
 /* Size of each block.  -ve means used. */
@@ -352,29 +351,7 @@ static int percpu_modinit(void)
return 0;
 }  
 __initcall(percpu_modinit);
-#else /* ... !CONFIG_SMP */
-static inline void *percpu_modalloc(unsigned long size, unsigned long align,
-   const char *name)
-{
-   return NULL;
-}
-static inline void percpu_modfree(void *pcpuptr)
-{
-   BUG();
-}
-static inline unsigned int find_pcpusec(Elf_Ehdr *hdr,
-   Elf_Shdr *sechdrs,
-   const char *secstrings)
-{
-   return 0;
-}
-static inline void percpu_modcopy(void *pcpudst, const void *src,
- unsigned long size)
-{
-   /* pcpusec should be 0, and size of that section should be 0. */
-   BUG_ON(size != 0);
-}
-#endif /* CONFIG_SMP */
+
 
 #ifdef CONFIG_MODULE_UNLOAD
 #define MODINFO_ATTR(field)\
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: State of Linux graphics

2005-08-31 Thread Keith Packard
On Wed, 2005-08-31 at 18:58 -0700, Allen Akin wrote:
> On Wed, Aug 31, 2005 at 02:06:54PM -0700, Keith Packard wrote:
> | On Wed, 2005-08-31 at 13:06 -0700, Allen Akin wrote:
> | > ...
> | 
> | Right, the goal is to have only one driver for the hardware, whether an
> | X server for simple 2D only environments or a GL driver for 2D/3D
> | environments. ...
> 
> I count two drivers there; I was hoping the goal was for one. :-)

Yeah, two systems, but (I hope) only one used for each card. So far, I'm
not sure of the value of attempting to provide a mostly-software GL
implementation in place of existing X drivers.

> |   ... I think the only questions here are about the road from
> | where we are to that final goal.
> 
> Well there are other questions, including whether it's correct to
> partition the world into "2D only" and "2D/3D" environments.  There are
> many disadvantages and few advantages (that I can see) for doing so.

I continue to work on devices for which 3D isn't going to happen.  My
most recent window system runs on a machine with only 384K of memory,
and yet supports a reasonable facsimile of a linux desktop environment.

In the 'real world', we have linux machines continuing to move
"down-market" with a target price of $100. At this price point, it is
reasonable to look at what are now considered 'embedded' graphics
controllers with no acceleration other than simple copies and fills.

Again, the question is whether a mostly-software OpenGL implementation
can effectively compete against the simple X+Render graphics model for
basic 2D application operations, and whether there are people interested
in even trying to make this happen.

> |... However, at the
> | application level, GL is not a very friendly 2D application-level API.
> 
> The point of OpenGL is to expose what the vast majority of current
> display hardware does well, and not a lot more.  So if a class of apps
> isn't "happy" with the functionality that OpenGL provides, it won't be
> happy with the functionality that any other low-level API provides.  The
> problem lies with the hardware.

Not currently; the OpenGL we have today doesn't provide for
component-level compositing or off-screen drawable objects. The former
is possible in much modern hardware, and may be exposed in GL through
pixel shaders, while the latter spent far too long mired in the ARB and
is only now on the radar for implementation in our environment.

Off-screen drawing is the dominant application paradigm in the 2D world,
so we can't function without it while component-level compositing
provides superior text presentation on LCD screens, which is an
obviously increasing segment of the market.

> Conversely, if the apps aren't taking advantage of the functionality
> OpenGL provides, they're not exploiting the opportunities the hardware
> offers.  Of course I'm not saying all apps *must* use all of OpenGL;
> simply that their developers should be aware of exactly what they're
> leaving on the table.  It can make the difference between an app that's
> run-of-the-mill and one that's outstanding.

Most 2D applications aren't all about the presentation on the screen;
right now, we're struggling to just get basic office functionality
provided to the user. The cairo effort is more about making applications
portable to different window systems and printing systems than it is
about bling, although the bling does have a strong pull for some
developers.

So, my motivation for moving to GL drivers is far more about providing
drivers for closed source hardware and reducing developer effort needed
to support new hardware than it is about making the desktop graphics
faster or more fancy.

> "Friendliness" is another matter, and it makes a ton of sense to package
> common functionality in an easier-to-use higher-level library that a lot
> of apps can share.  In this discussion my concern isn't with Cairo, but
> with the number and type of back-end APIs we (driver developers and
> library developers and application developers) have to support.

Right, again the goal is to have only one driver per video card. Right
now we're not there, and the result is that the GL drivers take a back
seat in most environments to the icky X drivers that are required to
provide simple 2D graphics. That's not a happy place to be, and we do
want to solve that as soon as possible.

> | ... GL provides
> | far more functionality than we need for 2D applications being designed
> | and implemented today...
> 
> With the exception of lighting, it seems to me that pretty much all of
> that applies to today's "2D" apps.  It's just a myth that there's "far
> more" functionality in OpenGL than 2D apps can use.  (Especially for
> OpenGL ES, which eliminates legacy cruft from full OpenGL.)

The bulk of 2D applications need to paint solid rectangles, display a
couple of images with a bit of scaling and 

[patch] updated hdaps driver.

2005-08-31 Thread Robert Love
Below find an updated hdaps driver.

Various bug fixes, clean ups, additions to the DMI whitelist, and a new
automatic inversion detector (some ThinkPads have the axises negated).

Andrew, since a new 2.6-mm has yet to come out, feel free to replace the
original patch with this one.

Thanks,

Robert Love


Driver for the IBM Hard Drive Active Protection System (HDAPS), an
accelerometer found in most modern ThinkPads.

Signed-off-by: Robert Love <[EMAIL PROTECTED]>

diff -urN linux-2.6.13/drivers/hwmon/hdaps.c linux/drivers/hwmon/hdaps.c
--- linux-2.6.13/drivers/hwmon/hdaps.c  1969-12-31 19:00:00.0 -0500
+++ linux/drivers/hwmon/hdaps.c 2005-08-31 23:50:36.0 -0400
@@ -0,0 +1,739 @@
+/*
+ * drivers/hwmon/hdaps.c - driver for IBM's Hard Drive Active Protection System
+ *
+ * Copyright (C) 2005 Robert Love <[EMAIL PROTECTED]> 
+ * Copyright (C) 2005 Jesper Juhl <[EMAIL PROTECTED]> 
+ *
+ * The HardDisk Active Protection System (hdaps) is present in the IBM ThinkPad
+ * T41, T42, T43, R51, and X40, at least.  It provides a basic two-axis
+ * accelerometer and other data, such as the device's temperature.
+ *
+ * Based on the document by Mark A. Smith available at
+ * http://www.almaden.ibm.com/cs/people/marksmith/tpaps.html and a lot of trial
+ * and error.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License v2 as published by the
+ * Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define HDAPS_LOW_PORT 0x1600  /* first port used by hdaps */
+#define HDAPS_NR_PORTS 0x30/* 0x1600 - 0x162f */
+
+#define STATE_FRESH0x50/* accelerometer data is fresh */
+
+#define REFRESH_ASYNC  0x00/* do asynchronous refresh */
+#define REFRESH_SYNC   0x01/* do synchronous refresh */
+
+#define HDAPS_PORT_STATE   0x1611  /* device state */
+#define HDAPS_PORT_YPOS0x1612  /* y-axis position */
+#defineHDAPS_PORT_XPOS 0x1614  /* x-axis position */
+#define HDAPS_PORT_TEMP1   0x1616  /* device temperature, in celcius */
+#define HDAPS_PORT_YVAR0x1617  /* y-axis variance (what is 
this?) */
+#define HDAPS_PORT_XVAR0x1619  /* x-axis variance (what is 
this?) */
+#define HDAPS_PORT_TEMP2   0x161b  /* device temperature (again?) */
+#define HDAPS_PORT_UNKNOWN 0x161c  /* what is this? */
+#define HDAPS_PORT_KMACT   0x161d  /* keyboard or mouse activity */
+
+#define HDAPS_READ_MASK0xff/* some reads have the low 8 
bits set */
+
+#define KEYBD_MASK 0x20/* set if keyboard activity */
+#define MOUSE_MASK 0x40/* set if mouse activity */
+#define KEYBD_ISSET(n) (!! (n & KEYBD_MASK))   /* keyboard used? */
+#define MOUSE_ISSET(n) (!! (n & MOUSE_MASK))   /* mouse used? */
+
+#define INIT_TIMEOUT_MSECS 4000/* wait up to 4s for device init ... */
+#define INIT_WAIT_MSECS200 /* ... in 200ms increments */
+
+static struct platform_device *pdev;
+static struct input_dev hdaps_idev;
+static struct timer_list hdaps_timer;
+static unsigned int hdaps_mousedev_threshold = 4;
+static unsigned long hdaps_poll_ms = 50;
+static unsigned int hdaps_mousedev;
+static unsigned int hdaps_invert;
+static u8 km_activity;
+static int rest_x;
+static int rest_y;
+
+static DECLARE_MUTEX(hdaps_sem);
+
+/*
+ * __get_latch - Get the value from a given port.  Callers must hold hdaps_sem.
+ */
+static inline u8 __get_latch(u16 port)
+{
+   return inb(port) & HDAPS_READ_MASK;
+}
+
+/*
+ * __check_latch - Check a port latch for a given value.  Callers must hold
+ * hdaps_sem.  Returns zero if the port contains the given value.
+ */
+static inline unsigned int __check_latch(u16 port, u8 val)
+{
+   if (__get_latch(port) == val)
+   return 0;
+   return -EINVAL;
+}
+
+/*
+ * __wait_latch - Wait up to 100us for a port latch to get a certain value,
+ * returning zero if the value is obtained.  Callers must hold hdaps_sem.
+ */
+static unsigned int __wait_latch(u16 port, u8 val)
+{
+   unsigned int i;
+
+   for (i = 0; i < 20; i++) {
+   if (!__check_latch(port, val))
+   return 0;
+   udelay(5);
+   }
+
+   return -EINVAL;
+}
+
+/*
+ * __device_refresh - Request a refresh from the accelerometer.
+ *
+ * If sync is 

Re: sr device can be written?

2005-08-31 Thread Randy.Dunlap
On Tue, 30 Aug 2005 19:07:55 +0800 jeff shia wrote:

> but It seems that I can not open sr0 with openflags O_RDWR,why?
> open("/dev/sr0",O_RDWR);
> 
> It says:sr0 is a read only file sytem.
> why?

What media did you have in the drive?

For me, with a CD-ROM, I get the same that you reported,
but with a DVD+RW disc, it opens successfully.


> On 8/30/05, Tino Keitel <[EMAIL PROTECTED]> wrote:
> > On Tue, Aug 30, 2005 at 16:11:58 +0800, jeff shia wrote:
> > > YOu mean the device file can be written?
> > 
> > Yes, like an ordinary block device.
> > 
> > >
> > >
> > > On 8/30/05, Tino Keitel <[EMAIL PROTECTED]> wrote:
> > > > On Tue, Aug 30, 2005 at 12:53:51 +0800, jeff shia wrote:
> > > > > Hello,
> > > > >  Sr is the Scsi-cdrom device?so it can be read only?but look at the 
> > > > > source=
> > > > > =20
> > > > > code I notice that
> > > > > sr can be written also!Is it right?
> > > >
> > > > Just imagine a DVD-RAM drive.
> > > >
> > > > Regards,
> > > > Tino
> > > > -

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: APs from the Kernel Summit run Linux

2005-08-31 Thread Nigel Cunningham
Hi.

On Thu, 2005-09-01 at 13:29, Kyle Moffett wrote:
> The 4020 and 0402 look oddly symmetrical to me, but that could just
> be my imagination.

All I saw in it was byte n+1 = byte n >> 1. Can't see any use to that
either, though. Maybe it's just there to torment reverse engineerers, or
trap memory corruption?

Nigel
-- 
Evolution.
Enumerate the requirements.
Consider the interdependencies.
Calculate the probabilities.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Question] [Patch] How get instruction pointer of user space ???

2005-08-31 Thread [EMAIL PROTECTED]
Hi:

Thanks to Yingchao Zhou and Gaurav Dhiman first, for your answers.

I get it now! but it look we must update knownledge about this.

I read copy_thread() in arch/i386/kernel/process.c, the code piece of
this function are:


/* childregs = ((struct pt_regs *) (THREAD_SIZE + (unsigned long)
p->thread_info)) - 1;
** /*
* The below -8 is to reserve 8 bytes on top of the ring0 stack.
* This is necessary to guarantee that the entire "struct pt_regs"
* is accessable even if the CPU haven't stored the SS/ESP registers
* on the stack (interrupt gate does not save these registers
* when switching to the same priv ring).
* Therefore beware: accessing the xss/esp fields of the
* "struct pt_regs" is possible, but they may contain the
* completely wrong values.
*/
childregs = (struct pt_regs *) ((unsigned long) childregs - 8);
*childregs = *regs;
*/

Oh, clear all secrets on it now! that comment is very readable.

OK, see my code to do experiement in do_fork() :

/*
static void
my_check_regs_with_offset(struct pt_regs *orig_regs)
{
struct thread_info *thread_info;
struct pt_regs *pt_regs;
unsigned long *stack_bottom;

thread_info = current_thread_info();
pt_regs = (struct pt_regs*)((unsigned long)thread_info+THREAD_SIZE-8);
pt_regs--;
stack_bottom = 8+(unsigned long)(pt_regs+1),
printk("\tbottom=%p, pt_regs = %p eip=%p\n",
stack_bottom,
pt_regs,
pt_regs->eip);

}

static void
my_check_regs_without_offset(struct pt_regs *orig_regs)
{
struct thread_info *thread_info;
struct pt_regs *pt_regs;
unsigned long *stack_bottom;

thread_info = current_thread_info();
pt_regs = (struct pt_regs*)((unsigned long)thread_info+THREAD_SIZE);
pt_regs--;
stack_bottom = (unsigned long)(pt_regs+1),
printk("\tbottom=%p, pt_regs = %p eip=%p\n",
stack_bottom,
pt_regs,
pt_regs->eip);
}*/

in do_fork() function, I insert code:

/* if (current->tgid && (current->tgid % 10 == 0) && get_task_mm(current)) {
unsigned long stack_bottom;
struct thread_info *thread_info;
struct mm_struct *mm;

printk("withOUT offset: ");
my_check_regs_without_offset(regs);

printk("with offset: ");
my_check_regs_with_offset(regs);

printk("sizeof(struct pt_regs) = %d THREAD_SIZE=%x ",
sizeof(struct pt_regs),
THREAD_SIZE);
thread_info = current_thread_info();
printk("thread_info=%p\n", thread_info);

printk("In kernel words:");
printk(" KSTK_TOP(task)=%x\n", KSTK_TOP(current));
printk(" task_pt_regs(task)=%x\n", task_pt_regs(current));
printk(" KSTK_EIP(task)=%x\n", KSTK_EIP(current));

printk("In fact:\n");
stack_bottom = 8+(unsigned long)(regs+1);
printk(" bottom=%p, pt_regs=%p, eip=%p\n",stack_bottom,regs,regs->eip);
mm = get_task_mm(current);
printk(" code address range: [%x-%x]\n", mm->start_code, mm->end_code);
printk("\n");

}
*/
the printk() output:


*withOUT offset: bottom=dac16000, pt_regs = dac15fc4 eip=0282
with offset: bottom=dac16000, pt_regs = dac15fbc eip=0012d402
sizeof(struct pt_regs) = 60 THREAD_SIZE=2000 thread_info=dac14000
In kernel words: KSTK_TOP(task)=deebb020
task_pt_regs(task)=dac15fc4
KSTK_EIP(task)=282
In fact:
bottom=dac16000, pt_regs=dac15fbc, eip=0012d402
code address range: [8047000-80d1b80]

withOUT offset: bottom=dac16000, pt_regs = dac15fc4 eip=0282
with offset: bottom=dac16000, pt_regs = dac15fbc eip=00c1f402
sizeof(struct pt_regs) = 60 THREAD_SIZE=2000 thread_info=dac14000
In kernel words: KSTK_TOP(task)=deebb020
task_pt_regs(task)=dac15fc4
KSTK_EIP(task)=282
In fact:
bottom=dac16000, pt_regs=dac15fbc, eip=00c1f402
code address range: [8047000-80d1b80]

withOUT offset: bottom=dac16000, pt_regs = dac15fc4 eip=0282
with offset: bottom=dac16000, pt_regs = dac15fbc eip=00c1f402
sizeof(struct pt_regs) = 60 THREAD_SIZE=2000 thread_info=dac14000
In kernel words: KSTK_TOP(task)=deebb020
task_pt_regs(task)=dac15fc4
KSTK_EIP(task)=282
In fact:
bottom=dac16000, pt_regs=dac15fbc, eip=00c1f402
code address range: [8047000-80d1b80]

* It's look there have one anonymous hero update copy_thread(), but
he/she forget to update
macro task_pt_regs(task). After browse LXR, I found 2.6.11 have not this
change yet.

the copy_thread() in 2.6.12.3 and 2.6.13 include this "dummy" offset, at
least.
the attachment is a patch for that.

Is right my words?

thanks again.

sailor











--- linux-2.6.13/include/asm-i386/processor.h.orig	2005-09-01 11:19:22.0 +0800
+++ linux-2.6.13/include/asm-i386/processor.h	2005-09-01 11:26:04.0 +0800
@@ -538,11 +538,13 @@
unsigned long *__ptr = (unsigned long *)(info); \
(unsigned long)(&__ptr[THREAD_SIZE_LONGS]); \
 })
-
+/*
+ * subtract 8 here, to skip dummy offset, see copy_thread() for detailed comment.
+ */
 #define task_pt_regs(task) \
 ({ \
struct pt_regs *__regs__;   \
-   __regs__ = (struct pt_regs *)KSTK_TOP((task)->thread_info); \
+   __regs__ = (struct pt_regs 

Re: APs from the Kernel Summit run Linux

2005-08-31 Thread Kyle Moffett

On Aug 31, 2005, at 16:32:11, Vojtech Pavlik wrote:

On Wed, Aug 31, 2005 at 08:53:19PM +0100, Russell King wrote:


On Wed, Aug 31, 2005 at 12:55:12PM -0400, Mark Lord wrote:


I'll try loading the works into another ARM
system I have here, and see (1) if it runs as-is,
and (2) what the disassembly shows.



You can identify ARM code quite readily - look for a large number of
32-bit words naturally aligned and grouped together whose top nibble
is 14 - ie 0xE...

The top nibble is the conditional execution field, and 14 is  
"always".


Didn't find that. Anyway:

The first and third parts contain a repeating 7-byte sequence

81 40 20 10 08 04 02

near the beginning, while part 2 is padded with zeroes in the same
place.


That sequence is altered in the first and last repetitions, like this:

88 4020 1008 0402
81 4020 1008 0402
[...]
81 4020 1008 0402
81 4020 1008 04c2

The 4020 and 0402 look oddly symmetrical to me, but that could just
be my imagination.

I wrote a quick perl script to find the number of occurrences of 8-bit
aligned sequences of 16-bits, for all 16-bit values.  It has some
interesting (and potentially useful) results.

The script:
http://zeus.moffetthome.net/~kyle/hexfreq

The output:
http://zeus.moffetthome.net/~kyle/dwl.hexmult

Reprocessed output by frequency:
http://zeus.moffetthome.net/~kyle/dwl.hexfreq

Reprocessing command:
dwl.hexfreq


Cheers,
Kyle Moffett

--
Somone asked me why I work on this free (http://www.fsf.org/philosophy/)
software stuff and not get a real job. Charles Shultz had the best  
answer:


"Why do musicians compose symphonies and poets write poems? They do  
it because
life wouldn't have any meaning for them if they didn't. That's why I  
draw

cartoons. It's my life."
  -- Charles Shultz


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Zhou Yingchao
2005/9/1, jmerkey <[EMAIL PROTECTED]>:
> Bernd,
>
> It might be helpful for someone to look at these sections of code I had
> to patch in 2.6.9.
> I discovered a case where the kernel scheduler will pass NULL for the
> array argument
> when I started hitting the extreme upper range > 200MB/S combined disk
> and lan
> throughput.  This was running with preemptible kernel and hyperthreading
> enabled.
>
> The wheels come off in the kernel somewhere.  I looked at later 2.6
> kernels and there's
> been some changes, but someone may get an ah ha from this fix, if there
> is an underlying
> problem in the kernel.
>
> Jeff
>
>
>  static void dequeue_task(struct task_struct *p, prio_array_t *array)
>  {
> -array->nr_active--;
> -list_del(>run_list);
> -if (list_empty(array->queue + p->prio))
> -__clear_bit(p->prio, array->bitmap);
> +if (!array)
> +   printk("WARN:  prio_array was NULL in dequeue task %08X"
> +  "pid-%d\n", (unsigned)p, (int)p->pid);
> +
> +if (array)
> +{
> +   array->nr_active--;
> +   list_del(>run_list);
> +   if (list_empty(array->queue + p->prio))
> +   __clear_bit(p->prio, array->bitmap);
> +}
>  }
>
>
> static void deactivate_task(struct task_struct *p, runqueue_t *rq)
>  {
> -rq->nr_running--;
> -if (p->state == TASK_UNINTERRUPTIBLE)
> -rq->nr_uninterruptible++;
> -dequeue_task(p, p->array);
> -p->array = NULL;
> +if (!p->array)
> +   printk("WARN:  prio_array was NULL in deactivate task %08X"
> +  "pid-%d\n", (unsigned)p, (int)p->pid);
> +
> +if (p->array)
> +{
> +   rq->nr_running--;
> +   if (p->state == TASK_UNINTERRUPTIBLE)
> +   rq->nr_uninterruptible++;
> +   dequeue_task(p, p->array);
> +   p->array = NULL;
> +}
>  }
>

 I think a BUG_ON(!array) should be there to cache the call trace. I
think there are bugs on the call trace. The codes you add will only
resolve the problem in an exterior way.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6] I2C: Drop I2C_DEVNAME and i2c_clientname

2005-08-31 Thread Mauro Carvalho Chehab
Em Qua, 2005-08-31 às 13:56 -0700, Greg KH escreveu:
> On Wed, Aug 31, 2005 at 12:34:58PM -0300, Mauro Carvalho Chehab wrote:
> > Em Ter, 2005-08-30 ?s 23:20 +0200, Jean Delvare escreveu:
> > > Hi Mauro,
> > > 
> > > > (...) it would be nice not to have a different I2C
> > > > API for every single 2.6 version :-) It would be nice to change I2C
> > > > API once and keep it stable for a while.
> > 
> > > The Linux 2.6 development model is designed around a relatively fast
> > > move from -mm to Linus' tree, which implies incremental changes all the
> > > time. I'm only doing that.
> > It is ok to change code, but, IMHO, API should be more stable.
> 
> I take it you have not read Documentation/stable_api_nonsense.txt yet?
No I din't :-)
> If not, please do, it shows that what you are asking for will not
> happen.
I was not asking for a 'stable' one.. but a less variant... anyway, I
can survive with this policy ;-)
> 
> good luck,
> 
> greg k-h
> 
Cheers, 
Mauro.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: State of Linux graphics

2005-08-31 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Allen Akin wrote:
> On Wed, Aug 31, 2005 at 02:06:54PM -0700, Keith Packard wrote:
> | 
> | ...So far, 3D driver work has proceeded almost entirely on the
> | newest documented hardware that people could get. Going back and
> | spending months optimizing software 3D rendering code so that it works
> | as fast as software 2D code seems like a thankless task.
> 
> Jon's right about this:  If you can accelerate a given simple function
> (blending, say) for a 2D driver, you can accelerate that same function
> in a Mesa driver for a comparable amount of effort, and deliver a
> similar benefit to apps.  (More apps, in fact, since it helps
> OpenGL-based apps as well as Cairo-based apps.)

The difference is that there is a much larger number of state
combinations possible in OpenGL than in something stripped down for
"just 2D".  That can make it more difficult to know where to spend the
time tuning.  I've spent a fair amount of time looking at Mesa's texture
blending code, so I know this to be true.

The real route forward is to dig deeper into run-time code generation.
There are a large number of possible combinations, but they all look
pretty similar.  This is ideal for run-time code gen.  The problem is
that writing correct, tuned assembly for this stuff takes a pretty
experience developer, and writing correct, tuned code generation
routines takes an even more experienced developer.  Experienced and more
experienced developers are, alas, in short supply.

BTW, Alan, when are you going to start writing code again? >:)

> So long as people are encouraged by word and deed to spend their time on
> "2D" drivers, Mesa drivers will be further starved for resources and the
> belief that OpenGL has nothing to offer "2D" apps will become
> self-fulfilling.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFDFnFQX1gOwKyEAw8RAgZsAJ9MoKf+JTX4OGrybrhD+i2axstONgCghwih
/Bln/u55IJb3BMWBwVTA3sk=
=k086
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


'mdio_bus_exit' in discarded section .text.exit

2005-08-31 Thread Peter Chubb

When building with  CONFIG_PHYLIB=y on Itanium, I see:
 `mdio_bus_exit' referenced in section `.init.text' of
drivers/built-in.o: defined in discarded section `.exit.text' of
drivers/built-in.o

I believe that mdio_bus_exit should not be declared __exit, because it
is referencesd from __init sections in, say, phy_init().

Signed-off-by: Peter Chubb <[EMAIL PROTECTED]>

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -170,7 +170,7 @@ int __init mdio_bus_init(void)
   return bus_register(_bus_type);
 }
 
-void __exit mdio_bus_exit(void)
+void mdio_bus_exit(void)
 {
bus_unregister(_bus_type);
 }


-- 
Dr Peter Chubb  http://www.gelato.unsw.edu.au  peterc AT gelato.unsw.edu.au
The technical we do immediately,  the political takes *forever*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] PowerOP Take 2 1/3: ARM OMAP1 platform support

2005-08-31 Thread Todd Poynor

David Brownell wrote:

Interesting.  I start to like this shape better; it moves more of the
logic to operating point code, where it can make the sysfs interface
talk in terms of meaningful abstractions, not cryptic numeric offsets.
But it was odd to see the first patch be platform-specific support,
rather than be a neutral framework into which platform-aware code plugs
different kinds of things...


Since it is at a low layer below a number of possible interfaces, and 
since there is no generic processing performed at this low layer (it's 
pretty much set or get an opaque structure), there isn't any 
higher-layer framework to plug into at the moment.  If something like 
these abstractions of power parameters and operating points are felt to 
be a good foundation for a runtime power management stack then turning 
our attentions to the next layer up (perhaps cpufreq or a new 
embedded-oriented stack) would create that generic structure.


Its worth noting that newer embedded SOCs are coming up with such 
complicated clocking structures and rules for setting and switching 
operating points that some silicon vendors are starting to provide code 
at approximately the PowerOP level for their platforms, to plug into 
different upper-layer power management stacks (and possibly different 
open source OSes).  So there may be some value to settling on common 
interfaces for this.



One part I don't like is that the platform would be limited to tweaking
a predefined set of fields in registers.  That seems insufficient for
subsystems that may not be present on all boards.  


Yes, the code currently assumes it would be tweaked for different 
variants of platforms, partly due to the difficulty of implementing a 
lean and mean way of integrating the different pieces.  It sounds like 
registering multiple handlers for multiple sets of power parameters may 
be in order, although a single opaque structure shared between upper 
layers and the handlers probably won't be sufficient any more.  If the 
operating point data structure basically goes away and sysfs becomes the 
preferred interface then it should be fairly straightforward to discover 
what PM capabilities are registered and to get/set the associated power 
param attributes.  Otherwise in-kernel interfaces might need some 
further thought to specify something that routes to the proper handler.


> Plus, to borrow some

terms from cpufreq, it only facilitates "usermode" governor models, never
"ondemand" or any other efficient quick-response adaptive algorithms.


The sysfs interface does not itself handle such schemes, but the PowerOP 
layer is fine with inserting beneath in-kernel algorithms.  Low-latency, 
very frequent adjustments to power parameters are very much in mind for 
what I'm trying to do, assuming embedded hardware will increasingly be 
able to take advantage of aggressive runtime power management for 
battery savings.  (Much of this is driven by how embedded hardware can 
most aggressively but usefully be power managed, and it would be nice to 
get those folks more involved.)  What DPM does with approximately the 
same type of interface is setup some operating points and policies for 
which operating point is appropriate in which situations, and then kick 
off a kernel state machine that handles the transitions.


...

Alternatively, the "thing" could implement some adaptive algorithm
using local measurements, predictions, and feedback to adjust any
platform power parameters dynamically.  Maybe it'd delegate management
of the ARM clock to "cpufreq", and focus on managing power for other
board components that might never get really reusable code.  Switching
between operating points wouldn't require userspace instruction;
call it a "dynamic operating point" selection model.


Interesting, although such close coordination of changing various clocks 
and voltages is required on some platforms that it would be hard to 
distribute it much among kernel components.  To some degree the above is 
how DPM functions: some policy instructions are sent to the kernel and 
the kernel switches operating points accordingly.  Something more 
flexible than operating points could be specified in the policy info, 
possibly even something as abstract as "battery low", pushing the 
interpretation of high-level power policy into kernel components instead 
of a userspace app giving the kernel low-level instructions.



The DSP clock might benefit from some support though.  I've never
much looked at this, beyond noting that SPUs on CELL should have
similar issues.  Wouldn't it be nice to have "ondemand" style
governors for DSPs or SPUs?  That's got to be easy. ;)


So far as I understand, Linux-coordinated power management of the DSP 
side of dual-core general-purpose + DSP platforms is often handled by a 
Linux driver that knows how to talk to whatever it is that runs on the 
DSP (such as via shared memory message libs from the silicon vendor). 
Soon the other core will be 

Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread jmerkey

Bernd,

It might be helpful for someone to look at these sections of code I had 
to patch in 2.6.9.
I discovered a case where the kernel scheduler will pass NULL for the 
array argument
when I started hitting the extreme upper range > 200MB/S combined disk 
and lan
throughput.  This was running with preemptible kernel and hyperthreading 
enabled.


The wheels come off in the kernel somewhere.  I looked at later 2.6 
kernels and there's
been some changes, but someone may get an ah ha from this fix, if there 
is an underlying
problem in the kernel.  


Jeff


static void dequeue_task(struct task_struct *p, prio_array_t *array)
{
-array->nr_active--;
-list_del(>run_list);
-if (list_empty(array->queue + p->prio))
-__clear_bit(p->prio, array->bitmap);
+if (!array)
+   printk("WARN:  prio_array was NULL in dequeue task %08X"
+  "pid-%d\n", (unsigned)p, (int)p->pid);
+
+if (array)
+{
+   array->nr_active--;
+   list_del(>run_list);
+   if (list_empty(array->queue + p->prio))
+   __clear_bit(p->prio, array->bitmap);
+}
}


static void deactivate_task(struct task_struct *p, runqueue_t *rq)
{
-rq->nr_running--;
-if (p->state == TASK_UNINTERRUPTIBLE)
-rq->nr_uninterruptible++;
-dequeue_task(p, p->array);
-p->array = NULL;
+if (!p->array)
+   printk("WARN:  prio_array was NULL in deactivate task %08X"
+  "pid-%d\n", (unsigned)p, (int)p->pid);
+
+if (p->array)
+{
+   rq->nr_running--;
+   if (p->state == TASK_UNINTERRUPTIBLE)
+   rq->nr_uninterruptible++;
+   dequeue_task(p, p->array);
+   p->array = NULL;
+}
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/18] Updates & bug fixes for iseries_veth network driver

2005-08-31 Thread Jeff Garzik

applied patches 1-18

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.13

2005-08-31 Thread Greg KH
On Wed, Aug 31, 2005 at 12:41:03AM +0200, Henrik Persson wrote:
> Linus Torvalds wrote:
> > There it is. 
> > 
> > The most painful part of 2.6.13 is likely to be the fact that we made x86
> > use the generic PCI bus setup code for assigning unassigned resources.  
> > That uncovered rather a lot of nasty small details, but should also mean
> > that a lot of laptops in particular should be able to discover PCI devices
> > behind bridges that the BIOS hasn't set up.
> > 
> > We've hopefully fixed up all the problems that the longish -rc series
> > showed, and it shouldn't be that painful, but if you have device problems,
> > please make a report that at a minimum contains the unified diff of the
> > output of "lspci -vvx" running on 2.6.12 vs 2.6.13. That might give us
> > some clues.
> 
> Well. 2.6.13 won't boot if I have my Netgear WG511 in the cardbus slot.
> It boots just fine if it isn't inserted, though. If I insert it later
> on, the computer will freeze and won't respond, just like it does on boot.
> 
> 2.6.12.5 works just fine, and I just did make oldconfig and used the
> defaults (except for the hardware monitoring).
> 
> Suggestions, anyone?

Can you try the patch posted to lkml at:
http://marc.theaimsgroup.com/?l=linux-kernel=112541348008047=2
from Ivan to see if that helps this?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread jmerkey

Bernd Eckenfels wrote:


In article <[EMAIL PROTECTED]> you wrote:
 


I mean, nvidia people also use propietary code in the kernel (probably
violating the GPL anyway) and don't do such things.
   



The Linux kernel allows binary drivers, you just have to live with a limited
number of exported symbols and that the kernel is tainted. Which basically
means nobody sane can help you with corrupted kernel data structures.

Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 


Bernd,

Thanks for the accurate and reasonable response.  I object to the use of 
the word "tainted".  This implies the
binary code is somehow infringing.  I would suggest changing the word to 
"non-GPL" or "Vendor Supported" since

this is more accurate.   Just a suggestion.

Thanks

Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: State of Linux graphics

2005-08-31 Thread Allen Akin
On Wed, Aug 31, 2005 at 02:06:54PM -0700, Keith Packard wrote:
| On Wed, 2005-08-31 at 13:06 -0700, Allen Akin wrote:
| > ...
| 
| Right, the goal is to have only one driver for the hardware, whether an
| X server for simple 2D only environments or a GL driver for 2D/3D
| environments. ...

I count two drivers there; I was hoping the goal was for one. :-)

|   ... I think the only questions here are about the road from
| where we are to that final goal.

Well there are other questions, including whether it's correct to
partition the world into "2D only" and "2D/3D" environments.  There are
many disadvantages and few advantages (that I can see) for doing so.

[I'm going to reorder your comments just a bit to clarify my replies; I
hope that's OK]

|... However, at the
| application level, GL is not a very friendly 2D application-level API.

The point of OpenGL is to expose what the vast majority of current
display hardware does well, and not a lot more.  So if a class of apps
isn't "happy" with the functionality that OpenGL provides, it won't be
happy with the functionality that any other low-level API provides.  The
problem lies with the hardware.

Conversely, if the apps aren't taking advantage of the functionality
OpenGL provides, they're not exploiting the opportunities the hardware
offers.  Of course I'm not saying all apps *must* use all of OpenGL;
simply that their developers should be aware of exactly what they're
leaving on the table.  It can make the difference between an app that's
run-of-the-mill and one that's outstanding.

"Friendliness" is another matter, and it makes a ton of sense to package
common functionality in an easier-to-use higher-level library that a lot
of apps can share.  In this discussion my concern isn't with Cairo, but
with the number and type of back-end APIs we (driver developers and
library developers and application developers) have to support.

| ... GL provides
| far more functionality than we need for 2D applications being designed
| and implemented today...

When I look at OpenGL, I see ways to:

Create geometric primitives
Specify how those primitives are transformed
Apply lighting to objects made of those primitives
Convert geometric primitives to images
Create images
Specify how those images are transformed
Determine which portions of images should be visible
Combine images
Manage the physical resources for implementing this stuff

With the exception of lighting, it seems to me that pretty much all of
that applies to today's "2D" apps.  It's just a myth that there's "far
more" functionality in OpenGL than 2D apps can use.  (Especially for
OpenGL ES, which eliminates legacy cruft from full OpenGL.)

|... picking the right subset and sticking to that is
| our current challenge.

That would be fine with me.  I'm more worried about what Render (plus
EXA?) represents -- a second development path with the actual costs and
opportunity costs I've mentioned before, and if apps become wedded to it
(rather than to a higher level like Cairo), a loss of opportunity to
exploit new features and better performance at the application level.

|  ...The integration of 2D and 3D acceleration into a
| single GL-based system will take longer, largely as we wait for the GL
| drivers to catch up to the requirements of the Xgl implementation that
| we already have.

Like Jon, I'm concerned that the focus on Render and EXA will
simultaneously take resources away from and reduce the motivation for
those drivers.

| I'm not sure we have any significant new extensions to create here;
| we've got a pretty good handle on how X maps to GL and it seems to work
| well enough with suitable existing extensions.

I'm glad to hear it, though a bit surprised.

| This will be an interesting area of research; right now, 2D applications
| are fairly sketchy about the structure of their UIs, so attempting to
| wrap them into more structured models will take some effort.

Game developers have done a surprising amount of work in this area, and
I know of one company deploying this sort of UI on graphics-accelerated
cell phones.  So some practical experience exists, and we should find a
way to tap into it.

| Certainly ensuring that cairo on glitz can be used to paint into an
| arbitrary GL context will go some ways in this direction.

Yep, that's essential.

| ...So far, 3D driver work has proceeded almost entirely on the
| newest documented hardware that people could get. Going back and
| spending months optimizing software 3D rendering code so that it works
| as fast as software 2D code seems like a thankless task.

Jon's right about this:  If you can accelerate a given simple function
(blending, say) for a 2D driver, you can accelerate that same function
in a Mesa driver for a comparable 

[PATCH 9/18] iseries_veth: Use kobjects to track lifecycle of connection structs

2005-08-31 Thread Michael Ellerman
The iseries_veth driver can attach to multiple vlans, which correspond to
multiple net devices. However there is only 1 connection between each LPAR,
so the connection structure may be shared by multiple net devices.

This makes module removal messy, because we can't deallocate the connections
until we know there are no net devices still using them. The solution is to
use ref counts on the connections, so we can delete them (actually stop) as
soon as the ref count hits zero.

This patch fixes (part of) a bug we were seeing with IPv6 sending probes to
a dead LPAR, which would then hang us forever due to leftover skbs.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |  121 ++---
 1 files changed, 83 insertions(+), 38 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -129,6 +129,7 @@ struct veth_lpar_connection {
int num_events;
struct VethCapData local_caps;
 
+   struct kobject kobject;
struct timer_list ack_timer;
 
spinlock_t lock;
@@ -171,6 +172,11 @@ static void veth_recycle_msg(struct veth
 static void veth_flush_pending(struct veth_lpar_connection *cnx);
 static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *);
 static void veth_timed_ack(unsigned long connectionPtr);
+static void veth_release_connection(struct kobject *kobject);
+
+static struct kobj_type veth_lpar_connection_ktype = {
+   .release= veth_release_connection
+};
 
 /*
  * Utility functions
@@ -611,7 +617,7 @@ static int veth_init_connection(u8 rlp)
 {
struct veth_lpar_connection *cnx;
struct veth_msg *msgs;
-   int i;
+   int i, rc;
 
if ( (rlp == this_lp)
 || ! HvLpConfig_doLpsCommunicateOnVirtualLan(this_lp, rlp) )
@@ -632,6 +638,14 @@ static int veth_init_connection(u8 rlp)
 
veth_cnx[rlp] = cnx;
 
+   /* This gets us 1 reference, which is held on behalf of the driver
+* infrastructure. It's released at module unload. */
+   kobject_init(>kobject);
+   cnx->kobject.ktype = _lpar_connection_ktype;
+   rc = kobject_set_name(>kobject, "cnx%.2d", rlp);
+   if (rc != 0)
+   return rc;
+
msgs = kmalloc(VETH_NUMBUFFERS * sizeof(struct veth_msg), GFP_KERNEL);
if (! msgs) {
veth_error("Can't allocate buffers for LPAR %d.\n", rlp);
@@ -660,11 +674,9 @@ static int veth_init_connection(u8 rlp)
return 0;
 }
 
-static void veth_stop_connection(u8 rlp)
+static void veth_stop_connection(struct veth_lpar_connection *cnx)
 {
-   struct veth_lpar_connection *cnx = veth_cnx[rlp];
-
-   if (! cnx)
+   if (!cnx)
return;
 
spin_lock_irq(>lock);
@@ -685,11 +697,9 @@ static void veth_stop_connection(u8 rlp)
flush_scheduled_work();
 }
 
-static void veth_destroy_connection(u8 rlp)
+static void veth_destroy_connection(struct veth_lpar_connection *cnx)
 {
-   struct veth_lpar_connection *cnx = veth_cnx[rlp];
-
-   if (! cnx)
+   if (!cnx)
return;
 
if (cnx->num_events > 0)
@@ -704,8 +714,16 @@ static void veth_destroy_connection(u8 r
  NULL, NULL);
 
kfree(cnx->msgs);
+   veth_cnx[cnx->remote_lp] = NULL;
kfree(cnx);
-   veth_cnx[rlp] = NULL;
+}
+
+static void veth_release_connection(struct kobject *kobj)
+{
+   struct veth_lpar_connection *cnx;
+   cnx = container_of(kobj, struct veth_lpar_connection, kobject);
+   veth_stop_connection(cnx);
+   veth_destroy_connection(cnx);
 }
 
 /*
@@ -1349,15 +1367,31 @@ static void veth_timed_ack(unsigned long
 
 static int veth_remove(struct vio_dev *vdev)
 {
-   int i = vdev->unit_address;
+   struct veth_lpar_connection *cnx;
struct net_device *dev;
+   struct veth_port *port;
+   int i;
 
-   dev = veth_dev[i];
-   if (dev != NULL) {
-   veth_dev[i] = NULL;
-   unregister_netdev(dev);
-   free_netdev(dev);
+   dev = veth_dev[vdev->unit_address];
+
+   if (! dev)
+   return 0;
+
+   port = netdev_priv(dev);
+
+   for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
+   cnx = veth_cnx[i];
+
+   if (cnx && (port->lpar_map & (1 << i))) {
+   /* Drop our reference to connections on our VLAN */
+   kobject_put(>kobject);
+   }
}
+
+   veth_dev[vdev->unit_address] = NULL;
+   unregister_netdev(dev);
+   free_netdev(dev);
+
return 0;
 }
 
@@ -1365,6 +1399,7 @@ static int veth_probe(struct vio_dev *vd
 {
int i = vdev->unit_address;
struct net_device *dev;
+   struct veth_port *port;
 
dev = veth_probe_one(i, >dev);
   

[PATCH 13/18] iseries_veth: Fix bogus counting of TX errors

2005-08-31 Thread Michael Ellerman
There's a number of problems with the way iseries_veth counts TX errors.

Firstly it counts conditions which aren't really errors as TX errors. This
includes if we don't have a connection struct for the other LPAR, or if the
other LPAR is currently down (or just doesn't want to talk to us). Neither
of these should count as TX errors.

Secondly, it counts one TX error for each LPAR that fails to accept the packet.
This can lead to TX error counts higher than the total number of packets sent
through the interface. This is confusing for users.

This patch fixes that behaviour. The non-error conditions are no longer
counted, and we introduce a new and I think saner meaning to the TX counts.

If a packet is successfully transmitted to any LPAR then it is transmitted
and tx_packets is incremented by 1.

If there is an error transmitting a packet to any LPAR then that is counted
as one error, ie. tx_errors is incremented by 1.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   47 ++---
 1 files changed, 19 insertions(+), 28 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -938,31 +938,25 @@ static int veth_transmit_to_one(struct s
struct veth_port *port = (struct veth_port *) dev->priv;
HvLpEvent_Rc rc;
struct veth_msg *msg = NULL;
-   int err = 0;
unsigned long flags;
 
-   if (! cnx) {
-   port->stats.tx_errors++;
-   dev_kfree_skb(skb);
+   if (! cnx)
return 0;
-   }
 
spin_lock_irqsave(>lock, flags);
 
if (! (cnx->state & VETH_STATE_READY))
-   goto drop;
+   goto no_error;
 
-   if ((skb->len - 14) > VETH_MAX_MTU)
+   if ((skb->len - ETH_HLEN) > VETH_MAX_MTU)
goto drop;
 
msg = veth_stack_pop(cnx);
-
-   if (! msg) {
-   err = 1;
+   if (! msg)
goto drop;
-   }
 
msg->in_use = 1;
+   msg->skb = skb_get(skb);
 
msg->data.addr[0] = dma_map_single(port->dev, skb->data,
skb->len, DMA_TO_DEVICE);
@@ -970,9 +964,6 @@ static int veth_transmit_to_one(struct s
if (dma_mapping_error(msg->data.addr[0]))
goto recycle_and_drop;
 
-   /* Is it really necessary to check the length and address
-* fields of the first entry here? */
-   msg->skb = skb;
msg->dev = port->dev;
msg->data.len[0] = skb->len;
msg->data.eofmask = 1 << VETH_EOF_SHIFT;
@@ -992,43 +983,43 @@ static int veth_transmit_to_one(struct s
if (veth_stack_is_empty(cnx))
veth_stop_queues(cnx);
 
+ no_error:
spin_unlock_irqrestore(>lock, flags);
return 0;
 
  recycle_and_drop:
-   /* we free the skb below, so tell veth_recycle_msg() not to. */
-   msg->skb = NULL;
veth_recycle_msg(cnx, msg);
  drop:
-   port->stats.tx_errors++;
-   dev_kfree_skb(skb);
spin_unlock_irqrestore(>lock, flags);
-   return err;
+   return 1;
 }
 
-static HvLpIndexMap veth_transmit_to_many(struct sk_buff *skb,
+static void veth_transmit_to_many(struct sk_buff *skb,
  HvLpIndexMap lpmask,
  struct net_device *dev)
 {
struct veth_port *port = (struct veth_port *) dev->priv;
-   int i;
-   int rc;
+   int i, success, error;
+
+   success = error = 0;
 
for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
if ((lpmask & (1 << i)) == 0)
continue;
 
-   rc = veth_transmit_to_one(skb_get(skb), i, dev);
-   if (! rc)
-   lpmask &= ~(1stats.tx_packets++;
port->stats.tx_bytes += skb->len;
}
-
-   return lpmask;
 }
 
 static int veth_start_xmit(struct sk_buff *skb, struct net_device *dev)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/18] iseries_veth: Add a per-connection ack timer

2005-08-31 Thread Michael Ellerman
Currently the iseries_veth driver contravenes the specification in
Documentation/networking/driver.txt, in that if packets are not acked by
the other LPAR they will sit around forever.

This patch adds a per-connection timer which fires if we've had no acks for
five seconds. This is superior to the generic TX timer because it catches
the case of a small number of packets being sent and never acked.

This fixes a bug we were seeing on real systems, where some IPv6 neighbour
discovery packets would not be acked and then prevent the module from being
removed, due to skbs lying around.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   75 +
 1 files changed, 69 insertions(+), 6 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -132,6 +132,11 @@ struct veth_lpar_connection {
struct kobject kobject;
struct timer_list ack_timer;
 
+   struct timer_list reset_timer;
+   unsigned int reset_timeout;
+   unsigned long last_contact;
+   int outstanding_tx;
+
spinlock_t lock;
unsigned long state;
HvLpInstanceId src_inst;
@@ -171,8 +176,9 @@ static int veth_start_xmit(struct sk_buf
 static void veth_recycle_msg(struct veth_lpar_connection *, struct veth_msg *);
 static void veth_flush_pending(struct veth_lpar_connection *cnx);
 static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *);
-static void veth_timed_ack(unsigned long connectionPtr);
 static void veth_release_connection(struct kobject *kobject);
+static void veth_timed_ack(unsigned long ptr);
+static void veth_timed_reset(unsigned long ptr);
 
 static struct kobj_type veth_lpar_connection_ktype = {
.release= veth_release_connection
@@ -360,7 +366,7 @@ static void veth_handle_int(struct VethL
HvLpIndex rlp = event->base_event.xSourceLp;
struct veth_lpar_connection *cnx = veth_cnx[rlp];
unsigned long flags;
-   int i;
+   int i, acked = 0;
 
BUG_ON(! cnx);
 
@@ -374,13 +380,22 @@ static void veth_handle_int(struct VethL
break;
case VethEventTypeFramesAck:
spin_lock_irqsave(>lock, flags);
+
for (i = 0; i < VETH_MAX_ACKS_PER_MSG; ++i) {
u16 msgnum = event->u.frames_ack_data.token[i];
 
-   if (msgnum < VETH_NUMBUFFERS)
+   if (msgnum < VETH_NUMBUFFERS) {
veth_recycle_msg(cnx, cnx->msgs + msgnum);
+   cnx->outstanding_tx--;
+   acked++;
+   }
}
+
+   if (acked > 0)
+   cnx->last_contact = jiffies;
+
spin_unlock_irqrestore(>lock, flags);
+
veth_flush_pending(cnx);
break;
case VethEventTypeFrames:
@@ -454,8 +469,6 @@ static void veth_statemachine(void *p)
 
  restart:
if (cnx->state & VETH_STATE_RESET) {
-   int i;
-
if (cnx->state & VETH_STATE_OPEN)
HvCallEvent_closeLpEventPath(cnx->remote_lp,
 HvLpEvent_Type_VirtualLan);
@@ -474,15 +487,20 @@ static void veth_statemachine(void *p)
| VETH_STATE_SENTCAPACK | VETH_STATE_READY);
 
/* Clean up any leftover messages */
-   if (cnx->msgs)
+   if (cnx->msgs) {
+   int i;
for (i = 0; i < VETH_NUMBUFFERS; ++i)
veth_recycle_msg(cnx, cnx->msgs + i);
+   }
+   cnx->outstanding_tx = 0;
 
/* Drop the lock so we can do stuff that might sleep or
 * take other locks. */
spin_unlock_irq(>lock);
 
del_timer_sync(>ack_timer);
+   del_timer_sync(>reset_timer);
+
veth_flush_pending(cnx);
 
spin_lock_irq(>lock);
@@ -631,9 +649,16 @@ static int veth_init_connection(u8 rlp)
cnx->remote_lp = rlp;
spin_lock_init(>lock);
INIT_WORK(>statemachine_wq, veth_statemachine, cnx);
+
init_timer(>ack_timer);
cnx->ack_timer.function = veth_timed_ack;
cnx->ack_timer.data = (unsigned long) cnx;
+
+   init_timer(>reset_timer);
+   cnx->reset_timer.function = veth_timed_reset;
+   cnx->reset_timer.data = (unsigned long) cnx;
+   cnx->reset_timeout = 5 * HZ * (VETH_ACKTIMEOUT / 100);
+
memset(>pending_acks, 0xff, sizeof (cnx->pending_acks));
 
veth_cnx[rlp] = cnx;
@@ -948,6 +973,13 @@ static int veth_transmit_to_one(struct s
if (rc != HvLpEvent_Rc_Good)
goto 

[PATCH 14/18] iseries_veth: Add sysfs support for connection structs

2005-08-31 Thread Michael Ellerman
To aid in field debugging, add sysfs support for iseries_veth's connection
structures. At the moment this is all read-only, however we could think about
adding write support for some attributes in future.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   94 +++--
 1 files changed, 90 insertions(+), 4 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -182,10 +182,6 @@ static void veth_release_connection(stru
 static void veth_timed_ack(unsigned long ptr);
 static void veth_timed_reset(unsigned long ptr);
 
-static struct kobj_type veth_lpar_connection_ktype = {
-   .release= veth_release_connection
-};
-
 /*
  * Utility functions
  */
@@ -280,6 +276,81 @@ static int veth_allocate_events(HvLpInde
 }
 
 /*
+ * sysfs support
+ */
+
+struct veth_cnx_attribute {
+   struct attribute attr;
+   ssize_t (*show)(struct veth_lpar_connection *, char *buf);
+   ssize_t (*store)(struct veth_lpar_connection *, const char *buf);
+};
+
+static ssize_t veth_cnx_attribute_show(struct kobject *kobj,
+   struct attribute *attr, char *buf)
+{
+   struct veth_cnx_attribute *cnx_attr;
+   struct veth_lpar_connection *cnx;
+
+   cnx_attr = container_of(attr, struct veth_cnx_attribute, attr);
+   cnx = container_of(kobj, struct veth_lpar_connection, kobject);
+
+   if (!cnx_attr->show)
+   return -EIO;
+
+   return cnx_attr->show(cnx, buf);
+}
+
+#define CUSTOM_CNX_ATTR(_name, _format, _expression)   \
+static ssize_t _name##_show(struct veth_lpar_connection *cnx, char *buf)\
+{  \
+   return sprintf(buf, _format, _expression);  \
+}  \
+struct veth_cnx_attribute veth_cnx_attr_##_name = __ATTR_RO(_name)
+
+#define SIMPLE_CNX_ATTR(_name) \
+   CUSTOM_CNX_ATTR(_name, "%lu\n", (unsigned long)cnx->_name)
+
+SIMPLE_CNX_ATTR(outstanding_tx);
+SIMPLE_CNX_ATTR(remote_lp);
+SIMPLE_CNX_ATTR(num_events);
+SIMPLE_CNX_ATTR(src_inst);
+SIMPLE_CNX_ATTR(dst_inst);
+SIMPLE_CNX_ATTR(num_pending_acks);
+SIMPLE_CNX_ATTR(num_ack_events);
+CUSTOM_CNX_ATTR(ack_timeout, "%d\n", jiffies_to_msecs(cnx->ack_timeout));
+CUSTOM_CNX_ATTR(reset_timeout, "%d\n", jiffies_to_msecs(cnx->reset_timeout));
+CUSTOM_CNX_ATTR(state, "0x%.4lX\n", cnx->state);
+CUSTOM_CNX_ATTR(last_contact, "%d\n", cnx->last_contact ?
+   jiffies_to_msecs(jiffies - cnx->last_contact) : 0);
+
+#define GET_CNX_ATTR(_name)(_cnx_attr_##_name.attr)
+
+static struct attribute *veth_cnx_default_attrs[] = {
+   GET_CNX_ATTR(outstanding_tx),
+   GET_CNX_ATTR(remote_lp),
+   GET_CNX_ATTR(num_events),
+   GET_CNX_ATTR(reset_timeout),
+   GET_CNX_ATTR(last_contact),
+   GET_CNX_ATTR(state),
+   GET_CNX_ATTR(src_inst),
+   GET_CNX_ATTR(dst_inst),
+   GET_CNX_ATTR(num_pending_acks),
+   GET_CNX_ATTR(num_ack_events),
+   GET_CNX_ATTR(ack_timeout),
+   NULL
+};
+
+static struct sysfs_ops veth_cnx_sysfs_ops = {
+   .show = veth_cnx_attribute_show
+};
+
+static struct kobj_type veth_lpar_connection_ktype = {
+   .release= veth_release_connection,
+   .sysfs_ops  = _cnx_sysfs_ops,
+   .default_attrs  = veth_cnx_default_attrs
+};
+
+/*
  * LPAR connection code
  */
 
@@ -1493,6 +1564,8 @@ void __exit veth_module_cleanup(void)
if (!cnx)
continue;
 
+   /* Remove the connection from sysfs */
+   kobject_del(>kobject);
/* Drop the driver's reference to the connection */
kobject_put(>kobject);
}
@@ -1523,6 +1596,19 @@ int __init veth_module_init(void)
if (rc != 0)
goto error;
 
+   for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) {
+   struct kobject *kobj;
+
+   if (!veth_cnx[i])
+   continue;
+
+   kobj = _cnx[i]->kobject;
+   kobj->parent = _driver.driver.kobj;
+   /* If the add failes, complain but otherwise continue */
+   if (0 != kobject_add(kobj))
+   veth_error("cnx %d: Failed adding to sysfs.\n", i);
+   }
+
return 0;
 
 error:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/18] iseries_veth: Only call dma_unmap_single() if dma_map_single() succeeded

2005-08-31 Thread Michael Ellerman
The iseries_veth driver unconditionally calls dma_unmap_single() even
when the corresponding dma_map_single() may have failed.

Rework the code a bit to keep the return value from dma_unmap_single()
around, and then check if it's a dma_mapping_error() before we do
the dma_unmap_single().

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   17 -
 1 files changed, 8 insertions(+), 9 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -931,7 +931,6 @@ static int veth_transmit_to_one(struct s
struct veth_lpar_connection *cnx = veth_cnx[rlp];
struct veth_port *port = (struct veth_port *) dev->priv;
HvLpEvent_Rc rc;
-   u32 dma_address, dma_length;
struct veth_msg *msg = NULL;
int err = 0;
unsigned long flags;
@@ -959,20 +958,19 @@ static int veth_transmit_to_one(struct s
 
msg->in_use = 1;
 
-   dma_length = skb->len;
-   dma_address = dma_map_single(port->dev, skb->data,
-dma_length, DMA_TO_DEVICE);
+   msg->data.addr[0] = dma_map_single(port->dev, skb->data,
+   skb->len, DMA_TO_DEVICE);
 
-   if (dma_mapping_error(dma_address))
+   if (dma_mapping_error(msg->data.addr[0]))
goto recycle_and_drop;
 
/* Is it really necessary to check the length and address
 * fields of the first entry here? */
msg->skb = skb;
msg->dev = port->dev;
-   msg->data.addr[0] = dma_address;
-   msg->data.len[0] = dma_length;
+   msg->data.len[0] = skb->len;
msg->data.eofmask = 1 << VETH_EOF_SHIFT;
+
rc = veth_signaldata(cnx, VethEventTypeFrames, msg->token, >data);
 
if (rc != HvLpEvent_Rc_Good)
@@ -1076,8 +1074,9 @@ static void veth_recycle_msg(struct veth
dma_address = msg->data.addr[0];
dma_length = msg->data.len[0];
 
-   dma_unmap_single(msg->dev, dma_address, dma_length,
-DMA_TO_DEVICE);
+   if (!dma_mapping_error(dma_address))
+   dma_unmap_single(msg->dev, dma_address, dma_length,
+   DMA_TO_DEVICE);
 
if (msg->skb) {
dev_kfree_skb_any(msg->skb);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/18] iseries_veth: Simplify full-queue handling

2005-08-31 Thread Michael Ellerman
The iseries_veth driver often has multiple netdevices sending packets over
a single connection to another LPAR. If the bandwidth to the other LPAR is
exceeded, all the netdevices must have their queues stopped.

The current code achieves this by queueing one incoming skb on the
per-netdevice port structure. When the connection is able to send more packets
we iterate through the port structs and flush any packet that is queued,
as well as restarting the associated netdevice's queue.

This arrangement makes less sense now that we have per-connection TX timers,
rather than the per-netdevice generic TX timer.

The new code simply detects when one of the connections is full, and stops
the queue of all associated netdevices. Then when a packet is acked on that
connection (ie. there is space again) all the queues are woken up.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |  108 ++---
 1 files changed, 64 insertions(+), 44 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -158,10 +158,11 @@ struct veth_port {
u64 mac_addr;
HvLpIndexMap lpar_map;
 
-   spinlock_t pending_gate;
-   struct sk_buff *pending_skb;
-   HvLpIndexMap pending_lpmask;
+   /* queue_lock protects the stopped_map and dev's queue. */
+   spinlock_t queue_lock;
+   HvLpIndexMap stopped_map;
 
+   /* mcast_gate protects promiscuous, num_mcast & mcast_addr. */
rwlock_t mcast_gate;
int promiscuous;
int num_mcast;
@@ -174,7 +175,8 @@ static struct net_device *veth_dev[HVMAX
 
 static int veth_start_xmit(struct sk_buff *skb, struct net_device *dev);
 static void veth_recycle_msg(struct veth_lpar_connection *, struct veth_msg *);
-static void veth_flush_pending(struct veth_lpar_connection *cnx);
+static void veth_wake_queues(struct veth_lpar_connection *cnx);
+static void veth_stop_queues(struct veth_lpar_connection *cnx);
 static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *);
 static void veth_release_connection(struct kobject *kobject);
 static void veth_timed_ack(unsigned long ptr);
@@ -221,6 +223,12 @@ static inline struct veth_msg *veth_stac
return msg;
 }
 
+/* You must hold the connection's lock when you call this function. */
+static inline int veth_stack_is_empty(struct veth_lpar_connection *cnx)
+{
+   return cnx->msg_stack_head == NULL;
+}
+
 static inline HvLpEvent_Rc
 veth_signalevent(struct veth_lpar_connection *cnx, u16 subtype,
 HvLpEvent_AckInd ackind, HvLpEvent_AckType acktype,
@@ -391,12 +399,12 @@ static void veth_handle_int(struct VethL
}
}
 
-   if (acked > 0)
+   if (acked > 0) {
cnx->last_contact = jiffies;
+   veth_wake_queues(cnx);
+   }
 
spin_unlock_irqrestore(>lock, flags);
-
-   veth_flush_pending(cnx);
break;
case VethEventTypeFrames:
veth_receive(cnx, event);
@@ -492,7 +500,9 @@ static void veth_statemachine(void *p)
for (i = 0; i < VETH_NUMBUFFERS; ++i)
veth_recycle_msg(cnx, cnx->msgs + i);
}
+
cnx->outstanding_tx = 0;
+   veth_wake_queues(cnx);
 
/* Drop the lock so we can do stuff that might sleep or
 * take other locks. */
@@ -501,8 +511,6 @@ static void veth_statemachine(void *p)
del_timer_sync(>ack_timer);
del_timer_sync(>reset_timer);
 
-   veth_flush_pending(cnx);
-
spin_lock_irq(>lock);
 
if (cnx->state & VETH_STATE_RESET)
@@ -869,8 +877,9 @@ static struct net_device * __init veth_p
 
port = (struct veth_port *) dev->priv;
 
-   spin_lock_init(>pending_gate);
+   spin_lock_init(>queue_lock);
rwlock_init(>mcast_gate);
+   port->stopped_map = 0;
 
for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
HvLpVirtualLanIndexMap map;
@@ -980,6 +989,9 @@ static int veth_transmit_to_one(struct s
cnx->last_contact = jiffies;
cnx->outstanding_tx++;
 
+   if (veth_stack_is_empty(cnx))
+   veth_stop_queues(cnx);
+
spin_unlock_irqrestore(>lock, flags);
return 0;
 
@@ -1023,7 +1035,6 @@ static int veth_start_xmit(struct sk_buf
 {
unsigned char *frame = skb->data;
struct veth_port *port = (struct veth_port *) dev->priv;
-   unsigned long flags;
HvLpIndexMap lpmask;
 
if (! (frame[0] & 0x01)) {
@@ -1040,27 +1051,9 @@ static int veth_start_xmit(struct sk_buf
lpmask = port->lpar_map;
}
 
-   spin_lock_irqsave(>pending_gate, 

[PATCH 8/18] iseries_veth: Make init_connection() & destroy_connection() symmetrical

2005-08-31 Thread Michael Ellerman
This patch makes veth_init_connection() and veth_destroy_connection()
symmetrical in that they allocate/deallocate the same data.

Currently if there's an error while initialising connections (ie. ENOMEM)
we call veth_module_cleanup(), however this will oops because we call
driver_unregister() before we've called driver_register(). I've never seen
this actually happen though.

So instead we explicitly call veth_destroy_connection() for each connection,
any that have been set up will be deallocated.

We also fix a potential leak if vio_register_driver() fails.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   35 ++-
 1 files changed, 22 insertions(+), 13 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -683,6 +683,14 @@ static void veth_stop_connection(u8 rlp)
 
/* Wait for the state machine to run. */
flush_scheduled_work();
+}
+
+static void veth_destroy_connection(u8 rlp)
+{
+   struct veth_lpar_connection *cnx = veth_cnx[rlp];
+
+   if (! cnx)
+   return;
 
if (cnx->num_events > 0)
mf_deallocate_lp_events(cnx->remote_lp,
@@ -694,14 +702,6 @@ static void veth_stop_connection(u8 rlp)
  HvLpEvent_Type_VirtualLan,
  cnx->num_ack_events,
  NULL, NULL);
-}
-
-static void veth_destroy_connection(u8 rlp)
-{
-   struct veth_lpar_connection *cnx = veth_cnx[rlp];
-
-   if (! cnx)
-   return;
 
kfree(cnx->msgs);
kfree(cnx);
@@ -1441,15 +1441,24 @@ int __init veth_module_init(void)
 
for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) {
rc = veth_init_connection(i);
-   if (rc != 0) {
-   veth_module_cleanup();
-   return rc;
-   }
+   if (rc != 0)
+   goto error;
}
 
HvLpEvent_registerHandler(HvLpEvent_Type_VirtualLan,
  _handle_event);
 
-   return vio_register_driver(_driver);
+   rc = vio_register_driver(_driver);
+   if (rc != 0)
+   goto error;
+
+   return 0;
+
+error:
+   for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) {
+   veth_destroy_connection(i);
+   }
+
+   return rc;
 }
 module_init(veth_module_init);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/18] iseries_veth: Remove TX timeout code

2005-08-31 Thread Michael Ellerman
The iseries_veth driver uses the generic TX timeout watchdog, however a better
solution is in the works, so remove this code.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   48 -
 1 files changed, 48 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -830,49 +830,6 @@ static struct ethtool_ops ops = {
.get_link = veth_get_link,
 };
 
-static void veth_tx_timeout(struct net_device *dev)
-{
-   struct veth_port *port = (struct veth_port *)dev->priv;
-   struct net_device_stats *stats = >stats;
-   unsigned long flags;
-   int i;
-
-   stats->tx_errors++;
-
-   spin_lock_irqsave(>pending_gate, flags);
-
-   if (!port->pending_lpmask) {
-   spin_unlock_irqrestore(>pending_gate, flags);
-   return;
-   }
-
-   printk(KERN_WARNING "%s: Tx timeout!  Resetting lp connections: %08x\n",
-  dev->name, port->pending_lpmask);
-
-   for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
-   struct veth_lpar_connection *cnx = veth_cnx[i];
-
-   if (! (port->pending_lpmask & (1state |= VETH_STATE_RESET;
-   veth_kick_statemachine(cnx);
-   spin_unlock(>lock);
-   }
-
-   spin_unlock_irqrestore(>pending_gate, flags);
-}
-
 static struct net_device * __init veth_probe_one(int vlan, struct device *vdev)
 {
struct net_device *dev;
@@ -921,9 +878,6 @@ static struct net_device * __init veth_p
dev->set_multicast_list = veth_set_multicast_list;
SET_ETHTOOL_OPS(dev, );
 
-   dev->watchdog_timeo = 2 * (VETH_ACKTIMEOUT * HZ / 100);
-   dev->tx_timeout = veth_tx_timeout;
-
SET_NETDEV_DEV(dev, vdev);
 
rc = register_netdev(dev);
@@ -1058,8 +1012,6 @@ static int veth_start_xmit(struct sk_buf
 
lpmask = veth_transmit_to_many(skb, lpmask, dev);
 
-   dev->trans_start = jiffies;
-
if (! lpmask) {
dev_kfree_skb(skb);
} else {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 16/18] iseries_veth: Incorporate iseries_veth.h in iseries_veth.c

2005-08-31 Thread Michael Ellerman
iseries_veth.h is only used by iseries_veth.c, so merge the former into
the latter.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.h |   46 -
 drivers/net/iseries_veth.c |   42 +++--
 2 files changed, 40 insertions(+), 48 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -81,12 +81,50 @@
 
 #undef DEBUG
 
-#include "iseries_veth.h"
-
 MODULE_AUTHOR("Kyle Lucke <[EMAIL PROTECTED]>");
 MODULE_DESCRIPTION("iSeries Virtual ethernet driver");
 MODULE_LICENSE("GPL");
 
+#define VethEventTypeCap   (0)
+#define VethEventTypeFrames(1)
+#define VethEventTypeMonitor   (2)
+#define VethEventTypeFramesAck (3)
+
+#define VETH_MAX_ACKS_PER_MSG  (20)
+#define VETH_MAX_FRAMES_PER_MSG(6)
+
+struct VethFramesData {
+   u32 addr[VETH_MAX_FRAMES_PER_MSG];
+   u16 len[VETH_MAX_FRAMES_PER_MSG];
+   u32 eofmask;
+};
+#define VETH_EOF_SHIFT (32-VETH_MAX_FRAMES_PER_MSG)
+
+struct VethFramesAckData {
+   u16 token[VETH_MAX_ACKS_PER_MSG];
+};
+
+struct VethCapData {
+   u8 caps_version;
+   u8 rsvd1;
+   u16 num_buffers;
+   u16 ack_threshold;
+   u16 rsvd2;
+   u32 ack_timeout;
+   u32 rsvd3;
+   u64 rsvd4[3];
+};
+
+struct VethLpEvent {
+   struct HvLpEvent base_event;
+   union {
+   struct VethCapData caps_data;
+   struct VethFramesData frames_data;
+   struct VethFramesAckData frames_ack_data;
+   } u;
+
+};
+
 #define VETH_NUMBUFFERS(120)
 #define VETH_ACKTIMEOUT(100) /* microseconds */
 #define VETH_MAX_MCAST (12)
Index: veth-dev2/drivers/net/iseries_veth.h
===
--- veth-dev2.orig/drivers/net/iseries_veth.h
+++ /dev/null
@@ -1,46 +0,0 @@
-/* File veth.h created by Kyle A. Lucke on Mon Aug  7 2000. */
-
-#ifndef _ISERIES_VETH_H
-#define _ISERIES_VETH_H
-
-#define VethEventTypeCap   (0)
-#define VethEventTypeFrames(1)
-#define VethEventTypeMonitor   (2)
-#define VethEventTypeFramesAck (3)
-
-#define VETH_MAX_ACKS_PER_MSG  (20)
-#define VETH_MAX_FRAMES_PER_MSG(6)
-
-struct VethFramesData {
-   u32 addr[VETH_MAX_FRAMES_PER_MSG];
-   u16 len[VETH_MAX_FRAMES_PER_MSG];
-   u32 eofmask;
-};
-#define VETH_EOF_SHIFT (32-VETH_MAX_FRAMES_PER_MSG)
-
-struct VethFramesAckData {
-   u16 token[VETH_MAX_ACKS_PER_MSG];
-};
-
-struct VethCapData {
-   u8 caps_version;
-   u8 rsvd1;
-   u16 num_buffers;
-   u16 ack_threshold;
-   u16 rsvd2;
-   u32 ack_timeout;
-   u32 rsvd3;
-   u64 rsvd4[3];
-};
-
-struct VethLpEvent {
-   struct HvLpEvent base_event;
-   union {
-   struct VethCapData caps_data;
-   struct VethFramesData frames_data;
-   struct VethFramesAckData frames_ack_data;
-   } u;
-
-};
-
-#endif /* _ISERIES_VETH_H */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/18] iseries_veth: Remove a FIXME WRT deletion of the ack_timer

2005-08-31 Thread Michael Ellerman
The iseries_veth driver has a timer which we use to send acks. When the
connection is reset or stopped we need to delete the timer.

Currently we only call del_timer() when resetting a connection, which means
the timer might run again while the connection is being re-setup. As it turns
out that's ok, because the flags the timer consults have been reset.

It's cleaner though to call del_timer_sync() once we've dropped the lock,
although the timer may still run between us dropping the lock and calling
del_timer_sync(), but as above that's ok.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   21 +
 1 files changed, 13 insertions(+), 8 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -450,13 +450,15 @@ static void veth_statemachine(void *p)
if (cnx->state & VETH_STATE_RESET) {
int i;
 
-   del_timer(>ack_timer);
-
if (cnx->state & VETH_STATE_OPEN)
HvCallEvent_closeLpEventPath(cnx->remote_lp,
 HvLpEvent_Type_VirtualLan);
 
-   /* reset ack data */
+   /*
+* Reset ack data. This prevents the ack_timer actually
+* doing anything, even if it runs one more time when
+* we drop the lock below.
+*/
memset(>pending_acks, 0xff, sizeof (cnx->pending_acks));
cnx->num_pending_acks = 0;
 
@@ -469,9 +471,16 @@ static void veth_statemachine(void *p)
if (cnx->msgs)
for (i = 0; i < VETH_NUMBUFFERS; ++i)
veth_recycle_msg(cnx, cnx->msgs + i);
+
+   /* Drop the lock so we can do stuff that might sleep or
+* take other locks. */
spin_unlock_irq(>lock);
+
+   del_timer_sync(>ack_timer);
veth_flush_pending(cnx);
+
spin_lock_irq(>lock);
+
if (cnx->state & VETH_STATE_RESET)
goto restart;
}
@@ -658,13 +667,9 @@ static void veth_stop_connection(u8 rlp)
veth_kick_statemachine(cnx);
spin_unlock_irq(>lock);
 
+   /* Wait for the state machine to run. */
flush_scheduled_work();
 
-   /* FIXME: not sure if this is necessary - will already have
-* been deleted by the state machine, just want to make sure
-* its not running any more */
-   del_timer_sync(>ack_timer);
-
if (cnx->num_events > 0)
mf_deallocate_lp_events(cnx->remote_lp,
  HvLpEvent_Type_VirtualLan,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/18] iseries_veth: Replace lock-protected atomic with an ordinary variable

2005-08-31 Thread Michael Ellerman
The iseries_veth driver uses atomic ops to manipulate the in_use field of
one of its per-connection structures. However all references to the
flag occur while the connection's lock is held, so the atomic ops aren't
necessary.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -117,7 +117,7 @@ struct veth_msg {
struct veth_msg *next;
struct VethFramesData data;
int token;
-   unsigned long in_use;
+   int in_use;
struct sk_buff *skb;
struct device *dev;
 };
@@ -957,6 +957,8 @@ static int veth_transmit_to_one(struct s
goto drop;
}
 
+   msg->in_use = 1;
+
dma_length = skb->len;
dma_address = dma_map_single(port->dev, skb->data,
 dma_length, DMA_TO_DEVICE);
@@ -971,7 +973,6 @@ static int veth_transmit_to_one(struct s
msg->data.addr[0] = dma_address;
msg->data.len[0] = dma_length;
msg->data.eofmask = 1 << VETH_EOF_SHIFT;
-   set_bit(0, &(msg->in_use));
rc = veth_signaldata(cnx, VethEventTypeFrames, msg->token, >data);
 
if (rc != HvLpEvent_Rc_Good)
@@ -981,10 +982,8 @@ static int veth_transmit_to_one(struct s
return 0;
 
  recycle_and_drop:
+   /* we free the skb below, so tell veth_recycle_msg() not to. */
msg->skb = NULL;
-   /* need to set in use to make veth_recycle_msg in case this
-* was a mapping failure */
-   set_bit(0, >in_use);
veth_recycle_msg(cnx, msg);
  drop:
port->stats.tx_errors++;
@@ -1066,12 +1065,14 @@ static int veth_start_xmit(struct sk_buf
return 0;
 }
 
+/* You must hold the connection's lock when you call this function. */
 static void veth_recycle_msg(struct veth_lpar_connection *cnx,
 struct veth_msg *msg)
 {
u32 dma_address, dma_length;
 
-   if (test_and_clear_bit(0, >in_use)) {
+   if (msg->in_use) {
+   msg->in_use = 0;
dma_address = msg->data.addr[0];
dma_length = msg->data.len[0];
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/18] iseries_veth: Cleanup error and debug messages

2005-08-31 Thread Michael Ellerman
Currently the iseries_veth driver prints the file name and line number in its
error messages. This isn't very useful for most users, so just print
"iseries_veth: message" instead.

 - convert uses of veth_printk() to veth_debug()/veth_error()/veth_info()
 - make terminology consistent, ie. always refer to LPAR not lpar
 - be consistent about printing return codes as %d not %x
 - make format strings fit in 80 columns

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   98 +++--
 1 files changed, 51 insertions(+), 47 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -79,6 +79,8 @@
 #include 
 #include 
 
+#undef DEBUG
+
 #include "iseries_veth.h"
 
 MODULE_AUTHOR("Kyle Lucke <[EMAIL PROTECTED]>");
@@ -176,11 +178,18 @@ static void veth_timed_ack(unsigned long
  * Utility functions
  */
 
-#define veth_printk(prio, fmt, args...) \
-   printk(prio "%s: " fmt, __FILE__, ## args)
+#define veth_info(fmt, args...) \
+   printk(KERN_INFO "iseries_veth: " fmt, ## args)
 
 #define veth_error(fmt, args...) \
-   printk(KERN_ERR "(%s:%3.3d) ERROR: " fmt, __FILE__, __LINE__ , ## args)
+   printk(KERN_ERR "iseries_veth: Error: " fmt, ## args)
+
+#ifdef DEBUG
+#define veth_debug(fmt, args...) \
+   printk(KERN_DEBUG "iseries_veth: " fmt, ## args)
+#else
+#define veth_debug(fmt, args...) do {} while (0)
+#endif
 
 static inline void veth_stack_push(struct veth_lpar_connection *cnx,
   struct veth_msg *msg)
@@ -278,7 +287,7 @@ static void veth_take_cap(struct veth_lp
  HvLpEvent_Type_VirtualLan);
 
if (cnx->state & VETH_STATE_GOTCAPS) {
-   veth_error("Received a second capabilities from lpar %d\n",
+   veth_error("Received a second capabilities from LPAR %d.\n",
   cnx->remote_lp);
event->base_event.xRc = HvLpEvent_Rc_BufferNotAvailable;
HvCallEvent_ackLpEvent((struct HvLpEvent *) event);
@@ -297,7 +306,7 @@ static void veth_take_cap_ack(struct vet
 
spin_lock_irqsave(>lock, flags);
if (cnx->state & VETH_STATE_GOTCAPACK) {
-   veth_error("Received a second capabilities ack from lpar %d\n",
+   veth_error("Received a second capabilities ack from LPAR %d.\n",
   cnx->remote_lp);
} else {
memcpy(>cap_ack_event, event,
@@ -314,8 +323,7 @@ static void veth_take_monitor_ack(struct
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
-   veth_printk(KERN_DEBUG, "Monitor ack returned for lpar %d\n",
-   cnx->remote_lp);
+   veth_debug("cnx %d: lost connection.\n", cnx->remote_lp);
cnx->state |= VETH_STATE_RESET;
veth_kick_statemachine(cnx);
spin_unlock_irqrestore(>lock, flags);
@@ -336,8 +344,8 @@ static void veth_handle_ack(struct VethL
veth_take_monitor_ack(cnx, event);
break;
default:
-   veth_error("Unknown ack type %d from lpar %d\n",
-  event->base_event.xSubtype, rlp);
+   veth_error("Unknown ack type %d from LPAR %d.\n",
+   event->base_event.xSubtype, rlp);
};
 }
 
@@ -373,8 +381,8 @@ static void veth_handle_int(struct VethL
veth_receive(cnx, event);
break;
default:
-   veth_error("Unknown interrupt type %d from lpar %d\n",
-  event->base_event.xSubtype, rlp);
+   veth_error("Unknown interrupt type %d from LPAR %d.\n",
+   event->base_event.xSubtype, rlp);
};
 }
 
@@ -400,8 +408,8 @@ static int veth_process_caps(struct veth
 || (remote_caps->ack_threshold > VETH_MAX_ACKS_PER_MSG)
 || (remote_caps->ack_threshold == 0)
 || (cnx->ack_timeout == 0) ) {
-   veth_error("Received incompatible capabilities from lpar %d\n",
-  cnx->remote_lp);
+   veth_error("Received incompatible capabilities from LPAR %d.\n",
+   cnx->remote_lp);
return HvLpEvent_Rc_InvalidSubtypeData;
}
 
@@ -418,8 +426,8 @@ static int veth_process_caps(struct veth
cnx->num_ack_events += num;
 
if (cnx->num_ack_events < num_acks_needed) {
-   veth_error("Couldn't allocate enough ack events for 
lpar %d\n",
-  cnx->remote_lp);
+   veth_error("Couldn't allocate enough ack events "
+   "for LPAR %d.\n", cnx->remote_lp);
 
return 

[PATCH 3/18] iseries_veth: Try to avoid pathological reset behaviour

2005-08-31 Thread Michael Ellerman
The iseries_veth driver contains a state machine which is used to manage
how connections are setup and neogotiated between LPARs.

If one side of a connection resets for some reason, the two LPARs can get
stuck in a race to re-setup the connection. This can lead to the connection
being declared dead by one or both ends. In practice the connection is
declared dead by one or both ends approximately 8/10 times a connection is
reset, although it is rare for connections to be reset.

(an example here: http://michael.ellerman.id.au/files/misc/veth-trace.html)

The core of the problem is that the end that resets the connection doesn't
wait for the other end to become aware of the reset. So the resetting end
starts setting the connection back up, and then receives a reset from the
other end (which is the response to the initial reset). And so on.

We're severely limited in what we can do to fix this. The protocol between
LPARs is essentially fixed, as we have to interoperate with both OS/400
and old Linux drivers. Which also means we need a fix that only changes the
code on one end.

The only fix I've found given that, is to just blindly sleep for a bit when
resetting the connection, in the hope that the other end will get itself
sorted.  Needless to say I'd love it if someone has a better idea.

This does work, I've so far been unable to get it to break, whereas without
the fix a reset of one end will lead to a dead connection ~8/10 times.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   25 +++--
 1 files changed, 23 insertions(+), 2 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -324,8 +324,14 @@ static void veth_take_monitor_ack(struct
 
spin_lock_irqsave(>lock, flags);
veth_debug("cnx %d: lost connection.\n", cnx->remote_lp);
-   cnx->state |= VETH_STATE_RESET;
-   veth_kick_statemachine(cnx);
+
+   /* Avoid kicking the statemachine once we're shutdown.
+* It's unnecessary and it could break veth_stop_connection(). */
+
+   if (! (cnx->state & VETH_STATE_SHUTDOWN)) {
+   cnx->state |= VETH_STATE_RESET;
+   veth_kick_statemachine(cnx);
+   }
spin_unlock_irqrestore(>lock, flags);
 }
 
@@ -483,6 +489,12 @@ static void veth_statemachine(void *p)
 
if (cnx->state & VETH_STATE_RESET)
goto restart;
+
+   /* Hack, wait for the other end to reset itself. */
+   if (! (cnx->state & VETH_STATE_SHUTDOWN)) {
+   schedule_delayed_work(>statemachine_wq, 5 * HZ);
+   goto out;
+   }
}
 
if (cnx->state & VETH_STATE_SHUTDOWN)
@@ -667,6 +679,15 @@ static void veth_stop_connection(u8 rlp)
veth_kick_statemachine(cnx);
spin_unlock_irq(>lock);
 
+   /* There's a slim chance the reset code has just queued the
+* statemachine to run in five seconds. If so we need to cancel
+* that and requeue the work to run now. */
+   if (cancel_delayed_work(>statemachine_wq)) {
+   spin_lock_irq(>lock);
+   veth_kick_statemachine(cnx);
+   spin_unlock_irq(>lock);
+   }
+
/* Wait for the state machine to run. */
flush_scheduled_work();
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/18] iseries_veth: Remove redundant message stack lock

2005-08-31 Thread Michael Ellerman
The iseries_veth driver keeps a stack of messages for each connection
and a lock to protect the stack. However there is also a per-connection lock
which makes the message stack lock redundant.

Remove the message stack lock and document the fact that callers of the
stack-manipulation functions must hold the connection's lock.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   12 +++-
 1 files changed, 3 insertions(+), 9 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -143,7 +143,6 @@ struct veth_lpar_connection {
struct VethCapData remote_caps;
u32 ack_timeout;
 
-   spinlock_t msg_stack_lock;
struct veth_msg *msg_stack_head;
 };
 
@@ -190,27 +189,23 @@ static void veth_timed_ack(unsigned long
 #define veth_debug(fmt, args...) do {} while (0)
 #endif
 
+/* You must hold the connection's lock when you call this function. */
 static inline void veth_stack_push(struct veth_lpar_connection *cnx,
   struct veth_msg *msg)
 {
-   unsigned long flags;
-
-   spin_lock_irqsave(>msg_stack_lock, flags);
msg->next = cnx->msg_stack_head;
cnx->msg_stack_head = msg;
-   spin_unlock_irqrestore(>msg_stack_lock, flags);
 }
 
+/* You must hold the connection's lock when you call this function. */
 static inline struct veth_msg *veth_stack_pop(struct veth_lpar_connection *cnx)
 {
-   unsigned long flags;
struct veth_msg *msg;
 
-   spin_lock_irqsave(>msg_stack_lock, flags);
msg = cnx->msg_stack_head;
if (msg)
cnx->msg_stack_head = cnx->msg_stack_head->next;
-   spin_unlock_irqrestore(>msg_stack_lock, flags);
+
return msg;
 }
 
@@ -645,7 +640,6 @@ static int veth_init_connection(u8 rlp)
 
cnx->msgs = msgs;
memset(msgs, 0, VETH_NUMBUFFERS * sizeof(struct veth_msg));
-   spin_lock_init(>msg_stack_lock);
 
for (i = 0; i < VETH_NUMBUFFERS; i++) {
msgs[i].token = i;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/18] iseries_veth: Remove studly caps from iseries_veth.c

2005-08-31 Thread Michael Ellerman
Having merged iseries_veth.h, let's remove some of the studly caps that came
with it.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   74 ++---
 1 files changed, 37 insertions(+), 37 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -85,26 +85,26 @@ MODULE_AUTHOR("Kyle Lucke <[EMAIL PROTECTED]
 MODULE_DESCRIPTION("iSeries Virtual ethernet driver");
 MODULE_LICENSE("GPL");
 
-#define VethEventTypeCap   (0)
-#define VethEventTypeFrames(1)
-#define VethEventTypeMonitor   (2)
-#define VethEventTypeFramesAck (3)
+#define VETH_EVENT_CAP (0)
+#define VETH_EVENT_FRAMES  (1)
+#define VETH_EVENT_MONITOR (2)
+#define VETH_EVENT_FRAMES_ACK  (3)
 
 #define VETH_MAX_ACKS_PER_MSG  (20)
 #define VETH_MAX_FRAMES_PER_MSG(6)
 
-struct VethFramesData {
+struct veth_frames_data {
u32 addr[VETH_MAX_FRAMES_PER_MSG];
u16 len[VETH_MAX_FRAMES_PER_MSG];
u32 eofmask;
 };
 #define VETH_EOF_SHIFT (32-VETH_MAX_FRAMES_PER_MSG)
 
-struct VethFramesAckData {
+struct veth_frames_ack_data {
u16 token[VETH_MAX_ACKS_PER_MSG];
 };
 
-struct VethCapData {
+struct veth_cap_data {
u8 caps_version;
u8 rsvd1;
u16 num_buffers;
@@ -115,12 +115,12 @@ struct VethCapData {
u64 rsvd4[3];
 };
 
-struct VethLpEvent {
+struct veth_lpevent {
struct HvLpEvent base_event;
union {
-   struct VethCapData caps_data;
-   struct VethFramesData frames_data;
-   struct VethFramesAckData frames_ack_data;
+   struct veth_cap_data caps_data;
+   struct veth_frames_data frames_data;
+   struct veth_frames_ack_data frames_ack_data;
} u;
 
 };
@@ -153,7 +153,7 @@ struct VethLpEvent {
 
 struct veth_msg {
struct veth_msg *next;
-   struct VethFramesData data;
+   struct veth_frames_data data;
int token;
int in_use;
struct sk_buff *skb;
@@ -165,7 +165,7 @@ struct veth_lpar_connection {
struct work_struct statemachine_wq;
struct veth_msg *msgs;
int num_events;
-   struct VethCapData local_caps;
+   struct veth_cap_data local_caps;
 
struct kobject kobject;
struct timer_list ack_timer;
@@ -179,12 +179,12 @@ struct veth_lpar_connection {
unsigned long state;
HvLpInstanceId src_inst;
HvLpInstanceId dst_inst;
-   struct VethLpEvent cap_event, cap_ack_event;
+   struct veth_lpevent cap_event, cap_ack_event;
u16 pending_acks[VETH_MAX_ACKS_PER_MSG];
u32 num_pending_acks;
 
int num_ack_events;
-   struct VethCapData remote_caps;
+   struct veth_cap_data remote_caps;
u32 ack_timeout;
 
struct veth_msg *msg_stack_head;
@@ -217,7 +217,7 @@ static int veth_start_xmit(struct sk_buf
 static void veth_recycle_msg(struct veth_lpar_connection *, struct veth_msg *);
 static void veth_wake_queues(struct veth_lpar_connection *cnx);
 static void veth_stop_queues(struct veth_lpar_connection *cnx);
-static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *);
+static void veth_receive(struct veth_lpar_connection *, struct veth_lpevent *);
 static void veth_release_connection(struct kobject *kobject);
 static void veth_timed_ack(unsigned long ptr);
 static void veth_timed_reset(unsigned long ptr);
@@ -308,7 +308,7 @@ static int veth_allocate_events(HvLpInde
struct veth_allocation vc = { COMPLETION_INITIALIZER(vc.c), 0 };
 
mf_allocate_lp_events(rlp, HvLpEvent_Type_VirtualLan,
-   sizeof(struct VethLpEvent), number,
+   sizeof(struct veth_lpevent), number,
_complete_allocation, );
wait_for_completion();
 
@@ -456,7 +456,7 @@ static inline void veth_kick_statemachin
 }
 
 static void veth_take_cap(struct veth_lpar_connection *cnx,
- struct VethLpEvent *event)
+ struct veth_lpevent *event)
 {
unsigned long flags;
 
@@ -481,7 +481,7 @@ static void veth_take_cap(struct veth_lp
 }
 
 static void veth_take_cap_ack(struct veth_lpar_connection *cnx,
- struct VethLpEvent *event)
+ struct veth_lpevent *event)
 {
unsigned long flags;
 
@@ -499,7 +499,7 @@ static void veth_take_cap_ack(struct vet
 }
 
 static void veth_take_monitor_ack(struct veth_lpar_connection *cnx,
- struct VethLpEvent *event)
+ struct veth_lpevent *event)
 {
unsigned long flags;
 
@@ -516,7 +516,7 @@ static void veth_take_monitor_ack(struct
spin_unlock_irqrestore(>lock, flags);
 }
 
-static void veth_handle_ack(struct VethLpEvent 

[PATCH 4/18] iseries_veth: Fix broken promiscuous handling

2005-08-31 Thread Michael Ellerman
Due to a logic bug, once promiscuous mode is enabled in the iseries_veth
driver it is never disabled.

The driver keeps two flags, promiscuous and all_mcast which have exactly the
same effect. This is because we only ever receive packets destined for us,
or multicast packets. So consolidate them into one promiscuous flag for
simplicity.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   16 +---
 1 files changed, 5 insertions(+), 11 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -159,7 +159,6 @@ struct veth_port {
 
rwlock_t mcast_gate;
int promiscuous;
-   int all_mcast;
int num_mcast;
u64 mcast_addr[VETH_MAX_MCAST];
 };
@@ -756,17 +755,15 @@ static void veth_set_multicast_list(stru
 
write_lock_irqsave(>mcast_gate, flags);
 
-   if (dev->flags & IFF_PROMISC) { /* set promiscuous mode */
-   printk(KERN_INFO "%s: Promiscuous mode enabled.\n",
-  dev->name);
+   if ((dev->flags & IFF_PROMISC) || (dev->flags & IFF_ALLMULTI) ||
+   (dev->mc_count > VETH_MAX_MCAST)) {
port->promiscuous = 1;
-   } else if ( (dev->flags & IFF_ALLMULTI)
-   || (dev->mc_count > VETH_MAX_MCAST) ) {
-   port->all_mcast = 1;
} else {
struct dev_mc_list *dmi = dev->mc_list;
int i;
 
+   port->promiscuous = 0;
+
/* Update table */
port->num_mcast = 0;
 
@@ -1145,12 +1142,9 @@ static inline int veth_frame_wanted(stru
if ( (mac_addr == port->mac_addr) || (mac_addr == 0x) )
return 1;
 
-   if (! (((char *) _addr)[0] & 0x01))
-   return 0;
-
read_lock_irqsave(>mcast_gate, flags);
 
-   if (port->promiscuous || port->all_mcast) {
+   if (port->promiscuous) {
wanted = 1;
goto out;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/18] iseries_veth: Add sysfs support for port structs

2005-08-31 Thread Michael Ellerman
Also to aid debugging, add sysfs support for iseries_veth's port structures.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   67 +
 1 files changed, 67 insertions(+)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -167,6 +167,8 @@ struct veth_port {
int promiscuous;
int num_mcast;
u64 mcast_addr[VETH_MAX_MCAST];
+
+   struct kobject kobject;
 };
 
 static HvLpIndex this_lp;
@@ -350,6 +352,62 @@ static struct kobj_type veth_lpar_connec
.default_attrs  = veth_cnx_default_attrs
 };
 
+struct veth_port_attribute {
+   struct attribute attr;
+   ssize_t (*show)(struct veth_port *, char *buf);
+   ssize_t (*store)(struct veth_port *, const char *buf);
+};
+
+static ssize_t veth_port_attribute_show(struct kobject *kobj,
+   struct attribute *attr, char *buf)
+{
+   struct veth_port_attribute *port_attr;
+   struct veth_port *port;
+
+   port_attr = container_of(attr, struct veth_port_attribute, attr);
+   port = container_of(kobj, struct veth_port, kobject);
+
+   if (!port_attr->show)
+   return -EIO;
+
+   return port_attr->show(port, buf);
+}
+
+#define CUSTOM_PORT_ATTR(_name, _format, _expression)  \
+static ssize_t _name##_show(struct veth_port *port, char *buf) \
+{  \
+   return sprintf(buf, _format, _expression);  \
+}  \
+struct veth_port_attribute veth_port_attr_##_name = __ATTR_RO(_name)
+
+#define SIMPLE_PORT_ATTR(_name)\
+   CUSTOM_PORT_ATTR(_name, "%lu\n", (unsigned long)port->_name)
+
+SIMPLE_PORT_ATTR(promiscuous);
+SIMPLE_PORT_ATTR(num_mcast);
+CUSTOM_PORT_ATTR(lpar_map, "0x%X\n", port->lpar_map);
+CUSTOM_PORT_ATTR(stopped_map, "0x%X\n", port->stopped_map);
+CUSTOM_PORT_ATTR(mac_addr, "0x%lX\n", port->mac_addr);
+
+#define GET_PORT_ATTR(_name)   (_port_attr_##_name.attr)
+static struct attribute *veth_port_default_attrs[] = {
+   GET_PORT_ATTR(mac_addr),
+   GET_PORT_ATTR(lpar_map),
+   GET_PORT_ATTR(stopped_map),
+   GET_PORT_ATTR(promiscuous),
+   GET_PORT_ATTR(num_mcast),
+   NULL
+};
+
+static struct sysfs_ops veth_port_sysfs_ops = {
+   .show = veth_port_attribute_show
+};
+
+static struct kobj_type veth_port_ktype = {
+   .sysfs_ops  = _port_sysfs_ops,
+   .default_attrs  = veth_port_default_attrs
+};
+
 /*
  * LPAR connection code
  */
@@ -992,6 +1050,13 @@ static struct net_device * __init veth_p
return NULL;
}
 
+   kobject_init(>kobject);
+   port->kobject.parent = >class_dev.kobj;
+   port->kobject.ktype  = _port_ktype;
+   kobject_set_name(>kobject, "veth_port");
+   if (0 != kobject_add(>kobject))
+   veth_error("Failed adding port for %s to sysfs.\n", dev->name);
+
veth_info("%s attached to iSeries vlan %d (LPAR map = 0x%.4X)\n",
dev->name, vlan, port->lpar_map);
 
@@ -1486,6 +1551,8 @@ static int veth_remove(struct vio_dev *v
}
 
veth_dev[vdev->unit_address] = NULL;
+   kobject_del(>kobject);
+   kobject_put(>kobject);
unregister_netdev(dev);
free_netdev(dev);
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/18] Updates & bug fixes for iseries_veth network driver

2005-08-31 Thread Michael Ellerman
Hi,

This is a series of patches for the iseries_veth driver. Most of these are
pretty much unchanged since I posted them earlier:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0506.3/1837.html

I've added patches to add sysfs support, and do some further code cleanups.

Please merge if they look ok.

cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 18/18] iseries_veth: Be consistent about driver name, increment version

2005-08-31 Thread Michael Ellerman
The iseries_veth driver tells sysfs that it's called 'iseries_veth', but if
you ask it via ethtool it thinks it's called 'veth'. I think this comes from
2.4 when the driver was called 'veth', but it's definitely called
'iseries_veth' now, so fix it.

To make sure we don't do it again define DRV_NAME and use it everywhere.

While we're at it, change the version number to 2.0, to reflect the changes
made in this patch series.

Signed-off-by: Michael Ellerman <[EMAIL PROTECTED]>
---

 drivers/net/iseries_veth.c |   16 ++--
 1 files changed, 10 insertions(+), 6 deletions(-)

Index: veth-dev2/drivers/net/iseries_veth.c
===
--- veth-dev2.orig/drivers/net/iseries_veth.c
+++ veth-dev2/drivers/net/iseries_veth.c
@@ -125,6 +125,9 @@ struct veth_lpevent {
 
 };
 
+#define DRV_NAME   "iseries_veth"
+#define DRV_VERSION"2.0"
+
 #define VETH_NUMBUFFERS(120)
 #define VETH_ACKTIMEOUT(100) /* microseconds */
 #define VETH_MAX_MCAST (12)
@@ -227,14 +230,14 @@ static void veth_timed_reset(unsigned lo
  */
 
 #define veth_info(fmt, args...) \
-   printk(KERN_INFO "iseries_veth: " fmt, ## args)
+   printk(KERN_INFO DRV_NAME ": " fmt, ## args)
 
 #define veth_error(fmt, args...) \
-   printk(KERN_ERR "iseries_veth: Error: " fmt, ## args)
+   printk(KERN_ERR DRV_NAME ": Error: " fmt, ## args)
 
 #ifdef DEBUG
 #define veth_debug(fmt, args...) \
-   printk(KERN_DEBUG "iseries_veth: " fmt, ## args)
+   printk(KERN_DEBUG DRV_NAME ": " fmt, ## args)
 #else
 #define veth_debug(fmt, args...) do {} while (0)
 #endif
@@ -997,9 +1000,10 @@ static void veth_set_multicast_list(stru
 
 static void veth_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo 
*info)
 {
-   strncpy(info->driver, "veth", sizeof(info->driver) - 1);
+   strncpy(info->driver, DRV_NAME, sizeof(info->driver) - 1);
info->driver[sizeof(info->driver) - 1] = '\0';
-   strncpy(info->version, "1.0", sizeof(info->version) - 1);
+   strncpy(info->version, DRV_VERSION, sizeof(info->version) - 1);
+   info->version[sizeof(info->version) - 1] = '\0';
 }
 
 static int veth_get_settings(struct net_device *dev, struct ethtool_cmd *ecmd)
@@ -1642,7 +1646,7 @@ static struct vio_device_id veth_device_
 MODULE_DEVICE_TABLE(vio, veth_device_table);
 
 static struct vio_driver veth_driver = {
-   .name = "iseries_veth",
+   .name = DRV_NAME,
.id_table = veth_device_table,
.probe = veth_probe,
.remove = veth_remove
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: State of Linux graphics

2005-08-31 Thread James Cloos
> "Ian" == Ian Romanick <[EMAIL PROTECTED]> writes:

Ian> I'd really like to see a list of areas where OpenGL
Ian> isn't up to snuff for 2D operations. 

Is that OpenVR spec from Khronos a reasonable baseline
for such a list?

-JimC
-- 
James H. Cloos, Jr. <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Bernd Eckenfels
In article <[EMAIL PROTECTED]> you wrote:
> I disagree with the language and the characterization that our 
> proprietary user application code is "tainted."

The kernel is tainted if you install non-open source modules. You are not
allowed to circumvent this mechanism if you want to ship binary only
modules.

Gruss
Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Bernd Eckenfels
In article <[EMAIL PROTECTED]> you wrote:
> I mean, nvidia people also use propietary code in the kernel (probably
> violating the GPL anyway) and don't do such things.

The Linux kernel allows binary drivers, you just have to live with a limited
number of exported symbols and that the kernel is tainted. Which basically
means nobody sane can help you with corrupted kernel data structures.

Bernd
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6] I2C: Drop I2C_DEVNAME and i2c_clientname

2005-08-31 Thread Greg KH
On Wed, Aug 31, 2005 at 12:34:58PM -0300, Mauro Carvalho Chehab wrote:
> Em Ter, 2005-08-30 ?s 23:20 +0200, Jean Delvare escreveu:
> > Hi Mauro,
> > 
> > > (...) it would be nice not to have a different I2C
> > > API for every single 2.6 version :-) It would be nice to change I2C
> > > API once and keep it stable for a while.
> 
> > The Linux 2.6 development model is designed around a relatively fast
> > move from -mm to Linus' tree, which implies incremental changes all the
> > time. I'm only doing that.
>   It is ok to change code, but, IMHO, API should be more stable.

I take it you have not read Documentation/stable_api_nonsense.txt yet?
If not, please do, it shows that what you are asking for will not
happen.

good luck,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] add transport class symlink to device object

2005-08-31 Thread Greg KH
On Thu, Aug 18, 2005 at 02:50:19PM -0500, Dmitry Torokhov wrote:
> On 8/18/05, Greg KH <[EMAIL PROTECTED]> wrote:
> > @@ -500,9 +519,13 @@ int class_device_add(struct class_device
> >}
> > 
> >class_device_add_attrs(class_dev);
> > -   if (class_dev->dev)
> > +   if (class_dev->dev) {
> > +   class_name = make_class_name(class_dev);
> >sysfs_create_link(_dev->kobj,
> >  _dev->dev->kobj, "device");
> > +   sysfs_create_link(_dev->dev->kobj, _dev->kobj,
> > + class_name);
> > +   }
> > 
> 
> I wonder if we need to grab a reference to class_dev->dev here:
> 
> dev = device_get(class_dev->dev);
> if (dev) {
>  
> }
> 
> Otherwise, if device gets unregistered/deleted before class device is
> deleted we'll get into trouble when removing the link since
> class_dev->dev will be garbage.
> 
> .. But grabbing that reference will cause pains in SCSI system which,
> when I looked, removed class devices from device's release function.

No the sysfs_create_link() call increments the kobject reference on the
target of the symlink.  See sysfs_add_link() for details.  So this
should be just fine, right?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH]reconfigure MSI registers after resume

2005-08-31 Thread Greg KH
On Thu, Aug 18, 2005 at 01:35:46PM +0800, Shaohua Li wrote:
> Hi,
> It appears pci_enable_msi doesn't reconfigure msi registers if it
> successfully look up a msi for a device. It assumes the data and address
> registers unchanged after calling pci_disable_msi. But this isn't always
> true, such as in a suspend/resume circle. In my test system, the
> registers unsurprised become zero after a S3 resume. This patch fixes my
> problem, please look at it. MSIX might have the same issue, but I
> haven't taken a close look.

Tom, any comments on this?

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Daniel Walker
On Thu, 2005-09-01 at 01:50 +0200, Roman Zippel wrote:

> What "more versions" are you talking about? When you convert a user time 
> to kernel time you can automatically validate it and later you can use 
> standard kernel APIs, so you don't have to add even more API bloat.

What's kernel time? Are you talking about jiffies? The whole point of
multiple clocks is to allow for different degrees of precision. 

Daniel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread jmerkey

Diego Calleja wrote:


El Wed, 31 Aug 2005 14:27:47 -0600,
"Jeff V. Merkey" <[EMAIL PROTECTED]> escribió:

 



NOTE! This copyright does *not* cover user programs that use kernel
services by normal system calls - this is merely considered normal use
of the kernel, and does *not* fall under the heading of "derived work".
Also note that the GPL below is copyrighted by the Free Software
Foundation, but the instance of code that it refers to (the linux
kernel) is copyrighted by me and others who actually wrote it.
   



So, that means that DSFS runs on userspace? (We can't see the source
so it'd be nice to know how DSFS works)

Also, I'm curious about this piece of code on your patch:
ftp://ftp.soleranetworks.com/pub/dsfs/datascout-only-2.6.9-06-28-05.patch

-   printk(KERN_WARNING "%s: module license '%s' taints kernel.\n",
-  mod->name, license);
+// printk(KERN_WARNING "%s: module license '%s' taints kernel.\n",
+//mod->name, license);

I mean, nvidia people also use propietary code in the kernel (probably
violating the GPL anyway) and don't do such things.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

 

I disagree with the language and the characterization that our 
proprietary user application code is "tainted."


Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Perez-Gonzalez, Inaky
>From: Roman Zippel [mailto:[EMAIL PROTECTED]
>On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:
>
>> I cannot produce (top of my head) any other POSIX API calls that
>> allow you to specify another clock source, but they are there,
>> somewhere. If I am to introduce a new API, I better make it
>> flexible enough so that other subsystems can use it for more stuff
>> other than...
>
>So we have to deal at kernel level with every broken timeout
specification
>that comes along?

Hmm, I cannot think of more ways to specify a timeout than how
long I want to wait (relative) or until when (absolute) and which
is the reference clock. And they don't seem broken to me, common
sense, in any case. Do you have any examples?

In any case, like it or not, POSIX is what almost every application
uses to talk to the kernel.

>> ...adding more versions that add complexity and duplicate
>> code in many different places (user-to-kernel copy, syscall entry
>> points, timespec validation). And the minute you add a clock_id
>> you can steal some bits for specifying absolute/relative (or vice
>> versa), so it is almost a win-win situarion.
>
>What "more versions" are you talking about? When you convert a user
time
>to kernel time you can automatically validate it and later you can use
>standard kernel APIs, so you don't have to add even more API bloat.

The versions you were talking about:

>From: Roman Zippel [mailto:[EMAIL PROTECTED]
>...
>Why is not sufficient to just add a relative/absolute version,
>which convert the time at entry to kernel time?

Different versions of the same function that do relative, absolute.
If I keep going that way, the reason becomes:

sys_mutex_lock
sys_mutex_lock_timed_relative_clock_realtime
sys_mutex_lock_timed_absolute_clock_realtime
sys_mutex_lock_timed_relative_clock_monotonic
sys_mutex_lock_timed_absolute_clock_monotonic
sys_mutex_lock_timed_relative_clock_monotonic_highres
sys_mutex_lock_timed_absolute_clock_monotonic_highres

s/mutex_lock/ with whatever system call that takes a timeout you want
and
keep adding combinations. On each of those check for validity of the
__user pointer, copy it, validate the timespec.

[admitedly I am stretching the point with the different clock types].

So where is the problem on unifying all that handling? You are still 
not offering any constructive criticism to solve the issue that now
the syscalls take relative timeouts vs the absolutes we need.

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[-mm PATCH] v9fs: cleanup fd transport

2005-08-31 Thread Eric Van Hensbergen
[PATCH] v9fs: cleanup fd transport

Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbegren <[EMAIL PROTECTED]>

---
commit a1949213f1723a7b8bba8edfa118985460d31604
tree 40224cafbfb68543c60a8e0f04ae669cba2cedf7
parent 3f92b2539fe581ee9011d687fbd43cebb641465e
author Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005 16:02:42
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005
16:02:42 -0500

 fs/9p/trans_fd.c |   42 +++---
 fs/9p/v9fs.c |5 -
 2 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/fs/9p/trans_fd.c b/fs/9p/trans_fd.c
--- a/fs/9p/trans_fd.c
+++ b/fs/9p/trans_fd.c
@@ -56,6 +56,9 @@ static int v9fs_fd_recv(struct v9fs_tran
 {
struct v9fs_trans_fd *ts = trans ? trans->priv : NULL;
 
+   if (!trans || trans->status != Connected || !ts)
+   return -EIO;
+
return kernel_read(ts->in_file, ts->in_file->f_pos, v, len);
 }
 
@@ -73,6 +76,9 @@ static int v9fs_fd_send(struct v9fs_tran
mm_segment_t oldfs = get_fs();
int ret = 0;
 
+   if (!trans || trans->status != Connected || !ts)
+   return -EIO;
+
set_fs(get_ds());
/* The cast to a user pointer is valid due to the set_fs() */
ret = vfs_write(ts->out_file, (void __user *)v, len,
>out_file->f_pos);
@@ -95,6 +101,11 @@ v9fs_fd_init(struct v9fs_session_info *v
struct v9fs_trans_fd *ts = NULL;
struct v9fs_transport *trans = v9ses->transport;
 
+   if((v9ses->wfdno == ~0) || (v9ses->rfdno == ~0)) {
+   printk(KERN_ERR "v9fs: Insufficient options for proto=fd\n");
+   return -ENOPROTOOPT;
+   }
+
sema_init(>writelock, 1);
sema_init(>readlock, 1);
 
@@ -103,11 +114,21 @@ v9fs_fd_init(struct v9fs_session_info *v
if (!ts)
return -ENOMEM;
 
-   trans->priv = ts;
-
ts->in_file = fget( v9ses->rfdno );
ts->out_file = fget( v9ses->wfdno );
 
+   if (!ts->in_file || !ts->out_file) {
+   if (ts->in_file)
+   fput(ts->in_file);
+
+   if (ts->out_file)
+   fput(ts->out_file);
+
+   kfree(ts);
+   return -EIO;
+   }
+
+   trans->priv = ts;
trans->status = Connected;
 
return 0;
@@ -122,7 +143,22 @@ v9fs_fd_init(struct v9fs_session_info *v
 
 static void v9fs_fd_close(struct v9fs_transport *trans)
 {
-   struct v9fs_trans_fd *ts = trans ? trans->priv : NULL;
+   struct v9fs_trans_fd *ts;
+
+   if (!trans) 
+   return;
+
+   trans->status = Disconnected;
+   ts = trans->priv;
+
+   if (!ts)
+   return;
+
+   if (ts->in_file)
+   fput(ts->in_file);
+
+   if (ts->out_file)
+   fput(ts->out_file);
 
kfree(ts);
 }
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -296,11 +296,6 @@ v9fs_session_init(struct v9fs_session_in
case PROTO_FD:
trans_proto = _trans_fd;
*v9ses->remotename = 0;
-   if((v9ses->wfdno == ~0) || (v9ses->rfdno == ~0)) {
-   printk(KERN_ERR "v9fs: Insufficient options for 
proto=fd\n");
-   retval = -ENOPROTOOPT;
-   goto SessCleanUp;
-   }
break;
default:
printk(KERN_ERR "v9fs: Bad mount protocol %d\n", v9ses->proto);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] v9fs: Support to force umount

2005-08-31 Thread Eric Van Hensbergen
[PATCH] v9fs: Support to force umount

Support for force umount

Signed-off-by: Latchesar Ionkov <[EMAIL PROTECTED]>
Signed-off-by: Eric Van Hensbergen <[EMAIL PROTECTED]>

---
commit 3f92b2539fe581ee9011d687fbd43cebb641465e
tree cd34696129c3b636b85578f659f260100196dee1
parent 83f1fe3d2adc3746d719e430d0a794de1f151c40
author Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005 15:53:14
-0500
committer Eric Van Hensbergen <[EMAIL PROTECTED]> Wed, 31 Aug 2005
15:53:14 -0500

 fs/9p/mux.c   |   20 
 fs/9p/mux.h   |1 +
 fs/9p/v9fs.c  |9 +
 fs/9p/v9fs.h  |4 +---
 fs/9p/vfs_super.c |9 +
 5 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/fs/9p/mux.c b/fs/9p/mux.c
--- a/fs/9p/mux.c
+++ b/fs/9p/mux.c
@@ -331,6 +331,26 @@ v9fs_mux_rpc(struct v9fs_session_info *v
 }
 
 /**
+ * v9fs_mux_cancel_requests - cancels all pending requests
+ *
+ * @v9ses: session info structure
+ * @err: error code to return to the requests
+ */
+void v9fs_mux_cancel_requests(struct v9fs_session_info *v9ses, int err)
+{
+   struct v9fs_rpcreq *rptr;
+   struct v9fs_rpcreq *rreq;
+
+   dprintk(DEBUG_MUX, " %d\n", err);
+   spin_lock(>muxlock);
+   list_for_each_entry_safe(rreq, rptr, >mux_fcalls, next) {
+   rreq->err = err;
+   }
+   spin_unlock(>muxlock);
+   wake_up_all(>read_wait);
+}
+
+/**
  * v9fs_recvproc - kproc to handle demultiplexing responses
  * @data: session info structure
  *
diff --git a/fs/9p/mux.h b/fs/9p/mux.h
--- a/fs/9p/mux.h
+++ b/fs/9p/mux.h
@@ -38,3 +38,4 @@ struct v9fs_rpcreq {
 int v9fs_mux_init(struct v9fs_session_info *v9ses, const char
*dev_name);
 long v9fs_mux_rpc(struct v9fs_session_info *v9ses,
  struct v9fs_fcall *tcall, struct v9fs_fcall **rcall);
+void v9fs_mux_cancel_requests(struct v9fs_session_info *v9ses, int
err);
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -414,6 +414,15 @@ void v9fs_session_close(struct v9fs_sess
putname(v9ses->remotename);
 }
 
+/**
+ * v9fs_session_cancel - mark transport as disconnected 
+ * and cancel all pending requests.
+ */
+void v9fs_session_cancel(struct v9fs_session_info *v9ses) {
+   v9ses->transport->status = Disconnected;
+   v9fs_mux_cancel_requests(v9ses, -EIO);
+}
+
 extern int v9fs_error_init(void);
 
 /**
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
--- a/fs/9p/v9fs.h
+++ b/fs/9p/v9fs.h
@@ -89,9 +89,7 @@ struct v9fs_session_info *v9fs_inode2v9s
 void v9fs_session_close(struct v9fs_session_info *v9ses);
 int v9fs_get_idpool(struct v9fs_idpool *p);
 void v9fs_put_idpool(int id, struct v9fs_idpool *p);
-int v9fs_get_option(char *opts, char *name, char *buf, int buflen);
-long long v9fs_get_int_option(char *opts, char *name, long long dflt);
-int v9fs_parse_tcp_devname(const char *devname, char **addr, char
**remotename);
+void v9fs_session_cancel(struct v9fs_session_info *v9ses);
 
 #define V9FS_MAGIC 0x01021997
 
diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c
--- a/fs/9p/vfs_super.c
+++ b/fs/9p/vfs_super.c
@@ -257,10 +257,19 @@ static int v9fs_show_options(struct seq_
return 0;
 }
 
+static void
+v9fs_umount_begin(struct super_block *sb)
+{
+   struct v9fs_session_info *v9ses = sb->s_fs_info;
+
+   v9fs_session_cancel(v9ses);
+}
+
 static struct super_operations v9fs_super_ops = {
.statfs = simple_statfs,
.clear_inode = v9fs_clear_inode,
.show_options = v9fs_show_options,
+   .umount_begin = v9fs_umount_begin,
 };
 
 struct file_system_type v9fs_fs_type = {


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Roman Zippel
Hi,

On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:

> I cannot produce (top of my head) any other POSIX API calls that
> allow you to specify another clock source, but they are there,
> somewhere. If I am to introduce a new API, I better make it 
> flexible enough so that other subsystems can use it for more stuff
> other than...

So we have to deal at kernel level with every broken timeout specification 
that comes along?

> >Why is not sufficient to just add a relative/absolute version,
> >which convert the time at entry to kernel time?
> 
> ...adding more versions that add complexity and duplicate
> code in many different places (user-to-kernel copy, syscall entry 
> points, timespec validation). And the minute you add a clock_id
> you can steal some bits for specifying absolute/relative (or vice
> versa), so it is almost a win-win situarion.

What "more versions" are you talking about? When you convert a user time 
to kernel time you can automatically validate it and later you can use 
standard kernel APIs, so you don't have to add even more API bloat.

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MAX_ARG_PAGES has no effect?

2005-08-31 Thread Andi Kleen
Ingo Molnar <[EMAIL PROTECTED]> writes:
> 
> MAX_ARG_PAGES should work just fine. I think the 'getconf ARG_MAX' 
> output is hardcoded. (because the kernel does not provide the 
> information dynamically)

Perhaps it would be a good idea to make it a sysctl. Is there 
any reason it should be hardcoded?  I cannot think of any.

Ok if someone lowers the sysctl then execve has to handle
the case of the args/environment possibly not fitting anymore,
but that should be easy.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] Support UTF-8 scripts

2005-08-31 Thread H. Peter Anvin
Followup to:  <[EMAIL PROTECTED]>
By author:=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]>
In newsgroup: linux.dev.kernel
>
> This patch adds support for UTF-8 signatures (aka BOM, byte order
> mark) to binfmt_script. Files that start with EF BF FF # ! are now
> recognized as scripts (in addition to files starting with # !).
> 
> With such support, creating scripts that reliably carry non-ASCII
> characters is simplified. Editors and the script interpreter can
> easily agree on what the encoding of the script is, and the
> interpreter can then render strings appropriately. Currently,
> Python supports source files that start with the UTF-8 signature;
> the approach would naturally extend to Perl to enhance/replace
> the "use utf8" pragma. Likewise, Tcl could use the UTF-8 signature
> to reliably identify UTF-8 source code (instead of assuming
> [encoding system] for source code).
> 

BOM should not be used in UTF-8.  In fact, it shouldn't be used at
all.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Perez-Gonzalez, Inaky
>From: Roman Zippel [mailto:[EMAIL PROTECTED]
>On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:
>
>> Usefulness: (see the rationale in the patch), but in a nutshell;
>> most POSIX timeout specs have to be absolute in CLOCK_REALTIME
>> (eg: pthread_mutex_timed_lock()). Current kernel needs the timeout
>> relative, so glibc calls the kernel/however gets the time, computes
>> relative times and syscalls. Race conditions, overhead...etc.
>>
>> This mechanism supports both. That's why it is more general.
>
>Your patch basically only mentions fusyn, why does it need multiple
clock
>sources?

I cannot produce (top of my head) any other POSIX API calls that
allow you to specify another clock source, but they are there,
somewhere. If I am to introduce a new API, I better make it 
flexible enough so that other subsystems can use it for more stuff
other than...

>Why is not sufficient to just add a relative/absolute version,
>which convert the time at entry to kernel time?

...adding more versions that add complexity and duplicate
code in many different places (user-to-kernel copy, syscall entry 
points, timespec validation). And the minute you add a clock_id
you can steal some bits for specifying absolute/relative (or vice
versa), so it is almost a win-win situarion.

To summarize: thought about that, but it is fugly and not too practical.


Consider also his allows you to write extensions to POSIX or your
own user-level APIs that could allow (following the fusyn example) 
you to wait on a mutex with a timeout based off a monotonic clock, 
if you need it (or something that makes more sense than this--highres 
comes to mind). 

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Diego Calleja
El Wed, 31 Aug 2005 14:27:47 -0600,
"Jeff V. Merkey" <[EMAIL PROTECTED]> escribió:

>  
> NOTE! This copyright does *not* cover user programs that use kernel
>  services by normal system calls - this is merely considered normal use
>  of the kernel, and does *not* fall under the heading of "derived work".
>  Also note that the GPL below is copyrighted by the Free Software
>  Foundation, but the instance of code that it refers to (the linux
>  kernel) is copyrighted by me and others who actually wrote it.

So, that means that DSFS runs on userspace? (We can't see the source
so it'd be nice to know how DSFS works)

Also, I'm curious about this piece of code on your patch:
ftp://ftp.soleranetworks.com/pub/dsfs/datascout-only-2.6.9-06-28-05.patch

-   printk(KERN_WARNING "%s: module license '%s' taints kernel.\n",
-  mod->name, license);
+// printk(KERN_WARNING "%s: module license '%s' taints kernel.\n",
+//mod->name, license);

I mean, nvidia people also use propietary code in the kernel (probably
violating the GPL anyway) and don't do such things.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Perez-Gonzalez, Inaky
>From: Christopher Friesen [mailto:[EMAIL PROTECTED]
>Perez-Gonzalez, Inaky wrote:
>
>>>I can get the first sleep.  Suppose I oversleep by X nanoseconds.  I
>>>wake, and get an opaque timeout back.  How do I ask for the new wake
>>>time to be "endtime + INTERVAL"?
>>
>>
>> endtime.ts += INTERVAL
>> [we all know opaque is relative too]
>
>Heh. Okay, then what are the rules about what I'm allowed to do with
>endtime?  Joe mentioned there was a bit in there somewhere to denote
>absolute time.

Well, it doesn't really matter. The bit in endtime.clock_id (highest,
AFAIR) says if it is absolute or not, but because adding a relative
value to a value maintains its condition (absolute or relative), it
is not a concern. Just add it.

Unless I am missing something really basic, of course.

>> Or better, use itimers :)
>
>I as actually thinking in terms of implementing itimers on top of your
>new API.

Heh, got me.

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FW: [RFC] A more general timeout specification

2005-08-31 Thread Christopher Friesen

Perez-Gonzalez, Inaky wrote:


I can get the first sleep.  Suppose I oversleep by X nanoseconds.  I
wake, and get an opaque timeout back.  How do I ask for the new wake
time to be "endtime + INTERVAL"?



endtime.ts += INTERVAL
[we all know opaque is relative too] 


Heh. Okay, then what are the rules about what I'm allowed to do with 
endtime?  Joe mentioned there was a bit in there somewhere to denote 
absolute time.



Or better, use itimers :)


I as actually thinking in terms of implementing itimers on top of your 
new API.


Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] 2.6.13-ck1

2005-08-31 Thread Rodney Gordon II
On Thu, Sep 01, 2005 at 08:47:36AM +1000, Con Kolivas wrote:
> On Thu, 1 Sep 2005 06:07 am, daniel mclellan wrote:
> > Yes.
> >
> >
> > Linux yavanna 2.6.13-ckx1 #1 Tue Aug 30 04:03:25 EST 2005 x86_64 AMD
> > Athlon(tm) 64 FX-53 Processor AuthenticAMD GNU/Linux
> >
> > On Wednesday 31 August 2005 14:49, Rodney Gordon II wrote:
> > > On Mon, Aug 29, 2005 at 05:03:24PM +1000, Con Kolivas wrote:
> > > > These are patches designed to improve system responsiveness and
> > > > interactivity. It is configurable to any workload but the default ck*
> > > > patch is aimed at the desktop and ck*-server is available with more
> > > > emphasis on serverspace.
> > > >
> > > >
> > > > Apply to 2.6.13
> > > > http://ck.kolivas.org/patches/2.6/2.6.13/2.6.13-ck1/patch-2.6.13-ck1.bz
> > > >2 or development version:
> > > > http://ck.kolivas.org/patches/2.6/2.6.13/2.6.13-ck1/patch-2.6.13-ck1+.b
> > > >z2
> > > >
> > > > or server version:
> > > > http://ck.kolivas.org/patches/2.6/2.6.13/2.6.13-ck1/patch-2.6.13-ck1-se
> > > >rv er.bz2
> > >
> > > I am having odd lockup problems with just the non-+ 'stable' ck lately..
> > > Trying a large copy will often lock my disk I/O up and I have to do a
> > > hard reboot. Nothing shows in logs..
> > >
> > > Is anyone having similar problems?
> 
> 2 things:
> 
> What HZ are you running?
> Can you set up netconsole or serial console as these will capture something 
> that won't be seen in your logs.
> 
> Cheers,
> Con

1: 1000HZ
2: Nope.. I am sorry, one computer household at the moment :( daniel?
-r

-- 
Rodney Gordon II (meff) | meff  pobox  com
GPG Key ID: 7FF4B2BC|   AIM ID: mefforz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Roman Zippel
Hi,

On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:

> >Why is that needed in a _general_ timeout API? What exactly makes it so
> >useful for everyone and not just more complex for everyone?
> 
> Because if a system call gets a timeout specification it needs to
> verify its correctness first. Instead of doing that at the point
> where it goes to sleep, that could be deep in an atomic section,
> we provide a separate function [timeout_validate()] which is the
> one you mention, to do that.
> 
> Usefulness: (see the rationale in the patch), but in a nutshell;
> most POSIX timeout specs have to be absolute in CLOCK_REALTIME
> (eg: pthread_mutex_timed_lock()). Current kernel needs the timeout
> relative, so glibc calls the kernel/however gets the time, computes
> relative times and syscalls. Race conditions, overhead...etc. 
> 
> This mechanism supports both. That's why it is more general.

Your patch basically only mentions fusyn, why does it need multiple clock 
sources? Why is not sufficient to just add a relative/absolute version, 
which convert the time at entry to kernel time?

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick calculation in timer_pm.c

2005-08-31 Thread john stultz
On Wed, 2005-08-31 at 15:36 -0700, Zachary Amsden wrote:
> >I feel lost ticks can be based on cycles difference directly
> >rather than being based on microseconds that has elapsed.
> >
> >Following patch is in that direction. 
> >
> >With this patch, time had kept up really well on one particular
> >machine (Intel 4way Pentium 3 box) overnight, while
> >on another newer machine (Intel 4way Xeon with HT) it didnt do so
> >well (time sped up after 3 or 4 hours). Hence I consider this
> >particular patch will need more review/work.
> >
> >  
> >
> 
> Does this patch help address the issues pointed out here?
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=5127

Unfortunately no. The issue there is that once the lost tick
compensation code has fired, should those "lost" ticks appear later we
end up over-compensating.

This patch however does help to make sure that when the lost tick code
fires, the error from converting to usecs doesn't bite us. And could
probably go into mainline independent of the dynamic ticks patch (with
further testing, of course).

thanks
-john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] 2.6.13-ck1

2005-08-31 Thread Con Kolivas
On Thu, 1 Sep 2005 06:07 am, daniel mclellan wrote:
> Yes.
>
>
> Linux yavanna 2.6.13-ckx1 #1 Tue Aug 30 04:03:25 EST 2005 x86_64 AMD
> Athlon(tm) 64 FX-53 Processor AuthenticAMD GNU/Linux
>
> On Wednesday 31 August 2005 14:49, Rodney Gordon II wrote:
> > On Mon, Aug 29, 2005 at 05:03:24PM +1000, Con Kolivas wrote:
> > > These are patches designed to improve system responsiveness and
> > > interactivity. It is configurable to any workload but the default ck*
> > > patch is aimed at the desktop and ck*-server is available with more
> > > emphasis on serverspace.
> > >
> > >
> > > Apply to 2.6.13
> > > http://ck.kolivas.org/patches/2.6/2.6.13/2.6.13-ck1/patch-2.6.13-ck1.bz
> > >2 or development version:
> > > http://ck.kolivas.org/patches/2.6/2.6.13/2.6.13-ck1/patch-2.6.13-ck1+.b
> > >z2
> > >
> > > or server version:
> > > http://ck.kolivas.org/patches/2.6/2.6.13/2.6.13-ck1/patch-2.6.13-ck1-se
> > >rv er.bz2
> >
> > I am having odd lockup problems with just the non-+ 'stable' ck lately..
> > Trying a large copy will often lock my disk I/O up and I have to do a
> > hard reboot. Nothing shows in logs..
> >
> > Is anyone having similar problems?

2 things:

What HZ are you running?
Can you set up netconsole or serial console as these will capture something 
that won't be seen in your logs.

Cheers,
Con
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] Updated dynamic tick patches - Fix lost tick calculation in timer_pm.c

2005-08-31 Thread Zachary Amsden

Srivatsa Vaddagiri wrote:


On Wed, Aug 31, 2005 at 10:28:43PM +0530, Srivatsa Vaddagiri wrote:
 


Following patches related to dynamic tick are posted in separate mails,
for convenience of review. The first patch probably applies w/o dynamic
tick consideration also.

Patch 1/3  -> Fixup lost tick calculation in timer_pm.c
   



Currently, lost tick calculation in timer_pm.c is based on number
of microseconds that has elapsed since the last tick. Calculating
the number of microseconds is approximated by cyc2us, which
basically does :

microsec = (cycles * 286) / 1024

Consider 10 ticks lost. This amounts to 14319*10 = 143190 cycles 
(14319 = PMTMR_EXPECTED_RATE/(CALIBRATE_LATCH/LATCH)).
This amount to 39992 microseconds as per the above equation 
or 39992 / 4000 = 9 lost ticks, which is incorrect.


I feel lost ticks can be based on cycles difference directly
rather than being based on microseconds that has elapsed.

Following patch is in that direction. 


With this patch, time had kept up really well on one particular
machine (Intel 4way Pentium 3 box) overnight, while
on another newer machine (Intel 4way Xeon with HT) it didnt do so
well (time sped up after 3 or 4 hours). Hence I consider this
particular patch will need more review/work.

 



Does this patch help address the issues pointed out here?

http://bugzilla.kernel.org/show_bug.cgi?id=5127
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix kprobes handling of simultaneous probe hit/unregister

2005-08-31 Thread David S. Miller
From: Jim Keniston <[EMAIL PROTECTED]>
Date: 31 Aug 2005 14:53:37 -0700

> This bug doesn't exist on ppc64 and ia64, where a breakpoint
> instruction leaves the IP pointing to the beginning of the instruction.
> I don't know about sparc64.  (Dave, could you please advise?)

On sparc64 instructions are all 32-bit, 4-byte aligned, and a
breakpoint instruction leaves the PC pointing at the beginning of that
breakpoint instruction.

So I think sparc64 should be OK.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread H. Peter Anvin

Jesper Juhl wrote:


Well, it wouldn't have to be initrd specifically. Generally what's
needed is *some* way to tell the kernel "please read more options from
location ". The interresting bit is what 's supposed to be.



This is what initramfs (as opposed to initrd) does quite well.

-hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Perez-Gonzalez, Inaky
>From: Christopher Friesen [mailto:[EMAIL PROTECTED]
>Joe Korty wrote:
>
>> The returned timeout struct has a bit used to mark the value as
absolute.  Thus
>> the caller treats the returned timeout as a opaque cookie that can be
>> reapplied to the next (or more likely, the to-be restarted) timeout.
>
>Okay, endtime is always absolute value of when it should have expired.
>But I think I see a problem with the opaque cookie scheme and repeating
>timeouts.
>
>Suppose I want to wake my application at INTERVAL nanoseconds from now
>on the MONOTONIC clock, then again every INTERVAL nanoseconds after
that.

This API is not intended for your application to use directly, but
for kernel APIs that take sleeps from userspace (like
pthread_mutex_lock()
and friends), so this scenario is not very likely.

Granted, sleep() can be implemented with it too, so...

>How do I do that with this API?
>
>I can get the first sleep.  Suppose I oversleep by X nanoseconds.  I
>wake, and get an opaque timeout back.  How do I ask for the new wake
>time to be "endtime + INTERVAL"?

endtime.ts += INTERVAL

[we all know opaque is relative too] 
Or better, use itimers :)

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 04/16] I/O driver for 8250-compatible UARTs

2005-08-31 Thread Tom Rini
On Wed, Aug 31, 2005 at 03:19:37PM -0600, Bjorn Helgaas wrote:
> On Wednesday 31 August 2005 2:10 pm, Tom Rini wrote:
> > On Wed, Aug 31, 2005 at 01:38:52PM -0600, Bjorn Helgaas wrote:
> > > On Monday 29 August 2005 10:09 am, Tom Rini wrote:
> > I've tried intentionally to not mention 'ttyS' anywhere (exposed to the
> > user) because it's really not 'ttySN' but it is the port registered to
> > us.
> 
> So kgdb's port N is different from ttySN?  That sounds really
> confusing.  And KGDB_SIMPLE_SERIAL does mention "ttyS".

It's not intentionally different, and really only might be different in
the we have ttySX case, but ttySX isn't registered to KGDB case.

> > There's really two cases we have to deal with.  The first case is a
> > known at compile time or can be registered at boot-time easily port (ie
> > dumb old PC or ARM boards).  The second case is "serial port over
> > there".  Perhaps we should change the kgdb8250 arg to be an override of
> > the default port, so:
> > kgdb8250={io,mmio},,,
> 
> That makes sense.  But I'd make it {io,mmio},,,
> so it's more like the existing "console=uart" argument.

ok.

> > > > +   printk(KERN_ERR "kgdb8250: argument error, usage: "
> > > > +  "kgdb8250=,");
> > > > +#ifdef CONFIG_IA64
> > > > +   printk(",,");
> > > > +#endif
> > > 
> > > This isn't ia64-specific.
> > 
> > It is and it isn't.  Since no one's tried a PCI card uart for KGDB nor
> > had a case where we have to pass in the mmio addr except on ia64, it is
> > ia64-specific.
> 
> Maybe it's only been *tested* on ia64, but I don't think that's a
> reason to make it compiled only on ia64.

It's only been needed on ia64.  But it's moot since I've reworked things
for kgdb8250= is always a complete override.

> Actually, I think KGDB_SIMPLE_SERIAL, KGDB_*BAUD, KGDB_PORT_*,
> KGDB_PORT, and KGDB_IRQ are overkill.  Could they all be nuked
> in favor of a KGDB_8250_DEVICE that could be set to things like
> "ttyS0,115200" or "io,0x3f8,115200,49"?

Hmm.  I'll give that a shot momentarily... That sounds like a good idea
'tho..

-- 
Tom Rini
http://gate.crashing.org/~trini/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread Chris Wedgwood
On Thu, Sep 01, 2005 at 12:12:00AM +0200, Jesper Juhl wrote:

> b) add a new boot option telling the kernel the name of some file in
> initrd or similar from which to load additional options.

a file in initrd isn't a good choice; as the initrd is generally a fix
image

the point is some bootloaders might want to pass quite a bit of state
to the kernel at times (i actually have this for a mip32 target where
i construct a table and pass a pointer to that in, a tad icky but for
lack of options)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread H. Peter Anvin

Chris Wedgwood wrote:

On Wed, Aug 31, 2005 at 03:01:57PM -0700, H. Peter Anvin wrote:


Maybe not.  Another option would simply be to bump it up
significantly (2x isn't really that much.)  4096, maybe.


I wonder if we're not at the point where we need something different
to what we have now.  The concept of a command-line works for passing
simple state but for more complex things it's too cumbersome.


Well, we have initramfs for the really big stuff.  The kernel shouldn't 
really need that much data, though.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread H. Peter Anvin

Chris Wedgwood wrote:

On Thu, Sep 01, 2005 at 12:12:00AM +0200, Jesper Juhl wrote:


b) add a new boot option telling the kernel the name of some file in
initrd or similar from which to load additional options.


a file in initrd isn't a good choice; as the initrd is generally a fix
image

the point is some bootloaders might want to pass quite a bit of state
to the kernel at times (i actually have this for a mip32 target where
i construct a table and pass a pointer to that in, a tad icky but for
lack of options)


initrd is a fixed image, but initramfs can be synthesized.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: FW: [RFC] A more general timeout specification

2005-08-31 Thread Perez-Gonzalez, Inaky
>From: Roman Zippel [mailto:[EMAIL PROTECTED]
>On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:
>
>> +flags = tp->clock_id & TIMEOUT_FLAGS_MASK;
>> +clock_id = tp->clock_id & TIMEOUT_CLOCK_MASK;
>> +
>> +result = -EINVAL;
>> +if (flags & ~TIMEOUT_RELATIVE)
>> +goto out;
>> +
>> +/* someday, we should support *all* clocks available to us */
>> +if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC)
>> +goto out;
>> +if ((unsigned long)tp->ts.tv_nsec >= NSEC_PER_SEC)
>> +goto out;
>
>Why is that needed in a _general_ timeout API? What exactly makes it so
>useful for everyone and not just more complex for everyone?

Because if a system call gets a timeout specification it needs to
verify its correctness first. Instead of doing that at the point
where it goes to sleep, that could be deep in an atomic section,
we provide a separate function [timeout_validate()] which is the
one you mention, to do that.

Usefulness: (see the rationale in the patch), but in a nutshell;
most POSIX timeout specs have to be absolute in CLOCK_REALTIME
(eg: pthread_mutex_timed_lock()). Current kernel needs the timeout
relative, so glibc calls the kernel/however gets the time, computes
relative times and syscalls. Race conditions, overhead...etc. 

This mechanism supports both. That's why it is more general.

-- Inaky
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread Jesper Juhl
On 9/1/05, Chris Wedgwood <[EMAIL PROTECTED]> wrote:
> On Thu, Sep 01, 2005 at 12:12:00AM +0200, Jesper Juhl wrote:
> 
> > b) add a new boot option telling the kernel the name of some file in
> > initrd or similar from which to load additional options.
> 
> a file in initrd isn't a good choice; as the initrd is generally a fix
> image
> 
> the point is some bootloaders might want to pass quite a bit of state
> to the kernel at times (i actually have this for a mip32 target where
> i construct a table and pass a pointer to that in, a tad icky but for
> lack of options)
> 
Well, it wouldn't have to be initrd specifically. Generally what's
needed is *some* way to tell the kernel "please read more options from
location ". The interresting bit is what 's supposed to be.


-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread Chris Wedgwood
On Wed, Aug 31, 2005 at 03:12:58PM -0700, H. Peter Anvin wrote:

> Well, we have initramfs for the really big stuff.  The kernel
> shouldn't really need that much data, though.

except the initrd image is in many cases fairly fixed; right now i
have options i pass into initramfs by passing arguments on the command
line which initrd them reads, parses and uses that to grab a file from
the network

it's a tad disconnected to have to do this though
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FW: [RFC] A more general timeout specification

2005-08-31 Thread Roman Zippel
Hi,

On Wed, 31 Aug 2005, Perez-Gonzalez, Inaky wrote:

> + flags = tp->clock_id & TIMEOUT_FLAGS_MASK;
> + clock_id = tp->clock_id & TIMEOUT_CLOCK_MASK;
> +
> + result = -EINVAL;
> + if (flags & ~TIMEOUT_RELATIVE)
> + goto out;
> +
> + /* someday, we should support *all* clocks available to us */
> + if (clock_id != CLOCK_REALTIME && clock_id != CLOCK_MONOTONIC)
> + goto out;
> + if ((unsigned long)tp->ts.tv_nsec >= NSEC_PER_SEC)
> + goto out;

Why is that needed in a _general_ timeout API? What exactly makes it so 
useful for everyone and not just more complex for everyone?

bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread Jesper Juhl
On 9/1/05, Chris Wedgwood <[EMAIL PROTECTED]> wrote:
> On Wed, Aug 31, 2005 at 03:01:57PM -0700, H. Peter Anvin wrote:
> 
> > Maybe not.  Another option would simply be to bump it up
> > significantly (2x isn't really that much.)  4096, maybe.
> 
> I wonder if we're not at the point where we need something different
> to what we have now.  The concept of a command-line works for passing
> simple state but for more complex things it's too cumbersome.

How about

a) bump the limit on the cmd line - it's still useful, and 256 really
is quite small for some things.

b) add a new boot option telling the kernel the name of some file in
initrd or similar from which to load additional options.

I don't know if b is feasible at all. It would mean that the kernel
would need to get a hold of the initrd or whatever quite early to be
able to process options from it, but if it's doable somehow it would
be a really neat thing.


-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread Chris Wedgwood
On Wed, Aug 31, 2005 at 03:01:57PM -0700, H. Peter Anvin wrote:

> Maybe not.  Another option would simply be to bump it up
> significantly (2x isn't really that much.)  4096, maybe.

I wonder if we're not at the point where we need something different
to what we have now.  The concept of a command-line works for passing
simple state but for more complex things it's too cumbersome.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FW: [RFC] A more general timeout specification

2005-08-31 Thread Christopher Friesen

Joe Korty wrote:


The returned timeout struct has a bit used to mark the value as absolute.  Thus
the caller treats the returned timeout as a opaque cookie that can be
reapplied to the next (or more likely, the to-be restarted) timeout.


Okay, endtime is always absolute value of when it should have expired. 
But I think I see a problem with the opaque cookie scheme and repeating 
timeouts.


Suppose I want to wake my application at INTERVAL nanoseconds from now 
on the MONOTONIC clock, then again every INTERVAL nanoseconds after that.


How do I do that with this API?

I can get the first sleep.  Suppose I oversleep by X nanoseconds.  I 
wake, and get an opaque timeout back.  How do I ask for the new wake 
time to be "endtime + INTERVAL"?


Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread H. Peter Anvin

Chris Wedgwood wrote:

On Wed, Aug 31, 2005 at 02:29:44PM -0700, H. Peter Anvin wrote:


I think someone on the SYSLINUX mailing list already sent a patch to
akpm to make 512 the default; making it configurable would be a
better idea.  Feel free to send your patch through me.


So we really need this to be a configuration option?  We have too many
of those already.


Maybe not.  Another option would simply be to bump it up significantly 
(2x isn't really that much.)   4096, maybe.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[-mm patch] relayfs: relayfs_remove fix

2005-08-31 Thread Tom Zanussi
This patch makes relayfs_remove use simple_rmdir for removing
directories instead of simple_unlink.  Thanks to Nathan Scott for the
original patch.

Andrew, please apply.

Thanks,

Tom

Signed-off-by: Tom Zanussi <[EMAIL PROTECTED]>

diff -urpN -X dontdiff linux-2.6.13-rc6-mm2/fs/relayfs/inode.c 
linux-2.6.13-rc6-mm2-cur/fs/relayfs/inode.c
--- linux-2.6.13-rc6-mm2/fs/relayfs/inode.c 2005-08-25 18:21:31.0 
-0500
+++ linux-2.6.13-rc6-mm2-cur/fs/relayfs/inode.c 2005-08-31 17:11:12.0 
-0500
@@ -189,26 +189,39 @@ struct dentry *relayfs_create_dir(const 
 /**
  * relayfs_remove - remove a file or directory in the relay filesystem
  * @dentry: file or directory dentry
+ *
+ * Returns 0 if successful, negative otherwise.
  */
 int relayfs_remove(struct dentry *dentry)
 {
-   struct dentry *parent = dentry->d_parent;
+   struct dentry *parent;
+   int error = 0;
+
+   if (!dentry)
+   return -EINVAL;
+   parent = dentry->d_parent;
if (!parent)
return -EINVAL;
 
parent = dget(parent);
down(>d_inode->i_sem);
if (dentry->d_inode) {
-   simple_unlink(parent->d_inode, dentry);
-   d_delete(dentry);
+   if (S_ISDIR(dentry->d_inode->i_mode))
+   error = simple_rmdir(parent->d_inode, dentry);
+   else
+   error = simple_unlink(parent->d_inode, dentry);
+   if (!error)
+   d_delete(dentry);
}
-   dput(dentry);
+   if (!error)
+   dput(dentry);
up(>d_inode->i_sem);
dput(parent);
 
-   simple_release_fs(_mount, _mount_count);
+   if (!error)
+   simple_release_fs(_mount, _mount_count);
 
-   return 0;
+   return error;
 }
 
 /**
@@ -219,9 +232,6 @@ int relayfs_remove(struct dentry *dentry
  */
 int relayfs_remove_dir(struct dentry *dentry)
 {
-   if (!dentry)
-   return -EINVAL;
-
return relayfs_remove(dentry);
 }
 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Where is the performance bottleneck?

2005-08-31 Thread Holger Kiehl

On Thu, 1 Sep 2005, Nick Piggin wrote:


Holger Kiehl wrote:


meminfo.dump:

   MemTotal:  8124172 kB
   MemFree: 23564 kB
   Buffers:   7825944 kB
   Cached:  19216 kB
   SwapCached:  0 kB
   Active:  25708 kB
   Inactive:  7835548 kB
   HighTotal:   0 kB
   HighFree:0 kB
   LowTotal:  8124172 kB
   LowFree: 23564 kB
   SwapTotal:15631160 kB
   SwapFree: 15631160 kB
   Dirty: 3145604 kB


Hmm OK, dirty memory is pinned pretty much exactly on dirty_ratio
so maybe I've just led you on a goose chase.

You could
   echo 5 > /proc/sys/vm/dirty_background_ratio
   echo 10 > /proc/sys/vm/dirty_ratio

To further reduce dirty memory in the system, however this is
a long shot, so please continue your interaction with the
other people in the thread first.


Yes, this does make a difference, here the results of running

  dd if=/dev/full of=/dev/sd?1 bs=4M count=4883

on 8 disks at the same time:

  34.273340
  33.938829
  33.598469
  32.970575
  32.841351
  32.723988
  31.559880
  29.778112

That's 32.710568 MB/s on average per disk with your change and without
it it was 24.958557 MB/s on average per disk.

I will do more tests tomorrow.

Thanks,
Holger

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread Chris Wedgwood
On Wed, Aug 31, 2005 at 02:29:44PM -0700, H. Peter Anvin wrote:

> I think someone on the SYSLINUX mailing list already sent a patch to
> akpm to make 512 the default; making it configurable would be a
> better idea.  Feel free to send your patch through me.

So we really need this to be a configuration option?  We have too many
of those already.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fix kprobes handling of simultaneous probe hit/unregister

2005-08-31 Thread Jim Keniston
This patch fixes a bug in kprobes's handling of a corner case on i386
and x86_64.  On an SMP system, if one CPU unregisters a kprobe just
after another CPU hits that probepoint, kprobe_handler() on the latter
CPU sees that the kprobe has been unregistered, and attempts to let the
CPU continue as if the probepoint hadn't been hit.  The bug is that on
i386 and x86_64, we were neglecting to set the IP back to the beginning
of the probed instruction.  This could cause an oops or crash.

This bug doesn't exist on ppc64 and ia64, where a breakpoint
instruction leaves the IP pointing to the beginning of the instruction.
I don't know about sparc64.  (Dave, could you please advise?)

This fix has been tested on i386 and x86_64 SMP systems.  To reproduce
the problem, set one CPU to work registering and unregistering a kprobe
repeatedly, and another CPU pounding the probepoint in a tight loop.

Please apply.

Acked-by: Prasanna S Panchamukhi <[EMAIL PROTECTED]>
Signed-off-by: Jim Keniston <[EMAIL PROTECTED]>
--- linux-2.6.13/arch/i386/kernel/kprobes.c 2005-08-30 12:27:35.0 
-0700
+++ linux-fixed/arch/i386/kernel/kprobes.c  2005-08-30 15:33:03.0 
-0700
@@ -220,7 +220,10 @@
 * either a probepoint or a debugger breakpoint
 * at this address.  In either case, no further
 * handling of this interrupt is appropriate.
+* Back up over the (now missing) int3 and run
+* the original instruction.
 */
+   regs->eip -= sizeof(kprobe_opcode_t);
ret = 1;
}
/* Not one of ours: let kernel handle it */
--- linux-2.6.13/arch/x86_64/kernel/kprobes.c   2005-08-30 12:27:35.0 
-0700
+++ linux-fixed/arch/x86_64/kernel/kprobes.c2005-08-30 15:32:31.0 
-0700
@@ -360,7 +360,10 @@
 * either a probepoint or a debugger breakpoint
 * at this address.  In either case, no further
 * handling of this interrupt is appropriate.
+* Back up over the (now missing) int3 and run
+* the original instruction.
 */
+   regs->rip = (unsigned long)addr;
ret = 1;
}
/* Not one of ours: let kernel handle it */


[PATCH 1/2] Whitespace cleanup in pageattr.c

2005-08-31 Thread Zachary Amsden
This highly technical change allows the kernel to jump atop the Eiffel Tower,
fly with acceleration fifty times that of a space shuttle, and ingest 15 times
its own weight.

Patch-subject: Whitespace cleanup in pageattr.c
Depends-on: add-pgtable-allocation-notifiers
Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Index: linux-2.6.13/arch/i386/mm/pageattr.c
===
--- linux-2.6.13.orig/arch/i386/mm/pageattr.c   2005-08-31 14:41:45.0 
-0700
+++ linux-2.6.13/arch/i386/mm/pageattr.c2005-08-31 14:41:49.0 
-0700
@@ -33,7 +33,7 @@ pte_t *lookup_address(unsigned long addr
return NULL;
if (pmd_large(*pmd))
return (pte_t *)pmd;
-return pte_offset_kernel(pmd, address);
+   return pte_offset_kernel(pmd, address);
 } 
 
 static struct page *split_large_page(unsigned long address, pgprot_t prot)
@@ -54,8 +54,8 @@ static struct page *split_large_page(uns
pbase = (pte_t *)page_address(base);
SetPagePTE(virt_to_page(pbase));
for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE) {
-   set_pte([i], pfn_pte(addr >> PAGE_SHIFT,
-  addr == address ? prot : 
PAGE_KERNEL));
+   set_pte([i], pfn_pte(addr >> PAGE_SHIFT,
+   addr == address ? prot : PAGE_KERNEL));
}
return base;
 } 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] Use page present for pae pdpes

2005-08-31 Thread Zachary Amsden
Ok, the use of "1 + " and subtraction of one for PAE PDPEs has confused
many people now.  Make it explicit what is going on and why anding with
PAGE_MASK is a better idea to strip these bits.

Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]>
Depends-on: add-pgtable-allocation-notifiers
Index: linux-2.6.13/arch/i386/mm/pgtable.c
===
--- linux-2.6.13.orig/arch/i386/mm/pgtable.c2005-08-31 14:48:17.0 
-0700
+++ linux-2.6.13/arch/i386/mm/pgtable.c 2005-08-31 14:48:53.0 -0700
@@ -247,14 +247,14 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
if (!pmd)
goto out_oom;
SetPagePDE(virt_to_page(pmd));
-   set_pgd([i], __pgd(1 + __pa(pmd)));
+   set_pgd([i], __pgd(_PAGE_PRESENT | __pa(pmd)));
}
return pgd;
 
 out_oom:
for (i--; i >= 0; i--) {
ClearPagePDE(pfn_to_page(pgd_val(pgd[i]) >> PAGE_SHIFT));
-   kmem_cache_free(pmd_cache, (void *)__va(pgd_val(pgd[i])-1));
+   kmem_cache_free(pmd_cache, (void *)__va(pgd_val(pgd[i]) & 
PAGE_MASK));
}
kmem_cache_free(pgd_cache, pgd);
return NULL;
@@ -268,7 +268,7 @@ void pgd_free(pgd_t *pgd)
if (PTRS_PER_PMD > 1)
for (i = 0; i < USER_PTRS_PER_PGD; ++i) {
ClearPagePDE(pfn_to_page(pgd_val(pgd[i]) >> 
PAGE_SHIFT));
-   kmem_cache_free(pmd_cache, (void 
*)__va(pgd_val(pgd[i])-1));
+   kmem_cache_free(pmd_cache, (void *)__va(pgd_val(pgd[i]) 
& PAGE_MASK));
}
/* in the non-PAE case, free_pgtables() clears user pgd entries */
kmem_cache_free(pgd_cache, pgd);
Index: linux-2.6.13/arch/i386/mm/init.c
===
--- linux-2.6.13.orig/arch/i386/mm/init.c   2005-08-31 14:48:17.0 
-0700
+++ linux-2.6.13/arch/i386/mm/init.c2005-08-31 14:48:53.0 -0700
@@ -387,7 +387,7 @@ void zap_low_mappings (void)
 */
for (i = 0; i < USER_PTRS_PER_PGD; i++)
 #ifdef CONFIG_X86_PAE
-   set_pgd(swapper_pg_dir+i, __pgd(1 + __pa(empty_zero_page)));
+   set_pgd(swapper_pg_dir+i, __pgd(_PAGE_PRESENT | 
__pa(empty_zero_page)));
 #else
set_pgd(swapper_pg_dir+i, __pgd(0));
 #endif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Jose Luis Domingo Lopez
On Wednesday, 31 August 2005, at 11:27:41 -0600,
Jeff V. Merkey wrote:

> I am very open to discussions of this. Please go ahead and argue the 
> merits of GPL vs. proprietary code. DSFS is platform
> neutral and will also run on Windows XP/2000/2003/Longhorn and Free BSD. 
> It uses no kernel headers or kernel files.
> 
So then, does it have _anything_ to do with linux kernel development? It
doesn't seem so. Is this "product" an attempt to raise some money, and
make your former "linux kernel buyout" offer, but now giving a higher
amount of money?

Damnit, hope I am not feeding some troll out there...

-- 
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.13)



signature.asc
Description: Digital signature


[PATCH 0/2] Trivial cleanups to virtualization tree

2005-08-31 Thread Zachary Amsden
Not very much of importance here, but the idea for these cleanups
came along during discussion of my last set of patches with Chris
Wright.

One cleans up whitespace, another improves understandability of
the mysterious +/- 1's in the page table init code.

Zachary Amsden <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Jeff V. Merkey


NOTE! This copyright does *not* cover user programs that use kernel
services by normal system calls - this is merely considered normal use
of the kernel, and does *not* fall under the heading of "derived work".
Also note that the GPL below is copyrighted by the Free Software
Foundation, but the instance of code that it refers to (the linux
kernel) is copyrighted by me and others who actually wrote it.

Linus Torvalds


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Jeff V. Merkey

[EMAIL PROTECTED] wrote:


On Wed, 31 Aug 2005 12:00:45 MDT, "Jeff V. Merkey" said:

 


There's also a more fundamental problem with the GPL language.  The GPL stated 
it
confers "RIGHT TO COPY".  This is not the same as "RIGHT TO GRANT
LICENSES TO DISTRIBUTE."  Under US copyright law, if you confer to any person
the "right to copy" in a license which states the software is FREE, you have in 
essense
affected a copyright transfer to each and every person who receives the 
code.
   



Bullshit.

17 USC 106(3) talks about transfer of ownership *of the item*, not of the
copyright itself (see 17 USC 202, which clarifies this).  So you can sell a
book - but that isn't transferring the copyright of the book.  There isn't any
actual transfer without a document that actually *SAYS* "transfer of copyright" 
-
see 17 USC 204 (a) (Note that there's whole companies in Utah, with actual
large legal teams, that seem unclear on the concept in 17 USC 204(a), so I'm
not surprised that you're confused on this as well).

 

I have responded all I am going to on this topic.  Further discussion 
will not be helpful.  The patches are provided
IAW the GPL.  Our proprietary application is just like the thousands of 
others provided on Linux, and it

does use or incorporate any GPL or Linux code.

I will not respond to any further discussion on this thread.  Thanks for 
the input.  Please feel free to read Linus
statements on kernel.org regarding the statements that applications that 
run on Linux and that use published

interfaces are unaffected by the GPL.

Thanks for your input.

Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FW: [RFC] A more general timeout specification

2005-08-31 Thread Joe Korty
On Wed, Aug 31, 2005 at 03:20:03PM -0600, Christopher Friesen wrote:
> Perez-Gonzalez, Inaky wrote:
> >In this structure,
> >the user specifies:
> >whether the time is absolute, or relative to 'now'.
> 
> 
> >Timeout_sleep has a return argument, endtime, which is also in
> >'struct timeout' format.  If the input time was relative, then
> >it is converted to absolute and returned through this argument.
> 
> Wouldn't it make more sense for the endtime to be returned in the same 
> format (relative/absolute) as the original timer was specified?  That 
> way an application can set a new timer for "timeout + SLEEPTIME" and on 
> average it will be reasonably accurate.
> 
> In the proposed method, for endtime to be useful the app needs to check 
> the current time, compare with the endtime, and figure out the delta. 
> If you're going to force the app to do all that work anyway, the app may 
> as well use absolute times.
> 
> Chris

The returned timeout struct has a bit used to mark the value as absolute.  Thus
the caller treats the returned timeout as a opaque cookie that can be
reapplied to the next (or more likely, the to-be restarted) timeout.

A general principle is, once a time has been converted to absolute, it
should never be converted back to relative time.  To do so means the
end-time starts to drift from the original end-time.

Regards,
Joe
--
"Money can buy bandwidth, but latency is forever" -- John Mashey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: THE LINUX/I386 BOOT PROTOCOL - Breaking the 256 limit

2005-08-31 Thread H. Peter Anvin

Alon Bar-Lev wrote:


Hello Peter,

I am sorry that I am contacting you directly... Please refer me to 
correct contact if you are not the one.


Lately, I've found that 256 bytes long kernel parameters are not enough 
for my configuration. 

I've found the place where the kernel defines the length, I've actually 
found it in two places... I cannot understand why...


include/asm-i386/param.h: #define COMMAND_LINE_SIZE 256
include/asm-i386/setup.h: #define COMMAND_LINE_SIZE 256

Now... I've added an entry in the kernel configuration menu so that I 
can define these constants using menuconfig. 


I was quite happy...

But then I've got into a discussion with grub's development team...

From what I've read in the Documentation/i386/boot.txt I understood 
that if I use boot protocol 2.02+ there should be no reason for 256 byte 
limitation on the string pointed by the cmd_line_ptr, so I guessed they 
will deliver the command-line twice once for the old protocol truncated, 
and once for the new protocol not truncated.


Grub and Lilo approach is to point  the cmd_line_ptr to the old 
protocol's command line, thus truncating it to 256.


I'm just wondering... Can the 256 limit be broken, without modifying the 
boot protocol?


I think it can... But I need a formal answer so I can push it forward.



Yes, it can.  Several people on the SYSLINUX mailing list have tried 
this, and it works just fine.  The current version of SYSLINUX has a 
limit of 511 characters (because of memory management reasons inside 
SYSLINUX) instead of 255 (plus null).


I think someone on the SYSLINUX mailing list already sent a patch to 
akpm to make 512 the default; making it configurable would be a better 
idea.  Feel free to send your patch through me.


-hpa

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.4.27 and the ata_piix (Intel SATA) driver

2005-08-31 Thread Larry Lindsey
I'm trying to backport ata_piix SATA support to the 2.4.27 kernel.  I
found some patches at the jgarzik people page, but they didn't work. 
I'm building a 2.4.27 kernel with the drivers/scsi from 2.4.29.  Is
there anything else that I should do?  Should I be worried about any
issues?

Thanks,
Larry

PS. Please CC responses to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-08-31 Thread Valdis . Kletnieks
On Wed, 31 Aug 2005 12:00:45 MDT, "Jeff V. Merkey" said:

>  There's also a more fundamental problem with the GPL language.  The GPL 
> stated it
> confers "RIGHT TO COPY".  This is not the same as "RIGHT TO GRANT
> LICENSES TO DISTRIBUTE."  Under US copyright law, if you confer to any person
> the "right to copy" in a license which states the software is FREE, you have 
> in essense
> affected a copyright transfer to each and every person who receives the 
> code.

Bullshit.

17 USC 106(3) talks about transfer of ownership *of the item*, not of the
copyright itself (see 17 USC 202, which clarifies this).  So you can sell a
book - but that isn't transferring the copyright of the book.  There isn't any
actual transfer without a document that actually *SAYS* "transfer of copyright" 
-
see 17 USC 204 (a) (Note that there's whole companies in Utah, with actual
large legal teams, that seem unclear on the concept in 17 USC 204(a), so I'm
not surprised that you're confused on this as well).



pgpvxl2Vuin8d.pgp
Description: PGP signature


Re: [patch 04/16] I/O driver for 8250-compatible UARTs

2005-08-31 Thread Tom Rini
On Wed, Aug 31, 2005 at 10:03:34PM +0100, Russell King wrote:
> On Wed, Aug 31, 2005 at 01:10:39PM -0700, Tom Rini wrote:
> > On Wed, Aug 31, 2005 at 01:38:52PM -0600, Bjorn Helgaas wrote:
> > > On Monday 29 August 2005 10:09 am, Tom Rini wrote:
> > > >  linux-2.6.13-trini/drivers/serial/kgdb_8250.c  |  594 
> > > > +
> > > 
> > > The existing stuff in drivers/serial is named "8250_*"; is
> > > there a reason you're using "kgdb_8250" rather than "8250_kgdb"?
> > 
> > All the other kgdb stuff tends to be prefixed, not suffixed.  But I
> > don't really care either way.
> 
> I'd prefer it was 8250_kgdb.c actually - that keeps it along side the
> other 8250 files.

Will do.

> > > > +   switch (CURRENTPORT.iotype) {
> > > > +   case UPIO_MEM:
> > > > +   if (CURRENTPORT.mapbase)
> > > > +   kgdb8250_needs_request_mem_region = 1;
> > > > +   if (CURRENTPORT.flags & UPF_IOREMAP) {
> > > > +   CURRENTPORT.membase = 
> > > > ioport_map(CURRENTPORT.mapbase,
> > > > + 8 << 
> > > > KGDB8250_REG_SHIFT);
> > > 
> > > Shouldn't this be ioremap instead of ioport_map?
> > 
> > If I remember right from the testing, no.  Or if my memory is wrong and
> > that's retorihcal, sure.
> 
> ioport_map() is supposed to be used to map the IO range for the ioread/
> iowrite operations.  IOW, it takes something compatible with inb() and
> friends and converts it to something compatible with ioread8() and
> friends.
> 
> It does not take a MMIO cookie, so the code above appears to be
> conceptually wrong.
> 

So it's luck (or another mapping I didn't see elsewhere) that this
worked, and it should still be ioremap(...) to use with ioread/write8
later on in the code?

-- 
Tom Rini
http://gate.crashing.org/~trini/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: KLive: Linux Kernel Live Usage Monitor

2005-08-31 Thread Andrea Arcangeli
On Wed, Aug 31, 2005 at 04:28:59PM +0200, Sven Ladegast wrote:
> Why not generating a unique system ID at compilation stage of the kernel 
> if the apopriate kernel option is enabled? This needn't have something to 
> do with klive...just a unique kernel-ID or something like that.

I could also store an unique ID on disk without involving the kernel, if
all you want is to track a single computer. But I didn't want to track a
single computer. The main reason there is an "host" (as md5 of the IP)
is to give more values to info coming from different IP (assuming not
everyone is out there to confuse data). But it's not really about
tracking.

However I like the idea of uploading the `lspci -v` output since it
could be useful to know about really good hardware and drivers.

About the cookie I'm skeptical about the need of it, because it wouldn't
be secure anyway (there's no way for me to verify that the pci-ids are
the real ones that are in the computer so any notion of security is
quite pointless here), if something we need an ack that the packet was
not lost and that we should keep sending the pciids in at the next
packet too.

The only reason to use ssl would be to hide the pci-ids on the network
transfer (not really to make the cookie secure).

BTW, in the meantime I wrote the completely generic installer (this
is not rpm/deb kind of installer, it's a quick and dirty approach but it
should run in all distro and in all archs:

wget http://klive.cpushare.com/install.sh
sh install.sh --install

that will make it persistent. It goes into /var/tmp/klive-*

to uninstall it *completely*:

sh install.sh --uninstall
rm install.sh

You don't need root for the above (infact I never tested it as root, but
it should work as root too ;).

Please let me know if there are problem with the quick and dirty
installer (I finished it a few minutes ago), thanks!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FW: [RFC] A more general timeout specification

2005-08-31 Thread Christopher Friesen

Perez-Gonzalez, Inaky wrote:

In this structure,
the user specifies:
whether the time is absolute, or relative to 'now'.




Timeout_sleep has a return argument, endtime, which is also in
'struct timeout' format.  If the input time was relative, then
it is converted to absolute and returned through this argument.


Wouldn't it make more sense for the endtime to be returned in the same 
format (relative/absolute) as the original timer was specified?  That 
way an application can set a new timer for "timeout + SLEEPTIME" and on 
average it will be reasonably accurate.


In the proposed method, for endtime to be useful the app needs to check 
the current time, compare with the endtime, and figure out the delta. 
If you're going to force the app to do all that work anyway, the app may 
as well use absolute times.


Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 04/16] I/O driver for 8250-compatible UARTs

2005-08-31 Thread Bjorn Helgaas
On Wednesday 31 August 2005 2:10 pm, Tom Rini wrote:
> On Wed, Aug 31, 2005 at 01:38:52PM -0600, Bjorn Helgaas wrote:
> > On Monday 29 August 2005 10:09 am, Tom Rini wrote:
> I've tried intentionally to not mention 'ttyS' anywhere (exposed to the
> user) because it's really not 'ttySN' but it is the port registered to
> us.

So kgdb's port N is different from ttySN?  That sounds really
confusing.  And KGDB_SIMPLE_SERIAL does mention "ttyS".

> There's really two cases we have to deal with.  The first case is a
> known at compile time or can be registered at boot-time easily port (ie
> dumb old PC or ARM boards).  The second case is "serial port over
> there".  Perhaps we should change the kgdb8250 arg to be an override of
> the default port, so:
> kgdb8250={io,mmio},,,

That makes sense.  But I'd make it {io,mmio},,,
so it's more like the existing "console=uart" argument.

> > > + switch (CURRENTPORT.iotype) {
> > > + case UPIO_MEM:
> > > + if (CURRENTPORT.mapbase)
> > > + kgdb8250_needs_request_mem_region = 1;
> > > + if (CURRENTPORT.flags & UPF_IOREMAP) {
> > > + CURRENTPORT.membase = ioport_map(CURRENTPORT.mapbase,
> > > +   8 << KGDB8250_REG_SHIFT);
> > 
> > Shouldn't this be ioremap instead of ioport_map?
> 
> If I remember right from the testing, no.  Or if my memory is wrong and
> that's retorihcal, sure.

ioport_map() certainly isn't going to do anything good with an MMIO
address.

> > > + printk(KERN_ERR "kgdb8250: argument error, usage: "
> > > +"kgdb8250=,");
> > > +#ifdef CONFIG_IA64
> > > + printk(",,");
> > > +#endif
> > 
> > This isn't ia64-specific.
> 
> It is and it isn't.  Since no one's tried a PCI card uart for KGDB nor
> had a case where we have to pass in the mmio addr except on ia64, it is
> ia64-specific.

Maybe it's only been *tested* on ia64, but I don't think that's a
reason to make it compiled only on ia64.

> > > + * Syntax for this cmdline option is "kgdb8250=ttyno,baudrate"
> > > + * with ",irq,iomembase" tacked on the end on IA64.
> > 
> > This syntax doesn't really make sense on ia64, because there are
> > no fixed "ttyno/iomembase" mappings.  It would be unambiguous to
> > specify either ttyno OR iomembase, but there's no good way to use
> > both.
> 
> It's true that ttyno isn't really useful on ia64.

Then I think it would be a mistake to have syntax that requires both
ttyno and iomembase in the same command-line option.  I'm visualizing
something like this:

kgdb8250=ttyS0,115200
kgdb8250=io,0x3f8,115200,49
kgdb8250=mmio,0xff5e,115200,49

where you can easily decide which type of device specification
you've got.

> > > +config KGDB_SIMPLE_SERIAL
> > > + bool "Simple selection of KGDB serial port"
> > > + depends on KGDB_8250
> > > + default y
> > > + help
> > > +   If you say Y here, you will only have to pick the baud rate
> > > +   and serial port (ttyS) that you wish to use for KGDB.  If you
> > > +   say N, you will have provide the I/O port and IRQ number.  Note
> > > +   that if your serial ports are iomapped, such as on ia64, then
> > > +   you must say Y here.  If in doubt, say Y.
> > 
> > How about: "... you will have to provide the address (I/O port or MMIO
> > address) and IRQ ..."
> 
> I'd really rather not force everyone to pass in the address token and
> IRQ#.

My point was merely that you should support explicit MMIO addresses
as well as explicit I/O ports.  I guess it makes sense to accept
ttyS names as well, and accept the limitation that before the 8250
driver initializes, ttyS names only work for devices defined at
compile-time in SERIAL_PORT_DFNS.

> > I don't understand the "iomapped" bit -- does that mean MMIO?  And why
> > would it make any difference whether they're in I/O port or MMIO space?
> 
> It's the special "boot once, figure out your I/O address and IRQ, reboot
> and pass it in" case of IA64.  I'm under the impression that it's
> because of the more dynamic than other arches that we couldn't just
> register the ports as we find them to KGDB and let the user pick from a
> pre-registered port that we play that game.

Yup, ia64 doesn't require serial ports at fixed addresses.  They're all
discovered via ACPI and PCI enumeration.

But "iomapped" doesn't suggest that to me.  And I would expect the
text to say that if you don't have any compiled-in UART names, you'd
have to say "N".  But it says use "Y" for ia64.

Actually, I think KGDB_SIMPLE_SERIAL, KGDB_*BAUD, KGDB_PORT_*,
KGDB_PORT, and KGDB_IRQ are overkill.  Could they all be nuked
in favor of a KGDB_8250_DEVICE that could be set to things like
"ttyS0,115200" or "io,0x3f8,115200,49"?

> > > +config KGDB_PORT
> > > + hex "hex I/O port address of the debug serial port"
> > > + depends on !KGDB_SIMPLE_SERIAL && KGDB_8250 && !IA64
> > > + default 3f8
> > > + help
> > > +   This is the unmapped (and on platforms with 1:1 mapping
> > > +   this 

Re: Where is the performance bottleneck?

2005-08-31 Thread Dr. David Alan Gilbert
* Holger Kiehl ([EMAIL PROTECTED]) wrote:

> There is however one difference, here I had set
> /sys/block/sd?/queue/nr_requests to 4096.

Well from that it looks like none of the queues get about 255
(hmm that's a round number)

> avg-cpu:  %user   %nice%sys %iowait   %idle
>0.100.00   21.85   58.55   19.50

Fair amount of system time.

> Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s 
> avgrq-sz avgqu-sz   await  svctm  %util

> sdf11314.90   0.00 365.10  0.00 93440.000.00 46720.00 0.00  
> 255.93 1.925.26   2.74  99.98
> sdg7973.20   0.00 257.20  0.00 65843.200.00 32921.60 0.00   
> 256.00 1.947.53   3.89 100.01

There seems to be quite a spread of read performance accross the drives
(pretty consistent accross the run); what makes sdg so much slower than
sdf (which seems to be the slowest and fastest drives respectively).
I guess if everyone was running at sdf's speed you would be pretty happy.

If you physically swap f and g does the performance follow the drive
or the letter?

Dave
--
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert| Running GNU/Linux on Alpha,68K| Happy  \ 
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
 \ _|_ http://www.treblig.org   |___/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: FW: [RFC] A more general timeout specification

2005-08-31 Thread Joe Korty
On Wed, Aug 31, 2005 at 01:55:54PM -0700, Perez-Gonzalez, Inaky wrote:
> Hi Andrew
> 
> This was developed by Joe Korty <[EMAIL PROTECTED]>, greatly 
> enhancing something I had done before, so I am signing it out 
> (although Joe should too, Joe?).


The fusyn (robust mutexes) project proposes the creation
of a more general data structure, 'struct timeout', for the
specification of timeouts in new services.  In this structure,
the user specifies:

a time, in timespec format.
the clock the time is specified against (eg, CLOCK_MONOTONIC).
whether the time is absolute, or relative to 'now'.

That is, all combinations of useful timeout attributes become
possible.

Also proposed are two new kernel routines for the manipulation
of timeouts:

timeout_validate()
timeout_sleep()

timeout_validate() error-checks the syntax of a timeout
argument and returns either zero or -EINVAL.  By breaking
timeout_validate() out from timeout_sleep(), it becomes possible
to error check the timeout 'far away' from the places in the
code where we would actually do the timeout, as well as being
able to perform such checks only at those places we know the
timeout specification is coming from an unsafe source.

timeout_sleep() puts the caller to sleep until the
specified end time is in the past, as measured against
the given clock, or until the caller is awakened by other
means (such as wake_up_process()).  Like schedule_timeout(),
TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE must be set ahead
of time; if TASK_INTERRUPTIBLE is set then signals will also
break the caller out of the sleep.

timeout_sleep() returns either 0 (returned early) or -ETIMEDOUT
(returned due to timeout).  It is up to the caller to resolve,
in the "returned early" case, why it returned early.

Timeout_sleep has a return argument, endtime, which is also in
'struct timeout' format.  If the input time was relative, then
it is converted to absolute and returned through this argument.
This can be used when an early-terminated service must be
restarted and side effects of the early termination-n-restart
(such as end time drift) are to be avoided.

Signed-off-by: Inaky Perez-Gonzalez <[EMAIL PROTECTED]>
Signed-off-by: Joe Korty <[EMAIL PROTECTED]>




 2.6.12-rc4-jak/include/linux/time.h|6 +
 2.6.12-rc4-jak/include/linux/timeout.h |   48 
 2.6.12-rc4-jak/kernel/posix-timers.c   |7 +
 2.6.12-rc4-jak/kernel/timer.c  |  184
+
 4 files changed, 245 insertions(+)

diff -puNa include/linux/time.h~a.more.flexible.timeout.approach
include/linux/time.h
--- 2.6.12-rc4/include/linux/time.h~a.more.flexible.timeout.approach
2005-05-18 13:53:14.204417169 -0400
+++ 2.6.12-rc4-jak/include/linux/time.h 2005-05-18 13:53:14.212416002
-0400
@@ -25,6 +25,8 @@ struct timezone {
int tz_dsttime; /* type of dst correction */
 };
 
+#include 
+
 #ifdef __KERNEL__
 
 /* Parameters used to convert the timespec values */
@@ -103,6 +105,10 @@ struct itimerval;
 extern int do_setitimer(int which, struct itimerval *value, struct
itimerval *ovalue);
 extern int do_getitimer(int which, struct itimerval *value);
 extern void getnstimeofday (struct timespec *tv);
+extern long clock_gettime(int which, struct timespec *tp);
+
+extern int FASTCALL(abs_timespec_to_abs_jiffies (clockid_t clock, const
struct timespec *tp, unsigned long *jp));
+extern int FASTCALL(rel_to_abs_timespec(clockid_t clock, const struct
timespec *tsrel, struct timespec *tsabs));
 
 extern struct timespec timespec_trunc(struct timespec t, unsigned
gran);
 
diff -puNa /dev/null include/linux/timeout.h
--- /dev/null   2004-06-24 14:04:38.0 -0400
+++ 2.6.12-rc4-jak/include/linux/timeout.h  2005-05-18
13:53:14.212416002 -0400
@@ -0,0 +1,48 @@
+/*
+ * Extended timeout specification
+ *
+ * (C) 2002-2005 Intel Corp
+ * Inaky Perez-Gonzalez <[EMAIL PROTECTED]>.
+ *
+ * Licensed under the FSF's GNU Public License v2 or later.
+ *
+ * Generic extended timeout specification.  Broken out by Joe Korty
+ * <[EMAIL PROTECTED]> from linux/time.h so that it can be included
+ * by userspace applications in conjunction with #include "time.h".
+ */
+
+#ifndef _LINUX_TIMEOUT_H
+#define _LINUX_TIMEOUT_H
+
+/* 'struct timeout' flag values.  OR these into clock_id along with
+ * a clock specification such as CLOCK_REALTIME or CLOCK_MONOTONIC.
+ */
+enum {
+   TIMEOUT_RELATIVE   = 0x1000,/* relative timeout */
+
+   TIMEOUT_FLAGS_MASK = 0xf000,/* flags mask for
clock_id */
+   TIMEOUT_CLOCK_MASK = 0x0fff,/* clock mask for
clock_id */
+};
+
+/* Magic values a 'struct timeout' pointer can have */
+
+#define TIMEOUT_MAX((struct timeout *) ~0UL) /* never time out */
+#define TIMEOUT_NONE   ((struct timeout *) 0UL)  /* time out
immediately */
+
+/**
+ * struct timeout - general timeout specification
+ *
+ * @clock_id: which clock source to use ORed with flags describing use.
+ * @ts:   timespec 

Re: State of Linux graphics

2005-08-31 Thread Keith Packard
On Wed, 2005-08-31 at 13:06 -0700, Allen Akin wrote:
> On Wed, Aug 31, 2005 at 11:29:30AM -0700, Keith Packard wrote:
> | The real goal is to provide a good programming environment for 2D
> | applications, not to push some particular low-level graphics library.
> 
> I think that's a reasonable goal.
> 
> My red flag goes up at the point where the 2D programming environment
> pushes down into device drivers and becomes an OpenGL peer.  That's
> where we risk redundancy of concepts, duplication of engineering effort,
> and potential semantic conflicts.

Right, the goal is to have only one driver for the hardware, whether an
X server for simple 2D only environments or a GL driver for 2D/3D
environments. I think the only questions here are about the road from
where we are to that final goal.

> For just one small example, we now have several ways of specifying the
> format of a pixel and creating source and destination surfaces based on
> a format.  Some of those formats and surfaces can't be used directly by
> Render, and some can't be used directly by OpenGL.  Furthermore, the
> physical resources have to be managed by some chunk of software that
> must now resolve conflicts between two APIs.

As long as Render is capable of exposing enough information about the GL
formats for 2D applications to operate, I think we're fine. GL provides
far more functionality than we need for 2D applications being designed
and implemented today; picking the right subset and sticking to that is
our current challenge.

> The ARB took a heck of a long time getting consensus on the framebuffer
> object extension in OpenGL because image resource management is a
> difficult problem at the hardware level.  By adding a second low-level
> software interface we've made it even harder.  We've also put artificial
> barriers between "2D" clients and useful "3D" functionality, and between
> "3D" clients and useful "2D" functionality.  I don't see that the nature
> of computer graphics really justifies such a separation (and in fact the
> OpenGL designers argued against it almost 15 years ago).

At the hardware level, there is no difference. However, at the
application level, GL is not a very friendly 2D application-level API.
Abstracting 3D hardware functionality to make it paletable to 2D
developers remains the key goal of Render and cairo.

Note that by layering cairo directly on GL rather than the trip through
Render and the X server, one idea was to let application developers use
the cairo API to "paint" on 3D surfaces without creating an intermediate
texture. Passing through the X server and Render will continue to draw
application content to pixels before it is applied to the final screen
geometry.

> So I think better integration is also a reasonable goal.

Current efforts in solving the memory management issues with the DRM
environment should make the actual consumer of that memory irrelevant,
so we can (at least as a temporary measure) run GL and old-style X
applications on the same card and expect them to share memory in a more
integrated fashion. The integration of 2D and 3D acceleration into a
single GL-based system will take longer, largely as we wait for the GL
drivers to catch up to the requirements of the Xgl implementation that
we already have.

> I believe we're doing well with layered implementation strategies like
> Xgl and Glitz.

I've been pleased that our early assertions about Render being
compatible with GL drawing semantics have been borne out in practice,
and that our long term goal of a usable GL-based X server are possible
if not quite ready for prime-time.

>   Where we might do better is in (1) extending OpenGL to
> provide missing functionality, rather than creating peer low-level APIs;

I'm not sure we have any significant new extensions to create here;
we've got a pretty good handle on how X maps to GL and it seems to work
well enough with suitable existing extensions.

> (2) expressing the output of higher-level services in terms of OpenGL
> entities (vertex buffer objects, framebuffer objects including textures,
> shader programs, etc.) so that apps can mix-and-match them and
> scene-graph libraries can optimize their use; 

This will be an interesting area of research; right now, 2D applications
are fairly sketchy about the structure of their UIs, so attempting to
wrap them into more structured models will take some effort.

Certainly ensuring that cairo on glitz can be used to paint into an
arbitrary GL context will go some ways in this direction.

> (3) finishing decent
> OpenGL drivers for small and old hardware to address people's concerns
> about running modern apps on those systems.

The question is whether this is interesting enough to attract developer
resources. So far, 3D driver work has proceeded almost entirely on the
newest documented hardware that people could get. Going back and
spending months optimizing software 3D rendering code so that it works
as fast as software 2D code seems like 

  1   2   3   4   5   6   7   >