date:20071205

Re: [PATCH] pci: Fix bus resource assignment on 32 bits with 64b resources

2007-12-05 Thread Benjamin Herrenschmidt


On Wed, 2007-12-05 at 22:39 -0800, Greg KH wrote:
> > that is  it can be either unsigned int, unsigned long or unsigned
> long
> > long... and we have no way to reliably printk that.
> 
> We do this already just fine.  Take a look in the kernel, I think we
> just always cast it to long long to be uniform.

I wanted to avoid that for two reasons:

 - casts are fugly
 - it adds support code to cast & handle 64 bits to 32 bits platforms
   that wouldn't normally need it

Now, if you really think that's the way to go, I'll respin with casts
(I've used cast in subsequent patches merging bits & pieces of the
powerpc 32 and 64 bits PCI code too in fact).

I was just hoping somebody had a better idea, like a way to add a new
format specifier to printk without losing gcc type checking :-)

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc4-mm1: hostbyte=0x01 driverbyte=0x00 (now bisected)

2007-12-05 Thread Hannes Reinecke

Alexey Dobriyan wrote:
>>  git-scsi-misc.patch
> 
> Apologies for not looking into the problem earlier. See
> http://marc.info/?t=11962802235&r=1&w=2
> "2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O 
> error"
> for previous installment.
> 
> I've bisected it to the following patch in git-scsi-misc branch.
> Revert on top of 2.6.24-rc4-mm1 also helps.
> 
> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> Author: Hannes Reinecke <[EMAIL PROTECTED]>
> Date:   Tue Nov 6 09:23:40 2007 +0100
> 
> [SCSI] Do not requeue requests if REQ_FAILFAST is set
> 
> Any requests with the REQ_FAILFAST flag set should not be requeued
> to the requeust queue, but rather terminated directly.
> Otherwise the multipath failover will stall until the command
> timeout triggers.
> 
> Signed-off-by: Hannes Reinecke <[EMAIL PROTECTED]>
> Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 0f44bdb..0da0dd0 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1286,6 +1286,11 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
> struct request *req)
>*/
>   if (!(req->cmd_flags & REQ_PREEMPT))
>   ret = BLKPREP_DEFER;
> + /*
> +  * Return failfast requests immediately
> +  */
> + if (req->cmd_flags & REQ_FAILFAST)
> + ret = BLKPREP_KILL;
>   break;
>   default:
>   /*
> @@ -1414,6 +1419,17 @@ static inline int scsi_host_queue_ready(struct 
> request_queue *q,
>   return 1;
>  }
>  
> +static void __scsi_kill_request(struct request *req)
> +{
> + struct scsi_cmnd *cmd = req->special;
> + struct scsi_device *sdev = cmd->device;
> +
> + cmd->result = DID_NO_CONNECT << 16;
> + atomic_inc(&cmd->device->iorequest_cnt);
> + sdev->device_busy--;
> + __scsi_done(cmd);
> +}
> +
>  /*
>   * Kill a request for a dead device
>   */
> @@ -1527,8 +1543,16 @@ static void scsi_request_fn(struct request_queue *q)
>* accept it.
>*/
>   req = elv_next_request(q);
> - if (!req || !scsi_dev_queue_ready(q, sdev))
> + if (!req)
> + break;
> +
> + if (!scsi_dev_queue_ready(q, sdev)) {
> + if (req->cmd_flags & REQ_FAILFAST) {
> + scsi_kill_request(req, q);
> + continue;
> + }
>   break;
> + }
>  
>   if (unlikely(!scsi_device_online(sdev))) {
>   sdev_printk(KERN_ERR, sdev,
> @@ -1609,8 +1633,12 @@ static void scsi_request_fn(struct request_queue *q)
>* later time.
>*/
>   spin_lock_irq(q->queue_lock);
> - blk_requeue_request(q, req);
> - sdev->device_busy--;
> + if (unlikely(req->cmd_flags & REQ_FAILFAST))
> + __scsi_kill_request(req);
> + else {
> + blk_requeue_request(q, req);
> + sdev->device_busy--;
> + }
>   if(sdev->device_busy == 0)
>   blk_plug_device(q);
>   out:
Yeah, sorry. That patch was bad. Please use the attached one instead.
Andrew, can you replace them?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 13e7e09..9ec1566 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1284,13 +1284,15 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
struct request *req)
/*
 * If the devices is blocked we defer normal commands.
 */
-   if (!(req->cmd_flags & REQ_PREEMPT))
-   ret = BLKPREP_DEFER;
-   /*
-* Return failfast requests immediately
-*/
-   if (req->cmd_flags & REQ_FAILFAST)
-   ret = BLKPREP_KILL;
+   if (!(req->cmd_flags & REQ_PREEMPT)) {
+   /*
+* Return failfast requests immediately
+*/
+   if (req->cmd_flags & REQ_FAILFAST)
+   ret = BLKPREP_KILL;
+   else
+   ret = BLKPREP_DEFER;
+   }
break;
default:
/*

AW: [PATCH] serial: add ADDI-DATA GmbH Communication cardsin8250_pci.c and pci_ids.h.

2007-12-05 Thread Krauth.Julien


Hello,

Here is the last version of the patch updated regarding your remarks.

Regards,

Julien Krauth

Changes:


- Indentation.

- Code optimisation: all boards except the APCI-7800 are now managed
with the pci_default_setup() function.

- Add pbn_b0_8_115200 to manage the APCI-7800-3 with the
pci_default_setup() function.


---

From: Krauth Julien <[EMAIL PROTECTED]>

Add ADDI-DATA GmbH communication cards to 8250_pci driver.
Supported cards are:

APCI-7300, APCI-7420, APCI-7500, APCI-7800 APCI-7300-2, APCI-7420-2,
APCI-7500-2 APCI-7300-3, APCI-7420-3, APCI-7500-3, APCI-7800-3

8250_pci.c.patch

Add ADDI-DATA GmbH communication cards to 8250_pci.c.

pci_ids.h.patch
===
Add ADDI-DATA GmbH communication cards Vendor and Device IDs to
pci_ids.h.

WARNING:

8250_pci.c.patch depend on pci_ids.h.patch.
8250_pci.c is using Vendor and Device ID defined in pci_ids.h.


Signed-off-by: Krauth J. <[EMAIL PROTECTED]>
---

This patch applies to kernel 2.6.23.9.

--- linux-2.6.23.9-vanilla/include/linux/pci_ids.h  2007-11-26
18:51:43.0 +0100
+++ linux-2.6.23.9/include/linux/pci_ids.h  2007-11-27
10:55:12.0 +0100
@@ -2019,6 +2019,23 @@
 #define PCI_VENDOR_ID_QUICKNET 0x15e2
 #define PCI_DEVICE_ID_QUICKNET_XJ  0x0500
 
+/*
+ * ADDI-DATA GmbH communication cards <[EMAIL PROTECTED]>  */
+#define PCI_VENDOR_ID_ADDIDATA_OLD 0x10E8
+#define PCI_VENDOR_ID_ADDIDATA 0x15B8
+#define PCI_DEVICE_ID_ADDIDATA_APCI75000x7000
+#define PCI_DEVICE_ID_ADDIDATA_APCI74200x7001
+#define PCI_DEVICE_ID_ADDIDATA_APCI73000x7002
+#define PCI_DEVICE_ID_ADDIDATA_APCI78000x818E
+#define PCI_DEVICE_ID_ADDIDATA_APCI7500_2  0x7009
+#define PCI_DEVICE_ID_ADDIDATA_APCI7420_2  0x700A
+#define PCI_DEVICE_ID_ADDIDATA_APCI7300_2  0x700B
+#define PCI_DEVICE_ID_ADDIDATA_APCI7500_3  0x700C
+#define PCI_DEVICE_ID_ADDIDATA_APCI7420_3  0x700D
+#define PCI_DEVICE_ID_ADDIDATA_APCI7300_3  0x700E
+#define PCI_DEVICE_ID_ADDIDATA_APCI7800_3  0x700F
+
 #define PCI_VENDOR_ID_PDC  0x15e9
 
 #define PCI_VENDOR_ID_FARSITE   0x1619


--- linux-2.6.23.9-vanilla/drivers/serial/8250_pci.c2007-11-26
18:51:43.0 +0100
+++ linux-2.6.23.9/drivers/serial/8250_pci.c2007-11-30
08:33:50.0 +0100
@@ -106,6 +106,32 @@ setup_port(struct serial_private *priv, 
 }
 
 /*
+ * ADDI-DATA GmbH communication cards <[EMAIL PROTECTED]>
+ */
+static int
+addidata_apci7800_setup(struct serial_private *priv, struct
pciserial_board *board,
+   struct uart_port *port, int idx)
+{
+   unsigned int bar = 0, offset = board->first_offset;
+   bar = FL_GET_BASE(board->flags);
+
+   if (idx < 2 ) {
+   offset += idx * board->uart_offset;
+   } else if ((idx >= 2) && (idx < 4)) {
+   bar += 1;
+   offset += ((idx - 2) * board->uart_offset);
+   } else if ((idx >= 4) && (idx < 6 )) {
+   bar += 2;
+   offset += ((idx - 4) * board->uart_offset);
+   } else if (idx >= 6) {
+   bar += 3;
+   offset += ((idx - 6) * board->uart_offset);
+   }
+
+   return setup_port(priv, port, bar, offset, board->reg_shift);
+}
+
+/*
  * AFAVLAB uses a different mixture of BARs and offsets
  * Not that ugly ;) -- HW
  */
@@ -752,6 +778,16 @@ pci_default_setup(struct serial_private 
  */
 static struct pci_serial_quirk pci_serial_quirks[] = {
/*
+   * ADDI-DATA GmbH communication cards <[EMAIL PROTECTED]>
+   */
+   {
+   .vendor = PCI_VENDOR_ID_ADDIDATA_OLD,
+   .device = PCI_DEVICE_ID_ADDIDATA_APCI7800,
+   .subvendor  = PCI_ANY_ID,
+   .subdevice  = PCI_ANY_ID,
+   .setup  = addidata_apci7800_setup,
+   },
+   /*
 * AFAVLAB cards - these may be called via parport_serial
 *  It is not clear whether this applies to all products.
 */
@@ -1036,6 +1072,7 @@ enum pci_board_num_t {
pbn_b0_2_115200,
pbn_b0_4_115200,
pbn_b0_5_115200,
+   pbn_b0_8_115200,
 
pbn_b0_1_921600,
pbn_b0_2_921600,
@@ -1172,6 +1209,12 @@ static struct pciserial_board pci_boards
.base_baud  = 115200,
.uart_offset= 8,
},
+   [pbn_b0_8_115200] = {
+   .flags   = FL_BASE0,
+   .num_ports   = 8,
+   .base_baud   = 115200,
+   .uart_offset = 8,
+   },
 
[pbn_b0_1_921600] = {
.flags  = FL_BASE0,
@@ -2574,6 +2617,97 @@ static struct pci_device_id serial_pci_t
pbn_pasemi_1682M },
 
/*
+   * ADDI-DATA GmbH communication cards <[EMAIL PROTECTED]>
+   */
+   {   PCI_VENDOR_ID_ADDIDATA,
+   PCI_DEVICE_ID_ADDIDATA_APCI7500,
+   PCI_ANY_ID,
+   PCI_

Re: PS3: trouble with SPARSEMEM_VMEMMAP and kexec

2007-12-05 Thread Geert Uytterhoeven

On Wed, 5 Dec 2007, Geoff Levand wrote:
> Andrew Morton wrote:
> > On Wed, 5 Dec 2007 10:52:48 +0100 (CET)
> > Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:
> > 
> >> 
> >> Subject: sparsemem: sparse_add_one_section() may fail to allocate memory
> >> 
> >> sparsemem: sparse_add_one_section() may fail to allocate memory, and must 
> >> check
> >> whether the allocation succeeded before proceeding to touch the allocated
> >> memory.
> >> 
> >> From: Geert Uytterhoeven <[EMAIL PROTECTED]>
> >> 
> >> Signed-off-by: Geert Uytterhoeven <[EMAIL PROTECTED]>
> >> ---
> >> FIXME There are still some possible memory leaks in 
> >> sparse_add_one_section():
> >>   - usemap is never deallocated
> >>   - __kfree_section_memmap() is a not yet implemented dummy
> > 
> > I already had
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/broken-out/mm-sparsec-improve-the-error-handling-for-sparse_add_one_section.patch
> > and
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/broken-out/mm-sparsec-check-the-return-value-of-sparse_index_alloc.patch
> > 
> > queued.  Do they fix the problem, and should they be merged in 2.6.24?
> 
> No, a quick test shows it just panics in a different place.  Geert's
> patch does also.

What do you mean, that it still paniced after my patch?

The kernel did boot succesfully for me when passing ps3fb=48M. Userspace saw 58
MiB (128 MiB - kernelsize - 48 MiB(ps3fb)).

I did not try kexec, though.

With kind regards,
 
Geert Uytterhoeven
Software Architect

Sony Network and Software Technology Center Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
 
Phone:+32 (0)2 700 8453 
Fax:  +32 (0)2 700 8622 
E-mail:   [EMAIL PROTECTED] 
Internet: http://www.sony-europe.com/

Sony Network and Software Technology Center Europe  
A division of Sony Service Centre (Europe) N.V. 
Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium  
VAT BE 0413.825.160 · RPR Brussels  
Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619

Re: 2.6.24-rc4-mm1

2007-12-05 Thread Andrew Morton

On Thu, 06 Dec 2007 17:59:37 +1100 Reuben Farrelly <[EMAIL PROTECTED]> wrote:

> On 5/12/2007 4:17 PM, Andrew Morton wrote:
> > Temporarily at
> > 
> >   http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/
> > 
> > Will appear later at
> > 
> >   
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/
> > 
> > 
> > - Lots of device IDs have been removed from the e1000 driver and moved over
> >   to e1000e.  So if your e1000 stops working, you forgot to set 
> > CONFIG_E1000E.
> 
> This non fatal oops which I have just noticed may be related to this change 
> then 
> - certainly looks networking related.

yep, but it isn't e1000.  It's core TCP.

> WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert()
> Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1
> 
> Call Trace:
> [] tcp_fastretrans_alert+0x229/0xe63
>   [] tcp_ack+0xa3f/0x127d
>   [] tcp_rcv_established+0x55f/0x7f8
>   [] tcp_v4_do_rcv+0xdb/0x3a7
>   [] :nf_conntrack:nf_ct_deliver_cached_events+0x75/0x99
>   [] :nf_conntrack_ipv4:ipv4_confirm+0x29/0x51
>   [] tcp_v4_rcv+0x9be/0xaed
>   [] nf_hook_slow+0x60/0xdf
>   [] ip_local_deliver_finish+0xd3/0x253
>   [] ip_local_deliver+0x3b/0x85
>   [] ip_rcv_finish+0x119/0x3b8
>   [] ip_rcv+0x231/0x30c
>   [] netif_receive_skb+0x215/0x299
>   [] :e1000e:e1000_receive_skb+0x4d/0x1db
>   [] :e1000e:e1000_clean_rx_irq+0x12c/0x341
>   [] :e1000e:e1000_clean+0x306/0x58f
>   [] rebalance_domains+0xec/0x423
>   [] handle_edge_irq+0x97/0x13b
>   [] net_rx_action+0xb8/0x11d
>   [] __do_softirq+0x71/0xdd
>   [] call_softirq+0x1c/0x30
>   [] do_softirq+0x3d/0x8d
>   [] irq_exit+0x84/0x86
>   [] do_IRQ+0x7e/0xe4
>   [] mwait_idle+0x0/0x58
>   [] default_idle+0x0/0x43
>   [] ret_from_intr+0x0/0xa
> [] mwait_idle+0x48/0x58
>   [] enter_idle+0x22/0x24
>   [] cpu_idle+0x63/0x88
>   [] rest_init+0x55/0x60
>   [] start_kernel+0x2a4/0x32a
>   [] _sinittext+0x10b/0x120
> 
> tornado home #
> 
> I have posted a full dmesg up as well as my .config and an lcpci at 
> http://www.reub.net/files/kernel/2.6.24-rc4-mm1/ .
> 

Ilpo, Reuben's kernel is talking to you ;)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci: Fix bus resource assignment on 32 bits with 64b resources

2007-12-05 Thread Greg KH

On Thu, Dec 06, 2007 at 02:22:27PM +1100, Benjamin Herrenschmidt wrote:
> 
> On Wed, 2007-12-05 at 17:40 +1100, Benjamin Herrenschmidt wrote:
> > The current pci_assign_unassigned_resources() code doesn't work properly
> > on 32 bits platforms with 64 bits resources. The main reason is the use
> > of unsigned long in various places instead of resource_size_t.
> > 
> > This fixes it, along with some tricks to avoid casting to 64 bits on
> > platforms that don't need it in every printk around.
> > 
> > This is a pre-requisite for making powerpc use the generic code instead of
> > its own half-useful implementation.
> > 
> > Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
> > ---
> > 
> > This version fixes some stupid warnings when using 32 bits resources
> 
>  ... and has warnings on 64 bits platforms... G
> 
> This whole issue of printk vs. resource_size_t is a terrible mess :-(
> 
> Part of the problem is that resource_size_t can be either u32 or u64..
> 
> that is  it can be either unsigned int, unsigned long or unsigned long
> long... and we have no way to reliably printk that.

We do this already just fine.  Take a look in the kernel, I think we
just always cast it to long long to be uniform.

> Any clever idea before I start pushing filthy macros up linux/types.h ?

I don't think any macros are needed.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] A clean approach to writeout throttling

2007-12-05 Thread Andrew Morton

On Wed, 5 Dec 2007 22:21:44 -0800 Daniel Phillips <[EMAIL PROTECTED]> wrote:

> On Wednesday 05 December 2007 17:24, Andrew Morton wrote:
> > On Wed, 5 Dec 2007 16:03:01 -0800 Daniel Phillips <[EMAIL PROTECTED]> wrote:
> > > ...a block device these days may not be just a single 
> > > device, but may be a stack of devices connected together by a generic 
> > > mechanism such as device mapper, or a hardcoded stack such as 
> > > multi-disk or network block device.  It is necessary to consider the 
> > > resource requirements of the stack as a whole _before_ letting a 
> > > transfer proceed into any layer of the stack, otherwise deadlock on 
> > > many partially completed transfers becomes a possibility.  For this 
> > > reason, the bio throttling is only implemented at the initial, highest 
> > > level submission of the bio to the block layer and not for any recursive 
> > > submission of the same bio to a lower level block device in a stack.
> > > 
> > > This in turn has rather far reaching implications: the top level device 
> > > in a stack must take care of inspecting the entire stack in order to 
> > > determine how to calculate its resource requirements, thus becoming
> > > the boss device for the entire stack.  Though this intriguing idea could 
> > > easily become the cause of endless design work and many thousands of 
> > > lines of fancy code, today I sidestep the question entirely using 
> > > the "just provide lots of reserve" strategy.  Horrifying as it may seem 
> > > to some, this is precisely the strategy that Linux has used in the 
> > > context of resource management in general, from the very beginning and 
> > > likely continuing for quite some time into the future  My strongly held 
> > > opinion in this matter is that we need to solve the real, underlying 
> > > problems definitively with nice code before declaring the opening of 
> > > fancy patch season.  So I am leaving further discussion of automatic 
> > > resource discovery algorithms and the like out of this post.
> > 
> > Rather than asking the stack "how much memory will this request consume"
> > you could instead ask "how much memory are you currently using".
> > 
> > ie: on entry to the stack, do 
> > 
> > current->account_block_allocations = 1;
> > make_request(...);
> > rq->used_memory += current->pages_used_for_block_allocations;
> > 
> > and in the page allocator do
> > 
> > if (!in_interrupt() && current->account_block_allocations)
> > current->pages_used_for_block_allocations++;
> > 
> > and then somehow handle deallocation too ;)
> 
> Ah, and how do you ensure that you do not deadlock while making this
> inquiry?

It isn't an inquiry - it's a plain old submit_bio() and it runs to
completion in the usual fashion.

Thing is, we wouldn't have called it at all if this queue was already over
its allocation limit.  IOW, we know that it's below its allocation limit,
so we know it won't deadlock.  Given, of course, reasonably pessimistc
error margins.

Which margins can even be observed at runtime: keep a running "max" of this
stack's most-ever memory consumption (for a single call), and only submit a
bio into it when its current allocation is less than (limit - that-max).

>  Perhaps send a dummy transaction down the pipe?  Even so,
> deadlock is possible, quite evidently so in the real life example I have
> at hand.
> 
> Yours is essentially one of the strategies I had in mind, the other major
> one being simply to examine the whole stack, which presupposes some
> as-yet-nonexistant kernel wide method of representing block device
> stacks in all there glorious possible topology variations.

We already have that, I think: blk_run_backing_dev().  One could envisage a
similar thing which runs up and down the stack accumulating "how much
memory do you need for this request" data, but I think that would be hard to
implement and plain dumb.

> > The basic idea being to know in real time how much memory a particular
> > block stack is presently using.  Then, on entry to that stack, if the
> > stack's current usage is too high, wait for it to subside.
> 
> We do not wait for high block device resource usage to subside before
> submitting more requests.  The improvement you suggest is aimed at
> automatically determining resource requirements by sampling a
> running system, rather than requiring a programmer to determine them
> arduously by hand.  Something like automatically determining a
> workable locking strategy by analyzing running code, wouldn't that be
> a treat?  I will hope for one of those under my tree at Christmas.

I don't see any unviability.

> More practically, I can see a debug mode implemented along the lines
> you describe where we automatically detect that a writeout path has
> violated its covenant as expressed by its throttle_metric.
>  
> > otoh we already have mechanisms for limiting the number of requests in
> > flight.  This is approximately proportional to the amount of memory

Re: 2.6.24-rc4-mm1 Kernel build fails on S390x

2007-12-05 Thread Andrew Morton

On Thu, 06 Dec 2007 08:45:37 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> Hi Andrew,
> 
> The 2.6.24-rc4-mm1 kernel build fails on s390x,
> 
>   CC  arch/s390/kernel/traps.o
> In file included from include/asm/thread_info.h:39,
>  from include/linux/thread_info.h:21,
>  from include/linux/preempt.h:9,
>  from include/linux/spinlock.h:49,
>  from include/linux/seqlock.h:29,
>  from include/linux/time.h:8,
>  from include/linux/timex.h:57,
>  from include/linux/sched.h:53,
>  from arch/s390/kernel/traps.c:17:
> include/asm/processor.h:191: warning: "struct seq_file" declared inside 
> parameter list
> include/asm/processor.h:191: warning: its scope is only this definition or 
> declaration, which is probably not what you want
> arch/s390/kernel/traps.c: In function `task_show_regs':
> arch/s390/kernel/traps.c:226: error: implicit declaration of function 
> `seq_printf'
> make[1]: *** [arch/s390/kernel/traps.o] Error 1
> make: *** [arch/s390/kernel] Error 2

thanks.

--- 
a/arch/s390/kernel/traps.c~proc-seqfile-convert-proc_pid_status-to-properly-handle-pid-namespaces-fix-2
+++ a/arch/s390/kernel/traps.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff -puN 
include/asm-s390/processor.h~proc-seqfile-convert-proc_pid_status-to-properly-handle-pid-namespaces-fix-2
 include/asm-s390/processor.h
--- 
a/include/asm-s390/processor.h~proc-seqfile-convert-proc_pid_status-to-properly-handle-pid-namespaces-fix-2
+++ a/include/asm-s390/processor.h
@@ -165,6 +165,7 @@ struct stack_frame {
 /* Forward declaration, a strange C thing */
 struct task_struct;
 struct mm_struct;
+struct seq_file;
 
 /* Free all resources held by a thread. */
 extern void release_thread(struct task_struct *);
_


Unfortunately the current greg-versus-git-s390 snafu means that I'm not
cross-building s390.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Reduce stack used by lib/hexdump.c

2007-12-05 Thread Andrew Morton

On Thu, 6 Dec 2007 00:58:38 -0500 Kyle Moffett <[EMAIL PROTECTED]> wrote:

> On Dec 05, 2007, at 21:42:35, Joe Perches wrote:
> > On Wed, 2007-12-05 at 18:18 -0800, Randy Dunlap wrote:
> >> Joe Perches wrote:
> >>> Maybe just eliminate the 16 or 32 byte width option and force it  
> >>> to only 16 byte widths.
> >> Have you checked users (callers)?  I'm pretty sure that one of the  
> >> callers wanted 32 and that's why it's there.
> >
> > I did.  There is only 1 subsystem.  That's easy to change.
> >
> > drivers/mtd/ubi/debug.c:  print_hex_dump(KERN_DEBUG, "",  
> > DUMP_PREFIX_OFFSET, 32, 1,
> > drivers/mtd/ubi/io.c: print_hex_dump(KERN_DEBUG, "",  
> > DUMP_PREFIX_OFFSET, 32, 1,
> >
> > Long lines in the log file are not too easy to read anyway.  Using  
> > 16 byte dumps per line instead of 32 isn't painful.
> >
> > It gets rid of the allocation, reduces the argument count and makes  
> > the kernel smaller.  I think it's all good.
> >
> > Every current caller would have to change though.
> 
> Alternatively, since print_hex_dump is not a performance-critical  
> path (and usually indicates an error/debug condition), you could  
> probably just make a static "hexdump_lock" spinlock and  
> spin_lock_irqsave()/spin_unlock_irqrestore().  It would always nest  
> inside any other lock (except during crash, where we break locks  
> already for printk()), and I doubt any of the callers would notice  
> the serialization since they're already serialized on the printk buffer.
> 

Yup, that'd work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc4-mm1

2007-12-05 Thread David Miller

From: Reuben Farrelly <[EMAIL PROTECTED]>
Date: Thu, 06 Dec 2007 17:59:37 +1100

> On 5/12/2007 4:17 PM, Andrew Morton wrote:
> > - Lots of device IDs have been removed from the e1000 driver and moved over
> >   to e1000e.  So if your e1000 stops working, you forgot to set 
> > CONFIG_E1000E.
> 
> This non fatal oops which I have just noticed may be related to this change 
> then 
> - certainly looks networking related.
> 
> WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert()
> Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1
> 
> Call Trace:
> [] tcp_fastretrans_alert+0x229/0xe63
>   [] tcp_ack+0xa3f/0x127d
>   [] tcp_rcv_established+0x55f/0x7f8
>   [] tcp_v4_do_rcv+0xdb/0x3a7
>   [] :nf_conntrack:nf_ct_deliver_cached_events+0x75/0x99

No, it's from TCP assertions and changes added by Ilpo to the
net-2.6.25 tree recently.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-12-05 Thread Andrew Morton

On Thu, 06 Dec 2007 14:49:37 +0900 FUJITA Tomonori <[EMAIL PROTECTED]> wrote:

> > >  drivers/scsi/dpt_i2o.c |  132 ++-
> > >  drivers/scsi/dpti.h|9 ++
> > >  2 files changed, 68 insertions(+), 73 deletions(-)
> > 
> > I've done the following:
> > 
> > -untared a clean 2.6.24-rc4 and compiled it with my 2.6.23.1-settings in 
> > order
> >  to verify that the driver is still broken: checked, the box still won't 
> >  boot.
> > 
> > -patched the just compiled kernel source with your patch, "make dist-clean"
> >  (by means of "make-kpkg clean") and recompile: box boots fine.
> >
> > I've put the captured console logs to
> > http://w.sysiphus.de/dpt_i2o/bootlog.2624-rc4-pristine
> > http://w.sysiphus.de/dpt_i2o/bootlog.2624-rc4-patched
> > ... and the kernelconfig (which shouldn't matter) to
> > http://w.sysiphus.de/dpt_i2o/kernelconfig.2624-rc4
> 
> Thanks for testing. So reverting Matthew's hotplug patch fixes the
> problem though I have no idea how the patch leads to this. Seems that
> nobody has any clue on that. We need to revert that patch for the
> moment.

OK, thanks.  Let's leave it a couple of days for people to register objections,
have bright ideas, etc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc4-mm1

2007-12-05 Thread Reuben Farrelly


On 5/12/2007 4:17 PM, Andrew Morton wrote:

Temporarily at

  http://userweb.kernel.org/~akpm/2.6.24-rc4-mm1/

Will appear later at

  
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/


- Lots of device IDs have been removed from the e1000 driver and moved over
  to e1000e.  So if your e1000 stops working, you forgot to set CONFIG_E1000E.


This non fatal oops which I have just noticed may be related to this change then 
- certainly looks networking related.


WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert()
Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1

Call Trace:
   [] tcp_fastretrans_alert+0x229/0xe63
 [] tcp_ack+0xa3f/0x127d
 [] tcp_rcv_established+0x55f/0x7f8
 [] tcp_v4_do_rcv+0xdb/0x3a7
 [] :nf_conntrack:nf_ct_deliver_cached_events+0x75/0x99
 [] :nf_conntrack_ipv4:ipv4_confirm+0x29/0x51
 [] tcp_v4_rcv+0x9be/0xaed
 [] nf_hook_slow+0x60/0xdf
 [] ip_local_deliver_finish+0xd3/0x253
 [] ip_local_deliver+0x3b/0x85
 [] ip_rcv_finish+0x119/0x3b8
 [] ip_rcv+0x231/0x30c
 [] netif_receive_skb+0x215/0x299
 [] :e1000e:e1000_receive_skb+0x4d/0x1db
 [] :e1000e:e1000_clean_rx_irq+0x12c/0x341
 [] :e1000e:e1000_clean+0x306/0x58f
 [] rebalance_domains+0xec/0x423
 [] handle_edge_irq+0x97/0x13b
 [] net_rx_action+0xb8/0x11d
 [] __do_softirq+0x71/0xdd
 [] call_softirq+0x1c/0x30
 [] do_softirq+0x3d/0x8d
 [] irq_exit+0x84/0x86
 [] do_IRQ+0x7e/0xe4
 [] mwait_idle+0x0/0x58
 [] default_idle+0x0/0x43
 [] ret_from_intr+0x0/0xa
   [] mwait_idle+0x48/0x58
 [] enter_idle+0x22/0x24
 [] cpu_idle+0x63/0x88
 [] rest_init+0x55/0x60
 [] start_kernel+0x2a4/0x32a
 [] _sinittext+0x10b/0x120

tornado home #

I have posted a full dmesg up as well as my .config and an lcpci at 
http://www.reub.net/files/kernel/2.6.24-rc4-mm1/ .


Thanks,
Reuben
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why does reading from /dev/urandom deplete entropy so much?

2007-12-05 Thread Eric Dumazet


Matt Mackall a écrit :

On Tue, Dec 04, 2007 at 07:17:58PM +0100, Eric Dumazet wrote:

Alan Cox a ?crit :
No matter what you consider as being better, changing a 12 years old and 
widely used userspace interface like /dev/urandom is simply not an 
option.
   

Fixing it to be more efficient in its use of entropy and also fixing the
fact its not actually a good random number source would be worth looking
at however.
 

Yes, since current behavior on network irq is very pessimistic.


No, it's very optimistic. The network should not be trusted.


You keep saying that. I am refering to your previous attempts last year to 
remove net drivers from sources of entropy. No real changes were done.


If the network should not be trusted, then a patch should make sure network 
interrupts feed /dev/urandom but not /dev/random at all. (ie not calling 
credit_entropy_store() at all)




The distinction between /dev/random and /dev/urandom boils down to one
word: paranoia. If you are not paranoid enough to mistrust your
network, then /dev/random IS NOT FOR YOU. Use /dev/urandom. Do not
send patches to make /dev/random less paranoid, kthxbye.


I have many tg3 adapters on my servers, receiving thousand of interrupts per 
second, and calling add_timer_randomness(). I would like to either :


- Make sure this stuff is doing usefull job.
- Make improvements to reduce cpu time used.

I do not use /dev/urandom or/and /dev/random, but I know David wont accept a 
patch to remove IRQF_SAMPLE_RANDOM from tg3.c


Currently, I see that current implementation is suboptimal because it calls 
credit_entropy_store( nbits=0) forever.




If you have some trafic, (ie more than HZ/2  interrupts per second), 
then add_timer_randomness() feeds
some entropy but gives no credit (calling credit_entropy_store() with 
nbits=0)


This is because we take into account only the jiffies difference, and 
not the get_cycles() that should give

us more entropy on most plaforms.


If we cannot measure a difference, we should nonetheless assume there
is one?


There is a big difference on get_cycles() and jiffies. You should try to 
measure it on a typical x86_64 platform.


 
In this patch, I suggest that we feed only one u32 word of entropy, 
combination of the previous distinct
words (with some of them being constant or so), so that the nbits 
estimation is less pessimistic, but also to

avoid injecting false entropy.


Umm.. no, that's not how it works at all.

Also, for future reference, patches for /dev/random go through me, not
through Dave.



Why ? David is the network maintainer, and he was the one who rejected your 
previous patches.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: New Address Family: Inter Process Networking (IPN)

2007-12-05 Thread David Newall


Kyle Moffett wrote:

On Dec 06, 2007, at 00:30:16, Renzo Davoli wrote:
AF_IPN is different.  AF_IPN is the broadcast and peer-to-peer 
extension of AF_UNIX. It supports communication among *user* processes.


Ok, you say it's different, but then you describe how IP unicast and 
broadcast work.


Renzo also described something new (in the socket() arena): the 
multi-reader, multi-writer is just not available in IP.


I wonder if this solves the same problem as d-bus?


So if you really think this is something that belongs in the kernel 
you need to provide much more detailed descriptions and use-cases for 
why it cannot be implemented in user-space or with small modifications 
to existing UDP/TCP networking. 


I would strengthen this sentiment: If you think something belongs in the 
kernel, you need to argue your case (provide much more detailed 
descriptions and use-cases.)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] x86_64 EFI runtime service support : Calling convention fix (resend, cc LKML)

2007-12-05 Thread Huang, Ying

In EFI calling convention, %xmm0 - %xmm5 are specified as the scratch
registers (UEFI Specification 2.1, 2.3.4.2). To conforms to EFI
specification, this patch save/restore %xmm0 - %xmm5 registers
before/after invoking EFI runtime service. At the same time, the stack
is aligned in 16 bytes, and TS in CR0 in clear/restore to make it
possible to use SSE2 in EFI runtime service.

This patch is based on 2.6.24-rc4-mm1. And it has been tested on Intel
platforms with 64-bit UEFI 2.0 firmware.

Signed-off-by: Huang Ying <[EMAIL PROTECTED]>

---
 arch/x86/kernel/efi_stub_64.S |   71 +-
 1 file changed, 56 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/efi_stub_64.S
+++ b/arch/x86/kernel/efi_stub_64.S
@@ -8,61 +8,102 @@
 
 #include 
 
+#define SAVE_XMM   \
+   mov %rsp, %rax; \
+   subq $0x70, %rsp;   \
+   and $~0xf, %rsp;\
+   mov %rax, (%rsp);   \
+   mov %cr0, %rax; \
+   clts;   \
+   mov %rax, 0x8(%rsp);\
+   movaps %xmm0, 0x60(%rsp);   \
+   movaps %xmm1, 0x50(%rsp);   \
+   movaps %xmm2, 0x40(%rsp);   \
+   movaps %xmm3, 0x30(%rsp);   \
+   movaps %xmm4, 0x20(%rsp);   \
+   movaps %xmm5, 0x10(%rsp)
+
+#define RESTORE_XMM\
+   movaps 0x60(%rsp), %xmm0;   \
+   movaps 0x50(%rsp), %xmm1;   \
+   movaps 0x40(%rsp), %xmm2;   \
+   movaps 0x30(%rsp), %xmm3;   \
+   movaps 0x20(%rsp), %xmm4;   \
+   movaps 0x10(%rsp), %xmm5;   \
+   mov 0x8(%rsp), %rsi;\
+   mov %rsi, %cr0; \
+   mov (%rsp), %rsp
+
 ENTRY(efi_call0)
-   subq $40, %rsp
+   SAVE_XMM
+   subq $32, %rsp
call *%rdi
-   addq $40, %rsp
+   addq $32, %rsp
+   RESTORE_XMM
ret
 
 ENTRY(efi_call1)
-   subq $40, %rsp
+   SAVE_XMM
+   subq $32, %rsp
mov  %rsi, %rcx
call *%rdi
-   addq $40, %rsp
+   addq $32, %rsp
+   RESTORE_XMM
ret
 
 ENTRY(efi_call2)
-   subq $40, %rsp
+   SAVE_XMM
+   subq $32, %rsp
mov  %rsi, %rcx
call *%rdi
-   addq $40, %rsp
+   addq $32, %rsp
+   RESTORE_XMM
ret
 
 ENTRY(efi_call3)
-   subq $40, %rsp
+   SAVE_XMM
+   subq $32, %rsp
mov  %rcx, %r8
mov  %rsi, %rcx
call *%rdi
-   addq $40, %rsp
+   addq $32, %rsp
+   RESTORE_XMM
ret
 
 ENTRY(efi_call4)
-   subq $40, %rsp
+   SAVE_XMM
+   subq $32, %rsp
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
call *%rdi
-   addq $40, %rsp
+   addq $32, %rsp
+   RESTORE_XMM
ret
 
 ENTRY(efi_call5)
-   subq $40, %rsp
+   SAVE_XMM
+   subq $48, %rsp
mov %r9, 32(%rsp)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
call *%rdi
-   addq $40, %rsp
+   addq $48, %rsp
+   RESTORE_XMM
ret
 
 ENTRY(efi_call6)
-   subq $56, %rsp
-   mov 56+8(%rsp), %rax
+   SAVE_XMM
+   mov (%rsp), %rax
+   mov 8(%rax), %rax
+   subq $48, %rsp
mov %r9, 32(%rsp)
mov %rax, 40(%rsp)
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
call *%rdi
-   addq $56, %rsp
+   addq $48, %rsp
+   RESTORE_XMM
ret
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please revert: PCI: fix IDE legacy mode resources

2007-12-05 Thread Benjamin Herrenschmidt

On Thu, 2007-12-06 at 14:58 +0900, Yoichi Yuasa wrote:
> > What I don't understand is thus why you are calling resource_to_bus
> on 0x1f0
> > which is -not- a resource value, but is already a BAR value...
> 
> 0x1f0 is resource value on MIPS Cobalt.
> All RAW BAR values contain the offset(0x1000) on it.
> 
> Do the BAR values on your target contain the offset?

No, and I don't understand... raw BAR values don't contain such offset.
The physical address where the PIO is mapped might, but that's not what
we put in struct resource for IO resources and definitely not the BAR
value.

The legacy IDE device will decode cycles at 0x1f0, not cycles at
0x11f0.

Take for example PowerPC. Imagine that I have a bus whose IO space is
mapped at 0xf000 in the processor physical address space (this is a
real example, my powermac does that for the x16 PCI-Express slot though
other slots use other offsets).

Now, the kernel on ppc64 will map that virtually at some allocated
virtual address that we'll call we call bus_io_virt for this
explanation. In addition, inb() and outb() will apply an offset (which
can be different) that we call _IO_BASE to the port numbers. In general,
bus_io_virt of the first bus == _IO_BASE on ppc32 but that's not a
strict rule.

Let's say for the sake of the example, that _IO_BASE is 0xd000 and
our bus has been mapped at 0xd001 (bus_io_virt). So 0xd001 maps
to 0xf000 via the MMU.

When we scan the bus, we read the BAR content. So for example, a device
whose IOs have been assigned at 0x1000 will read that as a RAW bar value
and pci_scan_slot() (or whoever does the reading) will put 0x1000 in the
struct resource. In a similar vein, such a legacy controller would thus
be expected to have 0x1f0 in the resource.

Later, when we fixup (in a head quirk on ppc32 and in pcibios_fixup_bus
on ppc64, though that's changing wit 2.6.25 to use the same mechanism),
we see an IO resource, and we fixup by adding to it basically
(bus_io_virt - _IO_BASE).

That is, for our device that has a 0x1000 BAR value, we'll do:

resource = 0x1000 + (0xd001 - 0xd000) = 0x11000

And for the legacy IDE:

resource = 0x1f0 + (0xd001 - 0xd000) = 0x101f0

Now, if you do an inb or outb to one of these, the inb() and oub()
accessors will add _IO_BASE, which is 0xd000 in our example, so
you'll end up doing accesses to respectively:

access = 0xd0011000 for the example device or
access = 0xd00101f0 for the IDE controller

That translates via the MMU to

phys = 0xf0001000 for the example device or
phys = 0xf1f0 for the IDE controller

Which translates on to the bus into an IO cycle at the raw BAR
address (which is what is in the BAR content or hard-decoded in the case
of the legacy IDE):

bus = 0x1000 for the example device
bus = 0x01f0 for the IDE controller

which ... is what we started with.

Now I don't understand how MIPS does things differently, but I can't see
how it can be legal to call pci_resource_to_bus() on 0x1f0 in
pci_setup_device(), because at this stage, we are putting raw BAR values
in the struct resource (that is PCI bus addresses) and 0x1f0 _is_ such a
value.

Calling pci_resource_to_bus() would somewhat mean that 0x1f0 is not, and
instead is some kind of already fixed up resource value that we want
converted back into a BAR value, which is not the case.

So I suspect your patch works by accident more than by design, or I am
missing something...

Cheers,
Ben.

Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] A clean approach to writeout throttling

2007-12-05 Thread Daniel Phillips

On Wednesday 05 December 2007 17:24, Andrew Morton wrote:
> On Wed, 5 Dec 2007 16:03:01 -0800 Daniel Phillips <[EMAIL PROTECTED]> wrote:
> > ...a block device these days may not be just a single 
> > device, but may be a stack of devices connected together by a generic 
> > mechanism such as device mapper, or a hardcoded stack such as 
> > multi-disk or network block device.  It is necessary to consider the 
> > resource requirements of the stack as a whole _before_ letting a 
> > transfer proceed into any layer of the stack, otherwise deadlock on 
> > many partially completed transfers becomes a possibility.  For this 
> > reason, the bio throttling is only implemented at the initial, highest 
> > level submission of the bio to the block layer and not for any recursive 
> > submission of the same bio to a lower level block device in a stack.
> > 
> > This in turn has rather far reaching implications: the top level device 
> > in a stack must take care of inspecting the entire stack in order to 
> > determine how to calculate its resource requirements, thus becoming
> > the boss device for the entire stack.  Though this intriguing idea could 
> > easily become the cause of endless design work and many thousands of 
> > lines of fancy code, today I sidestep the question entirely using 
> > the "just provide lots of reserve" strategy.  Horrifying as it may seem 
> > to some, this is precisely the strategy that Linux has used in the 
> > context of resource management in general, from the very beginning and 
> > likely continuing for quite some time into the future  My strongly held 
> > opinion in this matter is that we need to solve the real, underlying 
> > problems definitively with nice code before declaring the opening of 
> > fancy patch season.  So I am leaving further discussion of automatic 
> > resource discovery algorithms and the like out of this post.
> 
> Rather than asking the stack "how much memory will this request consume"
> you could instead ask "how much memory are you currently using".
> 
> ie: on entry to the stack, do 
> 
>   current->account_block_allocations = 1;
>   make_request(...);
>   rq->used_memory += current->pages_used_for_block_allocations;
> 
> and in the page allocator do
> 
>   if (!in_interrupt() && current->account_block_allocations)
>   current->pages_used_for_block_allocations++;
> 
> and then somehow handle deallocation too ;)

Ah, and how do you ensure that you do not deadlock while making this
inquiry?  Perhaps send a dummy transaction down the pipe?  Even so,
deadlock is possible, quite evidently so in the real life example I have
at hand.

Yours is essentially one of the strategies I had in mind, the other major
one being simply to examine the whole stack, which presupposes some
as-yet-nonexistant kernel wide method of representing block device
stacks in all there glorious possible topology variations.

> The basic idea being to know in real time how much memory a particular
> block stack is presently using.  Then, on entry to that stack, if the
> stack's current usage is too high, wait for it to subside.

We do not wait for high block device resource usage to subside before
submitting more requests.  The improvement you suggest is aimed at
automatically determining resource requirements by sampling a
running system, rather than requiring a programmer to determine them
arduously by hand.  Something like automatically determining a
workable locking strategy by analyzing running code, wouldn't that be
a treat?  I will hope for one of those under my tree at Christmas.

More practically, I can see a debug mode implemented along the lines
you describe where we automatically detect that a writeout path has
violated its covenant as expressed by its throttle_metric.

> otoh we already have mechanisms for limiting the number of requests in
> flight.  This is approximately proportional to the amount of memory which
> was allocated to service those requests.  Why not just use that?

Two reasons.  The minor one is that device mapper bypasses that
mechanism (no elevator) and the major one is that number of requests
does not map well to the amount of resources consumed.  In ddsnap for
example, the amount of memory used by the userspace ddsnapd is
roughly linear vs the number of pages transferred, not the number of
requests.

> > @@ -3221,6 +3221,13 @@ static inline void __generic_make_reques
> > if (bio_check_eod(bio, nr_sectors))
> > goto end_io;
> >  
> > +   if (q && q->metric && !bio->bi_queue) {
> > +   int need = bio->bi_throttle = q->metric(bio);
> > +   bio->bi_queue = q;
> > +   /* FIXME: potential race if atomic_sub is called in the middle 
> > of condition check */
> > +   wait_event_interruptible(q->throttle_wait, 
> > atomic_read(&q->available) >= need);
> 
> This will fall straight through if signal_pending() and (I assume) bad
> stuff will happen.  uninterruptible sleep, methinks.

Yes

Re: New Address Family: Inter Process Networking (IPN)

2007-12-05 Thread Kyle Moffett


On Dec 06, 2007, at 00:30:16, Renzo Davoli wrote:
AF_IPN is different.  AF_IPN is the broadcast and peer-to-peer  
extension of AF_UNIX. It supports communication among *user*  
processes.


Ok, you say it's different, but then you describe how IP unicast and  
broadcast work.  Both are frequently used for communication among  
"*user* processes".  Please provide significantly more details about  
exactly *how* it's different.




Example:

Qemu, User-Mode Linux, Kvm, our umview machines can use IPN as an  
Ethernet Hub and communicate among themselves with the hosting  
computer and the world by a tap like interface.


You say "tap like" interface, but people do this already with  
existing infrastructure.  You can connect Qemu, UML, and KVM to a  
standard linus "tap" interface, and then use the standard Linux  
bridging code to connect the "tap" interface to your existing network  
interfaces.  Alternatively you could use the standard and well-tested  
IP routing/firewalling/NAT code to move your packets around.  None of  
this requires new network infrastructure in the slightest.  If you  
have problems with the existing code, please improve it instead of  
creating a slightly incompatible replacement which has different bugs  
and workarounds.



You can also grab an interface (say eth1) and use eth0 for your  
hosting computer and eth1 for the IPN network of virtual machines.


You can do that already with the bridging code.


If you load the kvde_switch submodule IPN can be a virtual Ethernet  
switch.


As I described above, this can be done with the existing bridging and  
tun/tap code.




Another Example:

You have a continuous stream of data packets generated by a  
process, and you want to send this data to many processes.  Maybe  
the set of processes is not known in advance, you want to send the  
data to any interested process. Some kind of publish&subscribe  
communication service (among unix processes not on TCP-IP). Without  
IPN you need a server. With IPN the sender creates the socket  
connects to it and feed it with data packets. All the interested  
receivers connects to it and start reading. That's all.


This is already done frequently in userspace.  Just register a port  
number with IANA on which to implement a "registration" server and  
write a little daemon to listen on 127.0.0.1:${YOUR_PORT}.  Your  
interconnecting programs then use either unicast or multicast sockets  
to bind, then report to the registration server what service you are  
offering and what port it's on.  Your "receivers" then connect to the  
registration server, ask what port a given service is on, and then  
multicast-listen or unicast-connect to access that service.  The best  
part is that all of the performance implications are already  
thoroughly understood.  Furthermore, if you want to extend your  
communication protocol to other hosts as well, you just have to  
replace the 127.0.0.1 bind with a global bind.  This is exactly how  
the standard-specified multiple-participant "SIP" protocol works, for  
example.



So if you really think this is something that belongs in the kernel  
you need to provide much more detailed descriptions and use-cases for  
why it cannot be implemented in user-space or with small  
modifications to existing UDP/TCP networking.


Cheers,
Kyle Moffett

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PS3: trouble with SPARSEMEM_VMEMMAP and kexec

2007-12-05 Thread Yasunori Goto


> I'll try Milton's suggestion to pre-allocate the memory early.  It seems
> that should work as long as nothing else before the hot-plug mem is added
> needs a large chunk.

Hello. Geoff-san. Sorry for late response.

Could you tell me the value of the following page_size calculation
in vmemmap_populate()? I think this page_size may be too big value. 

--
int __meminit vmemmap_populate(struct page *start_page,
   unsigned long nr_pages, int node)
   :
   :
unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
   :
---


In addition, I remember that current add_memory() is designed for
only 1 section's addition. (See: memory_probe_store() and
sparse_mem_map_populate().
they require only for 1 section's mem_map by specifing
PAGES_PER_SECTION.)
The 1 section size for normal powerpc box is only 16MB.
(IA64 -> 1GB, x86-64 -> 128MB).

But, if my understanding is correct, PS3's add_memory() requires all
of total memory. I'm afraid something other problems might be hidden
in this issue yet.

(However, I think Milton-san's suggestion is very desirable. 
 If preallocation of hotadd works on ia64 too, I'm very glad.)

Thanks.

-- 
Yasunori Goto 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: New Address Family: Inter Process Networking (IPN)

2007-12-05 Thread Stephen Hemminger

On Thu, 6 Dec 2007 06:38:21 +0100
[EMAIL PROTECTED] (Renzo Davoli) wrote:

> On Wed, Dec 05, 2007 at 04:55:52PM -0500, Stephen Hemminger wrote:
> > On Wed, 5 Dec 2007 17:40:55 +0100
> > [EMAIL PROTECTED] (Renzo Davoli) wrote:
> > > 0- (Constructive) comments.
> > > 1- The "official" assignment of an Address Family.
> > > 2- Another "grabbing hook" for interfaces (like the ones already
> > > We are studying some way to register/deregister grabbing services,
> > > I feel this would be the cleanest way. 
> > 
> > Post complete source code for kernel part to [EMAIL PROTECTED]
> I'll do it as soon as possible.
> > If you want the hooks, you need to include the full source code for 
> > inclusion
> > in mainline. All the Documentation/SubmittingPatches rules apply;
> > you can't just ask for "facilitators" and expect to keep your stuff out of 
> > tree.
> I am sorry if I was misunderstood.
> I did not want any "facilitator", nor I wanted to keep my code outside
> the kernel, on the contrary.

Greate

> It is perfectly okay for me to provide the entire code for inclusion.
> The purposes of my message were the following:
> - I wanted to introduce the idea and say to the linux kernel community
>   that a team is working on it.
> - Address family: is it okay to send a patch that add a new AF?
> is there a "AF registry" somewhere? (like the device major/minor
> registry or the well-known port assignment for TCP-IP).

The usual process is to just add the value as part of the patchset.
You then need to tell the glibc maintainers so it gets included appropriately
in userspace.

> - Hook: we have two different options. We can add another grabbing
> inline function like those used by the bridge and macvlan or we can
> design a grabbing service registration facility. Which one is preferrable?

The problem with making it a registration facilties are:
 * risk of making it easier for non-GPL out of tree abuse
 * possible ordering issues: ie. by hardcoding each hook, the
behaviour is defined in the case of multiple usages on the same
machine.

> The former is simpler, the latter is more elegant but it requires some 
> changes in the kernel bridge code.

Not a big deal, but see above

> So the former choice is between less-invasive,safer,inelegant, the
> latter is more-invasive,less safe,elegant.

 
> We need a bit of time to stabilize the code: deeply testing the existing
> features and implementing some more ideas we have on it.
> In the meanwhile we would be grateful if the community could kindly ask to the
> questions above.

I am a believer in review early and often. It is easier to just deal with
the nuisance issues (style, naming, configuration) at the beginning rather
than the final stage of the project.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fix br_fdb_fini() section mismatch

2007-12-05 Thread Harald Welte

When compiling a kernel (current linus git or 2.6.24-rc4) with built-in
CONFIG_BRIDGE, I get the following error:

  LD  .tmp_vmlinux1
`br_fdb_fini' referenced in section `.init.text' of net/built-in.o: defined in 
discarded section `.exit.text' of net/built-in.o
make: *** [.tmp_vmlinux1] Error 1

This patch fixes it.

Signed-off-by: Harald Welte <[EMAIL PROTECTED]>

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index eb57502..bc40377 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -44,7 +44,7 @@ int __init br_fdb_init(void)
return 0;
 }
 
-void __exit br_fdb_fini(void)
+void br_fdb_fini(void)
 {
kmem_cache_destroy(br_fdb_cache);
 }
-- 
- Harald Welte <[EMAIL PROTECTED]>  http://gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Reduce stack used by lib/hexdump.c

2007-12-05 Thread Kyle Moffett


On Dec 05, 2007, at 21:42:35, Joe Perches wrote:

On Wed, 2007-12-05 at 18:18 -0800, Randy Dunlap wrote:

Joe Perches wrote:
Maybe just eliminate the 16 or 32 byte width option and force it  
to only 16 byte widths.
Have you checked users (callers)?  I'm pretty sure that one of the  
callers wanted 32 and that's why it's there.


I did.  There is only 1 subsystem.  That's easy to change.

drivers/mtd/ubi/debug.c:  print_hex_dump(KERN_DEBUG, "",  
DUMP_PREFIX_OFFSET, 32, 1,
drivers/mtd/ubi/io.c: print_hex_dump(KERN_DEBUG, "",  
DUMP_PREFIX_OFFSET, 32, 1,


Long lines in the log file are not too easy to read anyway.  Using  
16 byte dumps per line instead of 32 isn't painful.


It gets rid of the allocation, reduces the argument count and makes  
the kernel smaller.  I think it's all good.


Every current caller would have to change though.


Alternatively, since print_hex_dump is not a performance-critical  
path (and usually indicates an error/debug condition), you could  
probably just make a static "hexdump_lock" spinlock and  
spin_lock_irqsave()/spin_unlock_irqrestore().  It would always nest  
inside any other lock (except during crash, where we break locks  
already for printk()), and I doubt any of the callers would notice  
the serialization since they're already serialized on the printk buffer.


Cheers,
Kyle Moffett

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please revert: PCI: fix IDE legacy mode resources

2007-12-05 Thread Yoichi Yuasa

On Thu, 06 Dec 2007 16:04:07 +1100
Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:

> 
> On Thu, 2007-12-06 at 13:34 +0900, Yoichi Yuasa wrote:
> > > I don't understand how his fix can work on MIPS nor why the previous
> > > code didn't, but I don't know how MIPS does its remapping tricks,
> > > however it will definitely -not- work on powerpc (and will break a
> > > couple of machines out there).
> > 
> > MIPS pcibios_fixup_bus() converts RAW BAR values(including offset) to
> > resource values. How does it fix up on powerpc?
> 
> Same thing. We expect resources to contain raw values before .

Same.

> What I don't understand is thus why you are calling resource_to_bus on 0x1f0
> which is -not- a resource value, but is already a BAR value...

0x1f0 is resource value on MIPS Cobalt.
All RAW BAR values contain the offset(0x1000) on it.

Do the BAR values on your target contain the offset?

Yoichi



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: broken dpt_i2o in 2.6.23 (was: ext2_check_page: bad entry in directory)

2007-12-05 Thread FUJITA Tomonori

On Wed, 5 Dec 2007 11:14:41 +0100
Anders Henke <[EMAIL PROTECTED]> wrote:

> On Tue, 4 Dec 2007 Andrew Morton wrote:
> > On Wed, 05 Dec 2007 10:30:54 +0900 FUJITA Tomonori <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > On Tue, 4 Dec 2007 17:11:55 -0800
> > > Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Wed, 05 Dec 2007 10:04:03 +0900
> > > > FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > > On Tue, 4 Dec 2007 16:57:38 -0800
> > > > > Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > > > 
> > > > > > On Thu, 29 Nov 2007 13:31:50 +0100
> > > > > > Anders Henke <[EMAIL PROTECTED]> wrote:
> > > > > > 
> > > > > > > On November 28 2007, Anders Henke wrote:
> > > > > > > > As "everything is reported as being zero" is quite odd an Jan 
> > > > > > > > took a
> > > > > > > > guess that it might be block-layer or driver-related, I've 
> > > > > > > > assumed
> > > > > > > > that the driver is responsible for this; just out of the 
> > > > > > > > curiousity, 
> > > > > > > > I've manually replaced the dpt_i2o driver by the 2.6.19 one by 
> > > > > > > > copying 
> > > > > > > > driver/scsi/dpt_i2o.c driver/scsi/dpti.h and driver/scsi/dpt/ 
> > > > > > > > into a 
> > > > > > > > vanilla 2.6.23.1. kernel; using this kernel fixed the issue for 
> > > > > > > > me.
> > > > > > > > 
> > > > > > > > I haven't yet fine-tested from which kernel release on the 
> > > > > > > > dpt_i2o driver 
> > > > > > > > behaves like this and spews out zeroed blocks when trying to 
> > > > > > > > mount
> > > > > > > > the rootfs. Maybe this is just some timing issue.
> > > > > > > 
> > > > > > > I've started the fine-tests and can say so far that dpt_i2o from 
> > > > > > > 2.6.22 is still fine. Test is simple:
> > > > > > > 
> > > > > > > [EMAIL PROTECTED]:/usr/src/linux-2.6.22/drivers/scsi/dpt$ cp -r 
> > > > > > > dpt/ dpt_i2o.c dpti.h /usr/src/linux-2.6.23.1/drivers/scsi/
> > > > > > > 
> > > > > > > ... recompile the kernel, reboot: works.
> > > > > > > 
> > > > > > > 2.6.22 and 2.6.23 differ in terms of the dpt_i2o driver by two 
> > > > > > > different
> > > > > > > patch sets:
> > > > > > > -one 2 Kb small set of patches from 2.6.22 to 2.6.22-rc1
> > > > > > > -one 7 Kb set of patches from 2.6.23-rc2 to 2.6.23-rc3
> > > > > > > -one 162 Kb set of patches from 2.6.23-rc9 to 2.6.23-rc10.
> > > > > > > 
> > > > > > > When applying the 2.6.23-rc1-based driver to "my" 2.6.31.1 kernel,
> > > > > > > the "zero blocks"-symptom show up, so it's the "lucky" situation
> > > > > > > that the smallest patch actually seams to be the broken one.
> > > > > > > 
> > > > > > > According to the 2.6.23-rc1 short-form changelog, there is
> > > > > > > one major edit on the dpt_i2o driver:
> > > > > > > 
> > > > > > > FUJITA Tomonori 
> > > > > > > 
> > > > > > >   [SCSI] dpt_i2o: convert to use the data buffer accessors
> > > > > > > 
> > > > > > > Stephen Rothwell 
> > > > > > >   dpt_i2o depends on virt_to_bus
> > > > > > > 
> > > > > > > Fujita, would you please take a look at this?
> > > > > > 
> > > > > > He won't have seen this.  cc's added.
> > > > > > 
> > > > > > > I think that something's broken in there, leading to the dpt_i2o 
> > > > > > > sending out blocks of zeroes right after initialization, at least 
> > > > > > > on
> > > > > > > some specific controllers (in this case, Adaptec 2010S on Intel
> > > > > > > SE7501WV2S-based boxes).
> > > > > > > 
> > > > > > > I don't have insight kernel driver development knowledge, so I'm
> > > > > > > quite out of help right now. Nevertheless, I'll add the diff
> > > > > > > from 2.6.22 to 2.6.23-rc1 in terms of dpt_i2o:
> > > > > > > 
> > > > > > 
> > > > > > Can you please confirm that this revert (against 2.6.24-rc4) fixes 
> > > > > > the data
> > > > > > corruption problems?
> > > > > 
> > > > > Anders said that my patch is fine and seems that Matthew's hotplug
> > > > > conversion patch leads to the problem:
> > > > > 
> > > > > http://marc.info/?l=linux-kernel&m=119641892129732&w=2
> > > > 
> > > > Oh.  Jan broke message threading :(
> > > > 
> > > > So it's been nearly a week and nothing has happened?  Do we revert that
> > > > change?
> > > 
> > > SCSI people really want this conversion...
> > > 
> > > Matthew, did you have a chance to look at it?
> > 
> > It seems pretty improbably that a change of that nature could cause data
> > corruption.  Anders, are you able to determine whether the revert (against
> > current Linus mainline or 2.6.24-rc4) fixes things?  Because it would be
> > very strange...
> > 
> > This is a grave bug.  It's really quite urgent...
> > 
> > Thanks.
> > 
> >  drivers/scsi/dpt_i2o.c |  132 ++-
> >  drivers/scsi/dpti.h|9 ++
> >  2 files changed, 68 insertions(+), 73 deletions(-)
> 
> I've done the following:
> 
> -untared a clean 2.6.24-rc4 and compiled it with my 2.6.23.1-settings in order
>  to verify that the driver is still broken: checked, the box still won't 
>  boot.
> 
> -patched the

Re: PS3: trouble with SPARSEMEM_VMEMMAP and kexec

2007-12-05 Thread Geoff Levand

Andrew Morton wrote:
> On Wed, 5 Dec 2007 10:52:48 +0100 (CET)
> Geert Uytterhoeven <[EMAIL PROTECTED]> wrote:
> 
>> 
>> Subject: sparsemem: sparse_add_one_section() may fail to allocate memory
>> 
>> sparsemem: sparse_add_one_section() may fail to allocate memory, and must 
>> check
>> whether the allocation succeeded before proceeding to touch the allocated
>> memory.
>> 
>> From: Geert Uytterhoeven <[EMAIL PROTECTED]>
>> 
>> Signed-off-by: Geert Uytterhoeven <[EMAIL PROTECTED]>
>> ---
>> FIXME There are still some possible memory leaks in sparse_add_one_section():
>>   - usemap is never deallocated
>>   - __kfree_section_memmap() is a not yet implemented dummy
> 
> I already had
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/broken-out/mm-sparsec-improve-the-error-handling-for-sparse_add_one_section.patch


This one has an error in it.  A patch to fix it is below.


> and
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/broken-out/mm-sparsec-check-the-return-value-of-sparse_index_alloc.patch
> 
> queued.  Do they fix the problem, and should they be merged in 2.6.24?


These two plus my fix below allow the hot plug add_memory() call to fail
gracefully and for the platform code to continue to boot on the
128MB of boot mem.

With ps3_defconfig the condition is only hit by the second stage
kexec'ed (kboot) kernel, which is not generally built by end users,
but there is a chance this condition would be hit by custom kernel
config, so I think they should go in for 2.6.24.

I'll continue to work on a fix for the memory allocation failure.

-Geoff


--
Subject: sparsemem: Fix sparse_index_init return check

sparse_index_init() returns -EEXIST to indicate the index
has already been created.  Exclude this from the error check
on the return value.

Signed-off-by: Geoff Levand <[EMAIL PROTECTED]>
---
 mm/sparse.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -392,7 +392,7 @@ int sparse_add_one_section(struct zone *
 * plus, it does a kmalloc
 */
ret = sparse_index_init(section_nr, pgdat->node_id);
-   if (ret < 0)
+   if (ret < 0 && ret != -EEXIST)
return ret;
memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, nr_pages);
if (!memmap)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please revert: PCI: fix IDE legacy mode resources

2007-12-05 Thread Yoichi Yuasa

On Thu, 06 Dec 2007 11:10:18 +1100
Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:

> The commit below that was merged in october looks bogus to me.
> 
> At this stage in the PCI probe, the pci_dev->resource's contain RAW bar
> values, that is bus values..
> 
> A PCI legacy IDE controller that hard decodes 0x1f0 etc...  does such as
> bus values as well. That is, the resources should contain 0x1f0...0x1f7
> etc... -not- some kind of transformed values, because that's exactly
> what a BAR would contain if it had been read from the device by
> pci_read_bases() and we haven't performed any fixup yet.
> 
> If the platform offsets resources, like powerpc does, it should do so
> later on in a fixup pass (on ppc, we use either a header quirk or
> fixup_bus depending on the phase of the moon) and that should work
> fine. 
> 
> I don't understand how his fix can work on MIPS nor why the previous
> code didn't, but I don't know how MIPS does its remapping tricks,
> however it will definitely -not- work on powerpc (and will break a
> couple of machines out there).

MIPS pcibios_fixup_bus() converts RAW BAR values(including offset) to resource 
values.
How does it fix up on powerpc?

Yoichi 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: New Address Family: Inter Process Networking (IPN)

2007-12-05 Thread Renzo Davoli

> In the meanwhile we would be grateful if the community could kindly ask to the
> questions above.
Obviously I meant:
In the meanwhile we would be grateful if the community could kindly *answer*
to the questions above

sorry (it is early morning here, it happens ;-)

renzo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] capabilities: introduce per-process capability bounding set (v10)

2007-12-05 Thread Andrew Morgan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

KaiGai Kohei wrote:
> BTW, could you tell me your intention about pam_cap.c is implemented
> with pam_sm_authenticate() and pam_sm_setcred()?
> I think it can be done with pam_sm_open_session(), and this approach
> enables to reduce the iteration of reading /etc/security/capability.conf.
> 
> How do you think the idea?

Good question! If you want to add session support you can. I'd prefer it
if you retained support for the auth/cred API too: admin choice and all
that. To remove the second read of the file, you can use a PAM data item
to cache the desired capability info after the first read of the file.

I implemented it as a credential module (which has to get the
authentication return code right to make the credential stack execute
correctly) because I think of capabilities as credentials.

That being said, the credentials vs. session thing is not well
delineated by many applications, so it is arguably useful to provide
both interfaces for the admin to make use of on a per application basis.

Cheers

Andrew
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHV4r8mwytjiwfWMwRAlOsAJ9MQQN0cLhH2lhx9gwvwHsMhQ72ggCfcKWt
/krnNdiAisfcbcXDfssdbLE=
=+0r1
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: New Address Family: Inter Process Networking (IPN)

2007-12-05 Thread Renzo Davoli

On Wed, Dec 05, 2007 at 04:55:52PM -0500, Stephen Hemminger wrote:
> On Wed, 5 Dec 2007 17:40:55 +0100
> [EMAIL PROTECTED] (Renzo Davoli) wrote:
> > 0- (Constructive) comments.
> > 1- The "official" assignment of an Address Family.
> > 2- Another "grabbing hook" for interfaces (like the ones already
> > We are studying some way to register/deregister grabbing services,
> > I feel this would be the cleanest way. 
> 
> Post complete source code for kernel part to [EMAIL PROTECTED]
I'll do it as soon as possible.
> If you want the hooks, you need to include the full source code for inclusion
> in mainline. All the Documentation/SubmittingPatches rules apply;
> you can't just ask for "facilitators" and expect to keep your stuff out of 
> tree.
I am sorry if I was misunderstood.
I did not want any "facilitator", nor I wanted to keep my code outside
the kernel, on the contrary.
It is perfectly okay for me to provide the entire code for inclusion.
The purposes of my message were the following:
- I wanted to introduce the idea and say to the linux kernel community
  that a team is working on it.
- Address family: is it okay to send a patch that add a new AF?
is there a "AF registry" somewhere? (like the device major/minor
registry or the well-known port assignment for TCP-IP).
- Hook: we have two different options. We can add another grabbing
inline function like those used by the bridge and macvlan or we can
design a grabbing service registration facility. Which one is preferrable?
The former is simpler, the latter is more elegant but it requires some 
changes in the kernel bridge code.
So the former choice is between less-invasive,safer,inelegant, the
latter is more-invasive,less safe,elegant.

We need a bit of time to stabilize the code: deeply testing the existing
features and implementing some more ideas we have on it.
In the meanwhile we would be grateful if the community could kindly ask to the
questions above.

renzo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] pid: Fix mips irix emulation pid usage

2007-12-05 Thread Eric W. Biederman


Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 arch/mips/kernel/irixelf.c |   14 +++---
 arch/mips/kernel/irixsig.c |   16 ++--
 arch/mips/kernel/sysirix.c |   12 ++--
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/mips/kernel/irixelf.c b/arch/mips/kernel/irixelf.c
index 7852c7c..290d8e3 100644
--- a/arch/mips/kernel/irixelf.c
+++ b/arch/mips/kernel/irixelf.c
@@ -591,9 +591,9 @@ static void irix_map_prda_page(void)
return;
 
pp = (struct prda *) v;
-   pp->prda_sys.t_pid  = current->pid;
+   pp->prda_sys.t_pid  = task_pid_vnr(current);
pp->prda_sys.t_prid = read_c0_prid();
-   pp->prda_sys.t_rpid = current->pid;
+   pp->prda_sys.t_rpid = task_pid_vnr(current);
 
/* We leave the rest set to zero */
 }
@@ -1170,11 +1170,11 @@ static int irix_core_dump(long signr, struct pt_regs 
*regs, struct file *file, u
prstatus.pr_info.si_signo = prstatus.pr_cursig = signr;
prstatus.pr_sigpend = current->pending.signal.sig[0];
prstatus.pr_sighold = current->blocked.sig[0];
-   psinfo.pr_pid = prstatus.pr_pid = current->pid;
-   psinfo.pr_ppid = prstatus.pr_ppid = current->parent->pid;
-   psinfo.pr_pgrp = prstatus.pr_pgrp = task_pgrp_nr(current);
-   psinfo.pr_sid = prstatus.pr_sid = task_session_nr(current);
-   if (current->pid == current->tgid) {
+   psinfo.pr_pid = prstatus.pr_pid = task_pid_vnr(current);
+   psinfo.pr_ppid = prstatus.pr_ppid = task_pid_vnr(current->parent);
+   psinfo.pr_pgrp = prstatus.pr_pgrp = task_pgrp_vnr(current);
+   psinfo.pr_sid = prstatus.pr_sid = task_session_vnr(current);
+   if (thread_group_leader(current)) {
/*
 * This is the record for the group leader.  Add in the
 * cumulative times of previous dead threads.  This total
diff --git a/arch/mips/kernel/irixsig.c b/arch/mips/kernel/irixsig.c
index 5b10ac1..0215c80 100644
--- a/arch/mips/kernel/irixsig.c
+++ b/arch/mips/kernel/irixsig.c
@@ -578,10 +578,11 @@ out:
 
 #define W_MASK  (W_EXITED | W_TRAPPED | W_STOPPED | W_CONT | W_NOHANG)
 
-asmlinkage int irix_waitsys(int type, int pid,
+asmlinkage int irix_waitsys(int type, int upid,
struct irix5_siginfo __user *info, int options,
struct rusage __user *ru)
 {
+   struct pid *pid = NULL;
int flag, retval;
DECLARE_WAITQUEUE(wait, current);
struct task_struct *tsk;
@@ -604,6 +605,8 @@ asmlinkage int irix_waitsys(int type, int pid,
if (type != IRIX_P_PID && type != IRIX_P_PGID && type != IRIX_P_ALL)
return -EINVAL;
 
+   if (type != IRIX_P_ALL)
+   pid = find_get_pid(upid);
add_wait_queue(¤t->signal->wait_chldexit, &wait);
 repeat:
flag = 0;
@@ -612,9 +615,9 @@ repeat:
tsk = current;
list_for_each(_p, &tsk->children) {
p = list_entry(_p, struct task_struct, sibling);
-   if ((type == IRIX_P_PID) && p->pid != pid)
+   if ((type == IRIX_P_PID) && task_pid(p) != pid)
continue;
-   if ((type == IRIX_P_PGID) && task_pgrp_nr(p) != pid)
+   if ((type == IRIX_P_PGID) && task_pgrp(p) != pid)
continue;
if ((p->exit_signal != SIGCHLD))
continue;
@@ -639,7 +642,7 @@ repeat:
 
retval = __put_user(SIGCHLD, &info->sig);
retval |= __put_user(0, &info->code);
-   retval |= __put_user(p->pid, &info->stuff.procinfo.pid);
+   retval |= __put_user(task_pid_vnr(p), 
&info->stuff.procinfo.pid);
retval |= __put_user((p->exit_code >> 8) & 0xff,
   &info->stuff.procinfo.procdata.child.status);
retval |= __put_user(p->utime, 
&info->stuff.procinfo.procdata.child.utime);
@@ -657,7 +660,7 @@ repeat:
getrusage(p, RUSAGE_BOTH, ru);
retval = __put_user(SIGCHLD, &info->sig);
retval |= __put_user(1, &info->code);  /* 
CLD_EXITED */
-   retval |= __put_user(p->pid, &info->stuff.procinfo.pid);
+   retval |= __put_user(task_pid_vnr(p), 
&info->stuff.procinfo.pid);
retval |= __put_user((p->exit_code >> 8) & 0xff,
   &info->stuff.procinfo.procdata.child.status);
retval |= __put_user(p->utime,
@@ -665,7 +668,7 @@ repeat:
retval |= __put_user(p->stime,
   &info->stuff.procinfo.procdata.child.stime);
if (retval)
-   return retval;
+   goto end_waitsys;
 
if (p->real_parent != p->parent) {

Re: [PATCH] fix br_fdb_fini() section mismatch

2007-12-05 Thread David Miller

From: Harald Welte <[EMAIL PROTECTED]>
Date: Thu, 6 Dec 2007 10:56:58 +0530

> When compiling a kernel (current linus git or 2.6.24-rc4) with built-in
> CONFIG_BRIDGE, I get the following error:
> 
>   LD  .tmp_vmlinux1
> `br_fdb_fini' referenced in section `.init.text' of net/built-in.o: defined 
> in discarded section `.exit.text' of net/built-in.o
> make: *** [.tmp_vmlinux1] Error 1
> 
> This patch fixes it.
> 
> Signed-off-by: Harald Welte <[EMAIL PROTECTED]>

Thanks, I already have this in my net-2.6.25 tree and will propagate
it to 2.6.24-rcX as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Bloggoo.com สร้างเว็บบล็อกแบบ เร็ว ฟรี ง่าย ทันทีตอนนี้เลย

2007-12-05 Thread Bloggoo.com

Dear  linux-kernel@vger.kernel.org,

[EMAIL PROTECTED] has sent you an invite to sign up at Bloggoo.com - 
http://bloggoo.com.

"BlogGoo (www.bloggoo.com) จัดทำขึ้นเพื่อให้ผู้ใช้บริการได้มีพื้นที่ส่วนตัว 
ในการสร้างสรรค์งานเขียนต่างๆ ของตนเองอย่างอิสระ ทั้งบอกเล่าเรื่องราวส่วนตัว 
เหตุการณ์ที่เกิดขึ้นประจำวัน แบ่งปันข้อมูล บทความ ใส่รูปภาพ วีดีโอ และเสียง 
หรือแลกเปลี่ยนความคิดเห็น ข่าวสารต่างๆ ตามแต่ที่ผู้ใช้บริการแต่ละท่านต้องการ. 

นอกจากนั้น BlogGoo ยังถือเป็นชุมชนออนไลน์ ที่เจ้าของ Blog สามารถติดต่อ 
เชื่อมความสัมพันธ์ กับเจ้าของ Blog อื่นๆ สร้างมิตรภาพดีๆ บนโลกอินเทอร์เน็ต 
และเพื่อเปิดโลกทัศน์ให้กว้างขึ้น. 

ขณะนี้ทาง BlogGoo ได้อยู่ในช่วงที่ต้องการการทดสอบระบบก่อนใช้งานจริง 
ซึ่งจะเปิดให้ใช้อย่างเป็นทางการในเร็วๆ นี้ 
เราต้องการผู้ที่สนใจที่จะมีส่วนร่วมในการทดสอบครั้งนี้ 
ถ้าท่านสนใจก็สามารถสมัครสมาชิกสร้างบล็อกของคุณทันทีได้ฟรี ที่นี่ 
http://bloggoo.com/wp-signup.php เพื่อทดสอบการสร้างบล็อกได้เลยทันที.

และท่านสามารถติชม หรือให้คำแนะนำเว็บไซต์ BlogGoo ได้ที่ [EMAIL PROTECTED]

สุดท้ายนี้ ต้องขอขอบคุณทุกท่านที่ให้การสนับสนุน 
และขอให้มีความสุขกับการใช้บริการ BlogGoo ของเรานะครับ"

You can create your account here:
http://bloggoo.com/wp-signup.php

We are looking forward to seeing you on the site.

Cheers,

--The Team @ Bloggoo.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: New Address Family: Inter Process Networking (IPN)

2007-12-05 Thread Renzo Davoli

On Thu, Dec 06, 2007 at 12:39:22AM +0100, Andi Kleen wrote:
> [EMAIL PROTECTED] (Renzo Davoli) writes:
> 
> > Berkeley socket have been designed for client server or point to point
> > communication. All existing Address Families implement this idea.
> Netlink is multicast/broadcast by default for once. And BC/MC certainly
> works for IPv[46] and a couple of other protocols too.
> 
> > IPN is an Inter Process Communication paradigm where all the processes
> > appear as they were connected by a networking bus.
> 
> Sounds like netlink. See also RFC 3549

RFC 3549 says:
"This document describes Linux Netlink, which is used in Linux both as
   an intra-kernel messaging system as well as between kernel and user
   space."

We know AF_NETLINK, our user-space stack lwipv6 supports it.

AF_IPN is different. 
AF_IPN is the broadcast and peer-to-peer extension of AF_UNIX.
It supports communication among *user* processes. 

Example:

Qemu, User-Mode Linux, Kvm, our umview machines can use IPN as an
Ethernet Hub and communicate among themselves with the hosting computer 
and the world by a tap like interface.

You can also grab an interface (say eth1) and use eth0 for your hosting
computer and eth1 for the IPN network of virtual machines.

If you load the kvde_switch submodule IPN can be a virtual Ethernet switch.

This example is already working using the svn versions of ipn and
vdeplug.

Another Example:

You have a continuous stream of data packets generated by a process,
and you want to send this data to many processes.
Maybe the set of processes is not known in advance, you want to send the
data to any interested process. Some kind of publish&subscribe
communication service (among unix processes not on TCP-IP).
Without IPN you need a server. With IPN the sender creates the socket
connects to it and feed it with data packets. All the interested 
receivers connects to it and start reading. That's all.

I hope that this message can give a better undertanding of what IPN is.

renzo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] esp_scsi: Fix reset cleanup spinlock recursion

2007-12-05 Thread David Miller

From: "Maciej W. Rozycki" <[EMAIL PROTECTED]>
Date: Wed, 5 Dec 2007 16:10:54 + (GMT)

>  The esp_reset_cleanup() function is called with the host lock held and 
> invokes starget_for_each_device() which wants to take it too.  Here is a 
> fix along the lines of shost_for_each_device()/__shost_for_each_device() 
> adding a __starget_for_each_device() counterpart which assumes the lock 
> has already been taken.
> 
>  Eventually, I think the driver should get modified so that more work is 
> done as a softirq rather than in the interrupt context, but for now it 
> fixes a bug that causes the spinlock debugger to fire.
> 
>  While at it, it fixes a small number of cosmetic problems with 
> starget_for_each_device() too.
> 
> Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]>

Acked-by: David S. Miller <[EMAIL PROTECTED]>

Thanks for finding and fixing this bug.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Documentation about unaligned memory access

2007-12-05 Thread Brandon Philips

On 16:11 Mon 03 Dec 2007, Daniel Drake wrote:
> Here's a document I wrote after figuring out what unaligned memory access
> is all about. I've tried to cover the information I was looking for when
> trying to learn about this, without producing a hopelessly detailed/complex
> spew. I hope it is useful to others.
> 
> Signed-off-by: Daniel Drake <[EMAIL PROTECTED]>
> 
> ---

> +These macros work work for memory accesses of any length (not just 32 bits as

Remove a "work" here.

> +in the examples above). Be aware that when compared to standard access of
> +aligned memory, using these macros to access unaligned memory can be costly 
> in
> +terms of performance.
> +
> +If use of such macros is not convenient, another option is to use memcpy(),
> +where the source or destination (or both) are of type u8* or unsigned char*.
> +Due to the byte-wise nature of this operation, unaligned accesses are 
> avoided.

Cheers,

Brandon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Please revert: PCI: fix IDE legacy mode resources

2007-12-05 Thread Benjamin Herrenschmidt


On Thu, 2007-12-06 at 13:34 +0900, Yoichi Yuasa wrote:
> > I don't understand how his fix can work on MIPS nor why the previous
> > code didn't, but I don't know how MIPS does its remapping tricks,
> > however it will definitely -not- work on powerpc (and will break a
> > couple of machines out there).
> 
> MIPS pcibios_fixup_bus() converts RAW BAR values(including offset) to
> resource values. How does it fix up on powerpc?

Same thing. We expect resources to contain raw values before . What I
don't understand is thus why you are calling resource_to_bus on 0x1f0
which is -not- a resource value, but is already a BAR value...

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2.6.21 through 2.6.23] ACPICA: fix acpi-cpufreq boot crash due to _PSD return-by-reference

2007-12-05 Thread Len Brown

From: Bob Moore <[EMAIL PROTECTED]>

This bug has been in the ACPICA interpreter since the beginning of time.
It is a reference-after-free bug due to the interpreter doing a
"return by reference" on local objects instead of a "return by value"
when the objects are part of a package.

It could technically happen for any AML code, but slab debugging
exposed it for the first time (we think) on some new hardware
when we were interpreting some AML on behalf of acpi-cpufreq initialization.

This patch shipped in linux-2.6.24-rc4,
but it applies cleanly back through 2.6.21.
http://bugzilla.kernel.org/show_bug.cgi?id=9429
contains backports through 2.6.16.

-Len

Changed resolution of named references in packages

Fixed a problem with the Package operator where all named
references were created as object references and left otherwise
unresolved. According to the ACPI specification, a Package can
only contain Data Objects or references to control methods. The
implication is that named references to Data Objects (Integer,
Buffer, String, Package, BufferField, Field) should be resolved
immediately upon package creation. This is the approach taken
with this change. References to all other named objects (Methods,
Devices, Scopes, etc.) are all now properly created as reference objects.

http://bugzilla.kernel.org/show_bug.cgi?id=5328
http://bugzilla.kernel.org/show_bug.cgi?id=9429

Signed-off-by: Bob Moore <[EMAIL PROTECTED]>
Signed-off-by: Len Brown <[EMAIL PROTECTED]>

 drivers/acpi/dispatcher/dsobject.c |   91 +--
 1 files changed, 85 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/dispatcher/dsobject.c 
b/drivers/acpi/dispatcher/dsobject.c
index a474ca2..954ac8c 100644
--- a/drivers/acpi/dispatcher/dsobject.c
+++ b/drivers/acpi/dispatcher/dsobject.c
@@ -137,6 +137,71 @@ acpi_ds_build_internal_object(struct acpi_walk_state 
*walk_state,
return_ACPI_STATUS(status);
}
}
+
+   /* Special object resolution for elements of a package */
+
+   if ((op->common.parent->common.aml_opcode == AML_PACKAGE_OP) ||
+   (op->common.parent->common.aml_opcode ==
+AML_VAR_PACKAGE_OP)) {
+   /*
+* Attempt to resolve the node to a value before we 
insert it into
+* the package. If this is a reference to a common data 
type,
+* resolve it immediately. According to the ACPI spec, 
package
+* elements can only be "data objects" or method 
references.
+* Attempt to resolve to an Integer, Buffer, String or 
Package.
+* If cannot, return the named reference (for things 
like Devices,
+* Methods, etc.) Buffer Fields and Fields will resolve 
to simple
+* objects (int/buf/str/pkg).
+*
+* NOTE: References to things like Devices, Methods, 
Mutexes, etc.
+* will remain as named references. This behavior is 
not described
+* in the ACPI spec, but it appears to be an oversight.
+*/
+   obj_desc = (union acpi_operand_object *)op->common.node;
+
+   status =
+   acpi_ex_resolve_node_to_value(ACPI_CAST_INDIRECT_PTR
+ (struct
+  acpi_namespace_node,
+  &obj_desc),
+ walk_state);
+   if (ACPI_FAILURE(status)) {
+   return_ACPI_STATUS(status);
+   }
+
+   switch (op->common.node->type) {
+   /*
+* For these types, we need the actual node, 
not the subobject.
+* However, the subobject got an extra 
reference count above.
+*/
+   case ACPI_TYPE_MUTEX:
+   case ACPI_TYPE_METHOD:
+   case ACPI_TYPE_POWER:
+   case ACPI_TYPE_PROCESSOR:
+   case ACPI_TYPE_EVENT:
+   case ACPI_TYPE_REGION:
+   case ACPI_TYPE_DEVICE:
+   case ACPI_TYPE_THERMAL:
+
+   obj_desc =
+   (union acpi_operand_object *)op->common.
+   node;
+   break;
+
+   default:
+   break;
+   }
+
+   /*
+* If above resolved to an op

[PATCH Latency Tracer] don't panic on failed bootmem alloc

2007-12-05 Thread Steven Rostedt

Ingo,

This patch prevents a panic on a failed bootmem alloc in the
initialization of the tracer buffers.

Signed-off-by: Steven Rostedt <[EMAIL PROTECTED]>

Index: linux-2.6-latency/kernel/latency_trace.c
===
--- linux-2.6-latency.orig/kernel/latency_trace.c
+++ linux-2.6-latency/kernel/latency_trace.c
@@ -2720,10 +2720,11 @@ void * __init tracer_alloc_bootmem(unsig
 {
void * ret;
 
-   ret =__alloc_bootmem(size, SMP_CACHE_BYTES, ARCH_LOW_ADDRESS_LIMIT);
+   ret =__alloc_bootmem_nopanic(size, SMP_CACHE_BYTES,
+ARCH_LOW_ADDRESS_LIMIT);
if (ret != NULL && ((unsigned long)ret) < ARCH_LOW_ADDRESS_LIMIT) {
free_bootmem(__pa(ret), size);
-   ret = __alloc_bootmem(size,
+   ret = __alloc_bootmem_nopanic(size,
SMP_CACHE_BYTES,
__pa(MAX_DMA_ADDRESS));
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: solid state drive access and context switching

2007-12-05 Thread Kyungmin Park

Hi,

On Dec 6, 2007 7:01 AM, Jared Hulbert <[EMAIL PROTECTED]> wrote:
> > Probably about 1000 clocks but its always going to depend upon the
> > workload and whether any other work can be done usefully.
>
> Yeah.  Sounds right, in the microsecond range.  Be interesting to see data.
>
> Anybody have ideas on what kind of experiments could confirm this
> estimate is right?

Is it the right place to write synchronously?
Now only concern the SATA.

Thank you,
Kyungmin Park

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 3b927be..cce0618 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -3221,6 +3221,13 @@ static inline void __generic_make_request(struct bio *bio
if (bio_check_eod(bio, nr_sectors))
goto end_io;

+#if 1
+   /* FIXME simple hack */
+   if (MAJOR(bio->bi_bdev->bd_dev) == 8 && bio_data_dir(bio) == WRITE) {
+   /* WRITE_SYNC */
+   bio->bi_rw |= (1 << BIO_RW_SYNC);
+   }
+#endif
/*
 * Resolve the mapping until finished. (drivers are
 * still free to implement/resolve their own stacking
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: drivers/net/iseries_veth.c dubious sysfs usage

2007-12-05 Thread Michael Ellerman

On Wed, 2007-12-05 at 13:41 -0800, Greg KH wrote:
> On Wed, Dec 05, 2007 at 10:10:31PM +1100, Michael Ellerman wrote:
> > On Wed, 2007-12-05 at 01:30 -0800, Greg KH wrote:
> > > In doing a massive kobject cleanup of the kernel tree, I ran across the
> > > iseries_veth.c driver.
> > > 
> > > It looks like the driver is creating a number of subdirectories under
> > > the driver sysfs directory.  This is odd and probably wrong.  You want
> > > these virtual connections to show up in the main sysfs device tree, not
> > > under the driver directory.
> > > 
> > > I'll be glad to totally guess and try to move it around in the sysfs
> > > tree, but odds are I'll get it all wrong as I can't really test this
> > > out :)
> > > 
> > > Any hints on what this driver is trying to do in this sysfs directories?
> > 
> > I wrote the code, I think, but it's been a while - I'll have a look at
> > it tomorrow.
> 
> Yes, can you send me the sysfs tree output of the driver directory, and
> what exactly the different files in there are supposed to be used for?

Sure. My version of tar (1.15.1) doesn't seem to be able to tar up /sys,
so hopefully this is sufficient:

igoeast:~# cd /sys/class/net/eth1/
igoeast:/sys/class/net/eth1# ls -la
total 0
drwxr-xr-x  4 root root0 Dec  6 10:22 .
drwxr-xr-x  6 root root0 Dec  6 10:21 ..
-r--r--r--  1 root root 4096 Dec  6 10:30 addr_len
-r--r--r--  1 root root 4096 Dec  6 10:30 address
-r--r--r--  1 root root 4096 Dec  6 10:30 broadcast
-r--r--r--  1 root root 4096 Dec  6 10:30 carrier
lrwxrwxrwx  1 root root0 Dec  6 10:22 device -> ../../../devices/vio/3
-r--r--r--  1 root root 4096 Dec  6 10:30 dormant
-r--r--r--  1 root root 4096 Dec  6 10:30 features
-rw-r--r--  1 root root 4096 Dec  6 10:30 flags
-r--r--r--  1 root root 4096 Dec  6 10:30 ifindex
-r--r--r--  1 root root 4096 Dec  6 10:30 iflink
-r--r--r--  1 root root 4096 Dec  6 10:30 link_mode
-rw-r--r--  1 root root 4096 Dec  6 10:30 mtu
-r--r--r--  1 root root 4096 Dec  6 10:30 operstate
drwxr-xr-x  2 root root0 Dec  6 10:30 statistics
lrwxrwxrwx  1 root root0 Dec  6 10:30 subsystem -> ../../../class/net
-rw-r--r--  1 root root 4096 Dec  6 10:30 tx_queue_len
-r--r--r--  1 root root 4096 Dec  6 10:30 type
-rw-r--r--  1 root root 4096 Dec  6 10:30 uevent
drwxr-xr-x  2 root root0 Dec  6 10:30 veth_port

Each net device has a port structure associated with it, the fields
should be fairly self explanatory, they're all read only I think.

igoeast:/sys/class/net/eth1# find veth_port/
veth_port/
veth_port/mac_addr
veth_port/lpar_map
veth_port/stopped_map
veth_port/promiscuous
veth_port/num_mcast


igoeast:/sys/class/net/eth1# cd device/driver

igoeast:/sys/class/net/eth1/device/driver# ls -l
total 0
lrwxrwxrwx  1 root root0 Dec  6 10:21 2 -> ../../../../devices/vio/2
lrwxrwxrwx  1 root root0 Dec  6 10:21 3 -> ../../../../devices/vio/3
--w---  1 root root 4096 Dec  6 10:21 bind
drwxr-xr-x  2 root root0 Dec  6 10:21 cnx00
drwxr-xr-x  2 root root0 Dec  6 10:21 cnx02
drwxr-xr-x  2 root root0 Dec  6 10:21 cnx03
drwxr-xr-x  2 root root0 Dec  6 10:21 cnx04
lrwxrwxrwx  1 root root0 Dec  6 10:21 module -> 
../../../../module/iseries_veth
--w---  1 root root 4096 Dec  6 10:21 uevent
--w---  1 root root 4096 Dec  6 10:21 unbind

The driver has a connection to all the other lpars, this is entirely
independent of the net devices.

igoeast:/sys/class/net/eth1/device/driver# find cnx00/
cnx00/
cnx00/outstanding_tx
cnx00/remote_lp
cnx00/num_events
cnx00/reset_timeout
cnx00/last_contact
cnx00/state
cnx00/src_inst
cnx00/dst_inst
cnx00/num_pending_acks
cnx00/num_ack_events
cnx00/ack_timeout


> > Why is it "odd and probably wrong" to create subdirectories under the
> > driver in sysfs?
> 
> Because a driver does not have "devices" under it in the sysfs tree.
> All devices liven in the /sys/devices/ tree so we can properly manage
> them that way.  A driver will then bind to a device, and the driver core
> will set up the linkages in sysfs properly so that everthing looks
> uniform.

OK. They're not "devices" that we create under the driver, they're just
attributes of the driver, and they happen to be in groups so I put them
in subdirectories.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


signature.asc
Description: This is a digitally signed message part

Re: Allow (O=...) from file

2007-12-05 Thread Jay Cliburn

On Wed, 5 Dec 2007 22:00:03 +0100
Sam Ravnborg <[EMAIL PROTECTED]> wrote:

> On Tue, Dec 04, 2007 at 09:04:33PM -0600, Jay Cliburn wrote:
> > Sam,
> > 
> > This piece of the top-level Makefile in current git causes an
> > out-of-tree driver Makefile to fail.
> > 
> > 101 ifdef O
> > 102   ifeq ("$(origin O)", "command line")
> > 103 KBUILD_OUTPUT := $(O)
> > 104   endif
> > 105 endif
> > 
> > The out-of-tree driver Makefile contains an O=... directive that
> > (correctly) does _not_ specify the kernel source dir, and apparently
> > isn't overridden by the command line either. If in the above
> > Makefile snippet I change "command line" to "file", my out-of-tree
> > make succeeds. What do you think about allowing O= to come from a
> > file in addition to the command line?
> 
> When you change "command line" to "file" you actually makes kbuild
> ignore the O=... value which is why it succeeds.

I'm puzzled by your statement.  Isn't the opposite true?  When using
"command line", doesn't the following happen?

1. My makefile sets O=/foo
2. My makefile invokes your makefile with O=/foo
3. Your makefile ignores my O=/foo because it requires O=/foo to
originate from the command line
4. KBUILD_OUTPUT never gets set to /foo and we hit the error

OTOH, if I use "file":
1. My makefile sets O=/foo
2. My makefile invokes your makefile with O=/foo
3. Your makefile accepts my O=/foo because it requires O=/foo to
originate from another makefile
4. KBUILD_OUTPUT gets set to /foo and my make succeeds

This all used to work the last time I tried it, which was sometime
during 2.6.23 development, IIRC.  Isn't the current structure going to
break just about all out-of-tree driver builds?

Jay
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci: Fix bus resource assignment on 32 bits with 64b resources

2007-12-05 Thread Benjamin Herrenschmidt


On Wed, 2007-12-05 at 17:40 +1100, Benjamin Herrenschmidt wrote:
> The current pci_assign_unassigned_resources() code doesn't work properly
> on 32 bits platforms with 64 bits resources. The main reason is the use
> of unsigned long in various places instead of resource_size_t.
> 
> This fixes it, along with some tricks to avoid casting to 64 bits on
> platforms that don't need it in every printk around.
> 
> This is a pre-requisite for making powerpc use the generic code instead of
> its own half-useful implementation.
> 
> Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
> ---
> 
> This version fixes some stupid warnings when using 32 bits resources

 ... and has warnings on 64 bits platforms... G

This whole issue of printk vs. resource_size_t is a terrible mess :-(

Part of the problem is that resource_size_t can be either u32 or u64..

that is  it can be either unsigned int, unsigned long or unsigned long
long... and we have no way to reliably printk that.

Any clever idea before I start pushing filthy macros up linux/types.h ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] pid: sys_wait... fixes

2007-12-05 Thread Eric W. Biederman


This modifies do_wait and eligible_child to take a pair of
enum pid_type and struct pid *pid to precisely specify what
set of processes are eligible to be waited for,  instead of the
raw pid_t value from sys_wait4.

This fixes a bug in sys_waitid where you could not wait for children
in just process group 1.

This fixes a pid namespace crossing case in eligible_child.  Allowing
us to wait for a processes in our current process group even if
our current process group == 0.

This allows the no child with this pid case to be optimized.
This allows us to optimize the pid membership test in eligible
child to be optimized.

This even closes a theoretical pid wraparound race where in
a threaded parent if two threads are waiting for the same child
and one thread picks up the child and the pid numbers wrap around
and generate another child with that same pid before the other
thread is scheduled (teribly insanely unlikely) we could end
up waiting on the second child with the same pid# and not discover
that the specific child we were waiting for has exited.

Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
---
 kernel/exit.c |   72 ++---
 1 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index aede730..9e4e22a 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1106,20 +1106,13 @@ asmlinkage void sys_exit_group(int error_code)
do_group_exit((error_code & 0xff) << 8);
 }
 
-static int eligible_child(pid_t pid, int options, struct task_struct *p)
+static int eligible_child(enum pid_type type, struct pid *pid, int options,
+ struct task_struct *p)
 {
int err;
-   struct pid_namespace *ns;
 
-   ns = current->nsproxy->pid_ns;
-   if (pid > 0) {
-   if (task_pid_nr_ns(p, ns) != pid)
-   return 0;
-   } else if (!pid) {
-   if (task_pgrp_nr_ns(p, ns) != task_pgrp_vnr(current))
-   return 0;
-   } else if (pid != -1) {
-   if (task_pgrp_nr_ns(p, ns) != -pid)
+   if (type < PIDTYPE_MAX) {
+   if (p->pids[type].pid != pid)
return 0;
}
 
@@ -1143,7 +1136,7 @@ static int eligible_child(pid_t pid, int options, struct 
task_struct *p)
if (likely(!err))
return 1;
 
-   if (pid <= 0)
+   if (type != PIDTYPE_PID)
return 0;
/* This child was explicitly requested, abort */
read_unlock(&tasklist_lock);
@@ -1463,8 +1456,9 @@ static int wait_task_continued(struct task_struct *p, int 
noreap,
return retval;
 }
 
-static long do_wait(pid_t pid, int options, struct siginfo __user *infop,
-   int __user *stat_addr, struct rusage __user *ru)
+static long do_wait(enum pid_type type, struct pid *pid, int options,
+   struct siginfo __user *infop, int __user *stat_addr,
+   struct rusage __user *ru)
 {
DECLARE_WAITQUEUE(wait, current);
struct task_struct *tsk;
@@ -1472,6 +1466,11 @@ static long do_wait(pid_t pid, int options, struct 
siginfo __user *infop,
 
add_wait_queue(¤t->signal->wait_chldexit,&wait);
 repeat:
+   /* If there is nothing that can match our critier just get out */
+   retval = -ECHILD;
+   if ((type < PIDTYPE_MAX) && (!pid || hlist_empty(&pid->tasks[type])))
+   goto end;
+
/*
 * We will set this flag if we see any child that might later
 * match our criteria, even if we are not able to reap it yet.
@@ -1484,7 +1483,7 @@ repeat:
struct task_struct *p;
 
list_for_each_entry(p, &tsk->children, sibling) {
-   int ret = eligible_child(pid, options, p);
+   int ret = eligible_child(type, pid, options, p);
if (!ret)
continue;
 
@@ -1531,7 +1530,7 @@ repeat:
if (!flag) {
list_for_each_entry(p, &tsk->ptrace_children,
ptrace_list) {
-   flag = eligible_child(pid, options, p);
+   flag = eligible_child(type, pid, options, p);
if (!flag)
continue;
if (likely(flag > 0))
@@ -1586,10 +1585,12 @@ end:
return retval;
 }
 
-asmlinkage long sys_waitid(int which, pid_t pid,
+asmlinkage long sys_waitid(int which, pid_t upid,
   struct siginfo __user *infop, int options,
   struct rusage __user *ru)
 {
+   struct pid *pid = NULL;
+   enum pid_type type;
long ret;
 
if (options & ~(WNOHANG|WNOWAIT|WEXITED|WSTOPPED|WCONTINUED))
@@ -1599,37 +1600,60 @@ asmlinkage long sys_waitid(int which, pid_t pid,
 
switch (

Re: 2.6.24-rc4-mm1 Kernel build fails on S390x

2007-12-05 Thread Kamalesh Babulal

Hi Andrew,

The 2.6.24-rc4-mm1 kernel build fails on s390x,

  CC  arch/s390/kernel/traps.o
In file included from include/asm/thread_info.h:39,
 from include/linux/thread_info.h:21,
 from include/linux/preempt.h:9,
 from include/linux/spinlock.h:49,
 from include/linux/seqlock.h:29,
 from include/linux/time.h:8,
 from include/linux/timex.h:57,
 from include/linux/sched.h:53,
 from arch/s390/kernel/traps.c:17:
include/asm/processor.h:191: warning: "struct seq_file" declared inside 
parameter list
include/asm/processor.h:191: warning: its scope is only this definition or 
declaration, which is probably not what you want
arch/s390/kernel/traps.c: In function `task_show_regs':
arch/s390/kernel/traps.c:226: error: implicit declaration of function 
`seq_printf'
make[1]: *** [arch/s390/kernel/traps.o] Error 1
make: *** [arch/s390/kernel] Error 2

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch/rfc 2/4] pcf857x I2C GPIO expander driver

2007-12-05 Thread David Brownell

On Friday 30 November 2007, David Brownell wrote:
> Thanks for the review.  I'll snip out typos and similar trivial
> comments (and fix them!), using responses here for more the
> substantive feedback.

Here's the current version of this patch ... updated to put the
driver into drivers/gpio (separate patch setting that up) and
the header into 

Note that after looking at the GPIO expanders listed at the NXP
website, I updated this to accept a few more of these chips.
Other than reset pins and addressing options, the key difference
between these seems to be the top I2C clock speed supported:

 pcf857x ...  100 KHz
 pca857x ...  400 KHz
 pca967x ... 1000 KHz

Otherwise they're equivalent at the level of just swapping parts.

- Dave

=   SNIP!
This is a new-style I2C driver for most common 8 and 16 bit I2C based
"quasi-bidirectional" GPIO expanders:  pcf8574 or pcf8575, and several
compatible models (mostly faster, supporting I2C at up to 1 MHz).

Since it's a new-style driver, these devices must be configured as
part of board-specific init.  That eliminates the need for error-prone
manual configuration of module parameters, and makes compatibility
with legacy drivers (pcf8574.c, pc8575.c)for these chips easier.

The driver exposes the GPIO signals using the platform-neutral GPIO
programming interface, so they are easily accessed by other kernel
code.  The lack of such a flexible kernel API is what has ensured
the proliferation of board-specific drivers for these chips... stuff
that rarely makes it upstream since it's so ugly.  This driver will
let them use standard calls.

Signed-off-by: David Brownell <[EMAIL PROTECTED]>
---
 drivers/gpio/Kconfig|   23 +++
 drivers/gpio/Makefile   |2 
 drivers/gpio/pcf857x.c  |  331 
 include/linux/i2c/pcf857x.h |   45 +
 4 files changed, 401 insertions(+)

--- a/drivers/gpio/Kconfig  2007-12-05 15:13:27.0 -0800
+++ b/drivers/gpio/Kconfig  2007-12-05 15:14:12.0 -0800
@@ -5,4 +5,27 @@
 menu "GPIO Support"
depends on GPIO_LIB
 
+config GPIO_PCF857X
+   tristate "PCF857x, PCA857x, and PCA967x I2C GPIO expanders"
+   depends on I2C
+   help
+ Say yes here to provide access to most "quasi-bidirectional" I2C
+ GPIO expanders used for additional digital outputs or inputs.
+ Most of these parts are from NXP, though TI is a second source for
+ some of them.  Compatible models include:
+
+ 8 bits:   pcf8574, pcf8574a, pca8574, pca8574a,
+   pca9670, pca9672, pca9674, pca9674a
+
+ 16 bits:  pcf8575, pcf8575c, pca8575,
+   pca9671, pca9673, pca9675
+
+ Your board setup code will need to declare the expanders in
+ use, and assign numbers to the GPIOs they expose.  Those GPIOs
+ can then be used from drivers and other kernel code, just like
+ other GPIOs, but only accessible from task contexts.
+
+ This driver provides an in-kernel interface to those GPIOs using
+ platform-neutral GPIO calls.
+
 endmenu
--- a/drivers/gpio/Makefile 2007-12-05 15:14:03.0 -0800
+++ b/drivers/gpio/Makefile 2007-12-05 15:14:12.0 -0800
@@ -1 +1,3 @@
 # gpio support: dedicated expander chips, etc
+
+obj-$(CONFIG_GPIO_PCF857X) += pcf857x.o
--- /dev/null   1970-01-01 00:00:00.0 +
+++ b/drivers/gpio/pcf857x.c2007-12-05 15:15:18.0 -0800
@@ -0,0 +1,331 @@
+/*
+ * pcf857x - driver for pcf857x, pca857x, and pca967x I2C GPIO expanders
+ *
+ * Copyright (C) 2007 David Brownell
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+
+/*
+ * The pcf857x, pca857x, and pca967x chips only expose one read and one
+ * write register.  Writing a "one" bit (to match the reset state) lets
+ * that pin be used as an input; it's not an open-drain model, but acts
+ * a bit like one.  This is described as "quasi-bidirectional"; read the
+ * chip documentation for details.
+ *
+ * Some other I2C GPIO expander chips (like the pca953{4,5,6,7,9}, pca9555,
+ * pca9698, mcp23008, and mc23017) have more complex register models with
+ * more conventional input circuitry, often using 0x20..0x27 addresses.
+ */
+struct pcf857x {
+

[patch-early-RFC 09/10] LTTng - x86_64 instrumentation

2007-12-05 Thread Mathieu Desnoyers

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86/ia32/ia32entry.S |6 +++---
 arch/x86/ia32/ipc32.c |2 ++
 arch/x86/kernel/apic_64.c |   21 +
 arch/x86/kernel/cpu/mcheck/mce_intel_64.c |6 ++
 arch/x86/kernel/entry_64.S|   12 ++--
 arch/x86/kernel/process_64.c  |   11 +++
 arch/x86/kernel/ptrace_64.c   |5 +
 arch/x86/kernel/setup64.c |1 +
 arch/x86/kernel/smp_64.c  |   18 ++
 arch/x86/kernel/traps_64.c|   29 +
 arch/x86/mm/fault_64.c|7 +++
 11 files changed, 105 insertions(+), 13 deletions(-)

Index: linux-2.6-lttng/arch/x86/ia32/ia32entry.S
===
--- linux-2.6-lttng.orig/arch/x86/ia32/ia32entry.S  2007-12-05 
21:05:48.0 -0500
+++ linux-2.6-lttng/arch/x86/ia32/ia32entry.S   2007-12-05 21:48:46.0 
-0500
@@ -125,7 +125,7 @@ ENTRY(ia32_sysenter_target)
.previous   
GET_THREAD_INFO(%r10)
orl$TS_COMPAT,threadinfo_status(%r10)
-   testl  
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+   testl  
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE|_TIF_SECCOMP),threadinfo_flags(%r10)
CFI_REMEMBER_STATE
jnz  sysenter_tracesys
 sysenter_do_call:  
@@ -230,7 +230,7 @@ ENTRY(ia32_cstar_target)
.previous   
GET_THREAD_INFO(%r10)
orl   $TS_COMPAT,threadinfo_status(%r10)
-   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE|_TIF_SECCOMP),threadinfo_flags(%r10)
CFI_REMEMBER_STATE
jnz   cstar_tracesys
 cstar_do_call: 
@@ -322,7 +322,7 @@ ENTRY(ia32_syscall)
SAVE_ARGS 0,0,1
GET_THREAD_INFO(%r10)
orl   $TS_COMPAT,threadinfo_status(%r10)
-   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%r10)
+   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE|_TIF_SECCOMP),threadinfo_flags(%r10)
jnz ia32_tracesys
 ia32_do_syscall:   
cmpl $(IA32_NR_syscalls-1),%eax
Index: linux-2.6-lttng/arch/x86/ia32/ipc32.c
===
--- linux-2.6-lttng.orig/arch/x86/ia32/ipc32.c  2007-12-05 21:05:48.0 
-0500
+++ linux-2.6-lttng/arch/x86/ia32/ipc32.c   2007-12-05 21:48:46.0 
-0500
@@ -18,6 +18,8 @@ sys32_ipc(u32 call, int first, int secon
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
switch (call) {
  case SEMOP:
/* struct sembuf is the same on 32 and 64bit :)) */
Index: linux-2.6-lttng/arch/x86/kernel/entry_64.S
===
--- linux-2.6-lttng.orig/arch/x86/kernel/entry_64.S 2007-12-05 
21:48:05.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/entry_64.S  2007-12-05 21:48:46.0 
-0500
@@ -161,7 +161,7 @@ ENTRY(ret_from_fork)
CFI_ADJUST_CFA_OFFSET -4
call schedule_tail
GET_THREAD_INFO(%rcx)
-   testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT),threadinfo_flags(%rcx)
+   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE),threadinfo_flags(%rcx)
jnz rff_trace
 rff_action:
RESTORE_REST
@@ -229,7 +229,7 @@ ENTRY(system_call)
movq  %rcx,RIP-ARGOFFSET(%rsp)
CFI_REL_OFFSET rip,RIP-ARGOFFSET
GET_THREAD_INFO(%rcx)
-   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),threadinfo_flags(%rcx)
+   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE|_TIF_SECCOMP),threadinfo_flags(%rcx)
jnz tracesys
cmpq $__NR_syscall_max,%rax
ja badsys
@@ -268,7 +268,7 @@ sysret_check:   
/* Handle reschedules */
/* edx: work, edi: workmask */  
 sysret_careful:
-   testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),%edx
+   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE|_TIF_SECCOMP),%edx
jnz ret_from_sys_call_trace
bt $TIF_NEED_RESCHED,%edx
jnc sysret_signal
@@ -377,7 +377,7 @@ int_very_careful:
sti
SAVE_REST
/* Check for syscall exit trace */  
-   testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP),%edx
+   testl 
$(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE|_TIF_SINGLESTEP),%edx
jz int_signal
pushq %rdi
CFI_ADJUST_CFA_OFFSET 8
@@ -385,7 +385,7

Re: [PATCH] kbuild: implement modules.order

2007-12-05 Thread Rusty Russell

On Wednesday 05 December 2007 18:11:49 Tejun Heo wrote:
> WANG Cong wrote:
> >>> I think, you forgot to free(3) the memory you calloc(3)'ed and
> >>> malloc(3)'ed above.
> >>
> >> It's a simple program where whole body is in main().  Why bother?
> >> What's the benefit of adding hash-table iterating free logic?
> >
> > Personally, I think memory leaks are bugs. And we hate bugs. ;)
>
> Trust me.  As a person buried alive in bug reports, I hate bugs too.  I
> just don't agree that this type of programs should free all its
> resources before exiting.  How about adding a comment saying /* we're
> going out anyway, don't bother freeing hashtable */?

I too once battled with the moral dilemma of freeing in programs that exit.

Then in 2001, I was moving out of a house which was to be demolished.  The 
landlord insisted that we pay for the carpets to be cleaned.  My wife still 
uses it as a canonical example of wasteful idiocy.

So I hope this has contributed to your enlightenment, as it did to mine.
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-early-RFC 01/10] LTTng - ARM instrumentation

2007-12-05 Thread Mathieu Desnoyers

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/arm/kernel/entry-common.S |8 
 arch/arm/kernel/process.c  |6 +-
 arch/arm/kernel/ptrace.c   |6 ++
 arch/arm/kernel/sys_arm.c  |2 ++
 arch/arm/kernel/traps.c|7 +++
 5 files changed, 24 insertions(+), 5 deletions(-)

Index: linux-2.6-lttng/arch/arm/kernel/entry-common.S
===
--- linux-2.6-lttng.orig/arch/arm/kernel/entry-common.S 2007-10-11 
11:44:13.0 -0400
+++ linux-2.6-lttng/arch/arm/kernel/entry-common.S  2007-10-11 
11:44:57.0 -0400
@@ -85,8 +85,8 @@ ENTRY(ret_from_fork)
get_thread_info tsk
ldr r1, [tsk, #TI_FLAGS]@ check for syscall tracing
mov why, #1
-   tst r1, #_TIF_SYSCALL_TRACE @ are we tracing syscalls?
-   beq ret_slow_syscall
+   tst r1, #_TIF_SYSCALL_TRACE | _TIF_KERNEL_TRACE
+   beq ret_slow_syscall@ are we tracing syscalls?
mov r1, sp
mov r0, #1  @ trace exit [IP = 1]
bl  syscall_trace
@@ -205,8 +205,8 @@ ENTRY(vector_swi)
 #endif
 
stmdb   sp!, {r4, r5}   @ push fifth and sixth args
-   tst ip, #_TIF_SYSCALL_TRACE @ are we tracing syscalls?
-   bne __sys_trace
+   tst ip, #_TIF_SYSCALL_TRACE | _TIF_KERNEL_TRACE
+   bne __sys_trace @ are we tracing syscalls?
 
cmp scno, #NR_syscalls  @ check upper syscall limit
adr lr, ret_fast_syscall@ return address
Index: linux-2.6-lttng/arch/arm/kernel/process.c
===
--- linux-2.6-lttng.orig/arch/arm/kernel/process.c  2007-10-11 
11:44:13.0 -0400
+++ linux-2.6-lttng/arch/arm/kernel/process.c   2007-10-11 14:46:50.0 
-0400
@@ -418,6 +418,7 @@ asm(".section .text\n"
 pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
 {
struct pt_regs regs;
+   long pid;
 
memset(®s, 0, sizeof(regs));
 
@@ -427,7 +428,10 @@ pid_t kernel_thread(int (*fn)(void *), v
regs.ARM_pc = (unsigned long)kernel_thread_helper;
regs.ARM_cpsr = SVC_MODE;
 
-   return do_fork(flags|CLONE_VM|CLONE_UNTRACED, 0, ®s, 0, NULL, NULL);
+   pid = do_fork(flags|CLONE_VM|CLONE_UNTRACED, 0, ®s, 0, NULL, NULL);
+
+   trace_mark(kernel_arch_kthread_create, "pid %ld fn %p", pid, fn);
+   return pid;
 }
 EXPORT_SYMBOL(kernel_thread);
 
Index: linux-2.6-lttng/arch/arm/kernel/ptrace.c
===
--- linux-2.6-lttng.orig/arch/arm/kernel/ptrace.c   2007-10-11 
11:44:13.0 -0400
+++ linux-2.6-lttng/arch/arm/kernel/ptrace.c2007-10-11 14:48:21.0 
-0400
@@ -789,6 +789,12 @@ asmlinkage int syscall_trace(int why, st
 {
unsigned long ip;
 
+   if (!why)
+   trace_mark(kernel_arch_syscall_entry, "syscall_id %d ip #p%ld",
+   scno, instruction_pointer(regs));
+   else
+   trace_mark(kernel_arch_syscall_exit, MARK_NOARGS);
+
if (!test_thread_flag(TIF_SYSCALL_TRACE))
return scno;
if (!(current->ptrace & PT_PTRACED))
Index: linux-2.6-lttng/arch/arm/kernel/sys_arm.c
===
--- linux-2.6-lttng.orig/arch/arm/kernel/sys_arm.c  2007-10-11 
11:44:13.0 -0400
+++ linux-2.6-lttng/arch/arm/kernel/sys_arm.c   2007-10-11 14:47:23.0 
-0400
@@ -161,6 +161,8 @@ asmlinkage int sys_ipc(uint call, int fi
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
switch (call) {
case SEMOP:
return sys_semtimedop (first, (struct sembuf __user *)ptr, 
second, NULL);
Index: linux-2.6-lttng/arch/arm/kernel/traps.c
===
--- linux-2.6-lttng.orig/arch/arm/kernel/traps.c2007-10-11 
11:44:13.0 -0400
+++ linux-2.6-lttng/arch/arm/kernel/traps.c 2007-10-11 14:48:11.0 
-0400
@@ -269,7 +269,14 @@ void arm_notify_die(const char *str, str
current->thread.error_code = err;
current->thread.trap_no = trap;
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%ld",
+   current->thread.trap_no,
+   instruction_pointer(regs));
+
force_sig_info(info->si_signo, info, current);
+
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
+
} else {
die(str, regs, err);
}

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3

[patch-early-RFC 02/10] LTTng - x86_32 instrumentation

2007-12-05 Thread Mathieu Desnoyers

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86/kernel/apic_32.c   |   21 +
 arch/x86/kernel/cpu/mcheck/p4.c |7 +++
 arch/x86/kernel/entry_32.S  |6 +++---
 arch/x86/kernel/process_32.c|6 +-
 arch/x86/kernel/ptrace_32.c |6 ++
 arch/x86/kernel/smp_32.c|   18 ++
 arch/x86/kernel/sys_i386_32.c   |2 ++
 arch/x86/kernel/traps_32.c  |   38 +++---
 arch/x86/mm/fault_32.c  |7 +++
 9 files changed, 100 insertions(+), 11 deletions(-)

Index: linux-2.6-lttng/arch/x86/kernel/process_32.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/process_32.c   2007-12-05 
21:05:49.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/process_32.c2007-12-05 
21:48:05.0 -0500
@@ -374,6 +374,7 @@ extern void kernel_thread_helper(void);
 int kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
 {
struct pt_regs regs;
+   long pid;
 
memset(®s, 0, sizeof(regs));
 
@@ -389,7 +390,10 @@ int kernel_thread(int (*fn)(void *), voi
regs.eflags = X86_EFLAGS_IF | X86_EFLAGS_SF | X86_EFLAGS_PF | 0x2;
 
/* Ok, create the new process.. */
-   return do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0, ®s, 0, NULL, 
NULL);
+   pid = do_fork(flags | CLONE_VM | CLONE_UNTRACED,
+   0, ®s, 0, NULL, NULL);
+   trace_mark(kernel_arch_kthread_create, "pid %ld fn %p", pid, fn);
+   return pid;
 }
 EXPORT_SYMBOL(kernel_thread);
 
Index: linux-2.6-lttng/arch/x86/kernel/ptrace_32.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/ptrace_32.c2007-12-05 
21:05:49.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/ptrace_32.c 2007-12-05 21:48:05.0 
-0500
@@ -650,6 +650,12 @@ int do_syscall_trace(struct pt_regs *reg
int is_singlestep = !is_sysemu && test_thread_flag(TIF_SINGLESTEP);
int ret = 0;
 
+   if (!entryexit)
+   trace_mark(kernel_arch_syscall_entry, "syscall_id %d ip #p%ld",
+   (int)regs->orig_eax, instruction_pointer(regs));
+   else
+   trace_mark(kernel_arch_syscall_exit, "ret %ld", regs->eax);
+
/* do the secure computing check first */
if (!entryexit)
secure_computing(regs->orig_eax);
Index: linux-2.6-lttng/arch/x86/kernel/sys_i386_32.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/sys_i386_32.c  2007-12-05 
21:05:49.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/sys_i386_32.c   2007-12-05 
21:48:05.0 -0500
@@ -128,6 +128,8 @@ asmlinkage int sys_ipc (uint call, int f
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
switch (call) {
case SEMOP:
return sys_semtimedop (first, (struct sembuf __user *)ptr, 
second, NULL);
Index: linux-2.6-lttng/arch/x86/kernel/traps_32.c
===
--- linux-2.6-lttng.orig/arch/x86/kernel/traps_32.c 2007-12-05 
21:48:04.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/traps_32.c  2007-12-05 21:48:05.0 
-0500
@@ -455,6 +455,9 @@ static void __kprobes do_trap(int trapnr
 {
struct task_struct *tsk = current;
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %d ip #p%ld", trapnr,
+   instruction_pointer(regs));
+
if (regs->eflags & VM_MASK) {
if (vm86)
goto vm86_trap;
@@ -481,7 +484,7 @@ static void __kprobes do_trap(int trapnr
force_sig_info(signr, info, tsk);
else
force_sig(signr, tsk);
-   return;
+   goto end;
}
 
kernel_trap: {
@@ -490,14 +493,16 @@ static void __kprobes do_trap(int trapnr
tsk->thread.trap_no = trapnr;
die(str, regs, error_code);
}
-   return;
+   goto end;
}
 
vm86_trap: {
int ret = handle_vm86_trap((struct kernel_vm86_regs *) regs, 
error_code, trapnr);
if (ret) goto trap_signal;
-   return;
+   goto end;
}
+end:
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 
 #define DO_ERROR(trapnr, signr, str, name) \
@@ -611,7 +616,10 @@ fastcall void __kprobes do_general_prote
current->comm, task_pid_nr(current),
regs->eip, regs->esp, error_code);
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %d ip #p%ld", 13,
+

[patch-early-RFC 10/10] LTTng - s390 instrumentation

2007-12-05 Thread Mathieu Desnoyers

Changelog :
- added syscall entry/exit instrumentation.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/s390/kernel/entry.S|   10 --
 arch/s390/kernel/entry64.S  |   10 --
 arch/s390/kernel/ptrace.c   |6 ++
 arch/s390/kernel/sys_s390.c |2 ++
 arch/s390/kernel/traps.c|   17 +
 arch/s390/mm/fault.c|6 ++
 6 files changed, 47 insertions(+), 4 deletions(-)

Index: linux-2.6-lttng/arch/s390/kernel/traps.c
===
--- linux-2.6-lttng.orig/arch/s390/kernel/traps.c   2007-11-28 
09:27:27.0 -0500
+++ linux-2.6-lttng/arch/s390/kernel/traps.c2007-11-28 09:33:55.0 
-0500
@@ -5,6 +5,7 @@
  *Copyright (C) 1999,2000 IBM Deutschland Entwicklung GmbH, IBM Corporation
  *Author(s): Martin Schwidefsky ([EMAIL PROTECTED]),
  *   Denis Joseph Barrow ([EMAIL PROTECTED],[EMAIL PROTECTED]),
+ *  Portions added by T. Halloran: (C) Copyright 2002 IBM Poughkeepsie, IBM 
Corporation
  *
  *  Derived from "arch/i386/kernel/traps.c"
  *Copyright (C) 1991, 1992 Linus Torvalds
@@ -307,6 +308,9 @@ static void __kprobes inline do_trap(lon
interruption_code, signr) == NOTIFY_STOP)
return;
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%lu",
+   interruption_code & 0x, instruction_pointer(regs));
+
 if (regs->psw.mask & PSW_MASK_PSTATE) {
 struct task_struct *tsk = current;
 
@@ -327,6 +331,7 @@ static void __kprobes inline do_trap(lon
die(str, regs, interruption_code);
}
 }
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 
 static inline void __user *get_check_address(struct pt_regs *regs)
@@ -437,6 +442,9 @@ static void illegal_op(struct pt_regs * 
if (regs->psw.mask & PSW_MASK_PSTATE)
local_irq_enable();
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%lu",
+   interruption_code & 0x, instruction_pointer(regs));
+
if (regs->psw.mask & PSW_MASK_PSTATE) {
if (get_user(*((__u16 *) opcode), (__u16 __user *) location))
return;
@@ -501,6 +509,7 @@ static void illegal_op(struct pt_regs * 
do_trap(interruption_code, signal,
"illegal operation", regs, &info);
}
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 
 
@@ -521,6 +530,9 @@ specification_exception(struct pt_regs *
 if (regs->psw.mask & PSW_MASK_PSTATE)
local_irq_enable();
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%lu",
+   interruption_code & 0x, instruction_pointer(regs));
+
 if (regs->psw.mask & PSW_MASK_PSTATE) {
get_user(*((__u16 *) opcode), location);
switch (opcode[0]) {
@@ -565,6 +577,7 @@ specification_exception(struct pt_regs *
do_trap(interruption_code, signal, 
"specification exception", regs, &info);
}
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 #else
 DO_ERROR_INFO(SIGILL, "specification exception", specification_exception,
@@ -585,6 +598,9 @@ static void data_exception(struct pt_reg
if (regs->psw.mask & PSW_MASK_PSTATE)
local_irq_enable();
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%lu",
+   interruption_code & 0x, instruction_pointer(regs));
+
if (MACHINE_HAS_IEEE)
asm volatile("stfpc %0" : "=m" (current->thread.fp_regs.fpc));
 
@@ -659,6 +675,7 @@ static void data_exception(struct pt_reg
do_trap(interruption_code, signal, 
"data exception", regs, &info);
}
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 
 static void space_switch_exception(struct pt_regs * regs, long int_code)
Index: linux-2.6-lttng/arch/s390/kernel/sys_s390.c
===
--- linux-2.6-lttng.orig/arch/s390/kernel/sys_s390.c2007-11-28 
09:27:27.0 -0500
+++ linux-2.6-lttng/arch/s390/kernel/sys_s390.c 2007-11-28 09:33:55.0 
-0500
@@ -150,6 +150,8 @@ asmlinkage long sys_ipc(uint call, int f
 struct ipc_kludge tmp;
int ret;
 
+trace_mark(ipc_call, "call %u first %d", call, first);
+
 switch (call) {
 case SEMOP:
return sys_semtimedop(first, (struct sembuf __user *)ptr,
Index: linux-2.6-lttng/arch/s390/mm/fault.c
===
--- linux-2.6-lttng.orig/arch/s390/mm/fault.c   2007-11-28 09:27:27.0 
-0500
+++ linux-2.6-lttng/arch/s390/mm/fault.c2007-11-28 09:33:55.0 
-0500
@@ -5,6 +5,7 @@
  *Copyright (C) 1999 IBM Deutschland Entwicklung GmbH, IBM Corporation
  *Author(s): Hartmut Penner ([EMA

[patch-early-RFC 08/10] LTTng Sparc instrumentation

2007-12-05 Thread Mathieu Desnoyers

syscall trace missing
traps missing

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/sparc/kernel/entry.S |   10 +-
 arch/sparc/kernel/process.c   |1 +
 arch/sparc/kernel/sys_sparc.c |2 ++
 3 files changed, 8 insertions(+), 5 deletions(-)

Index: linux-2.6-lttng/arch/sparc/kernel/entry.S
===
--- linux-2.6-lttng.orig/arch/sparc/kernel/entry.S  2007-11-13 
09:25:23.0 -0500
+++ linux-2.6-lttng/arch/sparc/kernel/entry.S   2007-11-13 09:50:18.0 
-0500
@@ -1231,7 +1231,7 @@ sys_ptrace:
 add%sp, STACKFRAME_SZ, %o0
 
ld  [%curptr + TI_FLAGS], %l5
-   andcc   %l5, _TIF_SYSCALL_TRACE, %g0
+   andcc   %l5, (_TIF_SYSCALL_TRACE|_TIF_KERNEL_TRACE), %g0
be  1f
 nop
 
@@ -1280,7 +1280,7 @@ sys_sigreturn:
 add%sp, STACKFRAME_SZ, %o0
 
ld  [%curptr + TI_FLAGS], %l5
-   andcc   %l5, _TIF_SYSCALL_TRACE, %g0
+   andcc   %l5, (_TIF_SYSCALL_TRACE|_TIF_KERNEL_TRACE), %g0
be  1f
 nop
 
@@ -1300,7 +1300,7 @@ sys_rt_sigreturn:
 add%sp, STACKFRAME_SZ, %o0
 
ld  [%curptr + TI_FLAGS], %l5
-   andcc   %l5, _TIF_SYSCALL_TRACE, %g0
+   andcc   %l5, (_TIF_SYSCALL_TRACE|_TIF_KERNEL_TRACE), %g0
be  1f
 nop
 
@@ -1436,7 +1436,7 @@ syscall_is_too_hard:
 
ld  [%curptr + TI_FLAGS], %l5
mov %i3, %o3
-   andcc   %l5, _TIF_SYSCALL_TRACE, %g0
+   andcc   %l5, (_TIF_SYSCALL_TRACE|_TIF_KERNEL_TRACE), %g0
mov %i4, %o4
bne linux_syscall_trace
 mov%i0, %l5
@@ -1453,7 +1453,7 @@ ret_sys_call:
ld  [%sp + STACKFRAME_SZ + PT_PSR], %g3
set PSR_C, %g2
bgeu1f
-andcc  %l6, _TIF_SYSCALL_TRACE, %g0
+andcc  %l6, (_TIF_SYSCALL_TRACE|_TIF_KERNEL_TRACE), %g0
 
/* System call success, clear Carry condition code. */
andn%g3, %g2, %g3
Index: linux-2.6-lttng/arch/sparc/kernel/sys_sparc.c
===
--- linux-2.6-lttng.orig/arch/sparc/kernel/sys_sparc.c  2007-11-13 
09:25:23.0 -0500
+++ linux-2.6-lttng/arch/sparc/kernel/sys_sparc.c   2007-11-13 
09:50:18.0 -0500
@@ -120,6 +120,8 @@ asmlinkage int sys_ipc (uint call, int f
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
if (call <= SEMCTL)
switch (call) {
case SEMOP:
Index: linux-2.6-lttng/arch/sparc/kernel/process.c
===
--- linux-2.6-lttng.orig/arch/sparc/kernel/process.c2007-11-13 
09:25:23.0 -0500
+++ linux-2.6-lttng/arch/sparc/kernel/process.c 2007-11-13 09:50:18.0 
-0500
@@ -709,6 +709,7 @@ pid_t kernel_thread(int (*fn)(void *), v
 "i" (__NR_clone), "r" (flags | CLONE_VM | 
CLONE_UNTRACED),
 "i" (__NR_exit),  "r" (fn), "r" (arg) :
 "g1", "g2", "g3", "o0", "o1", "memory", "cc");
+   trace_mark(kernel_arch_kthread_create, "pid %ld fn %p", retval, fn);
return retval;
 }
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-early-RFC 07/10] LTTng instrumentation SH64

2007-12-05 Thread Mathieu Desnoyers

traps are missing.
syscall trace missing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/sh64/kernel/entry.S|2 +-
 arch/sh64/kernel/process.c  |5 -
 arch/sh64/kernel/sys_sh64.c |2 ++
 3 files changed, 7 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/arch/sh64/kernel/entry.S
===
--- linux-2.6-lttng.orig/arch/sh64/kernel/entry.S   2007-11-13 
09:25:23.0 -0500
+++ linux-2.6-lttng/arch/sh64/kernel/entry.S2007-11-13 09:50:15.0 
-0500
@@ -1282,7 +1282,7 @@ syscall_allowed:
 
getcon  KCR0, r2
ld.lr2, TI_FLAGS, r4
-   movi(1 << TIF_SYSCALL_TRACE), r6
+   movi(_TIF_SYSCALL_TRACE|_TIF_KERNEL_TRACE), r6
and r6, r4, r6
beq/l   r6, ZERO, tr0
 
Index: linux-2.6-lttng/arch/sh64/kernel/sys_sh64.c
===
--- linux-2.6-lttng.orig/arch/sh64/kernel/sys_sh64.c2007-11-13 
09:25:23.0 -0500
+++ linux-2.6-lttng/arch/sh64/kernel/sys_sh64.c 2007-11-13 09:50:15.0 
-0500
@@ -187,6 +187,8 @@ asmlinkage int sys_ipc(uint call, int fi
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
if (call <= SEMCTL)
switch (call) {
case SEMOP:
Index: linux-2.6-lttng/arch/sh64/kernel/process.c
===
--- linux-2.6-lttng.orig/arch/sh64/kernel/process.c 2007-11-13 
09:25:23.0 -0500
+++ linux-2.6-lttng/arch/sh64/kernel/process.c  2007-11-13 09:50:15.0 
-0500
@@ -393,6 +393,7 @@ ATTRIB_NORET void kernel_thread_helper(v
  */
 int kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
 {
+   unsigned long pid;
struct pt_regs regs;
 
memset(®s, 0, sizeof(regs));
@@ -402,8 +403,10 @@ int kernel_thread(int (*fn)(void *), voi
regs.pc = (unsigned long)kernel_thread_helper;
regs.sr = (1 << 30);
 
-   return do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0,
+   pid = do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0,
   ®s, 0, NULL, NULL);
+   trace_mark(kernel_arch_kthread_create, "pid %ld fn %p", pid, fn);
+   return pid;
 }
 
 /*

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-early-RFC 04/10] LTTng instrumentation Powerpc

2007-12-05 Thread Mathieu Desnoyers

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/powerpc/kernel/misc_32.S  |2 +-
 arch/powerpc/kernel/misc_64.S  |2 +-
 arch/powerpc/kernel/process.c  |   11 +++
 arch/powerpc/kernel/ptrace.c   |4 
 arch/powerpc/kernel/syscalls.c |2 ++
 arch/powerpc/kernel/time.c |5 +
 arch/powerpc/kernel/traps.c|   11 +++
 arch/powerpc/mm/fault.c|3 +++
 8 files changed, 38 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/arch/powerpc/kernel/misc_32.S
===
--- linux-2.6-lttng.orig/arch/powerpc/kernel/misc_32.S  2007-11-28 
09:27:28.0 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/misc_32.S   2007-11-28 
09:33:42.0 -0500
@@ -766,7 +766,7 @@ _GLOBAL(abs)
  * Create a kernel thread
  *   kernel_thread(fn, arg, flags)
  */
-_GLOBAL(kernel_thread)
+_GLOBAL(original_kernel_thread)
stwur1,-16(r1)
stw r30,8(r1)
stw r31,12(r1)
Index: linux-2.6-lttng/arch/powerpc/kernel/misc_64.S
===
--- linux-2.6-lttng.orig/arch/powerpc/kernel/misc_64.S  2007-11-28 
09:27:28.0 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/misc_64.S   2007-11-28 
09:33:42.0 -0500
@@ -427,7 +427,7 @@ _GLOBAL(scom970_write)
  * Create a kernel thread
  *   kernel_thread(fn, arg, flags)
  */
-_GLOBAL(kernel_thread)
+_GLOBAL(original_kernel_thread)
std r29,-24(r1)
std r30,-16(r1)
stdur1,-STACK_FRAME_OVERHEAD(r1)
Index: linux-2.6-lttng/arch/powerpc/kernel/process.c
===
--- linux-2.6-lttng.orig/arch/powerpc/kernel/process.c  2007-11-28 
09:27:28.0 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/process.c   2007-11-28 
09:33:42.0 -0500
@@ -488,6 +488,17 @@ void show_regs(struct pt_regs * regs)
show_instructions(regs);
 }
 
+long original_kernel_thread(int (*fn) (void*), void* arg, unsigned long flags);
+
+long kernel_thread(int (fn) (void *), void* arg, unsigned long flags)
+{
+   long retval;
+
+   retval = original_kernel_thread(fn, arg, flags);
+   trace_mark(kernel_arch_kthread_create, "pid %ld fn %p", retval, fn);
+   return retval;
+}
+
 void exit_thread(void)
 {
discard_lazy_cpu_state();
Index: linux-2.6-lttng/arch/powerpc/kernel/ptrace.c
===
--- linux-2.6-lttng.orig/arch/powerpc/kernel/ptrace.c   2007-11-28 
09:27:28.0 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/ptrace.c2007-11-28 
09:33:42.0 -0500
@@ -624,6 +624,8 @@ static void do_syscall_trace(void)
 
 void do_syscall_trace_enter(struct pt_regs *regs)
 {
+   trace_mark(kernel_arch_syscall_entry, "syscall_id %d ip #p%ld",
+   (int)regs->gpr[0], instruction_pointer(regs));
secure_computing(regs->gpr[0]);
 
if (test_thread_flag(TIF_SYSCALL_TRACE)
@@ -650,6 +652,8 @@ void do_syscall_trace_enter(struct pt_re
 
 void do_syscall_trace_leave(struct pt_regs *regs)
 {
+   trace_mark(kernel_arch_syscall_exit, "ret %ld", regs->result);
+
if (unlikely(current->audit_context))

audit_syscall_exit((regs->ccr&0x1000)?AUDITSC_FAILURE:AUDITSC_SUCCESS,
   regs->result);
Index: linux-2.6-lttng/arch/powerpc/kernel/syscalls.c
===
--- linux-2.6-lttng.orig/arch/powerpc/kernel/syscalls.c 2007-11-28 
09:27:28.0 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/syscalls.c  2007-11-28 
09:33:42.0 -0500
@@ -56,6 +56,8 @@ int sys_ipc(uint call, int first, unsign
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
ret = -ENOSYS;
switch (call) {
case SEMOP:
Index: linux-2.6-lttng/arch/powerpc/kernel/time.c
===
--- linux-2.6-lttng.orig/arch/powerpc/kernel/time.c 2007-11-28 
09:27:28.0 -0500
+++ linux-2.6-lttng/arch/powerpc/kernel/time.c  2007-11-28 09:33:42.0 
-0500
@@ -564,6 +564,9 @@ void timer_interrupt(struct pt_regs * re
 * some CPUs will continuue to take decrementer exceptions */
set_dec(DECREMENTER_MAX);
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%ld", regs->trap,
+   instruction_pointer(regs));
+
 #ifdef CONFIG_PPC32
if (atomic_read(&ppc_n_lost_interrupts) != 0)
do_IRQ(regs);
@@ -605,6 +608,8 @@ void timer_interrupt(struct pt_regs * re
 
irq_exit();
set_irq_regs(old_regs);
+
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 
 void wakeup_decrementer(void)
Index: linux-2.6-lttng/arch/powerpc/kernel/traps.c

[patch-early-RFC 05/10] LTTng instrumentation PPC

2007-12-05 Thread Mathieu Desnoyers

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/ppc/kernel/misc.S  |2 +-
 arch/ppc/kernel/time.c  |4 
 arch/ppc/kernel/traps.c |3 +++
 arch/ppc/mm/fault.c |3 +++
 4 files changed, 11 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/arch/ppc/kernel/misc.S
===
--- linux-2.6-lttng.orig/arch/ppc/kernel/misc.S 2007-11-13 09:25:23.0 
-0500
+++ linux-2.6-lttng/arch/ppc/kernel/misc.S  2007-11-13 09:50:11.0 
-0500
@@ -868,7 +868,7 @@ _GLOBAL(_get_SP)
  * Create a kernel thread
  *   kernel_thread(fn, arg, flags)
  */
-_GLOBAL(kernel_thread)
+_GLOBAL(original_kernel_thread)
stwur1,-16(r1)
stw r30,8(r1)
stw r31,12(r1)
Index: linux-2.6-lttng/arch/ppc/kernel/time.c
===
--- linux-2.6-lttng.orig/arch/ppc/kernel/time.c 2007-11-13 09:25:23.0 
-0500
+++ linux-2.6-lttng/arch/ppc/kernel/time.c  2007-11-13 09:50:11.0 
-0500
@@ -139,7 +139,10 @@ void timer_interrupt(struct pt_regs * re
if (atomic_read(&ppc_n_lost_interrupts) != 0)
do_IRQ(regs);
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%ld",
+   regs->trap, instruction_pointer(regs));
old_regs = set_irq_regs(regs);
+
irq_enter();
 
while ((next_dec = tb_ticks_per_jiffy - tb_delta(&jiffy_stamp)) <= 0) {
@@ -192,6 +195,7 @@ void timer_interrupt(struct pt_regs * re
 
irq_exit();
set_irq_regs(old_regs);
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 
 /*
Index: linux-2.6-lttng/arch/ppc/kernel/traps.c
===
--- linux-2.6-lttng.orig/arch/ppc/kernel/traps.c2007-11-13 
09:25:23.0 -0500
+++ linux-2.6-lttng/arch/ppc/kernel/traps.c 2007-11-13 09:50:11.0 
-0500
@@ -108,11 +108,14 @@ void _exception(int signr, struct pt_reg
debugger(regs);
die("Exception in kernel mode", regs, signr);
}
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%ld", regs->trap,
+   instruction_pointer(regs));
info.si_signo = signr;
info.si_errno = 0;
info.si_code = code;
info.si_addr = (void __user *) addr;
force_sig_info(signr, &info, current);
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 
/*
 * Init gets no signals that it doesn't have a handler for.
Index: linux-2.6-lttng/arch/ppc/mm/fault.c
===
--- linux-2.6-lttng.orig/arch/ppc/mm/fault.c2007-11-13 09:25:23.0 
-0500
+++ linux-2.6-lttng/arch/ppc/mm/fault.c 2007-11-13 09:50:11.0 -0500
@@ -250,7 +250,10 @@ good_area:
 * the fault.
 */
  survive:
+   trace_mark(kernel_arch_trap_entry, "trap_id %ld ip #p%ld",
+   regs->trap, instruction_pointer(regs));
fault = handle_mm_fault(mm, vma, address, is_write);
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-early-RFC 03/10] LTTng - MIPS instrumentation

2007-12-05 Thread Mathieu Desnoyers

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/mips/kernel/entry.S |2 +-
 arch/mips/kernel/process.c   |6 +-
 arch/mips/kernel/ptrace.c|7 +++
 arch/mips/kernel/syscall.c   |2 ++
 arch/mips/kernel/traps.c |   16 
 arch/mips/kernel/unaligned.c |   11 ++-
 arch/mips/mm/fault.c |3 +++
 include/asm-mips/mipsregs.h  |1 +
 8 files changed, 41 insertions(+), 7 deletions(-)

Index: linux-2.6-lttng/arch/mips/kernel/process.c
===
--- linux-2.6-lttng.orig/arch/mips/kernel/process.c 2007-11-20 
13:13:42.0 -0500
+++ linux-2.6-lttng/arch/mips/kernel/process.c  2007-11-20 13:19:04.0 
-0500
@@ -226,6 +226,7 @@ static void __noreturn kernel_thread_hel
 long kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
 {
struct pt_regs regs;
+   long pid;
 
memset(®s, 0, sizeof(regs));
 
@@ -241,7 +242,10 @@ long kernel_thread(int (*fn)(void *), vo
 #endif
 
/* Ok, create the new process.. */
-   return do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0, ®s, 0, NULL, 
NULL);
+   pid = do_fork(flags | CLONE_VM | CLONE_UNTRACED,
+   0, ®s, 0, NULL, NULL);
+   trace_mark(kernel_arch_kthread_create, "pid %ld fn %p", pid, fn);
+   return pid;
 }
 
 /*
Index: linux-2.6-lttng/arch/mips/kernel/ptrace.c
===
--- linux-2.6-lttng.orig/arch/mips/kernel/ptrace.c  2007-11-20 
13:13:42.0 -0500
+++ linux-2.6-lttng/arch/mips/kernel/ptrace.c   2007-11-20 13:19:52.0 
-0500
@@ -466,6 +466,13 @@ static inline int audit_arch(void)
  */
 asmlinkage void do_syscall_trace(struct pt_regs *regs, int entryexit)
 {
+   if (!entryexit)
+   trace_mark(kernel_arch_syscall_entry, "syscall_id %d ip #p%ld",
+   (int)regs->regs[2], instruction_pointer(regs));
+   else
+   trace_mark(kernel_arch_syscall_exit, "ret %ld",
+   regs->regs[2]);
+
/* do the secure computing check first */
if (!entryexit)
secure_computing(regs->regs[0]);
Index: linux-2.6-lttng/arch/mips/kernel/syscall.c
===
--- linux-2.6-lttng.orig/arch/mips/kernel/syscall.c 2007-11-20 
13:13:42.0 -0500
+++ linux-2.6-lttng/arch/mips/kernel/syscall.c  2007-11-20 13:19:04.0 
-0500
@@ -327,6 +327,8 @@ asmlinkage int sys_ipc(unsigned int call
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
switch (call) {
case SEMOP:
return sys_semtimedop(first, (struct sembuf __user *)ptr,
Index: linux-2.6-lttng/arch/mips/kernel/traps.c
===
--- linux-2.6-lttng.orig/arch/mips/kernel/traps.c   2007-11-20 
13:13:42.0 -0500
+++ linux-2.6-lttng/arch/mips/kernel/traps.c2007-11-20 13:19:04.0 
-0500
@@ -293,7 +293,7 @@ static void __show_regs(const struct pt_
 
printk("Cause : %08x\n", cause);
 
-   cause = (cause & CAUSEF_EXCCODE) >> CAUSEB_EXCCODE;
+   cause = CAUSE_EXCCODE(cause);
if (1 <= cause && cause <= 5)
printk("BadVA : %0*lx\n", field, regs->cp0_badvaddr);
 
@@ -587,6 +587,8 @@ asmlinkage void do_fpe(struct pt_regs *r
 
die_if_kernel("FP exception in kernel code", regs);
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %lu ip #p%ld",
+   CAUSE_EXCCODE(regs->cp0_cause), instruction_pointer(regs));
if (fcr31 & FPU_CSR_UNI_X) {
int sig;
 
@@ -800,6 +802,9 @@ asmlinkage void do_cpu(struct pt_regs *r
unsigned int cpid;
int status;
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %lu ip #p%ld",
+   CAUSE_EXCCODE(regs->cp0_cause), instruction_pointer(regs));
+
die_if_kernel("do_cpu invoked from kernel context!", regs);
 
cpid = (regs->cp0_cause >> CAUSEB_CE) & 3;
@@ -811,8 +816,10 @@ asmlinkage void do_cpu(struct pt_regs *r
opcode = 0;
status = -1;
 
-   if (unlikely(compute_return_epc(regs) < 0))
+   if (unlikely(compute_return_epc(regs) < 0)) {
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
return;
+   }
 
if (unlikely(get_user(opcode, epc) < 0))
status = SIGSEGV;
@@ -830,7 +837,7 @@ asmlinkage void do_cpu(struct pt_regs *r
regs->cp0_epc = old_epc;/* Undo skip-over.  */
force_sig(status, current);
}
-
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
return;
 
case 1:
@@ -850,7 +857,7 @@ a

[patch-early-RFC 06/10] LTTng - instrumentation SH

2007-12-05 Thread Mathieu Desnoyers

Changelog:
- fix do_fork instrumentation

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/sh/kernel/entry-common.S |   10 ++
 arch/sh/kernel/process.c  |5 -
 arch/sh/kernel/ptrace.c   |8 +++-
 arch/sh/kernel/sys_sh.c   |2 ++
 arch/sh/kernel/traps.c|   10 --
 arch/sh/mm/fault.c|   12 
 6 files changed, 39 insertions(+), 8 deletions(-)

Index: linux-2.6-lttng/arch/sh/kernel/entry-common.S
===
--- linux-2.6-lttng.orig/arch/sh/kernel/entry-common.S  2007-11-26 
13:36:40.0 -0500
+++ linux-2.6-lttng/arch/sh/kernel/entry-common.S   2007-11-26 
13:37:12.0 -0500
@@ -224,7 +224,7 @@ work_resched:
 syscall_exit_work:
! r0: current_thread_info->flags
! r8: current_thread_info
-   tst #_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP, r0
+   tst #_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP | _TIF_KERNEL_TRACE, r0
bt/swork_pending
 tst#_TIF_NEED_RESCHED, r0
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -233,7 +233,8 @@ syscall_exit_work:
 nop
 #endif
sti
-   ! XXX setup arguments...
+   mov r15,r4  ! pass stacked regs as arg
+   mov #0, r5  ! trace entry [0]
mov.l   4f, r0  ! do_syscall_trace
jsr @r0
 nop
@@ -243,7 +244,8 @@ syscall_exit_work:
.align  2
 syscall_trace_entry:
!   Yes it is traced.
-   ! XXX setup arguments...
+   mov r15,r4  ! pass stacked regs as arg
+   mov #1, r5  ! trace entry [1]
mov.l   4f, r11 ! Call do_syscall_trace which notifies
jsr @r11! superior (will chomp R[0-7])
 nop
@@ -366,7 +368,7 @@ ENTRY(system_call)
!
get_current_thread_info r8, r10
mov.l   @(TI_FLAGS,r8), r8
-   mov #_TIF_SYSCALL_TRACE, r10
+   mov #(_TIF_SYSCALL_TRACE | _TIF_KERNEL_TRACE), r10
tst r10, r8
bf  syscall_trace_entry
!
Index: linux-2.6-lttng/arch/sh/kernel/process.c
===
--- linux-2.6-lttng.orig/arch/sh/kernel/process.c   2007-11-26 
13:36:40.0 -0500
+++ linux-2.6-lttng/arch/sh/kernel/process.c2007-11-28 08:29:11.0 
-0500
@@ -172,6 +172,7 @@ __asm__(".align 5\n"
 /* Don't use this in BL=1(cli).  Or else, CPU resets! */
 int kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
 {
+   unsigned long pid;
struct pt_regs regs;
 
memset(®s, 0, sizeof(regs));
@@ -182,8 +183,10 @@ int kernel_thread(int (*fn)(void *), voi
regs.sr = (1 << 30);
 
/* Ok, create the new process.. */
-   return do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0,
+   pid =  do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0,
   ®s, 0, NULL, NULL);
+   trace_mark(kernel_arch_kthread_create, "pid %ld fn %p", pid, fn);
+   return pid;
 }
 
 /*
Index: linux-2.6-lttng/arch/sh/kernel/sys_sh.c
===
--- linux-2.6-lttng.orig/arch/sh/kernel/sys_sh.c2007-11-26 
13:36:40.0 -0500
+++ linux-2.6-lttng/arch/sh/kernel/sys_sh.c 2007-11-26 13:37:12.0 
-0500
@@ -192,6 +192,8 @@ asmlinkage int sys_ipc(uint call, int fi
version = call >> 16; /* hack for backward compatibility */
call &= 0x;
 
+   trace_mark(kernel_arch_ipc_call, "call %u first %d", call, first);
+
if (call <= SEMCTL)
switch (call) {
case SEMOP:
Index: linux-2.6-lttng/arch/sh/kernel/traps.c
===
--- linux-2.6-lttng.orig/arch/sh/kernel/traps.c 2007-11-26 13:36:40.0 
-0500
+++ linux-2.6-lttng/arch/sh/kernel/traps.c  2007-11-26 13:37:12.0 
-0500
@@ -548,6 +548,9 @@ asmlinkage void do_address_error(struct 
lookup_exception_vector(error_code);
 #endif
 
+   trace_mark(kernel_arch_trap_entry, "trap_id %lu ip #p%ld",
+   (error_code >> 5), instruction_pointer(regs));
+
oldfs = get_fs();
 
if (user_mode(regs)) {
@@ -574,8 +577,10 @@ asmlinkage void do_address_error(struct 
tmp = handle_unaligned_access(instruction, regs);
set_fs(oldfs);
 
-   if (tmp==0)
-   return; /* sorted */
+   if (tmp==0) {
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
+   return; /* sorted */
+   }
 #endif
 
 uspace_segv:
@@ -611,6 +616,7 @@ uspace_segv:
force_sig(SIGSEGV, current);
 #endif
}
+   trace_mark(kernel_arch_trap_exit, MARK_NOARGS);
 }
 
 #ifdef CONFIG_SH_DSP
Index: linux-2.6-lttng/arch/sh/mm/fault.c
=

[patch-early-RFC 00/10] LTTng architecture dependent instrumentation

2007-12-05 Thread Mathieu Desnoyers

Hi,

Here is the architecture dependent instrumentation for LTTng. Not all
architectures are supported, and some of them have missing instrumentation
points.

The most complete should be :
x86_32, x86_64, powerpc, mips and arm.

It depends on the kernel trace thread flag patchset.

It instruments :
- traps/faults
- system calls
- kernel thread creation
- IPC calls

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Reduce stack used by lib/hexdump.c

2007-12-05 Thread Joe Perches

On Wed, 2007-12-05 at 18:18 -0800, Randy Dunlap wrote:
> Joe Perches wrote:
> > Maybe just eliminate the 16 or 32 byte width option and
> > force it to only 16 byte widths.
> Have you checked users (callers)?  I'm pretty sure that one of the
> callers wanted 32 and that's why it's there.

I did.  There is only 1 subsystem.  That's easy to change.

drivers/mtd/ubi/debug.c:  print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 
32, 1,
drivers/mtd/ubi/io.c: print_hex_dump(KERN_DEBUG, "", DUMP_PREFIX_OFFSET, 
32, 1,

Long lines in the log file are not too easy to read anyway.
Using 16 byte dumps per line instead of 32 isn't painful.

It gets rid of the allocation, reduces the argument count
and makes the kernel smaller.  I think it's all good.

Every current caller would have to change though.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 08/26] LTTng Linux Kernel Trace Thread Flags x86_32

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 include/asm-i386/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-x86/thread_info_32.h
===
--- linux-2.6-lttng.orig/include/asm-x86/thread_info_32.h   2007-07-30 
19:41:44.0 -0400
+++ linux-2.6-lttng/include/asm-x86/thread_info_32.h2007-07-30 
20:07:03.0 -0400
@@ -132,6 +132,7 @@ static inline struct thread_info *curren
 #define TIF_SYSCALL_AUDIT  6   /* syscall auditing active */
 #define TIF_SECCOMP7   /* secure computing */
 #define TIF_RESTORE_SIGMASK8   /* restore signal mask in do_signal() */
+#define TIF_KERNEL_TRACE   9   /* kernel trace active */
 #define TIF_MEMDIE 16
 #define TIF_DEBUG  17  /* uses debug registers */
 #define TIF_IO_BITMAP  18  /* uses I/O bitmap */
@@ -147,6 +148,7 @@ static inline struct thread_info *curren
 #define _TIF_SYSCALL_AUDIT (1

[patch-RFC 23/26] Prepare x86_64 for TIF_SYSCALL_TRACE async flag set in entry.S

2007-12-05 Thread Mathieu Desnoyers

When the flag is inactive upon syscall entry and concurrently activated before
exit, we seem to reach a state where the top of stack is incorrect upon return
to user space.

Fix this by fixing the top of stack and jumping to int_ret_from_sys_call if we
detect that thread flags has been modified.

We make sure that the thread flag read is coherent between our new test and the 
ALLWORK_MASK test by first saving it in a register used for both comparisons.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Andi Kleen <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 arch/x86/kernel/entry_64.S |   12 
 1 file changed, 12 insertions(+)

Index: linux-2.6-lttng/arch/x86/kernel/entry_64.S
===
--- linux-2.6-lttng.orig/arch/x86/kernel/entry_64.S 2007-11-13 
09:25:25.0 -0500
+++ linux-2.6-lttng/arch/x86/kernel/entry_64.S  2007-11-13 09:49:48.0 
-0500
@@ -268,6 +268,8 @@ sysret_check:   
/* Handle reschedules */
/* edx: work, edi: workmask */  
 sysret_careful:
+   testl $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),%edx
+   jnz ret_from_sys_call_trace
bt $TIF_NEED_RESCHED,%edx
jnc sysret_signal
TRACE_IRQS_ON
@@ -279,6 +281,16 @@ sysret_careful:
CFI_ADJUST_CFA_OFFSET -8
jmp sysret_check
 
+ret_from_sys_call_trace:
+   TRACE_IRQS_ON
+   sti
+   SAVE_REST
+   FIXUP_TOP_OF_STACK %rdi
+   movq %rsp,%rdi
+   LOAD_ARGS ARGOFFSET  /* reload args from stack in case ptrace changed 
it */
+   RESTORE_REST
+   jmp int_ret_from_sys_call
+
/* Handle a signal */ 
 sysret_signal:
TRACE_IRQS_ON

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 13/26] LTTng Kernel Trace Thread Flag MIPS

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-mips/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-mips/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-mips/thread_info.h 2007-08-09 
15:19:01.0 -0400
+++ linux-2.6-lttng/include/asm-mips/thread_info.h  2007-08-09 
15:28:15.0 -0400
@@ -122,9 +122,11 @@ register struct thread_info *__current_t
 #define TIF_32BIT_REGS 22  /* also implies 16/32 fprs */
 #define TIF_32BIT_ADDR 23  /* 32-bit address space (o32/n32) */
 #define TIF_FPUBOUND   24  /* thread bound to FPU-full CPU set */
+#define TIF_KERNEL_TRACE   30  /* kernel trace active */
 #define TIF_SYSCALL_TRACE  31  /* syscall trace active */
 
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 24/26] LTTng Linux Kernel Trace Thread Flag x86_64

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 include/asm-x86/thread_info_64.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-x86/thread_info_64.h
===
--- linux-2.6-lttng.orig/include/asm-x86/thread_info_64.h   2007-11-02 
11:06:22.0 -0400
+++ linux-2.6-lttng/include/asm-x86/thread_info_64.h2007-11-02 
12:02:57.0 -0400
@@ -107,6 +107,7 @@ static inline struct thread_info *stack_
  * Warning: layout of LSW is hardcoded in entry.S
  */
 #define TIF_SYSCALL_TRACE  0   /* syscall trace active */
+#define TIF_KERNEL_TRACE   1   /* kernel trace active */
 #define TIF_SIGPENDING 2   /* signal pending */
 #define TIF_NEED_RESCHED   3   /* rescheduling necessary */
 #define TIF_SINGLESTEP 4   /* reenable singlestep on user return*/
@@ -125,6 +126,7 @@ static inline struct thread_info *stack_
 #define TIF_FREEZE 23  /* is freezing for suspend */
 
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 14/26] LTTng Kernel Trace Thread Flag parisc

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-parisc/thread_info.h |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/include/asm-parisc/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-parisc/thread_info.h   2007-07-30 
18:18:19.0 -0400
+++ linux-2.6-lttng/include/asm-parisc/thread_info.h2007-07-30 
18:18:48.0 -0400
@@ -62,6 +62,7 @@ struct thread_info {
 #define TIF_32BIT   4   /* 32 bit binary */
 #define TIF_MEMDIE 5
 #define TIF_RESTORE_SIGMASK6   /* restore saved signal mask */
+#define TIF_KERNEL_TRACE   7   /* kernel trace active */
 
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
@@ -69,6 +70,7 @@ struct thread_info {
 #define _TIF_POLLING_NRFLAG(1 << TIF_POLLING_NRFLAG)
 #define _TIF_32BIT (1 << TIF_32BIT)
 #define _TIF_RESTORE_SIGMASK   (1 << TIF_RESTORE_SIGMASK)
+#define _TIF_KERNEL_TRACE  (1 << TIF_KERNEL_TRACE)
 
 #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | \
  _TIF_NEED_RESCHED | _TIF_RESTORE_SIGMASK)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 16/26] LTTng Kernel Trace Thread Flag s390

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/s390/kernel/entry.S   |5 -
 include/asm-s390/thread_info.h |2 ++
 2 files changed, 6 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-s390/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-s390/thread_info.h 2007-11-20 
13:10:50.0 -0500
+++ linux-2.6-lttng/include/asm-s390/thread_info.h  2007-11-20 
13:12:20.0 -0500
@@ -96,6 +96,7 @@
 #define TIF_SYSCALL_AUDIT  5   /* syscall auditing active */
 #define TIF_SINGLE_STEP6   /* deliver sigtrap on return to 
user */
 #define TIF_MCCK_PENDING   7   /* machine check handling is pending */
+#define TIF_KERNEL_TRACE   8   /* kernel trace active */
 #define TIF_USEDFPU16  /* FPU was used by this task this 
quantum (SMP) */
 #define TIF_POLLING_NRFLAG 17  /* true if poll_idle() is polling 
   TIF_NEED_RESCHED */
@@ -110,6 +111,7 @@
 #define _TIF_SYSCALL_AUDIT (1

[patch-RFC 11/26] LTTng Kernel Trace Thread Flag m68k

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-m68k/thread_info.h |1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6-lttng/include/asm-m68k/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-m68k/thread_info.h 2007-07-30 
18:14:49.0 -0400
+++ linux-2.6-lttng/include/asm-m68k/thread_info.h  2007-07-30 
18:15:23.0 -0400
@@ -55,6 +55,7 @@ struct thread_info {
  */
 #define TIF_SIGPENDING 6   /* signal pending */
 #define TIF_NEED_RESCHED   7   /* rescheduling necessary */
+#define TIF_KERNEL_TRACE   13  /* kernel trace active */
 #define TIF_DELAYED_TRACE  14  /* single step a syscall */
 #define TIF_SYSCALL_TRACE  15  /* syscall trace active */
 #define TIF_MEMDIE 16

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 07/26] LTTng Kernel Trace Thread Flag H8300

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-h8300/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-h8300/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-h8300/thread_info.h2007-08-07 
11:03:21.0 -0400
+++ linux-2.6-lttng/include/asm-h8300/thread_info.h 2007-08-07 
11:47:27.0 -0400
@@ -92,6 +92,7 @@ static inline struct thread_info *curren
   TIF_NEED_RESCHED */
 #define TIF_MEMDIE 4
 #define TIF_RESTORE_SIGMASK5   /* restore signal mask in do_signal() */
+#define TIF_KERNEL_TRACE   6   /* kernel trace active */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 19/26] LTTng Kernel Trace Thread Flag sparc

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-sparc/thread_info.h |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-sparc/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-sparc/thread_info.h2007-07-30 
18:23:47.0 -0400
+++ linux-2.6-lttng/include/asm-sparc/thread_info.h 2007-07-30 
18:24:06.0 -0400
@@ -128,7 +128,7 @@ BTFIXUPDEF_CALL(void, free_thread_info, 
  * thread information flag bit numbers
  */
 #define TIF_SYSCALL_TRACE  0   /* syscall trace active */
-/* flag bit 1 is available */
+#define TIF_KERNEL_TRACE   1   /* kernel trace active */
 #define TIF_SIGPENDING 2   /* signal pending */
 #define TIF_NEED_RESCHED   3   /* rescheduling necessary */
 #define TIF_RESTORE_SIGMASK4   /* restore signal mask in do_signal() */
@@ -140,6 +140,7 @@ BTFIXUPDEF_CALL(void, free_thread_info, 
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 12/26] LTTng Kernel Trace Thread Flag m68knommu

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-m68knommu/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-m68knommu/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-m68knommu/thread_info.h2007-07-30 
18:49:42.0 -0400
+++ linux-2.6-lttng/include/asm-m68knommu/thread_info.h 2007-07-30 
19:08:02.0 -0400
@@ -88,14 +88,16 @@ static inline struct thread_info *curren
 #define TIF_POLLING_NRFLAG 3   /* true if poll_idle() is polling
   TIF_NEED_RESCHED */
 #define TIF_MEMDIE 4
+#define TIF_KERNEL_TRACE   5   /* kernel trace active */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 18/26] LTTng Kernel Trace Thread Flag sh64

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-sh64/thread_info.h |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/include/asm-sh64/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-sh64/thread_info.h 2007-07-30 
18:23:10.0 -0400
+++ linux-2.6-lttng/include/asm-sh64/thread_info.h  2007-07-30 
18:23:34.0 -0400
@@ -75,11 +75,13 @@ static inline struct thread_info *curren
 
 /* thread information flags */
 #define TIF_SYSCALL_TRACE  0   /* syscall trace active */
+#define TIF_KERNEL_TRACE   1   /* kernel trace active */
 #define TIF_SIGPENDING 2   /* signal pending */
 #define TIF_NEED_RESCHED   3   /* rescheduling necessary */
 #define TIF_MEMDIE 4
 #define TIF_RESTORE_SIGMASK5   /* Restore signal mask in do_signal */
 
+#define _TIF_KERNEL_TRACE  (1 << TIF_KERNEL_TRACE)
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED  (1 << TIF_NEED_RESCHED)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 15/26] LTTng Kernel Trace Thread Flag powerpc

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-powerpc/thread_info.h |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

Index: linux-2.6-lttng/include/asm-powerpc/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-powerpc/thread_info.h  2007-08-10 
17:21:52.0 -0400
+++ linux-2.6-lttng/include/asm-powerpc/thread_info.h   2007-08-10 
18:26:46.0 -0400
@@ -112,7 +112,7 @@ static inline struct thread_info *curren
 #define TIF_POLLING_NRFLAG 3   /* true if poll_idle() is polling
   TIF_NEED_RESCHED */
 #define TIF_32BIT  4   /* 32 bit binary */
-#define TIF_PERFMON_WORK   5   /* work for pfm_handle_work() */
+#define TIF_KERNEL_TRACE   5   /* kernel trace active */
 #define TIF_PERFMON_CTXSW  6   /* perfmon needs ctxsw calls */
 #define TIF_SYSCALL_AUDIT  7   /* syscall auditing active */
 #define TIF_SINGLESTEP 8   /* singlestepping active */
@@ -124,6 +124,7 @@ static inline struct thread_info *curren
 #define TIF_FREEZE 14  /* Freezing for suspend */
 #define TIF_RUNLATCH   15  /* Is the runlatch enabled? */
 #define TIF_ABI_PENDING16  /* 32/64 bit switch needed */
+#define TIF_PERFMON_WORK   17  /* work for pfm_handle_work() */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 17/26] LTTng Kernel Trace Thread Flag SH

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-sh/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-sh/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-sh/thread_info.h   2007-07-30 
18:46:17.0 -0400
+++ linux-2.6-lttng/include/asm-sh/thread_info.h2007-07-30 
19:11:39.0 -0400
@@ -111,6 +111,7 @@ static inline struct thread_info *curren
 #define TIF_NEED_RESCHED   2   /* rescheduling necessary */
 #define TIF_RESTORE_SIGMASK3   /* restore signal mask in do_signal() */
 #define TIF_SINGLESTEP 4   /* singlestepping active */
+#define TIF_KERNEL_TRACE   5   /* kernel trace active */
 #define TIF_USEDFPU16  /* FPU was used by this task this 
quantum (SMP) */
 #define TIF_POLLING_NRFLAG 17  /* true if poll_idle() is polling 
TIF_NEED_RESCHED */
 #define TIF_MEMDIE 18
@@ -121,11 +122,12 @@ static inline struct thread_info *curren
 #define _TIF_NEED_RESCHED  (1

[patch-RFC 09/26] LTTng Kernel Trace Thread Flag ia64

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 arch/ia64/kernel/entry.S   |6 --
 include/asm-ia64/thread_info.h |   13 +
 2 files changed, 13 insertions(+), 6 deletions(-)

Index: linux-2.6-lttng/include/asm-ia64/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-ia64/thread_info.h 2007-11-20 
10:43:04.0 -0500
+++ linux-2.6-lttng/include/asm-ia64/thread_info.h  2007-11-20 
10:43:28.0 -0500
@@ -86,6 +86,7 @@
 #define TIF_SINGLESTEP 4   /* restore singlestep on return to user 
mode */
 #define TIF_RESTORE_SIGMASK5   /* restore signal mask in do_signal() */
 #define TIF_PERFMON_WORK   6   /* work for pfm_handle_work() */
+#define TIF_KERNEL_TRACE   7   /* kernel trace active */
 #define TIF_POLLING_NRFLAG 16  /* true if poll_idle() is polling 
TIF_NEED_RESCHED */
 #define TIF_MEMDIE 17
 #define TIF_MCA_INIT   18  /* this task is processing MCA or INIT 
*/
@@ -95,7 +96,8 @@
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
 #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
 #define _TIF_SINGLESTEP(1 << TIF_SINGLESTEP)
-#define _TIF_SYSCALL_TRACEAUDIT
(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP)
+#define _TIF_KERNEL_TRACE  (1 << TIF_KERNEL_TRACE)
+#define _TIF_SYSCALL_TRACEAUDIT
(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP|_TIF_KERNEL_TRACE)
 #define _TIF_RESTORE_SIGMASK   (1 << TIF_RESTORE_SIGMASK)
 #define _TIF_PERFMON_WORK  (1 << TIF_PERFMON_WORK)
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
@@ -108,9 +110,12 @@
 /* "work to do on user-return" bits */
 #define TIF_ALLWORK_MASK   
(_TIF_SIGPENDING|_TIF_PERFMON_WORK|_TIF_SYSCALL_AUDIT|\
 _TIF_NEED_RESCHED| _TIF_SYSCALL_TRACE|\
-_TIF_RESTORE_SIGMASK)
-/* like TIF_ALLWORK_BITS but sans TIF_SYSCALL_TRACE or TIF_SYSCALL_AUDIT */
-#define TIF_WORK_MASK  
(TIF_ALLWORK_MASK&~(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT))
+_TIF_RESTORE_SIGMASK|_TIF_KERNEL_TRACE)
+/*
+ * like TIF_ALLWORK_BITS but sans TIF_SYSCALL_TRACE, TIF_KERNEL_TRACE
+ * or TIF_SYSCALL_AUDIT
+ */
+#define TIF_WORK_MASK  
(TIF_ALLWORK_MASK&~(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_KERNEL_TRACE))
 
 #define TS_POLLING 1   /* true if in idle loop and not 
sleeping */
 
Index: linux-2.6-lttng/arch/ia64/kernel/entry.S
===
--- linux-2.6-lttng.orig/arch/ia64/kernel/entry.S   2007-11-20 
10:43:04.0 -0500
+++ linux-2.6-lttng/arch/ia64/kernel/entry.S2007-11-20 10:50:48.0 
-0500
@@ -621,9 +621,11 @@
;;
ld4 r2=[r2]
;;
+   movl r8=_TIF_SYSCALL_TRACEAUDIT
+   ;;  // added stop bits to prevent 
r8 dependency
+   and r2=r8,r2
mov r8=0
-   and r2=_TIF_SYSCALL_TRACEAUDIT,r2
-   ;;
+   ;;  // added stop bits to prevent 
r2 dependency
cmp.ne p6,p0=r2,r0
 (p6)   br.cond.spnt .strace_check_retval
;;  // added stop bits to prevent 
r8 dependency

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 05/26] LTTng Kernel Trace Thread Flag Cris

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-cris/thread_info.h |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/include/asm-cris/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-cris/thread_info.h 2007-11-15 
21:18:41.0 -0500
+++ linux-2.6-lttng/include/asm-cris/thread_info.h  2007-11-15 
21:32:07.0 -0500
@@ -83,6 +83,7 @@ struct thread_info {
 #define TIF_NOTIFY_RESUME  1   /* resumption notification requested */
 #define TIF_SIGPENDING 2   /* signal pending */
 #define TIF_NEED_RESCHED   3   /* rescheduling necessary */
+#define TIF_KERNEL_TRACE   4   /* kernel trace active */
 #define TIF_RESTORE_SIGMASK9   /* restore signal mask in do_signal() */
 #define TIF_POLLING_NRFLAG 16  /* true if poll_idle() is polling 
TIF_NEED_RESCHED */
 #define TIF_MEMDIE 17
@@ -91,11 +92,15 @@ struct thread_info {
 #define _TIF_NOTIFY_RESUME (1

[patch-RFC 06/26] LTTng Kernel Trace Thread Flag Frv

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-frv/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-frv/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-frv/thread_info.h  2007-09-04 
11:52:56.0 -0400
+++ linux-2.6-lttng/include/asm-frv/thread_info.h   2007-09-04 
12:19:57.0 -0400
@@ -112,6 +112,7 @@ register struct thread_info *__current_t
 #define TIF_SINGLESTEP 3   /* restore singlestep on return to user 
mode */
 #define TIF_IRET   4   /* return with iret */
 #define TIF_RESTORE_SIGMASK5   /* restore signal mask in do_signal() */
+#define TIF_KERNEL_TRACE   6   /* kernel trace active */
 #define TIF_POLLING_NRFLAG 16  /* true if poll_idle() is polling 
TIF_NEED_RESCHED */
 #define TIF_MEMDIE 17  /* OOM killer killed process */
 #define TIF_FREEZE 18  /* freezing for suspend */
@@ -122,10 +123,11 @@ register struct thread_info *__current_t
 #define _TIF_SINGLESTEP(1 << TIF_SINGLESTEP)
 #define _TIF_IRET  (1 << TIF_IRET)
 #define _TIF_RESTORE_SIGMASK   (1 << TIF_RESTORE_SIGMASK)
+#define _TIF_KERNEL_TRACE  (1 << TIF_KERNEL_TRACE)
 #define _TIF_POLLING_NRFLAG(1 << TIF_POLLING_NRFLAG)
 #define _TIF_FREEZE(1 << TIF_FREEZE)
 
-#define _TIF_WORK_MASK 0xFFFE  /* work to do on 
interrupt/exception return */
+#define _TIF_WORK_MASK 0xFFBE  /* work to do on 
interrupt/exception return */
 #define _TIF_ALLWORK_MASK  0x  /* work to do on any return to 
u-space */
 
 /*

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 04/26] LTTng Kernel Trace Thread Flag Blackfin

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-blackfin/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-blackfin/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-blackfin/thread_info.h 2007-07-30 
18:49:43.0 -0400
+++ linux-2.6-lttng/include/asm-blackfin/thread_info.h  2007-07-30 
19:01:13.0 -0400
@@ -125,6 +125,7 @@ static inline struct thread_info *curren
 #define TIF_MEMDIE  4
 #define TIF_RESTORE_SIGMASK5   /* restore signal mask in do_signal() */
 #define TIF_FREEZE  6   /* is freezing for suspend */
+#define TIF_KERNEL_TRACE   7   /* kernel trace active */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 00/26] LTTng Kernel Trace Thread Flag

2007-12-05 Thread Mathieu Desnoyers

Hi,

This is an RFC for addition of a new thread flag, TIF_KERNEL_TRACE, to each
architecture to activate system-wide system call tracing.

This is needed by LTTng architecture dependent instrumentation.

It applies on 2.6.24-rc4-git3.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 02/26] LTTng Kernel Trace Thread Flag ARM

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-arm/thread_info.h |3 +++
 1 file changed, 3 insertions(+)

Index: linux-2.6-lttng/include/asm-arm/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-arm/thread_info.h  2007-09-04 
11:50:11.0 -0400
+++ linux-2.6-lttng/include/asm-arm/thread_info.h   2007-09-04 
12:19:51.0 -0400
@@ -134,6 +134,7 @@ extern void iwmmxt_task_switch(struct th
 /*
  * thread information flags:
  *  TIF_SYSCALL_TRACE  - syscall trace active
+ *  TIF_KERNEL_TRACE   - kernel trace active
  *  TIF_SIGPENDING - signal pending
  *  TIF_NEED_RESCHED   - rescheduling necessary
  *  TIF_USEDFPU- FPU was used by this task this quantum (SMP)
@@ -141,6 +142,7 @@ extern void iwmmxt_task_switch(struct th
  */
 #define TIF_SIGPENDING 0
 #define TIF_NEED_RESCHED   1
+#define TIF_KERNEL_TRACE   7
 #define TIF_SYSCALL_TRACE  8
 #define TIF_POLLING_NRFLAG 16
 #define TIF_USING_IWMMXT   17
@@ -149,6 +151,7 @@ extern void iwmmxt_task_switch(struct th
 
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED  (1 << TIF_NEED_RESCHED)
+#define _TIF_KERNEL_TRACE  (1 << TIF_KERNEL_TRACE)
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
 #define _TIF_POLLING_NRFLAG(1 << TIF_POLLING_NRFLAG)
 #define _TIF_USING_IWMMXT  (1 << TIF_USING_IWMMXT)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 10/26] LTTng Kernel Trace Thread Flag m32r

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-m32r/thread_info.h |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/include/asm-m32r/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-m32r/thread_info.h 2007-11-28 
08:40:56.0 -0500
+++ linux-2.6-lttng/include/asm-m32r/thread_info.h  2007-11-28 
08:47:42.0 -0500
@@ -149,6 +149,7 @@ static inline unsigned int get_thread_fa
 #define TIF_NEED_RESCHED   2   /* rescheduling necessary */
 #define TIF_SINGLESTEP 3   /* restore singlestep on return to user 
mode */
 #define TIF_IRET   4   /* return with iret */
+#define TIF_KERNEL_TRACE   5   /* kernel trace active */
 #define TIF_RESTORE_SIGMASK8   /* restore signal mask in do_signal() */
 #define TIF_USEDFPU16  /* FPU was used by this task this 
quantum (SMP) */
 #define TIF_POLLING_NRFLAG 17  /* true if poll_idle() is polling 
TIF_NEED_RESCHED */
@@ -160,13 +161,17 @@ static inline unsigned int get_thread_fa
 #define _TIF_NEED_RESCHED  (1

[patch-RFC 03/26] LTTng Kernel Trace Thread Flag AVR32

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-avr32/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-avr32/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-avr32/thread_info.h2007-07-30 
18:49:43.0 -0400
+++ linux-2.6-lttng/include/asm-avr32/thread_info.h 2007-07-30 
18:59:15.0 -0400
@@ -83,6 +83,7 @@ static inline struct thread_info *curren
 #define TIF_MEMDIE 6
 #define TIF_RESTORE_SIGMASK7   /* restore signal mask in do_signal */
 #define TIF_CPU_GOING_TO_SLEEP 8   /* CPU is entering sleep 0 mode */
+#define TIF_KERNEL_TRACE9   /* kernel trace active */
 #define TIF_USERSPACE  31  /* true if FS sets userspace */
 
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
@@ -94,12 +95,13 @@ static inline struct thread_info *curren
 #define _TIF_MEMDIE(1 << TIF_MEMDIE)
 #define _TIF_RESTORE_SIGMASK   (1 << TIF_RESTORE_SIGMASK)
 #define _TIF_CPU_GOING_TO_SLEEP (1 << TIF_CPU_GOING_TO_SLEEP)
+#define _TIF_KERNEL_TRACE  (1 << TIF_KERNEL_TRACE)
 
 /* XXX: These two masks must never span more than 16 bits! */
 /* work to do on interrupt/exception return */
 #define _TIF_WORK_MASK 0x013e
 /* work to do on any return to userspace */
-#define _TIF_ALLWORK_MASK  0x013f
+#define _TIF_ALLWORK_MASK  0x033f
 /* work to do on return from debug mode */
 #define _TIF_DBGWORK_MASK  0x017e
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 26/26] LTTng Kernel Trace Thread Flag API

2007-12-05 Thread Mathieu Desnoyers

Add an API to set/clear the kernel wide tracing thread flags. Implemented in
kernel/sched.c. Updates thread flags *asynchronously* while holding the tasklist
lock.

Upon fork, the flag must be re-copied while the tasklist lock is held.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/linux/sched.h |3 ++
 kernel/fork.c |9 
 kernel/sched.c|   55 ++
 3 files changed, 67 insertions(+)

Index: linux-2.6-lttng/include/linux/sched.h
===
--- linux-2.6-lttng.orig/include/linux/sched.h  2007-12-05 20:50:29.0 
-0500
+++ linux-2.6-lttng/include/linux/sched.h   2007-12-05 20:54:32.0 
-0500
@@ -2000,6 +2000,9 @@ static inline void migration_init(void)
 }
 #endif
 
+extern void clear_kernel_trace_flag_all_tasks(void);
+extern void set_kernel_trace_flag_all_tasks(void);
+
 #endif /* __KERNEL__ */
 
 #endif
Index: linux-2.6-lttng/kernel/fork.c
===
--- linux-2.6-lttng.orig/kernel/fork.c  2007-12-05 20:54:00.0 -0500
+++ linux-2.6-lttng/kernel/fork.c   2007-12-05 20:54:32.0 -0500
@@ -1241,6 +1241,15 @@ static struct task_struct *copy_process(
!cpu_online(task_cpu(p
set_task_cpu(p, smp_processor_id());
 
+   /*
+* The state of the parent's TIF_KTRACE flag may have changed
+* since it was copied in dup_task_struct() so we re-copy it here.
+*/
+   if (test_thread_flag(TIF_KERNEL_TRACE))
+   set_tsk_thread_flag(p, TIF_KERNEL_TRACE);
+   else
+   clear_tsk_thread_flag(p, TIF_KERNEL_TRACE);
+
/* CLONE_PARENT re-uses the old parent */
if (clone_flags & (CLONE_PARENT|CLONE_THREAD))
p->real_parent = current->real_parent;
Index: linux-2.6-lttng/kernel/sched.c
===
--- linux-2.6-lttng.orig/kernel/sched.c 2007-12-05 20:54:00.0 -0500
+++ linux-2.6-lttng/kernel/sched.c  2007-12-05 20:54:32.0 -0500
@@ -7394,3 +7394,58 @@ struct cgroup_subsys cpuacct_subsys = {
.subsys_id = cpuacct_subsys_id,
 };
 #endif /* CONFIG_CGROUP_CPUACCT */
+
+static DEFINE_MUTEX(kernel_trace_mutex);
+static int kernel_trace_refcount;
+
+/**
+ * clear_kernel_trace_flag_all_tasks - clears all TIF_KERNEL_TRACE thread 
flags.
+ *
+ * This function iterates on all threads in the system to clear their
+ * TIF_KERNEL_TRACE flag. Setting the TIF_KERNEL_TRACE flag with the
+ * tasklist_lock held in copy_process() makes sure that once we finish clearing
+ * the thread flags, all threads have their flags cleared.
+ */
+void clear_kernel_trace_flag_all_tasks(void)
+{
+   struct task_struct *p;
+   struct task_struct *t;
+
+   mutex_lock(&kernel_trace_mutex);
+   if (--kernel_trace_refcount)
+   goto end;
+   read_lock(&tasklist_lock);
+   do_each_thread(p, t) {
+   clear_tsk_thread_flag(t, TIF_KERNEL_TRACE);
+   } while_each_thread(p, t);
+   read_unlock(&tasklist_lock);
+end:
+   mutex_unlock(&kernel_trace_mutex);
+}
+EXPORT_SYMBOL_GPL(clear_kernel_trace_flag_all_tasks);
+
+/**
+ * set_kernel_trace_flag_all_tasks - sets all TIF_KERNEL_TRACE thread flags.
+ *
+ * This function iterates on all threads in the system to set their
+ * TIF_KERNEL_TRACE flag. Setting the TIF_KERNEL_TRACE flag with the
+ * tasklist_lock held in copy_process() makes sure that once we finish setting
+ * the thread flags, all threads have their flags set.
+ */
+void set_kernel_trace_flag_all_tasks(void)
+{
+   struct task_struct *p;
+   struct task_struct *t;
+
+   mutex_lock(&kernel_trace_mutex);
+   if (kernel_trace_refcount++)
+   goto end;
+   read_lock(&tasklist_lock);
+   do_each_thread(p, t) {
+   set_tsk_thread_flag(t, TIF_KERNEL_TRACE);
+   } while_each_thread(p, t);
+   read_unlock(&tasklist_lock);
+end:
+   mutex_unlock(&kernel_trace_mutex);
+}
+EXPORT_SYMBOL_GPL(set_kernel_trace_flag_all_tasks);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 01/26] LTTng Kernel Trace Thread Flag Alpha

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-alpha/thread_info.h |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/include/asm-alpha/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-alpha/thread_info.h2007-11-13 
09:25:26.0 -0500
+++ linux-2.6-lttng/include/asm-alpha/thread_info.h 2007-11-13 
09:49:39.0 -0500
@@ -76,19 +76,21 @@ register struct thread_info *__current_t
 #define TIF_UAC_SIGBUS 7
 #define TIF_MEMDIE 8
 #define TIF_RESTORE_SIGMASK9   /* restore signal mask in do_signal */
+#define TIF_KERNEL_TRACE   10  /* Kernel tracing of syscalls */
 
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 25/26] LTTng Kernel Trace Thread Flag xtensa

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-xtensa/thread_info.h |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-xtensa/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-xtensa/thread_info.h   2007-09-04 
11:50:18.0 -0400
+++ linux-2.6-lttng/include/asm-xtensa/thread_info.h2007-09-04 
12:20:19.0 -0400
@@ -116,6 +116,7 @@ static inline struct thread_info *curren
 #define TIF_IRET   4   /* return with iret */
 #define TIF_MEMDIE 5
 #define TIF_RESTORE_SIGMASK6   /* restore signal mask in do_signal() */
+#define TIF_KERNEL_TRACE   7   /* kernel trace active */
 #define TIF_POLLING_NRFLAG 16  /* true if poll_idle() is polling 
TIF_NEED_RESCHED */
 
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 21/26] LTTng Kernel Trace Thread Flag UML

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-um/thread_info.h |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/include/asm-um/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-um/thread_info.h   2007-07-30 
18:26:11.0 -0400
+++ linux-2.6-lttng/include/asm-um/thread_info.h2007-07-30 
18:26:39.0 -0400
@@ -82,6 +82,7 @@ static inline struct thread_info *curren
 #define TIF_MEMDIE 5
 #define TIF_SYSCALL_AUDIT  6
 #define TIF_RESTORE_SIGMASK7
+#define TIF_KERNEL_TRACE   8   /* kernel trace active */
 
 #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
@@ -90,5 +91,6 @@ static inline struct thread_info *curren
 #define _TIF_MEMDIE(1 << TIF_MEMDIE)
 #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
 #define _TIF_RESTORE_SIGMASK   (1 << TIF_RESTORE_SIGMASK)
+#define _TIF_KERNEL_TRACE  (1 << TIF_KERNEL_TRACE)
 
 #endif

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 22/26] LTTng Kernel Trace Thread Flag v850

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-v850/thread_info.h |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6-lttng/include/asm-v850/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-v850/thread_info.h 2007-07-30 
18:26:48.0 -0400
+++ linux-2.6-lttng/include/asm-v850/thread_info.h  2007-07-30 
18:27:19.0 -0400
@@ -82,12 +82,14 @@ struct thread_info {
 #define TIF_POLLING_NRFLAG 3   /* true if poll_idle() is polling
   TIF_NEED_RESCHED */
 #define TIF_MEMDIE 4
+#define TIF_KERNEL_TRACE   5   /* kernel trace active */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE (1

[patch-RFC 20/26] LTTng Kernel Trace Thread Flag sparc64

2007-12-05 Thread Mathieu Desnoyers

Add a thread flag to activate system-wide syscall tracing.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/asm-sparc64/thread_info.h |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/include/asm-sparc64/thread_info.h
===
--- linux-2.6-lttng.orig/include/asm-sparc64/thread_info.h  2007-07-30 
18:24:15.0 -0400
+++ linux-2.6-lttng/include/asm-sparc64/thread_info.h   2007-07-30 
18:25:43.0 -0400
@@ -225,7 +225,7 @@ register struct thread_info *current_thr
 #define TIF_UNALIGNED  5   /* allowed to do unaligned accesses */
 #define TIF_NEWSIGNALS 6   /* wants new-style signals */
 #define TIF_32BIT  7   /* 32-bit binary */
-/* flag bit 8 is available */
+#define TIF_KERNEL_TRACE   8   /* kernel trace active */
 #define TIF_SECCOMP9   /* secure computing */
 #define TIF_SYSCALL_AUDIT  10  /* syscall auditing active */
 /* flag bit 11 is available */
@@ -244,6 +244,7 @@ register struct thread_info *current_thr
 #define _TIF_UNALIGNED (1

Re: laptop reboots right after hibernation

2007-12-05 Thread Tejun Heo

Thanks.  Almost there.  Can you please try the attached two patches and
report the boot log?

-- 
tejun
Index: work/include/linux/libata.h
===
--- work.orig/include/linux/libata.h
+++ work/include/linux/libata.h
@@ -1013,18 +1013,18 @@ extern void ata_do_eh(struct ata_port *a
  * printk helpers
  */
 #define ata_port_printk(ap, lv, fmt, args...) \
-	printk(lv"ata%u: "fmt, (ap)->print_id , ##args)
+	printk("%sata%u: "fmt, lv, (ap)->print_id , ##args)
 
 #define ata_link_printk(link, lv, fmt, args...) do { \
 	if ((link)->ap->nr_pmp_links) \
-		printk(lv"ata%u.%02u: "fmt, (link)->ap->print_id, \
+		printk("%sata%u.%02u: "fmt, lv, (link)->ap->print_id,	\
 		   (link)->pmp , ##args); \
 	else \
-		printk(lv"ata%u: "fmt, (link)->ap->print_id , ##args); \
+		printk("%sata%u: "fmt, lv, (link)->ap->print_id , ##args); \
 	} while(0)
 
 #define ata_dev_printk(dev, lv, fmt, args...) \
-	printk(lv"ata%u.%02u: "fmt, (dev)->link->ap->print_id, \
+	printk("%sata%u.%02u: "fmt, lv, (dev)->link->ap->print_id,	\
 	   (dev)->link->pmp + (dev)->devno , ##args)
 
 /*
Index: work/include/linux/ata.h
===
--- work.orig/include/linux/ata.h
+++ work/include/linux/ata.h
@@ -190,6 +190,8 @@ enum {
 	ATA_CMD_READ_LOG_EXT	= 0x2f,
 	ATA_CMD_PMP_READ	= 0xE4,
 	ATA_CMD_PMP_WRITE	= 0xE8,
+	ATA_CMD_CONF_OVERLAY	= 0xB1,
+	ATA_CMD_SEC_FREEZE_LOCK	= 0xF5,
 
 	/* READ_LOG_EXT pages */
 	ATA_LOG_SATA_NCQ	= 0x10,
@@ -239,6 +241,19 @@ enum {
 	SATA_AN			= 0x05,  /* Asynchronous Notification */
 	SATA_DIPM		= 0x03,  /* Device Initiated Power Management */
 
+	/* feature values for SET_MAX */
+	ATA_SET_MAX_ADDR	= 0x00,
+	ATA_SET_MAX_PASSWD	= 0x01,
+	ATA_SET_MAX_LOCK	= 0x02,
+	ATA_SET_MAX_UNLOCK	= 0x03,
+	ATA_SET_MAX_FREEZE_LOCK	= 0x04,
+
+	/* feature values for DEVICE CONFIGURATION OVERLAY */
+	ATA_DCO_RESTORE		= 0xC0,
+	ATA_DCO_FREEZE_LOCK	= 0xC1,
+	ATA_DCO_IDENTIFY	= 0xC2,
+	ATA_DCO_SET		= 0xC3,
+
 	/* ATAPI stuff */
 	ATAPI_PKT_DMA		= (1 << 0),
 	ATAPI_DMADIR		= (1 << 2),	/* ATAPI data dir:
Index: work/drivers/ata/libata-acpi.c
===
--- work.orig/drivers/ata/libata-acpi.c
+++ work/drivers/ata/libata-acpi.c
@@ -311,8 +311,8 @@ EXPORT_SYMBOL_GPL(ata_acpi_stm);
  * EH context.
  *
  * RETURNS:
- * Number of taskfiles on success, 0 if _GTF doesn't exist or doesn't
- * contain valid data.
+ * Number of taskfiles on success, 0 if _GTF doesn't exist.  -EINVAL
+ * if _GTF is invalid.
  */
 static int ata_dev_get_GTF(struct ata_device *dev, struct ata_acpi_gtf **gtf,
 			   void **ptr_to_free)
@@ -339,6 +339,7 @@ static int ata_dev_get_GTF(struct ata_de
 			ata_dev_printk(dev, KERN_WARNING,
    "_GTF evaluation failed (AE 0x%x)\n",
    status);
+			rc = -EINVAL;
 		}
 		goto out_free;
 	}
@@ -350,6 +351,7 @@ static int ata_dev_get_GTF(struct ata_de
 __FUNCTION__,
 (unsigned long long)output.length,
 output.pointer);
+		rc = -EINVAL;
 		goto out_free;
 	}
 
@@ -358,6 +360,7 @@ static int ata_dev_get_GTF(struct ata_de
 		ata_dev_printk(dev, KERN_WARNING,
 			   "_GTF unexpected object type 0x%x\n",
 			   out_obj->type);
+		rc = -EINVAL;
 		goto out_free;
 	}
 
@@ -365,6 +368,7 @@ static int ata_dev_get_GTF(struct ata_de
 		ata_dev_printk(dev, KERN_WARNING,
 			   "unexpected _GTF length (%d)\n",
 			   out_obj->buffer.length);
+		rc = -EINVAL;
 		goto out_free;
 	}
 
@@ -397,7 +401,7 @@ int ata_acpi_cbl_80wire(struct ata_port 
 	int valid = 0;
 
 	/* No _GTM data, no information */
-	if (ata_acpi_gtm(ap, >m) < 0)
+	if (!ap->acpi_handle || ata_acpi_gtm(ap, >m) < 0)
 		return 0;
 
 	/* Split timing, DMA enabled */
@@ -422,7 +426,7 @@ int ata_acpi_cbl_80wire(struct ata_port 
 EXPORT_SYMBOL_GPL(ata_acpi_cbl_80wire);
 
 /**
- * taskfile_load_raw - send taskfile registers to host controller
+ * ata_acpi_run_tf - send taskfile registers to host controller
  * @dev: target ATA device
  * @gtf: raw ATA taskfile register set (0x1f1 - 0x1f7)
  *
@@ -441,14 +445,17 @@ EXPORT_SYMBOL_GPL(ata_acpi_cbl_80wire);
  * EH context.
  *
  * RETURNS:
- * 0 on success, -errno on failure.
+ * 1 if command is executed successfully.  0 if ignored or rejected,
+ * -errno on other errors.
  */
-static int taskfile_load_raw(struct ata_device *dev,
-			  const struct ata_acpi_gtf *gtf)
+static int ata_acpi_run_tf(struct ata_device *dev,
+			   const struct ata_acpi_gtf *gtf)
 {
-	struct ata_port *ap = dev->link->ap;
 	struct ata_taskfile tf, rtf;
 	unsigned int err_mask;
+	const char *level;
+	char msg[60];
+	int rc;
 
 	if ((gtf->tf[0] == 0) && (gtf->tf[1] == 0) && (gtf->tf[2] == 0)
 	&& (gtf->tf[3] == 0) && (gtf->tf[4] == 0) && (gtf->tf[5] == 0)
@@ -468,29 +475,45 @@ static int taskfile_load_raw(struct ata_
 	tf.device  = gtf->tf[5];	/* 0x1f6 */
 	tf.command = gtf->tf[6];	/* 0x1f7 */
 
-	if (ata_msg_probe(ap))
-		ata_dev_printk(dev, KERN_

[PATCH 3/3] printer port driver: semaphore to mutex

2007-12-05 Thread Daniel Walker

The port_mutex is actually a semaphore, so easily converted to
a struct mutex.

Signed-off-by: Daniel Walker <[EMAIL PROTECTED]>

---
 drivers/char/lp.c  |   11 ++-
 include/linux/lp.h |2 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

Index: linux-2.6.23/drivers/char/lp.c
===
--- linux-2.6.23.orig/drivers/char/lp.c
+++ linux-2.6.23/drivers/char/lp.c
@@ -126,6 +126,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #undef LP_STATS
@@ -312,7 +313,7 @@ static ssize_t lp_write(struct file * fi
if (copy_size > LP_BUFFER_SIZE)
copy_size = LP_BUFFER_SIZE;
 
-   if (down_interruptible (&lp_table[minor].port_mutex))
+   if (mutex_lock_interruptible(&lp_table[minor].port_mutex))
return -EINTR;
 
if (copy_from_user (kbuf, buf, copy_size)) {
@@ -399,7 +400,7 @@ static ssize_t lp_write(struct file * fi
lp_release_parport (&lp_table[minor]);
}
 out_unlock:
-   up (&lp_table[minor].port_mutex);
+   mutex_unlock(&lp_table[minor].port_mutex);
 
return retv;
 }
@@ -421,7 +422,7 @@ static ssize_t lp_read(struct file * fil
if (count > LP_BUFFER_SIZE)
count = LP_BUFFER_SIZE;
 
-   if (down_interruptible (&lp_table[minor].port_mutex))
+   if (mutex_lock_interruptible(&lp_table[minor].port_mutex))
return -EINTR;
 
lp_claim_parport_or_block (&lp_table[minor]);
@@ -479,7 +480,7 @@ static ssize_t lp_read(struct file * fil
if (retval > 0 && copy_to_user (buf, kbuf, retval))
retval = -EFAULT;
 
-   up (&lp_table[minor].port_mutex);
+   mutex_unlock(&lp_table[minor].port_mutex);
 
return retval;
 }
@@ -888,7 +889,7 @@ static int __init lp_init (void)
lp_table[i].last_error = 0;
init_waitqueue_head (&lp_table[i].waitq);
init_waitqueue_head (&lp_table[i].dataq);
-   init_MUTEX (&lp_table[i].port_mutex);
+   mutex_init(&lp_table[i].port_mutex);
lp_table[i].timeout = 10 * HZ;
}
 
Index: linux-2.6.23/include/linux/lp.h
===
--- linux-2.6.23.orig/include/linux/lp.h
+++ linux-2.6.23/include/linux/lp.h
@@ -145,7 +145,7 @@ struct lp_struct {
 #endif
wait_queue_head_t waitq;
unsigned int last_error;
-   struct semaphore port_mutex;
+   struct mutex port_mutex;
wait_queue_head_t dataq;
long timeout;
unsigned int best_mode;

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] stopmachine semaphore to mutex

2007-12-05 Thread Daniel Walker

It's called stopmachine_mutex now, but it's a semaphore. So make it
a "struct mutex" .

Signed-off-by: Daniel Walker <[EMAIL PROTECTED]>

---
 kernel/stop_machine.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: linux-2.6.23/kernel/stop_machine.c
===
--- linux-2.6.23.orig/kernel/stop_machine.c
+++ linux-2.6.23/kernel/stop_machine.c
@@ -29,7 +29,7 @@ enum stopmachine_state {
 static enum stopmachine_state stopmachine_state;
 static unsigned int stopmachine_num_threads;
 static atomic_t stopmachine_thread_ack;
-static DECLARE_MUTEX(stopmachine_mutex);
+static DEFINE_MUTEX(stopmachine_mutex);
 
 static int stopmachine(void *cpu)
 {
@@ -177,7 +177,7 @@ struct task_struct *__stop_machine_run(i
smdata.data = data;
init_completion(&smdata.done);
 
-   down(&stopmachine_mutex);
+   mutex_lock(&stopmachine_mutex);
 
/* If they don't care which CPU fn runs on, bind to any online one. */
if (cpu == NR_CPUS)
@@ -193,7 +193,7 @@ struct task_struct *__stop_machine_run(i
wake_up_process(p);
wait_for_completion(&smdata.done);
}
-   up(&stopmachine_mutex);
+   mutex_unlock(&stopmachine_mutex);
return p;
 }
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 4/7] LTTng instrumentation kernel

2007-12-05 Thread Mathieu Desnoyers

Core kernel events.

*not* present in this patch because they are architecture specific :
- syscall entry/exit
- traps
- kernel thread creation

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 kernel/exit.c   |6 ++
 kernel/fork.c   |4 
 kernel/irq/handle.c |6 ++
 kernel/itimer.c |   11 +++
 kernel/kthread.c|4 
 kernel/lockdep.c|   19 +++
 kernel/module.c |4 
 kernel/printk.c |   26 ++
 kernel/sched.c  |   11 +++
 kernel/signal.c |2 ++
 kernel/softirq.c|   22 ++
 kernel/timer.c  |   12 +++-
 12 files changed, 126 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/kernel/irq/handle.c
===
--- linux-2.6-lttng.orig/kernel/irq/handle.c2007-12-05 20:50:32.0 
-0500
+++ linux-2.6-lttng/kernel/irq/handle.c 2007-12-05 20:54:00.0 -0500
@@ -130,6 +130,10 @@ irqreturn_t handle_IRQ_event(unsigned in
 {
irqreturn_t ret, retval = IRQ_NONE;
unsigned int status = 0;
+   struct pt_regs *regs = get_irq_regs();
+
+   trace_mark(kernel_irq_entry, "irq_id %u kernel_mode %u", irq,
+   (regs)?(!user_mode(regs)):(1));
 
handle_dynamic_tick(action);
 
@@ -148,6 +152,8 @@ irqreturn_t handle_IRQ_event(unsigned in
add_interrupt_randomness(irq);
local_irq_disable();
 
+   trace_mark(kernel_irq_exit, MARK_NOARGS);
+
return retval;
 }
 
Index: linux-2.6-lttng/kernel/itimer.c
===
--- linux-2.6-lttng.orig/kernel/itimer.c2007-12-05 20:50:32.0 
-0500
+++ linux-2.6-lttng/kernel/itimer.c 2007-12-05 20:54:00.0 -0500
@@ -132,6 +132,8 @@ enum hrtimer_restart it_real_fn(struct h
struct signal_struct *sig =
container_of(timer, struct signal_struct, real_timer);
 
+   trace_mark(kernel_timer_itimer_expired, "pid %d", sig->tsk->pid);
+
send_group_sig_info(SIGALRM, SEND_SIG_PRIV, sig->tsk);
 
return HRTIMER_NORESTART;
@@ -157,6 +159,15 @@ int do_setitimer(int which, struct itime
!timeval_valid(&value->it_interval))
return -EINVAL;
 
+   trace_mark(kernel_timer_itimer_set,
+   "which %d interval_sec %ld interval_usec %ld "
+   "value_sec %ld value_usec %ld",
+   which,
+   value->it_interval.tv_sec,
+   value->it_interval.tv_usec,
+   value->it_value.tv_sec,
+   value->it_value.tv_usec);
+
switch (which) {
case ITIMER_REAL:
 again:
Index: linux-2.6-lttng/kernel/kthread.c
===
--- linux-2.6-lttng.orig/kernel/kthread.c   2007-12-05 20:50:32.0 
-0500
+++ linux-2.6-lttng/kernel/kthread.c2007-12-05 20:54:00.0 -0500
@@ -195,6 +195,8 @@ int kthread_stop(struct task_struct *k)
/* It could exit after stop_info.k set, but before wake_up_process. */
get_task_struct(k);
 
+   trace_mark(kernel_kthread_stop, "pid %d", k->pid);
+
/* Must init completion *before* thread sees kthread_stop_info.k */
init_completion(&kthread_stop_info.done);
smp_wmb();
@@ -210,6 +212,8 @@ int kthread_stop(struct task_struct *k)
ret = kthread_stop_info.err;
mutex_unlock(&kthread_stop_lock);
 
+   trace_mark(kernel_kthread_stop_ret, "ret %d", ret);
+
return ret;
 }
 EXPORT_SYMBOL(kthread_stop);
Index: linux-2.6-lttng/kernel/lockdep.c
===
--- linux-2.6-lttng.orig/kernel/lockdep.c   2007-12-05 20:52:28.0 
-0500
+++ linux-2.6-lttng/kernel/lockdep.c2007-12-05 20:54:00.0 -0500
@@ -2014,6 +2014,9 @@ void trace_hardirqs_on(void)
struct task_struct *curr = current;
unsigned long ip;
 
+   _trace_mark(locking_hardirqs_on, "ip #p%lu",
+   (unsigned long) __builtin_return_address(0));
+
if (unlikely(!debug_locks || current->lockdep_recursion))
return;
 
@@ -2061,6 +2064,9 @@ void trace_hardirqs_off(void)
 {
struct task_struct *curr = current;
 
+   _trace_mark(locking_hardirqs_off, "ip #p%lu",
+   (unsigned long) __builtin_return_address(0));
+
if (unlikely(!debug_locks || current->lockdep_recursion))
return;
 
@@ -2088,6 +2094,9 @@ void trace_softirqs_on(unsigned long ip)
 {
struct task_struct *curr = current;
 
+   _trace_mark(locking_softirqs_on, "ip #p%lu",
+   (unsigned long) __builtin_return_address(0));
+
if (unlikely(!debug_locks))
return;
 
@@ -2122,6 +2131,9 @@ void trace_softirqs_off(unsigned long ip
 {
struct task_struct *curr = cu

[PATCH 2/3] Amiga serial driver: port_write_mutex fixup

2007-12-05 Thread Daniel Walker

The port_write_mutex was converted from a semaphore to a mutex, 
but there was still this ifdef'd init_MUTEX reference remaining.

Signed-off-by: Daniel Walker <[EMAIL PROTECTED]>

---
 drivers/char/ser_a2232.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23/drivers/char/ser_a2232.c
===
--- linux-2.6.23.orig/drivers/char/ser_a2232.c
+++ linux-2.6.23/drivers/char/ser_a2232.c
@@ -653,7 +653,7 @@ static void a2232_init_portstructs(void)
port->gs.closing_wait = 30 * HZ;
port->gs.rd = &a2232_real_driver;
 #ifdef NEW_WRITE_LOCKING
-   init_MUTEX(&(port->gs.port_write_mutex));
+   mutex_init(&(port->gs.port_write_mutex));
 #endif
init_waitqueue_head(&port->gs.open_wait);
init_waitqueue_head(&port->gs.close_wait);

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 5/7] LTTng instrumentation mm

2007-12-05 Thread Mathieu Desnoyers

Memory management core events.

Changelog:
- Use page_to_pfn for swap out instrumentation, wait_on_page_bit, do_swap_page,
  page alloc/free.
- add missing free_hot_cold_page instrumentation.
- add hugetlb page_alloc page_free instrumentation.
- Add write_access to mm fault.
- Add page bit_nr waited for by wait_on_page_bit.
- Move page alloc instrumentation to __aloc_pages so we cover the alloc zeroed
  page path.
- Add swap file used for swap in and swap out events.
- Dump the swap files, instrument swapon and swapoff.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: Dave Hansen <[EMAIL PROTECTED]>
---
 include/linux/swapops.h |8 
 mm/filemap.c|6 ++
 mm/hugetlb.c|2 ++
 mm/memory.c |   38 +-
 mm/page_alloc.c |6 ++
 mm/page_io.c|5 +
 mm/swapfile.c   |   22 ++
 7 files changed, 78 insertions(+), 9 deletions(-)

Index: linux-2.6-lttng/mm/filemap.c
===
--- linux-2.6-lttng.orig/mm/filemap.c   2007-12-05 20:50:32.0 -0500
+++ linux-2.6-lttng/mm/filemap.c2007-12-05 20:54:04.0 -0500
@@ -514,9 +514,15 @@ void fastcall wait_on_page_bit(struct pa
 {
DEFINE_WAIT_BIT(wait, &page->flags, bit_nr);
 
+   trace_mark(mm_filemap_wait_start, "pfn %lu bit_nr %d",
+   page_to_pfn(page), bit_nr);
+
if (test_bit(bit_nr, &page->flags))
__wait_on_bit(page_waitqueue(page), &wait, sync_page,
TASK_UNINTERRUPTIBLE);
+
+   trace_mark(mm_filemap_wait_end, "pfn %lu bit_nr %d",
+   page_to_pfn(page), bit_nr);
 }
 EXPORT_SYMBOL(wait_on_page_bit);
 
Index: linux-2.6-lttng/mm/memory.c
===
--- linux-2.6-lttng.orig/mm/memory.c2007-12-05 20:53:30.0 -0500
+++ linux-2.6-lttng/mm/memory.c 2007-12-05 20:54:04.0 -0500
@@ -2090,6 +2090,10 @@ static int do_swap_page(struct mm_struct
/* Had to read the page from swap area: Major fault */
ret = VM_FAULT_MAJOR;
count_vm_event(PGMAJFAULT);
+   trace_mark(mm_swap_in, "pfn %lu filp %p offset %lu",
+   page_to_pfn(page),
+   get_swap_info_struct(swp_type(entry))->swap_file,
+   swp_offset(entry));
}
 
mark_page_accessed(page);
@@ -2526,30 +2530,46 @@ unlock:
 int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, int write_access)
 {
+   int res;
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
+   trace_mark(mm_handle_fault_entry,
+   "address %lu ip #p%ld write_access %d",
+   address, KSTK_EIP(current), write_access);
+
__set_current_state(TASK_RUNNING);
 
count_vm_event(PGFAULT);
 
-   if (unlikely(is_vm_hugetlb_page(vma)))
-   return hugetlb_fault(mm, vma, address, write_access);
+   if (unlikely(is_vm_hugetlb_page(vma))) {
+   res = hugetlb_fault(mm, vma, address, write_access);
+   goto end;
+   }
 
pgd = pgd_offset(mm, address);
pud = pud_alloc(mm, pgd, address);
-   if (!pud)
-   return VM_FAULT_OOM;
+   if (!pud) {
+   res = VM_FAULT_OOM;
+   goto end;
+   }
pmd = pmd_alloc(mm, pud, address);
-   if (!pmd)
-   return VM_FAULT_OOM;
+   if (!pmd) {
+   res = VM_FAULT_OOM;
+   goto end;
+   }
pte = pte_alloc_map(mm, pmd, address);
-   if (!pte)
-   return VM_FAULT_OOM;
+   if (!pte) {
+   res = VM_FAULT_OOM;
+   goto end;
+   }
 
-   return handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+   res = handle_pte_fault(mm, vma, address, pte, pmd, write_access);
+end:
+   trace_mark(mm_handle_fault_exit, MARK_NOARGS);
+   return res;
 }
 
 #ifndef __PAGETABLE_PUD_FOLDED
Index: linux-2.6-lttng/mm/page_alloc.c
===
--- linux-2.6-lttng.orig/mm/page_alloc.c2007-12-05 20:50:32.0 
-0500
+++ linux-2.6-lttng/mm/page_alloc.c 2007-12-05 20:54:04.0 -0500
@@ -518,6 +518,9 @@ static void __free_pages_ok(struct page 
int i;
int reserved = 0;
 
+   trace_mark(mm_page_free, "order %u pfn %lu",
+   order, page_to_pfn(page));
+
for (i = 0 ; i < (1 << order) ; ++i)
reserved += free_pages_check(page + i);
if (reserved)
@@ -980,6 +983,8 @@ static void fastcall free_hot_cold_page(
struct per_cpu_pages *pcp;
unsigned long flags;
 
+   trace_mark(mm_page_free, "order %u pfn

[patch-RFC 2/7] LTTng instrumentation fs

2007-12-05 Thread Mathieu Desnoyers

Core filesystem events markers.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: Alexander Viro <[EMAIL PROTECTED]>
---
 fs/buffer.c |2 ++
 fs/compat.c |1 +
 fs/exec.c   |1 +
 fs/ioctl.c  |2 ++
 fs/open.c   |2 ++
 fs/read_write.c |   22 --
 fs/select.c |4 
 7 files changed, 32 insertions(+), 2 deletions(-)

Index: linux-2.6-lttng/fs/buffer.c
===
--- linux-2.6-lttng.orig/fs/buffer.c2007-12-05 20:50:32.0 -0500
+++ linux-2.6-lttng/fs/buffer.c 2007-12-05 20:53:59.0 -0500
@@ -89,7 +89,9 @@ void fastcall unlock_buffer(struct buffe
  */
 void __wait_on_buffer(struct buffer_head * bh)
 {
+   trace_mark(fs_buffer_wait_start, "bh %p", bh);
wait_on_bit(&bh->b_state, BH_Lock, sync_buffer, TASK_UNINTERRUPTIBLE);
+   trace_mark(fs_buffer_wait_end, "bh %p", bh);
 }
 
 static void
Index: linux-2.6-lttng/fs/compat.c
===
--- linux-2.6-lttng.orig/fs/compat.c2007-12-05 20:50:32.0 -0500
+++ linux-2.6-lttng/fs/compat.c 2007-12-05 20:53:59.0 -0500
@@ -1408,6 +1408,7 @@ int compat_do_execve(char * filename,
 
retval = search_binary_handler(bprm, regs);
if (retval >= 0) {
+   trace_mark(fs_exec, "filename %s", filename);
/* execve success */
security_bprm_free(bprm);
acct_update_integrals(current);
Index: linux-2.6-lttng/fs/ioctl.c
===
--- linux-2.6-lttng.orig/fs/ioctl.c 2007-12-05 20:50:32.0 -0500
+++ linux-2.6-lttng/fs/ioctl.c  2007-12-05 20:53:59.0 -0500
@@ -164,6 +164,8 @@ asmlinkage long sys_ioctl(unsigned int f
if (!filp)
goto out;
 
+   trace_mark(fs_ioctl, "fd %u cmd %u arg %lu", fd, cmd, arg);
+
error = security_file_ioctl(filp, cmd, arg);
if (error)
goto out_fput;
Index: linux-2.6-lttng/fs/open.c
===
--- linux-2.6-lttng.orig/fs/open.c  2007-12-05 20:50:32.0 -0500
+++ linux-2.6-lttng/fs/open.c   2007-12-05 20:53:59.0 -0500
@@ -1043,6 +1043,7 @@ long do_sys_open(int dfd, const char __u
fsnotify_open(f->f_path.dentry);
fd_install(fd, f);
}
+   trace_mark(fs_open, "fd %d filename %s", fd, tmp);
}
putname(tmp);
}
@@ -1133,6 +1134,7 @@ asmlinkage long sys_close(unsigned int f
filp = fdt->fd[fd];
if (!filp)
goto out_unlock;
+   trace_mark(fs_close, "fd %u", fd);
rcu_assign_pointer(fdt->fd[fd], NULL);
FD_CLR(fd, fdt->close_on_exec);
__put_unused_fd(files, fd);
Index: linux-2.6-lttng/fs/read_write.c
===
--- linux-2.6-lttng.orig/fs/read_write.c2007-12-05 20:50:32.0 
-0500
+++ linux-2.6-lttng/fs/read_write.c 2007-12-05 20:58:35.0 -0500
@@ -146,6 +146,9 @@ asmlinkage off_t sys_lseek(unsigned int 
if (res != (loff_t)retval)
retval = -EOVERFLOW;/* LFS: should only happen on 
32 bit platforms */
}
+
+   trace_mark(fs_lseek, "fd %u offset %ld origin %u", fd, offset, origin);
+
fput_light(file, fput_needed);
 bad:
return retval;
@@ -173,6 +176,10 @@ asmlinkage long sys_llseek(unsigned int 
offset = vfs_llseek(file, ((loff_t) offset_high << 32) | offset_low,
origin);
 
+   trace_mark(fs_llseek, "fd %u offset %llu origin %u", fd,
+   (unsigned long long)offset,
+   origin);
+
retval = (int)offset;
if (offset >= 0) {
retval = -EFAULT;
@@ -363,6 +370,7 @@ asmlinkage ssize_t sys_read(unsigned int
file = fget_light(fd, &fput_needed);
if (file) {
loff_t pos = file_pos_read(file);
+   trace_mark(fs_read, "fd %u count %zu", fd, count);
ret = vfs_read(file, buf, count, &pos);
file_pos_write(file, pos);
fput_light(file, fput_needed);
@@ -381,6 +389,7 @@ asmlinkage ssize_t sys_write(unsigned in
file = fget_light(fd, &fput_needed);
if (file) {
loff_t pos = file_pos_read(file);
+   trace_mark(fs_write, "fd %u count %zu", fd, count);
ret = vfs_write(file, buf, count, &pos);
file_pos_write(file, pos);
fput_light(file, fput_needed);
@@ -402,8 +411,12 @@ asmlinkage ssize_t sys_pread64(unsigned 
file = fget_light(fd, &fput_needed);
if (file) {
ret = -ESPIPE;
-   if (file->f_mode & FMODE_PREAD)
+

[patch-RFC 7/7] Add Markers Into Semaphore Primitives

2007-12-05 Thread Mathieu Desnoyers

This patch adds several markers around semaphore primitives.
Along with a tracing application this patch can be useful for measuring
kernel semaphore usage and contention.

Signed-off-by: Mike Mason <[EMAIL PROTECTED]>
Signed-off-by: David Wilder <[EMAIL PROTECTED]>
Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 lib/semaphore-sleepers.c |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/lib/semaphore-sleepers.c b/lib/semaphore-sleepers.c
index 1281805..5343a96 100644
--- a/lib/semaphore-sleepers.c
+++ b/lib/semaphore-sleepers.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -50,6 +51,7 @@
 
 fastcall void __up(struct semaphore *sem)
 {
+   trace_mark(sem_up, "%p", sem);
wake_up(&sem->wait);
 }
 
@@ -59,6 +61,7 @@ fastcall void __sched __down(struct semaphore * sem)
DECLARE_WAITQUEUE(wait, tsk);
unsigned long flags;
 
+   trace_mark(sem_down, "%p", sem);
tsk->state = TASK_UNINTERRUPTIBLE;
spin_lock_irqsave(&sem->wait.lock, flags);
add_wait_queue_exclusive_locked(&sem->wait, &wait);
@@ -73,12 +76,14 @@ fastcall void __sched __down(struct semaphore * sem)
 * the wait_queue_head.
 */
if (!atomic_add_negative(sleepers - 1, &sem->count)) {
+   trace_mark(sem_down_resume, "%p", sem);
sem->sleepers = 0;
break;
}
sem->sleepers = 1;  /* us - see -1 above */
spin_unlock_irqrestore(&sem->wait.lock, flags);
 
+   trace_mark(sem_down_sched, "%p", sem);
schedule();
 
spin_lock_irqsave(&sem->wait.lock, flags);
@@ -97,6 +102,7 @@ fastcall int __sched __down_interruptible(struct semaphore * 
sem)
DECLARE_WAITQUEUE(wait, tsk);
unsigned long flags;
 
+   trace_mark(sem_down_intr, "%p", sem);
tsk->state = TASK_INTERRUPTIBLE;
spin_lock_irqsave(&sem->wait.lock, flags);
add_wait_queue_exclusive_locked(&sem->wait, &wait);
@@ -113,6 +119,7 @@ fastcall int __sched __down_interruptible(struct semaphore 
* sem)
 * and exit.
 */
if (signal_pending(current)) {
+   trace_mark(sem_down_intr_fail, "%p", sem);
retval = -EINTR;
sem->sleepers = 0;
atomic_add(sleepers, &sem->count);
@@ -126,12 +133,14 @@ fastcall int __sched __down_interruptible(struct 
semaphore * sem)
 * still hoping to get the semaphore.
 */
if (!atomic_add_negative(sleepers - 1, &sem->count)) {
+   trace_mark(sem_down_intr_resume, "%p", sem);
sem->sleepers = 0;
break;
}
sem->sleepers = 1;  /* us - see -1 above */
spin_unlock_irqrestore(&sem->wait.lock, flags);
 
+   trace_mark(sem_down_intr_sched, "%p", sem);
schedule();
 
spin_lock_irqsave(&sem->wait.lock, flags);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 3/7] LTTng instrumentation ipc

2007-12-05 Thread Mathieu Desnoyers

Interprocess communication, core events.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 ipc/msg.c |5 -
 ipc/sem.c |5 -
 ipc/shm.c |5 -
 3 files changed, 12 insertions(+), 3 deletions(-)

Index: linux-2.6-lttng/ipc/msg.c
===
--- linux-2.6-lttng.orig/ipc/msg.c  2007-11-13 09:49:27.0 -0500
+++ linux-2.6-lttng/ipc/msg.c   2007-11-13 09:49:31.0 -0500
@@ -315,6 +315,7 @@ asmlinkage long sys_msgget(key_t key, in
struct ipc_namespace *ns;
struct ipc_ops msg_ops;
struct ipc_params msg_params;
+   long ret;
 
ns = current->nsproxy->ipc_ns;
 
@@ -325,7 +326,9 @@ asmlinkage long sys_msgget(key_t key, in
msg_params.key = key;
msg_params.flg = msgflg;
 
-   return ipcget(ns, &msg_ids(ns), &msg_ops, &msg_params);
+   ret = ipcget(ns, &msg_ids(ns), &msg_ops, &msg_params);
+   trace_mark(ipc_msg_create, "id %ld flags %d", ret, msgflg);
+   return ret;
 }
 
 static inline unsigned long
Index: linux-2.6-lttng/ipc/sem.c
===
--- linux-2.6-lttng.orig/ipc/sem.c  2007-11-13 09:49:27.0 -0500
+++ linux-2.6-lttng/ipc/sem.c   2007-11-13 09:49:31.0 -0500
@@ -334,6 +334,7 @@ asmlinkage long sys_semget(key_t key, in
struct ipc_namespace *ns;
struct ipc_ops sem_ops;
struct ipc_params sem_params;
+   long err;
 
ns = current->nsproxy->ipc_ns;
 
@@ -348,7 +349,9 @@ asmlinkage long sys_semget(key_t key, in
sem_params.flg = semflg;
sem_params.u.nsems = nsems;
 
-   return ipcget(ns, &sem_ids(ns), &sem_ops, &sem_params);
+   err = ipcget(ns, &sem_ids(ns), &sem_ops, &sem_params);
+   trace_mark(ipc_sem_create, "id %ld flags %d", err, semflg);
+   return err;
 }
 
 /* Manage the doubly linked list sma->sem_pending as a FIFO:
Index: linux-2.6-lttng/ipc/shm.c
===
--- linux-2.6-lttng.orig/ipc/shm.c  2007-11-13 09:49:27.0 -0500
+++ linux-2.6-lttng/ipc/shm.c   2007-11-13 09:49:31.0 -0500
@@ -497,6 +497,7 @@ asmlinkage long sys_shmget (key_t key, s
struct ipc_namespace *ns;
struct ipc_ops shm_ops;
struct ipc_params shm_params;
+   long err;
 
ns = current->nsproxy->ipc_ns;
 
@@ -508,7 +509,9 @@ asmlinkage long sys_shmget (key_t key, s
shm_params.flg = shmflg;
shm_params.u.size = size;
 
-   return ipcget(ns, &shm_ids(ns), &shm_ops, &shm_params);
+   err = ipcget(ns, &shm_ids(ns), &shm_ops, &shm_params);
+   trace_mark(ipc_shm_create, "id %ld flags %d", err, shmflg);
+   return err;
 }
 
 static inline unsigned long copy_shmid_to_user(void __user *buf, struct 
shmid64_ds *in, int version)

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 0/7] LTTng instrumentation (arch independent)

2007-12-05 Thread Mathieu Desnoyers

Hi,

This is the second RFC post for the LTTng architecture independent
instrumentation. I have mostly received interesting ideas from the mm group,
which I have added here.

It applies on top of 2.6.24-rc4-git3, on top of the Linux Kernel Markers with
Immediate Values patchset.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 1/7] Include marker.h in kernel.h -- temporary, for code readability

2007-12-05 Thread Mathieu Desnoyers

This patch is a hack to make my life easier : it lessens the conflicts due to
header includes that changes between the kernel versions.

The proper way to do this is to include  in every file using the
markers.

NOT FOR UPSTREAM.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
---
 include/linux/kernel.h |1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6-lttng/include/linux/kernel.h
===
--- linux-2.6-lttng.orig/include/linux/kernel.h 2007-06-15 16:13:48.0 
-0400
+++ linux-2.6-lttng/include/linux/kernel.h  2007-06-15 16:14:28.0 
-0400
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch-RFC 6/7] LTTng instrumentation net

2007-12-05 Thread Mathieu Desnoyers

Network core events.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
---
 net/core/dev.c |5 +
 net/ipv4/devinet.c |5 +
 net/socket.c   |   18 ++
 3 files changed, 28 insertions(+)

Index: linux-2.6-lttng/net/core/dev.c
===
--- linux-2.6-lttng.orig/net/core/dev.c 2007-11-19 08:07:04.0 -0500
+++ linux-2.6-lttng/net/core/dev.c  2007-11-19 08:07:11.0 -0500
@@ -1639,6 +1639,8 @@ int dev_queue_xmit(struct sk_buff *skb)
}
 
 gso:
+   trace_mark(net_dev_xmit, "skb %p protocol #2u%hu", skb, skb->protocol);
+
spin_lock_prefetch(&dev->queue_lock);
 
/* Disable soft irqs for various locks below. Also
@@ -2039,6 +2041,9 @@ int netif_receive_skb(struct sk_buff *sk
 
__get_cpu_var(netdev_rx_stat).total++;
 
+   trace_mark(net_dev_receive, "skb %p protocol #2u%hu",
+   skb, skb->protocol);
+
skb_reset_network_header(skb);
skb_reset_transport_header(skb);
skb->mac_len = skb->network_header - skb->mac_header;
Index: linux-2.6-lttng/net/ipv4/devinet.c
===
--- linux-2.6-lttng.orig/net/ipv4/devinet.c 2007-11-19 08:07:04.0 
-0500
+++ linux-2.6-lttng/net/ipv4/devinet.c  2007-11-19 08:07:11.0 -0500
@@ -262,6 +262,8 @@ static void __inet_del_ifa(struct in_dev
struct in_ifaddr **ifap1 = &ifa1->ifa_next;
 
while ((ifa = *ifap1) != NULL) {
+   trace_mark(net_del_ifa_ipv4, "label %s",
+   ifa->ifa_label);
if (!(ifa->ifa_flags & IFA_F_SECONDARY) &&
ifa1->ifa_scope <= ifa->ifa_scope)
last_prim = ifa;
@@ -368,6 +370,9 @@ static int __inet_insert_ifa(struct in_i
}
ifa->ifa_flags |= IFA_F_SECONDARY;
}
+   trace_mark(net_insert_ifa_ipv4, "label %s address #4u%lu",
+   ifa->ifa_label,
+   (unsigned long)ifa->ifa_address);
}
 
if (!(ifa->ifa_flags & IFA_F_SECONDARY)) {
Index: linux-2.6-lttng/net/socket.c
===
--- linux-2.6-lttng.orig/net/socket.c   2007-11-19 08:07:04.0 -0500
+++ linux-2.6-lttng/net/socket.c2007-11-19 08:07:11.0 -0500
@@ -563,6 +563,11 @@ int sock_sendmsg(struct socket *sock, st
struct sock_iocb siocb;
int ret;
 
+   trace_mark(net_socket_sendmsg,
+   "sock %p family %d type %d protocol %d size %zu",
+   sock, sock->sk->sk_family, sock->sk->sk_type,
+   sock->sk->sk_protocol, size);
+
init_sync_kiocb(&iocb, NULL);
iocb.private = &siocb;
ret = __sock_sendmsg(&iocb, sock, msg, size);
@@ -646,7 +651,13 @@ int sock_recvmsg(struct socket *sock, st
struct sock_iocb siocb;
int ret;
 
+   trace_mark(net_socket_recvmsg,
+   "sock %p family %d type %d protocol %d size %zu",
+   sock, sock->sk->sk_family, sock->sk->sk_type,
+   sock->sk->sk_protocol, size);
+
init_sync_kiocb(&iocb, NULL);
+
iocb.private = &siocb;
ret = __sock_recvmsg(&iocb, sock, msg, size, flags);
if (-EIOCBQUEUED == ret)
@@ -1212,6 +1223,11 @@ asmlinkage long sys_socket(int family, i
if (retval < 0)
goto out_release;
 
+   trace_mark(net_socket_create,
+   "sock %p family %d type %d protocol %d fd %d",
+   sock, sock->sk->sk_family, sock->sk->sk_type,
+   sock->sk->sk_protocol, retval);
+
 out:
/* It may be already another descriptor 8) Not kernel problem. */
return retval;
@@ -2021,6 +2037,8 @@ asmlinkage long sys_socketcall(int call,
a0 = a[0];
a1 = a[1];
 
+   trace_mark(net_socket_call, "call %d a0 %lu", call, a0);
+
switch (call) {
case SYS_SOCKET:
err = sys_socket(a0, a1, a[2]);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sockets affected by IPsec always block (2.6.23)

2007-12-05 Thread David Miller

From: Stefan Rompf <[EMAIL PROTECTED]>
Date: Wed, 5 Dec 2007 19:39:07 +0100

> I'd strongly suggest doing so. AFAIK, behaviour of connect() on nonblocking 
> sockets is quite well defined in POSIX.

You are entitled to your opinion.

POSIX says nothing about the semantics of route resolution.
Non-blocking doesn't mean "cannot sleep no matter what".

> If this is changed for some IP sockets, event-driven applications
> will randomly and subtly break.

If this was such a clear cut case we'd have changed things
a long time ago, but it isn't so don't pretend this is the
case.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i386 IOAPIC: de-fang IRQ compression

2007-12-05 Thread Eric W. Biederman

"Natalie Protasevich" <[EMAIL PROTECTED]> writes:

> I think we counted them in the order of 1400 external IRQs (actual
> ioapics/slots plus possible on-card bridges), and yes numbers for used
> IRQs were close to 250. Actual customer configurations could've big
> bigger, I don't have such data.
>
>> In particular is a large NR_IRQS plus dynamic vector allocation
>> sufficient for all cases you know about?
>
> Yes, since x86_64 boxes never had a problem once dynamic vectors were
> incorporated.

I was wondering if we could avoid making the vectors per cpu and still be
in good shape on x86_32.  From your description it looks like we can't
quite support everything on x86_32 if we don't do the per cpu vector
thing.  However we will likely have everything interesting supported.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/6] Immediate Values - Move Kprobes x86 restore_interrupt to kdebug.h

2007-12-05 Thread Mathieu Desnoyers

Since the breakpoint handler is useful both to kprobes and immediate values, it
makes sense to make the required restore_interrupt() available through
asm-i386/kdebug.h.

Signed-off-by: Mathieu Desnoyers <[EMAIL PROTECTED]>
Acked-by: Ananth N Mavinakayanahalli <[EMAIL PROTECTED]>
CC: Christoph Hellwig <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
CC: Thomas Gleixner <[EMAIL PROTECTED]>
CC: Ingo Molnar <[EMAIL PROTECTED]>
CC: H. Peter Anvin <[EMAIL PROTECTED]>
---
 include/asm-x86/kdebug.h |   12 
 include/asm-x86/kprobes_32.h |9 -
 include/asm-x86/kprobes_64.h |9 -
 3 files changed, 12 insertions(+), 18 deletions(-)

Index: linux-2.6-lttng/include/asm-x86/kdebug.h
===
--- linux-2.6-lttng.orig/include/asm-x86/kdebug.h   2007-11-02 
15:01:53.0 -0400
+++ linux-2.6-lttng/include/asm-x86/kdebug.h2007-11-02 15:02:00.0 
-0400
@@ -3,6 +3,9 @@
 
 #include 
 
+#include 
+#include 
+
 struct pt_regs;
 
 /* Grossly misnamed. */
@@ -30,4 +33,13 @@ extern void dump_pagetable(unsigned long
 extern unsigned long oops_begin(void);
 extern void oops_end(unsigned long);
 
+/* trap3/1 are intr gates for kprobes.  So, restore the status of IF,
+ * if necessary, before executing the original int3/1 (trap) handler.
+ */
+static inline void restore_interrupts(struct pt_regs *regs)
+{
+   if (regs->eflags & IF_MASK)
+   local_irq_enable();
+}
+
 #endif
Index: linux-2.6-lttng/include/asm-x86/kprobes_32.h
===
--- linux-2.6-lttng.orig/include/asm-x86/kprobes_32.h   2007-11-02 
15:01:53.0 -0400
+++ linux-2.6-lttng/include/asm-x86/kprobes_32.h2007-11-02 
15:02:00.0 -0400
@@ -79,15 +79,6 @@ struct kprobe_ctlblk {
struct prev_kprobe prev_kprobe;
 };
 
-/* trap3/1 are intr gates for kprobes.  So, restore the status of IF,
- * if necessary, before executing the original int3/1 (trap) handler.
- */
-static inline void restore_interrupts(struct pt_regs *regs)
-{
-   if (regs->eflags & IF_MASK)
-   local_irq_enable();
-}
-
 extern int kprobe_exceptions_notify(struct notifier_block *self,
unsigned long val, void *data);
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
Index: linux-2.6-lttng/include/asm-x86/kprobes_64.h
===
--- linux-2.6-lttng.orig/include/asm-x86/kprobes_64.h   2007-11-02 
15:02:10.0 -0400
+++ linux-2.6-lttng/include/asm-x86/kprobes_64.h2007-11-02 
15:02:22.0 -0400
@@ -72,15 +72,6 @@ struct kprobe_ctlblk {
struct prev_kprobe prev_kprobe;
 };
 
-/* trap3/1 are intr gates for kprobes.  So, restore the status of IF,
- * if necessary, before executing the original int3/1 (trap) handler.
- */
-static inline void restore_interrupts(struct pt_regs *regs)
-{
-   if (regs->eflags & IF_MASK)
-   local_irq_enable();
-}
-
 extern int post_kprobe_handler(struct pt_regs *regs);
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 extern int kprobe_handler(struct pt_regs *regs);

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 443 matches

Mail list logo