[PATCHv2] Smack: add support for modification of existing rules

2013-01-10 Thread Rafal Krypa
The following patch introduces in-place modification of Smack rules.
Until now Smack supported only overwriting of existing rules.
To change permitted access for a given subject and object, user had
to read list of rules to get current accesses, modify it and write
modified rule back to kernel. This way was inefficient, non-atomic
and unnecessarily difficult.
New interface is intended to ease such modifications.

Changes in v2:
- dropped patches for smackfs seq list operations
- changed modification approach to simple integer assignment

Rafal Krypa (1):
  Smack: add support for modification of existing rules

 Documentation/security/Smack.txt |   11 ++
 security/smack/smackfs.c |  249 ++
 2 files changed, 181 insertions(+), 79 deletions(-)

-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2] Smack: add support for modification of existing rules

2013-01-10 Thread Rafal Krypa
Rule modifications are enabled via /smack/change-rule. Format is as follows:
Subject Object rwaxt rwaxt

First two strings are subject and object labels up to 255 characters.
Third string contains permissions to enable.
Fourth string contains permissions to disable.

All unmentioned permissions will be left unchanged.
If no rule previously existed, it will be created.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Rafal Krypa r.kr...@samsung.com
---
 Documentation/security/Smack.txt |   11 ++
 security/smack/smackfs.c |  249 ++
 2 files changed, 181 insertions(+), 79 deletions(-)

diff --git a/Documentation/security/Smack.txt b/Documentation/security/Smack.txt
index 8a177e4..7a2d30c 100644
--- a/Documentation/security/Smack.txt
+++ b/Documentation/security/Smack.txt
@@ -117,6 +117,17 @@ access2
 ambient
This contains the Smack label applied to unlabeled network
packets.
+change-rule
+   This interface allows modification of existing access control rules.
+   The format accepted on write is:
+   %s %s %s %s
+   where the first string is the subject label, the second the
+   object label, the third the access to allow and the fourth the
+   access to deny. The access strings may contain only the characters
+   rwxat-. If a rule for a given subject and object exists it will be
+   modified by enabling the permissions in the third string and disabling
+   those in the fourth string. If there is no such rule it will be
+   created using the access specified in the third and the fourth strings.
 cipso
This interface allows a specific CIPSO header to be assigned
to a Smack label. The format accepted on write is:
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index 337e32c..2479a41 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -50,12 +50,12 @@ enum smk_inos {
SMK_ACCESS2 = 16,   /* make an access check with long labels */
SMK_CIPSO2  = 17,   /* load long label - CIPSO mapping */
SMK_REVOKE_SUBJ = 18,   /* set rules with subject label to '-' */
+   SMK_CHANGE_RULE = 19,   /* change or add rules (long labels) */
 };
 
 /*
  * List locks
  */
-static DEFINE_MUTEX(smack_list_lock);
 static DEFINE_MUTEX(smack_cipso_lock);
 static DEFINE_MUTEX(smack_ambient_lock);
 static DEFINE_MUTEX(smk_netlbladdr_lock);
@@ -110,6 +110,13 @@ struct smack_master_list {
 
 LIST_HEAD(smack_rule_list);
 
+struct smack_parsed_rule {
+   char*smk_subject;
+   char*smk_object;
+   int smk_access1;
+   int smk_access2;
+};
+
 static int smk_cipso_doi_value = SMACK_CIPSO_DOI_DEFAULT;
 
 const char *smack_cipso_option = SMACK_CIPSO_OPTION;
@@ -167,25 +174,28 @@ static void smk_netlabel_audit_set(struct netlbl_audit 
*nap)
 #define SMK_NETLBLADDRMIN  9
 
 /**
- * smk_set_access - add a rule to the rule list
- * @srp: the new rule to add
+ * smk_set_access - add a rule to the rule list or replace an old rule
+ * @srp: the rule to add or replace
  * @rule_list: the list of rules
  * @rule_lock: the rule list lock
+ * @global: if non-zero, indicates a global rule
  *
  * Looks through the current subject/object/access list for
  * the subject/object pair and replaces the access that was
  * there. If the pair isn't found add it with the specified
  * access.
  *
- * Returns 1 if a rule was found to exist already, 0 if it is new
  * Returns 0 if nothing goes wrong or -ENOMEM if it fails
  * during the allocation of the new pair to add.
  */
-static int smk_set_access(struct smack_rule *srp, struct list_head *rule_list,
-   struct mutex *rule_lock)
+static int smk_set_access(struct smack_parsed_rule *srp,
+   struct list_head *rule_list,
+   struct mutex *rule_lock, int global)
 {
struct smack_rule *sp;
+   struct smack_master_list *smlp;
int found = 0;
+   int rc = 0;
 
mutex_lock(rule_lock);
 
@@ -197,23 +207,89 @@ static int smk_set_access(struct smack_rule *srp, struct 
list_head *rule_list,
if (sp-smk_object == srp-smk_object 
sp-smk_subject == srp-smk_subject) {
found = 1;
-   sp-smk_access = srp-smk_access;
+   sp-smk_access |= srp-smk_access1;
+   sp-smk_access = ~srp-smk_access2;
break;
}
}
-   if (found == 0)
-   list_add_rcu(srp-list, rule_list);
 
+   if (found == 0) {
+   sp = kzalloc(sizeof(*sp), GFP_KERNEL);
+   if (sp == NULL) {
+   rc = -ENOMEM;
+   goto out;
+   }
+
+   sp-smk_subject = srp-smk_subject;
+   

Re: [PATCH] gpio: vt8500: Export dedicated GPIO before multifunction pins.

2013-01-10 Thread Tony Prisk
On Thu, 2013-01-10 at 11:49 +0100, Linus Walleij wrote:
 On Sun, Dec 30, 2012 at 9:29 PM, Tony Prisk li...@prisktech.co.nz wrote:
 
  The vendor does not provide numbering for gpio pins. Vendor source
  exports dedicated gpio pins first, followed by multifunction pins.
  As this is what end users expect, this patch changes vt8500 and wm8505
  to do the same.
 
  Signed-off-by: Tony Prisk li...@prisktech.co.nz
 
 So how many existing userspace applications does this patch
 break? Has this system been widely deployed so a kernel
 upgrade will cause problems for people?
 
 But applied anyway, unless someone screams about it real
 soon now. That seems to be the only way to get people to tell
 us about their use cases.
 
 Could you consider adding names to the exported GPIO pins
 on the vt8500 series please? Then userspace can atleast
 try to locate the right pin.
 
 Yours,
 Linus Walleij

In terms of userspace apps, my best guess would be 'I dunno'. This was
requested by the only end-user to ask a question since mainline support
was added - He couldn't find the external GPIO's in the 200+ that were
listed.

This also makes all the platforms the same now - external GPIO's are now
exported first (0..x) which is better in the long term for userspace.

The names is a bit of a problem, but I will try my best. We have limited
datasheets etc from the vendor, so knowing what things do is a bit of a
mystery sometimes.

Regards
Tony P

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gpio: vt8500: memory cleanup missing

2013-01-10 Thread Tony Prisk
On Thu, 2013-01-10 at 13:02 +0100, Linus Walleij wrote:
 On Thu, Jan 10, 2013 at 11:57 AM, Russell King - ARM Linux
 li...@arm.linux.org.uk wrote:
  On Thu, Jan 03, 2013 at 10:47:20AM +1300, Tony Prisk wrote:
 
  +static int vt8500_gpio_remove(struct platform_device *pdev)
  +{
  + int i;
  + int ret;
  + const struct vt8500_gpio_data *data;
  + void __iomem *gpio_base = vtchip[0].base;
  + const struct of_device_id *of_id =
  + of_match_device(vt8500_gpio_dt_ids, 
  pdev-dev);
  +
 
  You can get at the vtchip pointer if you put it into the platform device's
  driver data pointer.  That way, you're not artificially limiting this
  driver to just one device, and, with your changes it will go wrong if DT
  ever lists more than one device.
 
 Good point, I'm sloppy today :-(
 
 Patch dropped.
 
 Tony pls proceed as indicated by Russell.
 
 Yours,
 Linus Walleij

I must have been having a 'stupid' day to not realise that. Will fix.
Thanks Russell.

Regards
Tony P

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/14] lib: Add I/O map cache implementation

2013-01-10 Thread Thierry Reding
On Thu, Jan 10, 2013 at 11:20:07AM -0700, Jason Gunthorpe wrote:
 On Thu, Jan 10, 2013 at 11:25:44AM +0100, Thierry Reding wrote:
  On Thu, Jan 10, 2013 at 09:17:19AM +, Arnd Bergmann wrote:
   On Thursday 10 January 2013, Thierry Reding wrote:
On Wed, Jan 09, 2013 at 04:17:58PM -0700, Jason Gunthorpe wrote:
 On Wed, Jan 09, 2013 at 04:12:31PM -0700, Stephen Warren wrote:
 You could decrease the size of the mapping to only span the bus
 numbers that are configured for use via DT.

That won't work, unfortunately. The mapping is such that the bus number
is not encoded in the uppermost bits, the extended register number is.
So the only thing that we could do is decrease the size of the extended
register space for *all* devices.
   
   But you could still a method to map 16 separate areas per bus, each 65536
   bytes long, which results in 1MB per bus. That is probably ok, since
   very few systems have more than a handful of buses in practice.
   
   In theory, doing static mappings on a per-page base would let you
   do 16 devices at a time, but it's probably worth doing at this fine
   granularity.
  
  I don't understand how this would help. The encoding is like this:
  
  [27:24] extended register number
  [23:16] bus number
  [15:11] device number
  [10: 8] function number
  [ 7: 0] register number
  
  So it doesn't matter whether I use separate areas per bus or not. As
  soon as the whole extended configuration space needs to be accessed a
  whopping 28 bits (256 MiB) are required.
 
 You'd piece a mapping together, each bus requires 16 64k mappings, a
 simple 2d array of busnr*16 of pointers would do the trick. A more
 clever solution would be to allocate contiguous virtual memory and
 split that up..

Oh, I see. I'm not very familiar with the internals of remapping, so
I'll need to do some more reading. Thanks for the hints.

   Actually, AER probably needs this, and I believe some broken devices 
   need to mask interrupts using the PCI command word in the config space,
   it it can happen.
  
  Ugh... that would kill any such dynamic mapping approach. Perhaps if we
  could mark a device as requiring a static mapping we could pin that
  cache entry. But that doesn't sound very encouraging.
 
 AER applies to pretty much every PCI-E device these days.

So given there's no way around a segmented static mapping as you
suggested, right?

Thierry


pgpIxg6ZwZUHq.pgp
Description: PGP signature


Re: [PATCH 05/14] lib: Add I/O map cache implementation

2013-01-10 Thread Thierry Reding
On Thu, Jan 10, 2013 at 06:26:55PM +, Arnd Bergmann wrote:
 On Thursday 10 January 2013, Thierry Reding wrote:
  I don't understand how this would help. The encoding is like this:
  
  [27:24] extended register number
  [23:16] bus number
  [15:11] device number
  [10: 8] function number
  [ 7: 0] register number
  
  So it doesn't matter whether I use separate areas per bus or not. As
  soon as the whole extended configuration space needs to be accessed a
  whopping 28 bits (256 MiB) are required.
  
  What you propose would work if only regular configuration space is
  supported. I'm not sure if that's an option.
 
 I mean something like:
 
 struct tegra_bus_private {
   ...
   void __iomem *config_space[16];
 };
 
 
 void tegra_scan_bus(struct pci_bus *bus)
 {
   int i;
   struct tegra_bus_private *priv = bus-dev-private;
 
   for (i=0; i16; i++)
   priv-config_space[i] = ioremap(config_space_phys +
65536 * bus-primary + i * SZ_1M, 65536);
 
   ...
 }

Okay, I see. It's a bit kludgy, but I guess so was the I/O map cache.
It'll take some more time to work this out and test, but I'll give it
a shot.

Thierry


pgpC4AnXiuPs2.pgp
Description: PGP signature


[PATCH v2 0/3] dmaengine: add per channel capabilities api

2013-01-10 Thread Matt Porter
Changes since v1:
- Use the existing dma_transaction_type enums instead of
  adding the mostly duplicated dmaengine_apis enums

This series adds a new dmaengine api, dma_get_channels_caps(), which
may be used by a driver to get channel-specific capabilities. This is
based on a starting point suggested by Vinod Koul, but only implements
the minimal sets of channel capabilities to fulfill the needs of the
EDMA DMA Engine driver at this time.

Specifically, this implementation supports reporting of a set of
transfer type operations, maximum number of SG segments, and the
maximum size of an SG segment that a channel can support.

The call is implemented as follows:

struct dmaengine_chan_caps
*dma_get_channel_caps(struct dma_chan *chan,
  enum dma_transfer_direction dir);

The dma transfer direction parameter may appear a bit out of place
but it is necessary since the direction field in struct
dma_slave_config was deprecated. In some cases, EDMA for one, it
is necessary for the dmaengine driver to have the burst and address
width slave configuration parameters available in order to compute
the maximum segment size that can be handle. Due to this requirement,
the calling order of this api is as follows:

1. Allocate a DMA slave channel
1a. [Optionally] Get channel capabilities
2. Set slave and controller specific parameters
3. Get a descriptor for transaction
4. Submit the transaction
5. Issue pending requests and wait for callback notification

Along with the API implementation, this series implements the
backend device_channel_caps() in the EDMA DMA Engine driver and
converts the davinci_mmc driver to use dma_get_channel_caps() to
replace hardcoded limits.

This is tested on the AM1808-EVM.

Matt Porter (3):
  dmaengine: add dma_get_channel_caps()
  dma: edma: add device_channel_caps() support
  mmc: davinci: get SG segment limits with dma_get_channel_caps()

 drivers/dma/edma.c|   27 
 drivers/mmc/host/davinci_mmc.c|   66 +
 include/linux/dmaengine.h |   40 +
 include/linux/platform_data/mmc-davinci.h |3 --
 4 files changed, 88 insertions(+), 48 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/3] dma: edma: add device_channel_caps() support

2013-01-10 Thread Matt Porter
Implement device_channel_caps().

EDMA has a finite set of PaRAM slots available for linking
a multi-segment SG transfer. In order to prevent any one
channel from consuming all PaRAM slots to fulfill a large SG
transfer, the driver reports a static per-channel max number
of SG segments it will handle.

The maximum size of SG segment is limited by the slave config
maxburst and addr_width for the channel in question. These values
are used from the current channel config to calculate and return
the max segment length cap.

Signed-off-by: Matt Porter mpor...@ti.com
---
 drivers/dma/edma.c |   27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/dma/edma.c b/drivers/dma/edma.c
index 82c8672..fc4b9db 100644
--- a/drivers/dma/edma.c
+++ b/drivers/dma/edma.c
@@ -70,6 +70,7 @@ struct edma_chan {
boolalloced;
int slot[EDMA_MAX_SLOTS];
struct dma_slave_config cfg;
+   struct dmaengine_chan_caps  caps;
 };
 
 struct edma_cc {
@@ -462,6 +463,28 @@ static void edma_issue_pending(struct dma_chan *chan)
spin_unlock_irqrestore(echan-vchan.lock, flags);
 }
 
+static struct dmaengine_chan_caps
+*edma_get_channel_caps(struct dma_chan *chan, enum dma_transfer_direction dir)
+{
+   struct edma_chan *echan;
+   enum dma_slave_buswidth width = 0;
+   u32 burst = 0;
+
+   if (chan) {
+   echan = to_edma_chan(chan);
+   if (dir == DMA_MEM_TO_DEV) {
+   width = echan-cfg.dst_addr_width;
+   burst = echan-cfg.dst_maxburst;
+   } else if (dir == DMA_DEV_TO_MEM) {
+   width = echan-cfg.src_addr_width;
+   burst = echan-cfg.src_maxburst;
+   }
+   echan-caps.seg_len = (SZ_64K - 1) * width * burst;
+   return echan-caps;
+   }
+   return NULL;
+}
+
 static size_t edma_desc_size(struct edma_desc *edesc)
 {
int i;
@@ -521,6 +544,9 @@ static void __init edma_chan_init(struct edma_cc *ecc,
echan-ch_num = EDMA_CTLR_CHAN(ecc-ctlr, i);
echan-ecc = ecc;
echan-vchan.desc_free = edma_desc_free;
+   dma_cap_set(DMA_SLAVE, echan-caps.cap_mask);
+   dma_cap_set(DMA_SG, echan-caps.cap_mask);
+   echan-caps.seg_nr = MAX_NR_SG;
 
vchan_init(echan-vchan, dma);
 
@@ -537,6 +563,7 @@ static void edma_dma_init(struct edma_cc *ecc, struct 
dma_device *dma,
dma-device_alloc_chan_resources = edma_alloc_chan_resources;
dma-device_free_chan_resources = edma_free_chan_resources;
dma-device_issue_pending = edma_issue_pending;
+   dma-device_channel_caps = edma_get_channel_caps;
dma-device_tx_status = edma_tx_status;
dma-device_control = edma_control;
dma-dev = dev;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/3] dmaengine: add dma_get_channel_caps()

2013-01-10 Thread Matt Porter
Add a dmaengine API to retrieve per channel capabilities.
Currently, only channel ops and SG segment limitations are
implemented caps.

The API is optionally implemented by drivers and when
unimplemented will return a NULL pointer. It is intended
to be executed after a channel has been requested and, if
the channel is intended to be used with slave SG transfers,
then it may only be called after dmaengine_slave_config()
has executed. The slave driver provides parameters such as
burst size and address width which may be necessary for
the dmaengine driver to use in order to properly return SG
segment limit caps.

Suggested-by: Vinod Koul vinod.k...@intel.com
Signed-off-by: Matt Porter mpor...@ti.com
---
 include/linux/dmaengine.h |   40 
 1 file changed, 40 insertions(+)

diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index c88f302..9fd0c5b 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -371,6 +371,26 @@ struct dma_slave_config {
unsigned int slave_id;
 };
 
+/* struct dmaengine_chan_caps - expose capability of a channel
+ * Note: each channel can have same or different capabilities
+ *
+ * This primarily classifies capabilities into
+ * a) APIs/ops supported
+ * b) channel physical capabilities
+ *
+ * @cap_mask: api/ops capability (DMA_INTERRUPT and DMA_PRIVATE
+ *are invalid api/ops and will never be set)
+ * @seg_nr: maximum number of SG segments supported on a SG/SLAVE
+ * channel (0 for no maximum or not a SG/SLAVE channel)
+ * @seg_len: maximum length of SG segments supported on a SG/SLAVE
+ *  channel (0 for no maximum or not a SG/SLAVE channel)
+ */
+struct dmaengine_chan_caps {
+   dma_cap_mask_t cap_mask;
+   int seg_nr;
+   int seg_len;
+};
+
 static inline const char *dma_chan_name(struct dma_chan *chan)
 {
return dev_name(chan-dev-device);
@@ -534,6 +554,7 @@ struct dma_tx_state {
  * struct with auxiliary transfer status information, otherwise the call
  * will just return a simple status code
  * @device_issue_pending: push pending transactions to hardware
+ * @device_channel_caps: return the channel capabilities
  */
 struct dma_device {
 
@@ -602,6 +623,8 @@ struct dma_device {
dma_cookie_t cookie,
struct dma_tx_state *txstate);
void (*device_issue_pending)(struct dma_chan *chan);
+   struct dmaengine_chan_caps *(*device_channel_caps)(
+   struct dma_chan *chan, enum dma_transfer_direction direction);
 };
 
 static inline int dmaengine_device_control(struct dma_chan *chan,
@@ -969,6 +992,23 @@ dma_set_tx_state(struct dma_tx_state *st, dma_cookie_t 
last, dma_cookie_t used,
}
 }
 
+/**
+ * dma_get_channel_caps - flush pending transactions to HW
+ * @chan: target DMA channel
+ * @dir: direction of transfer
+ *
+ * Get the channel-specific capabilities. If the dmaengine
+ * driver does not implement per channel capbilities then
+ * NULL is returned.
+ */
+static inline struct dmaengine_chan_caps
+*dma_get_channel_caps(struct dma_chan *chan, enum dma_transfer_direction dir)
+{
+   if (chan-device-device_channel_caps)
+   return chan-device-device_channel_caps(chan, dir);
+   return NULL;
+}
+
 enum dma_status dma_sync_wait(struct dma_chan *chan, dma_cookie_t cookie);
 #ifdef CONFIG_DMA_ENGINE
 enum dma_status dma_wait_for_async_tx(struct dma_async_tx_descriptor *tx);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


reproducible w1 oops on recent kernels (at least since 3.2.x)

2013-01-10 Thread Sven Geggus
Hello,

I first thought this to be a Raspberry Pi thing, but its not. Looks
like w1 driver is broken in some platform and busmaster independent
way at least since kernel 3.2.x (which Raspberry Pi uses).

Here is what to do to repoduce the bug on x86:

Get owfs from owfs.org and compile with w1 support or just install
owserver from your favourite Linux distribution.  I'm using version
2.8p15-1 from debian testing.

1. connect a 1-wire device to your computer and load the appropriate
   kernel module (I'm using a DS9490, so the module is ds2490.ko, but
   the bug also happens with other modules like w1-gpio)
2. run owserver --error_print 2 --error_level 99 --foreground --w1
3. run owdir on another terminal
4. system crashes with the following oops:

--cut--
Driver for 1-wire Dallas network protocol.
usbcore: registered new interface driver DS9490R
w1_master_driver w1_bus_master1: Family 81 for 81.00247ca7.41 is not 
registered.
PGD 16ff067 PUD 1700067 PMD 0 
Oops:  [#1] PREEMPT SMP 
Modules linked in: ds2490 wire cn sha256_generic bluetooth crc16 binfmt_misc 
nfsd coretemp kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_intel 
snd_hda_codec snd_hwdep microcode i2c_i801 uhci_hcd
CPU 1 
Pid: 4631, comm: owserver Not tainted 3.7.1 #1  /DG45ID
RIP: 0010:[8104baf0]  [8104baf0] 
kthread_should_stop+0x10/0x1b
RSP: 0018:880223d79b00  EFLAGS: 00010202
RAX:  RBX: 88022f144000 RCX: 
RDX: 0001 RSI: 0286 RDI: 
RBP:  R08: 880223d78000 R09: 
R10: 0001 R11: dead00100100 R12: 88022f1440b0
R13: 0040 R14: a006f7fa R15: 
FS:  7fdf7fd80700() GS:88023bc8() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: ffc8 CR3: 00021c83c000 CR4: 000407e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process owserver (pid: 4631, threadinfo 880223d78000, task 880232f4e740)
Stack:
 a006ee9c 880232cc60c0 0100 
 00f1 ea000744b070 0001 88021c8a3824
 88022f144000 88021c8a3810 88022f144038 
Call Trace:
 [a006ee9c] ? w1_search+0x11d/0x188 [wire]
 [a006ef3e] ? w1_search_process_cb+0x37/0x91 [wire]
 [a006fbbc] ? w1_cn_callback+0x2fd/0x42e [wire]
 [a0034585] ? cn_rx_skb+0xb7/0xea [cn]
 [81458e29] ? netlink_unicast+0x123/0x1ae
 [814591a7] ? netlink_sendmsg+0x27d/0x2ed
 [81428229] ? sock_sendmsg+0x98/0xb5
 [8142a7db] ? sys_sendto+0xdb/0x104
 [810ef7cd] ? vfs_write+0xfa/0x141
 [810efa27] ? sys_write+0x60/0x77
 [8150e0a9] ? system_call_fastpath+0x16/0x1b
Code: ff c6 05 93 71 73 00 01 eb 06 48 89 df 5b ff e0 48 c7 c0 ea ff ff ff 5b 
c3 90 90 65 48 8b 04 25 c0 b7 00 00 48 8b 80 88 02 00 00 48 8b 40 c8 48 d1 e8 
83 e0 01 c3 f0 ff 47 10 48 8b 87 88 02 00 
 RSP 880223d79b00
CR2: ffc8
---[ end trace 3131d23f4378d60e ]---
--cut--

Regards

Sven

P.S.: Looks like this is the same bug, as the one reported at
https://bugzilla.redhat.com/show_bug.cgi?id=857954

-- 
Those who do not understand Unix are condemned to reinvent it, poorly
(Henry Spencer)

/me is giggls@ircnet, http://sven.gegg.us/ on the Web
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] mmc: davinci: get SG segment limits with dma_get_channel_caps()

2013-01-10 Thread Matt Porter
Replace the hardcoded values used to set max_segs/max_seg_size with
a dma_get_channel_caps() query to the dmaengine driver.

Signed-off-by: Matt Porter mpor...@ti.com
---
 drivers/mmc/host/davinci_mmc.c|   66 +
 include/linux/platform_data/mmc-davinci.h |3 --
 2 files changed, 21 insertions(+), 48 deletions(-)

diff --git a/drivers/mmc/host/davinci_mmc.c b/drivers/mmc/host/davinci_mmc.c
index 2063677..17e186d 100644
--- a/drivers/mmc/host/davinci_mmc.c
+++ b/drivers/mmc/host/davinci_mmc.c
@@ -144,18 +144,6 @@
 /* MMCSD Init clock in Hz in opendrain mode */
 #define MMCSD_INIT_CLOCK   20
 
-/*
- * One scatterlist dma segment is at most MAX_CCNT rw_threshold units,
- * and we handle up to MAX_NR_SG segments.  MMC_BLOCK_BOUNCE kicks in only
- * for drivers with max_segs == 1, making the segments bigger (64KB)
- * than the page or two that's otherwise typical. nr_sg (passed from
- * platform data) == 16 gives at least the same throughput boost, using
- * EDMA transfer linkage instead of spending CPU time copying pages.
- */
-#define MAX_CCNT   ((1  16) - 1)
-
-#define MAX_NR_SG  16
-
 static unsigned rw_threshold = 32;
 module_param(rw_threshold, uint, S_IRUGO);
 MODULE_PARM_DESC(rw_threshold,
@@ -216,8 +204,6 @@ struct mmc_davinci_host {
u8 version;
/* for ns in one cycle calculation */
unsigned ns_in_one_cycle;
-   /* Number of sg segments */
-   u8 nr_sg;
 #ifdef CONFIG_CPU_FREQ
struct notifier_block   freq_transition;
 #endif
@@ -421,16 +407,7 @@ static int mmc_davinci_send_dma_request(struct 
mmc_davinci_host *host,
int ret = 0;
 
if (host-data_dir == DAVINCI_MMC_DATADIR_WRITE) {
-   struct dma_slave_config dma_tx_conf = {
-   .direction = DMA_MEM_TO_DEV,
-   .dst_addr = host-mem_res-start + DAVINCI_MMCDXR,
-   .dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
-   .dst_maxburst =
-   rw_threshold / DMA_SLAVE_BUSWIDTH_4_BYTES,
-   };
chan = host-dma_tx;
-   dmaengine_slave_config(host-dma_tx, dma_tx_conf);
-
desc = dmaengine_prep_slave_sg(host-dma_tx,
data-sg,
host-sg_len,
@@ -443,16 +420,7 @@ static int mmc_davinci_send_dma_request(struct 
mmc_davinci_host *host,
goto out;
}
} else {
-   struct dma_slave_config dma_rx_conf = {
-   .direction = DMA_DEV_TO_MEM,
-   .src_addr = host-mem_res-start + DAVINCI_MMCDRR,
-   .src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
-   .src_maxburst =
-   rw_threshold / DMA_SLAVE_BUSWIDTH_4_BYTES,
-   };
chan = host-dma_rx;
-   dmaengine_slave_config(host-dma_rx, dma_rx_conf);
-
desc = dmaengine_prep_slave_sg(host-dma_rx,
data-sg,
host-sg_len,
@@ -1165,6 +1133,7 @@ static int __init davinci_mmcsd_probe(struct 
platform_device *pdev)
struct resource *r, *mem = NULL;
int ret = 0, irq = 0;
size_t mem_size;
+   struct dmaengine_chan_caps *dma_chan_caps;
 
/* REVISIT:  when we're fully converted, fail if pdata is NULL */
 
@@ -1214,12 +1183,6 @@ static int __init davinci_mmcsd_probe(struct 
platform_device *pdev)
 
init_mmcsd_host(host);
 
-   if (pdata-nr_sg)
-   host-nr_sg = pdata-nr_sg - 1;
-
-   if (host-nr_sg  MAX_NR_SG || !host-nr_sg)
-   host-nr_sg = MAX_NR_SG;
-
host-use_dma = use_dma;
host-mmc_irq = irq;
host-sdio_irq = platform_get_irq(pdev, 1);
@@ -1248,14 +1211,27 @@ static int __init davinci_mmcsd_probe(struct 
platform_device *pdev)
mmc-caps |= pdata-caps;
mmc-ocr_avail = MMC_VDD_32_33 | MMC_VDD_33_34;
 
-   /* With no iommu coalescing pages, each phys_seg is a hw_seg.
-* Each hw_seg uses one EDMA parameter RAM slot, always one
-* channel and then usually some linked slots.
-*/
-   mmc-max_segs   = MAX_NR_SG;
+   {
+   struct dma_slave_config dma_txrx_conf = {
+   .src_addr = host-mem_res-start + DAVINCI_MMCDRR,
+   .src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
+   .src_maxburst =
+   rw_threshold / DMA_SLAVE_BUSWIDTH_4_BYTES,
+   .dst_addr = host-mem_res-start + DAVINCI_MMCDXR,
+   .dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
+   .dst_maxburst =
+   rw_threshold / DMA_SLAVE_BUSWIDTH_4_BYTES,
+   };
+   dmaengine_slave_config(host-dma_tx, dma_txrx_conf);
+   

Re: [PATCH 05/14] lib: Add I/O map cache implementation

2013-01-10 Thread Thierry Reding
On Thu, Jan 10, 2013 at 07:55:05PM +0100, Thierry Reding wrote:
 On Thu, Jan 10, 2013 at 11:20:07AM -0700, Jason Gunthorpe wrote:
  On Thu, Jan 10, 2013 at 11:25:44AM +0100, Thierry Reding wrote:
   On Thu, Jan 10, 2013 at 09:17:19AM +, Arnd Bergmann wrote:
On Thursday 10 January 2013, Thierry Reding wrote:
 On Wed, Jan 09, 2013 at 04:17:58PM -0700, Jason Gunthorpe wrote:
  On Wed, Jan 09, 2013 at 04:12:31PM -0700, Stephen Warren wrote:
  You could decrease the size of the mapping to only span the bus
  numbers that are configured for use via DT.
 
 That won't work, unfortunately. The mapping is such that the bus 
 number
 is not encoded in the uppermost bits, the extended register number is.
 So the only thing that we could do is decrease the size of the 
 extended
 register space for *all* devices.

But you could still a method to map 16 separate areas per bus, each 
65536
bytes long, which results in 1MB per bus. That is probably ok, since
very few systems have more than a handful of buses in practice.

In theory, doing static mappings on a per-page base would let you
do 16 devices at a time, but it's probably worth doing at this fine
granularity.
   
   I don't understand how this would help. The encoding is like this:
   
 [27:24] extended register number
 [23:16] bus number
 [15:11] device number
 [10: 8] function number
 [ 7: 0] register number
   
   So it doesn't matter whether I use separate areas per bus or not. As
   soon as the whole extended configuration space needs to be accessed a
   whopping 28 bits (256 MiB) are required.
  
  You'd piece a mapping together, each bus requires 16 64k mappings, a
  simple 2d array of busnr*16 of pointers would do the trick. A more
  clever solution would be to allocate contiguous virtual memory and
  split that up..
 
 Oh, I see. I'm not very familiar with the internals of remapping, so
 I'll need to do some more reading. Thanks for the hints.

I forgot to ask. What's the advantage of having a contiguous virtual
memory area and splitting it up versus remapping each chunk separately?

Thierry


pgpqFYS2GYDOK.pgp
Description: PGP signature


[PATCH v2] gpio: vt8500: memory cleanup missing

2013-01-10 Thread Tony Prisk
This driver is missing a .remove callback, and the fail path on
probe is incomplete.

If an error occurs in vt8500_add_chips, gpio_base is not unmapped.
The driver is also ignoring the return value from this function so
if a chip fails to register it completes as successful.

Replaced pr_err with dev_err in vt8500_add_chips since the device is
available.

There is also no .remove callback defined. To allow removing the
registered chips, I have moved *vtchip to be a static global.

Signed-off-by: Tony Prisk li...@prisktech.co.nz
---
v2:
Remove global variable and use platform_set_drvdata instead.

 drivers/gpio/gpio-vt8500.c |   51 +++-
 1 file changed, 41 insertions(+), 10 deletions(-)

diff --git a/drivers/gpio/gpio-vt8500.c b/drivers/gpio/gpio-vt8500.c
index b53320a..87e59b5 100644
--- a/drivers/gpio/gpio-vt8500.c
+++ b/drivers/gpio/gpio-vt8500.c
@@ -233,10 +233,12 @@ static int vt8500_add_chips(struct platform_device *pdev, 
void __iomem *base,
sizeof(struct vt8500_gpio_chip) * data-num_banks,
GFP_KERNEL);
if (!vtchip) {
-   pr_err(%s: failed to allocate chip memory\n, __func__);
+   dev_err(pdev-dev, failed to allocate chip memory\n);
return -ENOMEM;
}
 
+   platform_set_drvdata(pdev, vtchip);
+
for (i = 0; i  data-num_banks; i++) {
vtchip[i].base = base;
vtchip[i].regs = data-banks[i];
@@ -261,6 +263,7 @@ static int vt8500_add_chips(struct platform_device *pdev, 
void __iomem *base,
 
gpiochip_add(chip);
}
+
return 0;
 }
 
@@ -273,36 +276,64 @@ static struct of_device_id vt8500_gpio_dt_ids[] = {
 
 static int vt8500_gpio_probe(struct platform_device *pdev)
 {
+   int ret;
void __iomem *gpio_base;
-   struct device_node *np;
+   struct device_node *np = pdev-dev.of_node;
const struct of_device_id *of_id =
of_match_device(vt8500_gpio_dt_ids, pdev-dev);
 
-   if (!of_id) {
-   dev_err(pdev-dev, Failed to find gpio controller\n);
+   if (!np) {
+   dev_err(pdev-dev, GPIO node missing in devicetree\n);
return -ENODEV;
}
 
-   np = pdev-dev.of_node;
-   if (!np) {
-   dev_err(pdev-dev, Missing GPIO description in devicetree\n);
-   return -EFAULT;
+   if (!of_id) {
+   dev_err(pdev-dev, No matching driver data\n);
+   return -ENODEV;
}
 
gpio_base = of_iomap(np, 0);
if (!gpio_base) {
dev_err(pdev-dev, Unable to map GPIO registers\n);
-   of_node_put(np);
return -ENOMEM;
}
 
-   vt8500_add_chips(pdev, gpio_base, of_id-data);
+   ret = vt8500_add_chips(pdev, gpio_base, of_id-data);
+   if (ret) {
+   iounmap(gpio_base);
+   return ret;
+   }
+
+   return 0;
+}
+
+static int vt8500_gpio_remove(struct platform_device *pdev)
+{
+   int i;
+   int ret;
+   const struct vt8500_gpio_data *data;
+   struct vt8500_gpio_chip *vtchip = platform_get_drvdata(pdev);
+   void __iomem *gpio_base = vtchip[0].base;
+   const struct of_device_id *of_id =
+   of_match_device(vt8500_gpio_dt_ids, pdev-dev);
+
+   data = of_id-data;
+
+   for (i = 0; i  data-num_banks; i++) {
+   ret = gpiochip_remove(vtchip[i].chip);
+   if (ret)
+   dev_warn(pdev-dev, gpiochip_remove returned %d\n,
+ret);
+   }
+
+   iounmap(gpio_base);
 
return 0;
 }
 
 static struct platform_driver vt8500_gpio_driver = {
.probe  = vt8500_gpio_probe,
+   .remove = vt8500_gpio_remove,
.driver = {
.name   = vt8500-gpio,
.owner  = THIS_MODULE,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Remove __dev* markings from init.h

2013-01-10 Thread Greg Kroah-Hartman
Note, the below patch is now in my driver-core.next tree, and should go
to Linus before 3.8-final is out.  Just a heads-up for anyone who is
wondering what happened to the __dev* markings in the kernel.


From: Greg Kroah-Hartman gre...@linuxfoundation.org
Subject: [PATCH] Remove __dev* markings from init.h

Now that all in-kernel users of __dev* are gone, let's remove them from
init.h to keep them from popping up again and again.

Thanks to Bill Pemberton for doing all of the hard work to make removal
of this possible.

Cc: Bill Pemberton wf...@virginia.edu
Cc: Stephen Rothwell s...@canb.auug.org.au
Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
---
 include/linux/init.h | 20 
 1 file changed, 20 deletions(-)



diff --git a/include/linux/init.h b/include/linux/init.h
index a799273..10ed4f4 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -93,14 +93,6 @@
 
 #define __exit  __section(.exit.text) __exitused __cold notrace
 
-/* Used for HOTPLUG, but that is always enabled now, so just make them noops */
-#define __devinit
-#define __devinitdata
-#define __devinitconst
-#define __devexit
-#define __devexitdata
-#define __devexitconst
-
 /* Used for HOTPLUG_CPU */
 #define __cpuinit__section(.cpuinit.text) __cold notrace
 #define __cpuinitdata__section(.cpuinit.data)
@@ -337,18 +329,6 @@ void __init parse_early_options(char *cmdline);
 #define __INITRODATA_OR_MODULE __INITRODATA
 #endif /*CONFIG_MODULES*/
 
-/* Functions marked as __devexit may be discarded at kernel link time, 
depending
-   on config options.  Newer versions of binutils detect references from
-   retained sections to discarded sections and flag an error.  Pointers to
-   __devexit functions must use __devexit_p(function_name), the wrapper will
-   insert either the function_name or NULL, depending on the config options.
- */
-#if defined(MODULE) || defined(CONFIG_HOTPLUG)
-#define __devexit_p(x) x
-#else
-#define __devexit_p(x) NULL
-#endif
-
 #ifdef MODULE
 #define __exit_p(x) x
 #else
-- 
1.8.1.rc1.5.g7e0651a

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 00/12] iommu/exynos: Fixes and Enhancements of System MMU driver with DT

2013-01-10 Thread 'Joerg Roedel'
On Thu, Jan 10, 2013 at 10:34:48AM -0800, Kukjin Kim wrote:
 Hmm, I think, just one [7/12] patch does matter so if you could create topic
 branch and apply [7/12] patch firstly before other drivers/ changes would be
 better to me. It's OK on both trees if I just _merge_ the first [7/12]
 commit in the topic branch you provided for exynos-iommu. But you know, you
 don't have to rebase it once I merge it in Samsung tree.
 
 How about?

Sounds good. Should I take the patch directly from this post? I would
add your Acked-by then and make it the base for the arm/exynos branch.
Once the branch is set-up, as usual it will not be rebased.


Joerg


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 1/2] virtio-net: fix the set affinity bug when CPU IDs are not consecutive

2013-01-10 Thread Ben Hutchings
On Thu, 2013-01-10 at 11:19 +1030, Rusty Russell wrote:
 Wanlong Gao gaowanl...@cn.fujitsu.com writes:
  On 01/09/2013 07:31 AM, Rusty Russell wrote:
  Wanlong Gao gaowanl...@cn.fujitsu.com writes:
*/
   static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff 
  *skb)
   {
  - int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
  -   smp_processor_id();
  + int txq = 0;
  +
  + if (skb_rx_queue_recorded(skb))
  + txq = skb_get_rx_queue(skb);
  + else if ((txq = per_cpu(vq_index, smp_processor_id())) == -1)
  + txq = 0;
  
  You should use __get_cpu_var() instead of smp_processor_id() here, ie:
  
  else if ((txq = __get_cpu_var(vq_index)) == -1)
  
  And AFAICT, no reason to initialize txq to 0 to start with.
  
  So:
  
  int txq;
  
  if (skb_rx_queue_recorded(skb))
 txq = skb_get_rx_queue(skb);
  else {
  txq = __get_cpu_var(vq_index);
  if (txq == -1)
  txq = 0;
  }
 
  Got it, thank you.
 
  
  Now, just to confirm, I assume this can happen even if we use vq_index,
  right, because of races with virtnet_set_channels?
 
  I still can't understand this race, could you explain more? thank you.
 
 I assume that someone can call virtnet_set_channels() while we are
 inside virtnet_select_queue(), so they reduce dev-real_num_tx_queues,
 causing virtnet_set_channels to do:
 
   while (unlikely(txq = dev-real_num_tx_queues))
   txq -= dev-real_num_tx_queues;
 
 Otherwise, when is this loop called?

In fact, this race can result in the TX scheduler using a queue that has
been disabled, or other weirdness (consider what happens if
real_num_tx_queues increases between those two uses).

virtnet_set_channels() really must disable TX temporarily:

netif_tx_lock(dev);
netif_device_detach(dev);
netif_tx_unlock(dev);
...
netif_device_attach(dev);

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 04/16] ARM: edma: add DT and runtime PM support for AM33XX

2013-01-10 Thread Matt Porter
On Sun, Oct 28, 2012 at 04:33:39PM +0530, Sekhar Nori wrote:
 On 10/18/2012 6:56 PM, Matt Porter wrote:
  Adds support for parsing the TI EDMA DT data into the required
  EDMA private API platform data.
  
  Calls runtime PM API only in the DT case in order to unidle the
  associated hwmods on AM33XX.
 
 Runtime PM is supported on DaVinci now, so if that was the reason for
 this choice, then it doesn't need to be that way.

Thanks, fixed this for v4.

  
  Signed-off-by: Matt Porter mpor...@ti.com
  ---
   arch/arm/common/edma.c  |  255 
  +--
   arch/arm/mach-davinci/board-da830-evm.c |4 +-
   arch/arm/mach-davinci/board-da850-evm.c |8 +-
   arch/arm/mach-davinci/board-dm646x-evm.c|4 +-
   arch/arm/mach-davinci/board-omapl138-hawk.c |8 +-
   arch/arm/mach-davinci/devices-da8xx.c   |8 +-
   arch/arm/mach-davinci/devices-tnetv107x.c   |4 +-
   arch/arm/mach-davinci/dm355.c   |4 +-
   arch/arm/mach-davinci/dm365.c   |4 +-
   arch/arm/mach-davinci/dm644x.c  |4 +-
   arch/arm/mach-davinci/dm646x.c  |4 +-
   include/linux/platform_data/edma.h  |8 +-
   12 files changed, 272 insertions(+), 43 deletions(-)
  
  diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
  index a3d189d..6d2a590 100644
  --- a/arch/arm/common/edma.c
  +++ b/arch/arm/common/edma.c
  @@ -24,6 +24,13 @@
   #include linux/platform_device.h
   #include linux/io.h
   #include linux/slab.h
  +#include linux/edma.h
  +#include linux/err.h
  +#include linux/of_address.h
  +#include linux/of_device.h
  +#include linux/of_dma.h
  +#include linux/of_irq.h
  +#include linux/pm_runtime.h
   
   #include linux/platform_data/edma.h
   
  @@ -1366,31 +1373,237 @@ void edma_clear_event(unsigned channel)
   EXPORT_SYMBOL(edma_clear_event);
   
   /*---*/
  +static int edma_of_read_u32_to_s8_array(const struct device_node *np,
  +const char *propname, s8 *out_values,
  +size_t sz)
  +{
  +   struct property *prop = of_find_property(np, propname, NULL);
  +   const __be32 *val;
  +
  +   if (!prop)
  +   return -EINVAL;
  +   if (!prop-value)
  +   return -ENODATA;
  +   if ((sz * sizeof(u32))  prop-length)
  +   return -EOVERFLOW;
  +
  +   val = prop-value;
  +
  +   while (sz--)
  +   *out_values++ = (s8)(be32_to_cpup(val++)  0xff);
  +
  +   /* Terminate it */
  +   *out_values++ = -1;
  +   *out_values++ = -1;
  +
  +   return 0;
  +}
  +
  +static int edma_of_read_u32_to_s16_array(const struct device_node *np,
  +const char *propname, s16 *out_values,
  +size_t sz)
  +{
  +   struct property *prop = of_find_property(np, propname, NULL);
  +   const __be32 *val;
  +
  +   if (!prop)
  +   return -EINVAL;
  +   if (!prop-value)
  +   return -ENODATA;
  +   if ((sz * sizeof(u32))  prop-length)
  +   return -EOVERFLOW;
  +
  +   val = prop-value;
  +
  +   while (sz--)
  +   *out_values++ = (s16)(be32_to_cpup(val++)  0x);
  +
  +   /* Terminate it */
  +   *out_values++ = -1;
  +   *out_values++ = -1;
  +
  +   return 0;
  +}
 
 I think these helper functions will have some general use beyond EDMA
 and can be kept in drivers/of/base.c. Grant/Rob need to agree though.

I expect these to go away before too long as I expect to rewrite all of
these data structures when the private edma api code is folded into
drivers/dma/edma.c. When that happens, I would like to use some data
structures that lend themselves to the DT property value. Given that,
let's wait until there is another user to move these helpers to
drivers/of/.

  diff --git a/arch/arm/mach-davinci/board-da830-evm.c 
  b/arch/arm/mach-davinci/board-da830-evm.c
  index 95b5e10..ffcbec1 100644
  --- a/arch/arm/mach-davinci/board-da830-evm.c
  +++ b/arch/arm/mach-davinci/board-da830-evm.c
  @@ -512,7 +512,7 @@ static struct davinci_i2c_platform_data 
  da830_evm_i2c_0_pdata = {
* example: Timer, GPIO, UART events etc) on da830/omap-l137 EVM, hence
* they are being reserved for codecs on the DSP side.
*/
  -static const s16 da830_dma_rsv_chans[][2] = {
  +static s16 da830_dma_rsv_chans[][2] = {
 
 I wonder why you had to remove const here and in other places. You seem
 to be allocating new memory for DT case anyway. Its also not a good idea
 to modify the passed platform data.

Good catch. Although I was never modifying pdata itself, at one point
the DT parsing code was modifying pointers that were const and I removed
the const. I've since fixed that issue since we'd like them to be const
as you point out. As a result, I've removed all these changes from v4 so
all those files aren't touched.

-Matt
--
To unsubscribe from this list: send the line 

Re: linux-next: Tree for Jan 10 (vmci)

2013-01-10 Thread Randy Dunlap
On 01/09/13 19:32, Stephen Rothwell wrote:
 Hi all,
 
 Changes since 20130109:
 


on i386, when CONFIG_PCI is not enabled:

  CC [M]  drivers/misc/vmw_vmci/vmci_guest.o
drivers/misc/vmw_vmci/vmci_guest.c:58:20: error: array type has incomplete 
element type
drivers/misc/vmw_vmci/vmci_guest.c: In function 'vmci_enable_msix':
drivers/misc/vmw_vmci/vmci_guest.c:384:2: error: implicit declaration of 
function 'pci_enable_msix' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_guest.c: In function 'vmci_guest_probe_device':
drivers/misc/vmw_vmci/vmci_guest.c:466:2: error: implicit declaration of 
function 'pcim_enable_device' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_guest.c:592:2: error: implicit declaration of 
function 'pci_enable_msi' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_guest.c:654:3: error: implicit declaration of 
function 'pci_disable_msix' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_guest.c:656:3: error: implicit declaration of 
function 'pci_disable_msi' [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[4]: *** [drivers/misc/vmw_vmci/vmci_guest.o] Error 1


Should all of vmci depend on PCI ??



-- 
~Randy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


checkpatch.pl error: Use of uninitialized value $max (line 3407,3410)

2013-01-10 Thread Guenter Roeck
Hi all,

I get the following error when running checkpatch.pl:

scripts/checkpatch.pl -f drivers/hwmon/da9055-hwmon.c
Use of uninitialized value $max in string eq at scripts/checkpatch.pl line 3407.
Use of uninitialized value $max in pattern match (m//) at scripts/checkpatch.pl 
line 3410.
total: 0 errors, 0 warnings, 336 lines checked

drivers/hwmon/da9055-hwmon.c has no obvious style problems and is ready for
submission.

The checkpatch code creating the error was introduced with commit 4a273195
(checkpatch: check usleep_range() arguments).

Is this a but in checkpatch, or is something wrong in my environment ?

Thanks,
Guenter
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 08/16] ARM: dts: add AM33XX EDMA support

2013-01-10 Thread Matt Porter
On Sun, Oct 28, 2012 at 04:46:36PM +0530, Sekhar Nori wrote:
 On 10/18/2012 6:56 PM, Matt Porter wrote:
  Adds AM33XX EDMA support to the am33xx.dtsi as documented in
  Documentation/devicetree/bindings/dma/ti-edma.txt
  
  Signed-off-by: Matt Porter mpor...@ti.com
  ---
   arch/arm/boot/dts/am33xx.dtsi |   31 +++
   1 file changed, 31 insertions(+)
  
  diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
  index bb31bff..ab9c78f 100644
  --- a/arch/arm/boot/dts/am33xx.dtsi
  +++ b/arch/arm/boot/dts/am33xx.dtsi
  @@ -62,6 +62,37 @@
  reg = 0x4820 0x1000;
  };
   
  +   edma: edma@4900 {
  +   compatible = ti,edma3;
  +   ti,hwmods = tpcc, tptc0, tptc1, tptc2;
  +   reg =   0x4900 0x1,
  +   0x44e10f90 0x10;
  +   interrupt-parent = intc;
  +   interrupts = 12 13 14;
  +   #dma-cells = 1;
  +   dma-channels = 64;
  +   ti,edma-regions = 4;
  +   ti,edma-slots = 256;
  +   ti,edma-reserved-channels = 0  2
  +14 2
  +26 6
  +48 4
  +56 8;
  +   ti,edma-reserved-slots = 0  2
  + 14 2
  + 26 6
  + 48 4
  + 56 8
  + 64 127;
 
 No need to reserve any channels or slots on AM335x, I think. This is
 used on DaVinci devices to share channels with DSP. I am not sure the
 cortex-M3 or PRU on the AM335x need to (or even can) have EDMA access.

I agree. I'm dropping this from the .dtsi in v4 as it is board/application
specific. The PRU, at least, can use the EDMA and I've seen examples as
such, but we can't hardcode this. The feature is there and documented in
the binding if somebody needs to reserve channels in their .dts.

-Matt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/14] lib: Add I/O map cache implementation

2013-01-10 Thread Jason Gunthorpe
On Thu, Jan 10, 2013 at 08:03:27PM +0100, Thierry Reding wrote:

   You'd piece a mapping together, each bus requires 16 64k mappings, a
   simple 2d array of busnr*16 of pointers would do the trick. A more
   clever solution would be to allocate contiguous virtual memory and
   split that up..
 
  Oh, I see. I'm not very familiar with the internals of remapping, so
  I'll need to do some more reading. Thanks for the hints.
 
 I forgot to ask. What's the advantage of having a contiguous virtual
 memory area and splitting it up versus remapping each chunk separately?

Not alot, really, but it saves you from the pointer array and
associated overhead. IIRC it is fairly easy to do in the kernel.

Arnd's version is good too, but you would be restricted to aligned
powers of two for the bus number range in the DT, which is probably
not that big a deal either?

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 11/16] mmc: omap_hsmmc: limit max_segs with the EDMA DMAC

2013-01-10 Thread Matt Porter
On Mon, Oct 29, 2012 at 01:48:46PM +0530, Sekhar Nori wrote:
 On 10/18/2012 6:56 PM, Matt Porter wrote:
  The EDMA DMAC has a hardware limitation that prevents supporting
  scatter gather lists with any number of segments. Since the EDMA
  DMA Engine driver sets the maximum segments to 16, we do the
  same.
  
  TODO: this will be replaced once the DMA Engine API supports an
  API to query the DMAC's segment size limit.
  
  Signed-off-by: Matt Porter mpor...@ti.com
  ---
   drivers/mmc/host/omap_hsmmc.c |   10 ++
   1 file changed, 10 insertions(+)
  
  diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
  index b327cd0..52bab01 100644
  --- a/drivers/mmc/host/omap_hsmmc.c
  +++ b/drivers/mmc/host/omap_hsmmc.c
  @@ -1828,6 +1828,16 @@ static int __devinit omap_hsmmc_probe(struct 
  platform_device *pdev)
   * as we want. */
  mmc-max_segs = 1024;
   
  +   /* Eventually we should get our max_segs limitation for EDMA by
  +* querying the dmaengine API */
 
 Nit picking: This is not as per multi-line comment style in
 Documentation/CodingStyle.

Thanks :). This is dropped from v4 anyway, as I now use a call to
dma_get_channel_caps() to determine the SG limits.

-Matt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Jan 10 (vmci)

2013-01-10 Thread Randy Dunlap
On 01/10/13 11:17, Randy Dunlap wrote:
 On 01/09/13 19:32, Stephen Rothwell wrote:
 Hi all,

 Changes since 20130109:

 
 
 on i386, when CONFIG_PCI is not enabled:
 
   CC [M]  drivers/misc/vmw_vmci/vmci_guest.o
 drivers/misc/vmw_vmci/vmci_guest.c:58:20: error: array type has incomplete 
 element type
 drivers/misc/vmw_vmci/vmci_guest.c: In function 'vmci_enable_msix':
 drivers/misc/vmw_vmci/vmci_guest.c:384:2: error: implicit declaration of 
 function 'pci_enable_msix' [-Werror=implicit-function-declaration]
 drivers/misc/vmw_vmci/vmci_guest.c: In function 'vmci_guest_probe_device':
 drivers/misc/vmw_vmci/vmci_guest.c:466:2: error: implicit declaration of 
 function 'pcim_enable_device' [-Werror=implicit-function-declaration]
 drivers/misc/vmw_vmci/vmci_guest.c:592:2: error: implicit declaration of 
 function 'pci_enable_msi' [-Werror=implicit-function-declaration]
 drivers/misc/vmw_vmci/vmci_guest.c:654:3: error: implicit declaration of 
 function 'pci_disable_msix' [-Werror=implicit-function-declaration]
 drivers/misc/vmw_vmci/vmci_guest.c:656:3: error: implicit declaration of 
 function 'pci_disable_msi' [-Werror=implicit-function-declaration]
 cc1: some warnings being treated as errors
 make[4]: *** [drivers/misc/vmw_vmci/vmci_guest.o] Error 1
 
 
 Should all of vmci depend on PCI ??
 
 


Also on i386, when CONFIG_BLOCK is not enabled:

  CC [M]  drivers/misc/vmw_vmci/vmci_queue_pair.o
In file included from drivers/misc/vmw_vmci/vmci_queue_pair.c:16:0:
include/linux/device-mapper.h:48:56: warning: 'struct bio' declared inside 
parameter list [enabled by default]
include/linux/device-mapper.h:48:56: warning: its scope is only this definition 
or declaration, which is probably not what you want [enabled by default]
include/linux/device-mapper.h:50:13: warning: 'struct request' declared inside 
parameter list [enabled by default]
include/linux/device-mapper.h:61:15: warning: 'struct bio' declared inside 
parameter list [enabled by default]
include/linux/device-mapper.h:64:15: warning: 'struct request' declared inside 
parameter list [enabled by default]
include/linux/device-mapper.h:80:15: warning: 'struct bvec_merge_data' declared 
inside parameter list [enabled by default]
include/linux/device-mapper.h:92:12: warning: 'struct queue_limits' declared 
inside parameter list [enabled by default]
include/linux/device-mapper.h:265:13: error: field 'clone' has incomplete type
include/linux/device-mapper.h: In function 'dm_bio_get_target_request_nr':
include/linux/device-mapper.h:280:9: warning: initialization from incompatible 
pointer type [enabled by default]
include/linux/device-mapper.h: At top level:
include/linux/device-mapper.h:376:42: warning: 'struct request' declared inside 
parameter list [enabled by default]
include/linux/device-mapper.h:563:33: warning: 'struct request' declared inside 
parameter list [enabled by default]
include/linux/device-mapper.h:564:41: warning: 'struct request' declared inside 
parameter list [enabled by default]
include/linux/device-mapper.h:565:38: warning: 'struct request' declared inside 
parameter list [enabled by default]
drivers/misc/vmw_vmci/vmci_queue_pair.c: In function 'qp_alloc_ppn_set':
drivers/misc/vmw_vmci/vmci_queue_pair.c:476:6: error: implicit declaration of 
function 'kmalloc' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_queue_pair.c:475:15: warning: assignment makes 
pointer from integer without a cast [enabled by default]
drivers/misc/vmw_vmci/vmci_queue_pair.c:480:15: warning: assignment makes 
pointer from integer without a cast [enabled by default]
drivers/misc/vmw_vmci/vmci_queue_pair.c:483:3: error: implicit declaration of 
function 'kfree' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_queue_pair.c: In function 'qp_host_alloc_queue':
drivers/misc/vmw_vmci/vmci_queue_pair.c:619:2: error: implicit declaration of 
function 'kzalloc' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_queue_pair.c:619:8: warning: assignment makes 
pointer from integer without a cast [enabled by default]
drivers/misc/vmw_vmci/vmci_queue_pair.c: In function 'qp_release_pages':
drivers/misc/vmw_vmci/vmci_queue_pair.c:717:3: error: implicit declaration of 
function 'page_cache_release' [-Werror=implicit-function-declaration]
drivers/misc/vmw_vmci/vmci_queue_pair.c: In function 'qp_guest_endpoint_create':
drivers/misc/vmw_vmci/vmci_queue_pair.c:975:8: warning: assignment makes 
pointer from integer without a cast [enabled by default]
drivers/misc/vmw_vmci/vmci_queue_pair.c: In function 'qp_alloc_hypercall':
drivers/misc/vmw_vmci/vmci_queue_pair.c:1033:12: warning: assignment makes 
pointer from integer without a cast [enabled by default]
drivers/misc/vmw_vmci/vmci_queue_pair.c: In function 'qp_broker_create':
drivers/misc/vmw_vmci/vmci_queue_pair.c:1396:8: warning: assignment makes 
pointer from integer without a cast [enabled by default]
drivers/misc/vmw_vmci/vmci_queue_pair.c:1450:3: error: implicit declaration 

Re: [PATCH 0/5] x86,smp: make ticket spinlock proportional backoff w/ auto tuning

2013-01-10 Thread Mike Galbraith
On Thu, 2013-01-10 at 10:31 -0500, Rik van Riel wrote: 
 On 01/10/2013 10:19 AM, Mike Galbraith wrote:
  On Tue, 2013-01-08 at 17:26 -0500, Rik van Riel wrote:
 
  Please let me know if you manage to break this code in any way,
  so I can fix it...
 
  I didn't break it, but did let it play with rq-lock contention.  Using
  cyclictest -Smp99 -i 100 -d 0, with 3 rt tasks for pull_rt_task() to
  pull around appears to have been a ~dead heat.
 
 Good to hear that the code seems to be robust. It seems to
 help prevent performance degradation in some workloads, and
 nobody seems to have found regressions yet.

I had hoped for a bit of positive, but a wash isn't surprising given the
profile.  I tried tbench too, didn't expect to see anything at all
there, and got that.. so both results are positive in that respect.

-Mike

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] pstore: Avoid deadlock in panic and emergency-restart path

2013-01-10 Thread Tony Luck
On Thu, Jan 10, 2013 at 10:23 AM, Seiji Aguchi seiji.agu...@hds.com wrote:
 Please apply these to your tree.

Ok. Applied and pushed to my next branch. Should show up in linux-next
in the next day or two.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 16/16] ARM: dts: add AM33XX SPI support

2013-01-10 Thread Matt Porter
On Sun, Oct 28, 2012 at 05:01:29PM +0530, Sekhar Nori wrote:
 On 10/18/2012 6:56 PM, Matt Porter wrote:
  Adds AM33XX SPI support for am335x-bone and am335x-evm.
  
  Signed-off-by: Matt Porter mpor...@ti.com
  ---
   arch/arm/boot/dts/am335x-bone.dts |   17 +++
   arch/arm/boot/dts/am335x-evm.dts  |9 
   arch/arm/boot/dts/am33xx.dtsi |   43 
  +
   3 files changed, 69 insertions(+)
  
  diff --git a/arch/arm/boot/dts/am335x-bone.dts 
  b/arch/arm/boot/dts/am335x-bone.dts
  index 5510979..23edfd8 100644
  --- a/arch/arm/boot/dts/am335x-bone.dts
  +++ b/arch/arm/boot/dts/am335x-bone.dts
  @@ -18,6 +18,17 @@
  reg = 0x8000 0x1000; /* 256 MB */
  };
   
  +   am3358_pinmux: pinmux@44e10800 {
  +   spi1_pins: pinmux_spi1_pins {
  +   pinctrl-single,pins = 
  +   0x190 0x13  /* mcasp0_aclkx.spi1_sclk, 
  OUTPUT_PULLUP | MODE3 */
  +   0x194 0x33  /* mcasp0_fsx.spi1_d0, 
  INPUT_PULLUP | MODE3 */
  +   0x198 0x13  /* mcasp0_axr0.spi1_d1, 
  OUTPUT_PULLUP | MODE3 */
  +   0x19c 0x13  /* mcasp0_ahclkr.spi1_cs0, 
  OUTPUT_PULLUP | MODE3 */
  +   ;
 
 Is there a single pinmux setting that provides SPI functionality on the
 bone headers? Or this is specific to a cape you tested with?

No, there are two usable settings for spi1 and one setting for spi0.
I'm dropping this from the series since it's specific to how I wired up
the homebrew cape I use for spi testing on the Bone. I publish the
branch where all these extra test-specific patches (that aren't intended
to be merged) are at in the cover letter.  Anybody that needs context of
how/what worked and was tested can grab them there.

-Matt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v6 00/12] iommu/exynos: Fixes and Enhancements of System MMU driver with DT

2013-01-10 Thread Kukjin Kim
'Joerg Roedel' wrote:
 
 On Thu, Jan 10, 2013 at 10:34:48AM -0800, Kukjin Kim wrote:
  Hmm, I think, just one [7/12] patch does matter so if you could create
topic
  branch and apply [7/12] patch firstly before other drivers/ changes
would
 be
  better to me. It's OK on both trees if I just _merge_ the first [7/12]
  commit in the topic branch you provided for exynos-iommu. But you know,
 you
  don't have to rebase it once I merge it in Samsung tree.
 
  How about?
 
 Sounds good. Should I take the patch directly from this post? I would
 add your Acked-by then and make it the base for the arm/exynos branch.

Sure, keep going on with

Acked-by: Kukjin Kim kgene@samsung.com

 Once the branch is set-up, as usual it will not be rebased.

OK, good. Let me know the name of branch when you set-up it :-)

Thanks.

- Kukjin

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


srat: harsh hot-pluggable memory check?

2013-01-10 Thread Davidlohr Bueso
When parsing the memory affinity mappings in arch/x86/mm/srat.c:
acpi_numa_memory_affinity_init() I'm wondering if the hot-pluggable check is 
too harsh, 
as we consider an error if the hot-pluggable bit is set and 
CONFIG_MEMORY_HOTPLUG is not.

Based on the ACPI specs (v5):

If the Enabled bit is set and the Hot Pluggable bit is also set. The
system hardware supports hot-add and hot-remove of this memory
region.

This only mentions that the system supports hot-plugging, and IMHO if the
user decides not to use CONFIG_MEMORY_HOTPLUG, it shouldn't be considered an 
error.
Therefore would it be ok to drop the check? Or am I missing something?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Mel Gorman
On Thu, Jan 10, 2013 at 09:25:11AM +, Eric Wong wrote:
 Mel Gorman mgor...@suse.de wrote:
  page-pfmemalloc can be left set for captured pages so try this but as
  capture is rarely used I'm strongly favouring a partial revert even if
  this works for you. I haven't reproduced this using your workload yet
  but I have found that high-order allocation stress tests for 3.8-rc2 are
  completely screwed. 71% success rates at rest in 3.7 and 6% in 3.8-rc2 so
  I have to chase that down too.
  
  diff --git a/mm/page_alloc.c b/mm/page_alloc.c
  index 9d20c13..c242d21 100644
  --- a/mm/page_alloc.c
  +++ b/mm/page_alloc.c
  @@ -2180,8 +2180,10 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, 
  unsigned int order,
  current-flags = ~PF_MEMALLOC;
   
  /* If compaction captured a page, prep and use it */
  -   if (page  !prep_new_page(page, order, gfp_mask))
  +   if (page  !prep_new_page(page, order, gfp_mask)) {
  +   page-pfmemalloc = false;
  goto got_page;
  +   }
   
  if (*did_some_progress != COMPACT_SKIPPED) {
  /* Page migration frees to the PCP lists but we want merging */
 
 This (on top of your previous patch) seems to work great after several
 hours of testing on both my VM and real machine.  I haven't tried your
 partial revert, yet.  Will try that in a bit on the VM.

Thanks Eric, it's much appreciated. However, I'm still very much in favour
of a partial revert as in retrospect the implementation of capture took the
wrong approach. Could you confirm the following patch works for you?
It's should functionally have the same effect as the first revert and
there are only minor changes from the last revert prototype I sent you
but there is no harm in being sure.

---8---
mm: compaction: Partially revert capture of suitable high-order page

Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when waiting
for POLLIN on a local TCP socket. It was easier to trigger if there was disk
IO and dirty pages at the same time and he bisected it to commit 1fb3f8ca
mm: compaction: capture a suitable high-order page immediately when it
is made available.

The intention of that patch was to improve high-order allocations under
memory pressure after changes made to reclaim in 3.6 drastically hurt
THP allocations but the approach was flawed. For Eric, the problem was
that page-pfmemalloc was not being cleared for captured pages leading to
a poor interaction with swap-over-NFS support causing the packets to be
dropped. However, I identified a few more problems with the patch including
the fact that it can increase contention on zone-lock in some cases which
could result in async direct compaction being aborted early.

In retrospect the capture patch took the wrong approach. What it should
have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
was allocating for THP and avoided races that way. While the patch was
showing to improve allocation success rates at the time, the benefit is
marginal given the relative complexity and it should be revisited from
scratch in the context of the other reclaim-related changes that have taken
place since the patch was first written and tested. This patch partially
reverts commit 1fb3f8ca mm: compaction: capture a suitable high-order
page immediately when it is made available.

Reported-by: Eric Wong normalper...@yhbt.net
Cc: sta...@vger.kernel.org
Signed-off-by: Mel Gorman mgor...@suse.de
---
 include/linux/compaction.h |4 +-
 include/linux/mm.h |1 -
 mm/compaction.c|   92 +++-
 mm/internal.h  |1 -
 mm/page_alloc.c|   35 -
 5 files changed, 23 insertions(+), 110 deletions(-)

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 6ecb6dc..cc7bdde 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -22,7 +22,7 @@ extern int sysctl_extfrag_handler(struct ctl_table *table, 
int write,
 extern int fragmentation_index(struct zone *zone, unsigned int order);
 extern unsigned long try_to_compact_pages(struct zonelist *zonelist,
int order, gfp_t gfp_mask, nodemask_t *mask,
-   bool sync, bool *contended, struct page **page);
+   bool sync, bool *contended);
 extern int compact_pgdat(pg_data_t *pgdat, int order);
 extern void reset_isolation_suitable(pg_data_t *pgdat);
 extern unsigned long compaction_suitable(struct zone *zone, int order);
@@ -75,7 +75,7 @@ static inline bool compaction_restarting(struct zone *zone, 
int order)
 #else
 static inline unsigned long try_to_compact_pages(struct zonelist *zonelist,
int order, gfp_t gfp_mask, nodemask_t *nodemask,
-   bool sync, bool *contended, struct page **page)
+   bool sync, bool *contended)
 {
return COMPACT_CONTINUE;
 }
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 

Re: Oops in sound/usb/pcm.c:match_endpoint_audioformats() in current -git

2013-01-10 Thread Eldad Zack

On Thu, 10 Jan 2013, Takashi Iwai wrote:

 At Thu, 10 Jan 2013 13:49:22 +0100,
 Jens Axboe wrote:
  
  Here it is, it's from the one introducing the audioformat lookup.
  Confirmed that 3.8-rc3 with this backed out works fine, too. So should
  be fairly confident in that result.

 From: Takashi Iwai ti...@suse.de
 Subject: [PATCH] ALSA: usb-audio: Fix NULL dereference by access to
  non-existing substream
 
 The commit [0d9741c0: ALSA: usb-audio: sync ep init fix for
 audioformat mismatch] introduced the correction of parameters to be
 set for sync EP.  But since the new code assumes that the sync EP is
 always paired with the data EP of another direction, it triggers Oops
 when a device only with a single direction is used.

Yes - sorry, I didn't consider this at all.

 This patch adds a proper check of sync EP type and the presence of the
 paired substream for avoiding the crash.
 
 Reported-by: Jens Axboe ax...@kernel.dk
 Signed-off-by: Takashi Iwai ti...@suse.de
 ---
  sound/usb/pcm.c | 11 +++
  1 file changed, 11 insertions(+)
 
 diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
 index c659310..21c0001 100644
 --- a/sound/usb/pcm.c
 +++ b/sound/usb/pcm.c
 @@ -511,6 +511,17 @@ static int configure_sync_endpoint(struct 
 snd_usb_substream *subs)
   struct snd_usb_substream *sync_subs =
   subs-stream-substream[subs-direction ^ 1];
  
 + if (subs-sync_endpoint-type != SND_USB_ENDPOINT_TYPE_DATA ||
 + !subs-stream) {
 + ret = snd_usb_endpoint_set_params(subs-sync_endpoint,
 +   subs-pcm_format,
 +   subs-channels,
 +   subs-period_bytes,
 +   subs-cur_rate,
 +   subs-cur_audiofmt,
 +   NULL);
 + }
 +

I think you want to return here, no?



Jens, could you please send me the device's descriptors (lsusb -v)?
I'd like to take a closer look at this.

Cheers,
Eldad

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: checkpatch.pl error: Use of uninitialized value $max (line 3407,3410)

2013-01-10 Thread Joe Perches
On Thu, 2013-01-10 at 11:18 -0800, Guenter Roeck wrote:
 Hi all,

Hi Guenter.

 I get the following error when running checkpatch.pl:
 
 scripts/checkpatch.pl -f drivers/hwmon/da9055-hwmon.c
 Use of uninitialized value $max in string eq at scripts/checkpatch.pl line 
 3407.
 Use of uninitialized value $max in pattern match (m//) at 
 scripts/checkpatch.pl line 3410.

[]

 The checkpatch code creating the error was introduced with commit 4a273195
 (checkpatch: check usleep_range() arguments).

 Is this a but in checkpatch, or is something wrong in my environment ?

Yes, it's a checkpatch bug, ignore it for now.
You'd also get it if you had a memset.

It's a problem with the $Float test

I believe this should fix it.

I haven't tested it much though.

 scripts/checkpatch.pl | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9de3a69..3d0f577 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -230,12 +230,12 @@ our $Inline   = qr{inline|__always_inline|noinline};
 our $Member= qr{-$Ident|\.$Ident|\[[^]]*\]};
 our $Lval  = qr{$Ident(?:$Member)*};
 
-our $Float_hex = qr{(?i:0x[0-9a-f]+p-?[0-9]+[fl]?)};
-our $Float_dec = 
qr{(?i:((?:[0-9]+\.[0-9]*|[0-9]*\.[0-9]+)(?:e-?[0-9]+)?[fl]?))};
-our $Float_int = qr{(?i:[0-9]+e-?[0-9]+[fl]?)};
+our $Float_hex = qr{(?i)0x[0-9a-f]+p-?[0-9]+[fl]?};
+our $Float_dec = qr{(?i)(?:[0-9]+\.[0-9]*|[0-9]*\.[0-9]+)(?:e-?[0-9]+)?[fl]?};
+our $Float_int = qr{(?i)[0-9]+e-?[0-9]+[fl]?};
 our $Float = qr{$Float_hex|$Float_dec|$Float_int};
-our $Constant  = qr{(?:$Float|(?i:(?:0x[0-9a-f]+|[0-9]+)[ul]*))};
-our $Assignment= qr{(?:\*\=|/=|%=|\+=|-=|=|=|=|\^=|\|=|=)};
+our $Constant  = qr{$Float|(?i)(?:0x[0-9a-f]+|[0-9]+)[ul]*};
+our $Assignment= qr{\*\=|/=|%=|\+=|-=|=|=|=|\^=|\|=|=};
 our $Compare= qr{=|=|==|!=||};
 our $Operators = qr{
=|=|==|!=|


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 000/123] 3.7.2-stable review

2013-01-10 Thread Jonathan Nieder
Shuah Khan wrote:

 Patches applied cleanly to 3.0.y, 3.4.y, and 3.7.y.
 Compiled and booted on the following systems:
 HP EliteBook 6930p Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz
 HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics

 Cross-compile tests:

Thanks much for these sanity checks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 16/16] ARM: dts: add AM33XX SPI support

2013-01-10 Thread Nishanth Menon
On 14:35-20130110, Matt Porter wrote:
 On Sun, Oct 28, 2012 at 05:01:29PM +0530, Sekhar Nori wrote:
  On 10/18/2012 6:56 PM, Matt Porter wrote:
   Adds AM33XX SPI support for am335x-bone and am335x-evm.
   
   Signed-off-by: Matt Porter mpor...@ti.com
   ---
arch/arm/boot/dts/am335x-bone.dts |   17 +++
arch/arm/boot/dts/am335x-evm.dts  |9 
arch/arm/boot/dts/am33xx.dtsi |   43 
   +
3 files changed, 69 insertions(+)
   
   diff --git a/arch/arm/boot/dts/am335x-bone.dts 
   b/arch/arm/boot/dts/am335x-bone.dts
   index 5510979..23edfd8 100644
   --- a/arch/arm/boot/dts/am335x-bone.dts
   +++ b/arch/arm/boot/dts/am335x-bone.dts
   @@ -18,6 +18,17 @@
 reg = 0x8000 0x1000; /* 256 MB */
 };

   + am3358_pinmux: pinmux@44e10800 {
   + spi1_pins: pinmux_spi1_pins {
   + pinctrl-single,pins = 
   + 0x190 0x13  /* mcasp0_aclkx.spi1_sclk, 
   OUTPUT_PULLUP | MODE3 */
   + 0x194 0x33  /* mcasp0_fsx.spi1_d0, 
   INPUT_PULLUP | MODE3 */
   + 0x198 0x13  /* mcasp0_axr0.spi1_d1, 
   OUTPUT_PULLUP | MODE3 */
minor comment:
doing as a 0x33 is better for both d1, d0 as D0,D1 can be switched between SDI 
and SDO
as needed with ti,pindir-d0-out-d1-in
   + 0x19c 0x13  /* mcasp0_ahclkr.spi1_cs0, 
   OUTPUT_PULLUP | MODE3 */
   + ;
  
  Is there a single pinmux setting that provides SPI functionality on the
  bone headers? Or this is specific to a cape you tested with?
 
 No, there are two usable settings for spi1 and one setting for spi0.
 I'm dropping this from the series since it's specific to how I wired up
 the homebrew cape I use for spi testing on the Bone. I publish the
 branch where all these extra test-specific patches (that aren't intended
 to be merged) are at in the cover letter.  Anybody that needs context of
 how/what worked and was tested can grab them there.
Possibly dumb question:
Cant we have pre-usable spi configurations?  Like spi1_configuration1_pins,
spi2_configuration1_pins, spi0_configuration1_pins? If documented with
P9 pin names in the bone dts, it saves a bit of effort in looking up
pad offset when dealing with capes.
-- 
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND V5 3/6] perf, amd: Use proper naming scheme for AMD bit field definitions

2013-01-10 Thread Jacob Shin
Update these AMD bit field names to be consistent with naming
convention followed by the rest of the file.

Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/x86/include/asm/perf_event.h|4 ++--
 arch/x86/kernel/cpu/perf_event_amd.c |8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h 
b/arch/x86/include/asm/perf_event.h
index 4fabcdf..2234eaaec 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -29,8 +29,8 @@
 #define ARCH_PERFMON_EVENTSEL_INV  (1ULL  23)
 #define ARCH_PERFMON_EVENTSEL_CMASK0xFF00ULL
 
-#define AMD_PERFMON_EVENTSEL_GUESTONLY (1ULL  40)
-#define AMD_PERFMON_EVENTSEL_HOSTONLY  (1ULL  41)
+#define AMD64_EVENTSEL_GUESTONLY   (1ULL  40)
+#define AMD64_EVENTSEL_HOSTONLY(1ULL  41)
 
 #define AMD64_EVENTSEL_EVENT   \
(ARCH_PERFMON_EVENTSEL_EVENT | (0x0FULL  32))
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c 
b/arch/x86/kernel/cpu/perf_event_amd.c
index 9541fe5..0c2cc51 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -156,9 +156,9 @@ static int amd_pmu_hw_config(struct perf_event *event)
event-hw.config = ~(ARCH_PERFMON_EVENTSEL_USR |
  ARCH_PERFMON_EVENTSEL_OS);
else if (event-attr.exclude_host)
-   event-hw.config |= AMD_PERFMON_EVENTSEL_GUESTONLY;
+   event-hw.config |= AMD64_EVENTSEL_GUESTONLY;
else if (event-attr.exclude_guest)
-   event-hw.config |= AMD_PERFMON_EVENTSEL_HOSTONLY;
+   event-hw.config |= AMD64_EVENTSEL_HOSTONLY;
 
if (event-attr.type != PERF_TYPE_RAW)
return 0;
@@ -336,7 +336,7 @@ static void amd_pmu_cpu_starting(int cpu)
struct amd_nb *nb;
int i, nb_id;
 
-   cpuc-perf_ctr_virt_mask = AMD_PERFMON_EVENTSEL_HOSTONLY;
+   cpuc-perf_ctr_virt_mask = AMD64_EVENTSEL_HOSTONLY;
 
if (boot_cpu_data.x86_max_cores  2)
return;
@@ -669,7 +669,7 @@ void amd_pmu_disable_virt(void)
 * SVM is disabled the Guest-only bits still gets set and the counter
 * will not count anything.
 */
-   cpuc-perf_ctr_virt_mask = AMD_PERFMON_EVENTSEL_HOSTONLY;
+   cpuc-perf_ctr_virt_mask = AMD64_EVENTSEL_HOSTONLY;
 
/* Reload all events */
x86_pmu_disable_all();
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND V5 4/6] perf, x86: Move MSR address offset calculation to architecture specific files

2013-01-10 Thread Jacob Shin
Move counter index to MSR address offset calculation to architecture
specific files. This prepares the way for perf_event_amd to enable
counter addresses that are not contiguous -- for example AMD Family
15h processors have 6 core performance counters starting at 0xc0010200
and 4 northbridge performance counters starting at 0xc0010240.

Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/x86/kernel/cpu/perf_event.h |   21 -
 arch/x86/kernel/cpu/perf_event_amd.c |   42 ++
 2 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 115c1ea..4440218 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -325,6 +325,7 @@ struct x86_pmu {
int (*schedule_events)(struct cpu_hw_events *cpuc, int n, 
int *assign);
unsignedeventsel;
unsignedperfctr;
+   int (*addr_offset)(int index, int eventsel);
u64 (*event_map)(int);
int max_events;
int num_counters;
@@ -446,28 +447,16 @@ extern u64 __read_mostly hw_cache_extra_regs
 
 u64 x86_perf_event_update(struct perf_event *event);
 
-static inline int x86_pmu_addr_offset(int index)
-{
-   int offset;
-
-   /* offset = X86_FEATURE_PERFCTR_CORE ? index  1 : index */
-   alternative_io(ASM_NOP2,
-  shll $1, %%eax,
-  X86_FEATURE_PERFCTR_CORE,
-  =a (offset),
-  a  (index));
-
-   return offset;
-}
-
 static inline unsigned int x86_pmu_config_addr(int index)
 {
-   return x86_pmu.eventsel + x86_pmu_addr_offset(index);
+   return x86_pmu.eventsel +
+   (x86_pmu.addr_offset ? x86_pmu.addr_offset(index, 1) : index);
 }
 
 static inline unsigned int x86_pmu_event_addr(int index)
 {
-   return x86_pmu.perfctr + x86_pmu_addr_offset(index);
+   return x86_pmu.perfctr +
+   (x86_pmu.addr_offset ? x86_pmu.addr_offset(index, 0) : index);
 }
 
 int x86_setup_perfctr(struct perf_event *event);
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c 
b/arch/x86/kernel/cpu/perf_event_amd.c
index 0c2cc51..ef1df38 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -132,6 +132,47 @@ static u64 amd_pmu_event_map(int hw_event)
return amd_perfmon_event_map[hw_event];
 }
 
+/*
+ * Previously calculated offsets
+ */
+static unsigned int event_offsets[X86_PMC_IDX_MAX] __read_mostly;
+static unsigned int count_offsets[X86_PMC_IDX_MAX] __read_mostly;
+
+/*
+ * Legacy CPUs:
+ *   4 counters starting at 0xc001 each offset by 1
+ *
+ * CPUs with core performance counter extensions:
+ *   6 counters starting at 0xc0010200 each offset by 2
+ */
+static inline int amd_pmu_addr_offset(int index, int eventsel)
+{
+   int offset;
+
+   if (!index)
+   return index;
+
+   if (eventsel)
+   offset = event_offsets[index];
+   else
+   offset = count_offsets[index];
+
+   if (offset)
+   return offset;
+
+   if (!cpu_has_perfctr_core)
+   offset = index;
+   else
+   offset = index  1;
+
+   if (eventsel)
+   event_offsets[index] = offset;
+   else
+   count_offsets[index] = offset;
+
+   return offset;
+}
+
 static int amd_pmu_hw_config(struct perf_event *event)
 {
int ret;
@@ -578,6 +619,7 @@ static __initconst const struct x86_pmu amd_pmu = {
.schedule_events= x86_schedule_events,
.eventsel   = MSR_K7_EVNTSEL0,
.perfctr= MSR_K7_PERFCTR0,
+   .addr_offset= amd_pmu_addr_offset,
.event_map  = amd_pmu_event_map,
.max_events = ARRAY_SIZE(amd_perfmon_event_map),
.num_counters   = AMD64_NUM_COUNTERS,
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND V5 2/6] perf, amd: Generalize northbridge constraints code for family 15h

2013-01-10 Thread Jacob Shin
From: Robert Richter r...@kernel.org

Generalize northbridge constraints code for family 10h so that later
we can reuse the same code path with other AMD processor families that
have the same northbridge event constraints.

Signed-off-by: Robert Richter r...@kernel.org
Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/x86/kernel/cpu/perf_event_amd.c |   43 --
 1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd.c 
b/arch/x86/kernel/cpu/perf_event_amd.c
index e7963c7..9541fe5 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -188,20 +188,13 @@ static inline int amd_has_nb(struct cpu_hw_events *cpuc)
return nb  nb-nb_id != -1;
 }
 
-static void amd_put_event_constraints(struct cpu_hw_events *cpuc,
- struct perf_event *event)
+static void __amd_put_nb_event_constraints(struct cpu_hw_events *cpuc,
+  struct perf_event *event)
 {
-   struct hw_perf_event *hwc = event-hw;
struct amd_nb *nb = cpuc-amd_nb;
int i;
 
/*
-* only care about NB events
-*/
-   if (!(amd_has_nb(cpuc)  amd_is_nb_event(hwc)))
-   return;
-
-   /*
 * need to scan whole list because event may not have
 * been assigned during scheduling
 *
@@ -247,12 +240,13 @@ static void amd_put_event_constraints(struct 
cpu_hw_events *cpuc,
   *
   * Given that resources are allocated (cmpxchg), they must be
   * eventually freed for others to use. This is accomplished by
-  * calling amd_put_event_constraints().
+  * calling __amd_put_nb_event_constraints()
   *
   * Non NB events are not impacted by this restriction.
   */
 static struct event_constraint *
-amd_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event)
+__amd_get_nb_event_constraints(struct cpu_hw_events *cpuc, struct perf_event 
*event,
+  struct event_constraint *c)
 {
struct hw_perf_event *hwc = event-hw;
struct amd_nb *nb = cpuc-amd_nb;
@@ -260,12 +254,6 @@ amd_get_event_constraints(struct cpu_hw_events *cpuc, 
struct perf_event *event)
int idx, new = -1;
 
/*
-* if not NB event or no NB, then no constraints
-*/
-   if (!(amd_has_nb(cpuc)  amd_is_nb_event(hwc)))
-   return unconstrained;
-
-   /*
 * detect if already present, if so reuse
 *
 * cannot merge with actual allocation
@@ -275,7 +263,7 @@ amd_get_event_constraints(struct cpu_hw_events *cpuc, 
struct perf_event *event)
 * because of successive calls to x86_schedule_events() from
 * hw_perf_group_sched_in() without hw_perf_enable()
 */
-   for (idx = 0; idx  x86_pmu.num_counters; idx++) {
+   for_each_set_bit(idx, c-idxmsk, X86_PMC_IDX_MAX) {
if (new == -1 || hwc-idx == idx)
/* assign free slot, prefer hwc-idx */
old = cmpxchg(nb-owners + idx, NULL, event);
@@ -391,6 +379,25 @@ static void amd_pmu_cpu_dead(int cpu)
}
 }
 
+static struct event_constraint *
+amd_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+   /*
+* if not NB event or no NB, then no constraints
+*/
+   if (!(amd_has_nb(cpuc)  amd_is_nb_event(event-hw)))
+   return unconstrained;
+
+   return __amd_get_nb_event_constraints(cpuc, event, unconstrained);
+}
+
+static void amd_put_event_constraints(struct cpu_hw_events *cpuc,
+ struct perf_event *event)
+{
+   if (amd_has_nb(cpuc)  amd_is_nb_event(event-hw))
+   __amd_put_nb_event_constraints(cpuc, event);
+}
+
 PMU_FORMAT_ATTR(event, config:0-7,32-35);
 PMU_FORMAT_ATTR(umask, config:8-15   );
 PMU_FORMAT_ATTR(edge,  config:18 );
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND V5 5/6] perf, x86: Allow for architecture specific RDPMC indexes

2013-01-10 Thread Jacob Shin
Similar to config_base and event_base, allow architecture specific
RDPMC ECX values.

Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/x86/kernel/cpu/perf_event.c |2 +-
 arch/x86/kernel/cpu/perf_event.h |6 ++
 arch/x86/kernel/cpu/perf_event_amd.c |6 ++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 4428fd1..b63982b 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -835,7 +835,7 @@ static inline void x86_assign_hw_event(struct perf_event 
*event,
} else {
hwc-config_base = x86_pmu_config_addr(hwc-idx);
hwc-event_base  = x86_pmu_event_addr(hwc-idx);
-   hwc-event_base_rdpmc = hwc-idx;
+   hwc-event_base_rdpmc = x86_pmu_rdpmc_index(hwc-idx);
}
 }
 
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 4440218..c910657 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -326,6 +326,7 @@ struct x86_pmu {
unsignedeventsel;
unsignedperfctr;
int (*addr_offset)(int index, int eventsel);
+   int (*rdpmc_index)(int index);
u64 (*event_map)(int);
int max_events;
int num_counters;
@@ -459,6 +460,11 @@ static inline unsigned int x86_pmu_event_addr(int index)
(x86_pmu.addr_offset ? x86_pmu.addr_offset(index, 0) : index);
 }
 
+static inline int x86_pmu_rdpmc_index(int index)
+{
+   return x86_pmu.rdpmc_index ? x86_pmu.rdpmc_index(index) : index;
+}
+
 int x86_setup_perfctr(struct perf_event *event);
 
 int x86_pmu_hw_config(struct perf_event *event);
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c 
b/arch/x86/kernel/cpu/perf_event_amd.c
index ef1df38..faf9072 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -173,6 +173,11 @@ static inline int amd_pmu_addr_offset(int index, int 
eventsel)
return offset;
 }
 
+static inline int amd_pmu_rdpmc_index(int index)
+{
+   return index;
+}
+
 static int amd_pmu_hw_config(struct perf_event *event)
 {
int ret;
@@ -620,6 +625,7 @@ static __initconst const struct x86_pmu amd_pmu = {
.eventsel   = MSR_K7_EVNTSEL0,
.perfctr= MSR_K7_PERFCTR0,
.addr_offset= amd_pmu_addr_offset,
+   .rdpmc_index= amd_pmu_rdpmc_index,
.event_map  = amd_pmu_event_map,
.max_events = ARRAY_SIZE(amd_perfmon_event_map),
.num_counters   = AMD64_NUM_COUNTERS,
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND V5 6/6] perf, amd: Enable northbridge performance counters on AMD family 15h

2013-01-10 Thread Jacob Shin
On AMD family 15h processors, there are 4 new performance counters
(in addition to 6 core performance counters) that can be used for
counting northbridge events (i.e. DRAM accesses). Their bit fields are
almost identical to the core performance counters. However, unlike the
core performance counters, these MSRs are shared between multiple
cores (that share the same northbridge). We will reuse the same code
path as existing family 10h northbridge event constraints handler
logic to enforce this sharing.

Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/x86/include/asm/cpufeature.h |2 +
 arch/x86/include/asm/perf_event.h |9 ++
 arch/x86/include/uapi/asm/msr-index.h |2 +
 arch/x86/kernel/cpu/perf_event_amd.c  |  167 +
 4 files changed, 160 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index 2d9075e..93fe929 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -167,6 +167,7 @@
 #define X86_FEATURE_TBM(6*32+21) /* trailing bit manipulations 
*/
 #define X86_FEATURE_TOPOEXT(6*32+22) /* topology extensions CPUID leafs */
 #define X86_FEATURE_PERFCTR_CORE (6*32+23) /* core performance counter 
extensions */
+#define X86_FEATURE_PERFCTR_NB  (6*32+24) /* NB performance counter extensions 
*/
 
 /*
  * Auxiliary flags: Linux defined - For features scattered in various
@@ -309,6 +310,7 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR)
 #define cpu_has_pclmulqdq  boot_cpu_has(X86_FEATURE_PCLMULQDQ)
 #define cpu_has_perfctr_core   boot_cpu_has(X86_FEATURE_PERFCTR_CORE)
+#define cpu_has_perfctr_nb boot_cpu_has(X86_FEATURE_PERFCTR_NB)
 #define cpu_has_cx8boot_cpu_has(X86_FEATURE_CX8)
 #define cpu_has_cx16   boot_cpu_has(X86_FEATURE_CX16)
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
diff --git a/arch/x86/include/asm/perf_event.h 
b/arch/x86/include/asm/perf_event.h
index 2234eaaec..57cb634 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -29,9 +29,14 @@
 #define ARCH_PERFMON_EVENTSEL_INV  (1ULL  23)
 #define ARCH_PERFMON_EVENTSEL_CMASK0xFF00ULL
 
+#define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL  36)
 #define AMD64_EVENTSEL_GUESTONLY   (1ULL  40)
 #define AMD64_EVENTSEL_HOSTONLY(1ULL  41)
 
+#define AMD64_EVENTSEL_INT_CORE_SEL_SHIFT  37
+#define AMD64_EVENTSEL_INT_CORE_SEL_MASK   \
+   (0xFULL  AMD64_EVENTSEL_INT_CORE_SEL_SHIFT)
+
 #define AMD64_EVENTSEL_EVENT   \
(ARCH_PERFMON_EVENTSEL_EVENT | (0x0FULL  32))
 #define INTEL_ARCH_EVENT_MASK  \
@@ -46,8 +51,12 @@
 #define AMD64_RAW_EVENT_MASK   \
(X86_RAW_EVENT_MASK  |  \
 AMD64_EVENTSEL_EVENT)
+#define AMD64_RAW_EVENT_MASK_NB\
+   (AMD64_EVENTSEL_EVENT|  \
+ARCH_PERFMON_EVENTSEL_UMASK)
 #define AMD64_NUM_COUNTERS 4
 #define AMD64_NUM_COUNTERS_CORE6
+#define AMD64_NUM_COUNTERS_NB  4
 
 #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_SEL  0x3c
 #define ARCH_PERFMON_UNHALTED_CORE_CYCLES_UMASK(0x00  8)
diff --git a/arch/x86/include/uapi/asm/msr-index.h 
b/arch/x86/include/uapi/asm/msr-index.h
index 433a59f..075a402 100644
--- a/arch/x86/include/uapi/asm/msr-index.h
+++ b/arch/x86/include/uapi/asm/msr-index.h
@@ -194,6 +194,8 @@
 /* Fam 15h MSRs */
 #define MSR_F15H_PERF_CTL  0xc0010200
 #define MSR_F15H_PERF_CTR  0xc0010201
+#define MSR_F15H_NB_PERF_CTL   0xc0010240
+#define MSR_F15H_NB_PERF_CTR   0xc0010241
 
 /* Fam 10h MSRs */
 #define MSR_FAM10H_MMIO_CONF_BASE  0xc0010058
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c 
b/arch/x86/kernel/cpu/perf_event_amd.c
index faf9072..1a80e05 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -132,11 +132,14 @@ static u64 amd_pmu_event_map(int hw_event)
return amd_perfmon_event_map[hw_event];
 }
 
+static struct event_constraint *amd_nb_event_constraint;
+
 /*
  * Previously calculated offsets
  */
 static unsigned int event_offsets[X86_PMC_IDX_MAX] __read_mostly;
 static unsigned int count_offsets[X86_PMC_IDX_MAX] __read_mostly;
+static unsigned int rdpmc_indexes[X86_PMC_IDX_MAX] __read_mostly;
 
 /*
  * Legacy CPUs:
@@ -144,10 +147,14 @@ static unsigned int count_offsets[X86_PMC_IDX_MAX] 
__read_mostly;
  *
  * CPUs with core performance counter extensions:
  *   6 counters starting at 0xc0010200 each offset by 2
+ *
+ * CPUs with north bridge performance counter extensions:
+ *   4 additional counters starting at 0xc0010240 each offset by 2
+ *   (indexed right above either one of the 

[PATCH RESEND V5 1/6] perf, amd: Rework northbridge event constraints handler

2013-01-10 Thread Jacob Shin
From: Robert Richter r...@kernel.org

Code simplification. No functional changes.

Signed-off-by: Robert Richter r...@kernel.org
Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/x86/kernel/cpu/perf_event_amd.c |   68 +-
 1 file changed, 26 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd.c 
b/arch/x86/kernel/cpu/perf_event_amd.c
index c93bc4e..e7963c7 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -256,9 +256,8 @@ amd_get_event_constraints(struct cpu_hw_events *cpuc, 
struct perf_event *event)
 {
struct hw_perf_event *hwc = event-hw;
struct amd_nb *nb = cpuc-amd_nb;
-   struct perf_event *old = NULL;
-   int max = x86_pmu.num_counters;
-   int i, j, k = -1;
+   struct perf_event *old;
+   int idx, new = -1;
 
/*
 * if not NB event or no NB, then no constraints
@@ -276,48 +275,33 @@ amd_get_event_constraints(struct cpu_hw_events *cpuc, 
struct perf_event *event)
 * because of successive calls to x86_schedule_events() from
 * hw_perf_group_sched_in() without hw_perf_enable()
 */
-   for (i = 0; i  max; i++) {
-   /*
-* keep track of first free slot
-*/
-   if (k == -1  !nb-owners[i])
-   k = i;
+   for (idx = 0; idx  x86_pmu.num_counters; idx++) {
+   if (new == -1 || hwc-idx == idx)
+   /* assign free slot, prefer hwc-idx */
+   old = cmpxchg(nb-owners + idx, NULL, event);
+   else if (nb-owners[idx] == event)
+   /* event already present */
+   old = event;
+   else
+   continue;
+
+   if (old  old != event)
+   continue;
+
+   /* reassign to this slot */
+   if (new != -1)
+   cmpxchg(nb-owners + new, event, NULL);
+   new = idx;
 
/* already present, reuse */
-   if (nb-owners[i] == event)
-   goto done;
-   }
-   /*
-* not present, so grab a new slot
-* starting either at:
-*/
-   if (hwc-idx != -1) {
-   /* previous assignment */
-   i = hwc-idx;
-   } else if (k != -1) {
-   /* start from free slot found */
-   i = k;
-   } else {
-   /*
-* event not found, no slot found in
-* first pass, try again from the
-* beginning
-*/
-   i = 0;
-   }
-   j = i;
-   do {
-   old = cmpxchg(nb-owners+i, NULL, event);
-   if (!old)
+   if (old == event)
break;
-   if (++i == max)
-   i = 0;
-   } while (i != j);
-done:
-   if (!old)
-   return nb-event_constraints[i];
-
-   return emptyconstraint;
+   }
+
+   if (new == -1)
+   return emptyconstraint;
+
+   return nb-event_constraints[new];
 }
 
 static struct amd_nb *amd_alloc_nb(int cpu)
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND V5 0/6] perf, amd: Enable AMD family 15h northbridge counters

2013-01-10 Thread Jacob Shin
The following patchset enables 4 additional performance counters in
AMD family 15h processors that count northbridge events -- such as
number of DRAM accesses.

This patchset is based on previous work done by Robert Richter
r...@kernel.org :

https://lkml.org/lkml/2012/6/19/324

The main differences are:

* The northbridge counters are indexed contiguously right above the
  core performance counters.

* MSR address offset calculations are moved to architecture specific
  files.

* Interrups are set up to be delivered only to a single core.

V5:
Rebased against latest tip

V4:
* Moved interrupt core select set up back to event constraints
  function, sicne during -hw_config time we do not yet know on which
  CPU the the event will run on.
* Tested on and made minor revisions to make sure that the patchset is
  compatible with upcoming AMD Family 16h processors, and will support
  core and NB counters without any further patches.

V3:
Addressed the following feedback/comments from Robert's review
* https://lkml.org/lkml/2012/11/16/484
* https://lkml.org/lkml/2012/11/26/162

V2:
Separate out Robert's patches, and add properly ordered certificate of
origins.

Jacob Shin (4):
  perf, amd: Use proper naming scheme for AMD bit field definitions
  perf, x86: Move MSR address offset calculation to architecture
specific files
  perf, x86: Allow for architecture specific RDPMC indexes
  perf, amd: Enable northbridge performance counters on AMD family 15h

Robert Richter (2):
  perf, amd: Rework northbridge event constraints handler
  perf, amd: Generalize northbridge constraints code for family 15h

 arch/x86/include/asm/cpufeature.h |2 +
 arch/x86/include/asm/perf_event.h |   13 +-
 arch/x86/include/uapi/asm/msr-index.h |2 +
 arch/x86/kernel/cpu/perf_event.c  |2 +-
 arch/x86/kernel/cpu/perf_event.h  |   25 ++-
 arch/x86/kernel/cpu/perf_event_amd.c  |  318 +
 6 files changed, 268 insertions(+), 94 deletions(-)

-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] clk: add common of_clk_init() function

2013-01-10 Thread Josh Cartwright
On Fri, Jan 04, 2013 at 12:30:52PM +0530, Prashant Gaikwad wrote:
 Modify of_clk_init function so that it will determine which
 driver to initialize based on device tree instead of each driver
 registering to it.
 
 Based on a similar patch for drivers/irqchip by Thomas Petazzoni and
 drivers/clocksource by Stephen Warren.
 
 Signed-off-by: Prashant Gaikwad pgaik...@nvidia.com
 ---

Prashant-

Sorry for the late response, but I finally got a chance to give this
patchset a spin on Zynq.  For patches 1 and 6:

Reviewed-by: Josh Cartwright josh.cartwri...@ni.com
Tested-by: Josh Cartwright josh.cartwri...@ni.com

  Josh


pgpzKjnldEQgJ.pgp
Description: PGP signature


Re: [PATCH v5 5/5] KVM: x86: improve reexecute_instruction

2013-01-10 Thread Marcelo Tosatti
On Fri, Jan 11, 2013 at 02:05:33AM +0800, Xiao Guangrong wrote:
 On 01/11/2013 01:26 AM, Marcelo Tosatti wrote:
  On Tue, Jan 08, 2013 at 02:38:36PM +0800, Xiao Guangrong wrote:
  The current reexecute_instruction can not well detect the failed 
  instruction
  emulation. It allows guest to retry all the instructions except it accesses
  on error pfn
 
  For example, some cases are nested-write-protect - if the page we want to
  write is used as PDE but it chains to itself. Under this case, we should
  stop the emulation and report the case to userspace
 
  Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
  ---
   arch/x86/include/asm/kvm_host.h |7 +++
   arch/x86/kvm/paging_tmpl.h  |   27 ---
   arch/x86/kvm/x86.c  |8 +++-
   3 files changed, 34 insertions(+), 8 deletions(-)
 
  diff --git a/arch/x86/include/asm/kvm_host.h 
  b/arch/x86/include/asm/kvm_host.h
  index c431b33..d6ab8d2 100644
  --- a/arch/x86/include/asm/kvm_host.h
  +++ b/arch/x86/include/asm/kvm_host.h
  @@ -502,6 +502,13 @@ struct kvm_vcpu_arch {
 u64 msr_val;
 struct gfn_to_hva_cache data;
 } pv_eoi;
  +
  +  /*
  +   * Indicate whether the access faults on its page table in guest
  +   * which is set when fix page fault and used to detect unhandeable
  +   * instruction.
  +   */
  +  bool write_fault_to_shadow_pgtable;
   };
 
   struct kvm_lpage_info {
  diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
  index 67b390d..df50560 100644
  --- a/arch/x86/kvm/paging_tmpl.h
  +++ b/arch/x86/kvm/paging_tmpl.h
  @@ -497,26 +497,34 @@ out_gpte_changed:
* created when kvm establishes shadow page table that stop kvm using 
  large
* page size. Do it early can avoid unnecessary #PF and emulation.
*
  + * @write_fault_to_shadow_pgtable will return true if the fault gfn is
  + * currently used as its page table.
  + *
* Note: the PDPT page table is not checked for PAE-32 bit guest. It is ok
* since the PDPT is always shadowed, that means, we can not use large 
  page
* size to map the gfn which is used as PDPT.
*/
   static bool
   FNAME(is_self_change_mapping)(struct kvm_vcpu *vcpu,
  -struct guest_walker *walker, int user_fault)
  +struct guest_walker *walker, int user_fault,
  +bool *write_fault_to_shadow_pgtable)
   {
 int level;
 gfn_t mask = ~(KVM_PAGES_PER_HPAGE(walker-level) - 1);
  +  bool self_changed = false;
 
 if (!(walker-pte_access  ACC_WRITE_MASK ||
   (!is_write_protection(vcpu)  !user_fault)))
 return false;
 
  -  for (level = walker-level; level = walker-max_level; level++)
  -  if (!((walker-gfn ^ walker-table_gfn[level - 1])  mask))
  -  return true;
  +  for (level = walker-level; level = walker-max_level; level++) {
  +  gfn_t gfn = walker-gfn ^ walker-table_gfn[level - 1];
  +
  +  self_changed |= !(gfn  mask);
  +  *write_fault_to_shadow_pgtable |= !gfn;
  +  }
 
  -  return false;
  +  return self_changed;
   }
 
   /*
  @@ -544,7 +552,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, 
  gva_t addr, u32 error_code,
 int level = PT_PAGE_TABLE_LEVEL;
 int force_pt_level;
 unsigned long mmu_seq;
  -  bool map_writable;
  +  bool map_writable, is_self_change_mapping;
 
 pgprintk(%s: addr %lx err %x\n, __func__, addr, error_code);
 
  @@ -572,9 +580,14 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, 
  gva_t addr, u32 error_code,
 return 0;
 }
 
  +  vcpu-arch.write_fault_to_shadow_pgtable = false;
  +
  +  is_self_change_mapping = FNAME(is_self_change_mapping)(vcpu,
  +walker, user_fault, vcpu-arch.write_fault_to_shadow_pgtable);
  +
 if (walker.level = PT_DIRECTORY_LEVEL)
 force_pt_level = mapping_level_dirty_bitmap(vcpu, walker.gfn)
  - || FNAME(is_self_change_mapping)(vcpu, walker, user_fault);
  + || is_self_change_mapping;
 else
 force_pt_level = 1;
 if (!force_pt_level) {
  diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
  index 6f13e03..2957012 100644
  --- a/arch/x86/kvm/x86.c
  +++ b/arch/x86/kvm/x86.c
  @@ -4810,7 +4810,13 @@ static bool reexecute_instruction(struct kvm_vcpu 
  *vcpu, gva_t cr2)
  * guest to let CPU execute the instruction.
  */
 kvm_mmu_unprotect_page(vcpu-kvm, gpa_to_gfn(gpa));
  -  return true;
  +
  +  /*
  +   * If the access faults on its page table, it can not
  +   * be fixed by unprotecting shadow page and it should
  +   * be reported to userspace.
  +   */
  +  return !vcpu-arch.write_fault_to_shadow_pgtable;
   }
  
  This sounds wrong: only reporting emulation failure in case 
  of a write fault to shadow pagetable? 
 
 We suppose unprotecting target-gfn can avoid emulation, the same
 as current code. :(

Current code treats access to non-mapped guest address as 

Re: kernel BUG at kernel/sched_rt.c:493!

2013-01-10 Thread Shawn Bohrer
On Thu, Jan 10, 2013 at 05:13:11AM +0100, Mike Galbraith wrote:
 On Tue, 2013-01-08 at 09:01 -0600, Shawn Bohrer wrote: 
  On Tue, Jan 08, 2013 at 09:36:05AM -0500, Steven Rostedt wrote:

I've also managed to reproduce this on 3.8.0-rc2 so it appears the bug
is still present in the latest kernel.
   
   Shawn,
   
   Can you send me your .config file.
  
  I've attached the 3.8.0-rc2 config that I used to reproduce this in an
  8 core kvm image.  Let me know if you need anything else.
 
 I tried beating on my little Q6600 with no success.  I even tried
 setting the entire box rt, GUI and all, nada.
 
 Hm, maybe re-installing systemd..

I don't know if Steve has had any success.  I can reproduce this easily
now so I'm happy to do some debugging if anyone has some things they
want me to try.

Here is some info on my setup at the moment.  I'm using an 8 core KVM
image now with an xfs file system.  We do use systemd if that is
relevant.  My cpuset controller is mounted on /cgroup/cpuset and we
use libcgroup-tools to move everything on the system that can be moved
into /cgroup/cpuset/sysdefault/  I've also boosted all kworker threads
to run as SCHED_FIFO with a priority of 51.  From there I just drop
the three attached shell scripts (burn.sh, sched_domain_bug.sh and
sched_domain_burn.sh) in /root/ and run /root/sched_domain_bug.sh as
root.  Usually the bug triggers in less than a minute.  You may need
to tweak my shell scripts if your setup is different but they are very
rudimentary.

In order to try digging up some more info I applied the following
patch, and triggered the bug a few times.  The results are always
essentially the same:

---
 kernel/sched/rt.c |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 418feb0..fba7f01 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -650,6 +650,8 @@ static void __disable_runtime(struct rq *rq)
 * we lend and now have to reclaim.
 */
want = rt_b-rt_runtime - rt_rq-rt_runtime;
+   printk(KERN_INFO Initial want: %lld rt_b-rt_runtime: %llu 
rt_rq-rt_runtime: %llu\n,
+  want, rt_b-rt_runtime, rt_rq-rt_runtime);
 
/*
 * Greedy reclaim, take back as much as we can.
@@ -684,7 +686,12 @@ static void __disable_runtime(struct rq *rq)
 * We cannot be left wanting - that would mean some runtime
 * leaked out of the system.
 */
-   BUG_ON(want);
+   if (want) {
+   printk(KERN_ERR BUG triggered, want: %lld\n, want);
+   for_each_cpu(i, rd-span) {
+   print_rt_stats(NULL, i);
+   }
+   }
 balanced:
/*
 * Disable all the borrow logic by pretending we have inf
---

Here is the output:

[   81.278842] SysRq : Changing Loglevel
[   81.279027] Loglevel set to 9
[   83.285456] Initial want: 5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 9
[   85.286452] Initial want: 5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 9
[   85.289625] Initial want: 5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 9
[   87.287435] Initial want: 1 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 85000
[   87.290718] Initial want: 5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 9
[   89.288469] Initial want: -5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 10
[   89.291550] Initial want: 15000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 8
[   89.292940] Initial want: 1 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 85000
[   89.294082] Initial want: 1 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 85000
[   89.295194] Initial want: 5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 9
[   89.296274] Initial want: 5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 9
[   90.959004] [sched_delayed] sched: RT throttling activated
[   91.289470] Initial want: 2 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 75000
[   91.292767] Initial want: 2 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 75000
[   91.294037] Initial want: 2 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 75000
[   91.295364] Initial want: 2 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 75000
[   91.296355] BUG triggered, want: 2
[   91.296355] 
[   91.296355] rt_rq[7]:
[   91.296355]   .rt_nr_running : 0
[   91.296355]   .rt_throttled  : 0
[   91.296355]   .rt_time   : 0.00
[   91.296355]   .rt_runtime: 750.00
[   91.307332] Initial want: -5000 rt_b-rt_runtime: 95000 
rt_rq-rt_runtime: 10
[   91.308440] Initial want: -1 rt_b-rt_runtime: 95000 

Re: srat: harsh hot-pluggable memory check?

2013-01-10 Thread Andi Kleen
 This only mentions that the system supports hot-plugging, and IMHO if the
 user decides not to use CONFIG_MEMORY_HOTPLUG, it shouldn't be considered an 
 error.
 Therefore would it be ok to drop the check? Or am I missing something?

The very strict checks were originally implemented because various early
BIOS had largely fictional SRATs, and trusting them blindly caused
boot failures or a lot of wasted memory for unnecessary hotplug zones. 
The wasted memory was mainly a problem with the old memory hotplug
implementation that pre-allocated memmaps, that's not a problem anymore.
However there may be still some other failure cases.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Wong
Mel Gorman mgor...@suse.de wrote:
 Thanks Eric, it's much appreciated. However, I'm still very much in favour
 of a partial revert as in retrospect the implementation of capture took the
 wrong approach. Could you confirm the following patch works for you?
 It's should functionally have the same effect as the first revert and
 there are only minor changes from the last revert prototype I sent you
 but there is no harm in being sure.

Thanks, I was just about to report back on the last partial revert
being successful :)  Will start testing this one, now.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Pull Load keys from signed PE binaries branch into linux-next

2013-01-10 Thread Josh Boyer
On Thu, Jan 03, 2013 at 01:05:19PM +, David Howells wrote:
 David Howells (23):
   KEYS: Rename public key parameter name arrays
   KEYS: Move the algorithm pointer array from x509 to public_key.c
   KEYS: Store public key algo ID in public_key struct
   KEYS: Split public_key_verify_signature() and make available
   KEYS: Store public key algo ID in public_key_signature struct
   X.509: struct x509_certificate needs struct tm declaring
   X.509: Add bits needed for PKCS#7
   X.509: Embed public_key_signature struct and create filler function
   X.509: Handle certificates that lack an authorityKeyIdentifier field
   X.509: Export certificate parse and free functions

The above patches are mostly cleanups and I can follow them fairly well.
The only real comment I had was the same one Kees already made about the
public_key_verify_signature_2 function.

Feel free to add a:

Reviewed-by: Josh Boyer jwbo...@redhat.com

josh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 1/5] perf: Add hardware breakpoint address mask

2013-01-10 Thread Jacob Shin
Some architectures (for us, AMD Family 16h) allow for don't care bit
mask to further qualify a hardware breakpoint address, in order to
trap on range of addresses. Update perf uapi to add bp_addr_mask field
and define HAVE_HW_BREAKPOINT_ADDR_MASK.

Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/Kconfig|4 
 include/linux/hw_breakpoint.h   |6 ++
 include/uapi/linux/perf_event.h |5 -
 kernel/events/hw_breakpoint.c   |3 +++
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 7f8f281..9ca606c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -221,6 +221,10 @@ config HAVE_HW_BREAKPOINT
bool
depends on PERF_EVENTS
 
+config HAVE_HW_BREAKPOINT_ADDR_MASK
+   bool
+   depends on HAVE_HW_BREAKPOINT
+
 config HAVE_MIXED_BREAKPOINTS_REGS
bool
depends on HAVE_HW_BREAKPOINT
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index 0464c85..9384201 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -84,6 +84,12 @@ static inline struct arch_hw_breakpoint 
*counter_arch_bp(struct perf_event *bp)
return bp-hw.info;
 }
 
+#ifdef CONFIG_HAVE_HW_BREAKPOINT_ADDR_MASK
+extern int arch_has_hw_breakpoint_addr_mask(void);
+#else
+static inline int arch_has_hw_breakpoint_addr_mask(void) { return 0; }
+#endif /* CONFIG_HAVE_HW_BREAKPOINT_ADDR_MASK */
+
 #else /* !CONFIG_HAVE_HW_BREAKPOINT */
 
 static inline int __init init_hw_breakpoint(void) { return 0; }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 4f63c05..067d315 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -284,7 +284,10 @@ struct perf_event_attr {
__u64   config1; /* extension of config */
};
union {
-   __u64   bp_len;
+   struct {
+   __u32   bp_len;
+   __u32   bp_addr_mask;
+   };
__u64   config2; /* extension of config1 */
};
__u64   branch_sample_type; /* enum perf_branch_sample_type */
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index fe8a916..6f10baa 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -612,6 +612,9 @@ static int hw_breakpoint_add(struct perf_event *bp, int 
flags)
if (!(flags  PERF_EF_START))
bp-hw.state = PERF_HES_STOPPED;
 
+   if (bp-attr.bp_addr_mask  !arch_has_hw_breakpoint_addr_mask())
+   return -EOPNOTSUPP;
+
return arch_install_hw_breakpoint(bp);
 }
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 4/5] perf tools: Add breakpoint address mask syntax to perf list and documentation

2013-01-10 Thread Jacob Shin
From: Suravee Suthikulpanit suravee.suthikulpa...@amd.com


Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 tools/perf/Documentation/perf-record.txt |   14 ++
 tools/perf/util/parse-events.c   |2 +-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 938e890..9a6fafa 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -33,13 +33,19 @@ OPTIONS
 - a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
  hexadecimal event descriptor.
 
-- a hardware breakpoint event in the form of '\mem:addr[:access]'
-  where addr is the address in memory you want to break in.
-  Access is the memory access type (read, write, execute) it can
-  be passed as follows: '\mem:addr[:[r][w][x]]'.
+- a hardware breakpoint event in the form of
+  '\mem:addr[:access][:addr_mask]' where addr is the address in
+  memory you want to break in. Access is the memory access type (read,
+  write, execute) it can be passed as follows: '\mem:addr[:[r][w][x]]'.
+  addr_mask is the don't care bit mask to further qualify the given
+  addr, to break in on accesses to an address range.
+
   If you want to profile read-write accesses in 0x1000, just set
   'mem:0x1000:rw'.
 
+  If you want to profile write accesses in [0x1000 ~ 0x1010), just set
+  'mem:0x1000:w:0xf'. (Only supported on some hardware)
+
 --filter=filter::
 Event filter.
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 2679e48..6443735 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1114,7 +1114,7 @@ void print_events(const char *event_glob, bool name_only)
printf(\n);
 
printf(  %-50s [%s]\n,
-  mem:addr[:access],
+  mem:addr[:access][:addr_mask],
event_type_descriptors[PERF_TYPE_BREAKPOINT]);
printf(\n);
}
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 5/5] perf tools: Add breakpoint address mask test case to tests/parse-events

2013-01-10 Thread Jacob Shin
From: Suravee Suthikulpanit suravee.suthikulpa...@amd.com


Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 tools/perf/tests/parse-events.c |   34 ++
 1 file changed, 34 insertions(+)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 32ee478..394e16c 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -369,6 +369,32 @@ static int test__checkevent_breakpoint_rw_modifier(struct 
perf_evlist *evlist)
return test__checkevent_breakpoint_rw(evlist);
 }
 
+static int test__checkevent_breakpoint_rw_addrmsk(struct perf_evlist *evlist)
+{
+   struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+   TEST_ASSERT_VAL(wrong bp_addr_mask,
+   0 == evsel-attr.bp_addr_mask);
+
+   return test__checkevent_breakpoint_rw(evlist);
+}
+
+static int test__checkevent_breakpoint_x_addrmsk_modifier(struct perf_evlist 
*evlist)
+{
+   struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+   TEST_ASSERT_VAL(wrong exclude_user, evsel-attr.exclude_user);
+   TEST_ASSERT_VAL(wrong exclude_kernel, !evsel-attr.exclude_kernel);
+   TEST_ASSERT_VAL(wrong exclude_hv, evsel-attr.exclude_hv);
+   TEST_ASSERT_VAL(wrong precise_ip, !evsel-attr.precise_ip);
+   TEST_ASSERT_VAL(wrong name,
+   !strcmp(perf_evsel__name(evsel), mem:0:x:0:k));
+   TEST_ASSERT_VAL(wrong bp_addr_mask,
+   0 == evsel-attr.bp_addr_mask);
+
+   return test__checkevent_breakpoint_x(evlist);
+}
+
 static int test__checkevent_pmu(struct perf_evlist *evlist)
 {
 
@@ -921,6 +947,14 @@ static struct test__event_st test__events[] = {
.name  = 
{cycles,instructions}:G,{cycles:G,instructions:G},cycles,
.check = test__group5,
},
+   [33] = {
+   .name  = mem:0:rw:0,
+   .check = test__checkevent_breakpoint_rw_addrmsk,
+   },
+   [34] = {
+   .name  = mem:0:x:0:k,
+   .check = test__checkevent_breakpoint_x_addrmsk_modifier,
+   },
 };
 
 static struct test__event_st test__events_pmu[] = {
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 3/5] perf tools: Add breakpoint address mask to the mem event parser

2013-01-10 Thread Jacob Shin
From: Suravee Suthikulpanit suravee.suthikulpa...@amd.com

Allow perf tool to pass in breakpoint address mask to match an address
range, i.e.:

  $ perf stat -e mem:0x1000:w:0xf a.out

Will count writes to [0x1000 ~ 0x1010)

Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 tools/perf/util/parse-events.c |3 ++-
 tools/perf/util/parse-events.h |2 +-
 tools/perf/util/parse-events.y |   14 --
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 2d8d53be..2679e48 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -466,12 +466,13 @@ do {  \
 }
 
 int parse_events_add_breakpoint(struct list_head **list, int *idx,
-   void *ptr, char *type)
+   void *ptr, char *type, void *msk)
 {
struct perf_event_attr attr;
 
memset(attr, 0, sizeof(attr));
attr.bp_addr = (unsigned long) ptr;
+   attr.bp_addr_mask = (unsigned long) msk;
 
if (parse_breakpoint_type(type, attr))
return -EINVAL;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index b7af80b..5b7905c 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -92,7 +92,7 @@ int parse_events_add_numeric(struct list_head **list, int 
*idx,
 int parse_events_add_cache(struct list_head **list, int *idx,
   char *type, char *op_result1, char *op_result2);
 int parse_events_add_breakpoint(struct list_head **list, int *idx,
-   void *ptr, char *type);
+   void *ptr, char *type, void *msk);
 int parse_events_add_pmu(struct list_head **list, int *idx,
 char *pmu , struct list_head *head_config);
 void parse_events__set_leader(char *name, struct list_head *list);
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 0f9914a..e8ac05d 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -254,13 +254,23 @@ PE_NAME_CACHE_TYPE
 }
 
 event_legacy_mem:
+PE_PREFIX_MEM PE_VALUE ':' PE_MODIFIER_BP ':' PE_VALUE sep_dc
+{
+   struct parse_events_data__events *data = _data;
+   struct list_head *list = NULL;
+
+   ABORT_ON(parse_events_add_breakpoint(list, data-idx,
+(void *) $2, $4, (void *) $6));
+   $$ = list;
+}
+|
 PE_PREFIX_MEM PE_VALUE ':' PE_MODIFIER_BP sep_dc
 {
struct parse_events_data__events *data = _data;
struct list_head *list = NULL;
 
ABORT_ON(parse_events_add_breakpoint(list, data-idx,
-(void *) $2, $4));
+(void *) $2, $4, NULL));
$$ = list;
 }
 |
@@ -270,7 +280,7 @@ PE_PREFIX_MEM PE_VALUE sep_dc
struct list_head *list = NULL;
 
ABORT_ON(parse_events_add_breakpoint(list, data-idx,
-(void *) $2, NULL));
+(void *) $2, NULL, NULL));
$$ = list;
 }
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 2/5] perf, x86: AMD implementation for hardware breakpoint address mask

2013-01-10 Thread Jacob Shin
Implement hardware breakpoint address mask for AMD Family 16h (and
any other future) processors. CPUID feature bit indicates the hardware
support for DRn_ADDR_MASK MSRs.

Signed-off-by: Jacob Shin jacob.s...@amd.com
---
 arch/x86/Kconfig  |1 +
 arch/x86/include/asm/cpufeature.h |2 ++
 arch/x86/include/asm/hw_breakpoint.h  |6 ++
 arch/x86/include/asm/processor.h  |7 +++
 arch/x86/include/uapi/asm/msr-index.h |6 ++
 arch/x86/kernel/cpu/amd.c |   21 +
 arch/x86/kernel/hw_breakpoint.c   |5 +
 7 files changed, 48 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 50b1b1b..90b2a3b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -64,6 +64,7 @@ config X86
select HAVE_KERNEL_XZ
select HAVE_KERNEL_LZO
select HAVE_HW_BREAKPOINT
+   select HAVE_HW_BREAKPOINT_ADDR_MASK
select HAVE_MIXED_BREAKPOINTS_REGS
select PERF_EVENTS
select HAVE_PERF_EVENTS_NMI
diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index 2d9075e..cf04936 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -167,6 +167,7 @@
 #define X86_FEATURE_TBM(6*32+21) /* trailing bit manipulations 
*/
 #define X86_FEATURE_TOPOEXT(6*32+22) /* topology extensions CPUID leafs */
 #define X86_FEATURE_PERFCTR_CORE (6*32+23) /* core performance counter 
extensions */
+#define X86_FEATURE_BPEXT  (6*32+26) /* data breakpoint extension */
 
 /*
  * Auxiliary flags: Linux defined - For features scattered in various
@@ -313,6 +314,7 @@ extern const char * const x86_power_flags[32];
 #define cpu_has_cx16   boot_cpu_has(X86_FEATURE_CX16)
 #define cpu_has_eager_fpu  boot_cpu_has(X86_FEATURE_EAGER_FPU)
 #define cpu_has_topoextboot_cpu_has(X86_FEATURE_TOPOEXT)
+#define cpu_has_bpext  boot_cpu_has(X86_FEATURE_BPEXT)
 
 #ifdef CONFIG_X86_64
 
diff --git a/arch/x86/include/asm/hw_breakpoint.h 
b/arch/x86/include/asm/hw_breakpoint.h
index ef1c4d2..c939415 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -14,6 +14,7 @@ struct arch_hw_breakpoint {
unsigned long   address;
u8  len;
u8  type;
+   u32 mask;
 };
 
 #include linux/kdebug.h
@@ -72,4 +73,9 @@ extern int arch_bp_generic_fields(int x86_len, int x86_type,
 
 extern struct pmu perf_ops_bp;
 
+static inline int arch_has_hw_breakpoint_addr_mask(void)
+{
+   return cpu_has_bpext;
+}
+
 #endif /* _I386_HW_BREAKPOINT_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 888184b..876aacd 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -990,8 +990,15 @@ extern bool cpu_has_amd_erratum(const int *);
 #define AMD_MODEL_RANGE_START(range)   (((range)  12)  0xfff)
 #define AMD_MODEL_RANGE_END(range) ((range)  0xfff)
 
+extern void set_dr_addr_mask(u32 mask, int dr);
+
 #else
 #define cpu_has_amd_erratum(x) (false)
+
+static inline void set_dr_addr_mask(u32 mask, int dr)
+{
+}
+
 #endif /* CONFIG_CPU_SUP_AMD */
 
 extern unsigned long arch_align_stack(unsigned long sp);
diff --git a/arch/x86/include/uapi/asm/msr-index.h 
b/arch/x86/include/uapi/asm/msr-index.h
index 433a59f..cfc6aa4 100644
--- a/arch/x86/include/uapi/asm/msr-index.h
+++ b/arch/x86/include/uapi/asm/msr-index.h
@@ -191,6 +191,12 @@
 #define MSR_AMD64_IBSBRTARGET  0xc001103b
 #define MSR_AMD64_IBS_REG_COUNT_MAX8 /* includes MSR_AMD64_IBSBRTARGET */
 
+/* Fam 16h MSRs */
+#define MSR_F16H_DR0_ADDR_MASK 0xc0011027
+#define MSR_F16H_DR1_ADDR_MASK 0xc0011019
+#define MSR_F16H_DR2_ADDR_MASK 0xc001101a
+#define MSR_F16H_DR3_ADDR_MASK 0xc001101b
+
 /* Fam 15h MSRs */
 #define MSR_F15H_PERF_CTL  0xc0010200
 #define MSR_F15H_PERF_CTR  0xc0010201
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 15239ff..5b0f676 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -906,3 +906,24 @@ bool cpu_has_amd_erratum(const int *erratum)
 }
 
 EXPORT_SYMBOL_GPL(cpu_has_amd_erratum);
+
+void set_dr_addr_mask(u32 mask, int dr)
+{
+   if (!cpu_has_bpext)
+   return;
+
+   BUG_ON(dr = HBP_NUM);
+
+   switch (dr) {
+   case 0:
+   wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
+   break;
+   case 1:
+   case 2:
+   case 3:
+   wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
+   break;
+   default:
+   break;
+   }
+}
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 02f0763..f8bf2df 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -121,6 +121,8 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
if 

Re: [PATCH] posix-timers: Fix clock_adjtime to return timex data on success

2013-01-10 Thread Richard Cochran
On Thu, Jan 10, 2013 at 06:12:02PM +0100, Miroslav Lichvar wrote:
 Copy the modified timex data back to the user also with positive return
 values. This fixes reading of the CLOCK_REALTIME timex data when the
 clock is in a non-zero state.
 
 Signed-off-by: Miroslav Lichvar mlich...@redhat.com

Acked-by: Richard Cochran richardcoch...@gmail.com

(Adding John Stultz on CC)

 ---
  kernel/posix-timers.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
 index 69185ae..10349d5 100644
 --- a/kernel/posix-timers.c
 +++ b/kernel/posix-timers.c
 @@ -997,7 +997,7 @@ SYSCALL_DEFINE2(clock_adjtime, const clockid_t, 
 which_clock,
  
   err = kc-clock_adj(which_clock, ktx);
  
 - if (!err  copy_to_user(utx, ktx, sizeof(ktx)))
 + if (err = 0  copy_to_user(utx, ktx, sizeof(ktx)))
   return -EFAULT;
  
   return err;
 -- 
 1.7.11.7
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND 0/5] perf: Add support for hardware breakpoint address masks

2013-01-10 Thread Jacob Shin
The following patchset adds address masks to existing perf hardware
breakpoint mechanism to allow trapping on an address range (currently
only single address) on supported architectures.

perf uapi is updated, x86 AMD implementation (for AMD Family 16h and
beyond) is provided, and perf tool has been extended to do:

  $ perf stat -e mem:0x1000:w:0xf a.out
  ^^^
  don't care bit mask

  which will count writes to [0x1000 ~ 0x1010)

Jacob Shin (2):
  perf: Add hardware breakpoint address mask
  perf, x86: AMD implementation for hardware breakpoint address mask

Suravee Suthikulpanit (3):
  perf tools: Add breakpoint address mask to the mem event parser
  perf tools: Add breakpoint address mask syntax to perf list and
documentation
  perf tools: Add breakpoint address mask test case to
tests/parse-events

 arch/Kconfig |4 
 arch/x86/Kconfig |1 +
 arch/x86/include/asm/cpufeature.h|2 ++
 arch/x86/include/asm/hw_breakpoint.h |6 ++
 arch/x86/include/asm/processor.h |7 ++
 arch/x86/include/uapi/asm/msr-index.h|6 ++
 arch/x86/kernel/cpu/amd.c|   21 ++
 arch/x86/kernel/hw_breakpoint.c  |5 +
 include/linux/hw_breakpoint.h|6 ++
 include/uapi/linux/perf_event.h  |5 -
 kernel/events/hw_breakpoint.c|3 +++
 tools/perf/Documentation/perf-record.txt |   14 
 tools/perf/tests/parse-events.c  |   34 ++
 tools/perf/util/parse-events.c   |5 +++--
 tools/perf/util/parse-events.h   |2 +-
 tools/perf/util/parse-events.y   |   14 ++--
 16 files changed, 125 insertions(+), 10 deletions(-)

-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] cgroup: use new hashtable implementation

2013-01-10 Thread Tejun Heo
On Thu, Jan 10, 2013 at 11:49:27AM +0800, Li Zefan wrote:
 Switch cgroup to use the new hashtable implementation. No functional changes.
 
 Signed-off-by: Li Zefan lize...@huawei.com

Applied to cgroup/for-3.9.  Thanks!

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: regression, bisected: openpty fails from 3.7 onwards without devpts

2013-01-10 Thread Jiri Slaby
On 01/10/2013 06:23 PM, Jiri Slaby wrote:
 getptsname expects EINVAL on failure to fall back to /dev/ttyp*... The
 same as unlockpt. We should definitely revert now

Maybe not that strictly. It would be enough to revert TIOCGPTN and
TIOCSPTLCK to return EINVAL as suggested by Alan.
-- 
js
suse labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 1/3] mmc: core: Add support for idle time BKOPS

2013-01-10 Thread Maya Erez
Devices have various maintenance operations need to perform internally.
In order to reduce latencies during time critical operations like read
and write, it is better to execute maintenance operations in other
times - when the host is not being serviced. Such operations are called
Background operations (BKOPS).
The device notifies the status of the BKOPS need by updating BKOPS_STATUS
(EXT_CSD byte [246]).

According to the standard a host that supports BKOPS shall check the
status periodically and start background operations as needed, so that
the device has enough time for its maintenance operations.

This patch adds support for this periodic check of the BKOPS status.
Since foreground operations are of higher priority than background
operations the host will check the need for BKOPS when it is idle,
and in case of an incoming request the BKOPS operation will be
interrupted.

When the mmcqd thread is idle, a delayed work is created to check the
need for BKOPS. The time to start the delayed work can be set by the host
controller. If this time is not set, a default time is used.
If the card raised an exception with need for urgent BKOPS (level 2/3)
a flag will be set to indicate MMC to start the BKOPS activity when it
becomes idle.

Since running the BKOPS too often can impact the eMMC endurance, the card
need for BKOPS is not checked every time MMC is idle (despite of cases of
exception raised). In order to estimate when is the best time to check
for BKOPS need the host will take into account the card capacity and
percentages of changed sectors in the card. A future enhancement can be to
check the card need for BKOPS only in case of random activity.

Signed-off-by: Maya Erez me...@codeaurora.org
---
 Documentation/mmc/mmc-dev-attrs.txt |9 ++
 drivers/mmc/card/block.c|   96 +-
 drivers/mmc/card/queue.c|2 +
 drivers/mmc/core/core.c |  155 +++
 drivers/mmc/core/mmc.c  |   17 
 include/linux/mmc/card.h|   47 ++-
 include/linux/mmc/core.h|2 +
 7 files changed, 291 insertions(+), 37 deletions(-)

diff --git a/Documentation/mmc/mmc-dev-attrs.txt 
b/Documentation/mmc/mmc-dev-attrs.txt
index 0d98fac..8d33b80 100644
--- a/Documentation/mmc/mmc-dev-attrs.txt
+++ b/Documentation/mmc/mmc-dev-attrs.txt
@@ -8,6 +8,15 @@ The following attributes are read/write.
 
force_roEnforce read-only access even if write protect 
switch is off.
 
+   bkops_check_threshold   This attribute is used to determine whether
+   the status bit that indicates the need for BKOPS should be checked.
+   The value should be given in percentages of the card size.
+   This value is used to calculate the minimum number of sectors that
+   needs to be changed in the device (written or discarded) in order to
+   require the status-bit of BKOPS to be checked.
+   The value can modified via sysfs by writing the required value to:
+   /sys/block/block_dev_name/bkops_check_threshold
+
 SD and MMC Device Attributes
 
 
diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 21056b9..a4d4b7e 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -108,6 +108,7 @@ struct mmc_blk_data {
unsigned intpart_curr;
struct device_attribute force_ro;
struct device_attribute power_ro_lock;
+   struct device_attribute bkops_check_threshold;
int area_type;
 };
 
@@ -268,6 +269,65 @@ out:
return ret;
 }
 
+static ssize_t
+bkops_check_threshold_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+   struct mmc_blk_data *md = mmc_blk_get(dev_to_disk(dev));
+   struct mmc_card *card = md-queue.card;
+   int ret;
+
+   if (!card)
+   ret = -EINVAL;
+   else
+   ret = snprintf(buf, PAGE_SIZE, %d\n,
+   card-bkops_info.size_percentage_to_queue_delayed_work);
+
+   mmc_blk_put(md);
+   return ret;
+}
+
+static ssize_t
+bkops_check_threshold_store(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   int value;
+   struct mmc_blk_data *md = mmc_blk_get(dev_to_disk(dev));
+   struct mmc_card *card = md-queue.card;
+   unsigned int card_size;
+   int ret = count;
+
+   if (!card) {
+   ret = -EINVAL;
+   goto exit;
+   }
+
+   sscanf(buf, %d, value);
+   if ((value = 0) || (value = 100)) {
+   ret = -EINVAL;
+   goto exit;
+   }
+
+   card_size = (unsigned int)get_capacity(md-disk);
+   if (card_size = 0) {
+   ret = -EINVAL;
+   goto exit;
+   }
+   card-bkops_info.size_percentage_to_queue_delayed_work = value;
+   

[PATCH v5 2/3] mmc: allow the host controller to poll for BKOPS completion

2013-01-10 Thread Maya Erez
In order to allow the card to perform the required BKOPS and prevent
the need for critical BKOPS, we would like to prevent BKOPS interruption
when possible.
In case the controller calls mmc_suspend_host when runtime suspend is
idle, the BKOPS operation will be interrupted. To prevent this we would
like to prevent the runtime suspend idle until BKOPS is completed.
This patch adds a flag to allow the controller to mark if the polling is
required or not.

Signed-off-by: Maya Erez me...@codeaurora.org
---
 drivers/mmc/core/core.c  |   82 +-
 drivers/mmc/core/mmc.c   |3 ++
 include/linux/mmc/card.h |5 +++
 include/linux/mmc/core.h |1 +
 include/linux/mmc/host.h |2 +-
 5 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index c8cb98e..e22584a 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -364,7 +364,15 @@ void mmc_start_bkops(struct mmc_card *card, bool 
from_exception)
}
mmc_card_clr_need_bkops(card);
mmc_card_set_doing_bkops(card);
-   card-bkops_info.sectors_changed = 0;
+
+   if (card-host-caps2  MMC_CAP2_POLL_FOR_BKOPS_COMP) {
+   pr_debug(%s: %s: starting the polling thread\n,
+mmc_hostname(card-host), __func__);
+   queue_work(system_nrt_wq,
+  card-bkops_info.poll_for_completion);
+   } else {
+   card-bkops_info.sectors_changed = 0;
+   }
 
 out:
mmc_release_host(card-host);
@@ -372,6 +380,78 @@ out:
 EXPORT_SYMBOL(mmc_start_bkops);
 
 /**
+ * mmc_bkops_completion_polling() - Poll on the card status to
+ * wait for the non-blocking BKOPS completion
+ * @work:  The completion polling work
+ *
+ * The on-going reading of the card status will prevent the card
+ * from getting into suspend while it is in the middle of
+ * performing BKOPS.
+ * Since the non blocking BKOPS can be interrupted by a fetched
+ * request we also check IF mmc_card_doing_bkops in each
+ * iteration.
+ */
+void mmc_bkops_completion_polling(struct work_struct *work)
+{
+   struct mmc_card *card = container_of(work, struct mmc_card,
+   bkops_info.poll_for_completion);
+   unsigned long timeout_jiffies = jiffies +
+   msecs_to_jiffies(BKOPS_COMPLETION_POLLING_TIMEOUT_MS);
+   u32 status;
+   int err;
+
+   /*
+* Wait for the BKOPs to complete. Keep reading the status to prevent
+* the host from getting into suspend
+*/
+   do {
+   mmc_claim_host(card-host);
+
+   if (!mmc_card_doing_bkops(card))
+   goto out;
+
+   err = mmc_send_status(card, status);
+   if (err) {
+   pr_err(%s: error %d requesting status\n,
+  mmc_hostname(card-host), err);
+   goto out;
+   }
+
+   /*
+* Some cards mishandle the status bits, so make sure to check
+* both the busy indication and the card state.
+*/
+   if ((status  R1_READY_FOR_DATA) 
+   (R1_CURRENT_STATE(status) != R1_STATE_PRG)) {
+   pr_debug(%s: %s: completed BKOPs, exit polling\n,
+mmc_hostname(card-host), __func__);
+   mmc_card_clr_doing_bkops(card);
+   card-bkops_info.sectors_changed = 0;
+   goto out;
+   }
+
+   mmc_release_host(card-host);
+
+   /*
+* Sleep before checking the card status again to allow the
+* card to complete the BKOPs operation
+*/
+   msleep(BKOPS_COMPLETION_POLLING_INTERVAL_MS);
+   } while (time_before(jiffies, timeout_jiffies));
+
+   pr_err(%s: %s: exit polling due to timeout, stop bkops\n,
+  mmc_hostname(card-host), __func__);
+   err = mmc_stop_bkops(card);
+   if (err)
+   pr_err(%s: %s: mmc_stop_bkops failed, err=%d\n,
+  mmc_hostname(card-host), __func__, err);
+
+   return;
+out:
+   mmc_release_host(card-host);
+}
+
+/**
  * mmc_start_idle_time_bkops() - check if a non urgent BKOPS is
  * needed
  * @work:  The idle time BKOPS work
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index 2f25488..61bfb8f 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -1550,6 +1550,9 @@ int mmc_attach_mmc(struct mmc_host *host)
INIT_DELAYED_WORK(host-card-bkops_info.dw,
  mmc_start_idle_time_bkops);
 
+   INIT_WORK(host-card-bkops_info.poll_for_completion,
+ mmc_bkops_completion_polling);
+
/*
 * The host controller can set the time to start the BKOPS in
   

[PATCH v5 3/3] mmc: core: Add MMC BKOPS statistics and debugfs ability to print them

2013-01-10 Thread Maya Erez
The BKOPS statistics are used for BKOPS unit tests and APT tests
to determine test success or failure.
the BKOPS statistics provide the following information:
The number of times BKOPS were issued according to it's severity level
The number of times BKOPS were interrupted by HPI.
The number of times the host went into suspend

Signed-off-by: Yaniv Gardi yga...@codeaurora.org
---
 drivers/mmc/core/bus.c |2 +
 drivers/mmc/core/core.c|   53 
 drivers/mmc/core/debugfs.c |  114 
 include/linux/mmc/card.h   |   12 +
 include/linux/mmc/core.h   |2 +
 5 files changed, 183 insertions(+), 0 deletions(-)

diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index 420cb67..47f883b 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -250,6 +250,8 @@ struct mmc_card *mmc_alloc_card(struct mmc_host *host, 
struct device_type *type)
card-dev.release = mmc_release_card;
card-dev.type = type;
 
+   spin_lock_init(card-bkops_info.bkops_stats.lock);
+
return card;
 }
 
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index e22584a..7405243 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -79,6 +79,30 @@ MODULE_PARM_DESC(
removable,
MMC/SD cards are removable and may be removed during suspend);
 
+#define MMC_UPDATE_BKOPS_STATS_HPI(stats)  \
+   do {\
+   spin_lock(stats.lock); \
+   if (stats.enabled)  \
+   stats.hpi++;\
+   spin_unlock(stats.lock);   \
+   } while (0);
+#define MMC_UPDATE_BKOPS_STATS_SUSPEND(stats)  \
+   do {\
+   spin_lock(stats.lock); \
+   if (stats.enabled)  \
+   stats.suspend++;\
+   spin_unlock(stats.lock);   \
+   } while (0);
+#define MMC_UPDATE_STATS_BKOPS_SEVERITY_LEVEL(stats, level)\
+   do {\
+   if (level = 0 || level  BKOPS_NUM_OF_SEVERITY_LEVELS) \
+   break;  \
+   spin_lock(stats.lock); \
+   if (stats.enabled)  \
+   stats.bkops_level[level-1]++;   \
+   spin_unlock(stats.lock);   \
+   } while (0);
+
 /*
  * Internal function. Schedule delayed work in the MMC work queue.
  */
@@ -255,6 +279,29 @@ mmc_start_request(struct mmc_host *host, struct 
mmc_request *mrq)
host-ops-request(host, mrq);
 }
 
+void mmc_blk_init_bkops_statistics(struct mmc_card *card)
+{
+   int i;
+   struct mmc_bkops_stats *bkops_stats;
+
+   if (!card)
+   return;
+
+   bkops_stats = card-bkops_info.bkops_stats;
+
+   spin_lock(bkops_stats-lock);
+
+   for (i = 0 ; i  BKOPS_NUM_OF_SEVERITY_LEVELS ; ++i)
+   bkops_stats-bkops_level[i] = 0;
+
+   bkops_stats-suspend = 0;
+   bkops_stats-hpi = 0;
+   bkops_stats-enabled = true;
+
+   spin_unlock(bkops_stats-lock);
+}
+EXPORT_SYMBOL(mmc_blk_init_bkops_statistics);
+
 /**
  * mmc_start_delayed_bkops() - Start a delayed work to check for
  *  the need of non urgent BKOPS
@@ -362,6 +409,8 @@ void mmc_start_bkops(struct mmc_card *card, bool 
from_exception)
mmc_hostname(card-host), err);
goto out;
}
+   MMC_UPDATE_STATS_BKOPS_SEVERITY_LEVEL(card-bkops_info.bkops_stats,
+   card-ext_csd.raw_bkops_status);
mmc_card_clr_need_bkops(card);
mmc_card_set_doing_bkops(card);
 
@@ -762,6 +811,8 @@ int mmc_stop_bkops(struct mmc_card *card)
err = 0;
}
 
+   MMC_UPDATE_BKOPS_STATS_HPI(card-bkops_info.bkops_stats);
+
 out:
return err;
 }
@@ -2614,6 +2665,8 @@ int mmc_suspend_host(struct mmc_host *host)
err = mmc_stop_bkops(host-card);
if (err)
goto out;
+   MMC_UPDATE_BKOPS_STATS_SUSPEND(host-
+   card-bkops_info.bkops_stats);
}
err = host-bus_ops-suspend(host);
}
diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index 35c2f85..30738cb 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -334,6 +334,114 @@ static const struct file_operations mmc_dbg_ext_csd_fops 
= {
.llseek = default_llseek,
 };
 
+static int mmc_bkops_stats_open(struct inode *inode, struct file *filp)
+{
+   struct mmc_card *card = 

Re: [PATCH v5 5/5] KVM: x86: improve reexecute_instruction

2013-01-10 Thread Xiao Guangrong
On 01/11/2013 03:48 AM, Marcelo Tosatti wrote:
 On Fri, Jan 11, 2013 at 02:05:33AM +0800, Xiao Guangrong wrote:
 On 01/11/2013 01:26 AM, Marcelo Tosatti wrote:
 On Tue, Jan 08, 2013 at 02:38:36PM +0800, Xiao Guangrong wrote:
 The current reexecute_instruction can not well detect the failed 
 instruction
 emulation. It allows guest to retry all the instructions except it accesses
 on error pfn

 For example, some cases are nested-write-protect - if the page we want to
 write is used as PDE but it chains to itself. Under this case, we should
 stop the emulation and report the case to userspace

 Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
 ---
  arch/x86/include/asm/kvm_host.h |7 +++
  arch/x86/kvm/paging_tmpl.h  |   27 ---
  arch/x86/kvm/x86.c  |8 +++-
  3 files changed, 34 insertions(+), 8 deletions(-)

 diff --git a/arch/x86/include/asm/kvm_host.h 
 b/arch/x86/include/asm/kvm_host.h
 index c431b33..d6ab8d2 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -502,6 +502,13 @@ struct kvm_vcpu_arch {
u64 msr_val;
struct gfn_to_hva_cache data;
} pv_eoi;
 +
 +  /*
 +   * Indicate whether the access faults on its page table in guest
 +   * which is set when fix page fault and used to detect unhandeable
 +   * instruction.
 +   */
 +  bool write_fault_to_shadow_pgtable;
  };

  struct kvm_lpage_info {
 diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
 index 67b390d..df50560 100644
 --- a/arch/x86/kvm/paging_tmpl.h
 +++ b/arch/x86/kvm/paging_tmpl.h
 @@ -497,26 +497,34 @@ out_gpte_changed:
   * created when kvm establishes shadow page table that stop kvm using 
 large
   * page size. Do it early can avoid unnecessary #PF and emulation.
   *
 + * @write_fault_to_shadow_pgtable will return true if the fault gfn is
 + * currently used as its page table.
 + *
   * Note: the PDPT page table is not checked for PAE-32 bit guest. It is ok
   * since the PDPT is always shadowed, that means, we can not use large 
 page
   * size to map the gfn which is used as PDPT.
   */
  static bool
  FNAME(is_self_change_mapping)(struct kvm_vcpu *vcpu,
 -struct guest_walker *walker, int user_fault)
 +struct guest_walker *walker, int user_fault,
 +bool *write_fault_to_shadow_pgtable)
  {
int level;
gfn_t mask = ~(KVM_PAGES_PER_HPAGE(walker-level) - 1);
 +  bool self_changed = false;

if (!(walker-pte_access  ACC_WRITE_MASK ||
  (!is_write_protection(vcpu)  !user_fault)))
return false;

 -  for (level = walker-level; level = walker-max_level; level++)
 -  if (!((walker-gfn ^ walker-table_gfn[level - 1])  mask))
 -  return true;
 +  for (level = walker-level; level = walker-max_level; level++) {
 +  gfn_t gfn = walker-gfn ^ walker-table_gfn[level - 1];
 +
 +  self_changed |= !(gfn  mask);
 +  *write_fault_to_shadow_pgtable |= !gfn;
 +  }

 -  return false;
 +  return self_changed;
  }

  /*
 @@ -544,7 +552,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, 
 gva_t addr, u32 error_code,
int level = PT_PAGE_TABLE_LEVEL;
int force_pt_level;
unsigned long mmu_seq;
 -  bool map_writable;
 +  bool map_writable, is_self_change_mapping;

pgprintk(%s: addr %lx err %x\n, __func__, addr, error_code);

 @@ -572,9 +580,14 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, 
 gva_t addr, u32 error_code,
return 0;
}

 +  vcpu-arch.write_fault_to_shadow_pgtable = false;
 +
 +  is_self_change_mapping = FNAME(is_self_change_mapping)(vcpu,
 +walker, user_fault, vcpu-arch.write_fault_to_shadow_pgtable);
 +
if (walker.level = PT_DIRECTORY_LEVEL)
force_pt_level = mapping_level_dirty_bitmap(vcpu, walker.gfn)
 - || FNAME(is_self_change_mapping)(vcpu, walker, user_fault);
 + || is_self_change_mapping;
else
force_pt_level = 1;
if (!force_pt_level) {
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 6f13e03..2957012 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4810,7 +4810,13 @@ static bool reexecute_instruction(struct kvm_vcpu 
 *vcpu, gva_t cr2)
 * guest to let CPU execute the instruction.
 */
kvm_mmu_unprotect_page(vcpu-kvm, gpa_to_gfn(gpa));
 -  return true;
 +
 +  /*
 +   * If the access faults on its page table, it can not
 +   * be fixed by unprotecting shadow page and it should
 +   * be reported to userspace.
 +   */
 +  return !vcpu-arch.write_fault_to_shadow_pgtable;
  }

 This sounds wrong: only reporting emulation failure in case 
 of a write fault to shadow pagetable? 

 We suppose unprotecting target-gfn can avoid emulation, the same
 as current code. :(
 
 Current code treats access to non-mapped guest address as indication to
 exit reporting emulation failure.
 
 The patch above restricts 

Re: [PATCH tip/core/rcu 1/1] Tiny RCU changes for 3.9

2013-01-10 Thread Paul E. McKenney
On Mon, Jan 07, 2013 at 08:22:50PM -0800, Josh Triplett wrote:
 On Mon, Jan 07, 2013 at 02:19:15PM -0800, Paul E. McKenney wrote:
  On Mon, Jan 07, 2013 at 09:56:06AM -0800, Josh Triplett wrote:
   On Mon, Jan 07, 2013 at 08:57:48AM -0800, Paul E. McKenney wrote:
On Mon, Jan 07, 2013 at 07:58:10AM -0800, Josh Triplett wrote:
 This patch seems reasonable to me, but the repeated use of #if
 defined(CONFIG_SMP) || defined(CONFIG_RCU_TRACE) seems somewhat
 annoying, and fragile if you ever decide to change the conditions.  
 How
 about defining an appropriate symbol in Kconfig for stall warnings, 
 and
 using that?

But I only just removed the config option for SMP RCU stall warnings.  
;-)

But I must agree that defined(CONFIG_SMP) || defined(CONFIG_RCU_TRACE)
is a bit obscure.  The rationale is that RCU stall warnings are
unconditionally enabled in SMP kernels, but don't want to be in
TINY_RCU kernels due to size constraints.  I therefore put it under
CONFIG_RCU_TRACE, which also contains other TINY_RCU debugging-style
options.  Would adding a comment to this effect help?
   
   I understand the rationale; I just think it would become clearer if you
   added an internal-only Kconfig symbol selected in both cases and change
   the conditionals to use that.
  
  My concern was that this would confuse people into thinking that the
  code under those #ifdefs was all the stall-warning code that there was.
  
  I suppose this could be forestalled with a suitably clever name...
  CONFIG_RCU_CPU_STALL_TINY_TOO?  Better names?
 
 How about CONFIG_RCU_STALL_COMMON, with associated help text saying
 include the stall-detection code common to both rcutree and rcutiny?

Sold!!!  Especially given that I am creating the commit to allow
TREE_PREEMPT_RCU to be used on UP systems with an eye towards getting
rid of TINY_PREEMPT_RCU.  ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in sound/usb/pcm.c:match_endpoint_audioformats() in current -git

2013-01-10 Thread Takashi Iwai
At Thu, 10 Jan 2013 20:45:02 +0100 (CET),
Eldad Zack wrote:
 
 
 On Thu, 10 Jan 2013, Takashi Iwai wrote:
 
  At Thu, 10 Jan 2013 13:49:22 +0100,
  Jens Axboe wrote:
   
   Here it is, it's from the one introducing the audioformat lookup.
   Confirmed that 3.8-rc3 with this backed out works fine, too. So should
   be fairly confident in that result.
 
  From: Takashi Iwai ti...@suse.de
  Subject: [PATCH] ALSA: usb-audio: Fix NULL dereference by access to
   non-existing substream
  
  The commit [0d9741c0: ALSA: usb-audio: sync ep init fix for
  audioformat mismatch] introduced the correction of parameters to be
  set for sync EP.  But since the new code assumes that the sync EP is
  always paired with the data EP of another direction, it triggers Oops
  when a device only with a single direction is used.
 
 Yes - sorry, I didn't consider this at all.
 
  This patch adds a proper check of sync EP type and the presence of the
  paired substream for avoiding the crash.
  
  Reported-by: Jens Axboe ax...@kernel.dk
  Signed-off-by: Takashi Iwai ti...@suse.de
  ---
   sound/usb/pcm.c | 11 +++
   1 file changed, 11 insertions(+)
  
  diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
  index c659310..21c0001 100644
  --- a/sound/usb/pcm.c
  +++ b/sound/usb/pcm.c
  @@ -511,6 +511,17 @@ static int configure_sync_endpoint(struct 
  snd_usb_substream *subs)
  struct snd_usb_substream *sync_subs =
  subs-stream-substream[subs-direction ^ 1];
   
  +   if (subs-sync_endpoint-type != SND_USB_ENDPOINT_TYPE_DATA ||
  +   !subs-stream) {
  +   ret = snd_usb_endpoint_set_params(subs-sync_endpoint,
  + subs-pcm_format,
  + subs-channels,
  + subs-period_bytes,
  + subs-cur_rate,
  + subs-cur_audiofmt,
  + NULL);
  +   }
  +
 
 I think you want to return here, no?

Ah, yes, good catch.  It was dropped during rebasing and rewriting.
Below is the revised patch.


thanks,

Takashi

---
From: Takashi Iwai ti...@suse.de
Subject: [PATCH v2] ALSA: usb-audio: Fix NULL dereference by access to
 non-existing substream

The commit [0d9741c0: ALSA: usb-audio: sync ep init fix for
audioformat mismatch] introduced the correction of parameters to be
set for sync EP.  But since the new code assumes that the sync EP is
always paired with the data EP of another direction, it triggers Oops
when a device only with a single direction is used.

This patch adds a proper check of sync EP type and the presence of the
paired substream for avoiding the crash.

Reported-by: Jens Axboe ax...@kernel.dk
Signed-off-by: Takashi Iwai ti...@suse.de
---
 sound/usb/pcm.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index c659310..d82e378 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -511,6 +511,16 @@ static int configure_sync_endpoint(struct 
snd_usb_substream *subs)
struct snd_usb_substream *sync_subs =
subs-stream-substream[subs-direction ^ 1];
 
+   if (subs-sync_endpoint-type != SND_USB_ENDPOINT_TYPE_DATA ||
+   !subs-stream)
+   return snd_usb_endpoint_set_params(subs-sync_endpoint,
+  subs-pcm_format,
+  subs-channels,
+  subs-period_bytes,
+  subs-cur_rate,
+  subs-cur_audiofmt,
+  NULL);
+
/* Try to find the best matching audioformat. */
list_for_each_entry(fp, sync_subs-fmt_list, list) {
int score = match_endpoint_audioformats(fp, subs-cur_audiofmt,
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/14] lib: Add I/O map cache implementation

2013-01-10 Thread Thierry Reding
On Thu, Jan 10, 2013 at 12:24:17PM -0700, Jason Gunthorpe wrote:
 On Thu, Jan 10, 2013 at 08:03:27PM +0100, Thierry Reding wrote:
 
You'd piece a mapping together, each bus requires 16 64k mappings, a
simple 2d array of busnr*16 of pointers would do the trick. A more
clever solution would be to allocate contiguous virtual memory and
split that up..
  
   Oh, I see. I'm not very familiar with the internals of remapping, so
   I'll need to do some more reading. Thanks for the hints.
  
  I forgot to ask. What's the advantage of having a contiguous virtual
  memory area and splitting it up versus remapping each chunk separately?
 
 Not alot, really, but it saves you from the pointer array and
 associated overhead. IIRC it is fairly easy to do in the kernel.

I've been investigating this a bit, and one problem is that it will
prevent the driver from ever building as a module because the necessary
functions aren't exported and I'm not sure exporting them would be
acceptable. Currently PCI host controller drivers with MSI support can't
be built as modules because the MSI infrastructure requires it, but I
briefly discussed this with Bjorn at some point and it should be easy to
remove that requirement.

 Arnd's version is good too, but you would be restricted to aligned
 powers of two for the bus number range in the DT, which is probably
 not that big a deal either?

Stephen suggested on IRC that we could try to keep a bit of dynamicity
in the allocation scheme if we create the bus mapping when the first
device on the bus is probed and discard the mapping if no devices are
found.

Sounds like a good plan to me. Does anybody see any potential pitfalls?

Thierry


pgpwUHh6il_3O.pgp
Description: PGP signature


Re: [PATCH 11/14] ARM: tegra: tamonten: Add PCIe support

2013-01-10 Thread Thierry Reding
On Wed, Jan 09, 2013 at 09:23:37PM +, Arnd Bergmann wrote:
 On Wednesday 09 January 2013, Thierry Reding wrote:
  Signed-off-by: Thierry Reding thierry.red...@avionic-design.de
  ---
   arch/arm/boot/dts/tegra20-tamonten.dtsi | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
 It's not clear how this one-line patch adds PCIe support to the
 platform. Are you missing the actual change here?

This is really just required to associate a name with the regulator, but
I think I can achieve the same by adding an alias to the TEC DTSI.

Thierry


pgpz4p0e_yH2m.pgp
Description: PGP signature


Re: [RFC PATCH v3 13/16] ARM: dts: add AM33XX MMC support

2013-01-10 Thread Matt Porter
On Tue, Oct 30, 2012 at 05:33:40AM +, AnilKumar wrote:
 On Thu, Oct 18, 2012 at 18:56:52, Porter, Matt wrote:
  Adds AM33XX MMC support for am335x-bone and am335x-evm.
  
  Signed-off-by: Matt Porter mpor...@ti.com
  ---
   arch/arm/boot/dts/am335x-bone.dts |6 ++
   arch/arm/boot/dts/am335x-evm.dts  |6 ++
   arch/arm/boot/dts/am33xx.dtsi |   27 +++
   3 files changed, 39 insertions(+)
  
  diff --git a/arch/arm/boot/dts/am335x-bone.dts 
  b/arch/arm/boot/dts/am335x-bone.dts
  index c634f87..5510979 100644
  --- a/arch/arm/boot/dts/am335x-bone.dts
  +++ b/arch/arm/boot/dts/am335x-bone.dts
  @@ -70,6 +70,8 @@
  };
   
  ldo3_reg: regulator@5 {
  +   regulator-min-microvolt = 180;
  +   regulator-max-microvolt = 330;
 
 I think these min  max limits are regulator limits. Are these fields
 required? Add details of these additions. AFAIK fine-tuned (board
 specific) min/max limits should be add here(like mpu and core
 regulator nodes)

This is required as the mmc driver builds the ocr mask from the
regulator range..and won't run without it. However, with the additional
updates since 3.7-rc1 to the am33xx release dts support, this is already
there so you won't see this hunk in v4.

  regulator-always-on;
  };
   
  @@ -78,3 +80,7 @@
  };
  };
   };
  +
  +mmc1 {
  +   vmmc-supply = ldo3_reg;
  +};
  diff --git a/arch/arm/boot/dts/am335x-evm.dts 
  b/arch/arm/boot/dts/am335x-evm.dts
  index 185d632..d63fce8 100644
  --- a/arch/arm/boot/dts/am335x-evm.dts
  +++ b/arch/arm/boot/dts/am335x-evm.dts
  @@ -114,7 +114,13 @@
  };
   
  vmmc_reg: regulator@12 {
  +   regulator-min-microvolt = 180;
  +   regulator-max-microvolt = 330;
 
 =same=

as above.

 
  regulator-always-on;
  };
  };
   };
  +
  +mmc1 {
  +   vmmc-supply = vmmc_reg;
  +};
  diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
  index ab9c78f..26a6af7 100644
  --- a/arch/arm/boot/dts/am33xx.dtsi
  +++ b/arch/arm/boot/dts/am33xx.dtsi
  @@ -234,6 +234,33 @@
  status = disabled;
  };
   
  +   mmc1: mmc@4806 {
  +   compatible = ti,omap3-hsmmc;
  +   ti,hwmods = mmc1;
  +   ti,dual-volt;
  +   ti,needs-special-reset;
  +   dmas = edma 24
  +   edma 25;
  +   dma-names = tx, rx;
 
 Add status = disabled here and okay in corresponding
 .dts file

yeah, I originally decided to avoid fixing non-dma related items, but
I'll fix this up in v4 while I'm there...to match the other mmc nodes.

  +   };
  +
  +   mmc2: mmc@481d8000 {
  +   compatible = ti,omap3-hsmmc;
  +   ti,hwmods = mmc2;
  +   ti,needs-special-reset;
  +   dmas = edma 2
  +   edma 3;
  +   dma-names = tx, rx;
  +   status = disabled;
  +   };
  +
  +   mmc3: mmc@4781 {
  +   compatible = ti,omap3-hsmmc;
  +   ti,hwmods = mmc3;
  +   ti,needs-special-reset;
 
 What about DMA resources for mmc3?
 
DMA resources for mmc3 are special in that mmc3 (actually MMC2 due
to the hwmod fortran style numbering) is on the crossbar. Since
dmaengine has no concept of a mux in front of dmac channels, we handle
our mux with h/w specific properties. What this means is that we can't
hardcode DMA resources for mmc3 (MMC2) or any other peripheral that sits
on the crossbar as they aren't a fixed EDMA channel.

Since the only peripheral sitting on mmc3 (or any crossbar based DMA
event) on one of the am33xx boards in wl12xx, I can't provide an example
of how this is done within this series...as wl12xx has no DT support and
can't be used.

However, for testing, I did a simple gpio event driver using a GPIO
instance on the crossbar. This purely an out-of-tree testing thing wired
op on the BeagleBone but it looks like this:

edma {
ti,edma-xbar-event-map = 32 12;
};

gpevt {
compatible = gpevt;
dmas = edma 12;
dma-names = gpioevt;
gpio-evt = gpio3 2 0;
};

The first node adds a crossbar event mapping (application-specific)
which maps GPIOEVT2 to EDMA channel 12 (an open channel with no fixed
peripheral use.

The gpevt device node then configures the board specific dma resources.

I don't see any reason to configure board specific dma resources for a
driver that can't use them until the driver is converted to DT...at that
time it makes sense to add mmc3 dma support for the evm and evmsk dts
files.

-Matt

  +   status = disabled;
  +   };
  +

RE: [PATCH 1/2] tools: hv: Fix how ifcfg-* file is created

2013-01-10 Thread KY Srinivasan


 -Original Message-
 From: Tomas Hozza [mailto:tho...@redhat.com]
 Sent: Tuesday, January 08, 2013 6:27 AM
 To: gre...@linuxfoundation.org
 Cc: KY Srinivasan; jasow...@redhat.com; Haiyang Zhang; linux-
 ker...@vger.kernel.org; Hashir Abdi; Tomas Hozza
 Subject: [PATCH 1/2] tools: hv: Fix how ifcfg-* file is created
 
 Fix for the daemon code and for hv_set_ifconfig.sh script, so
 that the created ifcfg-* file is consistent with initscripts
 documentation.
 
 Signed-off-by: Tomas Hozza tho...@redhat.com
Acked-by: K. Y. Srinivasan k...@microsoft.com

 ---
  tools/hv/hv_kvp_daemon.c| 73 ++-
 --
  tools/hv/hv_set_ifconfig.sh | 22 ++
  2 files changed, 44 insertions(+), 51 deletions(-)
 
 diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
 index d25a469..6b56b75 100644
 --- a/tools/hv/hv_kvp_daemon.c
 +++ b/tools/hv/hv_kvp_daemon.c
 @@ -1162,16 +1162,13 @@ static int process_ip_string(FILE *f, char 
 *ip_string, int
 type)
   snprintf(str, sizeof(str), %s, DNS);
   break;
   }
 - if (i != 0) {
 - if (type != DNS) {
 - snprintf(sub_str, sizeof(sub_str),
 - _%d, i++);
 - } else {
 - snprintf(sub_str, sizeof(sub_str),
 - %d, ++i);
 - }
 - } else if (type == DNS) {
 +
 + if (type == DNS) {
   snprintf(sub_str, sizeof(sub_str), %d, ++i);
 + } else if (type == GATEWAY  i == 0) {
 + ++i;
 + } else {
 + snprintf(sub_str, sizeof(sub_str), %d, i++);
   }
 
 
 @@ -1191,17 +1188,13 @@ static int process_ip_string(FILE *f, char 
 *ip_string, int
 type)
   snprintf(str, sizeof(str), %s,  DNS);
   break;
   }
 - if ((j != 0) || (type == DNS)) {
 - if (type != DNS) {
 - snprintf(sub_str, sizeof(sub_str),
 - _%d, j++);
 - } else {
 - snprintf(sub_str, sizeof(sub_str),
 - %d, ++i);
 - }
 - } else if (type == DNS) {
 - snprintf(sub_str, sizeof(sub_str),
 - %d, ++i);
 +
 + if (type == DNS) {
 + snprintf(sub_str, sizeof(sub_str), %d, ++i);
 + } else if (j == 0) {
 + ++j;
 + } else {
 + snprintf(sub_str, sizeof(sub_str), _%d, j++);
   }
   } else {
   return  HV_INVALIDARG;
 @@ -1244,18 +1237,19 @@ static int kvp_set_ip_info(char *if_name, struct
 hv_kvp_ipaddr_value *new_val)
* Here is the format of the ip configuration file:
*
* HWADDR=macaddr
 -  * IF_NAME=interface name
 -  * DHCP=yes (This is optional; if yes, DHCP is configured)
 +  * DEVICE=interface name
 +  * BOOTPROTO=protocol (where protocol is dhcp if DHCP is
 configured
 +  *   or none if no boot-time protocol should be 
 used)
*
 -  * IPADDR=ipaddr1
 -  * IPADDR_1=ipaddr2
 -  * IPADDR_x=ipaddry (where y = x + 1)
 +  * IPADDR0=ipaddr1
 +  * IPADDR1=ipaddr2
 +  * IPADDRx=ipaddry (where y = x + 1)
*
 -  * NETMASK=netmask1
 -  * NETMASK_x=netmasky (where y = x + 1)
 +  * NETMASK0=netmask1
 +  * NETMASKx=netmasky (where y = x + 1)
*
* GATEWAY=ipaddr1
 -  * GATEWAY_x=ipaddry (where y = x + 1)
 +  * GATEWAYx=ipaddry (where y = x + 1)
*
* DNSx=ipaddrx (where first DNS address is tagged as DNS1 etc)
*
 @@ -1294,20 +1288,23 @@ static int kvp_set_ip_info(char *if_name, struct
 hv_kvp_ipaddr_value *new_val)
   if (error)
   goto setval_error;
 
 - error = kvp_write_file(file, IF_NAME, , if_name);
 + error = kvp_write_file(file, DEVICE, , if_name);
   if (error)
   goto setval_error;
 
 - if (new_val-dhcp_enabled) {
 - error = kvp_write_file(file, DHCP, , yes);
 - if (error)
 - goto setval_error;
 + if (new_val-dhcp_enabled)
 + error = kvp_write_file(file, BOOTPROTO, , dhcp);
 + else
 + error = kvp_write_file(file, BOOTPROTO, , none);
 +
 + if (error)
 + goto setval_error;
 +
 + 

RE: [PATCH 2/2] tools: hv: Use CLOEXEC when opening kvp_pool files

2013-01-10 Thread KY Srinivasan


 -Original Message-
 From: Tomas Hozza [mailto:tho...@redhat.com]
 Sent: Tuesday, January 08, 2013 6:27 AM
 To: gre...@linuxfoundation.org
 Cc: KY Srinivasan; jasow...@redhat.com; Haiyang Zhang; linux-
 ker...@vger.kernel.org; Hashir Abdi; Tomas Hozza
 Subject: [PATCH 2/2] tools: hv: Use CLOEXEC when opening kvp_pool files
 
 Use CLOEXEC flag when opening kvp_pool_x files to prevent file
 descriptor leakage. Not using it was causing a problem when
 SELinux was enabled.
 
 Signed-off-by: Tomas Hozza tho...@redhat.com
Acked-by: K. Y. Srinivasan k...@microsoft.com

 ---
  tools/hv/hv_kvp_daemon.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
 index 6b56b75..31f839cc 100644
 --- a/tools/hv/hv_kvp_daemon.c
 +++ b/tools/hv/hv_kvp_daemon.c
 @@ -151,7 +151,7 @@ static void kvp_update_file(int pool)
*/
   kvp_acquire_lock(pool);
 
 - filep = fopen(kvp_file_info[pool].fname, w);
 + filep = fopen(kvp_file_info[pool].fname, we);
   if (!filep) {
   kvp_release_lock(pool);
   syslog(LOG_ERR, Failed to open file, pool: %d, pool);
 @@ -182,7 +182,7 @@ static void kvp_update_mem_state(int pool)
 
   kvp_acquire_lock(pool);
 
 - filep = fopen(kvp_file_info[pool].fname, r);
 + filep = fopen(kvp_file_info[pool].fname, re);
   if (!filep) {
   kvp_release_lock(pool);
   syslog(LOG_ERR, Failed to open file, pool: %d, pool);
 @@ -246,13 +246,13 @@ static int kvp_file_init(void)
   records_read = 0;
   num_blocks = 1;
   sprintf(fname, /var/opt/hyperv/.kvp_pool_%d, i);
 - fd = open(fname, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR |
 S_IROTH);
 + fd = open(fname, O_RDWR | O_CREAT | O_CLOEXEC, S_IRUSR |
 S_IWUSR | S_IROTH);
 
   if (fd == -1)
   return 1;
 
 
 - filep = fopen(fname, r);
 + filep = fopen(fname, re);
   if (!filep)
   return 1;
 
 --
 1.7.11.7
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in sound/usb/pcm.c:match_endpoint_audioformats() in current -git

2013-01-10 Thread Jens Axboe
On 2013-01-10 21:19, Takashi Iwai wrote:
 At Thu, 10 Jan 2013 20:45:02 +0100 (CET),
 Eldad Zack wrote:


 On Thu, 10 Jan 2013, Takashi Iwai wrote:

 At Thu, 10 Jan 2013 13:49:22 +0100,
 Jens Axboe wrote:

 Here it is, it's from the one introducing the audioformat lookup.
 Confirmed that 3.8-rc3 with this backed out works fine, too. So should
 be fairly confident in that result.

 From: Takashi Iwai ti...@suse.de
 Subject: [PATCH] ALSA: usb-audio: Fix NULL dereference by access to
  non-existing substream

 The commit [0d9741c0: ALSA: usb-audio: sync ep init fix for
 audioformat mismatch] introduced the correction of parameters to be
 set for sync EP.  But since the new code assumes that the sync EP is
 always paired with the data EP of another direction, it triggers Oops
 when a device only with a single direction is used.

 Yes - sorry, I didn't consider this at all.

 This patch adds a proper check of sync EP type and the presence of the
 paired substream for avoiding the crash.

 Reported-by: Jens Axboe ax...@kernel.dk
 Signed-off-by: Takashi Iwai ti...@suse.de
 ---
  sound/usb/pcm.c | 11 +++
  1 file changed, 11 insertions(+)

 diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
 index c659310..21c0001 100644
 --- a/sound/usb/pcm.c
 +++ b/sound/usb/pcm.c
 @@ -511,6 +511,17 @@ static int configure_sync_endpoint(struct 
 snd_usb_substream *subs)
 struct snd_usb_substream *sync_subs =
 subs-stream-substream[subs-direction ^ 1];
  
 +   if (subs-sync_endpoint-type != SND_USB_ENDPOINT_TYPE_DATA ||
 +   !subs-stream) {
 +   ret = snd_usb_endpoint_set_params(subs-sync_endpoint,
 + subs-pcm_format,
 + subs-channels,
 + subs-period_bytes,
 + subs-cur_rate,
 + subs-cur_audiofmt,
 + NULL);
 +   }
 +

 I think you want to return here, no?
 
 Ah, yes, good catch.  It was dropped during rebasing and rewriting.
 Below is the revised patch.

Thanks, I'll give it a go and report back.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7u1 08/31] x86, 64bit: early #PF handler set page table

2013-01-10 Thread Borislav Petkov
On Thu, Jan 10, 2013 at 09:05:46AM -0800, Yinghai Lu wrote:
 On Thu, Jan 10, 2013 at 4:19 AM, Borislav Petkov b...@alien8.de wrote:
  This is not how SOB chaining works:
 
  SOB: Author
  SOB: Handler - this is you, who has added it to the patchset
  SOB: Committer - maintainer
 
  You need to read Documentation/SubmittingPatches if there's still things
  unclear.
 
 Really don't know what you are doing here.
 
 We did that before for a long time.
 
 During reviewing some patches, Linus or HPA or Eric has better idea
 and drafted some patch,
 without their Signed-offs.
 
 then first version submitter will continue the debugging and testing
 and make the patch working.
 
 At last the submit the patch with authorship from Linus or HPA or Eric.
 
 So at that time how can the Signed-off from them?
 
 And there are commits in the upstream does not have Signed-off from the 
 Author.

I certainly hope those are a very very small number, if any.

In any case, if you've taken hpa's (or anyone's, for that matter) patch,
it should have SOB from the original author. Then, no matter whether you
do modifications to it or not, if it goes upstream through you, then it
has to have your SOB. And then, the upstream maintainer adds his/hers
because he's/she's the one committing it.

This way, the chain of patch handling is clear when you look at it and
you can trace the path back to this patch's origin and how it came
upstream.

Here's the relevant portion of SubmittingPatches:

Rule (b) allows you to adjust the code, but then it is very impolite
to change one submitter's code and make him endorse your bugs. To
solve this problem, it is recommended that you add a line between the
last Signed-off-by header and yours, indicating the nature of your
changes. While there is nothing mandatory about this, it seems like
prepending the description with your mail and/or name, all enclosed in
square brackets, is noticeable enough to make it obvious that you are
responsible for last-minute changes. Example :

Signed-off-by: Random J Developer ran...@developer.example.org
[lu...@maintainer.example.org: struct foo moved from foo.c to foo.h]
Signed-off-by: Lucky K Maintainer lu...@maintainer.example.org

In your case, the second SOB should be Lucky K Developer 2 :-)

This way the SOB chain tells you exactly who did what.

HTH.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] posix-timers: Fix clock_adjtime to return timex data on success

2013-01-10 Thread John Stultz

On 01/10/2013 12:12 PM, Richard Cochran wrote:

On Thu, Jan 10, 2013 at 06:12:02PM +0100, Miroslav Lichvar wrote:

Copy the modified timex data back to the user also with positive return
values. This fixes reading of the CLOCK_REALTIME timex data when the
clock is in a non-zero state.

Signed-off-by: Miroslav Lichvar mlich...@redhat.com

Acked-by: Richard Cochran richardcoch...@gmail.com

(Adding John Stultz on CC)


Just to clarify (the commit message makes it pretty hard to understand 
exactly what's going wrong), this is to handle the case where 
clock_adj() returns a non-zero time_state value (such as TIME_INS) ?


thanks
-john

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec

2013-01-10 Thread Borislav Petkov
On Thu, Jan 10, 2013 at 11:21:21AM -0500, Alex Deucher wrote:
 I'm assuming you didn't also update your userspace gfx stack?

By that you mean x.org etc, right? Or GPU microcode too? In any case, I
haven't touched any of those deliberately, AFAICR at least.

 Does disabling the new DMA ring for ttm bo moves avoid the issue?

How do I do that?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus

2013-01-10 Thread Tony Lindgren
* Colin Cross ccr...@android.com [130109 18:05]:
 +static void watchdog_check_hardlockup_other_cpu(void)
 +{
 + int cpu;
 + cpumask_t cpus = watchdog_cpus;
 +
 + /*
 +  * Test for hardlockups every 3 samples.  The sample period is
 +  *  watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly over
 +  *  watchdog_thresh (over by 20%).
 +  */
 + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0)
 + return;
 +
 + /* check for a hardlockup on the next cpu */
 + cpu = cpumask_next(smp_processor_id(), cpus);

Hmm don't you want to check cpu_oneline_mask here and
return if the other CPU is offline?

 + if (cpu = nr_cpu_ids)
 + cpu = cpumask_first(cpus);
 + if (cpu == smp_processor_id())
 + return;

Regards,

Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] iommu: moving initialization earlier

2013-01-10 Thread Shuah Khan
On Thu, Jan 10, 2013 at 10:09 AM, Joerg Roedel j...@8bytes.org wrote:
 On Mon, Jan 07, 2013 at 06:51:52PM +1100, Alexey Kardashevskiy wrote:
 The iommu_init() initializes IOMMU internal structures and data
 required for the IOMMU API as iommu_group_alloc().
 It is registered as a subsys_initcall now.

 One of the IOMMU users is going to be a PCI subsystem on POWER.
 It discovers new IOMMU tables during the PCI scan so the logical
 place to call iommu_group_alloc() is the moment when a new group
 is discovered. However PCI scan is done from subsys_initcall hook
 as IOMMU does so PCI hook can be (and is) called before the IOMMU one.

 The patch moves IOMMU subsystem initialization one step earlier
 to make sure that IOMMU is initialized before PCI scan begins.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

 Applied, thanks.

Joerg,

Could you please consider this patch for stable releases.

I am currently debugging IO_PAGE_FAULTS on 3.6.11 (happens on all
pre-3.7 releases). I root-caused the reason 3.7 works is because in
3.7 amd iommu driver moving up the early iommu initialization from
irq_remap_ops with the irq remapping feature.

I looked into back-porting a small sub-set of the changes from 3.7. In
the process, I back-ported the following change that fixes
IO_PAGE_FAULTS the problem to some extent.

33f28c59e18d83fd2aeef258d211be66b9b80eb3
[PATCH] iommu/amd: Split device table initialization into irq and
 dma part

My back-port reduced the IO_PAGE_FAULTS, but still happen. I applied
Alexey's patch on top of my back-ported change, and I no longer see
any IO_PAGE_FAULTS. So my second question/request is would you
consider my back-ported patch for stables which I will send out.

Here is the snippet from dmesg from 3.7 and 3.6.11 that illustrates
the change in early initialization during kernel boot process:

Snippet from 3.7:
[0.009980] Freeing SMP alternatives: 24k freed
[0.011486] ACPI: Core revision 20120913
[0.017291] ftrace: allocating 24968 entries in 98 pages
[0.025792] SK: before iommu_init_flags()
[1.015392] SK: before iommu_set_device_table()
[2.004989] SK: before iommu_enable_command_buffer()
[2.994600] SK: before iommu_enable_event_buffer()
[3.984469] SK: before iommu_set_exclusion_range()
[4.974066] SK: before iommu_enable()
[5.963662] SK: after iommu_flush_all_caches()
[8.024776] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[8.064424] smpboot: CPU0: AMD Opteron(tm) Processor 6328 (fam: 15,
model: 02, stepping: 00)

Snippet from 3.6.11:
[3.305383] PCI: CLS 64 bytes, default 64
[3.305424] Trying to unpack rootfs image as initramfs...
[5.137815] Freeing initrd memory: 118756k freed
[5.169817] SK: before iommu_init_flags()
[6.159717] SK: before iommu_set_device_table()
[7.149435] SK: before iommu_enable_command_buffer()
[8.139149] SK: before iommu_enable_event_buffer()
[9.128868] SK: before iommu_set_exclusion_range()
[   10.118581] SK: before iommu_enable()
[   11.108460] SK: before iommu_flush_all_caches()
[   13.141141] SK: after iommu_flush_all_caches()
[   15.120755] pci :00:00.2: can't derive routing for PCI INT A
[   15.120819] pci :00:00.2: PCI INT A: no GSI
[   15.120880]

In 3.6, early iommu initialization occurs way later.

-- Shuah
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] staging/omapdrm: garbage collect OMAP_DSS_DISPLAY_SUSPENDED

2013-01-10 Thread Paolo Pisati
Compilation fix - leftover from:

commit 998c336d4c7183301ed6a6ca93952f63e3cf694f
Author: Tomi Valkeinen tomi.valkei...@ti.com
Date:   Wed May 30 13:26:00 2012 +0300

OMAPDSS: remove omap_dss_device's suspend/resume

Cc: stable sta...@vger.kernel.org # v3.7
Signed-off-by: Paolo Pisati paolo.pis...@canonical.com
---
 drivers/staging/omapdrm/omap_connector.c |3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/staging/omapdrm/omap_connector.c 
b/drivers/staging/omapdrm/omap_connector.c
index 91edb3f..ff25467 100644
--- a/drivers/staging/omapdrm/omap_connector.c
+++ b/drivers/staging/omapdrm/omap_connector.c
@@ -113,9 +113,6 @@ static void omap_connector_dpms(struct drm_connector 
*connector, int mode)
if (mode == DRM_MODE_DPMS_ON) {
/* store resume info for suspended displays */
switch (dssdev-state) {
-   case OMAP_DSS_DISPLAY_SUSPENDED:
-   dssdev-activate_after_resume = true;
-   break;
case OMAP_DSS_DISPLAY_DISABLED: {
int ret = dssdev-driver-enable(dssdev);
if (ret) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] drivers/pinctrl: grab default handles from device core

2013-01-10 Thread Stephen Warren
On 12/12/2012 01:25 PM, Linus Walleij wrote:
 From: Linus Walleij linus.wall...@linaro.org
 
 This makes the device core auto-grab the pinctrl handle and set
 the default (PINCTRL_STATE_DEFAULT) state for every device
 that is present in the device model right before probe. This will
 account for the lion's share of embedded silicon devcies.

There are quite a few problems with this patch, and they end up
completely breaking at least Tegra in next-20130110.

 diff --git a/drivers/base/pinctrl.c b/drivers/base/pinctrl.c

 +int pinctrl_bind_pins(struct device *dev)
 +{
 + struct dev_pin_info *dpi;
 + int ret;
 +
 + /* Allocate a pin state container on-the-fly */
 + if (!dev-pins) {
 + dpi = devm_kzalloc(dev, sizeof(*dpi), GFP_KERNEL);

This is allocated using a devm_ function. If -EPROBE_DEFER is returned
below after the assignment to dev-pins or if the driver's own probe()
returns -EPROBE_DEFER, this allocation will be freed by the driver core.
This can leave dev-pins pointing to something non-NULL, yet invalid.

I haven't fully verified this, but I believe this issue is causing
crashes on Tegra. Certainly if I force this code to follow the path that
always allocates new structs or performs new gets, then the crashes go away.

 + if (!dpi)
 + return -ENOMEM;
 + } else
 + dpi = dev-pins;
 +
 + /*
 +  * Check if we already have a pinctrl handle, as we may arrive here
 +  * after a deferral in the state selection below
 +  */
 + if (!dpi-p) {
 + dpi-p = devm_pinctrl_get(dev);

That won't succeed for a pinctrl device that has a default state in
order to implement hogs. This will then cause the pin controller device
to always defer probe and never activate. This will leave HW
unconfigured and/or prevent other devices from successfully calling
pinctrl_get().

This issue also happens on Tegra.

 diff --git a/drivers/pinctrl/core.c b/drivers/pinctrl/core.c

 @@ -734,9 +734,16 @@ static struct pinctrl *pinctrl_get_locked(struct device 
 *dev)
   if (WARN_ON(!dev))
   return ERR_PTR(-EINVAL);
  
 + /*
 +  * See if somebody else (such as the device core) has already
 +  * obtained a handle to the pinctrl for this device. In that case,
 +  * return another pointer to it.
 +  */
   p = find_pinctrl(dev);
 - if (p != NULL)
 - return ERR_PTR(-EBUSY);

I deliberately returned an error here, because there's no reference
counting on the struct pinctrl objects. If a driver calls pinctrl_get(),
with the new code below, it will retrieve the same struct. If it later
calls pinctrl_put(), the put will immediately free the structure. This
will invalidate the pointers that reference it in struct device's pins
field.

This issue will probably trigger on Tegra, since we at least have a
pinctrl-based I2C mux that calls pinctrl_get().

 + if (p != NULL) {
 + dev_dbg(dev, obtain a copy of previously claimed pinctrl\n);
 + return p;
 + }
  
   return create_pinctrl(dev);
  }

Perhaps we could remove this patch from linux-next, and have a V3?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] perf tools: Fix building from 'make perf-*-src-pkg' tarballs

2013-01-10 Thread Arnaldo Carvalho de Melo
From: Sebastian Andrzej Siewior bige...@linutronix.de

Thanks (mostly) to uapi the package created from perf-*-src-pkg FTBFS:

|CC perf.o
|In file included from util/../perf.h:8:0,
| from util/cache.h:7,
| from perf.c:12:
|arch/x86/include/asm/unistd.h:4:29: fatal error: uapi/asm/unistd.h: No such 
file or directory
|
|CC perf.o
|In file included from util/../perf.h:106:0,
| from util/cache.h:7,
| from perf.c:12:
|include/linux/perf_event.h:17:35: fatal error: uapi/linux/perf_event.h: No 
such file or directory
|
|CC perf.o
|In file included from include/uapi/linux/perf_event.h:19:0,
| from util/../perf.h:106,
| from util/cache.h:7,
| from perf.c:12:
|util/include/asm/byteorder.h:2:49: fatal error: 
../../../../include/uapi/linux/swab.h: No such file or directory
|
|CC perf.o
|In file included from util/include/../../../../include/linux/list.h:7:0,
| from util/include/linux/list.h:4,
| from util/parse-events.h:7,
| from perf.c:15:
|util/include/linux/const.h:1:50: fatal error: 
../../../../include/uapi/linux/const.h: No such file or directory
|
|In file included from builtin-kvm.c:26:0:
|arch/x86/include/asm/svm.h:4:26: fatal error: uapi/asm/svm.h: No such file or 
directory
|
|In file included from util/evsel.c:21:0:
|include/linux/hw_breakpoint.h:5:38: fatal error: uapi/linux/hw_breakpoint.h: 
No such file or directory
|
|CC util/evsel.o
|In file included from util/perf_regs.h:5:0,
| from util/evsel.c:23:
|arch/x86/include/perf_regs.h:6:27: fatal error: asm/perf_regs.h: No such file 
or directory
|
|   CC util/rbtree.o
|In file included from ../../lib/rbtree.c:24:0:
|util/include/linux/rbtree_augmented.h:2:56: fatal error: 
../../../../include/linux/rbtree_augmented.h: No such file or directory

This patch adds the missing files.

Signed-off-by: Sebastian Andrzej Siewior bige...@linutronix.de
Cc: Ingo Molnar mi...@redhat.com
Cc: Paul Mackerras pau...@samba.org
Cc: Peter Zijlstra a.p.zijls...@chello.nl
Link: 
http://lkml.kernel.org/r/1357654134-28538-1-git-send-email-bige...@linutronix.de
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 tools/perf/MANIFEST |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/MANIFEST b/tools/perf/MANIFEST
index 80db3f4..39d4106 100644
--- a/tools/perf/MANIFEST
+++ b/tools/perf/MANIFEST
@@ -11,11 +11,21 @@ lib/rbtree.c
 include/linux/swab.h
 arch/*/include/asm/unistd*.h
 arch/*/include/asm/perf_regs.h
+arch/*/include/uapi/asm/unistd*.h
+arch/*/include/uapi/asm/perf_regs.h
 arch/*/lib/memcpy*.S
 arch/*/lib/memset*.S
 include/linux/poison.h
 include/linux/magic.h
 include/linux/hw_breakpoint.h
+include/linux/rbtree_augmented.h
+include/uapi/linux/perf_event.h
+include/uapi/linux/const.h
+include/uapi/linux/swab.h
+include/uapi/linux/hw_breakpoint.h
 arch/x86/include/asm/svm.h
 arch/x86/include/asm/vmx.h
 arch/x86/include/asm/kvm_host.h
+arch/x86/include/uapi/asm/svm.h
+arch/x86/include/uapi/asm/vmx.h
+arch/x86/include/uapi/asm/kvm.h
-- 
1.7.9.2.358.g22243

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side

2013-01-10 Thread Arnaldo Carvalho de Melo
From: David Ahern dsah...@gmail.com

This patch is brought to you by the letter 'H'.

Commit 20b279 breaks compatiblity with older perf binaries when run with
precise modifier (:p or :pp) by requiring the exclude_guest attribute to be
set. Older binaries default exclude_guest to 0 (ie., wanting guest-based
samples) unless host only profiling is requested (:H modifier). The workaround
for older binaries is to add H to the modifier list (e.g., -e cycles:ppH -
toggles exclude_guest to 1). This was deemed unacceptable by Linus:

https://lkml.org/lkml/2012/12/12/570

Between family in town and the fresh snow in Breckenridge there is no time left
to be working on the proper fix for this over the holidays. In the New Year I
have more pressing problems to resolve -- like some memory leaks in perf which
are proving to be elusive -- although the aforementioned snow is probably why
they are proving to be elusive. Either way I do not have any spare time to work
on this and from the time I have managed to spend on it the solution is more
difficult than just moving to a new exclude_guest flag (does not work) or
flipping the logic to include_guest (which is not as trivial as one would
think).

So, two options: silently force exclude_guest on as suggested by Gleb which
means no impact to older perf binaries or revert the original patch which
caused the breakage.

This patch does the latter -- reverts the original patch that introduced the
regression. The problem can be revisited in the future as time allows.

Signed-off-by: David Ahern dsah...@gmail.com
Cc: Avi Kivity a...@redhat.com
Cc: Gleb Natapov g...@redhat.com
Cc: Ingo Molnar mi...@kernel.org
Cc: Jiri Olsa jo...@redhat.com
Cc: Linus Torvalds torva...@linux-foundation.org
Cc: Peter Zijlstra pet...@infradead.org
Cc: Robert Richter robert.rich...@amd.com
Link: 
http://lkml.kernel.org/r/1356749767-17322-1-git-send-email-dsah...@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com
---
 arch/x86/kernel/cpu/perf_event.c |6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 4428fd1..6774c17 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -340,9 +340,6 @@ int x86_setup_perfctr(struct perf_event *event)
/* BTS is currently only allowed for user-mode. */
if (!attr-exclude_kernel)
return -EOPNOTSUPP;
-
-   if (!attr-exclude_guest)
-   return -EOPNOTSUPP;
}
 
hwc-config |= config;
@@ -385,9 +382,6 @@ int x86_pmu_hw_config(struct perf_event *event)
if (event-attr.precise_ip) {
int precise = 0;
 
-   if (!event-attr.exclude_guest)
-   return -EOPNOTSUPP;
-
/* Support for constant skid */
if (x86_pmu.pebs_active  !x86_pmu.pebs_broken) {
precise++;
-- 
1.7.9.2.358.g22243

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL 0/2] perf/urgent fixes

2013-01-10 Thread Arnaldo Carvalho de Melo

Hi Ingo,

Please consider pulling,

Regards,

- Arnaldo

-- 
1.7.9.2.358.g22243

The following changes since commit 5c49985c21bba4d2f899e3a97121868a5c58a876:

  Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm (2013-01-09 
08:58:57 -0800)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux 
tags/perf-urgent-for-mingo

for you to fetch changes up to acb8fb04b74e1c26117b89945dc058b52b28ccb9:

  perf tools: Fix building from 'make perf-*-src-pkg' tarballs (2013-01-10 
16:03:26 -0300)


perf/urgent fixes:

. revert 20b279 - require exclude_guest to use PEBS - kernel side,
  now older binaries will continue working for things like cycles:pp
  without needing to pass extra modifiers, from David Ahern.

. Fix building from 'make perf-*-src-pkg' tarballs, broken by UAPI, from
  Sebastian Andrzej Siewior

Signed-off-by: Arnaldo Carvalho de Melo a...@redhat.com


David Ahern (1):
  perf x86: revert 20b279 - require exclude_guest to use PEBS - kernel side

Sebastian Andrzej Siewior (1):
  perf tools: Fix building from 'make perf-*-src-pkg' tarballs

 arch/x86/kernel/cpu/perf_event.c |6 --
 tools/perf/MANIFEST  |   10 ++
 2 files changed, 10 insertions(+), 6 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 16/16] ARM: dts: add AM33XX SPI support

2013-01-10 Thread Matt Porter
On Thu, Jan 10, 2013 at 01:46:53PM -0600, Nishanth Menon wrote:
 On 14:35-20130110, Matt Porter wrote:
  On Sun, Oct 28, 2012 at 05:01:29PM +0530, Sekhar Nori wrote:
   On 10/18/2012 6:56 PM, Matt Porter wrote:
Adds AM33XX SPI support for am335x-bone and am335x-evm.

Signed-off-by: Matt Porter mpor...@ti.com
---
 arch/arm/boot/dts/am335x-bone.dts |   17 +++
 arch/arm/boot/dts/am335x-evm.dts  |9 
 arch/arm/boot/dts/am33xx.dtsi |   43 
+
 3 files changed, 69 insertions(+)

diff --git a/arch/arm/boot/dts/am335x-bone.dts 
b/arch/arm/boot/dts/am335x-bone.dts
index 5510979..23edfd8 100644
--- a/arch/arm/boot/dts/am335x-bone.dts
+++ b/arch/arm/boot/dts/am335x-bone.dts
@@ -18,6 +18,17 @@
reg = 0x8000 0x1000; /* 256 MB */
};
 
+   am3358_pinmux: pinmux@44e10800 {
+   spi1_pins: pinmux_spi1_pins {
+   pinctrl-single,pins = 
+   0x190 0x13  /* 
mcasp0_aclkx.spi1_sclk, OUTPUT_PULLUP | MODE3 */
+   0x194 0x33  /* mcasp0_fsx.spi1_d0, 
INPUT_PULLUP | MODE3 */
+   0x198 0x13  /* mcasp0_axr0.spi1_d1, 
OUTPUT_PULLUP | MODE3 */
 minor comment:
 doing as a 0x33 is better for both d1, d0 as D0,D1 can be switched between 
 SDI and SDO
 as needed with ti,pindir-d0-out-d1-in

Yes, thanks. Forgot about the new property and feature even though I
told somebody about it yesterday. :)

+   0x19c 0x13  /* 
mcasp0_ahclkr.spi1_cs0, OUTPUT_PULLUP | MODE3 */
+   ;
   
   Is there a single pinmux setting that provides SPI functionality on the
   bone headers? Or this is specific to a cape you tested with?
  
  No, there are two usable settings for spi1 and one setting for spi0.
  I'm dropping this from the series since it's specific to how I wired up
  the homebrew cape I use for spi testing on the Bone. I publish the
  branch where all these extra test-specific patches (that aren't intended
  to be merged) are at in the cover letter.  Anybody that needs context of
  how/what worked and was tested can grab them there.
 Possibly dumb question:
 Cant we have pre-usable spi configurations?  Like spi1_configuration1_pins,
 spi2_configuration1_pins, spi0_configuration1_pins? If documented with
 P9 pin names in the bone dts, it saves a bit of effort in looking up
 pad offset when dealing with capes.

Yes, let's introduce these things separately. I plan to reintroduce
patches to fix the incorrect 1-based numbering on many of the AM33xx as
previous ones were dropped...and I think this makes sense on top of
that.

-Matt
 -- 
 Regards,
 Nishanth Menon
 --
 To unsubscribe from this list: send the line unsubscribe linux-omap in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec

2013-01-10 Thread Alex Deucher
On Thu, Jan 10, 2013 at 3:32 PM, Borislav Petkov b...@alien8.de wrote:
 On Thu, Jan 10, 2013 at 11:21:21AM -0500, Alex Deucher wrote:
 I'm assuming you didn't also update your userspace gfx stack?

 By that you mean x.org etc, right? Or GPU microcode too? In any case, I
 haven't touched any of those deliberately, AFAICR at least.

Right.  Xorg drivers or mesa drivers.


 Does disabling the new DMA ring for ttm bo moves avoid the issue?

 How do I do that?

diff --git a/drivers/gpu/drm/radeon/radeon_asic.c
b/drivers/gpu/drm/radeon/radeon_asic.c
index 9056faf..b0cc46d 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.c
+++ b/drivers/gpu/drm/radeon/radeon_asic.c
@@ -974,8 +974,8 @@ static struct radeon_asic r600_asic = {
.blit_ring_index = RADEON_RING_TYPE_GFX_INDEX,
.dma = r600_copy_dma,
.dma_ring_index = R600_RING_TYPE_DMA_INDEX,
-   .copy = r600_copy_dma,
-   .copy_ring_index = R600_RING_TYPE_DMA_INDEX,
+   .copy = r600_copy_blit,
+   .copy_ring_index = RADEON_RING_TYPE_GFX_INDEX,
},
.surface = {
.set_reg = r600_set_surface_reg,
@@ -1058,8 +1058,8 @@ static struct radeon_asic rs780_asic = {
.blit_ring_index = RADEON_RING_TYPE_GFX_INDEX,
.dma = r600_copy_dma,
.dma_ring_index = R600_RING_TYPE_DMA_INDEX,
-   .copy = r600_copy_dma,
-   .copy_ring_index = R600_RING_TYPE_DMA_INDEX,
+   .copy = r600_copy_blit,
+   .copy_ring_index = RADEON_RING_TYPE_GFX_INDEX,
},
.surface = {
.set_reg = r600_set_surface_reg,



 Thanks.

 --
 Regards/Gruss,
 Boris.

 Sent from a fat crate under my desk. Formatting is fine.
 --
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 07/14] usb: ehci-omap: Instantiate PHY devices if required

2013-01-10 Thread Alan Stern
On Thu, 10 Jan 2013, Roger Quadros wrote:

 If the OMAP's Host controller is in PHY mode then we instantiate
 a platform device for the PHY (one for each port in PHY mode) and
 hold a reference to it so that we can use the usb_phy API, e.g.
 while suspend/resume.
 
 The platform data for the PHY must be supplied in the newly added
 .phy_config parameter in struct usbhs_omap_platform_data.
 
 The end goal is to move the PHY's reset and power handling code
 out of the ehci-omap driver and into the phy driver.

As mentioned in another thread, I would prefer to have these changes to 
ehci-omap.c made after the driver is converted to the new ehci-hcd is 
a library scheme.  The patch below does the conversion; it is meant to 
apply on top of the similar patch for ehci-mxc posted recently on the 
linux-usb mailing list.

After this conversion, the omap_ehci_hcd private data structure doesn't 
have to be allocated specifically.  It can be handled in the same way 
as the private data structure in the ehci-mxc patch.

I haven't even tried to compile this.  Please let me know how it works.

Alan Stern



Index: usb-3.7/drivers/usb/host/Kconfig
===
--- usb-3.7.orig/drivers/usb/host/Kconfig
+++ usb-3.7/drivers/usb/host/Kconfig
@@ -155,7 +155,7 @@ config USB_EHCI_MXC
  Variation of ARC USB block used in some Freescale chips.
 
 config USB_EHCI_HCD_OMAP
-   bool EHCI support for OMAP3 and later chips
+   tristate EHCI support for OMAP3 and later chips
depends on USB_EHCI_HCD  ARCH_OMAP
default y
---help---
Index: usb-3.7/drivers/usb/host/Makefile
===
--- usb-3.7.orig/drivers/usb/host/Makefile
+++ usb-3.7/drivers/usb/host/Makefile
@@ -27,6 +27,7 @@ obj-$(CONFIG_USB_EHCI_HCD)+= ehci-hcd.o
 obj-$(CONFIG_USB_EHCI_PCI) += ehci-pci.o
 obj-$(CONFIG_USB_EHCI_HCD_PLATFORM)+= ehci-platform.o
 obj-$(CONFIG_USB_EHCI_MXC) += ehci-mxc.o
+obj-$(CONFIG_USB_EHCI_HCD_OMAP)+= ehci-omap.o
 
 obj-$(CONFIG_USB_OXU210HP_HCD) += oxu210hp-hcd.o
 obj-$(CONFIG_USB_ISP116X_HCD)  += isp116x-hcd.o
Index: usb-3.7/drivers/usb/host/ehci-hcd.c
===
--- usb-3.7.orig/drivers/usb/host/ehci-hcd.c
+++ usb-3.7/drivers/usb/host/ehci-hcd.c
@@ -1255,11 +1255,6 @@ MODULE_LICENSE (GPL);
 #define PLATFORM_DRIVERehci_hcd_sh_driver
 #endif
 
-#ifdef CONFIG_USB_EHCI_HCD_OMAP
-#include ehci-omap.c
-#definePLATFORM_DRIVER ehci_hcd_omap_driver
-#endif
-
 #ifdef CONFIG_PPC_PS3
 #include ehci-ps3.c
 #definePS3_SYSTEM_BUS_DRIVER   ps3_ehci_driver
@@ -1349,6 +1344,7 @@ MODULE_LICENSE (GPL);
!IS_ENABLED(CONFIG_USB_EHCI_HCD_PLATFORM)  \
!defined(CONFIG_USB_CHIPIDEA_HOST)  \
!defined(CONFIG_USB_EHCI_MXC)  \
+   !defined(CONFIG_USB_EHCI_HCD_OMAP)  \
!defined(PLATFORM_DRIVER)  \
!defined(PS3_SYSTEM_BUS_DRIVER)  \
!defined(OF_PLATFORM_DRIVER)  \
Index: usb-3.7/drivers/usb/host/ehci-omap.c
===
--- usb-3.7.orig/drivers/usb/host/ehci-omap.c
+++ usb-3.7/drivers/usb/host/ehci-omap.c
@@ -36,6 +36,9 @@
  * - convert to use hwmod and runtime PM
  */
 
+#include linux/kernel.h
+#include linux/module.h
+#include linux/io.h
 #include linux/platform_device.h
 #include linux/slab.h
 #include linux/usb/ulpi.h
@@ -44,6 +47,10 @@
 #include linux/pm_runtime.h
 #include linux/gpio.h
 #include linux/clk.h
+#include linux/usb.h
+#include linux/usb/hcd.h
+
+#include ehci.h
 
 /* EHCI Register Set */
 #define EHCI_INSNREG04 (0xA0)
@@ -56,11 +63,13 @@
 #defineEHCI_INSNREG05_ULPI_EXTREGADD_SHIFT 8
 #defineEHCI_INSNREG05_ULPI_WRDATA_SHIFT0
 
-/*-*/
+#define DRIVER_DESC = OMAP-EHCI Host Controller driver;
 
-static const struct hc_driver ehci_omap_hc_driver;
+static const char hcd_name[] = ehci-omap;
 
 
+/*-*/
+
 static inline void ehci_write(void __iomem *base, u32 reg, u32 val)
 {
__raw_writel(val, base + reg);
@@ -165,6 +174,12 @@ static void disable_put_regulator(
 /* configure so an HC device and id are always provided */
 /* always called with process context; sleeping is OK */
 
+static struct hc_driver __read_mostly ehci_omap_hc_driver;
+
+static const struct ehci_driver_overrides ehci_omap_overrides __initdata = {
+   .reset =omap_ehci_init,
+};
+
 /**
  * ehci_hcd_omap_probe - initialize TI-based HCDs
  *
@@ -322,56 +337,33 @@ static struct platform_driver ehci_hcd_o
/*.suspend  = ehci_hcd_omap_suspend, */
/*.resume   = ehci_hcd_omap_resume, */
.driver = {
-   .name   = ehci-omap,
+ 

Re: [RFC PATCH v3 16/16] ARM: dts: add AM33XX SPI support

2013-01-10 Thread Nishanth Menon
On 15:49-20130110, Matt Porter wrote:
 On Thu, Jan 10, 2013 at 01:46:53PM -0600, Nishanth Menon wrote:
  On 14:35-20130110, Matt Porter wrote:
   On Sun, Oct 28, 2012 at 05:01:29PM +0530, Sekhar Nori wrote:
On 10/18/2012 6:56 PM, Matt Porter wrote:
[...]
 
 + 0x19c 0x13  /* 
 mcasp0_ahclkr.spi1_cs0, OUTPUT_PULLUP | MODE3 */
 + ;

Is there a single pinmux setting that provides SPI functionality on the
bone headers? Or this is specific to a cape you tested with?
   
   No, there are two usable settings for spi1 and one setting for spi0.
   I'm dropping this from the series since it's specific to how I wired up
   the homebrew cape I use for spi testing on the Bone. I publish the
   branch where all these extra test-specific patches (that aren't intended
   to be merged) are at in the cover letter.  Anybody that needs context of
   how/what worked and was tested can grab them there.
  Possibly dumb question:
  Cant we have pre-usable spi configurations?  Like spi1_configuration1_pins,
  spi2_configuration1_pins, spi0_configuration1_pins? If documented with
  P9 pin names in the bone dts, it saves a bit of effort in looking up
  pad offset when dealing with capes.
 
 Yes, let's introduce these things separately. I plan to reintroduce
 patches to fix the incorrect 1-based numbering on many of the AM33xx as
 previous ones were dropped...and I think this makes sense on top of
 that.
sounds good to me.

-- 
Regards,
Nishanth Menon
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Jan 10 (staging/sb105x)

2013-01-10 Thread Steven Rostedt
On Thu, 2013-01-10 at 11:08 -0800, Randy Dunlap wrote:
 On 01/09/13 19:32, Stephen Rothwell wrote:
  Hi all,
  
  Changes since 20130109:
  
 
 on x86_64, when CONFIG_PARPORT_PC is not enabled:
 
 drivers/built-in.o: In function `multi_init':
 sb_pci_mp.c:(.init.text+0x15684): undefined reference to 
 `parport_pc_probe_port'
 
 
 Full randconfig file is attached.

Ug, so https://lkml.org/lkml/2012/12/14/250 should have been
CONFIG_PARPORT_PC and not just PARPORT :-P

Can you try this patch:

diff --git a/drivers/staging/sb105x/sb_pci_mp.c 
b/drivers/staging/sb105x/sb_pci_mp.c
index 131afd0c..9464f38 100644
--- a/drivers/staging/sb105x/sb_pci_mp.c
+++ b/drivers/staging/sb105x/sb_pci_mp.c
@@ -3054,7 +3054,7 @@ static int init_mp_dev(struct pci_dev *pcidev, mppcibrd_t 
brd)
sbdev-nr_ports = ((portnum_hex/16)*10) + 
(portnum_hex % 16);
}
break;
-#ifdef CONFIG_PARPORT
+#ifdef CONFIG_PARPORT_PC
case PCI_DEVICE_ID_MP2S1P :
sbdev-nr_ports = 2;
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-10 Thread Eric Dumazet
On Thu, 2013-01-10 at 19:42 +, Mel Gorman wrote:

 Thanks Eric, it's much appreciated. However, I'm still very much in favour
 of a partial revert as in retrospect the implementation of capture took the
 wrong approach. Could you confirm the following patch works for you?
 It's should functionally have the same effect as the first revert and
 there are only minor changes from the last revert prototype I sent you
 but there is no harm in being sure.
 
 ---8---
 mm: compaction: Partially revert capture of suitable high-order page
 
 Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when waiting
 for POLLIN on a local TCP socket. It was easier to trigger if there was disk
 IO and dirty pages at the same time and he bisected it to commit 1fb3f8ca
 mm: compaction: capture a suitable high-order page immediately when it
 is made available.
 
 The intention of that patch was to improve high-order allocations under
 memory pressure after changes made to reclaim in 3.6 drastically hurt
 THP allocations but the approach was flawed. For Eric, the problem was
 that page-pfmemalloc was not being cleared for captured pages leading to
 a poor interaction with swap-over-NFS support causing the packets to be
 dropped. However, I identified a few more problems with the patch including
 the fact that it can increase contention on zone-lock in some cases which
 could result in async direct compaction being aborted early.
 
 In retrospect the capture patch took the wrong approach. What it should
 have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
 was allocating for THP and avoided races that way. While the patch was
 showing to improve allocation success rates at the time, the benefit is
 marginal given the relative complexity and it should be revisited from
 scratch in the context of the other reclaim-related changes that have taken
 place since the patch was first written and tested. This patch partially
 reverts commit 1fb3f8ca mm: compaction: capture a suitable high-order
 page immediately when it is made available.
 
 Reported-by: Eric Wong normalper...@yhbt.net
 Cc: sta...@vger.kernel.org
 Signed-off-by: Mel Gorman mgor...@suse.de
 ---

It seems to solve the problem on my kvm testbed

(512 MB of ram, 2 vcpus)

Tested-by: Eric Dumazet eduma...@google.com


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] clk: Add composite clock type

2013-01-10 Thread Stephen Boyd
On 01/04/13 18:49, Prashant Gaikwad wrote:
 On Saturday 05 January 2013 03:48 AM, Stephen Boyd wrote:
 On 01/03/13 21:51, Prashant Gaikwad wrote:
 diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
 index f0b269a..baf7608 100644
 --- a/drivers/clk/Makefile
 +++ b/drivers/clk/Makefile
 @@ -2,7 +2,8 @@
   obj-$(CONFIG_HAVE_CLK)+= clk-devres.o
   obj-$(CONFIG_CLKDEV_LOOKUP)+= clkdev.o
   obj-$(CONFIG_COMMON_CLK)+= clk.o clk-fixed-rate.o clk-gate.o \
 -   clk-mux.o clk-divider.o clk-fixed-factor.o
 +   clk-mux.o clk-divider.o clk-fixed-factor.o \
 +   clk-composite.o
 This list is getting a little out of hand. Should we sort it
 alphabetically and put each file on one line?

 Do you want me to do it in this patch?



No.

 +static u8 clk_composite_get_parent(struct clk_hw *hw)
 +{
 +struct clk_composite *composite = to_clk_composite(hw);
 +const struct clk_ops *mux_ops = composite-mux_ops;
 +struct clk_hw *mux_hw = composite-mux_hw;
 +
 +mux_hw-clk = hw-clk;
 Looks like this is already done down in the register function. Why are
 we doing it again here and in each op?

 Some ops gets called during clk_init which is before clk_register
 returns.



Hmm. Ok.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/14] lib: Add I/O map cache implementation

2013-01-10 Thread Jason Gunthorpe
On Thu, Jan 10, 2013 at 09:20:07PM +0100, Thierry Reding wrote:

  Arnd's version is good too, but you would be restricted to aligned
  powers of two for the bus number range in the DT, which is probably
  not that big a deal either?
 
 Stephen suggested on IRC that we could try to keep a bit of dynamicity
 in the allocation scheme if we create the bus mapping when the first
 device on the bus is probed and discard the mapping if no devices are
 found.

You probably don't need to mess around with 'discard on empty' the
kernel should only access bus numbers that are in the range of the
subordinate busses of the bridges. So if you establish a mapping on a
bus-by-bus basis at first access, it should be fine and very close to
minimal..

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] ACPI/pci_slot: update PCI slot information when PCI hotplug event happens

2013-01-10 Thread Myron Stowe
[+cc Yinghai]

On Wed, 2013-01-09 at 13:44 -0700, Bjorn Helgaas wrote:
 [+cc Myron]
 
 On Wed, Jan 9, 2013 at 1:19 PM, Rafael J. Wysocki r...@sisk.pl wrote:
  On Thursday, January 10, 2013 12:58:25 AM Jiang Liu wrote:
  Hi Rafael,
Thanks for your great efforts to review the patch.
 
  On 01/09/2013 08:01 AM, Rafael J. Wysocki wrote:
   Hi,
  
   On Wednesday, January 09, 2013 12:52:22 AM Jiang Liu wrote:
  snip
  
   +static void acpi_pci_slot_notify_add(struct pci_dev *dev)
   +{
   +  acpi_handle handle;
   +  struct callback_args context;
   +
   +  if (!dev-subordinate)
   +  return;
   +
   +  mutex_lock(slot_list_lock);
   +  handle = DEVICE_ACPI_HANDLE(dev-dev);
   +  context.root_handle = acpi_find_root_bridge_handle(dev);
  
   There's a patch under discussion that removes this function.
  
   Isn't there any other way to do this?
I will try to find a way to get rid of calling 
  acpi_find_root_bridge_handle,
  and it seems doable.
 
  
   +  if (handle  context.root_handle) {
   +  context.pci_bus = dev-subordinate;
   +  context.user_function = register_slot;
   +  acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, (u32)1,
  
   You can just pass 1 here I think.  Does the compiler complain?
  Thanks for reminder, the (u32) is unnecessary.
 
  
   +  register_slot, NULL, context, NULL);
   +  }
   +  mutex_unlock(slot_list_lock);
   +}
   +
   +static void acpi_pci_slot_notify_del(struct pci_dev *dev)
   +{
   +  struct acpi_pci_slot *slot, *tmp;
   +  struct pci_bus *bus = dev-subordinate;
   +
   +  if (!bus)
   +  return;
   +
   +  mutex_lock(slot_list_lock);
   +  list_for_each_entry_safe(slot, tmp, slot_list, list)
   +  if (slot-pci_slot  slot-pci_slot-bus == bus) {
   +  list_del(slot-list);
   +  pci_destroy_slot(slot-pci_slot);
   +  put_device(bus-dev);
   +  kfree(slot);
   +  }
   +  mutex_unlock(slot_list_lock);
   +}
   +
   +static int acpi_pci_slot_notify_fn(struct notifier_block *nb,
   + unsigned long event, void *data)
   +{
   +  struct device *dev = data;
   +
   +  switch (event) {
   +  case BUS_NOTIFY_ADD_DEVICE:
   +  acpi_pci_slot_notify_add(to_pci_dev(dev));
   +  break;
  
   Do I think correctly that this is going to be called for every PCI device
   added to the system, even if it's not a bridge?
  You are right. Function acpi_pci_slot_notify_add() and 
  acpi_pci_slot_notify_del()
  will check whether it's a bridge. If preferred, I will move the check up 
  into
  acpi_pci_slot_notify_fn().
 
  
   +  case BUS_NOTIFY_DEL_DEVICE:
   +  acpi_pci_slot_notify_del(to_pci_dev(dev));
   +  break;
   +  default:
   +  return NOTIFY_DONE;
   +  }
   +
   +  return NOTIFY_OK;
   +}
   +
   +static struct notifier_block acpi_pci_slot_notifier = {
   +  .notifier_call = acpi_pci_slot_notify_fn,
   +};
   +
static int __init
acpi_pci_slot_init(void)
{
  dmi_check_system(acpi_pci_slot_dmi_table);
  acpi_pci_register_driver(acpi_pci_slot_driver);
   +  bus_register_notifier(pci_bus_type, acpi_pci_slot_notifier);
  
   I wonder if/why this has to be so convoluted?
  
   We have found a PCI bridge in the ACPI namespace, so we've created a 
   struct
   acpi_device for it and we've walked the namespace below it already.
  
   Now we're creating a struct pci_dev for it and while registering it we're
   going to walk the namespace below the bridge again to find and register 
   its
   slots and that is done indirectly from a bus type notifier.
  
   Why can't we enumerate the slots directly upfront?
  Do you mean to create the PCI slot devices when creating the ACPI devices?
  I think there are two factors prevent us from doing that.
  The first is that the ACPI pci_slot driver could be built as a module, so
  we can't call into it from the ACPI core.
 
  I didn't say about calling the pci_slot driver from the ACPI core, but about
  enumerating slots in a way suitable for consumption by the pci_slot driver
  when it's ready.
 
  That said I really don't see a value in having a modular pci_slot driver.  
  It
  is part of the hotplug infrastructure and should always be presend for this
  reason, so we don't need to worry about the pci_slot driver not present 
  case.
 
 I agree that there's no value in supporting CONFIG_ACPI_PCI_SLOT=m.  I
 think Myron has some patches that remove that case.
 
 I'm not sure what the best way to merge them is.  We have a bunch of
 stuff this cycle that touches both ACPI and PCI.

Rafael:

The series Bjorn mentions is at https://lkml.org/lkml/2012/12/7/11  It
converts both the ACPI Hot Plug PCI Controller Driver (acpiphp) and
ACPI PCI Slot Detection Driver (pci_slot) sub-drivers to built-in
drivers (i.e. no longer supported as modules).

Yinghai commented back - https://lkml.org/lkml/2012/12/7/29 - indicating
a 

Re: [PATCH v3 3/6] ACPI/pci_slot: update PCI slot information when PCI hotplug event happens

2013-01-10 Thread Rafael J. Wysocki
On Thursday, January 10, 2013 02:24:23 PM Myron Stowe wrote:
 [+cc Yinghai]
 
 On Wed, 2013-01-09 at 13:44 -0700, Bjorn Helgaas wrote:
  [+cc Myron]
  
  On Wed, Jan 9, 2013 at 1:19 PM, Rafael J. Wysocki r...@sisk.pl wrote:
   On Thursday, January 10, 2013 12:58:25 AM Jiang Liu wrote:
   Hi Rafael,
 Thanks for your great efforts to review the patch.
  
   On 01/09/2013 08:01 AM, Rafael J. Wysocki wrote:
Hi,
   
On Wednesday, January 09, 2013 12:52:22 AM Jiang Liu wrote:
   snip
   
+static void acpi_pci_slot_notify_add(struct pci_dev *dev)
+{
+  acpi_handle handle;
+  struct callback_args context;
+
+  if (!dev-subordinate)
+  return;
+
+  mutex_lock(slot_list_lock);
+  handle = DEVICE_ACPI_HANDLE(dev-dev);
+  context.root_handle = acpi_find_root_bridge_handle(dev);
   
There's a patch under discussion that removes this function.
   
Isn't there any other way to do this?
 I will try to find a way to get rid of calling 
   acpi_find_root_bridge_handle,
   and it seems doable.
  
   
+  if (handle  context.root_handle) {
+  context.pci_bus = dev-subordinate;
+  context.user_function = register_slot;
+  acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, (u32)1,
   
You can just pass 1 here I think.  Does the compiler complain?
   Thanks for reminder, the (u32) is unnecessary.
  
   
+  register_slot, NULL, context, NULL);
+  }
+  mutex_unlock(slot_list_lock);
+}
+
+static void acpi_pci_slot_notify_del(struct pci_dev *dev)
+{
+  struct acpi_pci_slot *slot, *tmp;
+  struct pci_bus *bus = dev-subordinate;
+
+  if (!bus)
+  return;
+
+  mutex_lock(slot_list_lock);
+  list_for_each_entry_safe(slot, tmp, slot_list, list)
+  if (slot-pci_slot  slot-pci_slot-bus == bus) {
+  list_del(slot-list);
+  pci_destroy_slot(slot-pci_slot);
+  put_device(bus-dev);
+  kfree(slot);
+  }
+  mutex_unlock(slot_list_lock);
+}
+
+static int acpi_pci_slot_notify_fn(struct notifier_block *nb,
+ unsigned long event, void *data)
+{
+  struct device *dev = data;
+
+  switch (event) {
+  case BUS_NOTIFY_ADD_DEVICE:
+  acpi_pci_slot_notify_add(to_pci_dev(dev));
+  break;
   
Do I think correctly that this is going to be called for every PCI 
device
added to the system, even if it's not a bridge?
   You are right. Function acpi_pci_slot_notify_add() and 
   acpi_pci_slot_notify_del()
   will check whether it's a bridge. If preferred, I will move the check up 
   into
   acpi_pci_slot_notify_fn().
  
   
+  case BUS_NOTIFY_DEL_DEVICE:
+  acpi_pci_slot_notify_del(to_pci_dev(dev));
+  break;
+  default:
+  return NOTIFY_DONE;
+  }
+
+  return NOTIFY_OK;
+}
+
+static struct notifier_block acpi_pci_slot_notifier = {
+  .notifier_call = acpi_pci_slot_notify_fn,
+};
+
 static int __init
 acpi_pci_slot_init(void)
 {
   dmi_check_system(acpi_pci_slot_dmi_table);
   acpi_pci_register_driver(acpi_pci_slot_driver);
+  bus_register_notifier(pci_bus_type, acpi_pci_slot_notifier);
   
I wonder if/why this has to be so convoluted?
   
We have found a PCI bridge in the ACPI namespace, so we've created a 
struct
acpi_device for it and we've walked the namespace below it already.
   
Now we're creating a struct pci_dev for it and while registering it 
we're
going to walk the namespace below the bridge again to find and 
register its
slots and that is done indirectly from a bus type notifier.
   
Why can't we enumerate the slots directly upfront?
   Do you mean to create the PCI slot devices when creating the ACPI 
   devices?
   I think there are two factors prevent us from doing that.
   The first is that the ACPI pci_slot driver could be built as a module, so
   we can't call into it from the ACPI core.
  
   I didn't say about calling the pci_slot driver from the ACPI core, but 
   about
   enumerating slots in a way suitable for consumption by the pci_slot driver
   when it's ready.
  
   That said I really don't see a value in having a modular pci_slot driver. 
It
   is part of the hotplug infrastructure and should always be presend for 
   this
   reason, so we don't need to worry about the pci_slot driver not present 
   case.
  
  I agree that there's no value in supporting CONFIG_ACPI_PCI_SLOT=m.  I
  think Myron has some patches that remove that case.
  
  I'm not sure what the best way to merge them is.  We have a bunch of
  stuff this cycle that touches both ACPI and PCI.
 
 Rafael:
 
 The series Bjorn mentions is at https://lkml.org/lkml/2012/12/7/11  It
 converts both the ACPI Hot 

Re: [PATCH 05/11] spi/pxa2xx: make clock rate configurable from platform data

2013-01-10 Thread Rafael J. Wysocki
On Thursday, January 10, 2013 03:58:52 PM Mika Westerberg wrote:
 On Thu, Jan 10, 2013 at 01:33:19PM +, Mark Brown wrote:
  On Thu, Jan 10, 2013 at 02:18:08PM +0100, Rafael J. Wysocki wrote:
   On Thursday, January 10, 2013 12:51:59 PM Mark Brown wrote:
  
Sounds sensible, yes - about what I'd expect.  Is it possible to match
on CPUID or similar information (given that this is all in the SoC)
instead of ACPI, that might be more robust I guess?
  
   This particular part may be present in different SoCs.
  
  Right, but I'd expect you could enumerate the SoCs?  Someone might
  decide to change the clock configuration for future SoCs anyway.
 
 Well, they can use the same LPSS block with a different CPU but then we
 expect the ACPI IDs to change as well (so we can then make another set of
 clocks for those).

Also, as I said in another message, even if the LPSS block is used, the SPI
may not be exposed to us by the BIOS, so SoC enumeration is not sufficient
in general.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 11/18] perf tools: add mem access sampling core support

2013-01-10 Thread Stephane Eranian
On Wed, Jan 9, 2013 at 5:55 PM, Jiri Olsa jo...@redhat.com wrote:
 On Mon, Jan 07, 2013 at 07:27:50PM +0100, Stephane Eranian wrote:

 SNIP


 +static void ip__resolve_data(struct machine *self, struct thread *thread,
 +  u8 m,
 + struct addr_map_symbol *ams,
 + u64 addr)
 +{
 + struct addr_location al;
 +
 + memset(al, 0, sizeof(al));
 +
 + thread__find_addr_location(thread, self, m, MAP__VARIABLE, addr, al,
 +NULL);
 + ams-addr = addr;
 + ams-al_addr = al.addr;
 + ams-sym = al.sym;
 + ams-map = al.map;
 +}
 +
 +struct mem_info *machine__resolve_mem(struct machine *self,
 +   struct thread *thr,
 +   struct perf_sample *sample,
 +   u8 cpumode)
 +{
 + struct mem_info *mi;
 +
 + mi = calloc(1, sizeof(struct mem_info));
 + if (!mi)
 + return NULL;
 +
 + ip__resolve_ams(self, thr, mi-iaddr, sample-ip);
 + ip__resolve_data(self, thr, cpumode, mi-daddr, sample-addr);
 + mi-cost = sample-weight;
 + mi-dsrc.val = sample-dsrc;
 +
 + return mi;
 +}
 +

 The crash I report is due to the some maps could be removed
 via map_groups__fixup_overlappings.

 Attached patch makes the code working for me, but we might
 want to have some global unified fix for that, since this
 is not the only place suffering for that.

 Like globaly set map-referenced in add_hist_entry or
 hist_entry__new functions..


Would something like that work for you (untested)?

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 7034500..fc05b9f 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -274,6 +274,12 @@ static struct hist_entry *hist_entry__new(struct
hist_entry *template)

if (he-ms.map)
he-ms.map-referenced = true;
+   if (he-mem_info) {
+   if (he-mem_info-iaddr.map)
+   he-mem_info-iaddr.map-referenced = true;
+   if (he-mem_info-daddr.map)
+   he-mem_info-daddr.map-referenced = true;
+   }
if (symbol_conf.use_callchain)
callchain_init(he-callchain);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[git pull] drm intel fixes

2013-01-10 Thread Dave Airlie

Hi Linus,

Just intel fixes, including getting the Ironlake systems back to the state 
they were in for 3.6.

Dave.

The following changes since commit ecf02a607bd801e742d7bb35c6e40f7ca15edf03:

  Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86 
(2013-01-10 09:09:41 -0800)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to 82ba789f48de669fd0bbc84c326f07571d078572:

  Merge branch 'drm-intel-fixes' of 
git://people.freedesktop.org/~danvet/drm-intel (2013-01-11 07:52:48 +1000)



Chris Wilson (6):
  drm/i915; Only increment the user-pin-count after successfully pinning 
the bo
  drm/i915: Treat crtc-mode.clock == 0 as disabled
  drm: Only evict the blocks required to create the requested hole
  drm/i915: The sprite scaler on Ironlake also support YUV planes
  drm/i915: Add DEBUG messages to all intel_create_user_framebuffer error 
paths
  drm/i915: Use pixel size for computing linear offsets into a sprite

Daniel Vetter (2):
  Revert drm/i915: no lvds quirk for Zotac ZDBOX SD ID12/ID13
  drm/i915: Revert shrinker changes from Track unbound pages

Dave Airlie (1):
  Merge branch 'drm-intel-fixes' of 
git://people.freedesktop.org/~danvet/drm-intel

 drivers/gpu/drm/drm_mm.c | 45 ++--
 drivers/gpu/drm/i915/i915_gem.c  | 25 ++--
 drivers/gpu/drm/i915/intel_display.c | 33 +++---
 drivers/gpu/drm/i915/intel_lvds.c|  8 ---
 drivers/gpu/drm/i915/intel_pm.c  | 25 
 drivers/gpu/drm/i915/intel_sprite.c  | 10 
 include/drm/drm_mm.h |  2 +-
 7 files changed, 81 insertions(+), 67 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] mm: forcely swapout when we are out of page cache

2013-01-10 Thread Andrew Morton
On Thu, 10 Jan 2013 11:23:06 +0900
Minchan Kim minc...@kernel.org wrote:

  I have a feeling that laptop mode has bitrotted and these patches are
  kinda hacking around as-yet-not-understood failures...
 
 Absolutely, this patch is last guard for unexpectable behavior.
 As I mentioned in cover-letter, Luigi's problem could be solved either [1/2]
 or [2/2] but I wanted to add this as last resort in case of unexpected
 emergency. But you're right. It's not good to hide the problem like this path
 so let's drop [2/2].
 
 Also, I absolutely agree it has bitrotted so for correcting it, we need a
 volunteer who have to inverstigate power saveing experiment with long time.
 So [1/2] would be band-aid until that.

I'm inclined to hold off on 1/2 as well, really.

The point of laptop_mode isn't to save power btw - it is to minimise
the frequency with which the disk drive is spun up.  By deferring and
then batching writeout operations, basically.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] Reproducible OOM with partial workaround

2013-01-10 Thread paul . szabo
Dear Linux-MM,

On a machine with i386 kernel and over 32GB RAM, an OOM condition is
reliably obtained simply by writing a few files to some local disk
e.g. with:
  n=0; while [ $n -lt 99 ]; do dd bs=1M count=1024 if=/dev/zero of=x$n; 
((n=$n+1)); done
Crash usually occurs after 16 or 32 files written. Seems that the
problem may be avoided by using mem=32G on the kernel boot, and that
it occurs with any amount of RAM over 32GB.

I developed a workaround patch for this particular OOM demo, dropping
filesystem caches when about to exhaust lowmem. However, subsequently
I observed OOM when running many processes (as yet I do not have an
easy-to-reproduce demo of this); so as I suspected, the essence of the
problem is not with FS caches.

Could you please help in finding the cause of this OOM bug?

Please see
http://bugs.debian.org/695182
for details, in particular my workaround patch
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=101;att=1;bug=695182

(Please reply to me directly, as I am not a subscriber to the linux-mm
mailing list.)

Thanks, Paul

Paul Szabo   p...@maths.usyd.edu.au   http://www.maths.usyd.edu.au/u/psz/
School of Mathematics and Statistics   University of SydneyAustralia
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: Tree for Jan 10 (staging/sb105x)

2013-01-10 Thread Randy Dunlap
On 01/10/13 12:50, Steven Rostedt wrote:
 On Thu, 2013-01-10 at 11:08 -0800, Randy Dunlap wrote:
 On 01/09/13 19:32, Stephen Rothwell wrote:
 Hi all,

 Changes since 20130109:


 on x86_64, when CONFIG_PARPORT_PC is not enabled:

 drivers/built-in.o: In function `multi_init':
 sb_pci_mp.c:(.init.text+0x15684): undefined reference to 
 `parport_pc_probe_port'


 Full randconfig file is attached.
 
 Ug, so https://lkml.org/lkml/2012/12/14/250 should have been
 CONFIG_PARPORT_PC and not just PARPORT :-P
 
 Can you try this patch:

That's good.  Thanks.

Acked-by: Randy Dunlap rdun...@infradead.org


 diff --git a/drivers/staging/sb105x/sb_pci_mp.c 
 b/drivers/staging/sb105x/sb_pci_mp.c
 index 131afd0c..9464f38 100644
 --- a/drivers/staging/sb105x/sb_pci_mp.c
 +++ b/drivers/staging/sb105x/sb_pci_mp.c
 @@ -3054,7 +3054,7 @@ static int init_mp_dev(struct pci_dev *pcidev, 
 mppcibrd_t brd)
   sbdev-nr_ports = ((portnum_hex/16)*10) + 
 (portnum_hex % 16);
   }
   break;
 -#ifdef CONFIG_PARPORT
 +#ifdef CONFIG_PARPORT_PC
   case PCI_DEVICE_ID_MP2S1P :
   sbdev-nr_ports = 2;
  
 
 
 --


-- 
~Randy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] drivers/pinctrl: grab default handles from device core

2013-01-10 Thread Stephen Warren
On 01/10/2013 01:42 PM, Stephen Warren wrote:
 On 12/12/2012 01:25 PM, Linus Walleij wrote:
 From: Linus Walleij linus.wall...@linaro.org

 This makes the device core auto-grab the pinctrl handle and set
 the default (PINCTRL_STATE_DEFAULT) state for every device
 that is present in the device model right before probe. This will
 account for the lion's share of embedded silicon devcies.
 
 There are quite a few problems with this patch, and they end up
 completely breaking at least Tegra in next-20130110.
 
 diff --git a/drivers/base/pinctrl.c b/drivers/base/pinctrl.c
 
 +int pinctrl_bind_pins(struct device *dev)
 +{
 +struct dev_pin_info *dpi;
 +int ret;
 +
 +/* Allocate a pin state container on-the-fly */
 +if (!dev-pins) {
 +dpi = devm_kzalloc(dev, sizeof(*dpi), GFP_KERNEL);
 +if (!dpi)
 +return -ENOMEM;
 +} else
 +dpi = dev-pins;
 +
 +/*
 + * Check if we already have a pinctrl handle, as we may arrive here
 + * after a deferral in the state selection below
 + */
 +if (!dpi-p) {
 +dpi-p = devm_pinctrl_get(dev);
 
 That won't succeed for a pinctrl device that has a default state in
 order to implement hogs. This will then cause the pin controller device
 to always defer probe and never activate. This will leave HW
 unconfigured and/or prevent other devices from successfully calling
 pinctrl_get().

I see that an attempt was made to solve this problem, in the patch
immediately preceding this one (at least, as applied in the pinctrl
tree). However, that patch only addresses the case where the pin
controller is being looked up in the map, and not the case when
converting device tree to the map in the first place. The patch below
solves this:

diff --git a/drivers/pinctrl/devicetree.c b/drivers/pinctrl/devicetree.c
index fe2d1af..fd40a11 100644
--- a/drivers/pinctrl/devicetree.c
+++ b/drivers/pinctrl/devicetree.c
@@ -141,6 +141,11 @@ static int dt_to_map_one_config(struct pinctrl *p,
const char *statename,
pctldev = find_pinctrl_by_of_node(np_pctldev);
if (pctldev)
break;
+   /* Do not defer probing of hogs (circular loop) */
+   if (np_pctldev == p-dev-of_node) {
+   of_node_put(np_pctldev);
+   return -ENODEV;
+   }
}
of_node_put(np_pctldev);


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCHv2 8/9] zswap: add to mm/

2013-01-10 Thread Dan Magenheimer
 From: Seth Jennings [mailto:sjenn...@linux.vnet.ibm.com]
 Subject: [PATCHv2 8/9] zswap: add to mm/
 
 zswap is a thin compression backend for frontswap. It receives
 pages from frontswap and attempts to store them in a compressed
 memory pool, resulting in an effective partial memory reclaim and
 dramatically reduced swap device I/O.
 
 Additional, in most cases, pages can be retrieved from this
 compressed store much more quickly than reading from tradition
 swap devices resulting in faster performance for many workloads.
 
 This patch adds the zswap driver to mm/
 
 Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com

I've implemented the equivalent of zswap_flush_*
in zcache.  It looks much better than my earlier
attempt at similar code to move zpages to swap.
Nice work and thanks!

But... (isn't there always a but;-)...

 +/*
 + * This limits is arbitrary for now until a better
 + * policy can be implemented. This is so we don't
 + * eat all of RAM decompressing pages for writeback.
 + */
 +#define ZSWAP_MAX_OUTSTANDING_FLUSHES 64
 + if (atomic_read(zswap_outstanding_flushes) 
 + ZSWAP_MAX_OUTSTANDING_FLUSHES)
 + return;

From what I can see, zcache is in some ways more aggressive in
some circumstances in flushing (zcache calls it unuse),
and in some ways less aggressive.  But with significant exercise,
I can always cause the kernel to OOM when it is under heavy
memory pressure and the flush/unuse code is being used.

Have you given any further thought to a better policy
(see the comment in the snippet above)?  I'm going
to try a smaller number than 64 to see if the OOMs
go away, but choosing a random number for this throttling
doesn't seem like a good plan for moving forward.

Thanks,
Dan

P.S. I know you, like I, often use something kernbench-ish to
exercise your code.  I've found that compiling a kernel,
then switching to another kernel directory, doing a git pull,
and compiling that kernel, causes a lot of flushes/unuses
and the OOMs.  (This with 1GB RAM booting RHEL6 with a full GUI.)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] atkbd: Fix multi-char scancode handling on reconnect.

2013-01-10 Thread Shawn Nematbakhsh
Hi Dmitry,

On Sun, Jan 6, 2013 at 1:10 AM, Dmitry Torokhov
dmitry.torok...@gmail.com wrote:
 Hi Shawn,

 On Thu, Dec 20, 2012 at 06:33:11PM -0800, Shawn Nematbakhsh wrote:
 On resume from suspend there is a possibility for multi-byte scancodes
 to be handled incorrectly. atkbd_reconnect disables the processing of
 scancodes in software by calling atkbd_disable, but the keyboard may
 still be active because no disconnect command was sent. Later, software
 handling is re-enabled. If a multi-byte scancode sent from the keyboard
 straddles the re-enable, only the latter byte(s) will be handled.

 In practice, this leads to cases where multi-byte break codes (ex. e0
 4d - break code for right-arrow) are misread as make codes (4d - make
 code for numeric 6), leading to one or more unwanted, untyped characters
 being interpreted.

 The solution implemented here involves sending command f5 (reset
 disable) to the keyboard prior to disabling software handling of codes.
 Later, the command to re-enable the keyboard is sent only after we are
 prepared to handle scancodes.

 The core tries to avoid disturbing devices that are not keyboards, so I
 believe we should check the device ID first and if it is keyboard, then
 do the reset. We should also reset the internal state (emul and xl_bit)
 when re-enabling the device.

 Does the version of the patch below work for you?

 Thanks.

 --
 Dmitry

Thanks for the revision. This version of the patch tests good. I agree
with your changes, it makes sense to deactivate the keyboard from both
connect and reconnect, just in case the keyboard is able to send
scancodes at this point.

--
Shawn


 Input: atkbd - fix multi-byte scancode handling on reconnect

 From: Shawn Nematbakhsh sha...@chromium.org

 On resume from suspend there is a possibility for multi-byte scancodes
 to be handled incorrectly. atkbd_reconnect disables the processing of
 scancodes in software by calling atkbd_disable, but the keyboard may
 still be active because no disconnect command was sent. Later, software
 handling is re-enabled. If a multi-byte scancode sent from the keyboard
 straddles the re-enable, only the latter byte(s) will be handled.

 In practice, this leads to cases where multi-byte break codes (ex. e0
 4d - break code for right-arrow) are misread as make codes (4d - make
 code for numeric 6), leading to one or more unwanted, untyped characters
 being interpreted.

 The solution implemented here involves sending command f5 (reset
 disable) to the keyboard prior to disabling software handling of codes.
 Later, the command to re-enable the keyboard is sent only after we are
 prepared to handle scancodes.

 Signed-off-by: Shawn Nematbakhsh sha...@chromium.org
 Signed-off-by: Dmitry Torokhov dmitry.torok...@gmail.com
 ---
  drivers/input/keyboard/atkbd.c |   72 
 
  1 file changed, 51 insertions(+), 21 deletions(-)

 diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c
 index 33d0fcd..2626773 100644
 --- a/drivers/input/keyboard/atkbd.c
 +++ b/drivers/input/keyboard/atkbd.c
 @@ -676,6 +676,39 @@ static inline void atkbd_disable(struct atkbd *atkbd)
 serio_continue_rx(atkbd-ps2dev.serio);
  }

 +static int atkbd_activate(struct atkbd *atkbd)
 +{
 +   struct ps2dev *ps2dev = atkbd-ps2dev;
 +
 +/*
 + * Enable the keyboard to receive keystrokes.
 + */
 +
 +   if (ps2_command(ps2dev, NULL, ATKBD_CMD_ENABLE)) {
 +   dev_err(ps2dev-serio-dev,
 +   Failed to enable keyboard on %s\n,
 +   ps2dev-serio-phys);
 +   return -1;
 +   }
 +
 +   return 0;
 +}
 +
 +/*
 + * atkbd_deactivate() resets and disables the keyboard from sending
 + * keystrokes.
 + */
 +
 +static void atkbd_deactivate(struct atkbd *atkbd)
 +{
 +   struct ps2dev *ps2dev = atkbd-ps2dev;
 +
 +   if (ps2_command(ps2dev, NULL, ATKBD_CMD_RESET_DIS))
 +   dev_err(ps2dev-serio-dev,
 +   Failed to deactivate keyboard on %s\n,
 +   ps2dev-serio-phys);
 +}
 +
  /*
   * atkbd_probe() probes for an AT keyboard on a serio port.
   */
 @@ -731,6 +764,12 @@ static int atkbd_probe(struct atkbd *atkbd)
 return -1;
 }

 +/*
 + * Make sure nothing is coming from the keyboard and disturbs our
 + * internal state.
 + */
 +   atkbd_deactivate(atkbd);
 +
 return 0;
  }

 @@ -825,24 +864,6 @@ static int atkbd_reset_state(struct atkbd *atkbd)
 return 0;
  }

 -static int atkbd_activate(struct atkbd *atkbd)
 -{
 -   struct ps2dev *ps2dev = atkbd-ps2dev;
 -
 -/*
 - * Enable the keyboard to receive keystrokes.
 - */
 -
 -   if (ps2_command(ps2dev, NULL, ATKBD_CMD_ENABLE)) {
 -   dev_err(ps2dev-serio-dev,
 -   Failed to enable keyboard on %s\n,
 -   ps2dev-serio-phys);
 -   return -1;
 -   }
 -
 -   return 0;
 -}
 -
  /*
   * 

Re: [PATCH 0/5] x86,smp: make ticket spinlock proportional backoff w/ auto tuning

2013-01-10 Thread Chegu Vinod

On 1/8/2013 2:26 PM, Rik van Riel wrote:
...

Performance is within the margin of error of v2, so the graph
has not been update.

Please let me know if you manage to break this code in any way,
so I can fix it...



Attached below is some preliminary data with one of the AIM7 micro-benchmark
workloads (i.e. high_systime). This is a kernel intensive workload which
does tons of forks/execs etc.and stresses quite a few of the same set
of spinlocks and semaphores.

Observed a drop in performance as we go to 40way and 80 way. Wondering
if the back off keeps increasing to such an extent that it actually starts
to hurt given the nature of this workload ?  Also in the case of 80way
observed quite a bit of variation from run to run...

Also ran it inside a single KVM guest. There were some perf. dips but
interestingly didn't observe the same level of drop (compared to the
drop in the native case) as the guest size was scaled up to 40vcpu or
80vcpu.

FYI
Vinod



---

Platform : 8 socket (80 Core) Westmere with 1TB RAM.

Workload: AIM7-highsystime microbenchmark - 2000 users  100 jobs per user.  

Values reported are Jobs Per Minute (Higher is better).  The values
are average of 3 runs.

1) Native run:
--

Config 1:  3.7 kernel
Config 2:  3.7 + Rik's 1-4 patches


  20way 40way 80way

Config 1 ~179K ~159K ~146K 

Config 2 ~180K ~134K ~21K-43K  - high variation!


(Note: Used numactl to restrict workload to 
2 sockets (20way) and 4 sockets(40way))

--

2) KVM run : 


Single guest of different sizes (No over commit, NUMA enabled in the guest).

Note: This kernel intensive micro benchmark is exposes the PLE handler issue 
  esp. for large guests. Since Raghu's PLE changes are not yet in upstream 
  'have just run with current PLE handler  then by disabling 
  PLE (ple_gap=0).

Config 1 : Host  Guest at 3.7
Config 2 : Host  Guest are at 3.7 + Rik's 1-4 patches

--
 20vcpu/128G  40vcpu/256G  80vcpu/512G
(on 2 sockets)   (on 4 sockets)   (on 8 sockets)
--
Config 1   ~144K ~39K ~10K
--
Config 2   ~143K ~37.5K   ~11K
--

Config 3 : Host  Guest at 3.7 AND ple_gap=0
Config 4 : Host  Guest are at 3.7 + Rik's 1-4 patches AND ple_gap=0

--
Config 3   ~154K~131K~116K 
--
Config 4   ~151K~130K~115K
--


(Note: Used numactl to restrict qemu to 
2 sockets (20way) and 4 sockets(40way))


Re: regression, bisected: openpty fails from 3.7 onwards without devpts

2013-01-10 Thread Alan Cox
 getptsname expects EINVAL on failure to fall back to /dev/ttyp*... The
 same as unlockpt. We should definitely revert now and can teach glibc
 to accept also ENOTTY. After some years, we can try again :).

Strongly disagree for two reasons

1. We don't want to leave the other ioctls broken and non-compliant if
we can avoid it.

2. The userspace code here has quite reasonable expectations which are
that TIOCGPTN is not an unknown ioctl but is invalid to use in this
situation.

So we should just fix TIOCGPTN on a pty with no suitable name answer to
return -EINVAL (even though it would be fun to break userspace and make
Linus teach more rude words to non English speakers).

In this case I think the userspace expectations are actually perfectly
fair anyway even without considering the rather important criterion
of 'it worked last week')

Alan
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus

2013-01-10 Thread Colin Cross
On Thu, Jan 10, 2013 at 12:38 PM, Tony Lindgren t...@atomide.com wrote:

 * Colin Cross ccr...@android.com [130109 18:05]:
  +static void watchdog_check_hardlockup_other_cpu(void)
  +{
  + int cpu;
  + cpumask_t cpus = watchdog_cpus;
  +
  + /*
  +  * Test for hardlockups every 3 samples.  The sample period is
  +  *  watchdog_thresh * 2 / 5, so 3 samples gets us back to slightly 
  over
  +  *  watchdog_thresh (over by 20%).
  +  */
  + if (__this_cpu_read(hrtimer_interrupts) % 3 != 0)
  + return;
  +
  + /* check for a hardlockup on the next cpu */
  + cpu = cpumask_next(smp_processor_id(), cpus);

 Hmm don't you want to check cpu_oneline_mask here and
 return if the other CPU is offline?

watchdog_cpus is effectively a local copy of cpu_online_mask, but
updated after the watchdog_nmi_touch in watchdog_nmi_enable.  This
avoids a false positive after hotplugging in a cpu when
cpu_online_mask is true but that cpu hasn't yet run it's first
hrtimer.

  + if (cpu = nr_cpu_ids)
  + cpu = cpumask_first(cpus);
  + if (cpu == smp_processor_id())
  + return;

 Regards,

 Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fs: Disable preempt when acquire i_size_seqcount write lock

2013-01-10 Thread Andrew Morton
On Wed, 9 Jan 2013 11:34:19 +0800
Fan Du fan...@windriver.com wrote:

 Two rt tasks bind to one CPU core.
 
 The higher priority rt task A preempts a lower priority rt task B which
 has already taken the write seq lock, and then the higher priority
 rt task A try to acquire read seq lock, it's doomed to lockup.
 
 rt task A with lower priority: call write
 i_size_writert task B with higher 
 priority: call sync, and preempt task A
   write_seqcount_begin(inode-i_size_seqcount);i_size_read  
   inode-i_size = i_size; read_seqcount_begin -- 
 lockup here... 
 

Ouch.

And even if the preemping task is preemptible, it will spend an entire
timeslice pointlessly spinning, which isn't very good.

 So disable preempt when acquiring every i_size_seqcount *write* lock will
 cure the problem.
 
 ...

 --- a/include/linux/fs.h
 +++ b/include/linux/fs.h
 @@ -758,9 +758,11 @@ static inline loff_t i_size_read(const struct inode 
 *inode)
  static inline void i_size_write(struct inode *inode, loff_t i_size)
  {
  #if BITS_PER_LONG==32  defined(CONFIG_SMP)
 + preempt_disable();
   write_seqcount_begin(inode-i_size_seqcount);
   inode-i_size = i_size;
   write_seqcount_end(inode-i_size_seqcount);
 + preempt_enable();
  #elif BITS_PER_LONG==32  defined(CONFIG_PREEMPT)
   preempt_disable();
   inode-i_size = i_size;

afacit all write_seqcount_begin()/read_seqretry() sites are vulnerable
to this problem.  Would it not be better to do the preempt_disable() in
write_seqcount_begin()?


Possible problems:

- mm/filemap_xip.c does disk I/O under write_seqcount_begin().

- dev_change_name() does GFP_KERNEL allocations under write_seqcount_begin()

- I didn't review u64_stats_update_begin() callers.

But I think calling schedule() under preempt_disable() is OK anyway?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    4   5   6   7   8   9   10   11   12   13   >