[PATCH 1/3] mlx4_core: Stash PCI ID driver_data in mlx4_priv structure

2012-09-27 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com That way we can check flags later on, when we've finished with the pci_device_id structure. Also convert the is VF flag to an enum: Never do in the preprocessor what can be done in C. Signed-off-by: Roland Dreier rol...@purestorage.com --- drivers/net

Re: [PATCH] mlx4_core: Fix crash on uninitialized priv-cmd.slave_sem

2012-09-26 Thread Roland Dreier
On Tue, Sep 25, 2012 at 9:46 PM, Roland Dreier rol...@kernel.org wrote: By the way, I still get a steady stream of mlx4_core :05:00.0: Unknown command:0x4d accepted from slave:0 mlx4_core :05:00.0: Sense command failed for port: 1 once I load the driver... it seems SENSE_PORT

Quick mlx4 IB SR-IOV howto?

2012-09-26 Thread Roland Dreier
So I have SR-IOV enabled on a ConnectX-3 adapter, and I loaded the driver with num_vfs=1 probe_vf=1, so in the host I see: # The master device $ ibv_devinfo -d mlx4_1 hca_id: mlx4_1 transport: InfiniBand (0) fw_ver: 2.11.500

[PATCH] mlx4_core: Fix crash on uninitialized priv-cmd.slave_sem

2012-09-25 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com On an SR-IOV master device, __mlx4_init_one() calls mlx4_init_hca() before mlx4_multi_func_init(). However, for unlucky configurations, mlx4_init_hca() might call mlx4_SENSE_PORT() (via mlx4_dev_cap()), and that calls mlx4_cmd_imm

Re: [PATCH] mlx4_core: Fix crash on uninitialized priv-cmd.slave_sem

2012-09-25 Thread Roland Dreier
By the way, I still get a steady stream of mlx4_core :05:00.0: Unknown command:0x4d accepted from slave:0 mlx4_core :05:00.0: Sense command failed for port: 1 once I load the driver... it seems SENSE_PORT is a wrapped command but there's no entry in cmd_info[] for it? - R. -- To

Re: linux-next: build failure after merge of the akpm tree

2012-09-24 Thread Roland Dreier
On Mon, Sep 24, 2012 at 7:02 AM, Stephen Rothwell wrote: > After merging the akpm tree, today's linux-next build (powerpc > ppc64_defconfig) failed like this: > > drivers/infiniband/hw/mlx4/cm.c: In function 'id_map_alloc': > drivers/infiniband/hw/mlx4/cm.c:228:36: error: 'MAX_ID_MASK' undeclared

Re: linux-next: build failure after merge of the akpm tree

2012-09-24 Thread Roland Dreier
On Mon, Sep 24, 2012 at 7:02 AM, Stephen Rothwell s...@canb.auug.org.au wrote: After merging the akpm tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: drivers/infiniband/hw/mlx4/cm.c: In function 'id_map_alloc': drivers/infiniband/hw/mlx4/cm.c:228:36: error:

Re: [PATCH for-next V2 11/22] IB/mlx4: Add CM paravirtualization

2012-09-24 Thread Roland Dreier
On Fri, Aug 3, 2012 at 1:40 AM, Jack Morgenstein ja...@dev.mellanox.co.il wrote: +static struct id_map_entry * +id_map_alloc(struct ib_device *ibdev, int slave_id, u32 sl_cm_id) +{ + int ret, id; + static int next_id; + struct id_map_entry *ent; + struct

Re: [PATCH for-next V2 01/22] IB/core: Reserve bits in enum ib_qp_create_flags for low-level driver use

2012-09-24 Thread Roland Dreier
So I applied this whole series, with the plan to merge this for 3.7. Please send any changes as patches on top of what's already merged. Thanks, Roland -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo

Re: linux-next: build failure after merge of the akpm tree

2012-09-24 Thread Roland Dreier
On Mon, Sep 24, 2012 at 7:02 AM, Stephen Rothwell s...@canb.auug.org.au wrote: After merging the akpm tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: drivers/infiniband/hw/mlx4/cm.c: In function 'id_map_alloc': drivers/infiniband/hw/mlx4/cm.c:228:36: error:

Re: [PATCH for-next V2 21/22] {NET,IB}/mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations

2012-09-22 Thread Roland Dreier
On Fri, Aug 3, 2012 at 1:40 AM, Jack Morgenstein ja...@dev.mellanox.co.il wrote: Previously, the structure of a guest's proxy QPs followed the structure of the PPF special qps (qp0 port 1, qp0 port 2, qp1 port 1, qp1 port 2, ...). The guest then did offset calculations on the sqp_base qp

Re: [PATCH] qla2xxx: Fix endianness of task management response code

2012-09-21 Thread Roland Dreier
On Fri, Sep 21, 2012 at 1:02 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: The data in status1 appears to get used a word at a time ... what about the other three bytes you don't set; are they guaranteed to be zero? (in which case this works, it just looks wrong from the way

Re: [PATCH] qla2xxx: Fix endianness of task management response code

2012-09-19 Thread Roland Dreier
On Wed, Sep 19, 2012 at 12:59 AM, James Bottomley james.bottom...@hansenpartnership.com wrote: Is this also true on Big Endian Hardware? Because the fix you have assumes that the TIO IOCB with SCSI status mode 1 should be CPU endian ... that doesn't look right since this is passed directly

Re: [PATCH] IB/ipoib: Fix crash resulted as of use after free for multicast object

2012-09-18 Thread Roland Dreier
On Thu, Aug 30, 2012 at 12:01 AM, Patrick McHardy ka...@trash.net wrote: Fix a crash in ipoib_mcast_join_task(). With help from Or Gerlitz. Thanks, applied. But srsly, read over the changelog in that attachment and think about a bit more proofreading next time around. - R. -- To unsubscribe

[PATCH] qla2xxx: Fix endianness of task management response code

2012-09-18 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com The qla2xxx firmware actually expects the task management response code in a CTIO IOCB with SCSI status mode 1 to be in little-endian byte order, ie the response code should be the first byte in the sense_data[] array. The old code erroneously byte

[GIT PULL] please pull infiniband.git

2012-09-17 Thread Roland Dreier
of unsignaled WQE Roland Dreier (1): Merge branches 'cxgb4', 'ipoib', 'mlx4', 'ocrdma' and 'qib' into for-next Shlomo Pongratz (2): IPoIB: Fix memory leak in the neigh table deletion flow IPoIB: Fix AB-BA deadlock when deleting neighbours Wei Yongjun (1): RDMA/cxgb4: Move

[GIT PULL] please pull infiniband.git

2012-09-17 Thread Roland Dreier
of unsignaled WQE Roland Dreier (1): Merge branches 'cxgb4', 'ipoib', 'mlx4', 'ocrdma' and 'qib' into for-next Shlomo Pongratz (2): IPoIB: Fix memory leak in the neigh table deletion flow IPoIB: Fix AB-BA deadlock when deleting neighbours Wei Yongjun (1): RDMA/cxgb4: Move

[GIT PULL] please pull infiniband.git

2012-09-17 Thread Roland Dreier
of unsignaled WQE Roland Dreier (1): Merge branches 'cxgb4', 'ipoib', 'mlx4', 'ocrdma' and 'qib' into for-next Shlomo Pongratz (2): IPoIB: Fix memory leak in the neigh table deletion flow IPoIB: Fix AB-BA deadlock when deleting neighbours Wei Yongjun (1): RDMA/cxgb4: Move

Re: [PATCH] ib_srp: Fix use-after-free in srp_reset_req()

2012-09-17 Thread Roland Dreier
On Mon, Sep 17, 2012 at 12:31 PM, David Dillow dillo...@ornl.gov wrote: Roland, are you planning to apply this one and the one Bart has that fixes the error handling (srp_aport)? I didn't see them in your pull request. Sorry, I guess I missed them. I think I'll probably just put it into 3.7

Re: [PATCH fixes/for-3.6 3/3] net/mlx4_core: Enable 8TB memory registration

2012-09-13 Thread Roland Dreier
thanks, applied -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH fixes/for-3.6 2/3] IB/ipoib: Fix AB-BA deadlock when deleting neighbours

2012-09-12 Thread Roland Dreier
thanks, applied this and 1/3 On Wed, Aug 29, 2012 at 8:14 AM, Or Gerlitz ogerl...@mellanox.com wrote: while ((neigh = rcu_dereference_protected(*np, - lockdep_is_held(ntbl-rwlock))) != NULL) { +

Re: [PATCH for-next V2 03/22] IB/core: Add ib_find_exact_cached_pkey() to search for 16-bit pkey match

2012-09-11 Thread Roland Dreier
On Tue, Sep 11, 2012 at 10:12 AM, Doug Ledford dledf...@redhat.com wrote: As a second note, I would like to know why Intel (previously QLogic) does not use these functions in their driver and what it would take to get all drivers to use the functions. Do we need to add more to them? In my

Re: [PATCH] RDMA/cxgb4: move the dereference below the NULL test

2012-09-11 Thread Roland Dreier
applied, thanks -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH for-next V2 03/22] IB/core: Add ib_find_exact_cached_pkey() to search for 16-bit pkey match

2012-09-11 Thread Roland Dreier
On Tue, Sep 11, 2012 at 1:34 PM, Doug Ledford dledf...@redhat.com wrote: Well, at this point, the mlx4 driver uses them, the rdmacm kernel driver uses them, and both QLogic/Intel drivers have their own internal pkey table implementation. So, it isn't so much upper layer as it is drivers.

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-09-09 Thread Roland Dreier
On Sun, Sep 9, 2012 at 7:09 AM, Or Gerlitz or.gerl...@gmail.com wrote: I honestly do see how - the height of the root node is updated indepedently of the slots, so if someone managed to get the updated height there is nothing from stoping radix_tree_lookup from going too deep into the chain of

Re: [PATCH RFC for-next] net/mlx4_core: Fix racy flow in the driver CQ completion handler

2012-08-30 Thread Roland Dreier
On Thu, Aug 30, 2012 at 3:17 PM, Or Gerlitz or.gerl...@gmail.com wrote: Roland Dreier ‎rol...@kernel.org wrote: Can you be explicit about the race you're worried about? few 1. on the time CQ A is deleted an interrupt that relates to CQ B takes place and a radix tree lookup is running

Re: rsockets and fork

2012-08-24 Thread Roland Dreier
On Wed, Aug 22, 2012 at 4:35 PM, Hefty, Sean sean.he...@intel.com wrote: I'm haven't identified the specific problem with fork support, but I did see this in libmlx4: mlx4_alloc_context() { ... context-uar = mmap(NULL, to_mdev(ibdev)-page_size, PROT_WRITE,

[Bug 1037107] Re: Mellanox ConnectX-3 HCA's are not supported (MT27500 Family)

2012-08-19 Thread Roland Dreier
And the fix is to update to a new libmlx4 version, which is already in 12.10 / Quantal. I'm not sure if there's anything that can be done in the context of continuing maintenance to 12.04. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to

Re: [PATCH] target: Remove unused se_cmd.cmd_spdtl

2012-08-18 Thread Roland Dreier
On Fri, Aug 17, 2012 at 6:02 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: No, or at least that is not what happens anymore with current target core + iscsi-target code.. The se_cmd-data_length re-assignment here is what will be used by iscsi-target fabric code for all iSCSI

[GIT PULL] please pull infiniband.git

2012-08-17 Thread Roland Dreier
Roland Dreier (3): RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevs mlx4_core: Clean up buddy bitmap allocation Merge branches 'cma', 'ipoib', 'misc', 'mlx4', 'ocrdma', 'qib' and 'srp' into for-next Shlomo Pongratz (2): IB/ipoib: Add missing locking when

[GIT PULL] please pull infiniband.git

2012-08-17 Thread Roland Dreier
Roland Dreier (3): RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevs mlx4_core: Clean up buddy bitmap allocation Merge branches 'cma', 'ipoib', 'misc', 'mlx4', 'ocrdma', 'qib' and 'srp' into for-next Shlomo Pongratz (2): IB/ipoib: Add missing locking when

Re: Two more ib_srp patches

2012-08-17 Thread Roland Dreier
On Fri, Aug 17, 2012 at 8:07 AM, David Dillow dillo...@ornl.gov wrote: On Fri, 2012-08-17 at 05:50 -0400, Bart Van Assche wrote: Hello Dave, I think I have found two additional (longstanding) ib_srp issues. Do the patches below make sense to you ? If so, do you prefer that I post these as

[GIT PULL] please pull infiniband.git

2012-08-17 Thread Roland Dreier
Roland Dreier (3): RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevs mlx4_core: Clean up buddy bitmap allocation Merge branches 'cma', 'ipoib', 'misc', 'mlx4', 'ocrdma', 'qib' and 'srp' into for-next Shlomo Pongratz (2): IB/ipoib: Add missing locking when

Re: [infiniband:for-next 9/19] drivers/net/ethernet/mellanox/mlx4/mr.c:134:4: error: implicit declaration of function 'vmalloc'

2012-08-16 Thread Roland Dreier
On Wed, Aug 15, 2012 at 5:35 PM, Fengguang Wu fengguang...@intel.com wrote: drivers/net/ethernet/mellanox/mlx4/mr.c: In function 'mlx4_buddy_init': drivers/net/ethernet/mellanox/mlx4/mr.c:134:4: error: implicit declaration of function 'vmalloc' [-Werror=implicit-function-declaration]

Re: linux-next: build failure after merge of the infiniband tree

2012-08-15 Thread Roland Dreier
On Wed, Aug 15, 2012 at 6:44 PM, Stephen Rothwell wrote: > After merging the infiniband tree, today's linux-next build (powerpc > ppc64_defconfig) failed like this: > > drivers/net/ethernet/mellanox/mlx4/mr.c: In function 'mlx4_buddy_init': > drivers/net/ethernet/mellanox/mlx4/mr.c:134:4: error:

Re: linux-next: build failure after merge of the infiniband tree

2012-08-15 Thread Roland Dreier
On Wed, Aug 15, 2012 at 6:44 PM, Stephen Rothwell s...@canb.auug.org.au wrote: After merging the infiniband tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: drivers/net/ethernet/mellanox/mlx4/mr.c: In function 'mlx4_buddy_init':

Re: linux-next: build failure after merge of the infiniband tree

2012-08-15 Thread Roland Dreier
On Wed, Aug 15, 2012 at 6:44 PM, Stephen Rothwell s...@canb.auug.org.au wrote: After merging the infiniband tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: drivers/net/ethernet/mellanox/mlx4/mr.c: In function 'mlx4_buddy_init':

Re: [PATCHv1] RDMA/ocrdma: Fixed CONFIG_VLAN_8021Q.

2012-08-15 Thread Roland Dreier
On Sat, Aug 11, 2012 at 6:28 AM, Parav Pandit parav.pan...@emulex.com wrote: +static struct net_device *ocrdma_get_real_netdev(struct net_device *netdev) +{ +#if IS_ENABLED(CONFIG_VLAN_8021Q) + return vlan_dev_real_dev(netdev); +#else + return netdev; +#endif +} As I said

Re: [PATCH] RDMA/ucma.c: Different fix for ucma context uid=0, causing iWarp RDMA applications to fail in connection establishment

2012-08-10 Thread Roland Dreier
On Sat, Aug 4, 2012 at 11:48 PM, Hefty, Sean sean.he...@intel.com wrote: Roland, there's a race here where ucma_set_event_context() copies ctx-uid to the event structure outside of the mutex. Once the mutex is acquired, ctx-uid is checked. However, the uid could have changed between saving

Re: IB softirq race

2012-08-10 Thread Roland Dreier
On Fri, Aug 10, 2012 at 6:03 AM, Sebastian Riemer sebastian.rie...@profitbricks.com wrote: we've got a gateway machine which is connected to the internet via ethernet and is connected with our KVM VMs-providing cloud infrastructure via IB. There must have been a race with softirqs. We've got a

[Bug 1030156] Re: long-running compile leads to laptop overheat shutdown

2012-08-03 Thread Roland Dreier
I'm pretty sure things have gotten worse for me since updating precise - quantal. I used to be able to build kernels with -j2, and since the update, I get thermal shutdowns sometimes even running a single-threaded (ie no -j) build. Maybe this is just coincidence and my fan has gotten more

Re: [PATCH] RDMA/ucma.c: Fix for ucma context uid=0, causing iWarp RDMA applications to fail in connection establishment

2012-08-03 Thread Roland Dreier
On Thu, Aug 2, 2012 at 10:05 PM, Hefty, Sean sean.he...@intel.com wrote: The file-mut should protect against an event being reported to user space with the uid set to 0. Tatyana, can you give more details of how you hit a librdmacm crash? -- To unsubscribe from this list: send the line

Re: [PATCH v3 26/32] PCI/mthca: use PCIe capabilities access functions to simplify implementation

2012-08-02 Thread Roland Dreier
> Use PCIe capabilities access functions to simplify mthca driver's > implementation. Acked-by: Roland Dreier -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kern

Re: [PATCH v3 26/32] PCI/mthca: use PCIe capabilities access functions to simplify implementation

2012-08-02 Thread Roland Dreier
Use PCIe capabilities access functions to simplify mthca driver's implementation. Acked-by: Roland Dreier rol...@purestorage.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http

Re: FDR HCA only doing QDR?

2012-08-02 Thread Roland Dreier
On Thu, Aug 2, 2012 at 1:18 AM, Albert Strasheim full...@gmail.com wrote: active_speed: 10.0 Gbps (8) You actually seem to be getting FDR-10 (raw speed undecoded of 8 instead of QDR==4). There may be some software/firmware issue but I would double check that

Re: memory region limit at 32 GB?

2012-08-02 Thread Roland Dreier
On Thu, Aug 2, 2012 at 8:03 AM, Albert Strasheim full...@gmail.com wrote: Did something happen with this discussion? possible bug when scaling MTT table size with system ram http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg11728.html I thought I had merged a patch to use vmalloc for

Re: mellanox mlx4_core and SR-IOV

2012-08-01 Thread Roland Dreier
On Wed, Aug 1, 2012 at 6:38 AM, Lukas Hejtmanek wrote: > [3.558296] mlx4_core :02:00.0: not enough MMIO resources for SR-IOV > (nres: 0, iov->nres: 1) This comes from the core sriov_enable() function, not anything in mlx4. (although my kernel doesn't have the print of nres in that

Re: mellanox mlx4_core and SR-IOV

2012-08-01 Thread Roland Dreier
On Wed, Aug 1, 2012 at 6:38 AM, Lukas Hejtmanek xhejt...@ics.muni.cz wrote: [3.558296] mlx4_core :02:00.0: not enough MMIO resources for SR-IOV (nres: 0, iov-nres: 1) This comes from the core sriov_enable() function, not anything in mlx4. (although my kernel doesn't have the print of

[GIT PULL] please pull infiniband.git

2012-07-31 Thread Roland Dreier
size of cc_supported_table_entries Roland Dreier (3): RDMA/ocrdma: Fix check of GSI CQs RDMA/ucma: Convert open-coded equivalent to memdup_user() Merge branches 'cma', 'ipoib', 'ocrdma' and 'qib' into for-next Shlomo Pongratz (1): IPoIB: Use a private hash table for path

Re: [for-next PATCH] IB/IPoIB: correct typo errors

2012-07-30 Thread Roland Dreier
thanks, rolled these fixes into the main patch. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [for-next PATCH] IB/IPoIB: correct typo errors

2012-07-30 Thread Roland Dreier
On Mon, Jul 30, 2012 at 2:44 AM, Shlomo Pongratz shlo...@mellanox.com wrote: - lockdep_is_held(ntbl-lock))) != NULL) { + is_held(ntbl-rwlock))) != NULL) { By the way, I assume this is a typo -- there is no plain is_held(), is there?

Re: [PATCH] IB/qib: correct smatch issue in qib_init.c

2012-07-29 Thread Roland Dreier
Thanks, applied. Additionally, Ram Vepa should have been credited as the author for 36a8f01c. unfortunately too late to fix, the patch is upstream. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info

[Bug 1030156] [NEW] long-running compile leads to laptop overheat shutdown

2012-07-27 Thread Roland Dreier
Public bug reported: Since I've updated to Quantal, I notice that if I kick off a long- running compile on my laptop (eg building a whole kernel), then I often shutdown with messages like Jul 27 15:07:25 roland-t410s kernel: [145070.016141] CPU0: Core temperature above threshold, cpu clock

[Bug 1030156] Re: long-running compile leads to laptop overheat shutdown

2012-07-27 Thread Roland Dreier
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1030156 Title: long-running compile leads to laptop overheat shutdown To manage notifications about this bug go to:

Re: Work completion error: transport retry counter exceeded

2012-07-27 Thread Roland Dreier
On Fri, Jul 27, 2012 at 9:50 AM, Paul Grun pg...@systemfabricworks.com wrote: Note that both an RNR-NAK retry count exceeded and a timeout error are reported in the same way, as a locally detected error. Not quite right. There are two different work completion statuses:

[PATCH] RDMA/ocrdma: Fix check of GSI CQs

2012-07-27 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com It looks like one check was accidentally duplicated, and the other 3 checks were left out. This was detected by scripts/coccinelle/tests/doubletest.cocci: drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:895:6-54: duplicated argument

[PATCH] RDMA/ucma: Convert open-coded equivalent to memdup_user()

2012-07-27 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com Suggested by scripts/coccinelle/api/memdup_user.cocci. Reported-by: Fengguang Wu fengguang...@intel.com Signed-off-by: Roland Dreier rol...@purestorage.com --- drivers/infiniband/core/ucma.c | 19 +++ 1 file changed, 7 insertions

Re: Work completion error: transport retry counter exceeded

2012-07-26 Thread Roland Dreier
On Wed, Jul 25, 2012 at 7:07 PM, Ira Weiny wei...@llnl.gov wrote: attr.timeout = 14; Is this timeout sufficient to account for the round trip on the fabric and the ack delay on the remote HCA? I don't think there are any other attributes that would affect getting transport

Re: Work completion error: transport retry counter exceeded

2012-07-26 Thread Roland Dreier
I wonder if I might be seeing the same thing... How does one choose a good value for this setting? Apparently it maps to 4.096 x 2 ^ attr.timeout microseconds. What's the maximum value one can set here? What can go wrong if one goes for the maximum value? In theory you want a timeout of

Re: EINVAL when INIT - RTR on RoCE

2012-07-26 Thread Roland Dreier
On Thu, Jul 26, 2012 at 10:38 AM, Xavier R. Guérin guer...@gmail.com wrote: is there any overhead associated with using GIDs? yes, every packet is 40 bytes longer. man ibv_post_recv tells me that the GRH is written into the first 40 bytes of the receive buffer for UD QPs. Is that behavior

[Bug 1026964] Re: Lenovo T410s laptop suspends fine, won't resume any more

2012-07-24 Thread Roland Dreier
Yes, 3.5.0-6.6 is fine, just like upstream 3.5. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1026964 Title: Lenovo T410s laptop suspends fine, won't resume any more To manage notifications about

Fwd: circular lockdep problem

2012-07-24 Thread Roland Dreier
[resending because I forgot to reply-all] On Tue, Jul 24, 2012 at 9:39 AM, Steve Wise sw...@opengridcomputing.com wrote: Can anyone help me understand how I can resolve this? Its saying there is some circular dependency problem with the cxgb4 uld_mutex, the networking rtnl_mutex, and ib_core's

[GIT PULL] please pull infiniband.git

2012-07-23 Thread Roland Dreier
contention IB/qib: Add congestion control agent implementation IB/qib: checkpatch fixes Roland Dreier (4): RDMA/ocrdma: Fix assignment of max_srq_sge in device query RDMA/cxgb4: Fix endianness of addition to mpa->private_data_size IB: Use IS_ENABLED(CONFIG_I

[GIT PULL] please pull infiniband.git

2012-07-23 Thread Roland Dreier
contention IB/qib: Add congestion control agent implementation IB/qib: checkpatch fixes Roland Dreier (4): RDMA/ocrdma: Fix assignment of max_srq_sge in device query RDMA/cxgb4: Fix endianness of addition to mpa-private_data_size IB: Use IS_ENABLED(CONFIG_IPV6

[GIT PULL] please pull infiniband.git

2012-07-23 Thread Roland Dreier
contention IB/qib: Add congestion control agent implementation IB/qib: checkpatch fixes Roland Dreier (4): RDMA/ocrdma: Fix assignment of max_srq_sge in device query RDMA/cxgb4: Fix endianness of addition to mpa-private_data_size IB: Use IS_ENABLED(CONFIG_IPV6

Re: 3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-21 Thread Roland Dreier
Thanks Hugh. I just went ahead and built 3.5 final, and suspend/resume look to be working again. I'm not even going to try to understand how a timekeeping bug broke resume... - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

Re: 3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-21 Thread Roland Dreier
On Wed, Jul 18, 2012 at 9:46 PM, Tomasz Chmielewski wrote: > After upgrading to 3.5-rc7, my laptop no longer wakes up reliable from > suspend to RAM. 3.4.x worked fine. FWIW, I've been having similar problems with 3.5-rc7. With 3.5-rc6 my laptop resumed fine, but since updating to -rc7, it

Re: 3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-21 Thread Roland Dreier
On Wed, Jul 18, 2012 at 9:46 PM, Tomasz Chmielewski t...@wpkg.org wrote: After upgrading to 3.5-rc7, my laptop no longer wakes up reliable from suspend to RAM. 3.4.x worked fine. FWIW, I've been having similar problems with 3.5-rc7. With 3.5-rc6 my laptop resumed fine, but since updating to

Re: 3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-21 Thread Roland Dreier
Thanks Hugh. I just went ahead and built 3.5 final, and suspend/resume look to be working again. I'm not even going to try to understand how a timekeeping bug broke resume... - R. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to

[Bug 1026964] Re: Lenovo T410s laptop suspends fine, won't resume any more

2012-07-21 Thread Roland Dreier
Looks like this was fixed upstream by 3e997130bd2e (timekeeping: Add missing update call in timekeeping_resume()) in 3.5 final. I built 3.5 from source and suspend/resume seems good again. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to

[Bug 1026964] [NEW] Lenovo T410s laptop suspends fine, won't resume any more

2012-07-20 Thread Roland Dreier
Public bug reported: This is a weird one, but... I have a Lenovo T410s laptop, that I've run every release from 11.04 to current 12.10 development code on. Suspend/resume has pretty much always worked, except apparently with the latest Quantal kernel there is a regression. The computer

[Bug 1026964] Re: Lenovo T410s laptop suspends fine, won't resume any more

2012-07-20 Thread Roland Dreier
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1026964 Title: Lenovo T410s laptop suspends fine, won't resume any more To manage notifications about this bug go to:

[Bug 1026964] Re: Lenovo T410s laptop suspends fine, won't resume any more

2012-07-20 Thread Roland Dreier
Huh, the mere act of installing the mainline 3.5.0-rc7 kernel seems to have stopped this from happening ... I reproduced it a few times in a row but now it is back to working fine. I'll reopen and reinvestigate if it happens again. ** Changed in: linux (Ubuntu) Status: New = Invalid --

[Bug 1026964] Re: Lenovo T410s laptop suspends fine, won't resume any more

2012-07-20 Thread Roland Dreier
Happened again just now. It appears if I suspend and then resume within 30 seconds or so, I'm ok. If I suspend and leave the system for a while, it may refuse to wake up. I'm running the 3.5.0-030500rc7-generic mainline kernel now, will let you know if I see this again or not. ** Changed in:

[Bug 1026964] Re: Lenovo T410s laptop suspends fine, won't resume any more

2012-07-20 Thread Roland Dreier
This happened to me with mainline: Linux roland-t410s 3.5.0-030500rc7-generic #201207142035 SMP Sun Jul 15 00:35:57 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux so looks like a mainline regression. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to

Re: mlx4_ib_create_qp failed - OOM with call trace

2012-07-20 Thread Roland Dreier
On Fri, Jul 20, 2012 at 12:47 AM, Sebastian Riemer sebastian.rie...@profitbricks.com wrote: This is at least something we can implement and test for us as we only have modern server systems. Definitely, just replacing the kmallocs with vmallocs for those wrid arrays and replacing the

Re: [PATCH] [Trivial] qib: fix an incorrect message

2012-07-19 Thread Roland Dreier
thanks, applied. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] [Trivial] qib: fix an incorrect message

2012-07-19 Thread Roland Dreier
thanks, applied. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

[Desktop-packages] [Bug 1025498] Re: network-manager segfaulting after bringing up Wi-Fi link

2012-07-19 Thread Roland Dreier
OK, here's the output of sudo gdb --args /usr/sbin/NetworkManager --no-daemon --log-level=debug | tee nm.gdb.txt looks like we crash in the dns management code because priv-last_iface is NULL. I haven't tried to understand what NM is doing here yet. ** Attachment added: debug log / backtrace

[Desktop-packages] [Bug 1025498] Re: network-manager segfaulting after bringing up Wi-Fi link

2012-07-19 Thread Roland Dreier
** Changed in: network-manager (Ubuntu) Status: Incomplete = Confirmed -- You received this bug notification because you are a member of Desktop Packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1025498 Title: network-manager segfaulting after

[Bug 1025498] Re: network-manager segfaulting after bringing up Wi-Fi link

2012-07-19 Thread Roland Dreier
OK, here's the output of sudo gdb --args /usr/sbin/NetworkManager --no-daemon --log-level=debug | tee nm.gdb.txt looks like we crash in the dns management code because priv-last_iface is NULL. I haven't tried to understand what NM is doing here yet. ** Attachment added: debug log / backtrace

[Bug 1025498] Re: network-manager segfaulting after bringing up Wi-Fi link

2012-07-19 Thread Roland Dreier
** Changed in: network-manager (Ubuntu) Status: Incomplete = Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1025498 Title: network-manager segfaulting after bringing up Wi-Fi link

Re: [PATCH] [Trivial] qib: fix an incorrect message

2012-07-19 Thread Roland Dreier
thanks, applied. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/5] IB/qib: checkpatch fixes

2012-07-19 Thread Roland Dreier
thanks, applied 1, 2, 5. see my comments about #3 -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/5] IB/qib: Add support for per-device/per-port parameters

2012-07-19 Thread Roland Dreier
[resending because I forgot to cc linux-rdma] On Thu, Jul 19, 2012 at 6:04 AM, Mike Marciniszyn mike.marcinis...@intel.com wrote: Add support for per-device/per-port driver parameters allowing users to specify different values for different ports/devices. All converted parameters will behave

Re: mlx4_ib_create_qp failed - OOM with call trace

2012-07-19 Thread Roland Dreier
[5416523.203047] ib_srpt: Received SRP_LOGIN_REQ with i_port_id 0x0:0x2c903004ecf8b, t_port_id 0x2c903004ecf82:0x2c903004ecf82 and it_iu_len 260 on port 1 (guid=0xfe80:0x2c903004ecf83) [5416523.204736] kworker/0:4: page allocation failure: order:4, mode:0x40d0 [5416523.204738]

Re: [PATCH 5/7] target: Check sess_tearing_down in target_get_sess_cmd()

2012-07-17 Thread Roland Dreier
On Mon, Jul 16, 2012 at 6:56 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: Do you have a plan for how to handle this? Do we really want to plumb through another callback to tell the fabric driver to free the command in this case? I need to think more about this ahead of changing it

[Desktop-packages] [Bug 1025498] [NEW] network-manager crashing after bringing up wifi

2012-07-16 Thread Roland Dreier
Public bug reported: On my system after updating today, I see network manager bring up my wifi link and then immediately start reconnecting. In my dmesg, I see many sequences of messages like [ 2027.559748] NetworkManager[7265]: segfault at 0 ip 7f03d9939f7d sp 7fff1f62cbb0 error 4 in

[Desktop-packages] [Bug 1025498] Re: network-manager crashing after bringing up wifi

2012-07-16 Thread Roland Dreier
-- You received this bug notification because you are a member of Desktop Packages, which is subscribed to network-manager in Ubuntu. https://bugs.launchpad.net/bugs/1025498 Title: network-manager crashing after bringing up wifi Status in “network-manager” package in Ubuntu: New Bug

[Bug 1025498] [NEW] network-manager crashing after bringing up wifi

2012-07-16 Thread Roland Dreier
Public bug reported: On my system after updating today, I see network manager bring up my wifi link and then immediately start reconnecting. In my dmesg, I see many sequences of messages like [ 2027.559748] NetworkManager[7265]: segfault at 0 ip 7f03d9939f7d sp 7fff1f62cbb0 error 4 in

[Bug 1025498] Re: network-manager crashing after bringing up wifi

2012-07-16 Thread Roland Dreier
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1025498 Title: network-manager crashing after bringing up wifi To manage notifications about this bug go to:

[PATCH 0/7] series to fix qla2xxx use-after-free

2012-07-16 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com Hi Nic, Here's a series that's fundamentally about fixing a use-after-free in qla_target code. It ends up being seven patches because I wanted to make each step easy to review, and several of these are just cleanups that stand on their own. We have

[PATCH 1/7] qla2xxx: Get rid of redundant qla_tgt_sess.tearing_down

2012-07-16 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com The only place that sets qla_tgt_sess.tearing_down calls target_splice_sess_cmd_list() immediately afterwards, without dropping the lock it holds. That function sets se_session.sess_tearing_down, so we can get rid of the qla_target-specific flag

[PATCH 2/7] target: Un-export target_get_sess_cmd()

2012-07-16 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com There are no in-tree users of target_get_sess_cmd() outside of target_core_transport.c. Any new code should use the higher-level target_submit_cmd() interface. So let's un-export target_get_sess_cmd() and make it static to the one file where it's

[PATCH 3/7] sbp-target: Consolidate duplicated error path code in sbp_handle_command()

2012-07-16 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com Cc: Chris Boot bo...@bootc.net Cc: Stefan Richter stef...@s5r6.in-berlin.de Signed-off-by: Roland Dreier rol...@purestorage.com --- drivers/target/sbp/sbp_target.c | 28 1 file changed, 12 insertions(+), 16 deletions

[PATCH 6/7] qla2xxx: Remove racy, now-redundant check of sess_tearing_down

2012-07-16 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com Now that target_submit_cmd() / target_get_sess_cmd() check sess_tearing_down before adding commands to the list, we no longer need the check in qlt_do_work(). In fact this check is racy anyway (and that race is what inspired the change to add the check

[PATCH 4/7] target: Allow for target_submit_cmd() returning errors

2012-07-16 Thread Roland Dreier
From: Roland Dreier rol...@purestorage.com We want it to be possible for target_submit_cmd() to return errors up to its fabric module callers. For now just update the prototype to return an int, and update all callers to handle non-zero return values as an error. Cc: Chad Dupuis chad.dup

Re: [PATCH 4/7] target: Allow for target_submit_cmd() returning errors

2012-07-16 Thread Roland Dreier
On Mon, Jul 16, 2012 at 4:00 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: Mmmm. The original target_submit_cmd() code had been propagating up a return value, but then we decided (via Agrover's patch) that it made more sense for target_submit_cmd() to always handle exceptions via

Re: [PATCH 5/7] target: Check sess_tearing_down in target_get_sess_cmd()

2012-07-16 Thread Roland Dreier
On Mon, Jul 16, 2012 at 4:08 PM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: However, I'm still leaning towards a way to do this for tcm_qla2xxx that does not require all fabric callers to handle target_submit_cmd() exceptions directly.. How about a special target_get_sess_cmd() failure

Re: [PATCH 5/7] target: Check sess_tearing_down in target_get_sess_cmd()

2012-07-16 Thread Roland Dreier
OK, I'll take a look at how you handle this... So looking at commit bc187ea6c3b3 in the tree you just pushed out (target: Check sess_tearing_down in target_get_sess_cmd()) it looks like you just return from target_submit_cmd() if we fail to add the command to sess_cmd_list -- doesn't this mean

<    3   4   5   6   7   8   9   10   11   12   >