which list_del do you mean? in ipoib_cm_tx_start?
On Mon, May 20, 2013 at 11:05 AM, Or Gerlitz ogerl...@mellanox.com wrote:
On 19/05/2013 12:17, Jack Wang wrote:
we added inject_bug sysfs node to make function run into error case, like
something below. Yes, you are right, we want to speedup
On 20/05/2013 12:10, Jinpu Wang wrote:
which list_del do you mean? in ipoib_cm_tx_start?
yes, but not only, you can start with 5KG hammer and convert all
thesehits to list_del_init
linux-2.6]# grep list_del drivers/infiniband/ulp/ipoib/*.c | grep neigh
drivers/infiniband/ulp/ipoib/ipoib_cm.c:
Hi Roland,
Following what we discussed last week during the Linux Foundation EU
summit, I think it would be good to follow what you said and have a
point release for libibverbs and libmlx4 before we pull in the verbs
extensions framework and features that use it (XRC, Flow-Steering, etc
more
A quick test show the list_corruption warning is gone, after I convert
all list_del(neigh-list) to list_del_list(neigh-list).
Test is still running, will update status if anything wrong.
Thanks Or.
On Mon, May 20, 2013 at 12:58 PM, Or Gerlitz ogerl...@mellanox.com wrote:
On 20/05/2013
On 20/05/2013 15:46, Jinpu Wang wrote:
A quick test show the list_corruption warning is gone, after I convert
all list_del(neigh-list) to list_del_list(neigh-list).
yes, but this wasn't your original problem or was it?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma
On 5/20/2013 3:58 PM, Jack Wang wrote:
I haven't reproduced the original bug we saw in our production
environment
BUG: unable to handle kernel
at 0008
IP: [a0206c30] ipoib_cm_tx_reap+0xe0/0x5a0 [ib_ipoib]
...
RIP: 0010:[a0206c30] [a0206c30]
Hi Jack,
I don't understand what is the current status, that is what do you see
now after applying the patches.
If you don't get the original bug why did you gave the trace of it? Or
is it a new trace? It is not clear from your mail.
Please add only the trace of the current issue.
On Saturday 18 May 2013 00:37, Roland Dreier wrote:
On Fri, May 17, 2013 at 12:25 PM, Tom Tucker t...@opengridcomputing.com
wrote:
I'm looking at the Linux MLX4 net driver and found something that confuses
me mightily. In particular in the file net/ethernet/mellanox/mlx4/cq.c, the
On Mon, May 20, 2013 at 7:53 AM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
This is racy and can cause use-after-free, null pointer dereference, etc,
which
result in kernel crashes.
Sounds fine and I'd be happy to apply your final patch, but I'd be
curious to know what the race is in
On Mon, May 20, 2013 at 5:37 AM, Or Gerlitz ogerl...@mellanox.com wrote:
Following what we discussed last week during the Linux Foundation EU summit,
I think it would be good to follow what you said and have a point release for
libibverbs and libmlx4 before we pull in the verbs extensions
On 3/26/2013 4:14 PM, sean.he...@intel.com wrote:
From: Sean Hefty sean.he...@intel.com
XRC receive QPs are shareable across multiple processes. Allow
any process with access to the xrc domain to open an existing
QP. After opening the QP, the process will receive events
related to the QP and
So is it intended that cooperating processes sharing an xrc domain will
choose one process to create the xrc qp, and the rest will open it?
yes - The QPN of the shared QP must be known between the cooperating processes.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
On 05/20/2013 12:49 PM, Roland Dreier wrote:
On Mon, May 20, 2013 at 5:37 AM, Or Gerlitz ogerl...@mellanox.com wrote:
Following what we discussed last week during the Linux Foundation EU summit,
I think it would be good to follow what you said and have a point release
for libibverbs and
On Mon, May 20, 2013 at 5:36 PM, Jack Wang jinpu.w...@profitbricks.com wrote:
Sorry for confusion. Current list corruption is gone in my preliminary test,
after I changed
list_del to list_del_init as Or suggested.
As Or asked for the original bug, so I just want to show him the whole story.
Hi Sean,
Do we have some public quoted usages/feedback for rsockets? I think
you've mentioned something during the panel at the Linux EU summit
last week but I am not sure...
Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to
On 2013年05月20日 21:00, Or Gerlitz wrote:
On Mon, May 20, 2013 at 5:36 PM, Jack Wang jinpu.w...@profitbricks.com
wrote:
Sorry for confusion. Current list corruption is gone in my preliminary test,
after I changed
list_del to list_del_init as Or suggested.
As Or asked for the original bug, so
On Mon, May 20, 2013 at 10:38 PM, Jack Wang jinpu.w...@profitbricks.com wrote:
The bug in our production environment is introduced in our backport
about ipoib fixes from mainline, and when we hit that bug we reverted
back to old kernel without the backport patch, and the bug didn't happen for
Hi Guys,
One other quick one. I've received conflicting claims on the validity of
the wc.opcode when wc.status != 0 for mlx4 hardware.
My reading of the code (i.e. hw/mlx4/cq.c) is that the hardware cqe
owner_sr_opcode field contains MLX4_CQE_OPCODE_ERROR when there is an
error and
Do we have some public quoted usages/feedback for rsockets? I think
you've mentioned something during the panel at the Linux EU summit
last week but I am not sure...
Most feedback I can think of has come via private emails or personal
interactions, especially specific details of various usage
On 2013年05月20日 21:50, Or Gerlitz wrote:
On Mon, May 20, 2013 at 10:38 PM, Jack Wang jinpu.w...@profitbricks.com
wrote:
The bug in our production environment is introduced in our backport
about ipoib fixes from mainline, and when we hit that bug we reverted
back to old kernel without the
My reading of the code (i.e. hw/mlx4/cq.c) is that the hardware cqe
owner_sr_opcode field contains MLX4_CQE_OPCODE_ERROR when there is an
error and therefore, the only way to recover what the opcode was is
through the wr_id you used when submitting the WR.
Is my reading of the code correct?
On 5/20/13 2:58 PM, Hefty, Sean wrote:
My reading of the code (i.e. hw/mlx4/cq.c) is that the hardware cqe
owner_sr_opcode field contains MLX4_CQE_OPCODE_ERROR when there is an
error and therefore, the only way to recover what the opcode was is
through the wr_id you used when submitting the WR.
On Mon, May 20, 2013 at 10:52 PM, Hefty, Sean sean.he...@intel.com wrote:
Do we have some public quoted usages/feedback for rsockets? I think
you've mentioned something during the panel at the Linux EU summit
last week but I am not sure...
Most feedback I can think of has come via private
So if you were pushing these private conversations to linux-rdma, more
have been known on rsockets for the benefit of all... oh well. I think
you mentioned something re Intel HPC group, or I am wrong?
rsockets will continue to be supported by myself and Intel going forward. The
rsocket work
Hi,
Please find three patches to protect libibverbs from using invalid,
unsecure configuration files.
Thoses configurations files are usually located in
/etc/libibverbs.d/ and contains the name of a shared library
to dlopen().
Only legitimate shared libraries should be loaded by libibverbs,
so
Files beginning with a dot are mostly current and parent directories or,
by convention, hidden files.
Those path are skipped in find_sysfs_dev().
Signed-off-by: Yann Droneaud ydrone...@opteya.com
---
src/init.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/init.c b/src/init.c
index
Try to protect libibverbs from hand modified configuration files.
Signed-off-by: Yann Droneaud ydrone...@opteya.com
---
src/init.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/init.c b/src/init.c
index c880b68..1981da7 100644
--- a/src/init.c
+++ b/src/init.c
@@ -311,6 +311,9 @@
libibverbs must refuse to load arbitrary shared objects.
This patch check the configuration directory and files for
- being owned by root;
- not being writable by others.
Signed-off-by: Yann Droneaud ydrone...@opteya.com
---
src/init.c | 23 +--
1 file changed, 21
On Tue, Jan 17, 2012 at 10:21:28AM -0600, Steve Wise wrote:
On 01/17/2012 09:59 AM, Or Gerlitz wrote:
On 1/17/2012 5:08 PM, Steve Wise wrote:
I think this series should add some new send flags for HW that
does checksum offload [...] also, on ingress, most hardware can
do INET checksum
Following advice in Autotool Mythbuster [1],
option subdir-objects can be used to have Makefiles
create object files in the same directory than
theirs source files.
It reduces clobbering in the build directory.
[1] Autotool Mythbuster, by Diego Elio Flameeyes Petten`o
'autoupdate' is a tool to help developer to update configure.ac.
This patch apply a few fixes as suggested by autoupdate.
It was tested on Debian 6.0.7 (Squeeze) and Fedora 17 (Beefy Miracle).
Signed-off-by: Yann Droneaud ydrone...@opteya.com
---
configure.ac | 17 -
1 file
File opened by libibverbs are not supposed to be inherited
across exec*(), most of the files are of no use for another program,
and others cannot be used without the associated memory mapping.
This patch changes open() and fopen() to always set close on exec flag.
This patch also add checks to
On Wed, Apr 24, 2013 at 04:58:43PM +0300, Or Gerlitz wrote:
Hi Roland, all
The first five patches in the series are mlx4 DMFS (Device Managed Flow
Steering) pre-patches needed for flow steering access from the mlx4 IB driver.
net/mlx4_core: Move DMFS HW structs to common header file
33 matches
Mail list logo