On Fri, Dec 25, 2015 at 5:56 PM, Hal Rosenstock <h...@dev.mellanox.co.il> wrote:
> On 12/25/2015 9:50 AM, Hal Rosenstock wrote:
>> On 12/24/2015 11:09 AM, Matan Barak wrote:
>>> On Thu, Dec 24, 2015 at 4:07 PM, Matan Barak <mat...@dev.mellanox.co.il> 
>>> wrote:
>>>> On Thu, Dec 24, 2015 at 2:38 PM, Or Gerlitz <ogerl...@mellanox.com> wrote:
>>>>> On 12/24/2015 12:42 PM, Sagi Grimberg wrote:
>>>>>>
>>>>>>
>>>>>>>> This patch seems to generate a list corruption [1] when I test
>>>>>>>> with Doug's for-4.5 tree. Eran, care to take a look at this?
>>>>>>>
>>>>>>>
>>>>>>> This patch is part from a series that was introduced in 4.3-rc1 [1],
>>>>>>
>>>>>>
>>>>>> Then something else broke it. Can people check their patches on doug's
>>>>>> tree? At the moment it's unusable...
>>>>>
>>>>
>>>> Leon and I have checked Doug's tree with mlx4_ib disabled and we
>>>> didn't encounter any error.
>>>> We ran ucmatose over IB connection (in mlx5) and it worked flawlessly.
>>>>
>>>>>
>>>>> Yes, I checked the branch up to commit 882f3b3 "Merge branches
>>>>> '4.5/Or-cleanup' and '4.5/rdma-cq' into k.o/for-4.5" and it works (rping,
>>>>> ibv_rc_pingpong over top of mlx4 VPI)
>>>>>
>>>
>>> Regarding mlx4, Eran and I analyzed it. We didn't test that, but it
>>> seems like the bug is introduced in the 64bit counters test. Here's a
>>> proposal:
>>>
>>> diff --git a/drivers/infiniband/core/sysfs.c 
>>> b/drivers/infiniband/core/sysfs.c
>>> index 539040f..8da3c83 100644
>>> --- a/drivers/infiniband/core/sysfs.c
>>> +++ b/drivers/infiniband/core/sysfs.c
>>> @@ -714,11 +714,12 @@ err:
>>>   * Figure out which counter table to use depending on
>>>   * the device capabilities.
>>>   */
>>> -static struct attribute_group *get_counter_table(struct ib_device *dev)
>>> +static struct attribute_group *get_counter_table(struct ib_device *dev,
>>> +                                          int port_num)
>>>  {
>>>         struct ib_class_port_info cpi;
>>>
>>> -   if (get_perf_mad(dev, 0, IB_PMA_CLASS_PORT_INFO,
>>> + if (get_perf_mad(dev, port_num, IB_PMA_CLASS_PORT_INFO,
>>>                                 &cpi, 40, sizeof(cpi)) >= 0) {
>>
>> Your proposal is similar to earlier version of Christoph's patch but was
>> changed since ClassPortInfo attribute does not have PortSelect field
>> like other PerfMgt attributes which is where this port num would be
>> placed. In ClassPortInfo attribute, that location would be the
>> ClassVersion field that would be set to port number in PerfMgt Get query.
>
> In actuality, I don't think it really matters as this is a Get not a Set
> and the PMA would do the right thing even if some field in the CPI were
> stepped on.
>

Well, it does matter as it calls the vendor driver with port_num = 0.
Since the kernel is trusted, the vendor driver expects a valid port number.
Giving it an invalid number might result in memory corruptions, as
demonstrated in this case.

>> -- Hal

Matan

>>
>>>
>>>                 if (cpi.capability_mask && IB_PMA_CLASS_CAP_EXT_WIDTH)
>>> @@ -776,7 +777,7 @@ static int add_port(struct ib_device *device, int 
>>> port_num,
>>>                 goto err_put;
>>>         }
>>>
>>> -   p->pma_table = get_counter_table(device);
>>> + p->pma_table = get_counter_table(device, port_num);
>>>         ret = sysfs_create_group(&p->kobj, p->pma_table);
>>>         if (ret)
>>>                 goto err_put_gid_attrs;
>>>
>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>>> the body of a message to majord...@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to