Re: [PATCH] crypto, qat, use generic numa functions

2014-10-08 Thread Prarit Bhargava


On 10/08/2014 11:50 AM, Tadeusz Struk wrote:
> Hi Prarit,
> On 10/07/2014 05:12 PM, Prarit Bhargava wrote:
>> The method in which the qat code determines the numa node for memory
>> allocations is a bit clunky.  On 2 socket, single node systems it is
>> possible that adf_get_dev_node_id() returns node 1, even though node 1
>> doesn't exist.
>>
>> This code transitions the qat code to the generic numa functions.
>> Changing adf_get_dev_node_id() to a simple call to dev_get_node() results
>> in a change to the adf_accel_dev struct as well.
> 
> The problem with that is we don't want to use any valid numa node, but
> the node we are connected to or we don't want to use the accelerator at
> all. Otherwise, when the first valid numa node happens to be the remote
> node the dma transactions we be slow and instead of accelerating we will
> slow things down.
> A patch that enforces this is on it's way.

Yeah, I was actually wondering if

dev_get_node() returns NO_NODE, then we should just default to 0?

I'll wait for your patch ...

P.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] crypto, qat, use generic numa functions

2014-10-08 Thread Tadeusz Struk
Hi Prarit,
On 10/07/2014 05:12 PM, Prarit Bhargava wrote:
> The method in which the qat code determines the numa node for memory
> allocations is a bit clunky.  On 2 socket, single node systems it is
> possible that adf_get_dev_node_id() returns node 1, even though node 1
> doesn't exist.
> 
> This code transitions the qat code to the generic numa functions.
> Changing adf_get_dev_node_id() to a simple call to dev_get_node() results
> in a change to the adf_accel_dev struct as well.

The problem with that is we don't want to use any valid numa node, but
the node we are connected to or we don't want to use the accelerator at
all. Otherwise, when the first valid numa node happens to be the remote
node the dma transactions we be slow and instead of accelerating we will
slow things down.
A patch that enforces this is on it's way.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] crypto, qat, use generic numa functions

2014-10-07 Thread Prarit Bhargava
While testing, the following panic was seen:

IP: [] __alloc_pages_nodemask+0x97/0x420
PGD 0
Oops:  [#1] SMP
Modules linked in: aesni_intel ptp lrw qat_dh895xcc(+) intel_qat pps_core 
i2c_algo_bit authenc gf128mul iTCO_wdt ioatdma glue_helper sb_edac i2c_i801 
ablk_helper serio_raw iTCO_vendor_support pcspkr edac_core shpchp i2c_core 
cryptd dca lpc_ich mfd_core wmi xfs libcrc32c sd_mod crc_t10dif 
crct10dif_common ahci libahci libata dm_mirror dm_region_hash dm_log dm_mod
CPU: 0 PID: 1235 Comm: systemd-udevd Not tainted 3.10.0-165.el7.x86_64 #1
Hardware name: Intel Corporation SandyBridge Platform/To be filled by O.E.M., 
BIOS CCFRCLC0.019.1308201516 08/20/2013
task: 88006d068000 ti: 88006ca0c000 task.ti: 88006ca0c000
RIP: 0010:[]  [] 
__alloc_pages_nodemask+0x97/0x420
RSP: 0018:88006ca0f928  EFLAGS: 00010246
RAX: 2000 RBX:  RCX: 88006ca0ffd8
RDX:  RSI: 0002 RDI: 002052d0
RBP: 88006ca0f9c8 R08: 0008 R09: 0002
R10: 0068 R11: ffc4 R12: 002052d0
R13:  R14: 0002 R15: 
FS:  7f999a6f9880() GS:880076a0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 2008 CR3: 6c916000 CR4: 001407f0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Stack:
 88007ac07700 88006ca0f940 811a43d9 88006ca0fa00
 811a4a0a 88007ac00e30 88007ac00e10 880076a17
  8802 2000 000180d0
Call Trace:
 [] ? discard_slab+0x39/0x50
 [] ? deactivate_slab+0x35a/0x3c0
 [] new_slab+0x91/0x300
 [] __slab_alloc+0x2bb/0x482
 [] ? native_sched_clock+0x13/0x80
 [] ? sched_clock+0x9/0x10
 [] ? adf_probe+0xb7/0x5a0 [qat_dh895xcc]
 [] ? idr_get_empty_slot+0x16f/0x3c0
 [] ? idr_get_empty_slot+0x16f/0x3c0
 [] kmem_cache_alloc_node_trace+0x9b/0x220
 [] adf_probe+0xb7/0x5a0 [qat_dh895xcc]
 [] ? sysfs_addrm_finish+0x42/0xe0
 [] ? __sysfs_add_one+0x61/0x100
 [] local_pci_probe+0x45/0xa0
 [] ? pci_match_device+0xc5/0xd0
 [] pci_device_probe+0xf9/0x150
 [] driver_probe_device+0x87/0x390
 [] __driver_attach+0x93/0xa0
 [] ? __device_attach+0x40/0x40
 [] bus_for_each_dev+0x73/0xc0
 [] driver_attach+0x1e/0x20
 [] bus_add_driver+0x200/0x2d0
 [] driver_register+0x64/0xf0
 [] __pci_register_driver+0xa5/0xc0
 [] ? 0xa01bdfff
 [] adfdrv_init+0x3a/0x1000 [qat_dh895xcc]
 [] do_one_initcall+0xb8/0x230
 [] load_module+0x131a/0x1b20
 [] ? ddebug_proc_write+0xf0/0xf0
 [] ? copy_module_from_fd.isra.43+0x53/0x150
 [] SyS_finit_module+0xa6/0xd0
 [] system_call_fastpath+0x16/0x1b
Code: c1 eb 02 c1 e8 13 83 e3 02 83 e0 01 09 c3 44 23 25 cf 22 8a 00 48 c7 45 
c0 00 00 00 00 41 f6 c4 10 0f 85 55 02 00 00 48 8b 45 b0 <48> 83 78 08 00 0f 84 
a3 01 00 00 0f 1f 44 00 00 48 8b 45 b0 44

The method in which the qat code determines the numa node for memory
allocations is a bit clunky.  On 2 socket, single node systems it is
possible that adf_get_dev_node_id() returns node 1, even though node 1
doesn't exist.

This code transitions the qat code to the generic numa functions.
Changing adf_get_dev_node_id() to a simple call to dev_get_node() results
in a change to the adf_accel_dev struct as well.  In addition to that
change, qat_crypto_get_instance_node() must check for "any node" as a
valid numa_node value.

Cc: Tadeusz Struk 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: Bruce Allan 
Cc: Prarit Bhargava 
Cc: John Griffin 
Cc: qat-li...@intel.com
Cc: linux-cry...@vger.kernel.org
Signed-off-by: Prarit Bhargava 
---
 drivers/crypto/qat/qat_common/adf_accel_devices.h |2 +-
 drivers/crypto/qat/qat_common/qat_algs.c  |7 +--
 drivers/crypto/qat/qat_common/qat_crypto.c|4 +++-
 drivers/crypto/qat/qat_dh895xcc/adf_drv.c |   19 ++-
 4 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/drivers/crypto/qat/qat_common/adf_accel_devices.h 
b/drivers/crypto/qat/qat_common/adf_accel_devices.h
index 9282381..025f52f 100644
--- a/drivers/crypto/qat/qat_common/adf_accel_devices.h
+++ b/drivers/crypto/qat/qat_common/adf_accel_devices.h
@@ -199,7 +199,7 @@ struct adf_accel_dev {
struct list_head list;
struct module *owner;
uint8_t accel_id;
-   uint8_t numa_node;
+   int numa_node;
struct adf_accel_pci accel_pci_dev;
 } __packed;
 #endif
diff --git a/drivers/crypto/qat/qat_common/qat_algs.c 
b/drivers/crypto/qat/qat_common/qat_algs.c
index 59df488..d887074 100644
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -127,11 +127,6 @@ struct qat_alg_session_ctx {
spinlock_t lock;/* protects qat_alg_session_ctx struct */
 };
 
-static int get_current_node(void)
-{
-   return cpu_data(current_thread_info()->cpu).phys_pro