[PATCH] staging/rdma/hfi1: Change num_rcv_contexts to num_user_contexts and its meaning
From: Sebastian Sanchez num_rcv_contexts sets the number of user contexts, both receive and send. Renaming it to num_user_contexts makes sense to reflect its true meaning. When num_rcv_contexts is 0, the default behavior is the number of CPU cores instead of 0 contexts. This commit changes the variable num_rcv_contexts to num_user_contexts, and it also makes any negative value for this variable default to the number of CPU cores, so if num_user_contexts is set >= 0, the value will number of contexts. Reviewed-by: Mike Marciniszyn Signed-off-by: Sebastian Sanchez --- drivers/staging/rdma/hfi1/chip.c |9 +++-- drivers/staging/rdma/hfi1/hfi.h |2 +- drivers/staging/rdma/hfi1/init.c |6 +++--- 3 files changed, 7 insertions(+), 10 deletions(-) diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/staging/rdma/hfi1/chip.c index dc69159..510e5f6 100644 --- a/drivers/staging/rdma/hfi1/chip.c +++ b/drivers/staging/rdma/hfi1/chip.c @@ -9197,7 +9197,6 @@ fail: static int set_up_context_variables(struct hfi1_devdata *dd) { int num_kernel_contexts; - int num_user_contexts; int total_contexts; int ret; unsigned ngroups; @@ -9234,12 +9233,10 @@ static int set_up_context_variables(struct hfi1_devdata *dd) } /* * User contexts: (to be fixed later) -* - set to num_rcv_contexts if non-zero -* - default to 1 user context per CPU +* - default to 1 user context per CPU if num_user_contexts is +*negative */ - if (num_rcv_contexts) - num_user_contexts = num_rcv_contexts; - else + if (num_user_contexts < 0) num_user_contexts = num_online_cpus(); total_contexts = num_kernel_contexts + num_user_contexts; diff --git a/drivers/staging/rdma/hfi1/hfi.h b/drivers/staging/rdma/hfi1/hfi.h index 54ed6b3..434a89b 100644 --- a/drivers/staging/rdma/hfi1/hfi.h +++ b/drivers/staging/rdma/hfi1/hfi.h @@ -1629,7 +1629,7 @@ void update_sge(struct hfi1_sge_state *ss, u32 length); extern unsigned int hfi1_max_mtu; extern unsigned int hfi1_cu; extern unsigned int user_credit_return_threshold; -extern uint num_rcv_contexts; +extern int num_user_contexts; extern unsigned n_krcvqs; extern u8 krcvqs[]; extern int krcvqsset; diff --git a/drivers/staging/rdma/hfi1/init.c b/drivers/staging/rdma/hfi1/init.c index 1c8286f..a091b6c 100644 --- a/drivers/staging/rdma/hfi1/init.c +++ b/drivers/staging/rdma/hfi1/init.c @@ -82,10 +82,10 @@ * Number of user receive contexts we are configured to use (to allow for more * pio buffers per ctxt, etc.) Zero means use one user context per CPU. */ -uint num_rcv_contexts; -module_param_named(num_rcv_contexts, num_rcv_contexts, uint, S_IRUGO); +int num_user_contexts = -1; +module_param_named(num_user_contexts, num_user_contexts, uint, S_IRUGO); MODULE_PARM_DESC( - num_rcv_contexts, "Set max number of user receive contexts to use"); + num_user_contexts, "Set max number of user contexts to use"); u8 krcvqs[RXE_NUM_DATA_VL]; int krcvqsset; ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] staging/rdma/hfi1: Fix for module parameter hdrq_entsize when it's 0
From: Sebastian Sanchez If driver is loaded with parameter hdrq_entsize=0, then there's a NULL dereference when the driver gets unloaded. This causes a kernel Oops and prevents the module from being unloaded. This patch fixes this issue by making sure -EINVAL gets returned when hdrq_entsize=0. Reviewed-by: Chegondi, Harish Reviewed-by: Haralanov, Mitko Signed-off-by: Sebastian Sanchez --- drivers/staging/rdma/hfi1/init.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/staging/rdma/hfi1/init.c b/drivers/staging/rdma/hfi1/init.c index 1c8286f..b61a854 100644 --- a/drivers/staging/rdma/hfi1/init.c +++ b/drivers/staging/rdma/hfi1/init.c @@ -1350,6 +1350,7 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) if (!encode_rcv_header_entry_size(hfi1_hdrq_entsize)) { hfi1_early_err(&pdev->dev, "Invalid HdrQ Entry size %u\n", hfi1_hdrq_entsize); + ret = -EINVAL; goto bail; } ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] staging/rdma/hfi1: Fix a possible null pointer dereference
From: Easwar Hariharan A code inspection pointed out that kmalloc_array may return NULL and memset doesn't check the input pointer for NULL, resulting in a possible NULL dereference. This patch fixes this. Reviewed-by: Mike Marciniszyn Signed-off-by: Easwar Hariharan --- drivers/staging/rdma/hfi1/chip.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/staging/rdma/hfi1/chip.c b/drivers/staging/rdma/hfi1/chip.c index dc69159..49d49b2 100644 --- a/drivers/staging/rdma/hfi1/chip.c +++ b/drivers/staging/rdma/hfi1/chip.c @@ -10129,6 +10129,8 @@ static void init_qos(struct hfi1_devdata *dd, u32 first_ctxt) if (num_vls * qpns_per_vl > dd->chip_rcv_contexts) goto bail; rsmmap = kmalloc_array(NUM_MAP_REGS, sizeof(u64), GFP_KERNEL); + if (!rsmmap) + goto bail; memset(rsmmap, rxcontext, NUM_MAP_REGS * sizeof(u64)); /* init the local copy of the table */ for (i = 0, ctxt = first_ctxt; i < num_vls; i++) { ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 0/2] Convert IBTA traps to OPA traps
This two patch series gets rid of the vestigal use of IBTA traps in the hfi1 driver. --- Erik E. Kahn (1): staging/rdma/hfi1: HFI now sends OPA Traps instead of IBTA Jubin John (1): staging/rdma/hfi1: add definitions for OPA traps drivers/staging/rdma/hfi1/mad.c | 121 +++-- drivers/staging/rdma/hfi1/mad.h | 114 +++ drivers/staging/rdma/hfi1/ruc.c | 10 ++- drivers/staging/rdma/hfi1/ud.c| 21 +++--- drivers/staging/rdma/hfi1/verbs.h |2 - 5 files changed, 193 insertions(+), 75 deletions(-) -- Mike ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2] staging/rdma/hfi1: HFI now sends OPA Traps instead of IBTA
From: Erik E. Kahn send_trap() was still using old ib_smp instead of opa_smp for formatting and sending traps. Reviewed-by: Arthur Kepner Reviewed-by: Ira Weiny Reviewed-by: Dennis Dalessandro Signed-off-by: Jubin John Signed-off-by: Erik E. Kahn --- drivers/staging/rdma/hfi1/mad.c | 121 +++-- drivers/staging/rdma/hfi1/ruc.c | 10 ++- drivers/staging/rdma/hfi1/ud.c| 21 +++--- drivers/staging/rdma/hfi1/verbs.h |2 - 4 files changed, 79 insertions(+), 75 deletions(-) diff --git a/drivers/staging/rdma/hfi1/mad.c b/drivers/staging/rdma/hfi1/mad.c index a122565..28c3ad2 100644 --- a/drivers/staging/rdma/hfi1/mad.c +++ b/drivers/staging/rdma/hfi1/mad.c @@ -84,7 +84,7 @@ static void send_trap(struct hfi1_ibport *ibp, void *data, unsigned len) { struct ib_mad_send_buf *send_buf; struct ib_mad_agent *agent; - struct ib_smp *smp; + struct opa_smp *smp; int ret; unsigned long flags; unsigned long timeout; @@ -117,15 +117,15 @@ static void send_trap(struct hfi1_ibport *ibp, void *data, unsigned len) return; smp = send_buf->mad; - smp->base_version = IB_MGMT_BASE_VERSION; + smp->base_version = OPA_MGMT_BASE_VERSION; smp->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; - smp->class_version = 1; + smp->class_version = OPA_SMI_CLASS_VERSION; smp->method = IB_MGMT_METHOD_TRAP; ibp->tid++; smp->tid = cpu_to_be64(ibp->tid); smp->attr_id = IB_SMP_ATTR_NOTICE; /* o14-1: smp->mkey = 0; */ - memcpy(smp->data, data, len); + memcpy(smp->route.lid.data, data, len); spin_lock_irqsave(&ibp->lock, flags); if (!ibp->sm_ah) { @@ -164,11 +164,16 @@ static void send_trap(struct hfi1_ibport *ibp, void *data, unsigned len) * Send a bad [PQ]_Key trap (ch. 14.3.8). */ void hfi1_bad_pqkey(struct hfi1_ibport *ibp, __be16 trap_num, u32 key, u32 sl, - u32 qp1, u32 qp2, __be16 lid1, __be16 lid2) + u32 qp1, u32 qp2, u16 lid1, u16 lid2) { - struct ib_mad_notice_attr data; + struct opa_mad_notice_attr data; + u32 lid = ppd_from_ibp(ibp)->lid; + u32 _lid1 = lid1; + u32 _lid2 = lid2; - if (trap_num == IB_NOTICE_TRAP_BAD_PKEY) + memset(&data, 0, sizeof(data)); + + if (trap_num == OPA_TRAP_BAD_P_KEY) ibp->pkey_violations++; else ibp->qkey_violations++; @@ -176,17 +181,15 @@ void hfi1_bad_pqkey(struct hfi1_ibport *ibp, __be16 trap_num, u32 key, u32 sl, /* Send violation trap */ data.generic_type = IB_NOTICE_TYPE_SECURITY; - data.prod_type_msb = 0; data.prod_type_lsb = IB_NOTICE_PROD_CA; data.trap_num = trap_num; - data.issuer_lid = cpu_to_be16(ppd_from_ibp(ibp)->lid); - data.toggle_count = 0; - memset(&data.details, 0, sizeof(data.details)); - data.details.ntc_257_258.lid1 = lid1; - data.details.ntc_257_258.lid2 = lid2; - data.details.ntc_257_258.key = cpu_to_be32(key); - data.details.ntc_257_258.sl_qp1 = cpu_to_be32((sl << 28) | qp1); - data.details.ntc_257_258.qp2 = cpu_to_be32(qp2); + data.issuer_lid = cpu_to_be32(lid); + data.ntc_257_258.lid1 = cpu_to_be32(_lid1); + data.ntc_257_258.lid2 = cpu_to_be32(_lid2); + data.ntc_257_258.key = cpu_to_be32(key); + data.ntc_257_258.sl = sl << 3; + data.ntc_257_258.qp1 = cpu_to_be32(qp1); + data.ntc_257_258.qp2 = cpu_to_be32(qp2); send_trap(ibp, &data, sizeof(data)); } @@ -197,32 +200,30 @@ void hfi1_bad_pqkey(struct hfi1_ibport *ibp, __be16 trap_num, u32 key, u32 sl, static void bad_mkey(struct hfi1_ibport *ibp, struct ib_mad_hdr *mad, __be64 mkey, __be32 dr_slid, u8 return_path[], u8 hop_cnt) { - struct ib_mad_notice_attr data; + struct opa_mad_notice_attr data; + u32 lid = ppd_from_ibp(ibp)->lid; + memset(&data, 0, sizeof(data)); /* Send violation trap */ data.generic_type = IB_NOTICE_TYPE_SECURITY; - data.prod_type_msb = 0; data.prod_type_lsb = IB_NOTICE_PROD_CA; - data.trap_num = IB_NOTICE_TRAP_BAD_MKEY; - data.issuer_lid = cpu_to_be16(ppd_from_ibp(ibp)->lid); - data.toggle_count = 0; - memset(&data.details, 0, sizeof(data.details)); - data.details.ntc_256.lid = data.issuer_lid; - data.details.ntc_256.method = mad->method; - data.details.ntc_256.attr_id = mad->attr_id; - data.details.ntc_256.attr_mod = mad->attr_mod; - data.details.ntc_256.mkey = mkey; + data.trap_num = OPA_TRAP_BAD_M_KEY; + data.issuer_lid = cpu_to_be32(lid); + data.ntc_256.lid = data.issuer_lid; + data.ntc_256.method = mad->method; + data.ntc_256.attr_id = mad->attr_id; + data.ntc_256.attr_mod = mad->attr_mod; + data.ntc_256.mkey = mkey; if (mad->mgmt_cla
[PATCH 1/2] staging/rdma/hfi1: add definitions for OPA traps
From: Jubin John These new definitions will be used by follow-on patches for formating and sending OPA traps. Reviewed-by: Ira Weiny Signed-off-by: Jubin John --- drivers/staging/rdma/hfi1/mad.h | 114 +++ 1 file changed, 114 insertions(+) diff --git a/drivers/staging/rdma/hfi1/mad.h b/drivers/staging/rdma/hfi1/mad.h index 4745750..f031775 100644 --- a/drivers/staging/rdma/hfi1/mad.h +++ b/drivers/staging/rdma/hfi1/mad.h @@ -60,7 +60,121 @@ #endif #include "opa_compat.h" +/* + * OPA Traps + */ +#define OPA_TRAP_GID_NOW_IN_SERVICE cpu_to_be16(64) +#define OPA_TRAP_GID_OUT_OF_SERVICE cpu_to_be16(65) +#define OPA_TRAP_ADD_MULTICAST_GROUPcpu_to_be16(66) +#define OPA_TRAL_DEL_MULTICAST_GROUPcpu_to_be16(67) +#define OPA_TRAP_UNPATH cpu_to_be16(68) +#define OPA_TRAP_REPATH cpu_to_be16(69) +#define OPA_TRAP_PORT_CHANGE_STATE cpu_to_be16(128) +#define OPA_TRAP_LINK_INTEGRITY cpu_to_be16(129) +#define OPA_TRAP_EXCESSIVE_BUFFER_OVERRUN cpu_to_be16(130) +#define OPA_TRAP_FLOW_WATCHDOG cpu_to_be16(131) +#define OPA_TRAP_CHANGE_CAPABILITY cpu_to_be16(144) +#define OPA_TRAP_CHANGE_SYSGUID cpu_to_be16(145) +#define OPA_TRAP_BAD_M_KEY cpu_to_be16(256) +#define OPA_TRAP_BAD_P_KEY cpu_to_be16(257) +#define OPA_TRAP_BAD_Q_KEY cpu_to_be16(258) +#define OPA_TRAP_SWITCH_BAD_PKEYcpu_to_be16(259) +#define OPA_SMA_TRAP_DATA_LINK_WIDTHcpu_to_be16(2048) +/* + * Generic trap/notice other local changes flags (trap 144). + */ +#defineOPA_NOTICE_TRAP_LWDE_CHG0x08 /* Link Width Downgrade Enable + * changed + */ +#define OPA_NOTICE_TRAP_LSE_CHG 0x04 /* Link Speed Enable changed */ +#define OPA_NOTICE_TRAP_LWE_CHG 0x02 /* Link Width Enable changed */ +#define OPA_NOTICE_TRAP_NODE_DESC_CHG 0x01 + +struct opa_mad_notice_attr { + u8 generic_type; + u8 prod_type_msb; + __be16 prod_type_lsb; + __be16 trap_num; + __be16 toggle_count; + __be32 issuer_lid; + __be32 reserved1; + union ib_gid issuer_gid; + + union { + struct { + u8 details[64]; + } raw_data; + + struct { + union ib_gidgid; + } __packed ntc_64_65_66_67; + + struct { + __be32 lid; + } __packed ntc_128; + + struct { + __be32 lid;/* where violation happened */ + u8 port_num; /* where violation happened */ + } __packed ntc_129_130_131; + + struct { + __be32 lid;/* LID where change occurred */ + __be32 new_cap_mask; /* new capability mask */ + __be16 reserved2; + __be16 cap_mask; + __be16 change_flags; /* low 4 bits only */ + } __packed ntc_144; + + struct { + __be64 new_sys_guid; + __be32 lid;/* lid where sys guid changed */ + } __packed ntc_145; + + struct { + __be32 lid; + __be32 dr_slid; + u8 method; + u8 dr_trunc_hop; + __be16 attr_id; + __be32 attr_mod; + __be64 mkey; + u8 dr_rtn_path[30]; + } __packed ntc_256; + + struct { + __be32 lid1; + __be32 lid2; + __be32 key; + u8 sl; /* SL: high 5 bits */ + u8 reserved3[3]; + union ib_gidgid1; + union ib_gidgid2; + __be32 qp1;/* high 8 bits reserved */ + __be32 qp2;/* high 8 bits reserved */ + } __packed ntc_257_258; + + struct { + __be16 flags; /* low 8 bits reserved */ + __be16 pkey; + __be32 lid1; + __be32 lid2; + u8 sl; /* SL: high 5 bits */ + u8 reserved4[3]; + union ib_gidgid1; + union ib_gidgid2; + __be32 qp1;/* high 8 bits reserved
[PATCH] staging/rdma/hfi1: convert buffers allocated atomic to per cpu
Profiling has shown the the atomic is a performance issue for the pio hot path. If multiple cpus allocated an sc's buffer, the cacheline containing the atomic will bounce from L0 to L0. Convert the atomic to a percpu variable. Reviewed-by: Jubin John Signed-off-by: Mike Marciniszyn --- drivers/staging/rdma/hfi1/pio.c | 40 ++ drivers/staging/rdma/hfi1/pio.h |2 +- drivers/staging/rdma/hfi1/pio_copy.c |6 +++-- 3 files changed, 40 insertions(+), 8 deletions(-) diff --git a/drivers/staging/rdma/hfi1/pio.c b/drivers/staging/rdma/hfi1/pio.c index eab58c1..ed8043b 100644 --- a/drivers/staging/rdma/hfi1/pio.c +++ b/drivers/staging/rdma/hfi1/pio.c @@ -660,6 +660,24 @@ void set_pio_integrity(struct send_context *sc) write_kctxt_csr(dd, hw_context, SC(CHECK_ENABLE), reg); } +static u32 get_buffers_allocated(struct send_context *sc) +{ + int cpu; + u32 ret = 0; + + for_each_possible_cpu(cpu) + ret += *per_cpu_ptr(sc->buffers_allocated, cpu); + return ret; +} + +static void reset_buffers_allocated(struct send_context *sc) +{ + int cpu; + + for_each_possible_cpu(cpu) + (*per_cpu_ptr(sc->buffers_allocated, cpu)) = 0; +} + /* * Allocate a NUMA relative send context structure of the given type along * with a HW context. @@ -668,7 +686,7 @@ struct send_context *sc_alloc(struct hfi1_devdata *dd, int type, uint hdrqentsize, int numa) { struct send_context_info *sci; - struct send_context *sc; + struct send_context *sc = NULL; dma_addr_t pa; unsigned long flags; u64 reg; @@ -686,10 +704,20 @@ struct send_context *sc_alloc(struct hfi1_devdata *dd, int type, if (!sc) return NULL; + sc->buffers_allocated = alloc_percpu(u32); + if (!sc->buffers_allocated) { + kfree(sc); + dd_dev_err(dd, + "Cannot allocate buffers_allocated per cpu counters\n" + ); + return NULL; + } + spin_lock_irqsave(&dd->sc_lock, flags); ret = sc_hw_alloc(dd, type, &sw_index, &hw_context); if (ret) { spin_unlock_irqrestore(&dd->sc_lock, flags); + free_percpu(sc->buffers_allocated); kfree(sc); return NULL; } @@ -705,7 +733,6 @@ struct send_context *sc_alloc(struct hfi1_devdata *dd, int type, spin_lock_init(&sc->credit_ctrl_lock); INIT_LIST_HEAD(&sc->piowait); INIT_WORK(&sc->halt_work, sc_halted); - atomic_set(&sc->buffers_allocated, 0); init_waitqueue_head(&sc->halt_wait); /* grouping is always single context for now */ @@ -866,6 +893,7 @@ void sc_free(struct send_context *sc) spin_unlock_irqrestore(&dd->sc_lock, flags); kfree(sc->sr); + free_percpu(sc->buffers_allocated); kfree(sc); } @@ -1029,7 +1057,7 @@ int sc_restart(struct send_context *sc) /* kernel context */ loop = 0; while (1) { - count = atomic_read(&sc->buffers_allocated); + count = get_buffers_allocated(sc); if (count == 0) break; if (loop > 100) { @@ -1197,7 +1225,8 @@ int sc_enable(struct send_context *sc) sc->sr_head = 0; sc->sr_tail = 0; sc->flags = 0; - atomic_set(&sc->buffers_allocated, 0); + /* the alloc lock insures no fast path allocation */ + reset_buffers_allocated(sc); /* * Clear all per-context errors. Some of these will be set when @@ -1373,7 +1402,8 @@ retry: /* there is enough room */ - atomic_inc(&sc->buffers_allocated); + preempt_disable(); + this_cpu_inc(*sc->buffers_allocated); /* read this once */ head = sc->sr_head; diff --git a/drivers/staging/rdma/hfi1/pio.h b/drivers/staging/rdma/hfi1/pio.h index 0bb885c..53d3e0a 100644 --- a/drivers/staging/rdma/hfi1/pio.h +++ b/drivers/staging/rdma/hfi1/pio.h @@ -130,7 +130,7 @@ struct send_context { spinlock_t credit_ctrl_lock cacheline_aligned_in_smp; u64 credit_ctrl;/* cache for credit control */ u32 credit_intr_count; /* count of credit intr users */ - atomic_t buffers_allocated; /* count of buffers allocated */ + u32 __percpu *buffers_allocated;/* count of buffers allocated */ wait_queue_head_t halt_wait;/* wait until kernel sees interrupt */ }; diff --git a/drivers/staging/rdma/hfi1/pio_copy.c b/drivers/staging/rdma/hfi1/pio_copy.c index 8972bbc..ebb0baf 100644 --- a/drivers/staging/rdma/hfi1/pio_
[PATCH] staging/rdma/hfi1: fix sdma build failures to always clean up
There are holes in the sdma build support routines that do not clean any partially built sdma descriptors after mapping or allocate failures. This patch corrects these issues. Reviewed-by: Dennis Dalessandro Signed-off-by: Mike Marciniszyn --- drivers/staging/rdma/hfi1/sdma.c | 10 ++ drivers/staging/rdma/hfi1/sdma.h |7 +-- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/drivers/staging/rdma/hfi1/sdma.c b/drivers/staging/rdma/hfi1/sdma.c index 90b7072..8023eeb 100644 --- a/drivers/staging/rdma/hfi1/sdma.c +++ b/drivers/staging/rdma/hfi1/sdma.c @@ -2724,22 +2724,21 @@ static int _extend_sdma_tx_descs(struct hfi1_devdata *dd, struct sdma_txreq *tx) tx->coalesce_buf = kmalloc(tx->tlen + sizeof(u32), GFP_ATOMIC); if (!tx->coalesce_buf) - return -ENOMEM; - + goto enomem; tx->coalesce_idx = 0; } return 0; } if (unlikely(tx->num_desc == MAX_DESC)) - return -ENOMEM; + goto enomem; tx->descp = kmalloc_array( MAX_DESC, sizeof(struct sdma_desc), GFP_ATOMIC); if (!tx->descp) - return -ENOMEM; + goto enomem; /* reserve last descriptor for coalescing */ tx->desc_limit = MAX_DESC - 1; @@ -2747,6 +2746,9 @@ static int _extend_sdma_tx_descs(struct hfi1_devdata *dd, struct sdma_txreq *tx) for (i = 0; i < tx->num_desc; i++) tx->descp[i] = tx->descs[i]; return 0; +enomem: + sdma_txclean(dd, tx); + return -ENOMEM; } /* diff --git a/drivers/staging/rdma/hfi1/sdma.h b/drivers/staging/rdma/hfi1/sdma.h index 85701ee..da89e64 100644 --- a/drivers/staging/rdma/hfi1/sdma.h +++ b/drivers/staging/rdma/hfi1/sdma.h @@ -774,10 +774,13 @@ static inline int _sdma_txadd_daddr( tx->tlen -= len; /* special cases for last */ if (!tx->tlen) { - if (tx->packet_len & (sizeof(u32) - 1)) + if (tx->packet_len & (sizeof(u32) - 1)) { rval = _pad_sdma_tx_descs(dd, tx); - else + if (rval) + return rval; + } else { _sdma_close_tx(dd, tx); + } } tx->num_desc++; return rval; ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] staging/rdma/hfi1: fix pio progress routine race with allocator
The allocation code assumes that the shadow ring cannot be overrun because the credits will limit the allocation. Unfortuately, the progress mechanism in sc_release_update() updates the free count prior to processing the shadow ring, allowing the shadow ring to be overrun by an allocation. Reviewed-by: Mark Debbage Signed-off-by: Mike Marciniszyn --- drivers/staging/rdma/hfi1/pio.c |9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/staging/rdma/hfi1/pio.c b/drivers/staging/rdma/hfi1/pio.c index eab58c1..8e10857 100644 --- a/drivers/staging/rdma/hfi1/pio.c +++ b/drivers/staging/rdma/hfi1/pio.c @@ -1565,6 +1565,7 @@ void sc_release_update(struct send_context *sc) u64 hw_free; u32 head, tail; unsigned long old_free; + unsigned long free; unsigned long extra; unsigned long flags; int code; @@ -1579,7 +1580,7 @@ void sc_release_update(struct send_context *sc) extra = (((hw_free & CR_COUNTER_SMASK) >> CR_COUNTER_SHIFT) - (old_free & CR_COUNTER_MASK)) & CR_COUNTER_MASK; - sc->free = old_free + extra; + free = old_free + extra; trace_hfi1_piofree(sc, extra); /* call sent buffer callbacks */ @@ -1589,7 +1590,7 @@ void sc_release_update(struct send_context *sc) while (head != tail) { pbuf = &sc->sr[tail].pbuf; - if (sent_before(sc->free, pbuf->sent_at)) { + if (sent_before(free, pbuf->sent_at)) { /* not sent yet */ break; } @@ -1603,8 +1604,10 @@ void sc_release_update(struct send_context *sc) if (tail >= sc->sr_size) tail = 0; } - /* update tail, in case we moved it */ sc->sr_tail = tail; + /* make sure tail is updated before free */ + smp_wmb(); + sc->free = free; spin_unlock_irqrestore(&sc->release_lock, flags); sc_piobufavail(sc); } ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 2/2] staging/rdma/hfi1: add ACK coalescing logic
Implement ACK coalesing logic using a 8 bit counter. The algorithm is send pio ack when: - fecn present - this is the first packet in an interrupt session - counter is >= HFI1_PSN_CREDIT Otherwise the ack is defered. Reviewed-by: Dennis Dalessandro Signed-off-by: Mike Marciniszyn --- drivers/staging/rdma/hfi1/qp.c|1 + drivers/staging/rdma/hfi1/rc.c| 29 +++-- drivers/staging/rdma/hfi1/verbs.h |7 --- 3 files changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/staging/rdma/hfi1/qp.c b/drivers/staging/rdma/hfi1/qp.c index df1fa56..8f867ba 100644 --- a/drivers/staging/rdma/hfi1/qp.c +++ b/drivers/staging/rdma/hfi1/qp.c @@ -378,6 +378,7 @@ static void reset_qp(struct hfi1_qp *qp, enum ib_qp_type type) } qp->s_ack_state = IB_OPCODE_RC_ACKNOWLEDGE; qp->r_nak_state = 0; + qp->r_adefered = 0; qp->r_aflags = 0; qp->r_flags = 0; qp->s_head = 0; diff --git a/drivers/staging/rdma/hfi1/rc.c b/drivers/staging/rdma/hfi1/rc.c index 6fe0104..0193497 100644 --- a/drivers/staging/rdma/hfi1/rc.c +++ b/drivers/staging/rdma/hfi1/rc.c @@ -1618,6 +1618,17 @@ static inline void rc_defered_ack(struct hfi1_ctxtdata *rcd, } } +static inline void rc_cancel_ack(struct hfi1_qp *qp) +{ + qp->r_adefered = 0; + if (list_empty(&qp->rspwait)) + return; + list_del_init(&qp->rspwait); + qp->r_flags &= ~HFI1_R_RSP_DEFERED_ACK; + if (atomic_dec_and_test(&qp->refcount)) + wake_up(&qp->wait); +} + /** * rc_rcv_error - process an incoming duplicate or error RC packet * @ohdr: the other headers for this packet @@ -2335,8 +2346,22 @@ send_last: qp->r_ack_psn = psn; qp->r_nak_state = 0; /* Send an ACK if requested or required. */ - if (psn & (1 << 31)) - goto send_ack; + if (psn & IB_BTH_REQ_ACK) { + if (packet->numpkt == 0) { + rc_cancel_ack(qp); + goto send_ack; + } + if (qp->r_adefered >= HFI1_PSN_CREDIT) { + rc_cancel_ack(qp); + goto send_ack; + } + if (unlikely(is_fecn)) { + rc_cancel_ack(qp); + goto send_ack; + } + qp->r_adefered++; + rc_defered_ack(rcd, qp); + } return; rnr_nak: diff --git a/drivers/staging/rdma/hfi1/verbs.h b/drivers/staging/rdma/hfi1/verbs.h index c5e6f47..6d2012b 100644 --- a/drivers/staging/rdma/hfi1/verbs.h +++ b/drivers/staging/rdma/hfi1/verbs.h @@ -120,9 +120,9 @@ struct hfi1_packet; #define HFI1_VENDOR_IPGcpu_to_be16(0xFFA0) -#define IB_BTH_REQ_ACK (1 << 31) -#define IB_BTH_SOLICITED (1 << 23) -#define IB_BTH_MIG_REQ (1 << 22) +#define IB_BTH_REQ_ACK BIT(31) +#define IB_BTH_SOLICITED BIT(23) +#define IB_BTH_MIG_REQ BIT(22) #define IB_GRH_VERSION 6 #define IB_GRH_VERSION_MASK0xF @@ -484,6 +484,7 @@ struct hfi1_qp { u32 r_psn; /* expected rcv packet sequence number */ u32 r_msn; /* message sequence number */ + u8 r_adefered; /* number of acks defered */ u8 r_state; /* opcode of last packet received */ u8 r_flags; u8 r_head_ack_queue;/* index into s_ack_queue[] */ ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH 1/2] staging/rdma/hfi1: add common routine for queuing acks
This patch is a prelimary patch required to coalesce acks. The routine to "schedule" a QP for sending a NAK is now centralized in rc_defer_ack(). The flag is changed for clarity since the all acks will potentially use the deferral mechanism. Reviewed-by: Dennis Dalessandro Signed-off-by: Mike Marciniszyn --- drivers/staging/rdma/hfi1/driver.c |4 ++- drivers/staging/rdma/hfi1/rc.c | 42 +--- drivers/staging/rdma/hfi1/verbs.h | 12 ++ 3 files changed, 24 insertions(+), 34 deletions(-) diff --git a/drivers/staging/rdma/hfi1/driver.c b/drivers/staging/rdma/hfi1/driver.c index ce69141..e8e0f55 100644 --- a/drivers/staging/rdma/hfi1/driver.c +++ b/drivers/staging/rdma/hfi1/driver.c @@ -721,8 +721,8 @@ static inline void process_rcv_qp_work(struct hfi1_packet *packet) */ list_for_each_entry_safe(qp, nqp, &rcd->qp_wait_list, rspwait) { list_del_init(&qp->rspwait); - if (qp->r_flags & HFI1_R_RSP_NAK) { - qp->r_flags &= ~HFI1_R_RSP_NAK; + if (qp->r_flags & HFI1_R_RSP_DEFERED_ACK) { + qp->r_flags &= ~HFI1_R_RSP_DEFERED_ACK; hfi1_send_rc_ack(rcd, qp, 0); } if (qp->r_flags & HFI1_R_RSP_SEND) { diff --git a/drivers/staging/rdma/hfi1/rc.c b/drivers/staging/rdma/hfi1/rc.c index 0b19206..6fe0104 100644 --- a/drivers/staging/rdma/hfi1/rc.c +++ b/drivers/staging/rdma/hfi1/rc.c @@ -1608,6 +1608,16 @@ bail: return; } +static inline void rc_defered_ack(struct hfi1_ctxtdata *rcd, + struct hfi1_qp *qp) +{ + if (list_empty(&qp->rspwait)) { + qp->r_flags |= HFI1_R_RSP_DEFERED_ACK; + atomic_inc(&qp->refcount); + list_add_tail(&qp->rspwait, &rcd->qp_wait_list); + } +} + /** * rc_rcv_error - process an incoming duplicate or error RC packet * @ohdr: the other headers for this packet @@ -1650,11 +1660,7 @@ static noinline int rc_rcv_error(struct hfi1_other_headers *ohdr, void *data, * in the receive queue have been processed. * Otherwise, we end up propagating congestion. */ - if (list_empty(&qp->rspwait)) { - qp->r_flags |= HFI1_R_RSP_NAK; - atomic_inc(&qp->refcount); - list_add_tail(&qp->rspwait, &rcd->qp_wait_list); - } + rc_defered_ack(rcd, qp); } goto done; } @@ -2337,11 +2343,7 @@ rnr_nak: qp->r_nak_state = IB_RNR_NAK | qp->r_min_rnr_timer; qp->r_ack_psn = qp->r_psn; /* Queue RNR NAK for later */ - if (list_empty(&qp->rspwait)) { - qp->r_flags |= HFI1_R_RSP_NAK; - atomic_inc(&qp->refcount); - list_add_tail(&qp->rspwait, &rcd->qp_wait_list); - } + rc_defered_ack(rcd, qp); return; nack_op_err: @@ -2349,11 +2351,7 @@ nack_op_err: qp->r_nak_state = IB_NAK_REMOTE_OPERATIONAL_ERROR; qp->r_ack_psn = qp->r_psn; /* Queue NAK for later */ - if (list_empty(&qp->rspwait)) { - qp->r_flags |= HFI1_R_RSP_NAK; - atomic_inc(&qp->refcount); - list_add_tail(&qp->rspwait, &rcd->qp_wait_list); - } + rc_defered_ack(rcd, qp); return; nack_inv_unlck: @@ -2363,11 +2361,7 @@ nack_inv: qp->r_nak_state = IB_NAK_INVALID_REQUEST; qp->r_ack_psn = qp->r_psn; /* Queue NAK for later */ - if (list_empty(&qp->rspwait)) { - qp->r_flags |= HFI1_R_RSP_NAK; - atomic_inc(&qp->refcount); - list_add_tail(&qp->rspwait, &rcd->qp_wait_list); - } + rc_defered_ack(rcd, qp); return; nack_acc_unlck: @@ -2421,13 +2415,7 @@ void hfi1_rc_hdrerr( * Otherwise, we end up * propagating congestion. */ - if (list_empty(&qp->rspwait)) { - qp->r_flags |= HFI1_R_RSP_NAK; - atomic_inc(&qp->refcount); - list_add_tail( - &qp->rspwait, - &rcd->qp_wait_list); - } + rc_defered_ack(rcd, qp); } /* Out of sequence NAK */ } /* QP Request NAKs */ } diff --git a/drivers/staging/rdma/hfi1/verbs.h b/drivers/staging/rdma
[PATCH] IB/hfi1: fix sdma_descq_cnt parameter parsing
The boolean tests should have been or-ed. Reported-by: David Binderman Reviewed-by: Jubin John Signed-off-by: Mike Marciniszyn --- drivers/staging/rdma/hfi1/sdma.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/rdma/hfi1/sdma.c b/drivers/staging/rdma/hfi1/sdma.c index a8c903c..7e01f30 100644 --- a/drivers/staging/rdma/hfi1/sdma.c +++ b/drivers/staging/rdma/hfi1/sdma.c @@ -737,7 +737,7 @@ u16 sdma_get_descq_cnt(void) */ if (!is_power_of_2(count)) return SDMA_DESCQ_CNT; - if (count < 64 && count > 32768) + if (count < 64 || count > 32768) return SDMA_DESCQ_CNT; return count; } ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] Kconfig: add temporary PCI dependency
The move from infiniband to staging requires a temporary PCI dependency to fix 0-day build issues. The drivers/infiniband/hw/Kconfig gratuitously added it for all drivers. Signed-off-by: Mike Marciniszyn --- drivers/staging/hfi1/Kconfig |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/hfi1/Kconfig b/drivers/staging/hfi1/Kconfig index 78bc89e..8a079ba 100644 --- a/drivers/staging/hfi1/Kconfig +++ b/drivers/staging/hfi1/Kconfig @@ -1,6 +1,6 @@ config INFINIBAND_HFI1 tristate "Intel OPA Gen1 support" - depends on X86_64 && INFINIBAND + depends on X86_64 && INFINIBAND && PCI default m ---help--- This is a low-level driver for Intel OPA Gen1 adapter. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH] MAINTAINERS: Add maintainer section for hfi1
Signed-off-by: Mike Marciniszyn --- 0 files changed diff --git a/MAINTAINERS b/MAINTAINERS index b3c1a56..45953e9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9892,6 +9892,13 @@ M: Arnaud Patard S: Odd Fixes F: drivers/staging/xgifb/ +HFI1 DRIVER +M: Mike Marciniszyn +L: linux-r...@vger.kernel.org +L: de...@driverdev.osuosl.org +S: Supported +F: drivers/staging/hfi1 + STARFIRE/DURALAN NETWORK DRIVER M: Ion Badulescu S: Odd Fixes ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel