Re: [PATCH][V2][target-devel-next] tcmu: make array tcmu_attrib_attrs static const
On Tue, 2017-06-13 at 14:29 +0100, Colin King wrote: > From: Colin Ian King > > The array tcmu_attrib_attrs does not need to be in global scope, so make > it static. > > Cleans up sparse warning: > "symbol 'tcmu_attrib_attrs' was not declared. Should it be static?" > > Signed-off-by: Colin Ian King > --- > drivers/target/target_core_user.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/target/target_core_user.c > b/drivers/target/target_core_user.c > index afc1fd6bacaf..04fb3f720895 100644 > --- a/drivers/target/target_core_user.c > +++ b/drivers/target/target_core_user.c > @@ -1672,7 +1672,7 @@ static ssize_t tcmu_emulate_write_cache_store(struct > config_item *item, > } > CONFIGFS_ATTR(tcmu_, emulate_write_cache); > > -struct configfs_attribute *tcmu_attrib_attrs[] = { > +static struct configfs_attribute *tcmu_attrib_attrs[] = { > &tcmu_attr_cmd_time_out, > &tcmu_attr_dev_path, > &tcmu_attr_dev_size, Applied. Thanks Colin.
Re: [PATCH] tcmu: Fix module removal due to stuck unmap_thread thread again
On Thu, 2017-06-15 at 15:05 +0800, lixi...@cmss.chinamobile.com wrote: > From: Xiubo Li > > Because the unmap code just after the schdule() returned may take > a long time and if the kthread_stop() is fired just when in this > routine, the module removal maybe stuck too. > > Signed-off-by: Xiubo Li > --- > drivers/target/target_core_user.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/target/target_core_user.c > b/drivers/target/target_core_user.c > index beb5f09..203bff1 100644 > --- a/drivers/target/target_core_user.c > +++ b/drivers/target/target_core_user.c > @@ -1573,7 +1573,7 @@ static int unmap_thread_fn(void *data) > struct page *page; > int i; > > - while (1) { > + while (!kthread_should_stop()) { > DEFINE_WAIT(__wait); > > prepare_to_wait(&unmap_wait, &__wait, TASK_INTERRUPTIBLE); Applied to target-pending/for-next. Thanks Xiubo + MNC.
[Bug 196191] New: NULL pointer dereference at beiscsi_process_cq+0x6f6
https://bugzilla.kernel.org/show_bug.cgi?id=196191 Bug ID: 196191 Summary: NULL pointer dereference at beiscsi_process_cq+0x6f6 Product: SCSI Drivers Version: 2.5 Kernel Version: 4.9.30-2~bpo8+1 (Debian) Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Other Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org Reporter: wf...@niif.hu Regression: No An iSCSI-booted server occasionally halts with following console log during bootup: [ OK ] Reached target Network is Online. Starting LSB: Starts and stops the iSCSI initiator s...ault targets... [ 214.044354] iscsi: registered transport (tcp) [ 214.316086] iscsi: registered transport (iser) [ 217.293303] hb: link status definitely up for interface enp5s0f5, 200 Mbps full duplex [ 217.709239] cloud: link status definitely up for interface enp5s0f7, 1 Mbps full duplex [ 218.438152] BUG: unable to handle kernel NULL pointer dereference at (null) [ 218.463932] IP: [] beiscsi_process_cq+0x6f6/0xa00 [be2iscsi] [ 218.487703] PGD 0 [ 218.493722] [ 218.498607] Oops: [#1] SMP [ 218.508917] Modules linked in: ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp bonding ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate joydev iTCO_wdt iTCO_vendor_support evdev intel_uncore mgag200 ttm drm_kms_helper pcspkr mei_me intel_rapl_perf drm lpc_ich i2c_algo_bit mei mfd_core shpchp wmi ac button acpi_pad acpi_power_meter ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ipmi_watchdog ip6_tables ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_recent xt_multiport xt_conntrack ipmi_si nf_conntrack ipmi_poweroff iptable_filter ipmi_devintf ip_tables ipmi_msghandler x_tables configfs autofs4 xfs libcrc32c dm_service_time dm_multipath dm_mod sg sd_mod hid_generic usbhid hid be2iscsi crc32c_intel libiscsi ehci_pci aesni_intel aes_x86_64 scsi_transport_iscsi glue_helper ehci_hcd iscsi_boot_sysfs lrw gf128mul ablk_helper i2c_i801 cryptd usbcore i2c_smbus usb_common scsi_mod be2net [ 218.847642] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-0.bpo.3-amd64 #1 Debian 4.9.30-2~bpo8+1 [ 218.877399] Hardware name: FUJITSU PRIMERGY BX924 S4/D3143-B1, BIOS V4.6.5.4 R1.13.0 for D3143-B1x 07/16/2015 [ 218.910016] task: b640e540 task.stack: b640 [ 218.929476] RIP: 0010:[] [] beiscsi_process_cq+0x6f6/0xa00 [be2iscsi] [ 218.961249] RSP: :95bfbfa03e48 EFLAGS: 00010297 [ 218.978707] RAX: 95bfb5100e00 RBX: 0001 RCX: 0005 [ 219.002171] RDX: 0001 RSI: RDI: 95bfb532d504 [ 219.025636] RBP: 0001 R08: 03fb0001 R09: 00011bcd [ 219.049097] R10: 594e5a87 R11: 95bfb551fd18 R12: 95bfb2822208 [ 219.072562] R13: 0005 R14: 95bfb2818810 R15: 95bfb2557d40 [ 219.096026] FS: () GS:95bfbfa0() knlGS: [ 219.122634] CS: 0010 DS: ES: CR0: 80050033 [ 219.141521] CR2: CR3: 0017c5a07000 CR4: 001406f0 [ 219.164986] Stack: [ 219.171579] 80202f00 000ab86fc200 95bfb2818d70 95bfb87000e2 [ 219.195943] 95bfb532d504 95bfb5100e00 95bfb551fd18 01ff95bf00011bcd [ 219.220310] 00fe 95bf8100 95bfb86fa018 [ 219.244675] Call Trace: [ 219.252699] [ 219.259008] [] ? be_iopoll+0xc8/0x170 [be2iscsi] [ 219.279627] [] ? irq_poll_softirq+0xa4/0xd0 [ 219.298813] [] ? __do_softirq+0x106/0x292 [ 219.317430] [] ? irq_exit+0x98/0xa0 [ 219.334317] [] ? do_IRQ+0x4f/0xd0 [ 219.334317] [] ? do_IRQ+0x4f/0xd0 [ 219.350635] [] ? common_interrupt+0x82/0x82 [ 219.369810] [ 219.376123] [] ? cpuidle_enter_state+0x113/0x260 [ 219.396741] [] ? cpu_startup_entry+0x17e/0x260 [ 219.416785] [] ? start_kernel+0x46d/0x48d [ 219.435392] [] ? early_idt_handler_array+0x120/0x120 [ 219.457141] [] ? x86_64_start_kernel+0x152/0x176 [ 219.477745] Code: 00 0f b6 44 24 50 c6 06 22 88 56 01 88 46 02 44 89 c8 0f c8 89 46 1c 0f b6 44 24 40 41 8d 44 01 ff 0f c8 89 46 20 eb ac 48 8b 30 06 3f 0f 85 bb 00 00 00 49 8b 3b e9 74 ff ff ff 41 f6 86 10 [ 219.540521] RIP [] beiscsi_process_cq+0x6f6/0xa00 [be2iscsi] [ 219.564571] RSP [ 219.576023] CR2: [ 219.586917] ---[ end trace f9ec7ddab1cda260 ]--- [ 219.7222
Re: [PATCH v2 0/5] g_NCR5380: PDMA fixes and cleanup
On Saturday 24 June 2017 08:37:36 Finn Thain wrote: > Ondrej, would you please test this new series? > > Changed since v1: > - PDMA transfer residual is calculated earlier. > - End of DMA flag check is now polled (if there is any residual). > > > Finn Thain (2): > g_NCR5380: Limit sg_tablesize to avoid PDMA read overruns on DTC436 > g_NCR5380: Cleanup comments and whitespace > > Ondrej Zary (3): > g_NCR5380: Fix PDMA transfer size > g_NCR5380: End PDMA transfer correctly on target disconnection > g_NCR5380: Re-work PDMA loops > > drivers/scsi/g_NCR5380.c | 231 > ++- 1 file changed, 107 > insertions(+), 124 deletions(-) It mostly works, but there are some problems: It's not reliable - we continue the data transfer after poll_politely2 returns zero but we don't know if it returned because of host buffer being ready of because of an IRQ. So if a device disconnects during write, we continue to fill the buffer and only then find out that wait for 53c80 registers timed out. Then PDMA gets disabled: [ 3458.774251] sd 2:0:1:0: [sdb] tag#0 53c80 registers not accessible, device will be reset [ 3458.774251] sd 2:0:1:0: [sdb] tag#0 switching to slow handshake We can just reset and continue with a new PDMA transfer. Found no problems with reads. But when this happens during a write, we might have lost some data buffers that we need to transfer again. The chip's PDMA block counter does not seem to be very helpful here - testing shows that either one buffer is missing in the file or is duplicated. That's why my code had separate host buffer ready and IRQ checks. Host buffer first - if it's ready, transfer the data. If not, check for IRQ - if it was an error, rollback 2 buffers (the same if the host buffer is not ready in time). There's also a performance regression on DTC436 - the sg_tablesize limit affects performance badly. Before: # hdparm -t --direct /dev/sdb /dev/sdb: Timing O_DIRECT disk reads: 4 MB in 3.21 seconds = 1.25 MB/sec Now: # hdparm -t --direct /dev/sdb /dev/sdb: Timing O_DIRECT disk reads: 4 MB in 4.89 seconds = 837.69 kB/sec We should limit the transfer size instead: --- a/drivers/scsi/g_NCR5380.c +++ b/drivers/scsi/g_NCR5380.c @@ -45,7 +45,8 @@ int c400_blk_cnt; \ int c400_host_buf; \ int io_width; \ - int pdma_residual + int pdma_residual; \ + int board; #define NCR5380_dma_xfer_lengeneric_NCR5380_dma_xfer_len #define NCR5380_dma_recv_setup generic_NCR5380_pread @@ -247,7 +248,6 @@ static int generic_NCR5380_init_one(struct scsi_host_template *tpnt, case BOARD_DTC3181E: ports = dtc_3181e_ports; magic = ncr_53c400a_magic; - tpnt->sg_tablesize = 1; break; } @@ -317,6 +317,7 @@ static int generic_NCR5380_init_one(struct scsi_host_template *tpnt, } hostdata = shost_priv(instance); + hostdata->board = board; hostdata->io = iomem; hostdata->region_size = region_size; @@ -625,6 +626,9 @@ static int generic_NCR5380_dma_xfer_len(struct NCR5380_hostdata *hostdata, /* 53C400 datasheet: non-modulo-128-byte transfers should use PIO */ if (transfersize % 128) transfersize = 0; + /* Limit transfers to 512B to prevent random write corruption on DTC */ + if (hostdata->board == BOARD_DTC3181E && transfersize > 512) + transfersize = 512; return min(transfersize, DMA_MAX_SIZE); } No data corruption and no performance regression: # hdparm -t --direct /dev/sdb /dev/sdb: Timing O_DIRECT disk reads: 4 MB in 3.25 seconds = 1.23 MB/sec As the data corruption affects only writes, we could keep transfersize unlimited for reads: + /* Limit write transfers to 512B to prevent random corruption on DTC */ + if (hostdata->board == BOARD_DTC3181E && + cmd->sc_data_direction == DMA_TO_DEVICE && transfersize > 512) + transfersize = 512; So we can get some performance gain at least for reads: # hdparm -t --direct /dev/sdb /dev/sdb: Timing O_DIRECT disk reads: 6 MB in 4.17 seconds = 1.44 MB/sec -- Ondrej Zary
Re: [PATCH 17/35] NCR5380: Move bus reset to host reset
On 06/24/2017 09:24 AM, Finn Thain wrote: > On Fri, 23 Jun 2017, Hannes Reinecke wrote: > >> The bus reset handler really is a host reset, so move it to >> eh_bus_reset_handler. >> >> Signed-off-by: Hannes Reinecke >> --- >> drivers/scsi/NCR5380.c | 13 - >> drivers/scsi/arm/cumana_1.c | 2 +- >> drivers/scsi/arm/oak.c | 2 +- >> drivers/scsi/atari_scsi.c | 6 +++--- >> drivers/scsi/dmx3191d.c | 2 +- >> drivers/scsi/g_NCR5380.c| 4 ++-- >> drivers/scsi/mac_scsi.c | 4 ++-- >> drivers/scsi/sun3_scsi.c| 4 ++-- >> 8 files changed, 20 insertions(+), 17 deletions(-) >> >> diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c >> index acc3344..e877fb9 100644 >> --- a/drivers/scsi/NCR5380.c >> +++ b/drivers/scsi/NCR5380.c >> @@ -2296,24 +2296,24 @@ static int NCR5380_abort(struct scsi_cmnd *cmd) >> >> >> /** >> - * NCR5380_bus_reset - reset the SCSI bus >> + * NCR5380_host_reset - reset the SCSI host >> * @cmd: SCSI command undergoing EH >> * >> * Returns SUCCESS >> */ >> >> -static int NCR5380_bus_reset(struct scsi_cmnd *cmd) >> +static int NCR5380_host_reset(struct scsi_cmnd *cmd) >> { >> struct Scsi_Host *instance = cmd->device->host; >> struct NCR5380_hostdata *hostdata = shost_priv(instance); >> int i; >> unsigned long flags; >> -struct NCR5380_cmd *ncmd; >> +struct NCR5380_cmd *ncmd, *tmp; >> > > Do you need to introduce another temporary command pointer for this? > >> spin_lock_irqsave(&hostdata->lock, flags); >> >> #if (NDEBUG & NDEBUG_ANY) >> -scmd_printk(KERN_INFO, cmd, __func__); >> +shost_printk(KERN_INFO, instance, __func__); >> #endif >> NCR5380_dprint(NDEBUG_ANY, instance); >> NCR5380_dprint_phase(NDEBUG_ANY, instance); >> @@ -2331,7 +2331,10 @@ static int NCR5380_bus_reset(struct scsi_cmnd *cmd) >> * commands! >> */ >> >> -if (list_del_cmd(&hostdata->unissued, cmd)) { >> +list_for_each_entry_safe(ncmd, tmp, &hostdata->unissued, list) { >> +struct scsi_cmnd *cmd = NCR5380_to_scmd(ncmd); >> + >> +list_del_init(&ncmd->list); >> cmd->result = DID_RESET << 16; >> cmd->scsi_done(cmd); >> } > > For the sake of consistency, why didn't you use the same style that is > used later in this routine? I.e. > > list_for_each_entry(ncmd, &hostdata->unissued, list) { > struct scsi_cmnd *cmd = NCR5380_to_scmd(ncmd); > > cmd->result = DID_RESET << 16; > cmd->scsi_done(cmd); > } > INIT_LIST_HEAD(&hostdata->unissued); > > Either way, > > Acked-by: Finn Thain > > Thanks. > As the driver was switch to using embedded private data area we need to ensure that we're not touching the data area anymore once we call ->done(); the command might be reused after that. Once the command is reused the 'list' structure in the embedded data area will be reset, resulting in a list corruption for the 'unissued' list. But as we're running under EH here where we don't have asynchronous I/O submissions I guess we should be fine with your suggestion. Will be updating the patch. Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)