[PATCH] target: fix potential memory leak in option parsing
When specifying same string type option several times, current option parsing will cause memory leak. Hence, call kfree for previous one in this case. Signed-off-by: Chengguang Xu --- drivers/target/target_core_configfs.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/target/target_core_configfs.c b/drivers/target/target_core_configfs.c index 3f4bf12..ee1a3a8 100644 --- a/drivers/target/target_core_configfs.c +++ b/drivers/target/target_core_configfs.c @@ -1659,6 +1659,7 @@ static ssize_t target_pr_res_aptpl_metadata_store(struct config_item *item, token = match_token(ptr, tokens, args); switch (token) { case Opt_initiator_fabric: + kfree(i_fabric); i_fabric = match_strdup(args); if (!i_fabric) { ret = -ENOMEM; @@ -1666,6 +1667,7 @@ static ssize_t target_pr_res_aptpl_metadata_store(struct config_item *item, } break; case Opt_initiator_node: + kfree(i_port); i_port = match_strdup(args); if (!i_port) { ret = -ENOMEM; @@ -1680,6 +1682,7 @@ static ssize_t target_pr_res_aptpl_metadata_store(struct config_item *item, } break; case Opt_initiator_sid: + kfree(isid); isid = match_strdup(args); if (!isid) { ret = -ENOMEM; @@ -1737,6 +1740,7 @@ static ssize_t target_pr_res_aptpl_metadata_store(struct config_item *item, * PR APTPL Metadata for Target Port */ case Opt_target_fabric: + kfree(t_fabric); t_fabric = match_strdup(args); if (!t_fabric) { ret = -ENOMEM; @@ -1744,6 +1748,7 @@ static ssize_t target_pr_res_aptpl_metadata_store(struct config_item *item, } break; case Opt_target_node: + kfree(t_port); t_port = match_strdup(args); if (!t_port) { ret = -ENOMEM; -- 1.8.3.1
Re: MPT Fusion - ext4 delayed allocation failed errors on 4.14!
Nikola, > thanks for explanation. but disabling write same for now is safe, > right? I was hoping we'd be able to disable it for SATA devices only. > root@siv-70140:~ # sg_vpd /dev/sda > Supported VPD pages VPD page: > Supported VPD pages [sv] > Device identification [di] > Supported VPD pages [sv] > Supported VPD pages [sv] However, it doesn't look like we have much to work with since there is no ATA Information VPD page exposed. -- Martin K. Petersen Oracle Linux Engineering
Re: usercopy whitelist woe in scsi_sense_cache
On Mon, Apr 16, 2018 at 1:44 PM, Kees Cook wrote: > On Thu, Apr 12, 2018 at 8:02 PM, Kees Cook wrote: >> On Thu, Apr 12, 2018 at 3:47 PM, Kees Cook wrote: >>> After fixing up some build issues in the middle of the 4.16 cycle, I >>> get an unhelpful bisect result of commit 0a4b6e2f80aa ("Merge branch >>> 'for-4.16/block'"). Instead of letting the test run longer, I'm going >>> to switch to doing several shorter test boots per kernel and see if >>> that helps. One more bisect coming... >> >> Okay, so I can confirm the bisect points at the _merge_ itself, not a >> specific patch. I'm not sure how to proceed here. It looks like some >> kind of interaction between separate trees? Jens, do you have >> suggestions on how to track this down? > > Turning off HARDENED_USERCOPY and turning on KASAN, I see the same report: > > [ 38.274106] BUG: KASAN: slab-out-of-bounds in _copy_to_user+0x42/0x60 > [ 38.274841] Read of size 22 at addr 8800122b8c4b by task smartctl/1064 > [ 38.275630] > [ 38.275818] CPU: 2 PID: 1064 Comm: smartctl Not tainted 4.17.0-rc1-ARCH+ > #266 > [ 38.276631] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 > [ 38.277690] Call Trace: > [ 38.277988] dump_stack+0x71/0xab > [ 38.278397] ? _copy_to_user+0x42/0x60 > [ 38.278833] print_address_description+0x6a/0x270 > [ 38.279368] ? _copy_to_user+0x42/0x60 > [ 38.279800] kasan_report+0x243/0x360 > [ 38.280221] _copy_to_user+0x42/0x60 > [ 38.280635] sg_io+0x459/0x660 > ... > > Though we get slightly more details (some we already knew): > > [ 38.301330] Allocated by task 329: > [ 38.301734] kmem_cache_alloc_node+0xca/0x220 > [ 38.302239] scsi_mq_init_request+0x64/0x130 [scsi_mod] > [ 38.302821] blk_mq_alloc_rqs+0x2cf/0x370 > [ 38.303265] blk_mq_sched_alloc_tags.isra.4+0x7d/0xb0 > [ 38.303820] blk_mq_init_sched+0xf0/0x220 > [ 38.304268] elevator_switch+0x17a/0x2c0 > [ 38.304705] elv_iosched_store+0x173/0x220 > [ 38.305171] queue_attr_store+0x72/0xb0 > [ 38.305602] kernfs_fop_write+0x188/0x220 > [ 38.306049] __vfs_write+0xb6/0x330 > [ 38.306436] vfs_write+0xe9/0x240 > [ 38.306804] ksys_write+0x98/0x110 > [ 38.307181] do_syscall_64+0x6d/0x1d0 > [ 38.307590] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 38.308142] > [ 38.308316] Freed by task 0: > [ 38.308652] (stack is not available) > [ 38.309060] > [ 38.309243] The buggy address belongs to the object at 8800122b8c00 > [ 38.309243] which belongs to the cache scsi_sense_cache of size 96 > [ 38.310625] The buggy address is located 75 bytes inside of > [ 38.310625] 96-byte region [8800122b8c00, 8800122b8c60) With a hardware watchpoint, I've isolated the corruption to here: bfq_dispatch_request+0x2be/0x1610: __bfq_dispatch_request at block/bfq-iosched.c:3902 3900if (rq) { 3901inc_in_driver_start_rq: 3902bfqd->rq_in_driver++; 3903start_rq: 3904rq->rq_flags |= RQF_STARTED; 3905} Through some race condition(?), rq_in_driver is also sense_buffer, and it can get incremented. Specifically, I am doing: diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 25c14c58385c..f50d5053d732 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -6,6 +6,8 @@ #include #include #include +#include +#include #include @@ -428,6 +430,18 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *const hctx) } } +static void sample_hbp_handler(struct perf_event *bp, + struct perf_sample_data *data, + struct pt_regs *regs) +{ +printk(KERN_INFO "sense_buffer value is changed\n"); +dump_stack(); +printk(KERN_INFO "Dump stack from sample_hbp_handler\n"); +} + +struct perf_event * __percpu *sample_hbp; +int ok_to_go; + void blk_mq_sched_insert_request(struct request *rq, bool at_head, bool run_queue, bool async) { @@ -435,6 +449,16 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, struct elevator_queue *e = q->elevator; struct blk_mq_ctx *ctx = rq->mq_ctx; struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); + struct perf_event_attr attr; + struct scsi_request *req = scsi_req(rq); + + if (ok_to_go) { + hw_breakpoint_init(&attr); + attr.bp_addr = (uintptr_t)&(req->sense); + attr.bp_len = HW_BREAKPOINT_LEN_8; + attr.bp_type = HW_BREAKPOINT_W; + sample_hbp = register_wide_hw_breakpoint(&attr, sample_hbp_handler, NULL); + } /* flush rq in flush machinery need to be dispatched directly */ if (!(rq->rq_flags & RQF_FLUSH_SEQ) && op_is_flush(rq->cmd_flags)) { @@ -461,6 +485,9 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, run: if (run_queue) blk_mq_run_hw_queue(hctx
[PATCH 1/3] sd_zbc: Change the type of the ZBC fields into u32
This patch does not change any functionality but makes it clear that it is on purpose that these fields are 32 bits wide. Signed-off-by: Bart Van Assche Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Hannes Reinecke --- drivers/scsi/sd.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 0d663b5e45bb..392c7d078ae3 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -74,12 +74,12 @@ struct scsi_disk { struct gendisk *disk; struct opal_dev *opal_dev; #ifdef CONFIG_BLK_DEV_ZONED - unsigned intnr_zones; - unsigned intzone_blocks; - unsigned intzone_shift; - unsigned intzones_optimal_open; - unsigned intzones_optimal_nonseq; - unsigned intzones_max_open; + u32 nr_zones; + u32 zone_blocks; + u32 zone_shift; + u32 zones_optimal_open; + u32 zones_optimal_nonseq; + u32 zones_max_open; #endif atomic_topeners; sector_tcapacity; /* size in logical blocks */ -- 2.16.3
[PATCH 0/3] sd_zbc: Avoid that resetting a zone fails sporadically
Hello Martin, This patch series, when combined with the block layer patch that prevents that the report and reset zone ioctls are executed while a queue is frozen, prevents that these ioctls fail sporadically with -EIO. Since this race is relatively easy to trigger with the SMR support for fio that I'm working on, please consider this patch series for kernel v4.17. Thanks, Bart. Bart Van Assche (3): sd_zbc: Change the type of the ZBC fields into u32 sd_zbc: Let the SCSI core handle ILLEGAL REQUEST / ASC 0x21 sd_zbc: Avoid that resetting a zone fails sporadically drivers/scsi/sd.h | 12 ++-- drivers/scsi/sd_zbc.c | 150 +++-- include/linux/blkdev.h | 5 ++ 3 files changed, 93 insertions(+), 74 deletions(-) -- 2.16.3
[PATCH 3/3] sd_zbc: Avoid that resetting a zone fails sporadically
Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is called from sd_probe_async() and since sd_revalidate_disk() calls sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called concurrently with blkdev_report_zones() and/or blkdev_reset_zones(). That can cause these functions to fail with -EIO because sd_zbc_read_zones() e.g. sets q->nr_zones to zero before restoring it to the actual value, even if no drive characteristics have changed. Avoid that this can happen by making the following changes: - Protect the code that updates zone information with blk_queue_enter() and blk_queue_exit(). - Modify sd_zbc_setup_seq_zones_bitmap() and sd_zbc_setup() such that these functions do not modify struct scsi_disk before all zone information has been obtained. Note: since commit 055f6e18e08f ("block: Make q_usage_counter also track legacy requests"; kernel v4.15) the request queue freezing mechanism also affects legacy request queues. Fixes: 89d947561077 ("sd: Implement support for ZBC devices") Signed-off-by: Bart Van Assche Cc: Jens Axboe Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: sta...@vger.kernel.org # v4.10 --- drivers/scsi/sd_zbc.c | 140 + include/linux/blkdev.h | 5 ++ 2 files changed, 87 insertions(+), 58 deletions(-) diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index 2d0c06f7db3e..323e3dc4bc59 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -390,8 +390,10 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf) * * Check that all zones of the device are equal. The last zone can however * be smaller. The zone size must also be a power of two number of LBAs. + * + * Returns the zone size in bytes upon success or an error code upon failure. */ -static int sd_zbc_check_zone_size(struct scsi_disk *sdkp) +static s64 sd_zbc_check_zone_size(struct scsi_disk *sdkp) { u64 zone_blocks = 0; sector_t block = 0; @@ -402,8 +404,6 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp) int ret; u8 same; - sdkp->zone_blocks = 0; - /* Get a buffer */ buf = kmalloc(SD_ZBC_BUF_SIZE, GFP_KERNEL); if (!buf) @@ -435,16 +435,17 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp) /* Parse zone descriptors */ while (rec < buf + buf_len) { - zone_blocks = get_unaligned_be64(&rec[8]); - if (sdkp->zone_blocks == 0) { - sdkp->zone_blocks = zone_blocks; - } else if (zone_blocks != sdkp->zone_blocks && - (block + zone_blocks < sdkp->capacity - || zone_blocks > sdkp->zone_blocks)) { - zone_blocks = 0; + u64 this_zone_blocks = get_unaligned_be64(&rec[8]); + + if (zone_blocks == 0) { + zone_blocks = this_zone_blocks; + } else if (this_zone_blocks != zone_blocks && + (block + this_zone_blocks < sdkp->capacity + || this_zone_blocks > zone_blocks)) { + this_zone_blocks = 0; goto out; } - block += zone_blocks; + block += this_zone_blocks; rec += 64; } @@ -457,8 +458,6 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp) } while (block < sdkp->capacity); - zone_blocks = sdkp->zone_blocks; - out: if (!zone_blocks) { if (sdkp->first_scan) @@ -478,8 +477,7 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp) "Zone size too large\n"); ret = -ENODEV; } else { - sdkp->zone_blocks = zone_blocks; - sdkp->zone_shift = ilog2(zone_blocks); + ret = zone_blocks; } out_free: @@ -490,15 +488,14 @@ static int sd_zbc_check_zone_size(struct scsi_disk *sdkp) /** * sd_zbc_alloc_zone_bitmap - Allocate a zone bitmap (one bit per zone). - * @sdkp: The disk of the bitmap + * @nr_zones: Number of zones to allocate space for. + * @numa_node: NUMA node to allocate the memory from. */ -static inline unsigned long *sd_zbc_alloc_zone_bitmap(struct scsi_disk *sdkp) +static inline unsigned long * +sd_zbc_alloc_zone_bitmap(u32 nr_zones, int numa_node) { - struct request_queue *q = sdkp->disk->queue; - - return kzalloc_node(BITS_TO_LONGS(sdkp->nr_zones) - * sizeof(unsigned long), - GFP_KERNEL, q->node); + return kzalloc_node(BITS_TO_LONGS(nr_zones) * sizeof(unsigned long), + GFP_KERNEL, numa_node); }
[PATCH 2/3] sd_zbc: Let the SCSI core handle ILLEGAL REQUEST / ASC 0x21
scsi_io_completion() translates the sense key ILLEGAL REQUEST / ASC 0x21 into ACTION_FAIL. That means that setting cmd->allowed to zero in sd_zbc_complete() for this sense code / ASC combination is not necessary. Hence remove the code that resets cmd->allowed from sd_zbc_complete(). Signed-off-by: Bart Van Assche Cc: Damien Le Moal Cc: Christoph Hellwig Cc: Hannes Reinecke --- drivers/scsi/sd_zbc.c | 10 -- 1 file changed, 10 deletions(-) diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index 41df75eea57b..2d0c06f7db3e 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -299,16 +299,6 @@ void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, case REQ_OP_WRITE: case REQ_OP_WRITE_ZEROES: case REQ_OP_WRITE_SAME: - - if (result && - sshdr->sense_key == ILLEGAL_REQUEST && - sshdr->asc == 0x21) - /* -* INVALID ADDRESS FOR WRITE error: It is unlikely that -* retrying write requests failed with any kind of -* alignement error will result in success. So don't. -*/ - cmd->allowed = 0; break; case REQ_OP_ZONE_REPORT: -- 2.16.3
[Bug 199419] New: mpt3sas triggers KASAN complaint during reboot
https://bugzilla.kernel.org/show_bug.cgi?id=199419 Bug ID: 199419 Summary: mpt3sas triggers KASAN complaint during reboot Product: SCSI Drivers Version: 2.5 Kernel Version: v4.17-rc1 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Other Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org Reporter: bvanass...@acm.org Regression: No Created attachment 275411 --> https://bugzilla.kernel.org/attachment.cgi?id=275411&action=edit KASAN complaint Rebooting a system with an mpt3sas adapter causes the following complaint to be reported on the serial console: BUG: KASAN: use-after-free in mpt3sas_scsih_scsi_lookup_get+0xbd/0x120 [mpt3sas] Read of size 1 at addr 880807f4030a by task systemd-shutdow/1 CPU: 26 PID: 1 Comm: systemd-shutdow Not tainted 4.17.0-rc1-dbg+ #2 Hardware name: ASUSTeK COMPUTER INC. Z10PE-D16 WS/Z10PE-D16 WS, BIOS 3407 03/10/2017 Call Trace: dump_stack+0x7c/0xbb print_address_description+0x65/0x270 kasan_report+0x232/0x350 mpt3sas_scsih_scsi_lookup_get+0xbd/0x120 [mpt3sas] _scsih_flush_running_cmds+0x85/0x130 [mpt3sas] scsih_shutdown+0x4f/0xe0 [mpt3sas] pci_device_shutdown+0x42/0x80 device_shutdown+0x1af/0x2f0 kernel_restart+0x9/0x50 __do_sys_reboot+0x24e/0x2a0 do_syscall_64+0x5d/0x200 entry_SYSCALL_64_after_hwframe+0x49/0xbe (gdb) list *(mpt3sas_scsih_scsi_lookup_get+0xbd) 0x1fb2d is in mpt3sas_scsih_scsi_lookup_get (drivers/scsi/mpt3sas/mpt3sas_scsih.c:1468). 1463u32 unique_tag = smid - 1; 1464 1465scmd = scsi_host_find_tag(ioc->shost, unique_tag); 1466if (scmd) { 1467st = scsi_cmd_priv(scmd); 1468if (st->cb_idx == 0xFF) 1469scmd = NULL; 1470} 1471} 1472return scmd; -- You are receiving this mail because: You are watching the assignee of the bug.
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 16 Apr 2018 21:30:54 + Bart Van Assche wrote: > Hello Steve, > > The tool I'm most concerned about is blktrace. I'm not sure though how this > tool receives event data from the block layer core. Yeah, blktrace is "special", it looks like it registers its callbacks from the tracepoints, and writes the data to its own relay buffer. As it's not relying on the output from the tracing directory, additional fields being added shouldn't affect it. Looking at the trace event "block_rq_requeue" we have in the blktrace kernel code: static void blk_register_tracepoints(void) { int ret; ret = register_trace_block_rq_insert(blk_add_trace_rq_insert, NULL); Where the callback blk_add_trace_rq_insert() gets called when the trace event is hit. static void blk_add_trace_rq_insert(void *ignore, struct request_queue *q, struct request *rq) { blk_add_trace_rq(rq, 0, blk_rq_bytes(rq), BLK_TA_INSERT, blk_trace_request_get_cgid(q, rq)); } Where: static void blk_add_trace_rq(struct request *rq, int error, unsigned int nr_bytes, u32 what, union kernfs_node_id *cgid) { calls __blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq), rq->cmd_flags, what, error, 0, NULL, cgid); Which calls either the ftrace tracing file or its own relay buffer. Looking at the code from git://git.kernel.org/pub/scm/linux/kernel/git/axboe/blktrace.git It appears that it does not rely on the ftrace ring buffers. So I'm guessing blktrace is not affected by this patch. -- Steve
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 2018-04-16 at 17:24 -0400, Steven Rostedt wrote: > On Mon, 16 Apr 2018 20:49:12 + > Bart Van Assche wrote: > > > Which tools process these strings? Has it been verified whether or not > > the tools that process these strings still work fine with this patch > > applied? > > Ideally, tools shouldn't process trace event strings, but I'm sure some > do. :-/ > > Getting libtraceevent out as a library is a high priority of mine, and > hopefully I should get something out in a couple of months. > > Ideally, tools should parse the raw data and then new fields will not > affect them. Hello Steve, The tool I'm most concerned about is blktrace. I'm not sure though how this tool receives event data from the block layer core. Thanks, Bart.
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 16 Apr 2018 20:49:12 + Bart Van Assche wrote: > Which tools process these strings? Has it been verified whether or not > the tools that process these strings still work fine with this patch > applied? Ideally, tools shouldn't process trace event strings, but I'm sure some do. :-/ Getting libtraceevent out as a library is a high priority of mine, and hopefully I should get something out in a couple of months. Ideally, tools should parse the raw data and then new fields will not affect them. -- Steve
Proposal
Hello Greeetings to you please did you get my previous email regarding my investment proposal last week friday ? MS.Zeliha ömer faruk zeliha.omer.fa...@gmail.com
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 2018-04-16 at 21:10 +, Bean Huo (beanhuo) wrote: > Hi, Bart > > > mi...@redhad.com; linux-bl...@vger.kernel.org; raja...@google.com > > Subject: [EXT] Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in > > SCSI > > trace events > > > > On Mon, 2018-04-16 at 14:31 +, Bean Huo (beanhuo) wrote: > > > TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u > > > > prot_sgl=%u" \ > > > - " prot_op=%s cmnd=(%s %s raw=%s)", > > > + " prot_op=%s tag=%d cmnd=(%s %s raw=%s)", > > > > > > [ ... ] > > > TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u > > > > prot_sgl=%u" \ > > > - " prot_op=%s cmnd=(%s %s raw=%s) rtn=%d", > > > + " prot_op=%s tag=%d cmnd=(%s %s raw=%s) rtn=%d", > > > [ ... ] > > > TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u " \ > > > - "prot_sgl=%u prot_op=%s cmnd=(%s %s raw=%s) > > > > result=(driver=" \ > > > - "%s host=%s message=%s status=%s)", > > > + "prot_sgl=%u prot_op=%s tag=%d cmnd=(%s %s raw=%s) " \ > > > + "result=(driver=%s host=%s message=%s status=%s)", > > > > Which tools process these strings? Has it been verified whether or not the > > tools that process these strings still work fine with this patch applied? > > > > I don't use one certain special tool to analyze this string, I am using > ftrace with event. > With tag information, I can see how many tasks in storage device and easily > to trace each request > under multiple thread workload. > Event there is someone who using certain tool to analyze this string, after > adding additional > tag printout, in my opinion, they are happy to see it there. Since you want to change these strings it is your job to determine which user space tools parse these strings and also whether or not this change will break any of these tools. Thanks, Bart.
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
Hi, Bart >mi...@redhad.com; linux-bl...@vger.kernel.org; raja...@google.com >Subject: [EXT] Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI >trace events > >On Mon, 2018-04-16 at 14:31 +, Bean Huo (beanhuo) wrote: >> TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u >prot_sgl=%u" \ >> - " prot_op=%s cmnd=(%s %s raw=%s)", >> + " prot_op=%s tag=%d cmnd=(%s %s raw=%s)", >> >> [ ... ] >> TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u >prot_sgl=%u" \ >> - " prot_op=%s cmnd=(%s %s raw=%s) rtn=%d", >> + " prot_op=%s tag=%d cmnd=(%s %s raw=%s) rtn=%d", >> [ ... ] >> TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u " \ >> - "prot_sgl=%u prot_op=%s cmnd=(%s %s raw=%s) >result=(driver=" \ >> - "%s host=%s message=%s status=%s)", >> + "prot_sgl=%u prot_op=%s tag=%d cmnd=(%s %s raw=%s) " \ >> + "result=(driver=%s host=%s message=%s status=%s)", > >Which tools process these strings? Has it been verified whether or not the >tools that process these strings still work fine with this patch applied? > I don't use one certain special tool to analyze this string, I am using ftrace with event. With tag information, I can see how many tasks in storage device and easily to trace each request under multiple thread workload. Event there is someone who using certain tool to analyze this string, after adding additional tag printout, in my opinion, they are happy to see it there. Again, you said if we add new feature in legacy block, we will also add new feature in blk-mq. I still don't understand here. "include/trace/event/block.h ... scsi.h" will be changed? If yes, how? Because blk-mq is still using the events defined in include/trace/event/block.h. Bean Huo >Thanks, > >Bart. >
Re: [PATCH v9] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards
Thanks guys! Anything else needed for linux-scsi? Cheers, Michael On Mon, Apr 16, 2018 at 10:34 PM, John Paul Adrian Glaubitz wrote: > On 04/16/2018 11:26 AM, Christian T. Steigies wrote: >> >> On Thu, Apr 12, 2018 at 01:53:26PM +1200, Michael Schmitz wrote: >>> >>> From: Michael Schmitz >>> >>> New combined SCSI driver for all ESP based Zorro SCSI boards for >>> m68k Amiga. >>> >>> Code largely based on board specific parts of the old drivers (blz1230.c, >>> blz2060.c, cyberstorm.c, cyberstormII.c, fastlane.c which were removed >>> after the 2.6 kernel series for lack of maintenance) with contributions >>> by Tuomas Vainikka (TCQ bug tests and workaround) and Finn Thain (TCQ >>> bugfix by use of PIO in extended message in transfer). >>> >>> New Kconfig option and Makefile entries for new Amiga Zorro ESP SCSI >>> driver included in this patch. >>> >>> Use DMA transfers wherever possible, with board-specific DMA set-up >>> functions copied from the old driver code. Three byte reselection >>> messages >>> do appear to cause DMA timeouts. So wire up a PIO transfer routine for >>> these instead. esp_reselect_with_tag explicitly sets esp->cmd_block_dma >>> as >>> target address for the message bytes but PIO requires a virtual address. >>> Substiute kernel virtual address esp->cmd_block in PIO transfer call if >>> DMA address is esp->cmd_block_dma and phase is message in. >>> >>> PIO code taken from mac_esp.c where the reselection timeout issue was >>> debugged and fixed first, with minor macro and function rename. >>> >>> Signed-off-by: Michael Schmitz >>> Reviewed-by: Finn Thain >>> Reviewed-by: Christoph Hellwig >> >> Tested-by: Christian T. Steigies >> > > Also here on an Amiga 4000 with Cyberstorm I: > > Tested-by: John Paul Adrian Glaubitz > > -- > .''`. John Paul Adrian Glaubitz > : :' : Debian Developer - glaub...@debian.org > `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de > `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: [RESEND PATCH v1 2/2] trace: events: block: Add tag in block trace events
On Mon, 2018-04-16 at 14:33 +, Bean Huo (beanhuo) wrote: > [ ... ] > - TP_printk("%d,%d %s (%s) %llu + %u [%d]", > + TP_printk("%d,%d %s (%s) %llu + %u tag=%d [%d]", > [ ... ] > - TP_printk("%d,%d %s (%s) %llu + %u [%d]", > + TP_printk("%d,%d %s (%s) %llu + %u tag=%d [%d]", > [ ... ] Which tools process these strings? Has it been verified whether or not the tools that process these strings still work fine with this patch applied? Thanks, Bart.
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 2018-04-16 at 14:31 +, Bean Huo (beanhuo) wrote: > TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u prot_sgl=%u" \ > - " prot_op=%s cmnd=(%s %s raw=%s)", > + " prot_op=%s tag=%d cmnd=(%s %s raw=%s)", > > [ ... ] > TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u prot_sgl=%u" \ > - " prot_op=%s cmnd=(%s %s raw=%s) rtn=%d", > + " prot_op=%s tag=%d cmnd=(%s %s raw=%s) rtn=%d", > [ ... ] > TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u " \ > - "prot_sgl=%u prot_op=%s cmnd=(%s %s raw=%s) result=(driver=" \ > - "%s host=%s message=%s status=%s)", > + "prot_sgl=%u prot_op=%s tag=%d cmnd=(%s %s raw=%s) " \ > + "result=(driver=%s host=%s message=%s status=%s)", Which tools process these strings? Has it been verified whether or not the tools that process these strings still work fine with this patch applied? Thanks, Bart.
Re: usercopy whitelist woe in scsi_sense_cache
On Thu, Apr 12, 2018 at 8:02 PM, Kees Cook wrote: > On Thu, Apr 12, 2018 at 3:47 PM, Kees Cook wrote: >> After fixing up some build issues in the middle of the 4.16 cycle, I >> get an unhelpful bisect result of commit 0a4b6e2f80aa ("Merge branch >> 'for-4.16/block'"). Instead of letting the test run longer, I'm going >> to switch to doing several shorter test boots per kernel and see if >> that helps. One more bisect coming... > > Okay, so I can confirm the bisect points at the _merge_ itself, not a > specific patch. I'm not sure how to proceed here. It looks like some > kind of interaction between separate trees? Jens, do you have > suggestions on how to track this down? Turning off HARDENED_USERCOPY and turning on KASAN, I see the same report: [ 38.274106] BUG: KASAN: slab-out-of-bounds in _copy_to_user+0x42/0x60 [ 38.274841] Read of size 22 at addr 8800122b8c4b by task smartctl/1064 [ 38.275630] [ 38.275818] CPU: 2 PID: 1064 Comm: smartctl Not tainted 4.17.0-rc1-ARCH+ #266 [ 38.276631] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 38.277690] Call Trace: [ 38.277988] dump_stack+0x71/0xab [ 38.278397] ? _copy_to_user+0x42/0x60 [ 38.278833] print_address_description+0x6a/0x270 [ 38.279368] ? _copy_to_user+0x42/0x60 [ 38.279800] kasan_report+0x243/0x360 [ 38.280221] _copy_to_user+0x42/0x60 [ 38.280635] sg_io+0x459/0x660 ... Though we get slightly more details (some we already knew): [ 38.301330] Allocated by task 329: [ 38.301734] kmem_cache_alloc_node+0xca/0x220 [ 38.302239] scsi_mq_init_request+0x64/0x130 [scsi_mod] [ 38.302821] blk_mq_alloc_rqs+0x2cf/0x370 [ 38.303265] blk_mq_sched_alloc_tags.isra.4+0x7d/0xb0 [ 38.303820] blk_mq_init_sched+0xf0/0x220 [ 38.304268] elevator_switch+0x17a/0x2c0 [ 38.304705] elv_iosched_store+0x173/0x220 [ 38.305171] queue_attr_store+0x72/0xb0 [ 38.305602] kernfs_fop_write+0x188/0x220 [ 38.306049] __vfs_write+0xb6/0x330 [ 38.306436] vfs_write+0xe9/0x240 [ 38.306804] ksys_write+0x98/0x110 [ 38.307181] do_syscall_64+0x6d/0x1d0 [ 38.307590] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 38.308142] [ 38.308316] Freed by task 0: [ 38.308652] (stack is not available) [ 38.309060] [ 38.309243] The buggy address belongs to the object at 8800122b8c00 [ 38.309243] which belongs to the cache scsi_sense_cache of size 96 [ 38.310625] The buggy address is located 75 bytes inside of [ 38.310625] 96-byte region [8800122b8c00, 8800122b8c60) -Kees -- Kees Cook Pixel Security
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 2018-04-16 at 20:27 +, Bean Huo (beanhuo) wrote: > By the way, these patches are not to add new feature, they are just to > add print tag along with the other exist printed request parameters. Are you aware that there are two tag fields in struct request, namely "tag" and "internal_tag"? Are you aware that how these fields are used depends on whether or not a scheduler is attached to a request queue? Have you verified that the "tag" field contains a meaningful value for blk-mq for every blk-mq tracepoint? Thanks, Bart.
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
>>> This patch is not acceptable because it adds support for tag tracing >>> to the legacy block layer only. Any patch that adds a new feature to >>> the legacy block layer must also add it to blk-mq. >>> >> To be honest, I don't understand your point, can you give me more >explanation? > >The legacy block layer will be removed as soon as blk-mq provides all the >functionality of the legacy block layer and as soon as it performs at least as >well as the legacy block layer for all use cases. If new features are added to >the legacy block layer but not to blk-mq that prevents removal of the legacy >block layer. Hence the requirement I explained in my previous e-mail. > >Bart. Thanks for your information. I have several questions again. When the legacy block layer will be replaced by blk-mq? And "include/trece/event/block.h .. scsi.h" will also be changed? Do you have the related git rep or mail list about this topic? Maybe this is great big change, I am very interested in that. And want to have a look at. By the way, these patches are not to add new feature, they are just to add print tag along with the other exist Printed request parameters. The blk-mq is now still using "include/trace/evet/block.h" defined trace events. For example: void blk_mq_start_request(struct request *rq) { ... ... trace_block_rq_issue(q, rq); ... ... } Do you mean that this will also be removed/replaced by someone else? Bean Huo
Re: [PATCH 08/12] mmc: reduce use of block bounce buffers (fwd)
There is a duplicated test on line 360. julia -- Forwarded message -- Date: Mon, 16 Apr 2018 23:04:18 +0800 From: kbuild test robot To: kbu...@01.org Cc: Julia Lawall Subject: Re: [PATCH 08/12] mmc: reduce use of block bounce buffers CC: kbuild-...@01.org In-Reply-To: <20180416085032.7367-9-...@lst.de> References: <20180416085032.7367-9-...@lst.de> TO: Christoph Hellwig CC: io...@lists.linux-foundation.org, linux-a...@vger.kernel.org, linux-bl...@vger.kernel.org, linux-...@vger.kernel.org, linux-scsi@vger.kernel.org, net...@vger.kernel.org CC: linux-ker...@vger.kernel.org Hi Christoph, I love your patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.17-rc1 next-20180416] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Christoph-Hellwig/iscsi_tcp-don-t-set-a-bounce-limit/20180416-172618 :: branch date: 6 hours ago :: commit date: 6 hours ago >> drivers/mmc/core/queue.c:360:5-29: duplicated argument to && or || # https://github.com/0day-ci/linux/commit/6620a69f0eea8e8b7586f08f721c95a336022497 git remote add linux-review https://github.com/0day-ci/linux git remote update linux-review git checkout 6620a69f0eea8e8b7586f08f721c95a336022497 vim +360 drivers/mmc/core/queue.c 81196976 Adrian Hunter 2017-11-29 350 c8b5fd03 Adrian Hunter 2017-09-22 351 static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card) c8b5fd03 Adrian Hunter 2017-09-22 352 { c8b5fd03 Adrian Hunter 2017-09-22 353 struct mmc_host *host = card->host; c8b5fd03 Adrian Hunter 2017-09-22 354 8b904b5b Bart Van Assche 2018-03-07 355 blk_queue_flag_set(QUEUE_FLAG_NONROT, mq->queue); 8b904b5b Bart Van Assche 2018-03-07 356 blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, mq->queue); c8b5fd03 Adrian Hunter 2017-09-22 357 if (mmc_can_erase(card)) c8b5fd03 Adrian Hunter 2017-09-22 358 mmc_queue_setup_discard(mq->queue, card); c8b5fd03 Adrian Hunter 2017-09-22 359 6620a69f Christoph Hellwig 2018-04-16 @360 if (!mmc_dev(host)->dma_mask || !mmc_dev(host)->dma_mask) 6620a69f Christoph Hellwig 2018-04-16 361 blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); c8b5fd03 Adrian Hunter 2017-09-22 362 blk_queue_max_hw_sectors(mq->queue, c8b5fd03 Adrian Hunter 2017-09-22 363 min(host->max_blk_count, host->max_req_size / 512)); c8b5fd03 Adrian Hunter 2017-09-22 364 blk_queue_max_segments(mq->queue, host->max_segs); c8b5fd03 Adrian Hunter 2017-09-22 365 blk_queue_max_segment_size(mq->queue, host->max_seg_size); c8b5fd03 Adrian Hunter 2017-09-22 366 1e8e55b6 Adrian Hunter 2017-11-29 367 INIT_WORK(&mq->recovery_work, mmc_mq_recovery_handler); 81196976 Adrian Hunter 2017-11-29 368 INIT_WORK(&mq->complete_work, mmc_blk_mq_complete_work); 81196976 Adrian Hunter 2017-11-29 369 81196976 Adrian Hunter 2017-11-29 370 mutex_init(&mq->complete_lock); 81196976 Adrian Hunter 2017-11-29 371 81196976 Adrian Hunter 2017-11-29 372 init_waitqueue_head(&mq->wait); 81196976 Adrian Hunter 2017-11-29 373 } 81196976 Adrian Hunter 2017-11-29 374 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: Add udev-md-raid-safe-timeouts.rules
On Mon, 16 Apr 2018, Chris Murphy wrote: > Adding linux-usb@ and linux-scsi@ > (This email does contain the thread initiating email, but some replies > are on the other lists.) > > On Mon, Apr 16, 2018 at 5:43 AM, Austin S. Hemmelgarn > wrote: > > On 2018-04-15 21:04, Chris Murphy wrote: > >> > >> I just ran into this: > >> > >> https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec > >> > >> This solution is inadequate, can it be made more generic? This isn't > >> an md specific problem, it affects Btrfs and LVM as well. And in fact > >> raid0, and even none raid setups. > >> > >> There is no good reason to prevent deep recovery, which is what > >> happens with the default command timer of 30 seconds, with this class > >> of drive. Basically that value is going to cause data loss for the > >> single device and also raid0 case, where the reset happens before deep > >> recovery has a chance. And even if deep recovery fails to return user > >> data, what we need to see is the proper error message: read error UNC, > >> rather than a link reset message which just obfuscates the problem. > > > > > > This has been discussed at least once here before (probably more times, hard > > to be sure since it usually comes up as a side discussion in an only > > marginally related thread). Last I knew, the consensus here was that it > > needs to be changed upstream in the kernel, not by adding a udev rule > > because while the value is technically system policy, the default policy is > > brain-dead for anything but the original disks it was i9ntended for (30 > > seconds works perfectly fine for actual SCSI devices because they behave > > sanely in the face of media errors, but it's horribly inadequate for ATA > > devices). > > > > To re-iterate what I've said before on the subject: > > > > For ATA drives it should probably be 150 seconds. That's 30 seconds beyond > > the typical amount of time most consumer drives will keep retrying a sector, > > so even if it goes the full time to try and recover a sector this shouldn't > > trigger. The only people this change should negatively impact are those who > > have failing drives which support SCT ERC and have it enabled, but aren't > > already adjusting this timeout. > > > > For physical SCSI devices, it should continue to be 30 seconds. SCSI disks > > are sensible here and don't waste your time trying to recover a sector. For > > PV-SCSI devices, it should probably be adjusted too, but I don't know what a > > reasonable value is. > > > > For USB devices it should probably be higher than 30 seconds, but again I > > have no idea what a reasonable value is. > > I don't know how all of this is designed but it seems like there's > only one location for the command timer, and the SCSI driver owns it, > and then everyone else (ATA and USB and for all I know SAN) are on top > of that and lack any ability to have separate timeouts. As far as mass-storage is concerned, USB is merely a transport. It doesn't impose any timeout rules; the appropriate timeout value is whatever the device at the end of the USB link needs. Thus, a SCSI drive connected over USB could use a 30-second timeout, an ATA drive could use 150 seconds, and so on. Unfortunately, the only way to tell what sort of drive you've got is by looking at the Vendor/Product IDs or other information provided by the drive itself. You can't tell anything just from knowing what sort of bus it's on. Alan Stern > The nice thing about the udev rule is that it tests for SCT ERC before > making a change. There certainly are enterprise and almost enterprise > "NAS" SATA drives that have short SCT ERC times enabled out of the box > - and the udev method makes them immune to the change.
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On 04/16/18 10:05, Bean Huo (beanhuo) wrote: On Mon, 2018-04-16 at 09:41 -0700, Rajat Jain wrote: On Mon, Apr 16, 2018 at 8:28 AM, Steven Rostedt wrote: On Mon, 16 Apr 2018 14:31:49 + "Bean Huo (beanhuo)" wrote: Print the request tag along with other information while tracing a command. Signed-off-by: Bean Huo Acked-by: Rajat Jain This patch is not acceptable because it adds support for tag tracing to the legacy block layer only. Any patch that adds a new feature to the legacy block layer must also add it to blk-mq. To be honest, I don't understand your point, can you give me more explanation? The legacy block layer will be removed as soon as blk-mq provides all the functionality of the legacy block layer and as soon as it performs at least as well as the legacy block layer for all use cases. If new features are added to the legacy block layer but not to blk-mq that prevents removal of the legacy block layer. Hence the requirement I explained in my previous e-mail. Bart.
Re: Add udev-md-raid-safe-timeouts.rules
Adding linux-usb@ and linux-scsi@ (This email does contain the thread initiating email, but some replies are on the other lists.) On Mon, Apr 16, 2018 at 5:43 AM, Austin S. Hemmelgarn wrote: > On 2018-04-15 21:04, Chris Murphy wrote: >> >> I just ran into this: >> >> https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec >> >> This solution is inadequate, can it be made more generic? This isn't >> an md specific problem, it affects Btrfs and LVM as well. And in fact >> raid0, and even none raid setups. >> >> There is no good reason to prevent deep recovery, which is what >> happens with the default command timer of 30 seconds, with this class >> of drive. Basically that value is going to cause data loss for the >> single device and also raid0 case, where the reset happens before deep >> recovery has a chance. And even if deep recovery fails to return user >> data, what we need to see is the proper error message: read error UNC, >> rather than a link reset message which just obfuscates the problem. > > > This has been discussed at least once here before (probably more times, hard > to be sure since it usually comes up as a side discussion in an only > marginally related thread). Last I knew, the consensus here was that it > needs to be changed upstream in the kernel, not by adding a udev rule > because while the value is technically system policy, the default policy is > brain-dead for anything but the original disks it was i9ntended for (30 > seconds works perfectly fine for actual SCSI devices because they behave > sanely in the face of media errors, but it's horribly inadequate for ATA > devices). > > To re-iterate what I've said before on the subject: > > For ATA drives it should probably be 150 seconds. That's 30 seconds beyond > the typical amount of time most consumer drives will keep retrying a sector, > so even if it goes the full time to try and recover a sector this shouldn't > trigger. The only people this change should negatively impact are those who > have failing drives which support SCT ERC and have it enabled, but aren't > already adjusting this timeout. > > For physical SCSI devices, it should continue to be 30 seconds. SCSI disks > are sensible here and don't waste your time trying to recover a sector. For > PV-SCSI devices, it should probably be adjusted too, but I don't know what a > reasonable value is. > > For USB devices it should probably be higher than 30 seconds, but again I > have no idea what a reasonable value is. I don't know how all of this is designed but it seems like there's only one location for the command timer, and the SCSI driver owns it, and then everyone else (ATA and USB and for all I know SAN) are on top of that and lack any ability to have separate timeouts. The nice thing about the udev rule is that it tests for SCT ERC before making a change. There certainly are enterprise and almost enterprise "NAS" SATA drives that have short SCT ERC times enabled out of the box - and the udev method makes them immune to the change. -- Chris Murphy
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
Hi, Bart >On Mon, 2018-04-16 at 09:41 -0700, Rajat Jain wrote: >> On Mon, Apr 16, 2018 at 8:28 AM, Steven Rostedt >wrote: >> > On Mon, 16 Apr 2018 14:31:49 + >> > "Bean Huo (beanhuo)" wrote: >> > >> > > Print the request tag along with other information while tracing a >> > > command. >> > > >> > > Signed-off-by: Bean Huo >> >> Acked-by: Rajat Jain > >This patch is not acceptable because it adds support for tag tracing to the >legacy block layer only. Any patch that adds a new feature to the legacy block >layer must also add it to blk-mq. > To be honest, I don't understand your point, can you give me more explanation? >Bart. > >
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 2018-04-16 at 09:41 -0700, Rajat Jain wrote: > On Mon, Apr 16, 2018 at 8:28 AM, Steven Rostedt wrote: > > On Mon, 16 Apr 2018 14:31:49 + > > "Bean Huo (beanhuo)" wrote: > > > > > Print the request tag along with other information > > > while tracing a command. > > > > > > Signed-off-by: Bean Huo > > Acked-by: Rajat Jain This patch is not acceptable because it adds support for tag tracing to the legacy block layer only. Any patch that adds a new feature to the legacy block layer must also add it to blk-mq. Bart.
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, Apr 16, 2018 at 8:28 AM, Steven Rostedt wrote: > On Mon, 16 Apr 2018 14:31:49 + > "Bean Huo (beanhuo)" wrote: > >> Print the request tag along with other information >> while tracing a command. >> >> Signed-off-by: Bean Huo Acked-by: Rajat Jain >> --- > > I don't see any issue with the tracing part. > > Acked-by: Steven Rostedt (VMware) > > Others need to check the content. > > -- Steve
Re: [RESEND PATCH v1 2/2] trace: events: block: Add tag in block trace events
On Mon, 16 Apr 2018 14:33:29 + "Bean Huo (beanhuo)" wrote: > Print the request tag along with other information in block trace events > when tracing request , and unplug type (Sync / Async). > > Signed-off-by: Bean Huo I don't see any issue with the tracing part. Acked-by: Steven Rostedt (VMware) Others need to check the content. -- Steve > --- > include/trace/events/block.h | 36 +--- > 1 file changed, 25 insertions(+), 11 deletions(-) >
Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
On Mon, 16 Apr 2018 14:31:49 + "Bean Huo (beanhuo)" wrote: > Print the request tag along with other information > while tracing a command. > > Signed-off-by: Bean Huo > --- I don't see any issue with the tracing part. Acked-by: Steven Rostedt (VMware) Others need to check the content. -- Steve
[RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events
Print the request tag along with other information while tracing a command. Signed-off-by: Bean Huo --- include/trace/events/scsi.h | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/include/trace/events/scsi.h b/include/trace/events/scsi.h index f624969..a4ada90 100644 --- a/include/trace/events/scsi.h +++ b/include/trace/events/scsi.h @@ -210,6 +210,7 @@ TRACE_EVENT(scsi_dispatch_cmd_start, __field( unsigned int, lun ) __field( unsigned int, opcode ) __field( unsigned int, cmd_len ) + __field( int, tag ) __field( unsigned int, data_sglen ) __field( unsigned int, prot_sglen ) __field( unsigned char, prot_op ) @@ -223,6 +224,7 @@ TRACE_EVENT(scsi_dispatch_cmd_start, __entry->lun= cmd->device->lun; __entry->opcode = cmd->cmnd[0]; __entry->cmd_len= cmd->cmd_len; + __entry->tag= cmd->request->tag; __entry->data_sglen = scsi_sg_count(cmd); __entry->prot_sglen = scsi_prot_sg_count(cmd); __entry->prot_op= scsi_get_prot_op(cmd); @@ -230,10 +232,10 @@ TRACE_EVENT(scsi_dispatch_cmd_start, ), TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u prot_sgl=%u" \ - " prot_op=%s cmnd=(%s %s raw=%s)", + " prot_op=%s tag=%d cmnd=(%s %s raw=%s)", __entry->host_no, __entry->channel, __entry->id, __entry->lun, __entry->data_sglen, __entry->prot_sglen, - show_prot_op_name(__entry->prot_op), + show_prot_op_name(__entry->prot_op), __entry->tag, show_opcode_name(__entry->opcode), __parse_cdb(__get_dynamic_array(cmnd), __entry->cmd_len), __print_hex(__get_dynamic_array(cmnd), __entry->cmd_len)) @@ -253,6 +255,7 @@ TRACE_EVENT(scsi_dispatch_cmd_error, __field( int, rtn ) __field( unsigned int, opcode ) __field( unsigned int, cmd_len ) + __field( int, tag ) __field( unsigned int, data_sglen ) __field( unsigned int, prot_sglen ) __field( unsigned char, prot_op ) @@ -267,6 +270,7 @@ TRACE_EVENT(scsi_dispatch_cmd_error, __entry->rtn= rtn; __entry->opcode = cmd->cmnd[0]; __entry->cmd_len= cmd->cmd_len; + __entry->tag= cmd->request->tag; __entry->data_sglen = scsi_sg_count(cmd); __entry->prot_sglen = scsi_prot_sg_count(cmd); __entry->prot_op= scsi_get_prot_op(cmd); @@ -274,10 +278,10 @@ TRACE_EVENT(scsi_dispatch_cmd_error, ), TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u prot_sgl=%u" \ - " prot_op=%s cmnd=(%s %s raw=%s) rtn=%d", + " prot_op=%s tag=%d cmnd=(%s %s raw=%s) rtn=%d", __entry->host_no, __entry->channel, __entry->id, __entry->lun, __entry->data_sglen, __entry->prot_sglen, - show_prot_op_name(__entry->prot_op), + show_prot_op_name(__entry->prot_op), __entry->tag, show_opcode_name(__entry->opcode), __parse_cdb(__get_dynamic_array(cmnd), __entry->cmd_len), __print_hex(__get_dynamic_array(cmnd), __entry->cmd_len), @@ -298,6 +302,7 @@ DECLARE_EVENT_CLASS(scsi_cmd_done_timeout_template, __field( int, result ) __field( unsigned int, opcode ) __field( unsigned int, cmd_len ) + __field( int, tag ) __field( unsigned int, data_sglen ) __field( unsigned int, prot_sglen ) __field( unsigned char, prot_op ) @@ -312,6 +317,7 @@ DECLARE_EVENT_CLASS(scsi_cmd_done_timeout_template, __entry->result = cmd->result; __entry->opcode = cmd->cmnd[0]; __entry->cmd_len= cmd->cmd_len; + __entry->tag= cmd->request->tag; __entry->data_sglen = scsi_sg_count(cmd); __entry->prot_sglen = scsi_prot_sg_count(cmd); __entry->prot_op= scsi_get_prot_op(cmd); @@ -319,11 +325,11 @@ DECLARE_EVENT_CLASS(scsi_cmd_done_timeout_template, ), TP_printk("host_no=%u channel=%u id=%u lun=%u data_sgl=%u " \ - "prot_sgl=%u prot_op=%s cmnd=(%s %s raw=%s) result=(driver=" \ - "%s host=%s message=%s status=%s)", + "prot_sgl=%u prot_op=%s tag=%d cmnd=(%s %s raw=%s) " \ + "result=(driver=%s host=%s message=%s status=%s)",
[RESEND PATCH v1 0/2] Print the request tag in Block/SCSI trace events
These patches are to add the printout of the request tag in Block/SCSI trace events when tracing one request or command, this is very useful for tracing the task running status in the storage device which supports multiple command queue. As for the first patch " Add tag in SCSI trace events", copied from Rajat Jain [1]. I am just re-sending here. [1]https://patchwork.kernel.org/patch/6399661/ -- Resend since patches need to go through the block and SCSI subsystem, and need related maintainer's confirmation. Bean Huo (2): trace: events: scsi: Add tag in SCSI trace events trace: events: block: Add tag in block trace events include/trace/events/block.h | 36 +--- include/trace/events/scsi.h | 20 +--- 2 files changed, 38 insertions(+), 18 deletions(-) -- 2.7.4
[RESEND PATCH v1 2/2] trace: events: block: Add tag in block trace events
Print the request tag along with other information in block trace events when tracing request , and unplug type (Sync / Async). Signed-off-by: Bean Huo --- include/trace/events/block.h | 36 +--- 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/include/trace/events/block.h b/include/trace/events/block.h index 81b43f5..f8c0b9e 100644 --- a/include/trace/events/block.h +++ b/include/trace/events/block.h @@ -81,6 +81,7 @@ TRACE_EVENT(block_rq_requeue, __field( dev_t,dev ) __field( sector_t, sector ) __field( unsigned int, nr_sector ) + __field( int, tag ) __array( char, rwbs, RWBS_LEN) __dynamic_array( char, cmd,1 ) ), @@ -89,16 +90,17 @@ TRACE_EVENT(block_rq_requeue, __entry->dev = rq->rq_disk ? disk_devt(rq->rq_disk) : 0; __entry->sector= blk_rq_trace_sector(rq); __entry->nr_sector = blk_rq_trace_nr_sectors(rq); + __entry->tag = rq->tag; blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq)); __get_str(cmd)[0] = '\0'; ), - TP_printk("%d,%d %s (%s) %llu + %u [%d]", + TP_printk("%d,%d %s (%s) %llu + %u tag=%d [%d]", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs, __get_str(cmd), (unsigned long long)__entry->sector, - __entry->nr_sector, 0) + __entry->nr_sector, __entry->tag, 0) ); /** @@ -124,6 +126,7 @@ TRACE_EVENT(block_rq_complete, __field( sector_t, sector ) __field( unsigned int, nr_sector ) __field( int, error ) + __field( int, tag ) __array( char, rwbs, RWBS_LEN) __dynamic_array( char, cmd,1 ) ), @@ -133,16 +136,17 @@ TRACE_EVENT(block_rq_complete, __entry->sector= blk_rq_pos(rq); __entry->nr_sector = nr_bytes >> 9; __entry->error = error; + __entry->tag = rq->tag; blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, nr_bytes); __get_str(cmd)[0] = '\0'; ), - TP_printk("%d,%d %s (%s) %llu + %u [%d]", + TP_printk("%d,%d %s (%s) %llu + %u tag=%d [%d]", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs, __get_str(cmd), (unsigned long long)__entry->sector, - __entry->nr_sector, __entry->error) + __entry->nr_sector, __entry->tag, __entry->error) ); DECLARE_EVENT_CLASS(block_rq, @@ -156,6 +160,7 @@ DECLARE_EVENT_CLASS(block_rq, __field( sector_t, sector ) __field( unsigned int, nr_sector ) __field( unsigned int, bytes ) + __field( int, tag ) __array( char, rwbs, RWBS_LEN) __array( char, comm, TASK_COMM_LEN ) __dynamic_array( char, cmd,1 ) @@ -166,17 +171,18 @@ DECLARE_EVENT_CLASS(block_rq, __entry->sector= blk_rq_trace_sector(rq); __entry->nr_sector = blk_rq_trace_nr_sectors(rq); __entry->bytes = blk_rq_bytes(rq); + __entry->tag = rq->tag; blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq)); __get_str(cmd)[0] = '\0'; memcpy(__entry->comm, current->comm, TASK_COMM_LEN); ), - TP_printk("%d,%d %s %u (%s) %llu + %u [%s]", + TP_printk("%d,%d %s %u (%s) %llu + %u tag=%d [%s]", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->rwbs, __entry->bytes, __get_str(cmd), (unsigned long long)__entry->sector, - __entry->nr_sector, __entry->comm) + __entry->nr_sector, __entry->tag, __entry->comm) ); /** @@ -297,6 +303,7 @@ DECLARE_EVENT_CLASS(block_bio_merge, __field( dev_t, dev ) __field( sector_t, sector ) __field( unsigned int, nr_sector ) + __field( int, tag ) __array( char, rwbs, RWBS_LEN) __array( char, comm, TASK_COMM_LEN ) ), @@ -305,14 +312,15 @@ DECLARE_EVENT_CLASS(block_bio_merge, __entry->dev= bio_dev(bio);
[PATCHv3] tcmu: allow userspace to reset netlink
From: Xiubo Li This patch adds 1 tcmu attr to reset and complete all the blocked netlink waiting threads. It's used when the userspace daemon like tcmu-runner has crashed or forced to shutdown just before the netlink requests be replied to the kernel, then the netlink requeting threads will get stuck forever. We must reboot the machine to recover from it and by this the rebootng is not a must then. The Call Trace will be something like: == INFO: task targetctl:22655 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. targetctl D 880169718fd0 0 22655 17249 0x0080 Call Trace: [] schedule+0x29/0x70 [] schedule_timeout+0x239/0x2c0 [] ? skb_release_data+0xf2/0x140 [] wait_for_completion+0xfd/0x140 [] ? wake_up_state+0x20/0x20 [] tcmu_netlink_event+0x26a/0x3a0 [target_core_user] [] ? wake_up_atomic_t+0x30/0x30 [] tcmu_configure_device+0x236/0x350 [target_core_user] [] target_configure_device+0x3f/0x3b0 [target_core_mod] [] target_core_store_dev_enable+0x2c/0x60 [target_core_mod] [] target_core_dev_store+0x24/0x40 [target_core_mod] [] configfs_write_file+0xc4/0x130 [] vfs_write+0xbd/0x1e0 [] SyS_write+0x7f/0xe0 [] system_call_fastpath+0x16/0x1b == Signed-off-by: Xiubo Li --- Changes since v1(suggested by Mike Christie): v2: - Makes the reset per device. v3: - Remove nl_cmd->complete, use status instead - Fix lock issue - Check if a nl command is even waiting before trying to wake up drivers/target/target_core_user.c | 60 +-- 1 file changed, 51 insertions(+), 9 deletions(-) diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c index 4ad89ea..a831f5b7 100644 --- a/drivers/target/target_core_user.c +++ b/drivers/target/target_core_user.c @@ -104,8 +104,8 @@ struct tcmu_hba { #define TCMU_CONFIG_LEN 256 struct tcmu_nl_cmd { - /* wake up thread waiting for reply */ - struct completion complete; + unsigned int waiter; + int cmd; int status; }; @@ -159,9 +159,12 @@ struct tcmu_dev { spinlock_t nl_cmd_lock; struct tcmu_nl_cmd curr_nl_cmd; - /* wake up threads waiting on curr_nl_cmd */ + /* wake up threads waiting on nl_cmd_wq */ wait_queue_head_t nl_cmd_wq; + /* complete thread waiting complete_wq */ + wait_queue_head_t complete_wq; + char dev_config[TCMU_CONFIG_LEN]; int nl_reply_supported; @@ -307,11 +310,13 @@ static int tcmu_genl_cmd_done(struct genl_info *info, int completed_cmd) nl_cmd->status = rc; } - spin_unlock(&udev->nl_cmd_lock); if (!is_removed) target_undepend_item(&dev->dev_group.cg_item); - if (!ret) - complete(&nl_cmd->complete); + if (!ret && nl_cmd->waiter) { + nl_cmd->waiter--; + wake_up(&udev->complete_wq); + } + spin_unlock(&udev->nl_cmd_lock); return ret; } @@ -1258,6 +1263,7 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name) timer_setup(&udev->cmd_timer, tcmu_cmd_timedout, 0); init_waitqueue_head(&udev->nl_cmd_wq); + init_waitqueue_head(&udev->complete_wq); spin_lock_init(&udev->nl_cmd_lock); INIT_RADIX_TREE(&udev->data_blocks, GFP_KERNEL); @@ -1554,8 +1560,8 @@ static void tcmu_init_genl_cmd_reply(struct tcmu_dev *udev, int cmd) } memset(nl_cmd, 0, sizeof(*nl_cmd)); + nl_cmd->status = 1; nl_cmd->cmd = cmd; - init_completion(&nl_cmd->complete); spin_unlock(&udev->nl_cmd_lock); } @@ -1572,13 +1578,16 @@ static int tcmu_wait_genl_cmd_reply(struct tcmu_dev *udev) if (udev->nl_reply_supported <= 0) return 0; + spin_lock(&udev->nl_cmd_lock); + nl_cmd->waiter++; + spin_unlock(&udev->nl_cmd_lock); + pr_debug("sleeping for nl reply\n"); - wait_for_completion(&nl_cmd->complete); + wait_event(udev->complete_wq, nl_cmd->status != 1); spin_lock(&udev->nl_cmd_lock); nl_cmd->cmd = TCMU_CMD_UNSPEC; ret = nl_cmd->status; - nl_cmd->status = 0; spin_unlock(&udev->nl_cmd_lock); wake_up_all(&udev->nl_cmd_wq); @@ -2323,6 +2332,38 @@ static ssize_t tcmu_block_dev_store(struct config_item *item, const char *page, } CONFIGFS_ATTR(tcmu_, block_dev); +static ssize_t tcmu_reset_netlink_store(struct config_item *item, const char *page, + size_t count) +{ + struct se_device *se_dev = container_of(to_config_group(item), + struct se_device, + dev_action_group); + struct tcmu_dev *udev = TCMU_DEV(se_dev); + struct tcmu_nl_cmd *nl_cmd = &udev->curr_nl_cmd; + u8 val; + int ret; + + ret = kstrtou
Re: [PATCH 08/12] mmc: reduce use of block bounce buffers
On 16/04/18 09:50, Christoph Hellwig wrote: We can rely on the dma-mapping code to handle any DMA limits that is bigger than the ISA DMA mask for us (either using an iommu or swiotlb), so remove setting the block layer bounce limit for anything but bouncing for highmem pages. Signed-off-by: Christoph Hellwig --- drivers/mmc/core/queue.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c index 56e9a803db21..60a02a763d01 100644 --- a/drivers/mmc/core/queue.c +++ b/drivers/mmc/core/queue.c @@ -351,17 +351,14 @@ static const struct blk_mq_ops mmc_mq_ops = { static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card) { struct mmc_host *host = card->host; - u64 limit = BLK_BOUNCE_HIGH; - - if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask) - limit = (u64)dma_max_pfn(mmc_dev(host)) << PAGE_SHIFT; blk_queue_flag_set(QUEUE_FLAG_NONROT, mq->queue); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, mq->queue); if (mmc_can_erase(card)) mmc_queue_setup_discard(mq->queue, card); - blk_queue_bounce_limit(mq->queue, limit); + if (!mmc_dev(host)->dma_mask || !mmc_dev(host)->dma_mask) I'm almost surprised that GCC doesn't warn about "x || x", but nevertheless I think you've lost a "*" here... Robin. + blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); blk_queue_max_hw_sectors(mq->queue, min(host->max_blk_count, host->max_req_size / 512)); blk_queue_max_segments(mq->queue, host->max_segs);
Re: [PATCH v9] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards
On 04/16/2018 11:26 AM, Christian T. Steigies wrote: On Thu, Apr 12, 2018 at 01:53:26PM +1200, Michael Schmitz wrote: From: Michael Schmitz New combined SCSI driver for all ESP based Zorro SCSI boards for m68k Amiga. Code largely based on board specific parts of the old drivers (blz1230.c, blz2060.c, cyberstorm.c, cyberstormII.c, fastlane.c which were removed after the 2.6 kernel series for lack of maintenance) with contributions by Tuomas Vainikka (TCQ bug tests and workaround) and Finn Thain (TCQ bugfix by use of PIO in extended message in transfer). New Kconfig option and Makefile entries for new Amiga Zorro ESP SCSI driver included in this patch. Use DMA transfers wherever possible, with board-specific DMA set-up functions copied from the old driver code. Three byte reselection messages do appear to cause DMA timeouts. So wire up a PIO transfer routine for these instead. esp_reselect_with_tag explicitly sets esp->cmd_block_dma as target address for the message bytes but PIO requires a virtual address. Substiute kernel virtual address esp->cmd_block in PIO transfer call if DMA address is esp->cmd_block_dma and phase is message in. PIO code taken from mac_esp.c where the reselection timeout issue was debugged and fixed first, with minor macro and function rename. Signed-off-by: Michael Schmitz Reviewed-by: Finn Thain Reviewed-by: Christoph Hellwig Tested-by: Christian T. Steigies Also here on an Amiga 4000 with Cyberstorm I: Tested-by: John Paul Adrian Glaubitz -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: [PATCH v9] scsi: new zorro_esp.c for Amiga Zorro NCR53C9x boards
On Thu, Apr 12, 2018 at 01:53:26PM +1200, Michael Schmitz wrote: > From: Michael Schmitz > > New combined SCSI driver for all ESP based Zorro SCSI boards for > m68k Amiga. > > Code largely based on board specific parts of the old drivers (blz1230.c, > blz2060.c, cyberstorm.c, cyberstormII.c, fastlane.c which were removed > after the 2.6 kernel series for lack of maintenance) with contributions > by Tuomas Vainikka (TCQ bug tests and workaround) and Finn Thain (TCQ > bugfix by use of PIO in extended message in transfer). > > New Kconfig option and Makefile entries for new Amiga Zorro ESP SCSI > driver included in this patch. > > Use DMA transfers wherever possible, with board-specific DMA set-up > functions copied from the old driver code. Three byte reselection messages > do appear to cause DMA timeouts. So wire up a PIO transfer routine for > these instead. esp_reselect_with_tag explicitly sets esp->cmd_block_dma as > target address for the message bytes but PIO requires a virtual address. > Substiute kernel virtual address esp->cmd_block in PIO transfer call if > DMA address is esp->cmd_block_dma and phase is message in. > > PIO code taken from mac_esp.c where the reselection timeout issue was > debugged and fixed first, with minor macro and function rename. > > Signed-off-by: Michael Schmitz > Reviewed-by: Finn Thain > Reviewed-by: Christoph Hellwig Tested-by: Christian T. Steigies Christian
[PATCH] target: fix crash with iscsi target and dvd
When the current page can't be added to bio, one new bio should be created for adding this page again, instead of ignoring this page. This patch fixes kernel crash with iscsi target and dvd, as reported by Wakko. Cc: Wakko Warner Cc: Bart Van Assche Cc: target-de...@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: "Nicholas A. Bellinger" Cc: Christoph Hellwig Fixes: 84c8590646d5b35804 ("target: avoid accessing .bi_vcnt directly") Signed-off-by: Ming Lei --- drivers/target/target_core_pscsi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c index 0d99b242e82e..6cb933ecc084 100644 --- a/drivers/target/target_core_pscsi.c +++ b/drivers/target/target_core_pscsi.c @@ -890,6 +890,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, bytes = min(bytes, data_len); if (!bio) { +new_bio: nr_vecs = min_t(int, BIO_MAX_PAGES, nr_pages); nr_pages -= nr_vecs; /* @@ -931,6 +932,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, * be allocated with pscsi_get_bio() above. */ bio = NULL; + goto new_bio; } data_len -= bytes; -- 2.9.5
Re: [PATCH v2] block: ratelimite pr_err on IO path
On Mon, Apr 16, 2018 at 1:46 PM, Jinpu Wang wrote: > On Fri, Apr 13, 2018 at 6:59 PM, Martin K. Petersen > wrote: >> >> Jinpu, >> >> [CC:ed the mpt3sas maintainers] >> >> The ratelimit patch is just an attempt to treat the symptom, not the >> cause. > Agree. If we can fix the root cause, it will be great. >> >>> Thanks for asking, we updated mpt3sas driver which enables DIX support >>> (prot_mask=0x7f), all disks are SATA SSDs, no DIF support. >>> After reboot, kernel reports the IO errors from all the drives behind >>> HBA, seems for almost every read IO, which turns the system unusable: >>> [ 13.079375] sda: ref tag error at location 0 (rcvd 143196159) >>> [ 13.079989] sda: ref tag error at location 937702912 (rcvd 143196159) >>> [ 13.080233] sda: ref tag error at location 937703072 (rcvd 143196159) >>> [ 13.080407] sda: ref tag error at location 0 (rcvd 143196159) >>> [ 13.080594] sda: ref tag error at location 8 (rcvd 143196159) >> >> That sounds like a bug in the mpt3sas driver or firmware. I guess the >> HBA could conceivably be operating a SATA device as DIX Type 0 and strip >> the PI on the drive side. But that doesn't seem to be a particularly >> useful mode of operation. >> >> Jinpu: Which firmware are you running? Also, please send us the output >> of: >> >> sg_readcap -l /dev/sda >> sg_inq -x /dev/sda >> sg_vpd /dev/sda >> > Disks are INTEL SSDSC2BX48, directly attached to HBA. > LSISAS3008: FWVersion(13.00.00.00), ChipRevision(0x02), > BiosVersion(08.11.00.00) > mpt3sas_cm2: Protocol=(Initiator,Target), > Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set > Full,NCQ) > > jwang@x:~$ sudo sg_vpd /dev/sdz > Supported VPD pages VPD page: > Supported VPD pages [sv] > Unit serial number [sn] > Device identification [di] > Mode page policy [mpp] > ATA information (SAT) [ai] > Block limits (SBC) [bl] > Block device characteristics (SBC) [bdc] > Logical block provisioning (SBC) [lbpv] > jwang@x:~$ sudo sg_inq -x /dev/sdz > VPD INQUIRY: extended INQUIRY data page > inquiry: field in cdb illegal (page not supported) > jwang@x:~$ sudo sg_readcap -l /dev/sdz > Read Capacity results: >Protection: prot_en=0, p_type=0, p_i_exponent=0 >Logical block provisioning: lbpme=1, lbprz=1 >Last logical block address=937703087 (0x37e436af), Number of > logical blocks=937703088 >Logical block length=512 bytes >Logical blocks per physical block exponent=3 [so physical block > length=4096 bytes] >Lowest aligned logical block address=0 > Hence: >Device size: 480103981056 bytes, 457862.8 MiB, 480.10 GB > > >> Broadcom: How is DIX supposed to work for SATA drives behind an mpt3sas >> controller? [Sreekanth] Current Upstream mpt3sas driver doesn't have DIX support capabilities, it supports only DIF feature. Thanks, Sreekanth >> >> -- >> Martin K. Petersen Oracle Linux Engineering > > > Thanks! > > -- > Jack Wang > Linux Kernel Developer > > ProfitBricks GmbH > Greifswalder Str. 207 > D - 10405 Berlin
[PATCH 01/12] iscsi_tcp: don't set a bounce limit
The default already is to never bounce, so the call is a no-op. Signed-off-by: Christoph Hellwig --- drivers/scsi/iscsi_tcp.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 2ba4b68fdb73..b025a0b74341 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -962,7 +962,6 @@ static int iscsi_sw_tcp_slave_configure(struct scsi_device *sdev) if (conn->datadgst_en) sdev->request_queue->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES; - blk_queue_bounce_limit(sdev->request_queue, BLK_BOUNCE_ANY); blk_queue_dma_alignment(sdev->request_queue, 0); return 0; } -- 2.17.0
[PATCH 02/12] storsvc: don't set a bounce limit
The default already is to never bounce, so the call is a no-op. Signed-off-by: Christoph Hellwig --- drivers/scsi/storvsc_drv.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index 8c51d628b52e..5f2d177c3bd9 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1382,9 +1382,6 @@ static int storvsc_device_alloc(struct scsi_device *sdevice) static int storvsc_device_configure(struct scsi_device *sdevice) { - - blk_queue_bounce_limit(sdevice->request_queue, BLK_BOUNCE_ANY); - blk_queue_rq_timeout(sdevice->request_queue, (storvsc_timeout * HZ)); /* Ensure there are no gaps in presented sgls */ -- 2.17.0
remove PCI_DMA_BUS_IS_PHYS
Hi all, this series tries to get rid of the global and PCI_DMA_BUS_IS_PHYS flag, which causes the block layer and networking code to bounce buffer memory above the dma mask in some cases. It is a leftover from i386 + highmem days and is obsolete now that we have swiotlb or iommus so that the dma ops implementations can always (well minus the ISA DMA case which will require further attention) handle memory passed to them.
[PATCH 03/12] mtip32xx: don't use block layer bounce buffers
mtip32xx just sets the block bounce limit to the dma mask, which means that the iommu or swiotlb already take care of the bounce buffering, and the block bouncing can be removed. Signed-off-by: Christoph Hellwig --- drivers/block/mtip32xx/mtip32xx.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 769c551e3d71..b03bb27dcc58 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -3862,7 +3862,6 @@ static int mtip_block_initialize(struct driver_data *dd) blk_queue_max_hw_sectors(dd->queue, 0x); blk_queue_max_segment_size(dd->queue, 0x40); blk_queue_io_min(dd->queue, 4096); - blk_queue_bounce_limit(dd->queue, dd->pdev->dma_mask); /* Signal trim support */ if (dd->trim_supp == true) { -- 2.17.0
[PATCH 04/12] DAC960: don't use block layer bounce buffers
DAC960 just sets the block bounce limit to the dma mask, which means that the iommu or swiotlb already take care of the bounce buffering, and the block bouncing can be removed. Signed-off-by: Christoph Hellwig --- drivers/block/DAC960.c | 9 ++--- drivers/block/DAC960.h | 1 - 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/block/DAC960.c b/drivers/block/DAC960.c index f781eff7d23e..c9ba48519d0f 100644 --- a/drivers/block/DAC960.c +++ b/drivers/block/DAC960.c @@ -1179,7 +1179,6 @@ static bool DAC960_V1_EnableMemoryMailboxInterface(DAC960_Controller_T if (pci_set_dma_mask(Controller->PCIDevice, DMA_BIT_MASK(32))) return DAC960_Failure(Controller, "DMA mask out of range"); - Controller->BounceBufferLimit = DMA_BIT_MASK(32); if ((hw_type == DAC960_PD_Controller) || (hw_type == DAC960_P_Controller)) { CommandMailboxesSize = 0; @@ -1380,11 +1379,8 @@ static bool DAC960_V2_EnableMemoryMailboxInterface(DAC960_Controller_T dma_addr_t CommandMailboxDMA; DAC960_V2_CommandStatus_T CommandStatus; - if (!pci_set_dma_mask(Controller->PCIDevice, DMA_BIT_MASK(64))) - Controller->BounceBufferLimit = DMA_BIT_MASK(64); - else if (!pci_set_dma_mask(Controller->PCIDevice, DMA_BIT_MASK(32))) - Controller->BounceBufferLimit = DMA_BIT_MASK(32); - else + if (pci_set_dma_mask(Controller->PCIDevice, DMA_BIT_MASK(64)) && + pci_set_dma_mask(Controller->PCIDevice, DMA_BIT_MASK(32))) return DAC960_Failure(Controller, "DMA mask out of range"); /* This is a temporary dma mapping, used only in the scope of this function */ @@ -2540,7 +2536,6 @@ static bool DAC960_RegisterBlockDevice(DAC960_Controller_T *Controller) continue; } Controller->RequestQueue[n] = RequestQueue; - blk_queue_bounce_limit(RequestQueue, Controller->BounceBufferLimit); RequestQueue->queuedata = Controller; blk_queue_max_segments(RequestQueue, Controller->DriverScatterGatherLimit); blk_queue_max_hw_sectors(RequestQueue, Controller->MaxBlocksPerCommand); diff --git a/drivers/block/DAC960.h b/drivers/block/DAC960.h index 21aff470d268..1439e651928b 100644 --- a/drivers/block/DAC960.h +++ b/drivers/block/DAC960.h @@ -2295,7 +2295,6 @@ typedef struct DAC960_Controller unsigned short MaxBlocksPerCommand; unsigned short ControllerScatterGatherLimit; unsigned short DriverScatterGatherLimit; - u64 BounceBufferLimit; unsigned int CombinedStatusBufferLength; unsigned int InitialStatusLength; unsigned int CurrentStatusLength; -- 2.17.0
[PATCH 05/12] sata_nv: don't use block layer bounce buffers
sata_nv sets the block bounce limit to the reduce dma mask for ATAPI devices, which means that the iommu or swiotlb already take care of the bounce buffering, and the block bouncing can be removed. Signed-off-by: Christoph Hellwig --- drivers/ata/sata_nv.c | 62 +-- 1 file changed, 24 insertions(+), 38 deletions(-) diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c index 8c683ddd0f58..b6e9ad6d33c9 100644 --- a/drivers/ata/sata_nv.c +++ b/drivers/ata/sata_nv.c @@ -740,32 +740,16 @@ static int nv_adma_slave_config(struct scsi_device *sdev) sdev1 = ap->host->ports[1]->link.device[0].sdev; if ((port0->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) || (port1->flags & NV_ADMA_ATAPI_SETUP_COMPLETE)) { - /** We have to set the DMA mask to 32-bit if either port is in - ATAPI mode, since they are on the same PCI device which is - used for DMA mapping. If we set the mask we also need to set - the bounce limit on both ports to ensure that the block - layer doesn't feed addresses that cause DMA mapping to - choke. If either SCSI device is not allocated yet, it's OK - since that port will discover its correct setting when it - does get allocated. - Note: Setting 32-bit mask should not fail. */ - if (sdev0) - blk_queue_bounce_limit(sdev0->request_queue, - ATA_DMA_MASK); - if (sdev1) - blk_queue_bounce_limit(sdev1->request_queue, - ATA_DMA_MASK); - - dma_set_mask(&pdev->dev, ATA_DMA_MASK); + /* +* We have to set the DMA mask to 32-bit if either port is in +* ATAPI mode, since they are on the same PCI device which is +* used for DMA mapping. If either SCSI device is not allocated +* yet, it's OK since that port will discover its correct +* setting when it does get allocated. +*/ + rc = dma_set_mask(&pdev->dev, ATA_DMA_MASK); } else { - /** This shouldn't fail as it was set to this value before */ - dma_set_mask(&pdev->dev, pp->adma_dma_mask); - if (sdev0) - blk_queue_bounce_limit(sdev0->request_queue, - pp->adma_dma_mask); - if (sdev1) - blk_queue_bounce_limit(sdev1->request_queue, - pp->adma_dma_mask); + rc = dma_set_mask(&pdev->dev, pp->adma_dma_mask); } blk_queue_segment_boundary(sdev->request_queue, segment_boundary); @@ -1131,12 +1115,11 @@ static int nv_adma_port_start(struct ata_port *ap) VPRINTK("ENTER\n"); - /* Ensure DMA mask is set to 32-bit before allocating legacy PRD and - pad buffers */ - rc = dma_set_mask(&pdev->dev, DMA_BIT_MASK(32)); - if (rc) - return rc; - rc = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32)); + /* +* Ensure DMA mask is set to 32-bit before allocating legacy PRD and +* pad buffers. +*/ + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); if (rc) return rc; @@ -1156,13 +1139,16 @@ static int nv_adma_port_start(struct ata_port *ap) pp->notifier_clear_block = pp->gen_block + NV_ADMA_NOTIFIER_CLEAR + (4 * ap->port_no); - /* Now that the legacy PRD and padding buffer are allocated we can - safely raise the DMA mask to allocate the CPB/APRD table. - These are allowed to fail since we store the value that ends up - being used to set as the bounce limit in slave_config later if - needed. */ - dma_set_mask(&pdev->dev, DMA_BIT_MASK(64)); - dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64)); + /* +* Now that the legacy PRD and padding buffer are allocated we can +* try to raise the DMA mask to allocate the CPB/APRD table. +*/ + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (rc) { + rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); + if (rc) + return rc; + } pp->adma_dma_mask = *dev->dma_mask; mem = dmam_alloc_coherent(dev, NV_ADMA_PORT_PRIV_DMA_SZ, -- 2.17.0
[PATCH 07/12] scsi: reduce use of block bounce buffers
We can rely on the dma-mapping code to handle any DMA limits that is bigger than the ISA DMA mask for us (either using an iommu or swiotlb), so remove setting the block layer bounce limit for anything but the unchecked_isa_dma case, or the bouncing for highmem pages. Signed-off-by: Christoph Hellwig --- drivers/scsi/scsi_lib.c | 24 ++-- 1 file changed, 2 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0dfec0dedd5e..ebe2cbb48b80 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2142,27 +2142,6 @@ static int scsi_map_queues(struct blk_mq_tag_set *set) return blk_mq_map_queues(set); } -static u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost) -{ - struct device *host_dev; - u64 bounce_limit = 0x; - - if (shost->unchecked_isa_dma) - return BLK_BOUNCE_ISA; - /* -* Platforms with virtual-DMA translation -* hardware have no practical limit. -*/ - if (!PCI_DMA_BUS_IS_PHYS) - return BLK_BOUNCE_ANY; - - host_dev = scsi_get_device(shost); - if (host_dev && host_dev->dma_mask) - bounce_limit = (u64)dma_max_pfn(host_dev) << PAGE_SHIFT; - - return bounce_limit; -} - void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) { struct device *dev = shost->dma_dev; @@ -2182,7 +2161,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q) } blk_queue_max_hw_sectors(q, shost->max_sectors); - blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost)); + if (shost->unchecked_isa_dma) + blk_queue_bounce_limit(q, BLK_BOUNCE_ISA); blk_queue_segment_boundary(q, shost->dma_boundary); dma_set_seg_boundary(dev, shost->dma_boundary); -- 2.17.0
[PATCH 06/12] memstick: don't call blk_queue_bounce_limit
All in-tree host drivers set up a proper dma mask and use the dma-mapping helpers. This means they will be able to deal with any address that we are throwing at them. Signed-off-by: Christoph Hellwig --- drivers/memstick/core/ms_block.c| 5 - drivers/memstick/core/mspro_block.c | 5 - 2 files changed, 10 deletions(-) diff --git a/drivers/memstick/core/ms_block.c b/drivers/memstick/core/ms_block.c index 57b13dfbd21e..b2d025f42d14 100644 --- a/drivers/memstick/core/ms_block.c +++ b/drivers/memstick/core/ms_block.c @@ -2096,12 +2096,8 @@ static int msb_init_disk(struct memstick_dev *card) struct msb_data *msb = memstick_get_drvdata(card); struct memstick_host *host = card->host; int rc; - u64 limit = BLK_BOUNCE_HIGH; unsigned long capacity; - if (host->dev.dma_mask && *(host->dev.dma_mask)) - limit = *(host->dev.dma_mask); - mutex_lock(&msb_disk_lock); msb->disk_id = idr_alloc(&msb_disk_idr, card, 0, 256, GFP_KERNEL); mutex_unlock(&msb_disk_lock); @@ -2123,7 +2119,6 @@ static int msb_init_disk(struct memstick_dev *card) msb->queue->queuedata = card; - blk_queue_bounce_limit(msb->queue, limit); blk_queue_max_hw_sectors(msb->queue, MS_BLOCK_MAX_PAGES); blk_queue_max_segments(msb->queue, MS_BLOCK_MAX_SEGS); blk_queue_max_segment_size(msb->queue, diff --git a/drivers/memstick/core/mspro_block.c b/drivers/memstick/core/mspro_block.c index 8897962781bb..a2fadc605750 100644 --- a/drivers/memstick/core/mspro_block.c +++ b/drivers/memstick/core/mspro_block.c @@ -1175,12 +1175,8 @@ static int mspro_block_init_disk(struct memstick_dev *card) struct mspro_sys_info *sys_info = NULL; struct mspro_sys_attr *s_attr = NULL; int rc, disk_id; - u64 limit = BLK_BOUNCE_HIGH; unsigned long capacity; - if (host->dev.dma_mask && *(host->dev.dma_mask)) - limit = *(host->dev.dma_mask); - for (rc = 0; msb->attr_group.attrs[rc]; ++rc) { s_attr = mspro_from_sysfs_attr(msb->attr_group.attrs[rc]); @@ -1219,7 +1215,6 @@ static int mspro_block_init_disk(struct memstick_dev *card) msb->queue->queuedata = card; - blk_queue_bounce_limit(msb->queue, limit); blk_queue_max_hw_sectors(msb->queue, MSPRO_BLOCK_MAX_PAGES); blk_queue_max_segments(msb->queue, MSPRO_BLOCK_MAX_SEGS); blk_queue_max_segment_size(msb->queue, -- 2.17.0
[PATCH 08/12] mmc: reduce use of block bounce buffers
We can rely on the dma-mapping code to handle any DMA limits that is bigger than the ISA DMA mask for us (either using an iommu or swiotlb), so remove setting the block layer bounce limit for anything but bouncing for highmem pages. Signed-off-by: Christoph Hellwig --- drivers/mmc/core/queue.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c index 56e9a803db21..60a02a763d01 100644 --- a/drivers/mmc/core/queue.c +++ b/drivers/mmc/core/queue.c @@ -351,17 +351,14 @@ static const struct blk_mq_ops mmc_mq_ops = { static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card) { struct mmc_host *host = card->host; - u64 limit = BLK_BOUNCE_HIGH; - - if (mmc_dev(host)->dma_mask && *mmc_dev(host)->dma_mask) - limit = (u64)dma_max_pfn(mmc_dev(host)) << PAGE_SHIFT; blk_queue_flag_set(QUEUE_FLAG_NONROT, mq->queue); blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, mq->queue); if (mmc_can_erase(card)) mmc_queue_setup_discard(mq->queue, card); - blk_queue_bounce_limit(mq->queue, limit); + if (!mmc_dev(host)->dma_mask || !mmc_dev(host)->dma_mask) + blk_queue_bounce_limit(mq->queue, BLK_BOUNCE_HIGH); blk_queue_max_hw_sectors(mq->queue, min(host->max_blk_count, host->max_req_size / 512)); blk_queue_max_segments(mq->queue, host->max_segs); -- 2.17.0
[PATCH 10/12] ide: remove the PCI_DMA_BUS_IS_PHYS check
We now have ways to deal with drainage in the block layer, and libata has been using it for ages. We also want to get rid of PCI_DMA_BUS_IS_PHYS now, so just reduce the PCI transfer size for ide - anyone who cares for performance on PCI controllers should have switched to libata long ago. Signed-off-by: Christoph Hellwig --- drivers/ide/ide-probe.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c index 8d8ed036ca0a..56d7bc228cb3 100644 --- a/drivers/ide/ide-probe.c +++ b/drivers/ide/ide-probe.c @@ -796,8 +796,7 @@ static int ide_init_queue(ide_drive_t *drive) * This will be fixed once we teach pci_map_sg() about our boundary * requirements, hopefully soon. *FIXME* */ - if (!PCI_DMA_BUS_IS_PHYS) - max_sg_entries >>= 1; + max_sg_entries >>= 1; #endif /* CONFIG_PCI */ blk_queue_max_segments(q, max_sg_entries); -- 2.17.0
[PATCH 11/12] net: remove the PCI_DMA_BUS_IS_PHYS check in illegal_highdma
These days the dma mapping routines must be able to handle any address supported by the device, be that using an iommu, or swiotlb if none is supported. With that the PCI_DMA_BUS_IS_PHYS check in illegal_highdma is not needed and can be removed. Signed-off-by: Christoph Hellwig --- net/core/dev.c | 20 +--- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 969462ebb296..5802cf177b07 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2884,11 +2884,7 @@ void netdev_rx_csum_fault(struct net_device *dev) EXPORT_SYMBOL(netdev_rx_csum_fault); #endif -/* Actually, we should eliminate this check as soon as we know, that: - * 1. IOMMU is present and allows to map all the memory. - * 2. No high memory really exists on this machine. - */ - +/* XXX: check that highmem exists at all on the given machine. */ static int illegal_highdma(struct net_device *dev, struct sk_buff *skb) { #ifdef CONFIG_HIGHMEM @@ -2902,20 +2898,6 @@ static int illegal_highdma(struct net_device *dev, struct sk_buff *skb) return 1; } } - - if (PCI_DMA_BUS_IS_PHYS) { - struct device *pdev = dev->dev.parent; - - if (!pdev) - return 0; - for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { - skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; - dma_addr_t addr = page_to_phys(skb_frag_page(frag)); - - if (!pdev->dma_mask || addr + PAGE_SIZE - 1 > *pdev->dma_mask) - return 1; - } - } #endif return 0; } -- 2.17.0
[PATCH 09/12] ide: kill ide_toggle_bounce
ide_toggle_bounce did select various strange block bounce limits, including not bouncing at all as soon as an iommu is present in the system. Given that the dma_map routines now handle any required bounce buffering except for ISA DMA, and the ide code already must handle either ISA DMA or highmem at least for iommu equipped systems we can get rid of the block layer bounce limit setting entirely. Signed-off-by: Christoph Hellwig --- drivers/ide/ide-dma.c | 2 -- drivers/ide/ide-lib.c | 26 -- drivers/ide/ide-probe.c | 3 --- include/linux/ide.h | 2 -- 4 files changed, 33 deletions(-) diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c index 54d4d78ca46a..6f344654ef22 100644 --- a/drivers/ide/ide-dma.c +++ b/drivers/ide/ide-dma.c @@ -180,7 +180,6 @@ EXPORT_SYMBOL_GPL(ide_dma_unmap_sg); void ide_dma_off_quietly(ide_drive_t *drive) { drive->dev_flags &= ~IDE_DFLAG_USING_DMA; - ide_toggle_bounce(drive, 0); drive->hwif->dma_ops->dma_host_set(drive, 0); } @@ -211,7 +210,6 @@ EXPORT_SYMBOL(ide_dma_off); void ide_dma_on(ide_drive_t *drive) { drive->dev_flags |= IDE_DFLAG_USING_DMA; - ide_toggle_bounce(drive, 1); drive->hwif->dma_ops->dma_host_set(drive, 1); } diff --git a/drivers/ide/ide-lib.c b/drivers/ide/ide-lib.c index e1180fa46196..78cb79eddc8b 100644 --- a/drivers/ide/ide-lib.c +++ b/drivers/ide/ide-lib.c @@ -6,32 +6,6 @@ #include #include -/** - * ide_toggle_bounce - handle bounce buffering - * @drive: drive to update - * @on: on/off boolean - * - * Enable or disable bounce buffering for the device. Drives move - * between PIO and DMA and that changes the rules we need. - */ - -void ide_toggle_bounce(ide_drive_t *drive, int on) -{ - u64 addr = BLK_BOUNCE_HIGH; /* dma64_addr_t */ - - if (!PCI_DMA_BUS_IS_PHYS) { - addr = BLK_BOUNCE_ANY; - } else if (on && drive->media == ide_disk) { - struct device *dev = drive->hwif->dev; - - if (dev && dev->dma_mask) - addr = *dev->dma_mask; - } - - if (drive->queue) - blk_queue_bounce_limit(drive->queue, addr); -} - u64 ide_get_lba_addr(struct ide_cmd *cmd, int lba48) { struct ide_taskfile *tf = &cmd->tf; diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c index 2019e66eada7..8d8ed036ca0a 100644 --- a/drivers/ide/ide-probe.c +++ b/drivers/ide/ide-probe.c @@ -805,9 +805,6 @@ static int ide_init_queue(ide_drive_t *drive) /* assign drive queue */ drive->queue = q; - /* needs drive->queue to be set */ - ide_toggle_bounce(drive, 1); - return 0; } diff --git a/include/linux/ide.h b/include/linux/ide.h index ca9d34feb572..11f0dd03a4b4 100644 --- a/include/linux/ide.h +++ b/include/linux/ide.h @@ -1508,8 +1508,6 @@ static inline void ide_set_hwifdata (ide_hwif_t * hwif, void *data) hwif->hwif_data = data; } -extern void ide_toggle_bounce(ide_drive_t *drive, int on); - u64 ide_get_lba_addr(struct ide_cmd *, int); u8 ide_dump_status(ide_drive_t *, const char *, u8); -- 2.17.0
[PATCH 12/12] PCI: remove PCI_DMA_BUS_IS_PHYS
This was used by the ide, scsi and networking code in the past to determine if they should bounce payloads. Now that the dma mapping always have to support dma to all physical memory (thanks to swiotlb for non-iommu systems) there is no need to this crude hack any more. Signed-off-by: Christoph Hellwig --- arch/alpha/include/asm/pci.h | 5 - arch/arc/include/asm/pci.h| 6 -- arch/arm/include/asm/pci.h| 7 --- arch/arm64/include/asm/pci.h | 5 - arch/h8300/include/asm/pci.h | 2 -- arch/hexagon/kernel/dma.c | 1 - arch/ia64/hp/common/sba_iommu.c | 3 --- arch/ia64/include/asm/pci.h | 17 - arch/ia64/kernel/setup.c | 12 arch/ia64/sn/kernel/io_common.c | 5 - arch/m68k/include/asm/pci.h | 6 -- arch/microblaze/include/asm/pci.h | 6 -- arch/mips/include/asm/pci.h | 7 --- arch/parisc/include/asm/pci.h | 23 --- arch/parisc/kernel/setup.c| 5 - arch/powerpc/include/asm/pci.h| 18 -- arch/riscv/include/asm/pci.h | 3 --- arch/s390/include/asm/pci.h | 2 -- arch/s390/pci/pci_dma.c | 2 -- arch/sh/include/asm/pci.h | 6 -- arch/sh/kernel/dma-nommu.c| 1 - arch/sparc/include/asm/pci_32.h | 4 arch/sparc/include/asm/pci_64.h | 6 -- arch/x86/include/asm/pci.h| 3 --- arch/x86/kernel/pci-nommu.c | 1 - arch/xtensa/include/asm/pci.h | 2 -- drivers/parisc/ccio-dma.c | 2 -- drivers/parisc/sba_iommu.c| 2 -- include/asm-generic/pci.h | 8 include/linux/dma-mapping.h | 1 - lib/dma-direct.c | 1 - tools/virtio/linux/dma-mapping.h | 2 -- 32 files changed, 174 deletions(-) diff --git a/arch/alpha/include/asm/pci.h b/arch/alpha/include/asm/pci.h index b9ec55351924..cf6bc1e64d66 100644 --- a/arch/alpha/include/asm/pci.h +++ b/arch/alpha/include/asm/pci.h @@ -56,11 +56,6 @@ struct pci_controller { /* IOMMU controls. */ -/* The PCI address space does not equal the physical memory address space. - The networking and block device layers use this boolean for bounce buffer - decisions. */ -#define PCI_DMA_BUS_IS_PHYS 0 - /* TODO: integrate with include/asm-generic/pci.h ? */ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel) { diff --git a/arch/arc/include/asm/pci.h b/arch/arc/include/asm/pci.h index ba56c23c1b20..4ff53c041c64 100644 --- a/arch/arc/include/asm/pci.h +++ b/arch/arc/include/asm/pci.h @@ -16,12 +16,6 @@ #define PCIBIOS_MIN_MEM 0x10 #define pcibios_assign_all_busses()1 -/* - * The PCI address space does equal the physical memory address space. - * The networking and block device layers use this boolean for bounce - * buffer decisions. - */ -#define PCI_DMA_BUS_IS_PHYS1 #endif /* __KERNEL__ */ diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h index 1f0de808d111..0abd389cf0ec 100644 --- a/arch/arm/include/asm/pci.h +++ b/arch/arm/include/asm/pci.h @@ -19,13 +19,6 @@ static inline int pci_proc_domain(struct pci_bus *bus) } #endif /* CONFIG_PCI_DOMAINS */ -/* - * The PCI address space does equal the physical memory address space. - * The networking and block device layers use this boolean for bounce - * buffer decisions. - */ -#define PCI_DMA_BUS_IS_PHYS (1) - #define HAVE_PCI_MMAP #define ARCH_GENERIC_PCI_MMAP_RESOURCE diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h index 8747f7c5e0e7..9e690686e8aa 100644 --- a/arch/arm64/include/asm/pci.h +++ b/arch/arm64/include/asm/pci.h @@ -18,11 +18,6 @@ #define pcibios_assign_all_busses() \ (pci_has_flag(PCI_REASSIGN_ALL_BUS)) -/* - * PCI address space differs from physical memory address space - */ -#define PCI_DMA_BUS_IS_PHYS(0) - #define ARCH_GENERIC_PCI_MMAP_RESOURCE 1 extern int isa_dma_bridge_buggy; diff --git a/arch/h8300/include/asm/pci.h b/arch/h8300/include/asm/pci.h index 7c9e55d62215..d4d345a52092 100644 --- a/arch/h8300/include/asm/pci.h +++ b/arch/h8300/include/asm/pci.h @@ -15,6 +15,4 @@ static inline void pcibios_penalize_isa_irq(int irq, int active) /* We don't do dynamic PCI IRQ allocation */ } -#define PCI_DMA_BUS_IS_PHYS(1) - #endif /* _ASM_H8300_PCI_H */ diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c index ad8347c29dcf..77459df34e2e 100644 --- a/arch/hexagon/kernel/dma.c +++ b/arch/hexagon/kernel/dma.c @@ -208,7 +208,6 @@ const struct dma_map_ops hexagon_dma_ops = { .sync_single_for_cpu = hexagon_sync_single_for_cpu, .sync_single_for_device = hexagon_sync_single_for_device, .mapping_error = hexagon_mapping_error, - .is_phys= 1, }; void __init hexagon_dma_init(void) diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c index aec4a3354abe..6f05aba9012f 100644 --- a/arch/ia64/hp/com
Re: [PATCH v2] block: ratelimite pr_err on IO path
On Fri, Apr 13, 2018 at 6:59 PM, Martin K. Petersen wrote: > > Jinpu, > > [CC:ed the mpt3sas maintainers] > > The ratelimit patch is just an attempt to treat the symptom, not the > cause. Agree. If we can fix the root cause, it will be great. > >> Thanks for asking, we updated mpt3sas driver which enables DIX support >> (prot_mask=0x7f), all disks are SATA SSDs, no DIF support. >> After reboot, kernel reports the IO errors from all the drives behind >> HBA, seems for almost every read IO, which turns the system unusable: >> [ 13.079375] sda: ref tag error at location 0 (rcvd 143196159) >> [ 13.079989] sda: ref tag error at location 937702912 (rcvd 143196159) >> [ 13.080233] sda: ref tag error at location 937703072 (rcvd 143196159) >> [ 13.080407] sda: ref tag error at location 0 (rcvd 143196159) >> [ 13.080594] sda: ref tag error at location 8 (rcvd 143196159) > > That sounds like a bug in the mpt3sas driver or firmware. I guess the > HBA could conceivably be operating a SATA device as DIX Type 0 and strip > the PI on the drive side. But that doesn't seem to be a particularly > useful mode of operation. > > Jinpu: Which firmware are you running? Also, please send us the output > of: > > sg_readcap -l /dev/sda > sg_inq -x /dev/sda > sg_vpd /dev/sda > Disks are INTEL SSDSC2BX48, directly attached to HBA. LSISAS3008: FWVersion(13.00.00.00), ChipRevision(0x02), BiosVersion(08.11.00.00) mpt3sas_cm2: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) jwang@x:~$ sudo sg_vpd /dev/sdz Supported VPD pages VPD page: Supported VPD pages [sv] Unit serial number [sn] Device identification [di] Mode page policy [mpp] ATA information (SAT) [ai] Block limits (SBC) [bl] Block device characteristics (SBC) [bdc] Logical block provisioning (SBC) [lbpv] jwang@x:~$ sudo sg_inq -x /dev/sdz VPD INQUIRY: extended INQUIRY data page inquiry: field in cdb illegal (page not supported) jwang@x:~$ sudo sg_readcap -l /dev/sdz Read Capacity results: Protection: prot_en=0, p_type=0, p_i_exponent=0 Logical block provisioning: lbpme=1, lbprz=1 Last logical block address=937703087 (0x37e436af), Number of logical blocks=937703088 Logical block length=512 bytes Logical blocks per physical block exponent=3 [so physical block length=4096 bytes] Lowest aligned logical block address=0 Hence: Device size: 480103981056 bytes, 457862.8 MiB, 480.10 GB > Broadcom: How is DIX supposed to work for SATA drives behind an mpt3sas > controller? > > -- > Martin K. Petersen Oracle Linux Engineering Thanks! -- Jack Wang Linux Kernel Developer ProfitBricks GmbH Greifswalder Str. 207 D - 10405 Berlin