On Sun, 02 Mar 2008 15:17:16 +0200 Erez Zilber <[EMAIL PROTECTED]> wrote:
> Pete Wyckoff wrote: > > > [EMAIL PROTECTED] wrote on Sun, 10 Feb 2008 14:54 +0200: > > > >> I'm trying to run some performance tests on iSER. I'm using the latest > >> stgt build (tgt-20071227) against open-iscsi r865.15. When I run sgp_dd > >> on a single LUN directly from the target machine, I get ~400 MB/sec > >> (which is ok because it's the max throughput of that device). When I run > >> it from the initiator, I get only ~240 MB/sec. > >> > >> Did you use any specific configuration when you did your measurements? > >> Pete - according to your numbers, you got wire speed. I wonder why do I > >> get 40% less than the physical storage throughput. BTW - when I use > >> another iSCSI over iSER target, I get 400 MB/sec from the initiator. > >> > > > > Here's a patch that will log timestamps. I use these to pinpoint > > where the time is going. First check for anomalies: plot it up > > with line number on the x axis, time stamp on y, make sure it is > > smooth. > > It is almost smooth. From time to time, you can see that it takes time > before iser_rx_progress is called. I guess that the cq is empty for a > short time. I don't think that this is the real problem. > > > Then pick one operation: zoom in with the plot and your > > editor to look at the times for a single read/write, from when the > > request first appears to when the response goes out. > > > > Here's an example of a single command execution: > > 27892 tgtd: [16:40:38.907547] iser_rx_progress(1101) entry > 27893 tgtd: [16:40:38.907558] handle_wc(897) incoming cmd, len 76 > 27894 tgtd: [16:40:38.907568] iscsi_iser_read(1301) buf 0x529fc8 nbytes 48 > 27895 tgtd: [16:40:38.907578] iser_parse_hdr(1265) control type PDU > 27896 tgtd: [16:40:38.907587] iser_parse_hdr(1271) rstag 9b320086 va > 3f1fc000 > 27897 tgtd: [16:40:38.907597] iscsi_scsi_cmd_rx_start(1466) 1 28 0 0 > 524288 1 b > 27898 tgtd: [16:40:38.907607] iscsi_rdma_alloc_data_buf(1659) malloc > 0x2acf61554000 sz 524288 > 27899 tgtd: [16:40:38.907617] iscsi_task_queue(1411) 388b 388b 1 > 27900 tgtd: [16:40:38.907627] target_cmd_queue(763) 0x535de8 28 1 > 27901 tgtd: [16:40:38.907637] target_cmd_queue(783) 0x535de8 28 1 1 > > this is where bs work starts: > 27902 tgtd: [16:40:38.907649] tgtd: [16:40:38.907650] > bs_thread_worker_fn(151) got 0x535de8 > 27903 target_cmd_queue(794) b 28 (nil) 0x2acf61554000 262144 0 524288 0 4 > 27904 tgtd: [16:40:38.907676] iscsi_rdma_event_modify(1628) tx ready > adding 0x529eb0 > 27905 tgtd: [16:40:38.907686] handle_wc(922) incoming cmd proc done, repost > 27906 tgtd: [16:40:38.907697] iser_tx_progress(1017) entry > 27907 tgtd: [16:40:38.907706] iser_tx_progress(1028) trying tx > 27908 tgtd: [16:40:38.907716] iscsi_task_tx_start(1784) no more data > 27909 tgtd: [16:40:38.907725] iscsi_rdma_event_modify(1633) tx ready > removing 0x529eb0 > > this is where bs work ends (130 us): > 27910 tgtd: [16:40:38.907779] bs_rdwr_request(114) io done 0x535de8 28 > 524288 524288 > 27911 tgtd: [16:40:38.907796] bs_thread_ack_fn(82) found 0x535de8 > 27912 tgtd: [16:40:38.907813] bs_thread_request_done(122) back to tgtd, > 0x535de8 > 27913 tgtd: [16:40:38.907825] iscsi_rdma_event_modify(1628) tx ready > adding 0x529eb0 > 27914 tgtd: [16:40:38.907837] iser_tx_progress(1017) entry > 27915 tgtd: [16:40:38.907846] iser_tx_progress(1028) trying tx > 27916 tgtd: [16:40:38.907856] iscsi_task_tx_start(1759) found a task b > 524288 0 0 > 27917 tgtd: [16:40:38.907865] iscsi_data_rsp_build(974) 524288 524288 > 524288 524288 b > 27918 tgtd: [16:40:38.907875] iscsi_rdma_rdma_write(1491) offset 0 len > 524288 stag 9b320086 va 3f1fc000 > 27919 tgtd: [16:40:38.907885] iscsi_rdma_event_modify(1633) tx ready > removing 0x529eb0 > 27920 tgtd: [16:40:38.907895] iscsi_scsi_cmd_tx_done(1707) more data or > sense or bidir b > 27921 tgtd: [16:40:38.908269] iser_rx_progress(1101) entry > 27922 tgtd: [16:40:38.908280] handle_wc(932) RDMA write done > 27923 tgtd: [16:40:38.908289] iscsi_rdma_event_modify(1628) tx ready > adding 0x529eb0 > 27924 tgtd: [16:40:38.908298] iser_tx_progress(1017) entry > 27925 tgtd: [16:40:38.908308] iser_tx_progress(1028) trying tx > 27926 tgtd: [16:40:38.908317] iscsi_task_tx_start(1759) found a task b > 524288 524288 0 > 27927 tgtd: [16:40:38.908327] iscsi_cmd_rsp_build(901) 0x535d28 28 > 27928 tgtd: [16:40:38.908336] iscsi_iser_write_begin(1352) new sendl > 0x2aaab45f7090 len 48 > 27929 tgtd: [16:40:38.908347] iscsi_iser_write_end(1387) sendl > 0x2aaab45f7090 len 76 > 27930 tgtd: [16:40:38.908356] iscsi_iser_write_end(1402) inc progress to > finish cmd > 27931 tgtd: [16:40:38.908366] __cmd_done(855) 8 (nil) 0x2acf61554000 0 > 524288 0 > 27932 tgtd: [16:40:38.908376] iscsi_rdma_free_data_buf(1678) free > 0x2acf61554000 > > It seems that bs work takes a lot of time. For example: if I run > disktest on the same device directly from the target machine (i.e. tgt > is not involved) in the following way: > > disktest -PT -T100 -h1 -K64 -B256k -ID /dev/sdc -r > > I get ~13000 iops -> each IO takes ~77 us (compared to 130 us only for > bs work with tgt). I'm not familiar with the bs code in tgt, so I'm not > sure what should be done here. The problem is that Linux lacks a nice event notification mechanism. The main thread uses epoll to wait on events. I/O threads (pthread) use pipe to notify the main thread of I/O completion. It's not effective at all. > Another thing that looks strange - from looking at the logs, I don't see > that work is done in parallel. A command is received, iSCSI processing > is done and it is sent to bs processing (which seems to be idle until > that point in time). Now, nothing happens until bs processing is > complete. I will try to look more into the code and try to understand > it, but can someone try to elaborate on this? sgp_dd issues only one SCSI command at a time? _______________________________________________ Stgt-devel mailing list [email protected] https://lists.berlios.de/mailman/listinfo/stgt-devel
