Re: Specify SCSI ID

2008-10-01 Thread Vladislav Bolkhovitin

Nick wrote:
> On Sep 30, 8:16 am, Konrad Rzeszutek <[EMAIL PROTECTED]> wrote:
>> On Mon, Sep 29, 2008 at 05:50:17PM -0700, Nick wrote:
>>
>>> On Sep 29, 3:49 pm, Konrad Rzeszutek <[EMAIL PROTECTED]> wrote:
> Here's the output:
> [EMAIL PROTECTED] ~]# lsscsi
> [20:0:0:0]   tapeSEAGATE  ULTRIUM06242-XXX 1613  /dev/st0
> [21:0:0:0]   mediumx STK  L20  0215  /dev/sch0
> [22:0:0:0]   tapeSEAGATE  ULTRIUM06242-XXX 1613  /dev/st1
> As you can see, it does not maintain devids but each one becomes id
> 0.  As a result, I get the following message when the changer loads:
 Yeah, this is the fault of your storage. It converst the SCSI Ids
 in each seperate target.
>>> I'm not sure what you mean by this being a fault of my storage.  It is
>>> normal for libraries and tape drives to have different SCSI IDs.  I'm
>>> using SCST for the iSCSI Target Server, and that either allows me to
>>> specify all of the SCSI devices under the same target as different
>>> LUNs, or specify them under different targets.  SCST does not allow me
>> Aha! Do it as different LUNs.
>>
>>> to specify the ID the devices are presented as - maybe that's the
>>> fault to which you're referring?
>> I was thinking that it represented them as different targets - which is
>> what you don't want in your case.
> 
> It does allow me to present them as different targets, but it also
> allows me to present them as different LUNs on the same target.  I'm
> not sure why I can't specify different IDs on the same target though -
> that seems like a "missing feature" for SCST.

SCSI ID is purely parallel SCSI concept. Other SCSI transports, 
including iSCSI as well as SAM as a whole, don't have anything like 
that. So, if your tape library application requires use of SCSI IDs and 
can't use LUNs instead, this means that it is limited to work only with 
parallel SCSI libraries. Neither open-iscsi, nor SCST can change it.

> I tried out the
> different LUNs method last night and it seems to be somewhat
> functional, although using the "mtx unload" command doesn't seem to
> trigger an "eject" on the tape drive - I have to run an "mt -f /dev/
> st0 eject" and the do the unload.  Maybe this is normal, but it seems
> like usually you can just do the unload and that will trigger an eject
> on the drive, then move the media.  Anyway, I'll keep playing with
> that...
> 
> -Nick
> > 
> 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: kernel oops in iscsi_tcp_recv

2008-12-18 Thread Vladislav Bolkhovitin

Mike Christie, on 12/17/2008 11:31 PM wrote:
> Mike Christie wrote:
>> Erez Zilber wrote:
>>> Mike,
>>>
>>> I got a kernel oops while logging in from v870-1 to an iSCSI-SCST
>>> target. it happens before 'iscsiadm -m node -L all' returns. Is this a
>>> known bug? Here's the log:
>>>
>> I think this is a new bug. I will give scst a try and try to replicate.
> 
> I am trying to setup scst but got hung up.

What was hung up? Target? It shouldn't be so. Can you provide the details?

> Do I just need to fill out 
> /etc/scst.conf then start up the initd.redhat script or do I also need 
> to write some stuff to /proc/scsi_tgt?

Together with /etc/scst.conf you need to define targets in 
/etc/iscsi-scst.conf.

> > 
> 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



open-iscsi@googlegroups.com

2009-02-24 Thread Vladislav Bolkhovitin

Hi,

StorageSolutionGroup, on 02/24/2009 02:12 AM wrote:
> Hi,
> 
> Windows does not support disks that have been formatted to anything
> other than  a 512byte block size. Block size refers to the low level
> formatting of the disk and not the cluster or allocation size used by
> NTFS. Be aware that using a disk with a block size larger than 512
> bytes will cause applications not to function correctly.  You should
> check with your iSCSI target manufacture to ensure that their default
> block size is set to 512 bytes or problems will likely occur.For
> further reference and knowledge you can consult www.microsoft.com/document
> www.Stonefly.com and www.DNFstorage.com.

Not really correct, e.g.: http://support.microsoft.com/kb/923332/en-us. 
For Vista+ non-512b blocks are officially supported by MS, for 2000+ 
they "just work" (I can't recall any problem).

> Storage Solution Group.
> 
> 
> 
> > 
> 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Q: "- PDU header Digest" fetaure

2009-02-26 Thread Vladislav Bolkhovitin

Mike Christie, on 02/25/2009 08:38 PM wrote:
> Another reason a lot of distros do not support it is because a common 
> problem we always hit is that users will write out some data, then start 
> modifying it again. But the kernel will normally not do do a sync write 
> when you do a write. So once the write() returns, the kernel is still 
> sending it through the caches, block, scsi, and iscsi layers. If you are 
> writing to the data while the it is working its way through the iscsi 
> layers, the iscsi layer could have done the digest calculation, then you 
> could modify it and now when the target checks it the digest check will 
> fail. And so this happens over and over and you get digest errors all 
> over the place and the iscsi layers fire their error handling and retry 
> and retry, and in the end they just say forget it and do not support 
> data digests.

During testing of iSCSI-SCST with data digests enabled with open-iscsi 
initiator I've regularly once in several hours seen data digests errors. 
I was going to investigate it, but had no time. Now I know the reason.

Thanks for the explanation!
Vlad


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Low IOPs for certain block sizes

2009-04-14 Thread Vladislav Bolkhovitin

Hello,

Bart Van Assche, on 04/12/2009 10:09 PM wrote:
> Hello,
> 
> While running iSCSI performance tests I noticed that the performance
> for certain block sizes deviated significantly (more than ten times)
> from the performance for other block sizes, both larger and smaller.
> This surprised me.
> 
> The test I ran was as follows:
> * A file of 1 GB residing on a tmpfs filesystem was exported via iSCSI
> target software. The test has been repeated with both SCST and STGT.
> * On the initiator system open-iscsi version 2.0.870 was used for
> performing reads and writes with dd via direct I/O. Read-ahead was set
> to zero.
> * Both systems were running kernel 2.6.29.1 in run level 3 (no X
> server) and the 1 GbE interfaces in the two systems were connected via
> a crossed cable. The MTU has been left to its default value, 1500
> bytes. Netperf reported a throughput of 600 Mbit/s = 75 MB/s for the
> TCP/IP stream test on this setup.
> * 128 MB of data has been transferred during each test.
> * Each measurement has been repeated three times.
> * All caches were flushed before each test.
> * The ratio of standard deviation to average was 2% or lower for all
> measurements.
> * The measurement result are as follows (transfer speeds in MB/s):
> 
> Block   SCSTSTGTSCSTSTGT
>  size  writing writing reading reading
> -- --- --- --- ---
>  64 MB  71.763.362.158.4
>  32 MB  71.963.461.758.1
>  16 MB  72.463.061.757.1
>   8 MB  72.763.361.756.9
>   4 MB  72.963.561.357.0
>   2 MB  72.859.560.356.9
>   1 MB  72.138.759.456.0
> 512 KB  67.321.458.054.4
> 256 KB  67.422.855.553.4
> 128 KB  60.922.653.351.7
>  64 KB  53.222.253.045.7
>  32 KB  48.921.640.040.0
>  16 KB  40.020.8 0.6 1.3
>   8 KB  20.019.919.920.0
>   4 KB   0.6 1.618.910.3
> 
> All results look normal to me, except the write throughput for a block
> size of 4 KB and the read throughput for a block size of 16 KB.
> 
> Regarding CPU load: during the 4 KB write test, the CPU load was 0.9
> on the initiator system and 0.1 on the target.

I would suggest you to make sure you have any hardware coalescing 
disabled on both hosts. You can do that by using "ethtool -c"

> Has anyone observed similar behavior before ?
> 
> Bart.
> 
> > 
> 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: MC/S support

2009-10-13 Thread Vladislav Bolkhovitin

benoit plessis, on 10/13/2009 01:25 PM wrote:
> Strange because this is a common optimisation option than SAN vendor are
> using/recommending.
> 
> Was the problem really the feature or the implementation ?
> Do you have a pointer explaining the reasons ?

http://scst.sourceforge.net/mc_s.html

> 2009/10/8 Mike Christie mailto:micha...@cs.wisc.edu>>
> 
> 
> On 10/07/2009 04:22 PM, Kun Huang wrote:
>  > usr/config.h
>  > /* number of possible connections per session */
>  > #define ISCSI_CONN_MAX  1
>  >
> 
> The original code supported it (that is where the max conn and some
> other settings are from), but it does not anymore. The upstream kernel
> developers did not like the feature so it was dropped so open-iscsi
> could get merged.
> 
> 
> 
> 
> > 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: iSCSI latency issue

2009-11-25 Thread Vladislav Bolkhovitin
Shachar f, on 11/25/2009 07:57 PM wrote:
> I'm running open-iscsi with scst on Broadcom 10Gig network and facing 
> write latency issues.
> When using netperf over an idle network the latency for a single block 
> round trip transfer is 30 usec and with open-iscsi it is 90-100 usec.
>  
> I see that Nagle (TCP_NODELAY) is disabled when openning socket on the 
> initiator side and I'm not sure about the target side. 
> Vlad, Can you elaborate on this?

TCP_NODELAY is always enabled in iSCSI-SCST. You can at any time have 
latency statistics on the target side by enabling 
CONFIG_SCST_MEASURE_LATENCY (see README). Better also enable 
CONFIG_PREEMPT_NONE to not count CPU scheduler latency.

> Are others in the mailing list aware to possible environment changes 
> that effext latency?
>  
> more info -
> I'm running this test with Centos5.3 machines with almost latest open-iscsi.
>  
> Thanks,
>   Shachar

--

You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.




Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet

2010-02-05 Thread Vladislav Bolkhovitin

Pasi Kärkkäinen, on 01/28/2010 03:36 PM wrote:

Hello list,

Please check these news items:
http://blog.fosketts.net/2010/01/14/microsoft-intel-push-million-iscsi-iops/
http://communities.intel.com/community/openportit/server/blog/2010/01/19/100-iops-with-iscsi--thats-not-a-typo
http://www.infostor.com/index/blogs_new/dave_simpson_storage/blogs/infostor/dave_simpon_storage/post987_37501094375591341.html

"1,030,000 IOPS over a single 10 Gb Ethernet link"

"Specifically, Intel and Microsoft clocked 1,030,000 IOPS (with 512-byte blocks), 
and more than 2,250MBps with large block sizes (16KB to 256KB) using the Iometer benchmark"


So.. who wants to beat that using Linux + open-iscsi? :)


I personally, don't like such tests and don't trust them at all. They 
are pure marketing. The only goal of them is to create impression that X 
(Microsoft and Windows in this case) is a super-puper ahead of the 
world. I've seen on the Web a good article about usual tricks used by 
vendors to cheat benchmarks to get good marketing material, but, 
unfortunately, can't find link on it at the moment.


The problem is that you can't say from such tests if X will also "ahead 
of the world" on real life usages, because such tests always heavily 
optimized for particular used benchmarks and such optimizations almost 
always hurt real life cases. And you hardly find descriptions of those 
optimizations as well as a scientific description of the tests themself. 
The results published practically only in marketing documents.


Anyway, as far as I can see Linux supports all the used hardware as well 
as all advance performance modes of it, so if one repeats this test in 
the same setup, he/she should get not worse results.


For me personally it was funny to see how MS presents in the WinHEC 
presentation 
(http://download.microsoft.com/download/5/E/6/5E66B27B-988B-4F50-AF3A-C2FF1E62180F/COR-T586_WH08.pptx) 
that they have 1.1GB/s from 4 connections. In the beginning of 2008 I 
saw a *single* dd pushing data on that rate over a *single* connection 
from Linux initiator to iSCSI-SCST target using regular Myricom hardware 
without any special acceleration. I didn't know how proud I must have 
been for Linux :).


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet

2010-02-05 Thread Vladislav Bolkhovitin

Joe Landman, on 01/28/2010 06:01 PM wrote:

Pasi Kärkkäinen wrote:

Hello list,

Please check these news items:
http://blog.fosketts.net/2010/01/14/microsoft-intel-push-million-iscsi-iops/
http://communities.intel.com/community/openportit/server/blog/2010/01/19/100-iops-with-iscsi--thats-not-a-typo
http://www.infostor.com/index/blogs_new/dave_simpson_storage/blogs/infostor/dave_simpon_storage/post987_37501094375591341.html

"1,030,000 IOPS over a single 10 Gb Ethernet link"


This is less than 1us per IOP.  Interesting.  Their hardware may not 
actually support this.  10GbE typically is 7-10us, though ConnectX and 
some others get down to 2ish.


You can't calculate any latency for this test as 1/IOPS number. 
Otherwise you can measure that a half a second satellite link has lower 
latency, than your local 100Mbps net ;).


You know, each link has 2 properties: latency and bandwidth. They are 
*independent* (or orthogonal), this is fundamental. This means that if 
you know one parameter, you can't figure out another one.


So, if you send single 1 byte IO at time, 1/IOPS will mean latency (I 
suppose 1/bandwidth << 1/IOPS, otherwise latency is 1/IOPS - 
1/bandwidth), but if you fully fill the link, i.e. have at least 
latency*bandwidth data in-flight, with 1 byte IOs 1/IOPS will mean time 
to transfer each IO, i.e. 1/bandwidth.


Hence, tests which fully fill the link, as in this case, are latency 
insensitive and only test bandwidth, so you can't figure out any latency 
from them. You will get the same IOPS results (millions, possibly) for 
any link's latency, even if it's hundreds of seconds.


Initiator and target systems can also be considered as "links" with the 
same 2 independent parameters: latency (time to process each IO) and 
bandwidth (how many IOs at time can be processed), where bandwidth != 
latency * CPUs_num, because SMP systems don't scale linearly. Plus, in 
this test there are many targets working in parallel.


Thus, in this test IOs sent through many links, including the above 
"links", and fully filled them. So, for this test 1/IOPS is close to 
IO_size/the_worst_link_bandwidth.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet

2010-02-05 Thread Vladislav Bolkhovitin

Nicholas A. Bellinger, on 01/29/2010 07:25 PM wrote:

On Thu, 2010-01-28 at 20:45 +0200, Pasi Kärkkäinen wrote:

On Thu, Jan 28, 2010 at 07:38:28PM +0100, Bart Van Assche wrote:

On Thu, Jan 28, 2010 at 4:01 PM, Joe Landman
 wrote:

Pasi Kärkkäinen wrote:

Please check these news items:

http://blog.fosketts.net/2010/01/14/microsoft-intel-push-million-iscsi-iops/

http://communities.intel.com/community/openportit/server/blog/2010/01/19/100-iops-with-iscsi--thats-not-a-typo

http://www.infostor.com/index/blogs_new/dave_simpson_storage/blogs/infostor/dave_simpon_storage/post987_37501094375591341.html

"1,030,000 IOPS over a single 10 Gb Ethernet link"

This is less than 1us per IOP.  Interesting.  Their hardware may not
actually support this.  10GbE typically is 7-10us, though ConnectX and some
others get down to 2ish.

Which I/O depth has been used in the test ? Latency matters most with
an I/O depth of one and is almost irrelevant for high I/O depth
values.


iirc outstanding I/Os was 20 in that benchmark.


Also of interest, according to the following link

http://gestaltit.com/featured/top/stephen/wirespeed-10-gb-iscsi/

is that I/Os are being multiplexed across multiple TCP connections using
RFC-3720 defined Multiple Connection per Session (MC/S) logic between
the MSFT Initiator Nehalem machine and Netapp Target array:

"The configuration tested (on the initiator side) was an IBM x3550 with
dual 2 GHz CPUs, 4 GB of RAM, and an Intel 82598 adapter. This is not a
special server – in fact, it’s pretty low-end! The connection was tuned
with RSS, NetDMA, LRO, LSO, and jumbo frames and maxed out over 4 MCS
connections per second. I’m not sure what kind of access they were doing
(I’ll ask Suzanne), but it’s pretty impressive that the NetApp Filer
could push 1,174 megabytes per second!'

It just goes to show that software iSCSI MC/S can really scale to some
very impressive results with enough x86_64 horsepower behind it..


Well, if on Windows MC/S scales better than MPIO on random IO tests, as 
it is seen from the WinHEC presentation, it must mean that MPIO on 
Windows has serious scalability problems. It would well explain why 
Microsoft is the only OS vendor who pushes MC/S-capable initiator. I'm 
sure, if in the next version they fix those scalability problems, it 
will be presented as a great achievement ;). It isn't necessary Linux 
should also have such problems.


(MC/S requires to serialize all the commands across all connections 
according to commands' SNs, even if preserving the commands delivery 
order isn't needed. This is a known scalability limitation. MPIO does 
not require any such commands serialization, hence doesn't have such 
limitation. It is especially seen on high IOPS tests. For MPIO you can 
assign each IO thread to particular CPU and make network hardware to put 
data for corresponding connection on that CPU. It will allow the best 
CPU caches usage as well as avoid the "cache ping-pong" between CPUs. 
With MC/S such setup isn't possible, because all commands from all 
connections must pass single serialization point.)


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet

2010-02-09 Thread Vladislav Bolkhovitin

Pasi Kärkkäinen, on 02/08/2010 02:58 PM wrote:

On Fri, Feb 05, 2010 at 02:10:32PM +0300, Vladislav Bolkhovitin wrote:

Pasi Kärkkäinen, on 01/28/2010 03:36 PM wrote:

Hello list,

Please check these news items:
http://blog.fosketts.net/2010/01/14/microsoft-intel-push-million-iscsi-iops/
http://communities.intel.com/community/openportit/server/blog/2010/01/19/100-iops-with-iscsi--thats-not-a-typo
http://www.infostor.com/index/blogs_new/dave_simpson_storage/blogs/infostor/dave_simpon_storage/post987_37501094375591341.html

"1,030,000 IOPS over a single 10 Gb Ethernet link"

"Specifically, Intel and Microsoft clocked 1,030,000 IOPS (with 
512-byte blocks), and more than 2,250MBps with large block sizes (16KB 
to 256KB) using the Iometer benchmark"


So.. who wants to beat that using Linux + open-iscsi? :)
I personally, don't like such tests and don't trust them at all. They  
are pure marketing. The only goal of them is to create impression that X  
(Microsoft and Windows in this case) is a super-puper ahead of the  
world. I've seen on the Web a good article about usual tricks used by  
vendors to cheat benchmarks to get good marketing material, but,  
unfortunately, can't find link on it at the moment.


The problem is that you can't say from such tests if X will also "ahead  
of the world" on real life usages, because such tests always heavily  
optimized for particular used benchmarks and such optimizations almost  
always hurt real life cases. And you hardly find descriptions of those  
optimizations as well as a scientific description of the tests themself.  
The results published practically only in marketing documents.


Anyway, as far as I can see Linux supports all the used hardware as well  
as all advance performance modes of it, so if one repeats this test in  
the same setup, he/she should get not worse results.


For me personally it was funny to see how MS presents in the WinHEC  
presentation  
(http://download.microsoft.com/download/5/E/6/5E66B27B-988B-4F50-AF3A-C2FF1E62180F/COR-T586_WH08.pptx) 
that they have 1.1GB/s from 4 connections. In the beginning of 2008 I  
saw a *single* dd pushing data on that rate over a *single* connection  
from Linux initiator to iSCSI-SCST target using regular Myricom hardware  
without any special acceleration. I didn't know how proud I must have  
been for Linux :).




Hehe, congrats :)

Did you ever benchmark/measure what kind of IOPS numbers you can get? 


No. I was solving a task to get max linear throughput from a single 
initiator, target, link and connection with a HDD RAID6 backstorage on 
the target.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet

2010-04-13 Thread Vladislav Bolkhovitin

Pasi Kärkkäinen, on 04/12/2010 11:54 PM wrote:

On Fri, Feb 05, 2010 at 02:10:32PM +0300, Vladislav Bolkhovitin wrote:
For me personally it was funny to see how MS presents in the WinHEC  
presentation  
(http://download.microsoft.com/download/5/E/6/5E66B27B-988B-4F50-AF3A-C2FF1E62180F/COR-T586_WH08.pptx) 
that they have 1.1GB/s from 4 connections. In the beginning of 2008 I  
saw a *single* dd pushing data on that rate over a *single* connection  
from Linux initiator to iSCSI-SCST target using regular Myricom hardware  
without any special acceleration. I didn't know how proud I must have  
been for Linux :).




Btw was this over 10 Gig Ethernet? 


Did you have to tweak something special to achieve this, either on the 
initiator,
or on the target? 


Nothing special. It was for plain dd writes.

Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: iscsi performance via 10 Gig

2010-05-26 Thread Vladislav Bolkhovitin

Boaz Harrosh, on 05/26/2010 10:45 PM wrote:

On 05/26/2010 09:42 PM, Vladislav Bolkhovitin wrote:

Taylor, on 05/26/2010 09:32 PM wrote:

I'm curious what kind of performance numbers people can get from their
iscsi setup, specifically via 10 Gig.

We are running with Linux servers connected to Dell Equallogic 10 Gig
arrays on Suse.

Recently we were running under SLES 11, and with multipath were seeing
about 2.5 Gig per NIC, or 5.0 Gbit/sec total IO throughput, but we
were getting a large number of iscsi connection errors.  We are using
10 Gig NICs with jumbo frames.

We reimaged the server to OpenSuse, same hardware and configs
otherwise, and since then we are getting about half, or 1.2 to 1.3
Gbit per NIC, or 2.5 to 3.0 Gbit total IO throughput, but we've not
had any iscsi connection errors.

What are other people seeing?  Doesn't need to be an equallogic, just
any 10 Gig connection to an iscsi array and single host throughput
numbers.
ISCSI-SCST/open-iscsi on a decent hardware can fully saturate 10GbE 
link. On writes even with a single stream, i.e. something like a single 
dd writing data to a single device.




Off topic question:
That's a fast disk. A sata HD? the best I got for single sata was like
90 MB/s. Did you mean a RAM device of sorts.


The single stream data were both from a SAS RAID and RAMFS. The 
multi-stream data were from RAMFS, because I don't have any reports 
about any tests of iSCSI-SCST on fast enough SSDs.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: iscsi performance via 10 Gig

2010-05-26 Thread Vladislav Bolkhovitin

Taylor, on 05/26/2010 09:32 PM wrote:

I'm curious what kind of performance numbers people can get from their
iscsi setup, specifically via 10 Gig.

We are running with Linux servers connected to Dell Equallogic 10 Gig
arrays on Suse.

Recently we were running under SLES 11, and with multipath were seeing
about 2.5 Gig per NIC, or 5.0 Gbit/sec total IO throughput, but we
were getting a large number of iscsi connection errors.  We are using
10 Gig NICs with jumbo frames.

We reimaged the server to OpenSuse, same hardware and configs
otherwise, and since then we are getting about half, or 1.2 to 1.3
Gbit per NIC, or 2.5 to 3.0 Gbit total IO throughput, but we've not
had any iscsi connection errors.

What are other people seeing?  Doesn't need to be an equallogic, just
any 10 Gig connection to an iscsi array and single host throughput
numbers.


ISCSI-SCST/open-iscsi on a decent hardware can fully saturate 10GbE 
link. On writes even with a single stream, i.e. something like a single 
dd writing data to a single device.


Vlad



--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: iscsi performance via 10 Gig

2010-05-27 Thread Vladislav Bolkhovitin

Boaz Harrosh, on 05/26/2010 10:58 PM wrote:

On 05/26/2010 09:52 PM, Vladislav Bolkhovitin wrote:

Boaz Harrosh, on 05/26/2010 10:45 PM wrote:

On 05/26/2010 09:42 PM, Vladislav Bolkhovitin wrote:

Taylor, on 05/26/2010 09:32 PM wrote:

I'm curious what kind of performance numbers people can get from their
iscsi setup, specifically via 10 Gig.

We are running with Linux servers connected to Dell Equallogic 10 Gig
arrays on Suse.

Recently we were running under SLES 11, and with multipath were seeing
about 2.5 Gig per NIC, or 5.0 Gbit/sec total IO throughput, but we
were getting a large number of iscsi connection errors.  We are using
10 Gig NICs with jumbo frames.

We reimaged the server to OpenSuse, same hardware and configs
otherwise, and since then we are getting about half, or 1.2 to 1.3
Gbit per NIC, or 2.5 to 3.0 Gbit total IO throughput, but we've not
had any iscsi connection errors.

What are other people seeing?  Doesn't need to be an equallogic, just
any 10 Gig connection to an iscsi array and single host throughput
numbers.
ISCSI-SCST/open-iscsi on a decent hardware can fully saturate 10GbE 
link. On writes even with a single stream, i.e. something like a single 
dd writing data to a single device.



Off topic question:
That's a fast disk. A sata HD? the best I got for single sata was like
90 MB/s. Did you mean a RAM device of sorts.
The single stream data were both from a SAS RAID and RAMFS. The 
multi-stream data were from RAMFS, because I don't have any reports 
about any tests of iSCSI-SCST on fast enough SSDs.




Right thanks. So the SAS RAID had what? like 12-15 spindles?


If I remember correctly, it was 10 spindles each capable of 150+MB/s. 
The RAID was MD RAID0.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: mc/s - not yet in open-iscsi?

2010-06-10 Thread Vladislav Bolkhovitin

Christopher Barry, on 06/10/2010 03:09 AM wrote:

Greetings everyone,

Had a question about implementing mc/s using open-iscsi today. Wasn't
really sure exactly what it was. From googling about, I can't find any
references of people doing it with open-iscsi, although I see a few
references to people asking about it. Anyone know the status on that?


http://scst.sourceforge.net/mc_s.html. In short, there's no point in it 
worth implementation and maintenance effort.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet, 1.25 million IOPS update

2010-06-14 Thread Vladislav Bolkhovitin

Pasi Kärkkäinen, on 06/11/2010 11:26 AM wrote:

On Fri, Feb 05, 2010 at 02:10:32PM +0300, Vladislav Bolkhovitin wrote:

Pasi Kärkkäinen, on 01/28/2010 03:36 PM wrote:

Hello list,

Please check these news items:
http://blog.fosketts.net/2010/01/14/microsoft-intel-push-million-iscsi-iops/
http://communities.intel.com/community/openportit/server/blog/2010/01/19/100-iops-with-iscsi--thats-not-a-typo
http://www.infostor.com/index/blogs_new/dave_simpson_storage/blogs/infostor/dave_simpon_storage/post987_37501094375591341.html

"1,030,000 IOPS over a single 10 Gb Ethernet link"

"Specifically, Intel and Microsoft clocked 1,030,000 IOPS (with 
512-byte blocks), and more than 2,250MBps with large block sizes (16KB 
to 256KB) using the Iometer benchmark"


So.. who wants to beat that using Linux + open-iscsi? :)
I personally, don't like such tests and don't trust them at all. They  
are pure marketing. The only goal of them is to create impression that X  
(Microsoft and Windows in this case) is a super-puper ahead of the  
world. I've seen on the Web a good article about usual tricks used by  
vendors to cheat benchmarks to get good marketing material, but,  
unfortunately, can't find link on it at the moment.


The problem is that you can't say from such tests if X will also "ahead  
of the world" on real life usages, because such tests always heavily  
optimized for particular used benchmarks and such optimizations almost  
always hurt real life cases. And you hardly find descriptions of those  
optimizations as well as a scientific description of the tests themself.  
The results published practically only in marketing documents.


Anyway, as far as I can see Linux supports all the used hardware as well  
as all advance performance modes of it, so if one repeats this test in  
the same setup, he/she should get not worse results.


For me personally it was funny to see how MS presents in the WinHEC  
presentation  
(http://download.microsoft.com/download/5/E/6/5E66B27B-988B-4F50-AF3A-C2FF1E62180F/COR-T586_WH08.pptx) 
that they have 1.1GB/s from 4 connections. In the beginning of 2008 I  
saw a *single* dd pushing data on that rate over a *single* connection  
from Linux initiator to iSCSI-SCST target using regular Myricom hardware  
without any special acceleration. I didn't know how proud I must have  
been for Linux :).




It seems they've described the setup here:
http://communities.intel.com/community/wired/blog/2010/04/20/1-million-iop-article-explained

And today they seem to have a demo which produces 1.3 million IOPS!

"1 Million IOPS? How about 1.25 Million!":
http://communities.intel.com/community/wired/blog/2010/04/22/1-million-iops-how-about-125-million


I'm glad for them. The only thing surprises me that none of the Linux 
vendors, including Intel itself, interested to repeat this test for 
Linux and fix possible found problems, if any. Ten years ago similar 
test about Linux TCP scalability limitations comparing with Windows 
caused massive reaction and great TCP improvements.


The way how to do the test is quite straightforward, starting from 
making for Linux similarly effective test tool as IOMeter on Windows 
[1]. Maybe, the lack of such tool scares the vendors away?


Vlad

[1] None of the performance measurement tools for Linux I've seen so 
far, including disktest (although I've not looked at newer (1-1.5 years) 
versions) and fio satisfied me for various reasons.


--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: mc/s - not yet in open-iscsi?

2010-06-14 Thread Vladislav Bolkhovitin

Nicholas A. Bellinger, on 06/11/2010 12:45 AM wrote:

On Thu, 2010-06-10 at 13:35 -0700, Nicholas A. Bellinger wrote:

On Thu, 2010-06-10 at 13:36 +0400, Vladislav Bolkhovitin wrote:

Christopher Barry, on 06/10/2010 03:09 AM wrote:

Greetings everyone,

Had a question about implementing mc/s using open-iscsi today. Wasn't
really sure exactly what it was. From googling about, I can't find any
references of people doing it with open-iscsi, although I see a few
references to people asking about it. Anyone know the status on that?
http://scst.sourceforge.net/mc_s.html. In short, there's no point in it 
worth implementation and maintenance effort.

Heh, this URL is a bunch of b*llshit handwaving because the iscsi-scst
target does not support the complete set of features defined by
RFC-3720, namely MC/S and ErrorRecoveryLevel=2, let alone asymmeteric
logical unit access (ALUA) MPIO.   Vlad, if you are so sure that MC/S is
so awful, why don't you put your money where your mouth is and start
asking these questions on the IETF IPS list and see what Julian Satran
(the RFC editor) has to say about them..?  


H...?



Btw, just for those following along, here is what MC/S and ERL=2 when
used in combination (yes, they are complementary) really do:

http://linux-iscsi.org/builds/user/nab/Inter.vs.OuterNexus.Multiplexing.pdf

Also, I should mention in all fairness that my team was the first to
implement both a Target and Initiator capable of MC/S and
ErrorRecoveryLevel=2 running on Linux, and the first target capable of
running MC/S from multiple initiator implementations.

Unfortuately Vlad has never implemented any of these features in either
a target or initiator, so really he is not in a position to say what is
'good' or what is 'bad' about MC/S.


One more personal attack and misleading (read: deceiving) half-truth? My 
article is a technical article, so if you see anything wrong in it, you 
are welcome to point out on that and correct me. But instead you prefer 
personal attacks.


If you want to call me an ignorant idiot who's talking about what he 
completely doesn't understand, don't forget to call the same the Linux 
SCSI maintainers who also dislike MC/S for the same reasons (basically, 
I've just elaborated them) and who also have not implemented MC/S 
anywhere (although I've almost done it in iSCSI-SCST, but stopped in 
time). Simply, one doesn't have to jump from a fifth floor window to 
know consequences of this move. (Interesting, if I'm not in position to 
say anything about MC/S, who is in the position?)


What's funny is that your link says basically the same as my article and 
rather supports it.


Regarding your team being "the first", don't forget to also mention that 
later your implementation was rejected by the Linux community and 
open-iscsi was preferred instead.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: mc/s - not yet in open-iscsi?

2010-06-14 Thread Vladislav Bolkhovitin

Raj, on 06/12/2010 03:17 AM wrote:

Nicholas A. Bellinger  writes:


Btw, just for those following along, here is what MC/S and ERL=2 when
used in combination (yes, they are complementary) really do:

http://linux-iscsi.org/builds/user/nab/Inter.vs.OuterNexus.Multiplexing.pdf

Also, I should mention in all fairness that my team was the first to
implement both a Target and Initiator capable of MC/S and
ErrorRecoveryLevel=2 running on Linux, and the first target capable of
running MC/S from multiple initiator implementations.



But the end result is what? open-iSCSI still doesn't have the MC/S even though 
it is useful? 


The end result is that any driver level multipath, including MC/S, is 
forbidden in Linux to encourage developers and vendors to improve MPIO 
and not to try to workaround problems in it by their homebrewed 
multipath solutions [1]. As the result of this very smart policy, Linux 
MPIO is in a very good shape now. Particularly, it scales with more 
links quite well. In contrast, according to Microsoft's data linked in 
this list recently, Windows MPIO scales quite badly, but Linux MPIO 
scales similarly as Windows MC/S does [2]. (BTW, this is a good evidence 
that MC/S doesn't have any inherent performance advantage over MPIO.)


But we are on the Linux list, so we don't care about Windows' problems. 
Everybody are encouraged to use MPIO and, if have any problem with it, 
report it in the appropriate mailing lists.


Vlad

[1] Yes, MC/S is just a workaround apparently introduced by IETF 
committee to eliminate multipath problems they see in SCSI inside their 
_own_ protocol instead of to push T10 committee to make the necessary 
changes in SAM. Or, because a lack of acceptance of those problem from 
T10 committee. But I'm not familiar with so deep history, so can only 
speculate about it.


[2] The Windows MPIO limitations can well explain why Microsoft is the 
only OS vendor pushing MC/S: for them it's simpler to implement MC/S 
than to fix those MPIO scalability problems. Additionally, it could have 
a future marketing value for them: the improved MPIO scalability would 
be a big point to push customers to migrate to the new Windows version. 
But this is again just a vague speculation ground. We will see in the 
future.


--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: mc/s - not yet in open-iscsi?

2010-06-14 Thread Vladislav Bolkhovitin

Nicholas A. Bellinger, on 06/12/2010 07:22 AM wrote:

On Fri, 2010-06-11 at 23:17 +, Raj wrote:

Nicholas A. Bellinger  writes:


Btw, just for those following along, here is what MC/S and ERL=2 when
used in combination (yes, they are complementary) really do:

http://linux-iscsi.org/builds/user/nab/Inter.vs.OuterNexus.Multiplexing.pdf

Also, I should mention in all fairness that my team was the first to
implement both a Target and Initiator capable of MC/S and
ErrorRecoveryLevel=2 running on Linux, and the first target capable of
running MC/S from multiple initiator implementations.

But the end result is what? open-iSCSI still doesn't have the MC/S even though 
it is useful? 


So without going into a multi-year history lesson as to why MC/S is not
currently supported in Open-iSCSI, what it boils down to is this:

MC/S (or InterNexus multiplexing)


Not quite right: MC/S is "InterConnection" multiplexing inside a single 
nexus (session).


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet, 1.25 million IOPS update

2010-06-15 Thread Vladislav Bolkhovitin

guy keren, on 06/15/2010 01:46 AM wrote:

Vladislav Bolkhovitin wrote:

Pasi Kärkkäinen, on 06/11/2010 11:26 AM wrote:

On Fri, Feb 05, 2010 at 02:10:32PM +0300, Vladislav Bolkhovitin wrote:

Pasi Kärkkäinen, on 01/28/2010 03:36 PM wrote:

Hello list,

Please check these news items:
http://blog.fosketts.net/2010/01/14/microsoft-intel-push-million-iscsi-iops/ 

http://communities.intel.com/community/openportit/server/blog/2010/01/19/100-iops-with-iscsi--thats-not-a-typo 

http://www.infostor.com/index/blogs_new/dave_simpson_storage/blogs/infostor/dave_simpon_storage/post987_37501094375591341.html 



"1,030,000 IOPS over a single 10 Gb Ethernet link"

"Specifically, Intel and Microsoft clocked 1,030,000 IOPS (with 
512-byte blocks), and more than 2,250MBps with large block sizes 
(16KB to 256KB) using the Iometer benchmark"


So.. who wants to beat that using Linux + open-iscsi? :)
I personally, don't like such tests and don't trust them at all. 
They  are pure marketing. The only goal of them is to create 
impression that X  (Microsoft and Windows in this case) is a 
super-puper ahead of the  world. I've seen on the Web a good article 
about usual tricks used by  vendors to cheat benchmarks to get good 
marketing material, but,  unfortunately, can't find link on it at the 
moment.


The problem is that you can't say from such tests if X will also 
"ahead  of the world" on real life usages, because such tests always 
heavily  optimized for particular used benchmarks and such 
optimizations almost  always hurt real life cases. And you hardly 
find descriptions of those  optimizations as well as a scientific 
description of the tests themself.  The results published practically 
only in marketing documents.


Anyway, as far as I can see Linux supports all the used hardware as 
well  as all advance performance modes of it, so if one repeats this 
test in  the same setup, he/she should get not worse results.


For me personally it was funny to see how MS presents in the WinHEC  
presentation  
(http://download.microsoft.com/download/5/E/6/5E66B27B-988B-4F50-AF3A-C2FF1E62180F/COR-T586_WH08.pptx) 
that they have 1.1GB/s from 4 connections. In the beginning of 2008 
I  saw a *single* dd pushing data on that rate over a *single* 
connection  from Linux initiator to iSCSI-SCST target using regular 
Myricom hardware  without any special acceleration. I didn't know how 
proud I must have  been for Linux :).



It seems they've described the setup here:
http://communities.intel.com/community/wired/blog/2010/04/20/1-million-iop-article-explained 



And today they seem to have a demo which produces 1.3 million IOPS!

"1 Million IOPS? How about 1.25 Million!":
http://communities.intel.com/community/wired/blog/2010/04/22/1-million-iops-how-about-125-million 

I'm glad for them. The only thing surprises me that none of the Linux 
vendors, including Intel itself, interested to repeat this test for 
Linux and fix possible found problems, if any. Ten years ago similar 
test about Linux TCP scalability limitations comparing with Windows 
caused massive reaction and great TCP improvements.


The way how to do the test is quite straightforward, starting from 
making for Linux similarly effective test tool as IOMeter on Windows 
[1]. Maybe, the lack of such tool scares the vendors away?


Vlad

[1] None of the performance measurement tools for Linux I've seen so 
far, including disktest (although I've not looked at newer (1-1.5 years) 
versions) and fio satisfied me for various reasons.


there's iometer agent (dynamo) for linux (but the official version has 
some one fundamental flow, which should be fixed - it doesn't use AIO 
properly) - you just need a windows desktop to launch the test, and run 
the dynamo agent on a linux machine.


there is also vdbench from sun.


vdbench is Java-based, so not suitable for this particular case, where 
low CPU overhead is a key.



--guy



--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet, 1.25 million IOPS update

2010-06-22 Thread Vladislav Bolkhovitin

Pasi Kärkkäinen, on 06/18/2010 12:23 PM wrote:
[1] None of the performance measurement tools for Linux I've seen so  
far, including disktest (although I've not looked at newer (1-1.5 years)  
versions) and fio satisfied me for various reasons.


What's missing from ltp disktest?


It can't do the I/O method needed in this test: async 512b sequential 
zero copy SG/BSG in FIFO order + I have some questions to how it 
calculates results and how effectively it implements multithreading, so 
it should be audited to clear those questions.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: mc/s - not yet in open-iscsi?

2010-06-22 Thread Vladislav Bolkhovitin

Nicholas A. Bellinger, on 06/15/2010 04:46 AM wrote:
 As the result of this very smart policy, Linux 
MPIO is in a very good shape now. Particularly, it scales with more 
links quite well. In contrast, according to Microsoft's data linked in 
this list recently, Windows MPIO scales quite badly, but Linux MPIO 
scales similarly as Windows MC/S does [2]. (BTW, this is a good evidence 
that MC/S doesn't have any inherent performance advantage over MPIO.)




Then why can't you produce any numbers for Linux or MSFT, hmmm..?


I wonder, have you read my article or just blindly refusing it, because 
you have put to much effort in MC/S to accept it? If you read, you will 
find in it links to the measurements.



Just as a matter of record, back in 2005 it was proved that Linux/iSCSI
running with both MC/S *and* MPIO where complementary and improved
throughput by ~1 Gb/sec using the 1st generation (single core) AMD
Operton x86_64 chips on PCI-X 133 Mhz 10 Gb/sec with stateless TCP
offload:

http://www.mail-archive.com/linux-s...@vger.kernel.org/msg02225.html


Your measurements did not prove anything, because you didn't find the 
exact cause of the effect you seen. There could have been too many of 
the causes, starting from a deeper queue depth with more connections and 
ending to your initiator implementation problem(s) making it perform 
worse with a single connection, with none of them directly relating to 
MC/S vs MPIO. It's just a sane engineering practice to find out the 
exact cause before making any conclusions and declarations. Otherwise, 
such results are good only to help selling your stuff.


Actually, your are illustrating why the decision to forbid any driver 
level multipath in the kernel is so wise. It's too easy for driver level 
multipath developers to claim that the common MPIO is fundamentally 
flawed instead of improving it.



Just because you have not done the work yourself to implement the
interesting RFC-3720 features does not mean you get to dictate (or
dictate to others on this list) what the future of Linux/iSCSI will be.


Dictating? So far it have only been analyzing the current state of 
affairs. You don't have to agree.


[1] Yes, MC/S is just a workaround apparently introduced by IETF 
committee to eliminate multipath problems they see in SCSI inside their 
_own_ protocol instead of to push T10 committee to make the necessary 
changes in SAM. Or, because a lack of acceptance of those problem from 
T10 committee. But I'm not familiar with so deep history, so can only 
speculate about it.


Complete and utterly wrong.  You might want to check your copy of SAM,
because a SCSI fabric is allowed to have multipath communication paths
and ports as long as the ordering is enforced for the I_T Nexus at the
Target port.


Would you mind to point me out to _exact_ chapters in SAM describing 
facilities to: (1) reassign tasks between different I_T nexuses and (2) 
maintain order of commands over different I_T nexuses?


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Over one million IOPS using software iSCSI and 10 Gbit Ethernet, 1.25 million IOPS update

2010-06-22 Thread Vladislav Bolkhovitin

Vladislav Bolkhovitin, on 06/22/2010 11:04 PM wrote:

Pasi Kärkkäinen, on 06/18/2010 12:23 PM wrote:
[1] None of the performance measurement tools for Linux I've seen so  
far, including disktest (although I've not looked at newer (1-1.5 years)  
versions) and fio satisfied me for various reasons.

What's missing from ltp disktest?


It can't do the I/O method needed in this test: async 512b sequential 
zero copy SG/BSG in FIFO order + I have some questions to how it 
calculates results and how effectively it implements multithreading, so 
it should be audited to clear those questions.


I mean before I can trust it, I need to audit how it works to make sure 
it does everything correctly. Otherwise it would be hard to narrow down 
exact bottleneck of the test.


Vlad

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Expose Generic SCSI Device over iSCSI (1x8 G2 Autoloader from HP)

2010-10-08 Thread Vladislav Bolkhovitin
Raimund Sacherer, on 10/08/2010 03:55 PM wrote:
> Hello,
> 
> 
> is it possible to expose an HP 1x8 G2 Autoloader with is connected
> with SCSI over iSCSI to another Server?
> 
> 
> Our problem is that the HP ProLiant DL380 G3 does not support the
> autoloader on it's external SCSI port (due to the problem that the
> SmartArray 5i only supports one LUN and the Autloader needs 2). So if
> it would be possible to expose the autoloader from another server
> which has the proper SCSI support (but we can not use to install the
> backup software) via iSCSI and connect to it from the DL380 we could
> spare some money for an additional SCSI Card 

Try iSCSI-SCST iSCSI target (http://iscsi-scst.sourceforge.net/) in
pass-through mode.

Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: on-off periodic hangs in scsi commands

2011-01-06 Thread Vladislav Bolkhovitin
Spelic, on 01/05/2011 09:56 PM wrote:
> I am cc-ing the open-iscsi and scst people to see if they can suggest a 
> way to debug the iscsi one. In particular it would be very interesting 
> to get the number of commands received and replied from target side, and 
> match it to the number of commands submitted from initiator side. (but I 
> suspect these status variables do not exist) If target side says it has 
> already replied, but initiator side sees the reply after 1 second... 
> that would be very meaningful. The bad thing is that I do not have much 
> time to debug this :-(

For SCST you can enable "scsi" logging. Then you will see all the coming
SCSI commands and responses for them in the kernel log. To deal with the
large volume of logs this facility can produce, you will need to switch
your logging application (syslogd?) into async mode (no fsync after each
logged line).

Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Full duplex

2011-09-01 Thread Vladislav Bolkhovitin
Hi,

I've done some tests and looks like open-iscsi doesn't support full duplex speed
on bidirectional data transfers from a single drive.

My test is simple: 2 dd's doing big transfers in parallel over 1 GbE link from a
ramdisk or nullio iSCSI device. One dd is reading and another one is writing. 
I'm
watching throughput using vmstat. When any of the dd's working alone, I have 
full
single direction link utilization (~120 MB/s) in both directions, but when both
transfers working in parallel, throughput on any of them immediately drops in 2
times to 55-60 MB/s (sum is the same 120 MB/s).

For sure, I tested bidirectional possibility of a single TCP connection and it
does provide near 2 times throughput increase (~200 MB/s).

Interesting, that doing another direction transfer from the same device imported
from another iSCSI target provides expected full duplex 2x aggregate throughput
increase.

I tried several iSCSI targets + I'm pretty confident that iSCSI-SCST is capable 
to
provide full duplex transfers, but from some look on the open-iscsi code I can't
see the serialization point in it. Looks like open-iscsi receives and sends data
in different threads (the requester process and per connection iscsi_q_X 
workqueue
correspondingly), so should be capable to have full duplex.

Does anyone have idea what could be the serialization point preventing full 
duplex
speed?

Thanks,
Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Full duplex

2011-09-08 Thread Vladislav Bolkhovitin
Mike Christie, on 09/02/2011 12:15 PM wrote:
> On 09/01/2011 10:04 PM, Vladislav Bolkhovitin wrote:
>> Hi,
>>
>> I've done some tests and looks like open-iscsi doesn't support full duplex 
>> speed
>> on bidirectional data transfers from a single drive.
>>
>> My test is simple: 2 dd's doing big transfers in parallel over 1 GbE link 
>> from a
>> ramdisk or nullio iSCSI device. One dd is reading and another one is 
>> writing. I'm
>> watching throughput using vmstat. When any of the dd's working alone, I have 
>> full
>> single direction link utilization (~120 MB/s) in both directions, but when 
>> both
>> transfers working in parallel, throughput on any of them immediately drops 
>> in 2
>> times to 55-60 MB/s (sum is the same 120 MB/s).
>>
>> For sure, I tested bidirectional possibility of a single TCP connection and 
>> it
>> does provide near 2 times throughput increase (~200 MB/s).
>>
>> Interesting, that doing another direction transfer from the same device 
>> imported
>> from another iSCSI target provides expected full duplex 2x aggregate 
>> throughput
>> increase.
>>
>> I tried several iSCSI targets + I'm pretty confident that iSCSI-SCST is 
>> capable to
>> provide full duplex transfers, but from some look on the open-iscsi code I 
>> can't
>> see the serialization point in it. Looks like open-iscsi receives and sends 
>> data
>> in different threads (the requester process and per connection iscsi_q_X 
>> workqueue
>> correspondingly), so should be capable to have full duplex.
> 
> Yeah, we send from the iscsi_q workqueue and receive from the network
> softirq if the net driver supports NAPI.
> 
>>
>> Does anyone have idea what could be the serialization point preventing full 
>> duplex
>> speed?
>>
> 
> Did you do any lock profiliing and is the session->lock look the
> problem? It is taken in both the receive and xmit paths and also the
> queuecommand path.

Just done it. /proc/lock_stat says that there is no significant contention for
session->lock.

>From other side, session->lock is a spinlock, so, if it was the serialization
point, we would see big CPU consumption on the initiator. But we have a plenty 
of
CPU time there.

So, there must be other serialization point.

Thanks,
Vlad



-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Full duplex

2011-09-10 Thread Vladislav Bolkhovitin
Vladislav Bolkhovitin, on 09/08/2011 09:55 PM wrote:
> Mike Christie, on 09/02/2011 12:15 PM wrote:
>> On 09/01/2011 10:04 PM, Vladislav Bolkhovitin wrote:
>>> Hi,
>>>
>>> I've done some tests and looks like open-iscsi doesn't support full duplex 
>>> speed
>>> on bidirectional data transfers from a single drive.
>>>
>>> My test is simple: 2 dd's doing big transfers in parallel over 1 GbE link 
>>> from a
>>> ramdisk or nullio iSCSI device. One dd is reading and another one is 
>>> writing. I'm
>>> watching throughput using vmstat. When any of the dd's working alone, I 
>>> have full
>>> single direction link utilization (~120 MB/s) in both directions, but when 
>>> both
>>> transfers working in parallel, throughput on any of them immediately drops 
>>> in 2
>>> times to 55-60 MB/s (sum is the same 120 MB/s).
>>>
>>> For sure, I tested bidirectional possibility of a single TCP connection and 
>>> it
>>> does provide near 2 times throughput increase (~200 MB/s).
>>>
>>> Interesting, that doing another direction transfer from the same device 
>>> imported
>>> from another iSCSI target provides expected full duplex 2x aggregate 
>>> throughput
>>> increase.
>>>
>>> I tried several iSCSI targets + I'm pretty confident that iSCSI-SCST is 
>>> capable to
>>> provide full duplex transfers, but from some look on the open-iscsi code I 
>>> can't
>>> see the serialization point in it. Looks like open-iscsi receives and sends 
>>> data
>>> in different threads (the requester process and per connection iscsi_q_X 
>>> workqueue
>>> correspondingly), so should be capable to have full duplex.
>>
>> Yeah, we send from the iscsi_q workqueue and receive from the network
>> softirq if the net driver supports NAPI.
>>>
>>> Does anyone have idea what could be the serialization point preventing full 
>>> duplex
>>> speed?
>>
>> Did you do any lock profiliing and is the session->lock look the
>> problem? It is taken in both the receive and xmit paths and also the
>> queuecommand path.
> 
> Just done it. /proc/lock_stat says that there is no significant contention for
> session->lock.
> 
>>From other side, session->lock is a spinlock, so, if it was the serialization
> point, we would see big CPU consumption on the initiator. But we have a 
> plenty of
> CPU time there.
> 
> So, there must be other serialization point.

Update. Using sg_dd with blk_sgio=1 (SG_IO) instead of dd I was able to achieve
bidi speed 92 MB/s in each direction.

Thus, the iSCSI stack works as expected well and the serialization point must be
somewhere higher in the block stack. Both buffered and direct dd demonstrate the
same serialized behavior described above.

Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Full duplex

2011-09-10 Thread Vladislav Bolkhovitin
Vladislav Bolkhovitin, on 09/10/2011 05:44 PM wrote:
> Vladislav Bolkhovitin, on 09/08/2011 09:55 PM wrote:
>> Mike Christie, on 09/02/2011 12:15 PM wrote:
>>> On 09/01/2011 10:04 PM, Vladislav Bolkhovitin wrote:
>>>> Hi,
>>>>
>>>> I've done some tests and looks like open-iscsi doesn't support full duplex 
>>>> speed
>>>> on bidirectional data transfers from a single drive.
>>>>
>>>> My test is simple: 2 dd's doing big transfers in parallel over 1 GbE link 
>>>> from a
>>>> ramdisk or nullio iSCSI device. One dd is reading and another one is 
>>>> writing. I'm
>>>> watching throughput using vmstat. When any of the dd's working alone, I 
>>>> have full
>>>> single direction link utilization (~120 MB/s) in both directions, but when 
>>>> both
>>>> transfers working in parallel, throughput on any of them immediately drops 
>>>> in 2
>>>> times to 55-60 MB/s (sum is the same 120 MB/s).
>>>>
>>>> For sure, I tested bidirectional possibility of a single TCP connection 
>>>> and it
>>>> does provide near 2 times throughput increase (~200 MB/s).
>>>>
>>>> Interesting, that doing another direction transfer from the same device 
>>>> imported
>>>> from another iSCSI target provides expected full duplex 2x aggregate 
>>>> throughput
>>>> increase.
>>>>
>>>> I tried several iSCSI targets + I'm pretty confident that iSCSI-SCST is 
>>>> capable to
>>>> provide full duplex transfers, but from some look on the open-iscsi code I 
>>>> can't
>>>> see the serialization point in it. Looks like open-iscsi receives and 
>>>> sends data
>>>> in different threads (the requester process and per connection iscsi_q_X 
>>>> workqueue
>>>> correspondingly), so should be capable to have full duplex.
>>>
>>> Yeah, we send from the iscsi_q workqueue and receive from the network
>>> softirq if the net driver supports NAPI.
>>>>
>>>> Does anyone have idea what could be the serialization point preventing 
>>>> full duplex
>>>> speed?
>>>
>>> Did you do any lock profiliing and is the session->lock look the
>>> problem? It is taken in both the receive and xmit paths and also the
>>> queuecommand path.
>>
>> Just done it. /proc/lock_stat says that there is no significant contention 
>> for
>> session->lock.
>>
>> >From other side, session->lock is a spinlock, so, if it was the 
>> >serialization
>> point, we would see big CPU consumption on the initiator. But we have a 
>> plenty of
>> CPU time there.
>>
>> So, there must be other serialization point.
> 
> Update. Using sg_dd with blk_sgio=1 (SG_IO) instead of dd I was able to 
> achieve
> bidi speed 92 MB/s in each direction.
> 
> Thus, the iSCSI stack works as expected well and the serialization point must 
> be
> somewhere higher in the block stack. Both buffered and direct dd demonstrate 
> the
> same serialized behavior described above.

...even using the corresponding iSCSI device formatted in ext4 with the dd's
working over 2 _separate_ files.

In other words, apparently, for user space applications not smart enough to use 
sg
interface there is no way to use full duplex capability of the link. No tricks 
can
give them the double throughput.

I tried with Fibre Channel and see the same with the only difference that if the
same device from the same target imported as 2 LUNs (i.e. as multipath), both
those LUNs can work bidirectionally. In iSCSI you need to import this device 
from
2 separate iSCSI targets to achieve that.

Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: bulk writes in iSCSI

2014-11-06 Thread Vladislav Bolkhovitin
Paul Koning wrote on 11/03/2014 04:54 AM:
> On Nov 3, 2014, at 6:10 AM, shivraj dongawe  wrote:
> 
>> Hi all, 
>>
>> Suppose I have information about some lba's and lengths. 
>>I want to send more than one write command as a part of single pdu. 
>>I want to know whether I could perform this activity using iSCSI?
> 
> That would require SCSI to have such a mechanism, and it does not.  Why 
> bother?  There’s no reason to expect that to have any performance benefits. 

No, benefits are obvious, because suddenly X IO becomes X/N IO, at least on the
initiator-target connectivity, hence huge boost on high IOPS loads.

There is a pending for quite some time WRITE SCATTERED SCSI command. So, 
requests for
this command should go to T10 committee to accept it in the official standard 
as soon
as possible.

Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.


Re: [LSF/MM TOPIC] iSCSI MQ adoption via MCS discussion

2015-01-13 Thread Vladislav Bolkhovitin
Sagi Grimberg wrote on 01/08/2015 05:45 AM:
>> RFC 3720 namely requires that iSCSI numbering is
>> session-wide. This means maintaining a single counter for all MC/S
>> sessions. Such a counter would be a contention point. I'm afraid that
>> because of that counter performance on a multi-socket initiator system
>> with a scsi-mq implementation based on MC/S could be worse than with the
>> approach with multiple iSER targets. Hence my preference for an approach
>> based on multiple independent iSER connections instead of MC/S.
> 
> So this comment is spot on the pros/cons of the discussion (we might want to 
> leave
> something for LSF ;)).
> MCS would not allow a completely lockless data-path due to command
> ordering. On the other hand implementing some kind of multiple sessions
> solution feels somewhat like a mis-fit (at least in my view).
> 
> One of my thoughts about how to overcome the contention on commands
> sequence numbering was to suggest some kind of negotiable "relaxed
> ordering" mode but of course I don't have anything figured out yet.

Linux SCSI/block stack neither uses, nor guarantees any commands order. 
Applications
requiring commands order enforce it by queue draining (i.e. wait until all 
previous
commands finished). Hence, MC/S enforced commands order is an overkill, which
additionally coming with some non-zero performance cost.

Don't do MC/S, do independent connections. You know the KISS principle. Memory 
overhead
to setup the extra iSCSI sessions should be negligible.

Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.


Re: iSCSI Multiqueue

2020-01-24 Thread Vladislav Bolkhovitin


On 1/23/20 1:51 PM, The Lee-Man wrote:
> On Wednesday, January 15, 2020 at 7:16:48 AM UTC-8, Bobby wrote:
> 
> 
> Hi all,
> 
> I have a question regarding multi-queue in iSCSI. AFAIK, *scsi-mq*
> has been functional in kernel since kernel 3.17. Because earlier,
> the block layer was updated to multi-queue *blk-mq* from
> single-queue. So the current kernel has full-fledged *multi-queues*.
> 
> The question is:
> 
> How an iSCSI initiator uses multi-queue? Does it mean having
> multiple connections? I would like 
> to see where exactly that is achieved in the code, if someone can
> please me give me a hint. Thanks in advance :)
> 
> Regards
> 
> 
> open-iscsi does not use multi-queue specifically, though all of the
> block layer is now converted to using multi-queue. If I understand
> correctly, there is no more single-queue, but there is glue that allows
> existing single-queue drivers to continue on, mapping their use to
> multi-queue. (Someone please correct me if I'm wrong.)
> 
> The only time multi-queue might be useful for open-iscsi to use would be
> for MCS -- multiple connections per session. But the implementation of
> multi-queue makes using it for MCS problematic. Because each queue is on
> a different CPU, open-iscsi would have to coordinate the multiple
> connections across multiple CPUs, making things like ensuring correct
> sequence numbers difficult.
> 
> Hope that helps. I _believe_ there is still an effort to map open-iscsi
> MCS to multi-queue, but nobody has tried to actually do it yet that I
> know of. The goal, of course, is better throughput using MCS.

>From my old iSCSI target development days, MS is fundamentally not
friendly to multi-queue, because it requires by the iSCSI spec to
preserve order of commands inside the session across multiple
connections. Commands serialization => shared lock or atomic => no
multi-queue benefits.

Hence, usage of MS for multi-queue would be beneficial only if to drop
(aka violate) this iSCSI spec requirement.

Just a small reminder. I have not looked in the updated iSCSI spec for a
while, but don't remember this requirement was anyhow eased there.

In any case, multiple iSCSI sessions per block level "session" would
always be another alternative that would require virtually zero changes
in open-iscsi and in-kernel iSCSI driver[1] as opposed to complex
changes required to start supporting MS in it as well as in many iSCSI
targets around that currently do not[2]. If I would be working on iSCSI
MQ, I would consider this as the first and MUCH more preferable option.

Vlad

1. Most likely, completely zero.
2. Where requirement to preserve commands order would similarly kill all
the MQ performance benefits.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/open-iscsi/1a730951-21eb-ae5f-a835-ad92c512978c%40vlnb.net.


Re: iSCSI Multiqueue

2020-01-24 Thread Vladislav Bolkhovitin



On 1/24/20 12:43 AM, Vladislav Bolkhovitin wrote:
> 
> On 1/23/20 1:51 PM, The Lee-Man wrote:
>> On Wednesday, January 15, 2020 at 7:16:48 AM UTC-8, Bobby wrote:
>>
>>
>> Hi all,
>>
>> I have a question regarding multi-queue in iSCSI. AFAIK, *scsi-mq*
>> has been functional in kernel since kernel 3.17. Because earlier,
>> the block layer was updated to multi-queue *blk-mq* from
>> single-queue. So the current kernel has full-fledged *multi-queues*.
>>
>> The question is:
>>
>> How an iSCSI initiator uses multi-queue? Does it mean having
>> multiple connections? I would like 
>> to see where exactly that is achieved in the code, if someone can
>> please me give me a hint. Thanks in advance :)
>>
>> Regards
>>
>>
>> open-iscsi does not use multi-queue specifically, though all of the
>> block layer is now converted to using multi-queue. If I understand
>> correctly, there is no more single-queue, but there is glue that allows
>> existing single-queue drivers to continue on, mapping their use to
>> multi-queue. (Someone please correct me if I'm wrong.)
>>
>> The only time multi-queue might be useful for open-iscsi to use would be
>> for MCS -- multiple connections per session. But the implementation of
>> multi-queue makes using it for MCS problematic. Because each queue is on
>> a different CPU, open-iscsi would have to coordinate the multiple
>> connections across multiple CPUs, making things like ensuring correct
>> sequence numbers difficult.
>>
>> Hope that helps. I _believe_ there is still an effort to map open-iscsi
>> MCS to multi-queue, but nobody has tried to actually do it yet that I
>> know of. The goal, of course, is better throughput using MCS.
> 
> From my old iSCSI target development days, MS is fundamentally not
> friendly to multi-queue, because it requires by the iSCSI spec to
> preserve order of commands inside the session across multiple
> connections. Commands serialization => shared lock or atomic => no
> multi-queue benefits.
> 
> Hence, usage of MS for multi-queue would be beneficial only if to drop
> (aka violate) this iSCSI spec requirement.
> 
> Just a small reminder. I have not looked in the updated iSCSI spec for a
> while, but don't remember this requirement was anyhow eased there.
> 
> In any case, multiple iSCSI sessions per block level "session" would
> always be another alternative that would require virtually zero changes
> in open-iscsi and in-kernel iSCSI driver[1] as opposed to complex
> changes required to start supporting MS in it as well as in many iSCSI
> targets around that currently do not[2]. If I would be working on iSCSI
> MQ, I would consider this as the first and MUCH more preferable option.
> 
> Vlad
> 
> 1. Most likely, completely zero.
> 2. Where requirement to preserve commands order would similarly kill all
> the MQ performance benefits.

Oops, 'MCS' must be everywhere instead of 'MS'. Something "corrected"
this "for me" behind my back.

Sorry,
Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/open-iscsi/846665ef-d238-8abd-8d1b-72e494af9dd1%40vlnb.net.