Re: RFC: FMR support in SRP

2014-09-20 Thread Jack Wang
2014-09-20 8:24 GMT+02:00 Bart Van Assche :
>
> On 09/19/14 18:04, Jinpu Wang wrote:
>>
>> Another question, in srp_map_finish_fmr, the desc va is set to 0,
>> could you point me how SRP protect multiple rdma operation write to
>> same addr?  or different fmr-rkey will protect this?
>
>
> Hello Jack,
>
> As you can see in srpt_map_sg_to_ib_sge() in the SRP target driver the 
> virtual address (db->va) and rkey (db->key) are always used in combination - 
> the virtual address is never used alone. Please note that this not only holds 
> for FMR but also for FR. More information can be found in paragraph "10.6.2 
> MEMORY REGISTRATION" in the InfiniBand Architecture Specification.
>
> Bart.
>

Hello Bart,

Thanks for your informative reply, I will go through InfiniBand
Architecture Spec to understand memory registration better.

Have a nice weekend!

Jack




>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: FMR support in SRP

2014-09-19 Thread Bart Van Assche

On 09/19/14 18:04, Jinpu Wang wrote:

Another question, in srp_map_finish_fmr, the desc va is set to 0,
could you point me how SRP protect multiple rdma operation write to
same addr?  or different fmr-rkey will protect this?


Hello Jack,

As you can see in srpt_map_sg_to_ib_sge() in the SRP target driver the 
virtual address (db->va) and rkey (db->key) are always used in 
combination - the virtual address is never used alone. Please note that 
this not only holds for FMR but also for FR. More information can be 
found in paragraph "10.6.2 MEMORY REGISTRATION" in the InfiniBand 
Architecture Specification.


Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: FMR support in SRP

2014-09-19 Thread Jinpu Wang
On Fri, Sep 19, 2014 at 5:35 PM, Jinpu Wang  wrote:
>>
>>
>> Hello Jack,
>>
>> Did you know that file descriptor 0 corresponds to stdin ? With command-line
>> option -w the test program reads data from stdin and sends that data to a
>> SCSI device. I think the test program is waiting for you to provide input
>> data :-)
>>
>> Bart.
>>
> Thanks Bart,
>
> Now I know it:)
>
> This command works, generated 128 descriptors.
> ./discontiguous-io -l 512  -s /dev/sdb
>
>
>
> Thanks a lot.
>
>
> --
> Mit freundlichen Grüßen,
> Best Regards,
>
> Jack Wang

Another question, in srp_map_finish_fmr, the desc va is set to 0,
could you point me how SRP protect multiple rdma operation write to
same
addr?  or different fmr-rkey will protect this?

-- 

Best Regards,

Jack Wang
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: FMR support in SRP

2014-09-19 Thread Jinpu Wang
>
>
> Hello Jack,
>
> Did you know that file descriptor 0 corresponds to stdin ? With command-line
> option -w the test program reads data from stdin and sends that data to a
> SCSI device. I think the test program is waiting for you to provide input
> data :-)
>
> Bart.
>
Thanks Bart,

Now I know it:)

This command works, generated 128 descriptors.
./discontiguous-io -l 512  -s /dev/sdb



Thanks a lot.


-- 
Mit freundlichen Grüßen,
Best Regards,

Jack Wang
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: FMR support in SRP

2014-09-19 Thread Bart Van Assche

On 09/19/14 17:08, Jinpu Wang wrote:

Thank you very much for quick reply. I tried you test program with:

./discontiguous-io -l 4096 -o 1024 -s -w /dev/sdb

The program just wait there, and strace show it hang at read:

munmap(0x7fb074c1, 24430)   = 0
brk(0)  = 0x1896000
brk(0x18b8000)  = 0x18b8000
open("/dev/sdb", O_RDONLY)  = 3
ioctl(3, BLKSSZGET, 0x6066e0)   = 0
fstat(0, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7fb074c16000
read(0,

ib_srp load with cmd_sg_entries=255, target side use ib_srpt from svn
1 month ago.

Kernel is 3.14.13

lsscsi
[0:0:0:0]diskATA  ST3750528AS  CC38  /dev/sda
[1:0:0:0]diskSCST_BIO vol0  310  /dev/sdb
[1:0:0:1]diskSCST_BIO vol1  310  /dev/sdc


Hello Jack,

Did you know that file descriptor 0 corresponds to stdin ? With 
command-line option -w the test program reads data from stdin and sends 
that data to a SCSI device. I think the test program is waiting for you 
to provide input data :-)


Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: FMR support in SRP

2014-09-19 Thread Jinpu Wang
Hello Bart,

Thank you very much for quick reply. I tried you test program with:

./discontiguous-io -l 4096 -o 1024 -s -w /dev/sdb

The program just wait there, and strace show it hang at read:

munmap(0x7fb074c1, 24430)   = 0
brk(0)  = 0x1896000
brk(0x18b8000)  = 0x18b8000
open("/dev/sdb", O_RDONLY)  = 3
ioctl(3, BLKSSZGET, 0x6066e0)   = 0
fstat(0, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7fb074c16000
read(0,

ib_srp load with cmd_sg_entries=255, target side use ib_srpt from svn
1 month ago.

Kernel is 3.14.13

lsscsi
[0:0:0:0]diskATA  ST3750528AS  CC38  /dev/sda
[1:0:0:0]diskSCST_BIO vol0  310  /dev/sdb
[1:0:0:1]diskSCST_BIO vol1  310  /dev/sdc

Regards,
Jack



On Fri, Sep 19, 2014 at 4:04 PM, Bart Van Assche  wrote:
> On 09/19/14 15:11, Jinpu Wang wrote:
>>
>> During go through SRP FMR support, I found ib_srp pre-alloc
>> fmr_list/map_page in each request,
>> fmr_list are alloced as many as target->cmd_sg_cnt
>>
>> I add some debug message when run fio test with different settings.
>> Result show, state.ndesc and state.nmdesc is 1, state.npages is 0,
>> after srp_map_sg,
>> my question is : do we really need as many cmd_sg_cnt fmr_list, or I
>> miss something,  ndesc and nmdesc could be bigger than 1?
>
>
> Hello Jack,
>
> The limitations for FMR / FR memory registration are more restrictive than
> those imposed by the SCSI core on S/G-list layout. A few examples:
> * Memory registered via FMR must be aligned on an FMR page boundary.
> * Memory registered via a single FMR / FR registration must be a contiguous
> virtual memory region.
> * The maximum size for a memory region registered via FMR or FR
> (dev->mr_max_size) can be below the maximum size of an S/G-list that can be
> passed by the SCSI core.
>
> Hence the need for multiple memory descriptors. In case you are wondering
> how I tested memory registration involving multiple memory descriptors: I
> wrote a test program that causes the SCSI core to send an S/G-list to the
> SRP initiator that consists of 128 elements with four bytes of data. None of
> these elements are aligned on a page boundary and no two S/G-list elements
> are contiguous in virtual memory. This test program causes the SRP initiator
> to allocate 128 memory descriptors. Please note that I/O requests submitted
> by the attached test program will only be accepted by the SRP initiator if
> the cmd_sg_entries kernel module parameter has been set to a value >= 128.
>
> Bart.
>



-- 
Mit freundlichen Grüßen,
Best Regards,

Jack Wang

Linux Kernel Developer Storage
ProfitBricks GmbH  The IaaS-Company.

ProfitBricks GmbH
Greifswalder Str. 207
D - 10405 Berlin
Tel: +49 30 5770083-42
Fax: +49 30 5770085-98
Email: jinpu.w...@profitbricks.com
URL: http://www.profitbricks.de

Sitz der Gesellschaft: Berlin.
Registergericht: Amtsgericht Charlottenburg, HRB 125506 B.
Geschäftsführer: Andreas Gauger, Achim Weiss.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: FMR support in SRP

2014-09-19 Thread Bart Van Assche

On 09/19/14 15:11, Jinpu Wang wrote:

During go through SRP FMR support, I found ib_srp pre-alloc
fmr_list/map_page in each request,
fmr_list are alloced as many as target->cmd_sg_cnt

I add some debug message when run fio test with different settings.
Result show, state.ndesc and state.nmdesc is 1, state.npages is 0,
after srp_map_sg,
my question is : do we really need as many cmd_sg_cnt fmr_list, or I
miss something,  ndesc and nmdesc could be bigger than 1?


Hello Jack,

The limitations for FMR / FR memory registration are more restrictive 
than those imposed by the SCSI core on S/G-list layout. A few examples:

* Memory registered via FMR must be aligned on an FMR page boundary.
* Memory registered via a single FMR / FR registration must be a 
contiguous virtual memory region.
* The maximum size for a memory region registered via FMR or FR 
(dev->mr_max_size) can be below the maximum size of an S/G-list that can 
be passed by the SCSI core.


Hence the need for multiple memory descriptors. In case you are 
wondering how I tested memory registration involving multiple memory 
descriptors: I wrote a test program that causes the SCSI core to send an 
S/G-list to the SRP initiator that consists of 128 elements with four 
bytes of data. None of these elements are aligned on a page boundary and 
no two S/G-list elements are contiguous in virtual memory. This test 
program causes the SRP initiator to allocate 128 memory descriptors. 
Please note that I/O requests submitted by the attached test program 
will only be accepted by the SRP initiator if the cmd_sg_entries kernel 
module parameter has been set to a value >= 128.


Bart.

#include 
#include  // memset()
#include  // O_RDONLY
#include 
#include 
#include   // BLKSSZGET
#include// sg_io_hdr_t
#include 
#include // open()
#include 

class file_descriptor {
public:
file_descriptor(int fd = -1)
	: m_fd(fd)
{ }
~file_descriptor()
{ if (m_fd >= 0) close(m_fd); }
operator int() const
{ return m_fd; }

private:
file_descriptor(const file_descriptor &);
file_descriptor &operator=(const file_descriptor &);

int m_fd;
};

class iovec_t {
public:
iovec_t()
{ }
~iovec_t()
{ }
size_t size() const
{ return m_v.size(); }
const sg_iovec_t& operator[](const int i) const
{ return m_v[i]; }
sg_iovec_t& operator[](const int i)
{ return m_v[i]; }
void append(void *addr, size_t len) {
	m_v.resize(m_v.size() + 1);
	decltype(m_v)::iterator p = m_v.end() - 1;
	p->iov_base = addr;
	p->iov_len = len;
}
const void *address() const {
	return &*m_v.begin();
}
size_t data_len() const {
	size_t len = 0;
	for (decltype(m_v)::const_iterator p = m_v.begin(); p != m_v.end(); ++p)
	len += p->iov_len;
	return len;
}
void trunc(size_t len) {
	size_t s = 0;
	for (decltype(m_v)::iterator p = m_v.begin(); p != m_v.end(); ++p) {
	s += p->iov_len;
	if (s >= len) {
		p->iov_len -= s - len;
		assert(p->iov_len > 0 || (p->iov_len == 0 && len == 0));
		m_v.resize(p - m_v.begin() + 1);
		break;
	}
	}
}
std::ostream& write(std::ostream& os) const {
	for (decltype(m_v)::const_iterator p = m_v.begin(); p != m_v.end(); ++p)
	os.write((const char *)p->iov_base, p->iov_len);
	return os;
}

private:
iovec_t(const iovec_t &);
iovec_t &operator=(const iovec_t &);

std::vector m_v;
};

static unsigned block_size;

static void dumphex(std::ostream &os, const void *a, size_t len)
{
for (int i = 0; i < len; i += 16) {
	os << std::hex << std::setfill('0') << std::setw(16)
	   << (uintptr_t)a + i << ':';
	for (int j = i; j < i + 16 && j < len; j++) {
	if (j % 4 == 0)
		os << ' ';
	os << std::hex << std::setfill('0') << std::setw(2)
	   << (unsigned)((uint8_t*)a)[j];
	}
	os << "  ";
	for (int j = i; j < i + 16 && j < len; j++) {
	unsigned char c = ((uint8_t*)a)[j];
	os << (c >= ' ' && c < 128 ? (char)c : '.');
	}
	os << '\n';
}
}

enum {
MAX_READ_WRITE_6_LBA = 0x1f,
MAX_READ_WRITE_6_LENGTH = 0xff,
};

static ssize_t sg_read(const file_descriptor &fd, uint32_t lba, const iovec_t &v)
{
if (lba > MAX_READ_WRITE_6_LBA)
	return -1;

if (v.data_len() == 0 || (v.data_len() % block_size) != 0)
	return -1;

if (v.data_len() / block_size > MAX_READ_WRITE_6_LENGTH)
	return -1;

int sg_version;
if (ioctl(fd, SG_GET_VERSION_NUM, &sg_version) < 0 || sg_version < 3)
	return -1;

uint8_t read6[6] = { 0x08, (uint8_t)(lba >> 16), (uint8_t)(lba >> 8),
			 (uint8_t)(lba), (uint8_t)(v.data_len() / block_size),
			 0 };
unsigned char sense_buffer[32];
sg_io_hdr_t h = { .interface_id = 'S' };

h.cmdp = read6;
h.cmd_len = sizeof(read6);
h.dxfer_direction = SG_DXFER_FROM_DEV;
h.iovec_count = v.size();
h.dxfer_len = v.data_len();
h.dxferp = const_cast(v.address());
h.sbp = sense_buffer;
h.mx_sb_len = sizeof(sense_buffer);
h.timeout = 1000; /* 1000 millisecs == 1 se

RFC: FMR support in SRP

2014-09-19 Thread Jinpu Wang
Hi Bart and all,

During go through SRP FMR support, I found ib_srp pre-alloc
fmr_list/map_page in each request,
fmr_list are alloced as many as target->cmd_sg_cnt

I add some debug message when run fio test with different settings.
Result show, state.ndesc and state.nmdesc is 1, state.npages is 0,
after srp_map_sg,
my question is : do we really need as many cmd_sg_cnt fmr_list, or I
miss something,  ndesc and nmdesc could be bigger than 1?



-- 

Mit freundlichen Grüßen,
Best Regards,

Jack Wang

Linux Kernel Developer Storage
ProfitBricks GmbH  The IaaS-Company.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html