RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to receive data
> -Original Message- > From: Long Li > Sent: Monday, August 14, 2017 7:25 PM > To: Tom Talpey ; Steve French ; > linux-c...@vger.kernel.org; samba-techni...@lists.samba.org; linux- > ker...@vger.kernel.org > Subject: RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to > receive data > > > > > -Original Message- > > From: Tom Talpey > > Sent: Monday, August 14, 2017 1:57 PM > > To: Long Li ; Steve French ; > > linux-c...@vger.kernel.org; samba-techni...@lists.samba.org; linux- > > ker...@vger.kernel.org > > Subject: RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer > > to receive data > > > > > -Original Message- > > > From: linux-cifs-ow...@vger.kernel.org [mailto:linux-cifs- > > > ow...@vger.kernel.org] On Behalf Of Long Li > > > Sent: Wednesday, August 2, 2017 4:11 PM > > > To: Steve French ; linux-c...@vger.kernel.org; > > > samba- techni...@lists.samba.org; linux-kernel@vger.kernel.org > > > Cc: Long Li > > > Subject: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer > > > to receive data > > > > > > /* > > > + * Read data from receive reassembly queue > > > + * All the incoming data packets are placed in reassembly queue > > > + * buf: the buffer to read data into > > > + * size: the length of data to read > > > + * return value: actual data read > > > + */ > > > +int cifs_rdma_read(struct cifs_rdma_info *info, char *buf, unsigned > > > +int size) { > > >... > > > + spin_lock_irqsave(&info->reassembly_queue_lock, flags); > > > + log_cifs_read("size=%d info->reassembly_data_length=%d\n", size, > > > + atomic_read(&info->reassembly_data_length)); > > > + if (atomic_read(&info->reassembly_data_length) >= size) { > > > > If the reassembly queue is protected by a lock, why is an atomic_read() of > > its > > length needed? > > Will change this to non-atomic. > > > > > > + // this is for reading rfc1002 length > > > + if (response->first_segment && size==4) { > > > + unsigned int rfc1002_len = > > > + data_length + > > > remaining_data_length; > > > + *((__be32*)buf) = > > > cpu_to_be32(rfc1002_len); > > > + data_read = 4; > > > + response->first_segment = false; > > > + log_cifs_read("returning rfc1002 length > > > %d\n", > > > + rfc1002_len); > > > + goto read_rfc1002_done; > > > + } > > > > I am totally confused. What does RFC1002 framing have to do with receiving > > an SMB Direct packet??? > > The upper layer expects RFC1002 length at the beginning of the payload. A lot > of protocol processing logic check and act on this value. Returning this value > will avoid changes to lots of other upper layer code. > > This will be eventually fixed when a transport layer is added to upper layer > code. I recommend we do it in another patch. It's totally non-obvious that you are *inserting* an RFC1002 length into the received message. Need to state that, and include the above explanation. OK on deferring that work. So far so good! > > > + to_copy = min_t(int, data_length - offset, > > > to_read); > > > + memcpy( > > > + buf + data_read, > > > + (char*)data_transfer + data_offset + > > > offset, > > > + to_copy); > > > > Is it really necessary to perform all these data copies, especially under > > the > > reassembly_queue spinlock? This seems quite inefficient. Can the receive > > buffers not be loaned out and chained logically? > > This will require upper layer code changes to move to use new buffers > allocated/loaned this way, and also deal with packet boundaries. > > This code is not used to actually carry file data, which are normally done > through RDMA read/write. Disagree - RDMA will only be used when the size exceeds a threshold, perhaps 4KB or even 8KB. That means you'll be performing several memcpy()s for such workloads. RDMA is only used for bulk data. At the very least, try to rearrange the code to avoid holding the reassembly lock for so long, definitely not when memcpy'ing. It will significantly single-thread all your receives. > If we want to do it, I suggest do another patch since more changes other than > transport are involved. Sure, that's fine but 1) we will definitely want to do it and 2) this needs a comment. Tom.
RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to receive data
> -Original Message- > From: Tom Talpey > Sent: Monday, August 14, 2017 1:57 PM > To: Long Li ; Steve French ; > linux-c...@vger.kernel.org; samba-techni...@lists.samba.org; linux- > ker...@vger.kernel.org > Subject: RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer > to receive data > > > -Original Message- > > From: linux-cifs-ow...@vger.kernel.org [mailto:linux-cifs- > > ow...@vger.kernel.org] On Behalf Of Long Li > > Sent: Wednesday, August 2, 2017 4:11 PM > > To: Steve French ; linux-c...@vger.kernel.org; > > samba- techni...@lists.samba.org; linux-kernel@vger.kernel.org > > Cc: Long Li > > Subject: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer > > to receive data > > > > /* > > + * Read data from receive reassembly queue > > + * All the incoming data packets are placed in reassembly queue > > + * buf: the buffer to read data into > > + * size: the length of data to read > > + * return value: actual data read > > + */ > > +int cifs_rdma_read(struct cifs_rdma_info *info, char *buf, unsigned > > +int size) { > >... > > + spin_lock_irqsave(&info->reassembly_queue_lock, flags); > > + log_cifs_read("size=%d info->reassembly_data_length=%d\n", size, > > + atomic_read(&info->reassembly_data_length)); > > + if (atomic_read(&info->reassembly_data_length) >= size) { > > If the reassembly queue is protected by a lock, why is an atomic_read() of its > length needed? Will change this to non-atomic. > > > + // this is for reading rfc1002 length > > + if (response->first_segment && size==4) { > > + unsigned int rfc1002_len = > > + data_length + remaining_data_length; > > + *((__be32*)buf) = cpu_to_be32(rfc1002_len); > > + data_read = 4; > > + response->first_segment = false; > > + log_cifs_read("returning rfc1002 length > > %d\n", > > + rfc1002_len); > > + goto read_rfc1002_done; > > + } > > I am totally confused. What does RFC1002 framing have to do with receiving > an SMB Direct packet??? The upper layer expects RFC1002 length at the beginning of the payload. A lot of protocol processing logic check and act on this value. Returning this value will avoid changes to lots of other upper layer code. This will be eventually fixed when a transport layer is added to upper layer code. I recommend we do it in another patch. > > > + > > + to_copy = min_t(int, data_length - offset, to_read); > > + memcpy( > > + buf + data_read, > > + (char*)data_transfer + data_offset + offset, > > + to_copy); > > Is it really necessary to perform all these data copies, especially under the > reassembly_queue spinlock? This seems quite inefficient. Can the receive > buffers not be loaned out and chained logically? This will require upper layer code changes to move to use new buffers allocated/loaned this way, and also deal with packet boundaries. This code is not used to actually carry file data, which are normally done through RDMA read/write. If we want to do it, I suggest do another patch since more changes other than transport are involved. > > Tom.
RE: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to receive data
> -Original Message- > From: linux-cifs-ow...@vger.kernel.org [mailto:linux-cifs- > ow...@vger.kernel.org] On Behalf Of Long Li > Sent: Wednesday, August 2, 2017 4:11 PM > To: Steve French ; linux-c...@vger.kernel.org; samba- > techni...@lists.samba.org; linux-kernel@vger.kernel.org > Cc: Long Li > Subject: [[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to > receive data > > /* > + * Read data from receive reassembly queue > + * All the incoming data packets are placed in reassembly queue > + * buf: the buffer to read data into > + * size: the length of data to read > + * return value: actual data read > + */ > +int cifs_rdma_read(struct cifs_rdma_info *info, char *buf, unsigned int size) > +{ >... > + spin_lock_irqsave(&info->reassembly_queue_lock, flags); > + log_cifs_read("size=%d info->reassembly_data_length=%d\n", size, > + atomic_read(&info->reassembly_data_length)); > + if (atomic_read(&info->reassembly_data_length) >= size) { If the reassembly queue is protected by a lock, why is an atomic_read() of its length needed? > + // this is for reading rfc1002 length > + if (response->first_segment && size==4) { > + unsigned int rfc1002_len = > + data_length + remaining_data_length; > + *((__be32*)buf) = cpu_to_be32(rfc1002_len); > + data_read = 4; > + response->first_segment = false; > + log_cifs_read("returning rfc1002 length %d\n", > + rfc1002_len); > + goto read_rfc1002_done; > + } I am totally confused. What does RFC1002 framing have to do with receiving an SMB Direct packet??? > + > + to_copy = min_t(int, data_length - offset, to_read); > + memcpy( > + buf + data_read, > + (char*)data_transfer + data_offset + offset, > + to_copy); Is it really necessary to perform all these data copies, especially under the reassembly_queue spinlock? This seems quite inefficient. Can the receive buffers not be loaned out and chained logically? Tom.
[[PATCH v1] 21/37] [CIFS] SMBD: Implement API for upper layer to receive data
From: Long Li With reassembly queue in place, implement the API for upper layer to recevie data. This call may sleep if there is not enough data in the reassembly queue. Signed-off-by: Long Li --- fs/cifs/cifsrdma.c | 110 + fs/cifs/cifsrdma.h | 5 +++ 2 files changed, 115 insertions(+) diff --git a/fs/cifs/cifsrdma.c b/fs/cifs/cifsrdma.c index 1d3fd26..e5f6300 100644 --- a/fs/cifs/cifsrdma.c +++ b/fs/cifs/cifsrdma.c @@ -1316,6 +1316,116 @@ struct cifs_rdma_info* cifs_create_rdma_session( } /* + * Read data from receive reassembly queue + * All the incoming data packets are placed in reassembly queue + * buf: the buffer to read data into + * size: the length of data to read + * return value: actual data read + */ +int cifs_rdma_read(struct cifs_rdma_info *info, char *buf, unsigned int size) +{ + struct cifs_rdma_response *response; + struct smbd_data_transfer *data_transfer; + unsigned long flags; + int to_copy, to_read, data_read, offset; + u32 data_length, remaining_data_length, data_offset; + +again: + // the transport is disconnected? + if (info->transport_status != CIFS_RDMA_CONNECTED) { + log_cifs_read("disconnected\n"); + + /* +* If upper layer code is reading SMB packet length +* return 0 to indicate transport is disconnected and +* trigger a reconnect. +*/ + spin_lock_irqsave(&info->reassembly_queue_lock, flags); + response = _get_first_reassembly(info); + if (response && response->first_segment && size==4) { + memset(buf, 0, size); + spin_unlock_irqrestore(&info->reassembly_queue_lock, flags); + return size; + } + spin_unlock_irqrestore(&info->reassembly_queue_lock, flags); + return 0; + } + + spin_lock_irqsave(&info->reassembly_queue_lock, flags); + log_cifs_read("size=%d info->reassembly_data_length=%d\n", size, + atomic_read(&info->reassembly_data_length)); + if (atomic_read(&info->reassembly_data_length) >= size) { + data_read = 0; + to_read = size; + offset = info->first_entry_offset; + while(data_read < size) { + response = _get_first_reassembly(info); + data_transfer = (struct smbd_data_transfer *) response->packet; + + data_length = le32_to_cpu(data_transfer->data_length); + remaining_data_length = + le32_to_cpu(data_transfer->remaining_data_length); + data_offset = le32_to_cpu(data_transfer->data_offset); + + // this is for reading rfc1002 length + if (response->first_segment && size==4) { + unsigned int rfc1002_len = + data_length + remaining_data_length; + *((__be32*)buf) = cpu_to_be32(rfc1002_len); + data_read = 4; + response->first_segment = false; + log_cifs_read("returning rfc1002 length %d\n", + rfc1002_len); + goto read_rfc1002_done; + } + + to_copy = min_t(int, data_length - offset, to_read); + memcpy( + buf + data_read, + (char*)data_transfer + data_offset + offset, + to_copy); + + // move on to the next buffer? + if (to_copy == data_length - offset) { + list_del(&response->list); + info->count_reassembly_queue--; + info->count_dequeue_reassembly_queue++; + put_receive_buffer(info, response); + offset = 0; + log_cifs_read("put_receive_buffer offset=0\n"); + } else + offset += to_copy; + + to_read -= to_copy; + data_read += to_copy; + + log_cifs_read("_get_first_reassembly memcpy %d bytes " + "data_transfer_length-offset=%d after that " + "to_read=%d data_read=%d offset=%d\n", + to_copy, data_length - offset, + to_read, data_read, offset); + } + atomic_sub(data_read, &info->reassembly_data_length); + info->first_entry_offset = offset; + log_cifs_rea