RE: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer message with data payload

2017-08-30 Thread Long Li
> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Wednesday, August 30, 2017 1:52 AM
> To: Long Li 
> Cc: Christoph Hellwig ; Steve French
> ; linux-c...@vger.kernel.org; samba-
> techni...@lists.samba.org; linux-kernel@vger.kernel.org
> Subject: Re: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer
> message with data payload
> 
> On Wed, Aug 30, 2017 at 02:17:56AM +, Long Li wrote:
> > I partially addressed this issue in the V3 patch. Most of the
> > duplicate code on sending path is merged.
> >
> > The difficulty with translating the buffer to pages is that: I don't
> > know how many pages will be translated, and how many struct page I
> > need to allocate in advance to hold them. I try to avoid memory
> > allocation in the I/O path as much as possible. So I keep two
> > functions of sending data: one for buffer and one for pages.
> 
> You do: you'll always need speace for  (len + PAGE_SIZE - 1) >> PAGE_SIZE
> pages.
> 
> That being said: what callers even send you buffers?  In general we should
> aim to work with pages for all allocations that aren't tiny.

I'll look through the code allocating the buffers, it probably can fit into one 
page. Will fix this.


Re: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer message with data payload

2017-08-30 Thread Christoph Hellwig
On Wed, Aug 30, 2017 at 02:17:56AM +, Long Li wrote:
> I partially addressed this issue in the V3 patch. Most of the duplicate
> code on sending path is merged.
> 
> The difficulty with translating the buffer to pages is that: I don't
> know how many pages will be translated, and how many struct page I need
> to allocate in advance to hold them. I try to avoid memory allocation
> in the I/O path as much as possible. So I keep two functions of 
> sending data: one for buffer and one for pages.

You do: you'll always need speace for  (len + PAGE_SIZE - 1) >> PAGE_SIZE
pages.

That being said: what callers even send you buffers?  In general we
should aim to work with pages for all allocations that aren't tiny.


RE: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer message with data payload

2017-08-29 Thread Long Li
> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Sunday, August 13, 2017 3:24 AM
> To: Long Li 
> Cc: Steve French ; linux-c...@vger.kernel.org; samba-
> techni...@lists.samba.org; linux-kernel@vger.kernel.org; Long Li
> 
> Subject: Re: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer
> message with data payload
> 
> You can always get the struct page for kernel allocations using virt_to_page
> (or vmalloc_to_page, but this code would not handle the vmalloc case either),
> so I don't think you need this helper and can always use the one added in the
> previous patch.

I partially addressed this issue in the V3 patch. Most of the duplicate code on 
sending path is merged.

The difficulty with translating the buffer to pages is that: I don't know how 
many pages will be translated, and how many struct page I need to allocate in 
advance to hold them. I try to avoid memory allocation in the I/O path as much 
as possible. So I keep two functions of sending data: one for buffer and one 
for pages.


RE: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer message with data payload

2017-08-14 Thread Tom Talpey
> -Original Message-
> From: linux-cifs-ow...@vger.kernel.org [mailto:linux-cifs-
> ow...@vger.kernel.org] On Behalf Of Long Li
> Sent: Wednesday, August 2, 2017 4:10 PM
> To: Steve French ; linux-c...@vger.kernel.org; samba-
> techni...@lists.samba.org; linux-kernel@vger.kernel.org
> Cc: Long Li 
> Subject: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer message
> with data payload
> 
 
> Similar to sending transfer message with page payload, this function creates a
> SMBD data packet and send it over to RDMA, from iov passed from upper layer.

The following routine is heavily redundant with 14/37 
cifs_rdma_post_send_page().
Because they share quite a bit of protocol and DMA mapping logic, strongly 
suggest
they be merged.

Tom.

> +static int cifs_rdma_post_send_data(
> +   struct cifs_rdma_info *info,
> +   struct kvec *iov, int n_vec, int remaining_data_length);
>  static int cifs_rdma_post_send_page(struct cifs_rdma_info *info,
> struct page *page, unsigned long offset,
> size_t size, int remaining_data_length);
> @@ -671,6 +674,122 @@ static int cifs_rdma_post_send_page(struct
> cifs_rdma_info *info, struct page *pa
>  }



Re: [[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer message with data payload

2017-08-13 Thread Christoph Hellwig
You can always get the struct page for kernel allocations using
virt_to_page (or vmalloc_to_page, but this code would not handle the
vmalloc case either), so I don't think you need this helper and can
always use the one added in the previous patch.


[[PATCH v1] 15/37] [CIFS] SMBD: Post a SMBD data transfer message with data payload

2017-08-02 Thread Long Li
From: Long Li 

Similar to sending transfer message with page payload, this function creates a 
SMBD data packet and send it over to RDMA, from iov passed from upper layer.

Signed-off-by: Long Li 
---
 fs/cifs/cifsrdma.c | 119 +
 1 file changed, 119 insertions(+)

diff --git a/fs/cifs/cifsrdma.c b/fs/cifs/cifsrdma.c
index b3ec109..989cad8 100644
--- a/fs/cifs/cifsrdma.c
+++ b/fs/cifs/cifsrdma.c
@@ -66,6 +66,9 @@ static int cifs_rdma_post_recv(
struct cifs_rdma_info *info,
struct cifs_rdma_response *response);
 
+static int cifs_rdma_post_send_data(
+   struct cifs_rdma_info *info,
+   struct kvec *iov, int n_vec, int remaining_data_length);
 static int cifs_rdma_post_send_page(struct cifs_rdma_info *info,
struct page *page, unsigned long offset,
size_t size, int remaining_data_length);
@@ -671,6 +674,122 @@ static int cifs_rdma_post_send_page(struct cifs_rdma_info 
*info, struct page *pa
 }
 
 /*
+ * Send a data buffer
+ * iov: the iov array describing the data buffers
+ * n_vec: number of iov array
+ * remaining_data_length: remaining data to send in this payload
+ */
+static int cifs_rdma_post_send_data(
+   struct cifs_rdma_info *info, struct kvec *iov, int n_vec,
+   int remaining_data_length)
+{
+   struct cifs_rdma_request *request;
+   struct smbd_data_transfer *packet;
+   struct ib_send_wr send_wr, *send_wr_fail;
+   int rc = -ENOMEM, i;
+   u32 data_length;
+
+   request = mempool_alloc(info->request_mempool, GFP_KERNEL);
+   if (!request)
+   return rc;
+
+   request->info = info;
+
+   wait_event(info->wait_send_queue, atomic_read(&info->send_credits) > 0);
+   atomic_dec(&info->send_credits);
+
+   packet = (struct smbd_data_transfer *) request->packet;
+   packet->credits_requested = cpu_to_le16(info->send_credit_target);
+   packet->flags = cpu_to_le16(0);
+   packet->reserved = cpu_to_le16(0);
+
+   packet->data_offset = cpu_to_le32(24);
+
+   data_length = 0;
+   for (i=0; idata_length = cpu_to_le32(data_length);
+
+   packet->remaining_data_length = cpu_to_le32(remaining_data_length);
+   packet->padding = cpu_to_le32(0);
+
+   log_rdma_send("credits_requested=%d credits_granted=%d data_offset=%d "
+ "data_length=%d remaining_data_length=%d\n",
+   le16_to_cpu(packet->credits_requested),
+   le16_to_cpu(packet->credits_granted),
+   le32_to_cpu(packet->data_offset),
+   le32_to_cpu(packet->data_length),
+   le32_to_cpu(packet->remaining_data_length));
+
+   request->sge = kzalloc(sizeof(struct ib_sge)*(n_vec+1), GFP_KERNEL);
+   if (!request->sge)
+   goto allocate_sge_failed;
+
+   request->num_sge = n_vec+1;
+
+   request->sge[0].addr = ib_dma_map_single(
+   info->id->device, (void *)packet,
+   sizeof(*packet), DMA_BIDIRECTIONAL);
+   if(ib_dma_mapping_error(info->id->device, request->sge[0].addr)) {
+   rc = -EIO;
+   goto dma_mapping_failure;
+   }
+   request->sge[0].length = sizeof(*packet);
+   request->sge[0].lkey = info->pd->local_dma_lkey;
+   ib_dma_sync_single_for_device(info->id->device, request->sge[0].addr,
+   request->sge[0].length, DMA_TO_DEVICE);
+
+   for (i=0; isge[i+1].addr = ib_dma_map_single(info->id->device, 
iov[i].iov_base,
+   iov[i].iov_len, 
DMA_BIDIRECTIONAL);
+   if(ib_dma_mapping_error(info->id->device, 
request->sge[i+1].addr)) {
+   rc = -EIO;
+   goto dma_mapping_failure;
+   }
+   request->sge[i+1].length = iov[i].iov_len;
+   request->sge[i+1].lkey = info->pd->local_dma_lkey;
+   ib_dma_sync_single_for_device(info->id->device, 
request->sge[i+i].addr,
+   request->sge[i+i].length, DMA_TO_DEVICE);
+   }
+
+   log_rdma_send("rdma_request sge[0] addr=%llu legnth=%u lkey=%u\n",
+   request->sge[0].addr, request->sge[0].length, 
request->sge[0].lkey);
+   for (i=0; isge[i+1].addr,
+   request->sge[i+1].length, request->sge[i+1].lkey);
+
+   request->cqe.done = send_done;
+
+   send_wr.next = NULL;
+   send_wr.wr_cqe = &request->cqe;
+   send_wr.sg_list = request->sge;
+   send_wr.num_sge = request->num_sge;
+   send_wr.opcode = IB_WR_SEND;
+   send_wr.send_flags = IB_SEND_SIGNALED;
+
+   rc = ib_post_send(info->id->qp, &send_wr, &send_wr_fail);
+   if (!rc)
+   return 0;
+
+   // post send failed
+   log_rdma_send("ib_post_send failed rc=%d\n", rc);
+
+dma_mapping_failure:
+   for (i=0; isge[i].addr)
+   ib_dma_unma