[PATCH v2] blkmq: Fix NULL pointer deref when all reserved tags in use

2015-03-18 Thread Sam Bradshaw
When allocating from the reserved tags pool, bt_get() is called with
a NULL hctx.  If all tags are in use, the hw queue is kicked to push
out any pending IO, potentially freeing tags, and tag allocation is
retried.  The problem is that blk_mq_run_hw_queue() doesn't check for
a NULL hctx.  So we avoid it with a simple NULL hctx test.  

This issue was introduced by:
b32232073e80: blk-mq: fix hang in bt_get()

Tested by hammering mtip32xx with concurrent smartctl/hdparm.

Signed-off-by: Sam Bradshaw 
Signed-off-by: Selvan Mani 
---
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index d53a764..9d7dd64 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -280,7 +280,8 @@ static int bt_get(struct blk_mq_alloc_data *data,
 * pending IO submits before going to sleep waiting for
 * some to complete.
 */
-   blk_mq_run_hw_queue(hctx, false);
+   if (hctx)
+   blk_mq_run_hw_queue(hctx, false);
 
/*
 * Retry tag allocation after running the hardware queue,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] blkmq: Fix NULL pointer deref when all reserved tags in use

2015-03-18 Thread Sam Bradshaw



Good catch! But why not put the hctx == NULL check in as a conditional
in bt_get() before running the queue? I can't imagine other cases where
calling blk_mq_run_hw_queue() with hctx == NULL would be a valid scenario.


The change was meant to be broad in scope.  A runtime NULL deref is a 
rather unfortunate way to determine that there are other invalid scenarios.


But given that both approaches fix the immediate problem, I'd be happy 
to change the patch as you suggest.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] blkmq: Fix NULL pointer deref when all reserved tags in use

2015-03-18 Thread Sam Bradshaw
When allocating from the reserved tags pool, bt_get() is called with 
a NULL hctx.  If all tags are in use, the hw queue is kicked to push 
out any pending IO, potentially freeing tags, and tag allocation is 
retried.  The problem is that blk_mq_run_hw_queue() doesn't check for 
a NULL hctx.  This patch fixes that bug.

An alternative implementation might skip kicking the queue for reserved 
tags and go right to io_schedule() but we chose to keep it simple.

Tested by hammering mtip32xx with concurrent smartctl/hdparm.

Signed-off-by: Sam Bradshaw 
Signed-off-by: Selvan Mani 
---
 block/blk-mq.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 59fa239..0471af6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -887,7 +887,7 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 
 void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
 {
-   if (unlikely(test_bit(BLK_MQ_S_STOPPED, >state) ||
+   if (unlikely(!hctx || test_bit(BLK_MQ_S_STOPPED, >state) ||
!blk_mq_hw_queue_mapped(hctx)))
return;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] blkmq: Fix NULL pointer deref when all reserved tags in use

2015-03-18 Thread Sam Bradshaw
When allocating from the reserved tags pool, bt_get() is called with 
a NULL hctx.  If all tags are in use, the hw queue is kicked to push 
out any pending IO, potentially freeing tags, and tag allocation is 
retried.  The problem is that blk_mq_run_hw_queue() doesn't check for 
a NULL hctx.  This patch fixes that bug.

An alternative implementation might skip kicking the queue for reserved 
tags and go right to io_schedule() but we chose to keep it simple.

Tested by hammering mtip32xx with concurrent smartctl/hdparm.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
Signed-off-by: Selvan Mani sm...@micron.com
---
 block/blk-mq.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 59fa239..0471af6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -887,7 +887,7 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 
 void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
 {
-   if (unlikely(test_bit(BLK_MQ_S_STOPPED, hctx-state) ||
+   if (unlikely(!hctx || test_bit(BLK_MQ_S_STOPPED, hctx-state) ||
!blk_mq_hw_queue_mapped(hctx)))
return;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] blkmq: Fix NULL pointer deref when all reserved tags in use

2015-03-18 Thread Sam Bradshaw
When allocating from the reserved tags pool, bt_get() is called with
a NULL hctx.  If all tags are in use, the hw queue is kicked to push
out any pending IO, potentially freeing tags, and tag allocation is
retried.  The problem is that blk_mq_run_hw_queue() doesn't check for
a NULL hctx.  So we avoid it with a simple NULL hctx test.  

This issue was introduced by:
b32232073e80: blk-mq: fix hang in bt_get()

Tested by hammering mtip32xx with concurrent smartctl/hdparm.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
Signed-off-by: Selvan Mani sm...@micron.com
---
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index d53a764..9d7dd64 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -280,7 +280,8 @@ static int bt_get(struct blk_mq_alloc_data *data,
 * pending IO submits before going to sleep waiting for
 * some to complete.
 */
-   blk_mq_run_hw_queue(hctx, false);
+   if (hctx)
+   blk_mq_run_hw_queue(hctx, false);
 
/*
 * Retry tag allocation after running the hardware queue,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] blkmq: Fix NULL pointer deref when all reserved tags in use

2015-03-18 Thread Sam Bradshaw



Good catch! But why not put the hctx == NULL check in as a conditional
in bt_get() before running the queue? I can't imagine other cases where
calling blk_mq_run_hw_queue() with hctx == NULL would be a valid scenario.


The change was meant to be broad in scope.  A runtime NULL deref is a 
rather unfortunate way to determine that there are other invalid scenarios.


But given that both approaches fix the immediate problem, I'd be happy 
to change the patch as you suggest.




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] block: pass correct prot_buf pointer to integrity metadata processing function

2015-01-14 Thread Sam Bradshaw
The prot_buf pointer passed to the generate/verify functions is 
incorrect for the second and subsequent range, making it impossible 
to verify the guard tag.  The patch correctly increments the prot_buf 
pointer by the tuple size for each pass.

Signed-off-by: Sam Bradshaw 
---
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 5cbd5d9..c7cbf60 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -226,13 +226,13 @@ static int bio_integrity_process(struct bio *bio,
iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
iter.interval = bi->interval;
iter.seed = bip_get_seed(bip);
-   iter.prot_buf = prot_buf;
 
bio_for_each_segment(bv, bio, bviter) {
void *kaddr = kmap_atomic(bv.bv_page);
 
iter.data_buf = kaddr + bv.bv_offset;
iter.data_size = bv.bv_len;
+   iter.prot_buf = prot_buf;
 
ret = proc_fn();
if (ret) {
@@ -241,6 +241,7 @@ static int bio_integrity_process(struct bio *bio,
}
 
kunmap_atomic(kaddr);
+   prot_buf += bi->tuple_size;
}
return ret;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] block: pass correct prot_buf pointer to integrity metadata processing function

2015-01-14 Thread Sam Bradshaw
The prot_buf pointer passed to the generate/verify functions is 
incorrect for the second and subsequent range, making it impossible 
to verify the guard tag.  The patch correctly increments the prot_buf 
pointer by the tuple size for each pass.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
---
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 5cbd5d9..c7cbf60 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -226,13 +226,13 @@ static int bio_integrity_process(struct bio *bio,
iter.disk_name = bio-bi_bdev-bd_disk-disk_name;
iter.interval = bi-interval;
iter.seed = bip_get_seed(bip);
-   iter.prot_buf = prot_buf;
 
bio_for_each_segment(bv, bio, bviter) {
void *kaddr = kmap_atomic(bv.bv_page);
 
iter.data_buf = kaddr + bv.bv_offset;
iter.data_size = bv.bv_len;
+   iter.prot_buf = prot_buf;
 
ret = proc_fn(iter);
if (ret) {
@@ -241,6 +241,7 @@ static int bio_integrity_process(struct bio *bio,
}
 
kunmap_atomic(kaddr);
+   prot_buf += bi-tuple_size;
}
return ret;
 }
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw (sbradshaw)


> -Original Message-
> From: Martin K. Petersen [mailto:martin.peter...@oracle.com]
> Sent: Wednesday, January 07, 2015 4:19 PM
> To: Sam Bradshaw (sbradshaw)
> Cc: Martin K. Petersen; ax...@kernel.dk; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] block: pass correct seed to integrity metadata
> generation function
> 
> >>>>> "Sam" == Sam Bradshaw (sbradshaw)  writes:
> 
> Sam> Yes.
> 
> The seed is just a seed. We happen to set it to the (block layer)
> sector number if nothing else is provided but it's essentially just an
> incrementing counter starting at an arbitrary value chosen by the
> caller.
> 
> Since an I/O may get remapped many times as block devices are stacked
> (DM, MD, stripe splits, partition offset shifts, etc.) the seed is not
> expected to match the LBA on the storage device. That is almost never
> the case.
> 
> In SCSI we do a remapping pass before submitting a WRITE or upon
> completion of a READ. You will have to do the same for NVMe.
> 
> It was done this way to avoid remapping the ref tag several times. We
> adjust the seed as the I/O gets sliced and diced and only map the PI
> pages once to do a single traversal at the bottom of the stack.
> 
> For next gen devices we simply pass the seed to the hardware and let it
> handle the remapping. This saves us having to map the PI pages and pull
> them through the cache.

Got it, thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw (sbradshaw)


> -Original Message-
> From: Martin K. Petersen [mailto:martin.peter...@oracle.com]
> Sent: Wednesday, January 07, 2015 3:43 PM
> To: Sam Bradshaw (sbradshaw)
> Cc: Martin K. Petersen; ax...@kernel.dk; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] block: pass correct seed to integrity metadata
> generation function
> 
> >>>>> "Sam" == Sam Bradshaw (sbradshaw)  writes:
> 
> Sam> Yes, you were CC: on the original.  Could you take a peek at the
> Sam> patch and give me feedback?  https://lkml.org/lkml/2014/12/18/521
> 
> Is the target an NVMe device?

Yes.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw (sbradshaw)


> Sam> Any particular objection to this patch?  It would be nice if the
> Sam> kernel supported PI on block sizes other than just 512 bytes.
> 
> I never saw the patch. Did you CC: me on it?

Yes, you were CC: on the original.  Could you take a peek at the patch 
and give me feedback?
https://lkml.org/lkml/2014/12/18/521



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw

On 12/18/2014 04:11 PM, Sam Bradshaw wrote:

The seed value passed to the blk integrity metadata generation function
was wrong depending on the device block size (interval) and how many
blocks comprised the bvec.  This patch converts the seed to 'interval'
units and increments it correctly for each iteration.  Tested against a
4k+8 (w/ type1 PI) sector size.


Any particular objection to this patch?  It would be nice if the kernel
supported PI on block sizes other than just 512 bytes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw

On 12/18/2014 04:11 PM, Sam Bradshaw wrote:

The seed value passed to the blk integrity metadata generation function
was wrong depending on the device block size (interval) and how many
blocks comprised the bvec.  This patch converts the seed to 'interval'
units and increments it correctly for each iteration.  Tested against a
4k+8 (w/ type1 PI) sector size.


Any particular objection to this patch?  It would be nice if the kernel
supported PI on block sizes other than just 512 bytes.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw (sbradshaw)


 Sam Any particular objection to this patch?  It would be nice if the
 Sam kernel supported PI on block sizes other than just 512 bytes.
 
 I never saw the patch. Did you CC: me on it?

Yes, you were CC: on the original.  Could you take a peek at the patch 
and give me feedback?
https://lkml.org/lkml/2014/12/18/521



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw (sbradshaw)


 -Original Message-
 From: Martin K. Petersen [mailto:martin.peter...@oracle.com]
 Sent: Wednesday, January 07, 2015 3:43 PM
 To: Sam Bradshaw (sbradshaw)
 Cc: Martin K. Petersen; ax...@kernel.dk; linux-kernel@vger.kernel.org
 Subject: Re: [PATCH] block: pass correct seed to integrity metadata
 generation function
 
  Sam == Sam Bradshaw (sbradshaw) sbrads...@micron.com writes:
 
 Sam Yes, you were CC: on the original.  Could you take a peek at the
 Sam patch and give me feedback?  https://lkml.org/lkml/2014/12/18/521
 
 Is the target an NVMe device?

Yes.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block: pass correct seed to integrity metadata generation function

2015-01-07 Thread Sam Bradshaw (sbradshaw)


 -Original Message-
 From: Martin K. Petersen [mailto:martin.peter...@oracle.com]
 Sent: Wednesday, January 07, 2015 4:19 PM
 To: Sam Bradshaw (sbradshaw)
 Cc: Martin K. Petersen; ax...@kernel.dk; linux-kernel@vger.kernel.org
 Subject: Re: [PATCH] block: pass correct seed to integrity metadata
 generation function
 
  Sam == Sam Bradshaw (sbradshaw) sbrads...@micron.com writes:
 
 Sam Yes.
 
 The seed is just a seed. We happen to set it to the (block layer)
 sector number if nothing else is provided but it's essentially just an
 incrementing counter starting at an arbitrary value chosen by the
 caller.
 
 Since an I/O may get remapped many times as block devices are stacked
 (DM, MD, stripe splits, partition offset shifts, etc.) the seed is not
 expected to match the LBA on the storage device. That is almost never
 the case.
 
 In SCSI we do a remapping pass before submitting a WRITE or upon
 completion of a READ. You will have to do the same for NVMe.
 
 It was done this way to avoid remapping the ref tag several times. We
 adjust the seed as the I/O gets sliced and diced and only map the PI
 pages once to do a single traversal at the bottom of the stack.
 
 For next gen devices we simply pass the seed to the hardware and let it
 handle the remapping. This saves us having to map the PI pages and pull
 them through the cache.

Got it, thanks!

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block:mtip32xx: Add call to set bit for failing drive rebuild in mtip32xx.c for the function,mtip_hw_get_identify

2015-01-06 Thread Sam Bradshaw (sbradshaw)


> -Original Message-
> From: Nicholas Krause [mailto:xerofo...@gmail.com]
> Sent: Monday, January 05, 2015 10:45 PM
> To: ax...@fb.com
> Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa); Sam Bradshaw
> (sbradshaw); h...@lst.de; Selvan Mani (smani) [CONT - Type 2];
> agord...@redhat.com; linux-kernel@vger.kernel.org
> Subject: [PATCH] block:mtip32xx: Add call to set bit for failing drive
> rebuild in mtip32xx.c for the function,mtip_hw_get_identify
> 
> This adds a call to the set_bit function for setting the bit for driver
> rebuild failures if the drive reset build fails based on checking the
> last element in the buf array for holding the value of the hardware
> registers.In additon we set this bit on the dd_flag element of the
> pointer dd of the structure type,driver_data passed to this function.

Thanks, Nicholas.  Good stuff.

Aside, if you're seeing rebuild failures, please contact Micron
support to get your drive(s) replaced under warranty.

-Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] block:mtip32xx: Add call to set bit for failing drive rebuild in mtip32xx.c for the function,mtip_hw_get_identify

2015-01-06 Thread Sam Bradshaw (sbradshaw)


 -Original Message-
 From: Nicholas Krause [mailto:xerofo...@gmail.com]
 Sent: Monday, January 05, 2015 10:45 PM
 To: ax...@fb.com
 Cc: Asai Thambi Samymuthu Pattrayasamy (asamymuthupa); Sam Bradshaw
 (sbradshaw); h...@lst.de; Selvan Mani (smani) [CONT - Type 2];
 agord...@redhat.com; linux-kernel@vger.kernel.org
 Subject: [PATCH] block:mtip32xx: Add call to set bit for failing drive
 rebuild in mtip32xx.c for the function,mtip_hw_get_identify
 
 This adds a call to the set_bit function for setting the bit for driver
 rebuild failures if the drive reset build fails based on checking the
 last element in the buf array for holding the value of the hardware
 registers.In additon we set this bit on the dd_flag element of the
 pointer dd of the structure type,driver_data passed to this function.

Thanks, Nicholas.  Good stuff.

Aside, if you're seeing rebuild failures, please contact Micron
support to get your drive(s) replaced under warranty.

-Sam
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] block: pass correct seed to integrity metadata generation function

2014-12-18 Thread Sam Bradshaw
The seed value passed to the blk integrity metadata generation function 
was wrong depending on the device block size (interval) and how many 
blocks comprised the bvec.  This patch converts the seed to 'interval' 
units and increments it correctly for each iteration.  Tested against a
4k+8 (w/ type1 PI) sector size.

Signed-off-by: Sam Bradshaw 
Signed-off-by: Selvan Mani 
---
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 5cbd5d9..7114040 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -219,20 +219,22 @@ static int bio_integrity_process(struct bio *bio,
struct bvec_iter bviter;
struct bio_vec bv;
struct bio_integrity_payload *bip = bio_integrity(bio);
-   unsigned int ret = 0;
+   unsigned int len = 0, ret = 0;
void *prot_buf = page_address(bip->bip_vec->bv_page) +
bip->bip_vec->bv_offset;
+   sector_t seed, start = bio_integrity_intervals(bi, bip_get_seed(bip));
 
iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
iter.interval = bi->interval;
-   iter.seed = bip_get_seed(bip);
-   iter.prot_buf = prot_buf;
+   seed = start;
 
bio_for_each_segment(bv, bio, bviter) {
void *kaddr = kmap_atomic(bv.bv_page);
 
iter.data_buf = kaddr + bv.bv_offset;
iter.data_size = bv.bv_len;
+   iter.seed = seed;
+   iter.prot_buf = prot_buf;
 
ret = proc_fn();
if (ret) {
@@ -241,6 +243,9 @@ static int bio_integrity_process(struct bio *bio,
}
 
kunmap_atomic(kaddr);
+   len += bv.bv_len;
+   seed = start + (len / bi->interval);
+   prot_buf += bi->tuple_size;
}
return ret;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] block: pass correct seed to integrity metadata generation function

2014-12-18 Thread Sam Bradshaw
The seed value passed to the blk integrity metadata generation function 
was wrong depending on the device block size (interval) and how many 
blocks comprised the bvec.  This patch converts the seed to 'interval' 
units and increments it correctly for each iteration.  Tested against a
4k+8 (w/ type1 PI) sector size.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
Signed-off-by: Selvan Mani sm...@micron.com
---
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 5cbd5d9..7114040 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -219,20 +219,22 @@ static int bio_integrity_process(struct bio *bio,
struct bvec_iter bviter;
struct bio_vec bv;
struct bio_integrity_payload *bip = bio_integrity(bio);
-   unsigned int ret = 0;
+   unsigned int len = 0, ret = 0;
void *prot_buf = page_address(bip-bip_vec-bv_page) +
bip-bip_vec-bv_offset;
+   sector_t seed, start = bio_integrity_intervals(bi, bip_get_seed(bip));
 
iter.disk_name = bio-bi_bdev-bd_disk-disk_name;
iter.interval = bi-interval;
-   iter.seed = bip_get_seed(bip);
-   iter.prot_buf = prot_buf;
+   seed = start;
 
bio_for_each_segment(bv, bio, bviter) {
void *kaddr = kmap_atomic(bv.bv_page);
 
iter.data_buf = kaddr + bv.bv_offset;
iter.data_size = bv.bv_len;
+   iter.seed = seed;
+   iter.prot_buf = prot_buf;
 
ret = proc_fn(iter);
if (ret) {
@@ -241,6 +243,9 @@ static int bio_integrity_process(struct bio *bio,
}
 
kunmap_atomic(kaddr);
+   len += bv.bv_len;
+   seed = start + (len / bi-interval);
+   prot_buf += bi-tuple_size;
}
return ret;
 }
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtip32xx: minor performance enhancements

2014-06-06 Thread Sam Bradshaw
This patch adds the following:

1) Compiler hinting in the fast path.
2) A prefetch of port->flags to eliminate moderate cpu stalling later 
in mtip_hw_submit_io().
3) Eliminate a redundant rq_data_dir().
4) Reorder members of driver_data to eliminate false cacheline sharing 
between irq_workers_active and unal_qdepth.

With some workload and topology configurations, I'm seeing ~1.5% 
throughput improvement in small block random read benchmarks as well
as improved latency std. dev.

Signed-off-by: Sam Bradshaw 
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 74abd49..fcbac54 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -2380,6 +2380,8 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
/* Map the scatter list for DMA access */
nents = dma_map_sg(>pdev->dev, command->sg, nents, dma_dir);
 
+   prefetch(>flags);
+
command->scatter_ents = nents;
 
/*
@@ -2392,7 +2394,7 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
fis = command->command;
fis->type= 0x27;
fis->opts= 1 << 7;
-   if (rq_data_dir(rq) == READ)
+   if (dma_dir == DMA_FROM_DEVICE)
fis->command = ATA_CMD_FPDMA_READ;
else
fis->command = ATA_CMD_FPDMA_WRITE;
@@ -2412,7 +2414,7 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
fis->res3= 0;
fill_command_sg(dd, command, nents);
 
-   if (command->unaligned)
+   if (unlikely(command->unaligned))
fis->device |= 1 << 7;
 
/* Populate the command header */
@@ -2433,7 +2435,7 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
 * To prevent this command from being issued
 * if an internal command is in progress or error handling is active.
 */
-   if (port->flags & MTIP_PF_PAUSE_IO) {
+   if (unlikely(port->flags & MTIP_PF_PAUSE_IO)) {
set_bit(rq->tag, port->cmds_to_issue);
set_bit(MTIP_PF_ISSUE_CMDS_BIT, >flags);
return;
@@ -3754,7 +3756,7 @@ static bool mtip_check_unal_depth(struct blk_mq_hw_ctx 
*hctx,
struct driver_data *dd = hctx->queue->queuedata;
struct mtip_cmd *cmd = blk_mq_rq_to_pdu(rq);
 
-   if (!dd->unal_qdepth || rq_data_dir(rq) == READ)
+   if (rq_data_dir(rq) == READ || !dd->unal_qdepth)
return false;
 
/*
@@ -3776,11 +3778,11 @@ static int mtip_queue_rq(struct blk_mq_hw_ctx *hctx, 
struct request *rq)
 {
int ret;
 
-   if (mtip_check_unal_depth(hctx, rq))
+   if (unlikely(mtip_check_unal_depth(hctx, rq)))
return BLK_MQ_RQ_QUEUE_BUSY;
 
ret = mtip_submit_request(hctx, rq);
-   if (!ret)
+   if (likely(!ret))
return BLK_MQ_RQ_QUEUE_OK;
 
rq->errors = ret;
diff --git a/drivers/block/mtip32xx/mtip32xx.h 
b/drivers/block/mtip32xx/mtip32xx.h
index 4b9b554..ba1b31e 100644
--- a/drivers/block/mtip32xx/mtip32xx.h
+++ b/drivers/block/mtip32xx/mtip32xx.h
@@ -493,19 +493,19 @@ struct driver_data {
 
struct workqueue_struct *isr_workq;
 
-   struct mtip_work work[MTIP_MAX_SLOT_GROUPS];
-
atomic_t irq_workers_active;
 
+   struct mtip_work work[MTIP_MAX_SLOT_GROUPS];
+
int isr_binding;
 
struct block_device *bdev;
 
-   int unal_qdepth; /* qdepth of unaligned IO queue */
-
struct list_head online_list; /* linkage for online list */
 
struct list_head remove_list; /* linkage for removing list */
+
+   int unal_qdepth; /* qdepth of unaligned IO queue */
 };
 
 #endif

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtip32xx: minor performance enhancements

2014-06-06 Thread Sam Bradshaw
This patch adds the following:

1) Compiler hinting in the fast path.
2) A prefetch of port-flags to eliminate moderate cpu stalling later 
in mtip_hw_submit_io().
3) Eliminate a redundant rq_data_dir().
4) Reorder members of driver_data to eliminate false cacheline sharing 
between irq_workers_active and unal_qdepth.

With some workload and topology configurations, I'm seeing ~1.5% 
throughput improvement in small block random read benchmarks as well
as improved latency std. dev.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 74abd49..fcbac54 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -2380,6 +2380,8 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
/* Map the scatter list for DMA access */
nents = dma_map_sg(dd-pdev-dev, command-sg, nents, dma_dir);
 
+   prefetch(port-flags);
+
command-scatter_ents = nents;
 
/*
@@ -2392,7 +2394,7 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
fis = command-command;
fis-type= 0x27;
fis-opts= 1  7;
-   if (rq_data_dir(rq) == READ)
+   if (dma_dir == DMA_FROM_DEVICE)
fis-command = ATA_CMD_FPDMA_READ;
else
fis-command = ATA_CMD_FPDMA_WRITE;
@@ -2412,7 +2414,7 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
fis-res3= 0;
fill_command_sg(dd, command, nents);
 
-   if (command-unaligned)
+   if (unlikely(command-unaligned))
fis-device |= 1  7;
 
/* Populate the command header */
@@ -2433,7 +2435,7 @@ static void mtip_hw_submit_io(struct driver_data *dd, 
struct request *rq,
 * To prevent this command from being issued
 * if an internal command is in progress or error handling is active.
 */
-   if (port-flags  MTIP_PF_PAUSE_IO) {
+   if (unlikely(port-flags  MTIP_PF_PAUSE_IO)) {
set_bit(rq-tag, port-cmds_to_issue);
set_bit(MTIP_PF_ISSUE_CMDS_BIT, port-flags);
return;
@@ -3754,7 +3756,7 @@ static bool mtip_check_unal_depth(struct blk_mq_hw_ctx 
*hctx,
struct driver_data *dd = hctx-queue-queuedata;
struct mtip_cmd *cmd = blk_mq_rq_to_pdu(rq);
 
-   if (!dd-unal_qdepth || rq_data_dir(rq) == READ)
+   if (rq_data_dir(rq) == READ || !dd-unal_qdepth)
return false;
 
/*
@@ -3776,11 +3778,11 @@ static int mtip_queue_rq(struct blk_mq_hw_ctx *hctx, 
struct request *rq)
 {
int ret;
 
-   if (mtip_check_unal_depth(hctx, rq))
+   if (unlikely(mtip_check_unal_depth(hctx, rq)))
return BLK_MQ_RQ_QUEUE_BUSY;
 
ret = mtip_submit_request(hctx, rq);
-   if (!ret)
+   if (likely(!ret))
return BLK_MQ_RQ_QUEUE_OK;
 
rq-errors = ret;
diff --git a/drivers/block/mtip32xx/mtip32xx.h 
b/drivers/block/mtip32xx/mtip32xx.h
index 4b9b554..ba1b31e 100644
--- a/drivers/block/mtip32xx/mtip32xx.h
+++ b/drivers/block/mtip32xx/mtip32xx.h
@@ -493,19 +493,19 @@ struct driver_data {
 
struct workqueue_struct *isr_workq;
 
-   struct mtip_work work[MTIP_MAX_SLOT_GROUPS];
-
atomic_t irq_workers_active;
 
+   struct mtip_work work[MTIP_MAX_SLOT_GROUPS];
+
int isr_binding;
 
struct block_device *bdev;
 
-   int unal_qdepth; /* qdepth of unaligned IO queue */
-
struct list_head online_list; /* linkage for online list */
 
struct list_head remove_list; /* linkage for removing list */
+
+   int unal_qdepth; /* qdepth of unaligned IO queue */
 };
 
 #endif

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtip32xx: switch to vmalloc() to mitigate high order allocation failures

2014-03-13 Thread Sam Bradshaw
The mtip_port kmalloc() allocation is relatively high order, in particular 
since the recent doubling of the size of the scatterlist container.  The 
allocation has been shown to fail during SRSI under fragmented or low memory 
conditions.  Switching to vmalloc() mitigates the problem.  Converted both 
mtip_port and driver_data allocations for consistency.

Signed-off-by: Sam Bradshaw 
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 624e9d9..f55e3e5 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3116,7 +3116,7 @@ static int mtip_free_orphan(struct driver_data *dd)
dd->queue = NULL;
}
}
-   kfree(dd);
+   vfree(dd);
return 0;
 }
 
@@ -3495,8 +3495,7 @@ static int mtip_hw_init(struct driver_data *dd)
 
hba_setup(dd);
 
-   dd->port = kzalloc_node(sizeof(struct mtip_port), GFP_KERNEL,
-   dd->numa_node);
+   dd->port = vzalloc_node(sizeof(struct mtip_port), dd->numa_node);
if (!dd->port) {
dev_err(>pdev->dev,
"Memory allocation: port structure\n");
@@ -3680,7 +3679,7 @@ out2:
 
 out1:
/* Free the memory allocated for the for structure. */
-   kfree(dd->port);
+   vfree(dd->port);
 
return rv;
 }
@@ -3724,7 +3723,7 @@ static int mtip_hw_exit(struct driver_data *dd)
mtip_dma_free(dd);
 
/* Free the memory allocated for the for structure. */
-   kfree(dd->port);
+   vfree(dd->port);
dd->port = NULL;
 
return 0;
@@ -4512,7 +4511,7 @@ static int mtip_pci_probe(struct pci_dev *pdev,
my_node, pcibus_to_node(pdev->bus), dev_to_node(>dev),
cpu_to_node(smp_processor_id()), smp_processor_id());
 
-   dd = kzalloc_node(sizeof(struct driver_data), GFP_KERNEL, my_node);
+   dd = vzalloc_node(sizeof(struct driver_data), my_node);
if (dd == NULL) {
dev_err(>dev,
"Unable to allocate memory for driver data\n");
@@ -4670,7 +4669,7 @@ setmask_err:
pcim_iounmap_regions(pdev, 1 << MTIP_ABAR);
 
 iomap_err:
-   kfree(dd);
+   vfree(dd);
pci_set_drvdata(pdev, NULL);
return rv;
 done:
@@ -4731,7 +4730,7 @@ static void mtip_pci_remove(struct pci_dev *pdev)
spin_unlock_irqrestore(_lock, flags);
 
if (!dd->sr)
-   kfree(dd);
+   vfree(dd);
else
set_bit(MTIP_DDF_REMOVE_DONE_BIT, >dd_flag);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mtip32xx: mtip_async_complete() bug fixes

2014-03-13 Thread Sam Bradshaw
On 03/13/2014 02:02 PM, Jens Axboe wrote:
> On 03/13/2014 03:09 PM, Sam Bradshaw wrote:
>> This patch fixes 2 issues in the fast completion path:
>> 1) Possible double completions / double dma_unmap_sg() calls due to lack
>> of atomicity in the check and subsequent dereference of the upper layer
>> callback function. Fixed with cmpxchg before unmap and callback.
>> 2) Regression in unaligned IO constraining workaround for p420m devices.
>> Fixed by checking if IO is unaligned and using proper semaphore if so.
> 
> Sam, what is this patch against? It doesn't apply to -rc6 and it doesn't 
> apply to for-3.15/drivers (the latter would be preferred/required). It 
> also seems to have line break issues.
> 

It's against for-3.15/drivers.  Sorry about the breaks; if I remove them
does it apply cleanly?


diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index b2012b7..624e9d9 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port *port,
void *data,
int status)
 {
-   struct mtip_cmd *command;
+   struct mtip_cmd *cmd;
struct driver_data *dd = data;
-   int cb_status = status ? -EIO : 0;
+   int unaligned, cb_status = status ? -EIO : 0;
+   void (*func)(void *, int);
 
if (unlikely(!dd) || unlikely(!port))
return;
 
-   command = >commands[tag];
+   cmd = >commands[tag];
 
if (unlikely(status == PORT_IRQ_TF_ERR)) {
dev_warn(>dd->pdev->dev,
"Command tag %d failed due to TFE\n", tag);
}
 
-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(>pdev->dev,
-   command->sg,
-   command->scatter_ents,
-   command->direction);
+   /* Clear the active flag */
+   atomic_set(>commands[tag].active, 0);
 
/* Upper layer callback */
-   if (likely(command->async_callback))
-   command->async_callback(command->async_data, cb_status);
+   func = cmd->async_callback;
+   if (likely(func && cmpxchg(>async_callback, func, 0) == func)) {
 
-   command->async_callback = NULL;
-   command->comp_func = NULL;
+   /* Unmap the DMA scatter list entries */
+   dma_unmap_sg(>pdev->dev,
+   cmd->sg,
+   cmd->scatter_ents,
+   cmd->direction);
 
-   /* Clear the allocated and active bits for the command */
-   atomic_set(>commands[tag].active, 0);
-   release_slot(port, tag);
+   func(cmd->async_data, cb_status);
+   unaligned = cmd->unaligned;
 
-   up(>cmd_slot);
+   /* Clear the allocated bit for the command */
+   release_slot(port, tag);
+
+   if (unlikely(unaligned))
+   up(>cmd_slot_unal);
+   else
+   up(>cmd_slot);
+   }
 }
 
 /*
@@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long int data)
 {
struct mtip_port *port = (struct mtip_port *) data;
struct host_to_dev_fis *fis;
-   struct mtip_cmd *command;
-   int tag, cmdto_cnt = 0;
+   struct mtip_cmd *cmd;
+   int unaligned, tag, cmdto_cnt = 0;
unsigned int bit, group;
unsigned int num_command_slots;
unsigned long to, tagaccum[SLOTBITS_IN_LONGS];
+   void (*func)(void *, int);
 
if (unlikely(!port))
return;
@@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int data)
group = tag >> 5;
bit = tag & 0x1F;
 
-   command = >commands[tag];
-   fis = (struct host_to_dev_fis *) command->command;
+   cmd = >commands[tag];
+   fis = (struct host_to_dev_fis *) cmd->command;
 
set_bit(tag, tagaccum);
cmdto_cnt++;
@@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long int data)
 */
writel(1 << bit, port->completed[group]);
 
-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(>dd->pdev->dev,
-   command->sg,
-   command->scatter_ents,
-   command->direction);
+   /* Clear the active flag for the command */
+   atomic_set(>commands[tag].active, 0);
 
-   /* Call the async completion c

[PATCH] mtip32xx: mtip_async_complete() bug fixes

2014-03-13 Thread Sam Bradshaw

This patch fixes 2 issues in the fast completion path:
1) Possible double completions / double dma_unmap_sg() calls due to lack 
of atomicity in the check and subsequent dereference of the upper layer 
callback function.  Fixed with cmpxchg before unmap and callback.
2) Regression in unaligned IO constraining workaround for p420m devices. 
 Fixed by checking if IO is unaligned and using proper semaphore if so.


Bumped version to indicate presence of these fixes.

Signed-off-by: Sam Bradshaw 
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c

index b2012b7..624e9d9 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port 
*port,

void *data,
int status)
 {
-   struct mtip_cmd *command;
+   struct mtip_cmd *cmd;
struct driver_data *dd = data;
-   int cb_status = status ? -EIO : 0;
+   int unaligned, cb_status = status ? -EIO : 0;
+   void (*func)(void *, int);

if (unlikely(!dd) || unlikely(!port))
return;

-   command = >commands[tag];
+   cmd = >commands[tag];

if (unlikely(status == PORT_IRQ_TF_ERR)) {
dev_warn(>dd->pdev->dev,
"Command tag %d failed due to TFE\n", tag);
}

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(>pdev->dev,
-   command->sg,
-   command->scatter_ents,
-   command->direction);
+   /* Clear the active flag */
+   atomic_set(>commands[tag].active, 0);

/* Upper layer callback */
-   if (likely(command->async_callback))
-   command->async_callback(command->async_data, cb_status);
+   func = cmd->async_callback;
+   if (likely(func && cmpxchg(>async_callback, func, 0) == func)) {

-   command->async_callback = NULL;
-   command->comp_func = NULL;
+   /* Unmap the DMA scatter list entries */
+   dma_unmap_sg(>pdev->dev,
+   cmd->sg,
+   cmd->scatter_ents,
+   cmd->direction);

-   /* Clear the allocated and active bits for the command */
-   atomic_set(>commands[tag].active, 0);
-   release_slot(port, tag);
+   func(cmd->async_data, cb_status);
+   unaligned = cmd->unaligned;

-   up(>cmd_slot);
+   /* Clear the allocated bit for the command */
+   release_slot(port, tag);
+
+   if (unlikely(unaligned))
+   up(>cmd_slot_unal);
+   else
+   up(>cmd_slot);
+   }
 }

 /*
@@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long 
int data)

 {
struct mtip_port *port = (struct mtip_port *) data;
struct host_to_dev_fis *fis;
-   struct mtip_cmd *command;
-   int tag, cmdto_cnt = 0;
+   struct mtip_cmd *cmd;
+   int unaligned, tag, cmdto_cnt = 0;
unsigned int bit, group;
unsigned int num_command_slots;
unsigned long to, tagaccum[SLOTBITS_IN_LONGS];
+   void (*func)(void *, int);

if (unlikely(!port))
return;
@@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int 
data)

group = tag >> 5;
bit = tag & 0x1F;

-   command = >commands[tag];
-   fis = (struct host_to_dev_fis *) command->command;
+   cmd = >commands[tag];
+   fis = (struct host_to_dev_fis *) cmd->command;

set_bit(tag, tagaccum);
cmdto_cnt++;
@@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long 
int data)

 */
writel(1 << bit, port->completed[group]);

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(>dd->pdev->dev,
-   command->sg,
-   command->scatter_ents,
-   command->direction);
+   /* Clear the active flag for the command */
+   atomic_set(>commands[tag].active, 0);

-   /* Call the async completion callback. */
-   if (likely(command->async_callback))
-   command->async_callback(command->async_data,
--EIO);
-   command->async_callback = NULL;
-   command->comp_func = NULL;
+   func = cmd->async_call

[PATCH] mtip32xx: mtip_async_complete() bug fixes

2014-03-13 Thread Sam Bradshaw

This patch fixes 2 issues in the fast completion path:
1) Possible double completions / double dma_unmap_sg() calls due to lack 
of atomicity in the check and subsequent dereference of the upper layer 
callback function.  Fixed with cmpxchg before unmap and callback.
2) Regression in unaligned IO constraining workaround for p420m devices. 
 Fixed by checking if IO is unaligned and using proper semaphore if so.


Bumped version to indicate presence of these fixes.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c

index b2012b7..624e9d9 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port 
*port,

void *data,
int status)
 {
-   struct mtip_cmd *command;
+   struct mtip_cmd *cmd;
struct driver_data *dd = data;
-   int cb_status = status ? -EIO : 0;
+   int unaligned, cb_status = status ? -EIO : 0;
+   void (*func)(void *, int);

if (unlikely(!dd) || unlikely(!port))
return;

-   command = port-commands[tag];
+   cmd = port-commands[tag];

if (unlikely(status == PORT_IRQ_TF_ERR)) {
dev_warn(port-dd-pdev-dev,
Command tag %d failed due to TFE\n, tag);
}

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(dd-pdev-dev,
-   command-sg,
-   command-scatter_ents,
-   command-direction);
+   /* Clear the active flag */
+   atomic_set(port-commands[tag].active, 0);

/* Upper layer callback */
-   if (likely(command-async_callback))
-   command-async_callback(command-async_data, cb_status);
+   func = cmd-async_callback;
+   if (likely(func  cmpxchg(cmd-async_callback, func, 0) == func)) {

-   command-async_callback = NULL;
-   command-comp_func = NULL;
+   /* Unmap the DMA scatter list entries */
+   dma_unmap_sg(dd-pdev-dev,
+   cmd-sg,
+   cmd-scatter_ents,
+   cmd-direction);

-   /* Clear the allocated and active bits for the command */
-   atomic_set(port-commands[tag].active, 0);
-   release_slot(port, tag);
+   func(cmd-async_data, cb_status);
+   unaligned = cmd-unaligned;

-   up(port-cmd_slot);
+   /* Clear the allocated bit for the command */
+   release_slot(port, tag);
+
+   if (unlikely(unaligned))
+   up(port-cmd_slot_unal);
+   else
+   up(port-cmd_slot);
+   }
 }

 /*
@@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long 
int data)

 {
struct mtip_port *port = (struct mtip_port *) data;
struct host_to_dev_fis *fis;
-   struct mtip_cmd *command;
-   int tag, cmdto_cnt = 0;
+   struct mtip_cmd *cmd;
+   int unaligned, tag, cmdto_cnt = 0;
unsigned int bit, group;
unsigned int num_command_slots;
unsigned long to, tagaccum[SLOTBITS_IN_LONGS];
+   void (*func)(void *, int);

if (unlikely(!port))
return;
@@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int 
data)

group = tag  5;
bit = tag  0x1F;

-   command = port-commands[tag];
-   fis = (struct host_to_dev_fis *) command-command;
+   cmd = port-commands[tag];
+   fis = (struct host_to_dev_fis *) cmd-command;

set_bit(tag, tagaccum);
cmdto_cnt++;
@@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long 
int data)

 */
writel(1  bit, port-completed[group]);

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(port-dd-pdev-dev,
-   command-sg,
-   command-scatter_ents,
-   command-direction);
+   /* Clear the active flag for the command */
+   atomic_set(port-commands[tag].active, 0);

-   /* Call the async completion callback. */
-   if (likely(command-async_callback))
-   command-async_callback(command-async_data,
--EIO);
-   command-async_callback = NULL;
-   command-comp_func = NULL;
+   func = cmd-async_callback;
+   if (func 
+   cmpxchg(cmd-async_callback, func, 0) == func

Re: [PATCH] mtip32xx: mtip_async_complete() bug fixes

2014-03-13 Thread Sam Bradshaw
On 03/13/2014 02:02 PM, Jens Axboe wrote:
 On 03/13/2014 03:09 PM, Sam Bradshaw wrote:
 This patch fixes 2 issues in the fast completion path:
 1) Possible double completions / double dma_unmap_sg() calls due to lack
 of atomicity in the check and subsequent dereference of the upper layer
 callback function. Fixed with cmpxchg before unmap and callback.
 2) Regression in unaligned IO constraining workaround for p420m devices.
 Fixed by checking if IO is unaligned and using proper semaphore if so.
 
 Sam, what is this patch against? It doesn't apply to -rc6 and it doesn't 
 apply to for-3.15/drivers (the latter would be preferred/required). It 
 also seems to have line break issues.
 

It's against for-3.15/drivers.  Sorry about the breaks; if I remove them
does it apply cleanly?


diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index b2012b7..624e9d9 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port *port,
void *data,
int status)
 {
-   struct mtip_cmd *command;
+   struct mtip_cmd *cmd;
struct driver_data *dd = data;
-   int cb_status = status ? -EIO : 0;
+   int unaligned, cb_status = status ? -EIO : 0;
+   void (*func)(void *, int);
 
if (unlikely(!dd) || unlikely(!port))
return;
 
-   command = port-commands[tag];
+   cmd = port-commands[tag];
 
if (unlikely(status == PORT_IRQ_TF_ERR)) {
dev_warn(port-dd-pdev-dev,
Command tag %d failed due to TFE\n, tag);
}
 
-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(dd-pdev-dev,
-   command-sg,
-   command-scatter_ents,
-   command-direction);
+   /* Clear the active flag */
+   atomic_set(port-commands[tag].active, 0);
 
/* Upper layer callback */
-   if (likely(command-async_callback))
-   command-async_callback(command-async_data, cb_status);
+   func = cmd-async_callback;
+   if (likely(func  cmpxchg(cmd-async_callback, func, 0) == func)) {
 
-   command-async_callback = NULL;
-   command-comp_func = NULL;
+   /* Unmap the DMA scatter list entries */
+   dma_unmap_sg(dd-pdev-dev,
+   cmd-sg,
+   cmd-scatter_ents,
+   cmd-direction);
 
-   /* Clear the allocated and active bits for the command */
-   atomic_set(port-commands[tag].active, 0);
-   release_slot(port, tag);
+   func(cmd-async_data, cb_status);
+   unaligned = cmd-unaligned;
 
-   up(port-cmd_slot);
+   /* Clear the allocated bit for the command */
+   release_slot(port, tag);
+
+   if (unlikely(unaligned))
+   up(port-cmd_slot_unal);
+   else
+   up(port-cmd_slot);
+   }
 }
 
 /*
@@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long int data)
 {
struct mtip_port *port = (struct mtip_port *) data;
struct host_to_dev_fis *fis;
-   struct mtip_cmd *command;
-   int tag, cmdto_cnt = 0;
+   struct mtip_cmd *cmd;
+   int unaligned, tag, cmdto_cnt = 0;
unsigned int bit, group;
unsigned int num_command_slots;
unsigned long to, tagaccum[SLOTBITS_IN_LONGS];
+   void (*func)(void *, int);
 
if (unlikely(!port))
return;
@@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int data)
group = tag  5;
bit = tag  0x1F;
 
-   command = port-commands[tag];
-   fis = (struct host_to_dev_fis *) command-command;
+   cmd = port-commands[tag];
+   fis = (struct host_to_dev_fis *) cmd-command;
 
set_bit(tag, tagaccum);
cmdto_cnt++;
@@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long int data)
 */
writel(1  bit, port-completed[group]);
 
-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(port-dd-pdev-dev,
-   command-sg,
-   command-scatter_ents,
-   command-direction);
+   /* Clear the active flag for the command */
+   atomic_set(port-commands[tag].active, 0);
 
-   /* Call the async completion callback. */
-   if (likely(command-async_callback))
-   command-async_callback(command-async_data,
--EIO

[PATCH] mtip32xx: switch to vmalloc() to mitigate high order allocation failures

2014-03-13 Thread Sam Bradshaw
The mtip_port kmalloc() allocation is relatively high order, in particular 
since the recent doubling of the size of the scatterlist container.  The 
allocation has been shown to fail during SRSI under fragmented or low memory 
conditions.  Switching to vmalloc() mitigates the problem.  Converted both 
mtip_port and driver_data allocations for consistency.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 624e9d9..f55e3e5 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3116,7 +3116,7 @@ static int mtip_free_orphan(struct driver_data *dd)
dd-queue = NULL;
}
}
-   kfree(dd);
+   vfree(dd);
return 0;
 }
 
@@ -3495,8 +3495,7 @@ static int mtip_hw_init(struct driver_data *dd)
 
hba_setup(dd);
 
-   dd-port = kzalloc_node(sizeof(struct mtip_port), GFP_KERNEL,
-   dd-numa_node);
+   dd-port = vzalloc_node(sizeof(struct mtip_port), dd-numa_node);
if (!dd-port) {
dev_err(dd-pdev-dev,
Memory allocation: port structure\n);
@@ -3680,7 +3679,7 @@ out2:
 
 out1:
/* Free the memory allocated for the for structure. */
-   kfree(dd-port);
+   vfree(dd-port);
 
return rv;
 }
@@ -3724,7 +3723,7 @@ static int mtip_hw_exit(struct driver_data *dd)
mtip_dma_free(dd);
 
/* Free the memory allocated for the for structure. */
-   kfree(dd-port);
+   vfree(dd-port);
dd-port = NULL;
 
return 0;
@@ -4512,7 +4511,7 @@ static int mtip_pci_probe(struct pci_dev *pdev,
my_node, pcibus_to_node(pdev-bus), dev_to_node(pdev-dev),
cpu_to_node(smp_processor_id()), smp_processor_id());
 
-   dd = kzalloc_node(sizeof(struct driver_data), GFP_KERNEL, my_node);
+   dd = vzalloc_node(sizeof(struct driver_data), my_node);
if (dd == NULL) {
dev_err(pdev-dev,
Unable to allocate memory for driver data\n);
@@ -4670,7 +4669,7 @@ setmask_err:
pcim_iounmap_regions(pdev, 1  MTIP_ABAR);
 
 iomap_err:
-   kfree(dd);
+   vfree(dd);
pci_set_drvdata(pdev, NULL);
return rv;
 done:
@@ -4731,7 +4730,7 @@ static void mtip_pci_remove(struct pci_dev *pdev)
spin_unlock_irqrestore(dev_lock, flags);
 
if (!dd-sr)
-   kfree(dd);
+   vfree(dd);
else
set_bit(MTIP_DDF_REMOVE_DONE_BIT, dd-dd_flag);
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request

2014-03-12 Thread Sam Bradshaw

On 03/12/2014 09:05 AM, Felipe Franciosi wrote:

If the buffers are unmapped after completing a request, then stale data
might be in the request.


Good find, Felipe, thank you.  I would prefer something along the lines 
of this patch to make sure to avoid double completions / dma_unmap_sg() 
calls during surprise removal and/or timeout conditions.


Jens: note that this patch also fixes a regression in the unaligned 
workaround implementation that was introduced by the SRSI patch.


Signed-off-by: Sam Bradshaw 
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c

index 5160269..390ac6f 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port 
*port,

void *data,
int status)
 {
-   struct mtip_cmd *command;
+   struct mtip_cmd *cmd;
struct driver_data *dd = data;
-   int cb_status = status ? -EIO : 0;
+   int unaligned, cb_status = status ? -EIO : 0;
+   void (*func)(void *, int);

if (unlikely(!dd) || unlikely(!port))
return;

-   command = >commands[tag];
+   cmd = >commands[tag];

if (unlikely(status == PORT_IRQ_TF_ERR)) {
dev_warn(>dd->pdev->dev,
"Command tag %d failed due to TFE\n", tag);
}

+   /* Clear the active flag */
+   atomic_set(>commands[tag].active, 0);
+
/* Upper layer callback */
-   if (likely(command->async_callback))
-   command->async_callback(command->async_data, cb_status);
+   func = cmd->async_callback;
+   if (likely(func && cmpxchg(>async_callback, func, 0) == func)) {

-   command->async_callback = NULL;
-   command->comp_func = NULL;
+   /* Unmap the DMA scatter list entries */
+   dma_unmap_sg(>pdev->dev,
+   cmd->sg,
+   cmd->scatter_ents,
+   cmd->direction);

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(>pdev->dev,
-   command->sg,
-   command->scatter_ents,
-   command->direction);
+   func(cmd->async_data, cb_status);
+   unaligned = cmd->unaligned;

-   /* Clear the allocated and active bits for the command */
-   atomic_set(>commands[tag].active, 0);
-   release_slot(port, tag);
+   /* Clear the allocated bit for the command */
+   release_slot(port, tag);

-   up(>cmd_slot);
+   if (unlikely(unaligned))
+   up(>cmd_slot_unal);
+   else
+   up(>cmd_slot);
+   }
 }

 /*
@@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long 
int data)

 {
struct mtip_port *port = (struct mtip_port *) data;
struct host_to_dev_fis *fis;
-   struct mtip_cmd *command;
-   int tag, cmdto_cnt = 0;
+   struct mtip_cmd *cmd;
+   int unaligned, tag, cmdto_cnt = 0;
unsigned int bit, group;
unsigned int num_command_slots;
unsigned long to, tagaccum[SLOTBITS_IN_LONGS];
+   void (*func)(void *, int);

if (unlikely(!port))
return;
@@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int 
data)

group = tag >> 5;
bit = tag & 0x1F;

-   command = >commands[tag];
-   fis = (struct host_to_dev_fis *) command->command;
+   cmd = >commands[tag];
+   fis = (struct host_to_dev_fis *) cmd->command;

set_bit(tag, tagaccum);
cmdto_cnt++;
@@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long 
int data)

 */
writel(1 << bit, port->completed[group]);

-   /* Call the async completion callback. */
-   if (likely(command->async_callback))
-   command->async_callback(command->async_data,
--EIO);
-   command->async_callback = NULL;
-   command->comp_func = NULL;
+   /* Clear the active flag for the command */
+   atomic_set(>commands[tag].active, 0);

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(>dd->pdev->dev,
-   command->sg,
-   command->scatter_ents,
-   command-&

Re: [PATCH 2/2] mtip32xx: Unmap the DMA segments before completing the IO request

2014-03-12 Thread Sam Bradshaw

On 03/12/2014 09:05 AM, Felipe Franciosi wrote:

If the buffers are unmapped after completing a request, then stale data
might be in the request.


Good find, Felipe, thank you.  I would prefer something along the lines 
of this patch to make sure to avoid double completions / dma_unmap_sg() 
calls during surprise removal and/or timeout conditions.


Jens: note that this patch also fixes a regression in the unaligned 
workaround implementation that was introduced by the SRSI patch.


Signed-off-by: Sam Bradshaw sbrads...@micron.com
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c

index 5160269..390ac6f 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -252,38 +252,45 @@ static void mtip_async_complete(struct mtip_port 
*port,

void *data,
int status)
 {
-   struct mtip_cmd *command;
+   struct mtip_cmd *cmd;
struct driver_data *dd = data;
-   int cb_status = status ? -EIO : 0;
+   int unaligned, cb_status = status ? -EIO : 0;
+   void (*func)(void *, int);

if (unlikely(!dd) || unlikely(!port))
return;

-   command = port-commands[tag];
+   cmd = port-commands[tag];

if (unlikely(status == PORT_IRQ_TF_ERR)) {
dev_warn(port-dd-pdev-dev,
Command tag %d failed due to TFE\n, tag);
}

+   /* Clear the active flag */
+   atomic_set(port-commands[tag].active, 0);
+
/* Upper layer callback */
-   if (likely(command-async_callback))
-   command-async_callback(command-async_data, cb_status);
+   func = cmd-async_callback;
+   if (likely(func  cmpxchg(cmd-async_callback, func, 0) == func)) {

-   command-async_callback = NULL;
-   command-comp_func = NULL;
+   /* Unmap the DMA scatter list entries */
+   dma_unmap_sg(dd-pdev-dev,
+   cmd-sg,
+   cmd-scatter_ents,
+   cmd-direction);

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(dd-pdev-dev,
-   command-sg,
-   command-scatter_ents,
-   command-direction);
+   func(cmd-async_data, cb_status);
+   unaligned = cmd-unaligned;

-   /* Clear the allocated and active bits for the command */
-   atomic_set(port-commands[tag].active, 0);
-   release_slot(port, tag);
+   /* Clear the allocated bit for the command */
+   release_slot(port, tag);

-   up(port-cmd_slot);
+   if (unlikely(unaligned))
+   up(port-cmd_slot_unal);
+   else
+   up(port-cmd_slot);
+   }
 }

 /*
@@ -660,11 +667,12 @@ static void mtip_timeout_function(unsigned long 
int data)

 {
struct mtip_port *port = (struct mtip_port *) data;
struct host_to_dev_fis *fis;
-   struct mtip_cmd *command;
-   int tag, cmdto_cnt = 0;
+   struct mtip_cmd *cmd;
+   int unaligned, tag, cmdto_cnt = 0;
unsigned int bit, group;
unsigned int num_command_slots;
unsigned long to, tagaccum[SLOTBITS_IN_LONGS];
+   void (*func)(void *, int);

if (unlikely(!port))
return;
@@ -694,8 +702,8 @@ static void mtip_timeout_function(unsigned long int 
data)

group = tag  5;
bit = tag  0x1F;

-   command = port-commands[tag];
-   fis = (struct host_to_dev_fis *) command-command;
+   cmd = port-commands[tag];
+   fis = (struct host_to_dev_fis *) cmd-command;

set_bit(tag, tagaccum);
cmdto_cnt++;
@@ -709,27 +717,30 @@ static void mtip_timeout_function(unsigned long 
int data)

 */
writel(1  bit, port-completed[group]);

-   /* Call the async completion callback. */
-   if (likely(command-async_callback))
-   command-async_callback(command-async_data,
--EIO);
-   command-async_callback = NULL;
-   command-comp_func = NULL;
+   /* Clear the active flag for the command */
+   atomic_set(port-commands[tag].active, 0);

-   /* Unmap the DMA scatter list entries */
-   dma_unmap_sg(port-dd-pdev-dev,
-   command-sg,
-   command-scatter_ents,
-   command-direction);
+   func = cmd-async_callback;
+   if (func 
+   cmpxchg(cmd-async_callback, func, 0) == func

RE: mtip32xx blk-mq status?

2014-02-26 Thread Sam Bradshaw (sbradshaw)

> On 2014-02-26 11:29, Christoph Hellwig wrote:
> > Hi all,
> >
> > with blk-mq stabilizing in mainline and Jens using mtip32xx as tje
> major
> > example drivers in the past is there any progress on getting the
> > conversion finished and merged?
> 
> I'll pick up the pieces as soon as I get back and can test again. The
> basic conversion is easy enough, I'll just dust that off. On top of
> that
> I started splitting the issue groups up, since it actually gets you a
> good step further than just the basic conversion.

If you have a public branch for the latter, send it our way.  We can
help with the SRSI testing.
 
> The only problem area I ran into was the special case of the unaligned
> writes.

Yup, it's a painful limitation.  If you need any p420m hardware, let me 
know.

-Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: mtip32xx blk-mq status?

2014-02-26 Thread Sam Bradshaw (sbradshaw)

 On 2014-02-26 11:29, Christoph Hellwig wrote:
  Hi all,
 
  with blk-mq stabilizing in mainline and Jens using mtip32xx as tje
 major
  example drivers in the past is there any progress on getting the
  conversion finished and merged?
 
 I'll pick up the pieces as soon as I get back and can test again. The
 basic conversion is easy enough, I'll just dust that off. On top of
 that
 I started splitting the issue groups up, since it actually gets you a
 good step further than just the basic conversion.

If you have a public branch for the latter, send it our way.  We can
help with the SRSI testing.
 
 The only problem area I ran into was the special case of the unaligned
 writes.

Yup, it's a painful limitation.  If you need any p420m hardware, let me 
know.

-Sam
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtip32xx: Make SGL container per-command to eliminate high order dma allocation

2014-01-15 Thread Sam Bradshaw
The mtip32xx driver makes a high order dma memory allocation to store a command
index table, some dedicated buffers, and a command header & SGL blob.  This 
allocation can fail with a surprise insert under low & fragmented memory 
conditions.

This patch breaks these regions up into separate low order allocations and 
increases
the maximum number of segments a single command SGL can have.  We wanted to 
allow at
least 256 segments for 1 MB direct IO.  Since the command header occupies the 
first
0x80 bytes of the SGL blob, that meant we needed two 4k pages to contain the 
header 
and SGL.  The two pages allow up to 504 SGL segments.

Signed-off-by: Sam Bradshaw 
Signed-off-by: Asai Thambi S P 
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 050c712..ae46bfd 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -41,10 +41,31 @@
 #include "mtip32xx.h"
 
 #define HW_CMD_SLOT_SZ (MTIP_MAX_COMMAND_SLOTS * 32)
-#define HW_CMD_TBL_SZ  (AHCI_CMD_TBL_HDR_SZ + (MTIP_MAX_SG * 16))
-#define HW_CMD_TBL_AR_SZ   (HW_CMD_TBL_SZ * MTIP_MAX_COMMAND_SLOTS)
-#define HW_PORT_PRIV_DMA_SZ \
-   (HW_CMD_SLOT_SZ + HW_CMD_TBL_AR_SZ + AHCI_RX_FIS_SZ)
+
+/* DMA region containing RX Fis, Identify, RLE10, and SMART buffers */
+#define AHCI_RX_FIS_SZ  0x100
+#define AHCI_RX_FIS_OFFSET  0x0
+#define AHCI_IDFY_SZATA_SECT_SIZE
+#define AHCI_IDFY_OFFSET0x400
+#define AHCI_SECTBUF_SZ ATA_SECT_SIZE
+#define AHCI_SECTBUF_OFFSET 0x800
+#define AHCI_SMARTBUF_SZATA_SECT_SIZE
+#define AHCI_SMARTBUF_OFFSET0xC00
+/* 0x100 + 0x200 + 0x200 + 0x200 is smaller than 4k but we pad it out */
+#define BLOCK_DMA_ALLOC_SZ  4096
+
+/* DMA region containing command table (should be 8192 bytes) */
+#define AHCI_CMD_SLOT_SZsizeof(struct mtip_cmd_hdr)
+#define AHCI_CMD_TBL_SZ (MTIP_MAX_COMMAND_SLOTS * AHCI_CMD_SLOT_SZ)
+#define AHCI_CMD_TBL_OFFSET 0x0
+
+/* DMA region per command (contains header and SGL) */
+#define AHCI_CMD_TBL_HDR_SZ 0x80
+#define AHCI_CMD_TBL_HDR_OFFSET 0x0
+#define AHCI_CMD_TBL_SGL_SZ (MTIP_MAX_SG * sizeof(struct mtip_cmd_sg))
+#define AHCI_CMD_TBL_SGL_OFFSET AHCI_CMD_TBL_HDR_SZ
+#define CMD_DMA_ALLOC_SZ(AHCI_CMD_TBL_SGL_SZ + AHCI_CMD_TBL_HDR_SZ)
+
 
 #define HOST_CAP_NZDMA (1 << 19)
 #define HOST_HSORG 0xFC
@@ -3313,6 +3334,118 @@ st_out:
 }
 
 /*
+ * DMA region teardown
+ *
+ * @dd Pointer to driver_data structure
+ *
+ * return value
+ *  None
+ */
+static void mtip_dma_free(struct driver_data *dd)
+{
+   int i;
+   struct mtip_port *port = dd->port;
+
+   if (port->block1)
+   dmam_free_coherent(>pdev->dev, BLOCK_DMA_ALLOC_SZ,
+   port->block1, port->block1_dma);
+
+   if (port->command_list) {
+   dmam_free_coherent(>pdev->dev, AHCI_CMD_TBL_SZ,
+   port->command_list, port->command_list_dma);
+   }
+
+   for (i = 0; i < MTIP_MAX_COMMAND_SLOTS; i++) {
+   if (port->commands[i].command)
+   dmam_free_coherent(>pdev->dev, CMD_DMA_ALLOC_SZ,
+   port->commands[i].command,
+   port->commands[i].command_dma);
+   }
+}
+
+/*
+ * DMA region setup
+ *
+ * @dd Pointer to driver_data structure
+ *
+ * return value
+ *  -ENOMEM Not enough free DMA region space to initialize driver
+ */
+static int mtip_dma_alloc(struct driver_data *dd)
+{
+   struct mtip_port *port = dd->port;
+   int i, rv = 0;
+   u32 host_cap_64 = readl(dd->mmio + HOST_CAP) & HOST_CAP_64;
+
+   /* Allocate dma memory for RX Fis, Identify, and Sector Bufffer */
+   port->block1 =
+   dmam_alloc_coherent(>pdev->dev, BLOCK_DMA_ALLOC_SZ,
+   >block1_dma, GFP_KERNEL);
+   if (!port->block1)
+   return -ENOMEM;
+   memset(port->block1, 0, BLOCK_DMA_ALLOC_SZ);
+
+   /* Allocate dma memory for command list */
+   port->command_list =
+   dmam_alloc_coherent(>pdev->dev, AHCI_CMD_TBL_SZ,
+   >command_list_dma, GFP_KERNEL);
+   if (!port->command_list) {
+   dmam_free_coherent(>pdev->dev, BLOCK_DMA_ALLOC_SZ,
+   port->block1, port->block1_dma);
+   port->block1 = NULL;
+   port->block1_dma = 0;
+   return -ENOMEM;
+   }
+   memset(port->command_list, 0, AHCI_CMD_TBL_SZ);
+
+   /* Setup all pointers into first DMA region */
+   port->rxfis = port->block1 + AHCI_RX_FIS_OFFSET;
+   port->rxfis_dma = port->block1_dma + AHCI_RX_FIS_OFFSET;
+   port-

[PATCH] mtip32xx: Make SGL container per-command to eliminate high order dma allocation

2014-01-15 Thread Sam Bradshaw
The mtip32xx driver makes a high order dma memory allocation to store a command
index table, some dedicated buffers, and a command header  SGL blob.  This 
allocation can fail with a surprise insert under low  fragmented memory 
conditions.

This patch breaks these regions up into separate low order allocations and 
increases
the maximum number of segments a single command SGL can have.  We wanted to 
allow at
least 256 segments for 1 MB direct IO.  Since the command header occupies the 
first
0x80 bytes of the SGL blob, that meant we needed two 4k pages to contain the 
header 
and SGL.  The two pages allow up to 504 SGL segments.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
Signed-off-by: Asai Thambi S P asamymuth...@micron.com
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 050c712..ae46bfd 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -41,10 +41,31 @@
 #include mtip32xx.h
 
 #define HW_CMD_SLOT_SZ (MTIP_MAX_COMMAND_SLOTS * 32)
-#define HW_CMD_TBL_SZ  (AHCI_CMD_TBL_HDR_SZ + (MTIP_MAX_SG * 16))
-#define HW_CMD_TBL_AR_SZ   (HW_CMD_TBL_SZ * MTIP_MAX_COMMAND_SLOTS)
-#define HW_PORT_PRIV_DMA_SZ \
-   (HW_CMD_SLOT_SZ + HW_CMD_TBL_AR_SZ + AHCI_RX_FIS_SZ)
+
+/* DMA region containing RX Fis, Identify, RLE10, and SMART buffers */
+#define AHCI_RX_FIS_SZ  0x100
+#define AHCI_RX_FIS_OFFSET  0x0
+#define AHCI_IDFY_SZATA_SECT_SIZE
+#define AHCI_IDFY_OFFSET0x400
+#define AHCI_SECTBUF_SZ ATA_SECT_SIZE
+#define AHCI_SECTBUF_OFFSET 0x800
+#define AHCI_SMARTBUF_SZATA_SECT_SIZE
+#define AHCI_SMARTBUF_OFFSET0xC00
+/* 0x100 + 0x200 + 0x200 + 0x200 is smaller than 4k but we pad it out */
+#define BLOCK_DMA_ALLOC_SZ  4096
+
+/* DMA region containing command table (should be 8192 bytes) */
+#define AHCI_CMD_SLOT_SZsizeof(struct mtip_cmd_hdr)
+#define AHCI_CMD_TBL_SZ (MTIP_MAX_COMMAND_SLOTS * AHCI_CMD_SLOT_SZ)
+#define AHCI_CMD_TBL_OFFSET 0x0
+
+/* DMA region per command (contains header and SGL) */
+#define AHCI_CMD_TBL_HDR_SZ 0x80
+#define AHCI_CMD_TBL_HDR_OFFSET 0x0
+#define AHCI_CMD_TBL_SGL_SZ (MTIP_MAX_SG * sizeof(struct mtip_cmd_sg))
+#define AHCI_CMD_TBL_SGL_OFFSET AHCI_CMD_TBL_HDR_SZ
+#define CMD_DMA_ALLOC_SZ(AHCI_CMD_TBL_SGL_SZ + AHCI_CMD_TBL_HDR_SZ)
+
 
 #define HOST_CAP_NZDMA (1  19)
 #define HOST_HSORG 0xFC
@@ -3313,6 +3334,118 @@ st_out:
 }
 
 /*
+ * DMA region teardown
+ *
+ * @dd Pointer to driver_data structure
+ *
+ * return value
+ *  None
+ */
+static void mtip_dma_free(struct driver_data *dd)
+{
+   int i;
+   struct mtip_port *port = dd-port;
+
+   if (port-block1)
+   dmam_free_coherent(dd-pdev-dev, BLOCK_DMA_ALLOC_SZ,
+   port-block1, port-block1_dma);
+
+   if (port-command_list) {
+   dmam_free_coherent(dd-pdev-dev, AHCI_CMD_TBL_SZ,
+   port-command_list, port-command_list_dma);
+   }
+
+   for (i = 0; i  MTIP_MAX_COMMAND_SLOTS; i++) {
+   if (port-commands[i].command)
+   dmam_free_coherent(dd-pdev-dev, CMD_DMA_ALLOC_SZ,
+   port-commands[i].command,
+   port-commands[i].command_dma);
+   }
+}
+
+/*
+ * DMA region setup
+ *
+ * @dd Pointer to driver_data structure
+ *
+ * return value
+ *  -ENOMEM Not enough free DMA region space to initialize driver
+ */
+static int mtip_dma_alloc(struct driver_data *dd)
+{
+   struct mtip_port *port = dd-port;
+   int i, rv = 0;
+   u32 host_cap_64 = readl(dd-mmio + HOST_CAP)  HOST_CAP_64;
+
+   /* Allocate dma memory for RX Fis, Identify, and Sector Bufffer */
+   port-block1 =
+   dmam_alloc_coherent(dd-pdev-dev, BLOCK_DMA_ALLOC_SZ,
+   port-block1_dma, GFP_KERNEL);
+   if (!port-block1)
+   return -ENOMEM;
+   memset(port-block1, 0, BLOCK_DMA_ALLOC_SZ);
+
+   /* Allocate dma memory for command list */
+   port-command_list =
+   dmam_alloc_coherent(dd-pdev-dev, AHCI_CMD_TBL_SZ,
+   port-command_list_dma, GFP_KERNEL);
+   if (!port-command_list) {
+   dmam_free_coherent(dd-pdev-dev, BLOCK_DMA_ALLOC_SZ,
+   port-block1, port-block1_dma);
+   port-block1 = NULL;
+   port-block1_dma = 0;
+   return -ENOMEM;
+   }
+   memset(port-command_list, 0, AHCI_CMD_TBL_SZ);
+
+   /* Setup all pointers into first DMA region */
+   port-rxfis = port-block1 + AHCI_RX_FIS_OFFSET;
+   port-rxfis_dma = port-block1_dma + AHCI_RX_FIS_OFFSET;
+   port-identify  = port-block1 + AHCI_IDFY_OFFSET;
+   port-identify_dma  = port-block1_dma + AHCI_IDFY_OFFSET;
+   port

[PATCH] mtip32xx: Correctly handle security locked condition

2013-10-03 Thread Sam Bradshaw
If power is removed during a secure erase, the drive will end up in a
security locked condition.  This patch causes the driver to identify,
log, and flag the security lock state.  IOs are prevented from
submission to the drive until the locked state is addressed with a
secure erase.

Bumped version number to reflect this capability.

Signed-off-by: Sam Bradshaw 
Signed-off-by: Asai Thambi S P 
---
 drivers/block/mtip32xx/mtip32xx.c |   16 ++--
 drivers/block/mtip32xx/mtip32xx.h |2 +-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index aab28a4..a710f51 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -899,8 +899,9 @@ static void mtip_handle_tfe(struct driver_data *dd)
fail_reason = "thermal shutdown";
}
if (buf[288] == 0xBF) {
+   set_bit(MTIP_DDF_SEC_LOCK_BIT, >dd_flag);
dev_info(>pdev->dev,
-   "Drive indicates rebuild has failed.\n");
+   "Drive indicates rebuild has failed. Secure 
erase required.\n");
fail_all_ncq_cmds = 1;
fail_reason = "rebuild failed";
}
@@ -1566,6 +1567,12 @@ static int mtip_get_identify(struct mtip_port *port, 
void __user *user_buffer)
}
 #endif
 
+   /* Check security locked state */
+   if (port->identify[128] & 0x4)
+   set_bit(MTIP_DDF_SEC_LOCK_BIT, >dd->dd_flag);
+   else
+   clear_bit(MTIP_DDF_SEC_LOCK_BIT, >dd->dd_flag);
+
 #ifdef MTIP_TRIM /* Disabling TRIM support temporarily */
/* Demux ID.DRAT & ID.RZAT to determine trim support */
if (port->identify[69] & (1 << 14) && port->identify[69] & (1 << 5))
@@ -1887,6 +1894,10 @@ static void mtip_dump_identify(struct mtip_port *port)
strlcpy(cbuf, (char *)(port->identify+27), 41);
dev_info(>dd->pdev->dev, "Model: %s\n", cbuf);
 
+   dev_info(>dd->pdev->dev, "Security: %04x %s\n",
+   port->identify[128],
+   port->identify[128] & 0x4 ? "(LOCKED)" : "");
+
if (mtip_hw_get_capacity(port->dd, ))
dev_info(>dd->pdev->dev,
"Capacity: %llu sectors (%llu MB)\n",
@@ -3594,7 +3605,8 @@ static int mtip_hw_exit(struct driver_data *dd)
 * saves its state.
 */
if (!dd->sr) {
-   if (!test_bit(MTIP_DDF_REBUILD_FAILED_BIT, >dd_flag))
+   if (!test_bit(MTIP_PF_REBUILD_BIT, >port->flags) &&
+   !test_bit(MTIP_DDF_SEC_LOCK_BIT, >dd_flag))
if (mtip_standby_immediate(dd->port))
dev_warn(>pdev->dev,
"STANDBY IMMEDIATE failed\n");
diff --git a/drivers/block/mtip32xx/mtip32xx.h 
b/drivers/block/mtip32xx/mtip32xx.h
index 9be7a15..f73444a 100644
--- a/drivers/block/mtip32xx/mtip32xx.h
+++ b/drivers/block/mtip32xx/mtip32xx.h
@@ -92,7 +92,7 @@
 
 /* Driver name and version strings */
 #define MTIP_DRV_NAME  "mtip32xx"
-#define MTIP_DRV_VERSION   "1.2.6os3"
+#define MTIP_DRV_VERSION   "1.3.0"
 
 /* Maximum number of minor device numbers per device. */
 #define MTIP_MAX_MINORS16
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtip32xx: Correctly handle security locked condition

2013-10-03 Thread Sam Bradshaw
If power is removed during a secure erase, the drive will end up in a
security locked condition.  This patch causes the driver to identify,
log, and flag the security lock state.  IOs are prevented from
submission to the drive until the locked state is addressed with a
secure erase.

Bumped version number to reflect this capability.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
Signed-off-by: Asai Thambi S P asamymuth...@micron.com
---
 drivers/block/mtip32xx/mtip32xx.c |   16 ++--
 drivers/block/mtip32xx/mtip32xx.h |2 +-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index aab28a4..a710f51 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -899,8 +899,9 @@ static void mtip_handle_tfe(struct driver_data *dd)
fail_reason = thermal shutdown;
}
if (buf[288] == 0xBF) {
+   set_bit(MTIP_DDF_SEC_LOCK_BIT, dd-dd_flag);
dev_info(dd-pdev-dev,
-   Drive indicates rebuild has failed.\n);
+   Drive indicates rebuild has failed. Secure 
erase required.\n);
fail_all_ncq_cmds = 1;
fail_reason = rebuild failed;
}
@@ -1566,6 +1567,12 @@ static int mtip_get_identify(struct mtip_port *port, 
void __user *user_buffer)
}
 #endif
 
+   /* Check security locked state */
+   if (port-identify[128]  0x4)
+   set_bit(MTIP_DDF_SEC_LOCK_BIT, port-dd-dd_flag);
+   else
+   clear_bit(MTIP_DDF_SEC_LOCK_BIT, port-dd-dd_flag);
+
 #ifdef MTIP_TRIM /* Disabling TRIM support temporarily */
/* Demux ID.DRAT  ID.RZAT to determine trim support */
if (port-identify[69]  (1  14)  port-identify[69]  (1  5))
@@ -1887,6 +1894,10 @@ static void mtip_dump_identify(struct mtip_port *port)
strlcpy(cbuf, (char *)(port-identify+27), 41);
dev_info(port-dd-pdev-dev, Model: %s\n, cbuf);
 
+   dev_info(port-dd-pdev-dev, Security: %04x %s\n,
+   port-identify[128],
+   port-identify[128]  0x4 ? (LOCKED) : );
+
if (mtip_hw_get_capacity(port-dd, sectors))
dev_info(port-dd-pdev-dev,
Capacity: %llu sectors (%llu MB)\n,
@@ -3594,7 +3605,8 @@ static int mtip_hw_exit(struct driver_data *dd)
 * saves its state.
 */
if (!dd-sr) {
-   if (!test_bit(MTIP_DDF_REBUILD_FAILED_BIT, dd-dd_flag))
+   if (!test_bit(MTIP_PF_REBUILD_BIT, dd-port-flags) 
+   !test_bit(MTIP_DDF_SEC_LOCK_BIT, dd-dd_flag))
if (mtip_standby_immediate(dd-port))
dev_warn(dd-pdev-dev,
STANDBY IMMEDIATE failed\n);
diff --git a/drivers/block/mtip32xx/mtip32xx.h 
b/drivers/block/mtip32xx/mtip32xx.h
index 9be7a15..f73444a 100644
--- a/drivers/block/mtip32xx/mtip32xx.h
+++ b/drivers/block/mtip32xx/mtip32xx.h
@@ -92,7 +92,7 @@
 
 /* Driver name and version strings */
 #define MTIP_DRV_NAME  mtip32xx
-#define MTIP_DRV_VERSION   1.2.6os3
+#define MTIP_DRV_VERSION   1.3.0
 
 /* Maximum number of minor device numbers per device. */
 #define MTIP_MAX_MINORS16
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtip32xx: Correctly handle bio->bi_idx != 0 conditions

2013-05-14 Thread Sam Bradshaw

Stacking drivers may append bvecs to existing bio's, resulting
in non-zero bi_idx conditions.  This patch counts the loops of
bio_for_each_segment() rather than inheriting the bi_idx value
to pass as a segment count to the hardware submission routine.

Signed-off-by: Sam Bradshaw 
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c
b/drivers/block/mtip32xx/mtip32xx.c
index 847107e..7c77ae1 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3863,7 +3863,7 @@ static void mtip_make_request(struct request_queue
*queue, struct bio *bio)
struct driver_data *dd = queue->queuedata;
struct scatterlist *sg;
struct bio_vec *bvec;
-   int nents = 0;
+   int i, nents = 0;
int tag = 0, unaligned = 0;

if (unlikely(dd->dd_flag & MTIP_DDF_STOP_IO)) {
@@ -3921,11 +3921,12 @@ static void mtip_make_request(struct
request_queue *queue, struct bio *bio)
}

/* Create the scatter list for this bio. */
-   bio_for_each_segment(bvec, bio, nents) {
+   bio_for_each_segment(bvec, bio, i) {
sg_set_page([nents],
bvec->bv_page,
bvec->bv_len,
bvec->bv_offset);
+   nents++;
}

/* Issue the read/write. */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][TRIVIAL] mtip32xx: Fix NULL pointer dereference during module unload

2013-05-14 Thread Sam Bradshaw
An open file-handle to one or more of the driver exported debugfs
nodes causes raciness in recursive removal during module unload;
sometimes a stale parent dentry is dereferenced when more than 1
pci device is present.

Signed-off-by: Sam Bradshaw 
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c
b/drivers/block/mtip32xx/mtip32xx.c
index 847107e..e366c74 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3002,7 +3002,8 @@ static int mtip_hw_debugfs_init(struct driver_data
*dd)

 static void mtip_hw_debugfs_exit(struct driver_data *dd)
 {
-   debugfs_remove_recursive(dd->dfs_node);
+   if (dd->dfs_node)
+   debugfs_remove_recursive(dd->dfs_node);
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][TRIVIAL] mtip32xx: Fix NULL pointer dereference during module unload

2013-05-14 Thread Sam Bradshaw
An open file-handle to one or more of the driver exported debugfs
nodes causes raciness in recursive removal during module unload;
sometimes a stale parent dentry is dereferenced when more than 1
pci device is present.

Signed-off-by: Sam Bradshaw 
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c
b/drivers/block/mtip32xx/mtip32xx.c
index 847107e..e366c74 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3002,7 +3002,8 @@ static int mtip_hw_debugfs_init(struct driver_data
*dd)

 static void mtip_hw_debugfs_exit(struct driver_data *dd)
 {
-   debugfs_remove_recursive(dd->dfs_node);
+   if (dd->dfs_node)
+   debugfs_remove_recursive(dd->dfs_node);
 }



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][TRIVIAL] mtip32xx: Fix NULL pointer dereference during module unload

2013-05-14 Thread Sam Bradshaw
An open file-handle to one or more of the driver exported debugfs
nodes causes raciness in recursive removal during module unload;
sometimes a stale parent dentry is dereferenced when more than 1
pci device is present.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c
b/drivers/block/mtip32xx/mtip32xx.c
index 847107e..e366c74 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3002,7 +3002,8 @@ static int mtip_hw_debugfs_init(struct driver_data
*dd)

 static void mtip_hw_debugfs_exit(struct driver_data *dd)
 {
-   debugfs_remove_recursive(dd-dfs_node);
+   if (dd-dfs_node)
+   debugfs_remove_recursive(dd-dfs_node);
 }



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][TRIVIAL] mtip32xx: Fix NULL pointer dereference during module unload

2013-05-14 Thread Sam Bradshaw
An open file-handle to one or more of the driver exported debugfs
nodes causes raciness in recursive removal during module unload;
sometimes a stale parent dentry is dereferenced when more than 1
pci device is present.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c
b/drivers/block/mtip32xx/mtip32xx.c
index 847107e..e366c74 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3002,7 +3002,8 @@ static int mtip_hw_debugfs_init(struct driver_data
*dd)

 static void mtip_hw_debugfs_exit(struct driver_data *dd)
 {
-   debugfs_remove_recursive(dd-dfs_node);
+   if (dd-dfs_node)
+   debugfs_remove_recursive(dd-dfs_node);
 }


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mtip32xx: Correctly handle bio-bi_idx != 0 conditions

2013-05-14 Thread Sam Bradshaw

Stacking drivers may append bvecs to existing bio's, resulting
in non-zero bi_idx conditions.  This patch counts the loops of
bio_for_each_segment() rather than inheriting the bi_idx value
to pass as a segment count to the hardware submission routine.

Signed-off-by: Sam Bradshaw sbrads...@micron.com
---
diff --git a/drivers/block/mtip32xx/mtip32xx.c
b/drivers/block/mtip32xx/mtip32xx.c
index 847107e..7c77ae1 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3863,7 +3863,7 @@ static void mtip_make_request(struct request_queue
*queue, struct bio *bio)
struct driver_data *dd = queue-queuedata;
struct scatterlist *sg;
struct bio_vec *bvec;
-   int nents = 0;
+   int i, nents = 0;
int tag = 0, unaligned = 0;

if (unlikely(dd-dd_flag  MTIP_DDF_STOP_IO)) {
@@ -3921,11 +3921,12 @@ static void mtip_make_request(struct
request_queue *queue, struct bio *bio)
}

/* Create the scatter list for this bio. */
-   bio_for_each_segment(bvec, bio, nents) {
+   bio_for_each_segment(bvec, bio, i) {
sg_set_page(sg[nents],
bvec-bv_page,
bvec-bv_len,
bvec-bv_offset);
+   nents++;
}

/* Issue the read/write. */
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2] mtip32xx: mtip32xx: Disable TRIM support

2013-04-15 Thread Sam Bradshaw (sbradshaw)

> On Fri, Apr 12 2013, Asai Thambi S P wrote:
> >
> > Temporarily disabling TRIM support until TRIM related issues
> > are addressed in the firmware.
> 
> How serious is this? We do have released kernels out there with the
> driver, you might want to consider a stable backport too.

It's a performance issue, not data integrity.  Since the ATA trim command
is non-queueable, IO traffic must halt while the trim is outstanding.  A 
filesystem that is actively trimming sectors is effectively duty cycling 
the IO path on this part.  We are working to shrink the trim command 
latency while also investigating whether some sort of offline trim is more
appropriate.

Given the incremental endurance gain for trim on 34nm SLC vs the severe 
throughput degradation under some workloads, it seemed prudent to disable
the feature.

If you think that warrants a stable backport, just let me or Asai know.

-Sam


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 2/2] mtip32xx: mtip32xx: Disable TRIM support

2013-04-15 Thread Sam Bradshaw (sbradshaw)

 On Fri, Apr 12 2013, Asai Thambi S P wrote:
 
  Temporarily disabling TRIM support until TRIM related issues
  are addressed in the firmware.
 
 How serious is this? We do have released kernels out there with the
 driver, you might want to consider a stable backport too.

It's a performance issue, not data integrity.  Since the ATA trim command
is non-queueable, IO traffic must halt while the trim is outstanding.  A 
filesystem that is actively trimming sectors is effectively duty cycling 
the IO path on this part.  We are working to shrink the trim command 
latency while also investigating whether some sort of offline trim is more
appropriate.

Given the incremental endurance gain for trim on 34nm SLC vs the severe 
throughput degradation under some workloads, it seemed prudent to disable
the feature.

If you think that warrants a stable backport, just let me or Asai know.

-Sam


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: block: Add driver for Micron RealSSD pcie flash cards

2012-10-08 Thread Sam Bradshaw (sbradshaw)
> -Original Message-
> From: Dan Carpenter [mailto:dan.carpen...@oracle.com]
> Sent: Monday, October 08, 2012 12:13 AM
> To: Sam Bradshaw (sbradshaw)
> Cc: Jens Axboe; linux-kernel@vger.kernel.org
> Subject: re: block: Add driver for Micron RealSSD pcie flash cards
> 
> Hello Sam Bradshaw,
> 
> The patch 88523a61558a: "block: Add driver for Micron RealSSD pcie
> flash cards" from Aug 30, 2011, creates an endian bug:
> drivers/block/mtip32xx/mtip32xx.c:2468 mtip_hw_submit_io()
> 
> drivers/block/mtip32xx/mtip32xx.c
>   2465  fis->opts= 1 << 7;
>   2466  fis->command =
>   2467  (dir == READ ? ATA_CMD_FPDMA_READ : 
> ATA_CMD_FPDMA_WRITE);
>   2468  *((unsigned int *) >lba_low) = (start & 0xFF);
> 
>   2469  *((unsigned int *) >lba_low_ex) = ((start >> 24) & 
> 0xFF);
> ^^^
> 
> These addresses point to a series of four unions inside the
> host_to_dev_fis struct.  The unions will be set to different values
> depending on if the CPU is big or little endian.
> 
> If you knew which endianness this works on, then you could probably add
> a cpu_to_be32() or cpu_to_le32().

Thanks for the notification; we'll get a patch out asap.

-Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: block: Add driver for Micron RealSSD pcie flash cards

2012-10-08 Thread Sam Bradshaw (sbradshaw)
 -Original Message-
 From: Dan Carpenter [mailto:dan.carpen...@oracle.com]
 Sent: Monday, October 08, 2012 12:13 AM
 To: Sam Bradshaw (sbradshaw)
 Cc: Jens Axboe; linux-kernel@vger.kernel.org
 Subject: re: block: Add driver for Micron RealSSD pcie flash cards
 
 Hello Sam Bradshaw,
 
 The patch 88523a61558a: block: Add driver for Micron RealSSD pcie
 flash cards from Aug 30, 2011, creates an endian bug:
 drivers/block/mtip32xx/mtip32xx.c:2468 mtip_hw_submit_io()
 
 drivers/block/mtip32xx/mtip32xx.c
   2465  fis-opts= 1  7;
   2466  fis-command =
   2467  (dir == READ ? ATA_CMD_FPDMA_READ : 
 ATA_CMD_FPDMA_WRITE);
   2468  *((unsigned int *) fis-lba_low) = (start  0xFF);
 
   2469  *((unsigned int *) fis-lba_low_ex) = ((start  24)  
 0xFF);
 ^^^
 
 These addresses point to a series of four unions inside the
 host_to_dev_fis struct.  The unions will be set to different values
 depending on if the CPU is big or little endian.
 
 If you knew which endianness this works on, then you could probably add
 a cpu_to_be32() or cpu_to_le32().

Thanks for the notification; we'll get a patch out asap.

-Sam
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/