Re: 3.12: raid-1 mismatch_cnt question

2013-11-12 Thread Justin Piszcz
On Mon, Nov 11, 2013 at 7:39 PM, Brad Campbell
 wrote:
> On 11/07/2013 06:54 PM, Justin Piszcz wrote:
>>
>> On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz 
>> wrote:
>>>
>>> Hi,
>>>
>>> I run two SSDs in a RAID-1 configuration and I have a swap partition on a
>>> third SSD.  Over time, the mismatch_cnt between the two devices grows
>>> higher
>>> and higher.
>>>
>
> Are both SSD's identical? Do you have discard enabled on the filesystem?
Yes (2 x Intel SSDSC2CW240A3) & yes )/dev/root on / type ext4
(rw,relatime,discard,data=ordered))

>
> The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung
> SSD's. The Intel return 0 after TRIM while the Samsung don't, so I _always_
> have a massive mismatch_cnt after I run fstrim. I never use a repair
> operation as it's just going to re-write the already trimmed sectors.
Very interesting and good to know!

Justin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.12: raid-1 mismatch_cnt question

2013-11-12 Thread Justin Piszcz
On Mon, Nov 11, 2013 at 7:39 PM, Brad Campbell
lists2...@fnarfbargle.com wrote:
 On 11/07/2013 06:54 PM, Justin Piszcz wrote:

 On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz jpis...@lucidpixels.com
 wrote:

 Hi,

 I run two SSDs in a RAID-1 configuration and I have a swap partition on a
 third SSD.  Over time, the mismatch_cnt between the two devices grows
 higher
 and higher.


 Are both SSD's identical? Do you have discard enabled on the filesystem?
Yes (2 x Intel SSDSC2CW240A3)  yes )/dev/root on / type ext4
(rw,relatime,discard,data=ordered))


 The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung
 SSD's. The Intel return 0 after TRIM while the Samsung don't, so I _always_
 have a massive mismatch_cnt after I run fstrim. I never use a repair
 operation as it's just going to re-write the already trimmed sectors.
Very interesting and good to know!

Justin.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.12: raid-1 mismatch_cnt question

2013-11-11 Thread Brad Campbell

On 11/07/2013 06:54 PM, Justin Piszcz wrote:

On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz  wrote:

Hi,

I run two SSDs in a RAID-1 configuration and I have a swap partition on a
third SSD.  Over time, the mismatch_cnt between the two devices grows higher
and higher.



Are both SSD's identical? Do you have discard enabled on the filesystem?

The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung 
SSD's. The Intel return 0 after TRIM while the Samsung don't, so I 
_always_ have a massive mismatch_cnt after I run fstrim. I never use a 
repair operation as it's just going to re-write the already trimmed sectors.



Just a thought.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.12: raid-1 mismatch_cnt question

2013-11-11 Thread Brad Campbell

On 11/07/2013 06:54 PM, Justin Piszcz wrote:

On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz jpis...@lucidpixels.com wrote:

Hi,

I run two SSDs in a RAID-1 configuration and I have a swap partition on a
third SSD.  Over time, the mismatch_cnt between the two devices grows higher
and higher.



Are both SSD's identical? Do you have discard enabled on the filesystem?

The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung 
SSD's. The Intel return 0 after TRIM while the Samsung don't, so I 
_always_ have a massive mismatch_cnt after I run fstrim. I never use a 
repair operation as it's just going to re-write the already trimmed sectors.



Just a thought.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.12: raid-1 mismatch_cnt question

2013-11-07 Thread Justin Piszcz
On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz  wrote:
> Hi,
>
> I run two SSDs in a RAID-1 configuration and I have a swap partition on a
> third SSD.  Over time, the mismatch_cnt between the two devices grows higher
> and higher.
>
> Once a week, I run a check and repair against the md devices to help bring
> the mismatch_cnt down.  When I run the check and repair, the system is live
> so there are various logs/processes writing to disk.  The system also has
> ECC memory and there are no errors reported.
>
> The following graph is the mismatch_cnt from June 2013 to current; each drop
> represents a check+repair.  In September, I dropped the kernel/vm caches
> before running check/repair and that seemed to help a bit.
> http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png
>
> My question is: is this normal or should the mismatch_cnt always be 0 unless
> there is a HW or md/driver issue?
>
> Justin.
>

Hi,

Could anyone please comment if this is normal/expected behavior?

Thanks,

Justin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.12: raid-1 mismatch_cnt question

2013-11-07 Thread Justin Piszcz
On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz jpis...@lucidpixels.com wrote:
 Hi,

 I run two SSDs in a RAID-1 configuration and I have a swap partition on a
 third SSD.  Over time, the mismatch_cnt between the two devices grows higher
 and higher.

 Once a week, I run a check and repair against the md devices to help bring
 the mismatch_cnt down.  When I run the check and repair, the system is live
 so there are various logs/processes writing to disk.  The system also has
 ECC memory and there are no errors reported.

 The following graph is the mismatch_cnt from June 2013 to current; each drop
 represents a check+repair.  In September, I dropped the kernel/vm caches
 before running check/repair and that seemed to help a bit.
 http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png

 My question is: is this normal or should the mismatch_cnt always be 0 unless
 there is a HW or md/driver issue?

 Justin.


Hi,

Could anyone please comment if this is normal/expected behavior?

Thanks,

Justin.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


3.12: raid-1 mismatch_cnt question

2013-11-04 Thread Justin Piszcz
Hi,

I run two SSDs in a RAID-1 configuration and I have a swap partition on a
third SSD.  Over time, the mismatch_cnt between the two devices grows higher
and higher.

Once a week, I run a check and repair against the md devices to help bring
the mismatch_cnt down.  When I run the check and repair, the system is live
so there are various logs/processes writing to disk.  The system also has
ECC memory and there are no errors reported.

The following graph is the mismatch_cnt from June 2013 to current; each drop
represents a check+repair.  In September, I dropped the kernel/vm caches
before running check/repair and that seemed to help a bit.
http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png

My question is: is this normal or should the mismatch_cnt always be 0 unless
there is a HW or md/driver issue?

Justin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


3.12: raid-1 mismatch_cnt question

2013-11-04 Thread Justin Piszcz
Hi,

I run two SSDs in a RAID-1 configuration and I have a swap partition on a
third SSD.  Over time, the mismatch_cnt between the two devices grows higher
and higher.

Once a week, I run a check and repair against the md devices to help bring
the mismatch_cnt down.  When I run the check and repair, the system is live
so there are various logs/processes writing to disk.  The system also has
ECC memory and there are no errors reported.

The following graph is the mismatch_cnt from June 2013 to current; each drop
represents a check+repair.  In September, I dropped the kernel/vm caches
before running check/repair and that seemed to help a bit.
http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png

My question is: is this normal or should the mismatch_cnt always be 0 unless
there is a HW or md/driver issue?

Justin.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/