Re: 3.12: raid-1 mismatch_cnt question
On Mon, Nov 11, 2013 at 7:39 PM, Brad Campbell wrote: > On 11/07/2013 06:54 PM, Justin Piszcz wrote: >> >> On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz >> wrote: >>> >>> Hi, >>> >>> I run two SSDs in a RAID-1 configuration and I have a swap partition on a >>> third SSD. Over time, the mismatch_cnt between the two devices grows >>> higher >>> and higher. >>> > > Are both SSD's identical? Do you have discard enabled on the filesystem? Yes (2 x Intel SSDSC2CW240A3) & yes )/dev/root on / type ext4 (rw,relatime,discard,data=ordered)) > > The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung > SSD's. The Intel return 0 after TRIM while the Samsung don't, so I _always_ > have a massive mismatch_cnt after I run fstrim. I never use a repair > operation as it's just going to re-write the already trimmed sectors. Very interesting and good to know! Justin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: raid-1 mismatch_cnt question
On Mon, Nov 11, 2013 at 7:39 PM, Brad Campbell lists2...@fnarfbargle.com wrote: On 11/07/2013 06:54 PM, Justin Piszcz wrote: On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz jpis...@lucidpixels.com wrote: Hi, I run two SSDs in a RAID-1 configuration and I have a swap partition on a third SSD. Over time, the mismatch_cnt between the two devices grows higher and higher. Are both SSD's identical? Do you have discard enabled on the filesystem? Yes (2 x Intel SSDSC2CW240A3) yes )/dev/root on / type ext4 (rw,relatime,discard,data=ordered)) The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung SSD's. The Intel return 0 after TRIM while the Samsung don't, so I _always_ have a massive mismatch_cnt after I run fstrim. I never use a repair operation as it's just going to re-write the already trimmed sectors. Very interesting and good to know! Justin. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: raid-1 mismatch_cnt question
On 11/07/2013 06:54 PM, Justin Piszcz wrote: On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz wrote: Hi, I run two SSDs in a RAID-1 configuration and I have a swap partition on a third SSD. Over time, the mismatch_cnt between the two devices grows higher and higher. Are both SSD's identical? Do you have discard enabled on the filesystem? The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung SSD's. The Intel return 0 after TRIM while the Samsung don't, so I _always_ have a massive mismatch_cnt after I run fstrim. I never use a repair operation as it's just going to re-write the already trimmed sectors. Just a thought. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: raid-1 mismatch_cnt question
On 11/07/2013 06:54 PM, Justin Piszcz wrote: On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz jpis...@lucidpixels.com wrote: Hi, I run two SSDs in a RAID-1 configuration and I have a swap partition on a third SSD. Over time, the mismatch_cnt between the two devices grows higher and higher. Are both SSD's identical? Do you have discard enabled on the filesystem? The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung SSD's. The Intel return 0 after TRIM while the Samsung don't, so I _always_ have a massive mismatch_cnt after I run fstrim. I never use a repair operation as it's just going to re-write the already trimmed sectors. Just a thought. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: raid-1 mismatch_cnt question
On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz wrote: > Hi, > > I run two SSDs in a RAID-1 configuration and I have a swap partition on a > third SSD. Over time, the mismatch_cnt between the two devices grows higher > and higher. > > Once a week, I run a check and repair against the md devices to help bring > the mismatch_cnt down. When I run the check and repair, the system is live > so there are various logs/processes writing to disk. The system also has > ECC memory and there are no errors reported. > > The following graph is the mismatch_cnt from June 2013 to current; each drop > represents a check+repair. In September, I dropped the kernel/vm caches > before running check/repair and that seemed to help a bit. > http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png > > My question is: is this normal or should the mismatch_cnt always be 0 unless > there is a HW or md/driver issue? > > Justin. > Hi, Could anyone please comment if this is normal/expected behavior? Thanks, Justin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.12: raid-1 mismatch_cnt question
On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz jpis...@lucidpixels.com wrote: Hi, I run two SSDs in a RAID-1 configuration and I have a swap partition on a third SSD. Over time, the mismatch_cnt between the two devices grows higher and higher. Once a week, I run a check and repair against the md devices to help bring the mismatch_cnt down. When I run the check and repair, the system is live so there are various logs/processes writing to disk. The system also has ECC memory and there are no errors reported. The following graph is the mismatch_cnt from June 2013 to current; each drop represents a check+repair. In September, I dropped the kernel/vm caches before running check/repair and that seemed to help a bit. http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png My question is: is this normal or should the mismatch_cnt always be 0 unless there is a HW or md/driver issue? Justin. Hi, Could anyone please comment if this is normal/expected behavior? Thanks, Justin. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.12: raid-1 mismatch_cnt question
Hi, I run two SSDs in a RAID-1 configuration and I have a swap partition on a third SSD. Over time, the mismatch_cnt between the two devices grows higher and higher. Once a week, I run a check and repair against the md devices to help bring the mismatch_cnt down. When I run the check and repair, the system is live so there are various logs/processes writing to disk. The system also has ECC memory and there are no errors reported. The following graph is the mismatch_cnt from June 2013 to current; each drop represents a check+repair. In September, I dropped the kernel/vm caches before running check/repair and that seemed to help a bit. http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png My question is: is this normal or should the mismatch_cnt always be 0 unless there is a HW or md/driver issue? Justin. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
3.12: raid-1 mismatch_cnt question
Hi, I run two SSDs in a RAID-1 configuration and I have a swap partition on a third SSD. Over time, the mismatch_cnt between the two devices grows higher and higher. Once a week, I run a check and repair against the md devices to help bring the mismatch_cnt down. When I run the check and repair, the system is live so there are various logs/processes writing to disk. The system also has ECC memory and there are no errors reported. The following graph is the mismatch_cnt from June 2013 to current; each drop represents a check+repair. In September, I dropped the kernel/vm caches before running check/repair and that seemed to help a bit. http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png My question is: is this normal or should the mismatch_cnt always be 0 unless there is a HW or md/driver issue? Justin. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/