Re: ZFS, SSDs, and TRIM performance

2015-11-03 Thread Steven Hartland
This is something we've already done in FreeBSD, both myself and a 
others have
iterated a few times on this very thing. There's currently nothing 
outstanding that
I'm aware so its important to capture the details as people experience 
them to see

if there is any more work to do in this area.

On 03/11/2015 09:12, Nicolas Gilles wrote:

Not sure about the Samsung XS1715, but lots of SSDs seem to suck at
large amounts of TRIM in general leading a "let me pause everything
for a while" symptom. In fact I think there is work in ZFS to make
TRIMs work better, and to throttle them in case large amounts are
freed to avoid this kind of starvation.

-- Nicolas


On Thu, Oct 29, 2015 at 7:22 PM, Steven Hartland
 wrote:

If you running NVMe, are you running a version which has this:
https://svnweb.freebsd.org/base?view=revision&revision=285767

I'm pretty sure 10.2 does have that, so you should be good, but best to
check.

Other questions:
1. What does "gstat -d -p" show during the stalls?
2. Do you have any other zfs tuning in place?

On 29/10/2015 16:54, Sean Kelly wrote:

Me again. I have a new issue and I’m not sure if it is hardware or
software. I have nine servers running 10.2-RELEASE-p5 with Dell OEM’d
Samsung XS1715 NVMe SSDs. They are paired up in a single mirrored zpool on
each server. They perform great most of the time. However, I have a problem
when ZFS fires off TRIMs. Not during vdev creation, but like if I delete a
20GB snapshot.

If I destroy a 20GB snapshot or delete large files, ZFS fires off tons of
TRIMs to the disks. I can see the kstat.zfs.misc.zio_trim.success and
kstat.zfs.misc.zio_trim.bytes sysctls skyrocket. While this is happening,
any synchronous writes seem to block. For example, we’re running PostgreSQL
which does fsync()s all the time. While these TRIMs happen, Postgres just
hangs on writes. This causes reads to block due to lock contention as well.

If I change sync=disabled on my tank/pgsql dataset while this is
happening, it unblocks for the most part. But obviously this is not an ideal
way to run PostgreSQL.

I’m working with my vendor to get some Intel SSDs to test, but any ideas
if this could somehow be a software issue? Or does the Samsung XS1715 just
suck at TRIM and SYNC?

We’re thinking of just setting the vfs.zfs.trim.enabled=0 tunable for now
since WAL segment turnover actually causes TRIM operations a lot, but
unfortunately this is a reboot. But disabling TRIM does seem to fix the
issue on other servers I’ve tested with the same hardware config.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS, SSDs, and TRIM performance

2015-11-03 Thread Nicolas Gilles
Not sure about the Samsung XS1715, but lots of SSDs seem to suck at
large amounts of TRIM in general leading a "let me pause everything
for a while" symptom. In fact I think there is work in ZFS to make
TRIMs work better, and to throttle them in case large amounts are
freed to avoid this kind of starvation.

-- Nicolas


On Thu, Oct 29, 2015 at 7:22 PM, Steven Hartland
 wrote:
> If you running NVMe, are you running a version which has this:
> https://svnweb.freebsd.org/base?view=revision&revision=285767
>
> I'm pretty sure 10.2 does have that, so you should be good, but best to
> check.
>
> Other questions:
> 1. What does "gstat -d -p" show during the stalls?
> 2. Do you have any other zfs tuning in place?
>
> On 29/10/2015 16:54, Sean Kelly wrote:
>>
>> Me again. I have a new issue and I’m not sure if it is hardware or
>> software. I have nine servers running 10.2-RELEASE-p5 with Dell OEM’d
>> Samsung XS1715 NVMe SSDs. They are paired up in a single mirrored zpool on
>> each server. They perform great most of the time. However, I have a problem
>> when ZFS fires off TRIMs. Not during vdev creation, but like if I delete a
>> 20GB snapshot.
>>
>> If I destroy a 20GB snapshot or delete large files, ZFS fires off tons of
>> TRIMs to the disks. I can see the kstat.zfs.misc.zio_trim.success and
>> kstat.zfs.misc.zio_trim.bytes sysctls skyrocket. While this is happening,
>> any synchronous writes seem to block. For example, we’re running PostgreSQL
>> which does fsync()s all the time. While these TRIMs happen, Postgres just
>> hangs on writes. This causes reads to block due to lock contention as well.
>>
>> If I change sync=disabled on my tank/pgsql dataset while this is
>> happening, it unblocks for the most part. But obviously this is not an ideal
>> way to run PostgreSQL.
>>
>> I’m working with my vendor to get some Intel SSDs to test, but any ideas
>> if this could somehow be a software issue? Or does the Samsung XS1715 just
>> suck at TRIM and SYNC?
>>
>> We’re thinking of just setting the vfs.zfs.trim.enabled=0 tunable for now
>> since WAL segment turnover actually causes TRIM operations a lot, but
>> unfortunately this is a reboot. But disabling TRIM does seem to fix the
>> issue on other servers I’ve tested with the same hardware config.
>>
>
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS, SSDs, and TRIM performance

2015-10-29 Thread Steven Hartland

If you running NVMe, are you running a version which has this:
https://svnweb.freebsd.org/base?view=revision&revision=285767

I'm pretty sure 10.2 does have that, so you should be good, but best to 
check.


Other questions:
1. What does "gstat -d -p" show during the stalls?
2. Do you have any other zfs tuning in place?

On 29/10/2015 16:54, Sean Kelly wrote:

Me again. I have a new issue and I’m not sure if it is hardware or software. I 
have nine servers running 10.2-RELEASE-p5 with Dell OEM’d Samsung XS1715 NVMe 
SSDs. They are paired up in a single mirrored zpool on each server. They 
perform great most of the time. However, I have a problem when ZFS fires off 
TRIMs. Not during vdev creation, but like if I delete a 20GB snapshot.

If I destroy a 20GB snapshot or delete large files, ZFS fires off tons of TRIMs 
to the disks. I can see the kstat.zfs.misc.zio_trim.success and 
kstat.zfs.misc.zio_trim.bytes sysctls skyrocket. While this is happening, any 
synchronous writes seem to block. For example, we’re running PostgreSQL which 
does fsync()s all the time. While these TRIMs happen, Postgres just hangs on 
writes. This causes reads to block due to lock contention as well.

If I change sync=disabled on my tank/pgsql dataset while this is happening, it 
unblocks for the most part. But obviously this is not an ideal way to run 
PostgreSQL.

I’m working with my vendor to get some Intel SSDs to test, but any ideas if 
this could somehow be a software issue? Or does the Samsung XS1715 just suck at 
TRIM and SYNC?

We’re thinking of just setting the vfs.zfs.trim.enabled=0 tunable for now since 
WAL segment turnover actually causes TRIM operations a lot, but unfortunately 
this is a reboot. But disabling TRIM does seem to fix the issue on other 
servers I’ve tested with the same hardware config.



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

ZFS, SSDs, and TRIM performance

2015-10-29 Thread Sean Kelly
Me again. I have a new issue and I’m not sure if it is hardware or software. I 
have nine servers running 10.2-RELEASE-p5 with Dell OEM’d Samsung XS1715 NVMe 
SSDs. They are paired up in a single mirrored zpool on each server. They 
perform great most of the time. However, I have a problem when ZFS fires off 
TRIMs. Not during vdev creation, but like if I delete a 20GB snapshot.

If I destroy a 20GB snapshot or delete large files, ZFS fires off tons of TRIMs 
to the disks. I can see the kstat.zfs.misc.zio_trim.success and 
kstat.zfs.misc.zio_trim.bytes sysctls skyrocket. While this is happening, any 
synchronous writes seem to block. For example, we’re running PostgreSQL which 
does fsync()s all the time. While these TRIMs happen, Postgres just hangs on 
writes. This causes reads to block due to lock contention as well.

If I change sync=disabled on my tank/pgsql dataset while this is happening, it 
unblocks for the most part. But obviously this is not an ideal way to run 
PostgreSQL.

I’m working with my vendor to get some Intel SSDs to test, but any ideas if 
this could somehow be a software issue? Or does the Samsung XS1715 just suck at 
TRIM and SYNC?

We’re thinking of just setting the vfs.zfs.trim.enabled=0 tunable for now since 
WAL segment turnover actually causes TRIM operations a lot, but unfortunately 
this is a reboot. But disabling TRIM does seem to fix the issue on other 
servers I’ve tested with the same hardware config.

-- 
Sean Kelly
smke...@smkelly.org
http://smkelly.org

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"