Re: svn commit: r356755 - in head/sys: net netinet netinet6 netpfil/ipfw/nat64 sys
Aren’t the current and suggested the same there or do I need more coffee this morning? On Wed, 15 Jan 2020 at 06:10, Gleb Smirnoff wrote: > Hi, > > On Wed, Jan 15, 2020 at 06:05:20AM +, Gleb Smirnoff wrote: > T> Log: > T> Introduce NET_EPOCH_CALL() macro and use it everywhere where we free > T> data based on the network epoch. The macro reverses the argument > T> order of epoch_call(9) - first function, then its argument. NFC > > I really want to reverse the argument order of epoch_call() as well. > The current order is really backwards: > > void > epoch_call(epoch_t epoch, epoch_context_t ctx, > void (*callback)(epoch_context_t)); > > Suggested declaration is: > > void > epoch_call(epoch_t epoch, epoch_context_t ctx, > void (*callback)(epoch_context_t)); > > This will be a very easy change, since today function is > used just in few places. > > Before branching stable/12 we intentionally put this > note in epoch.9 manual page: > > NOTES > The epoch kernel programming interface is under development and is > subject to change. > > Any objections? > > -- > Gleb Smirnoff > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r355831 - head/sys/cam/nvme
Thanks for all the feedback Warner, some more comments in line below, would be interested in your thoughts. On 17/12/2019 02:53, Warner Losh wrote: On Mon, Dec 16, 2019, 5:28 PM Steven Hartland <mailto:steven.hartl...@multiplay.co.uk>> wrote: Be aware that ZFS already does a pretty decent job of this already, so the statement about upper layers isn't true for all. It even has different priorities for different request types so I'm a little concerned that doing it at both layers could cause issues. ZFS' BIO_DELETE scheduling works well for enterprise drives, but needs tuning the further away you get from enterprise performance. I don't anticipate any effect on performance here since this is not enabled by default, unless I've messed something up (and if I have screwed this up, please let me know). I've honestly not tried to enable these things on ZFS. In addition to this if its anything like SSD's numbers of requests are only a small part of the story with total trim size being the other one. I this case you could hit total desired size with just one BIO_DELETE request. With this code what's the impact of this? You're correct. It tends to be the number of segments and/or the size of the segment. This steers cases where the number of segments dominates. For cases where total size dominates, you're often better off using the I/O scheduler to rate limit the size of the trims. This is also one of the reasons I introduced kern.geom.dev.delete_max_sectors. It would be worth at some time writing up a guide to all the logic in the various layers with regards to how we treat TRIM requests. There are quite few elements now and I don't believe its clear where they all are and what they are trying to achieve, which makes it easy for them to start fighting against either other. This feature is designed to allow a large number of files to be deleted at once while doing the trims from them a little at a time to even the load out. That's pretty similar in concept to our current ZFS TRIM code, only time will tell once the new upstream gets merged, if this is still the case. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r355837 - head/sys/cam
Sticky keyboard there Warner? On a more serious note the fact that the controllers lie about the underlying location of data, the impact of skipping the TRIM requests can have a much more serious impact than one might think depending on the drive, so this type of optimisation can significantly harm performance instead of increasing it. This was the main reasons we sponsored the initial ZFS TRIM implementation; as drive performance go so bad with no TRIM that SSD's performed worse than HDD's. Now obviously this was some time ago, but I wouldn't be surprised if there's bad hardware / firmware like this still being produced. Given that might be a good idea to make this optional, possibly even opt in not opt out? Regards Steve On 17/12/2019 00:13, Warner Losh wrote: Author: imp Date: Tue Dec 17 00:13:45 2019 New Revision: 355837 URL: https://svnweb.freebsd.org/changeset/base/355837 Log: Implement bio_speedup React to the BIO_SPEED command in the cam io scheduler by completing as successful BIO_DELETE commands that are pending, up to the length passed down in the BIO_SPEEDUP cmomand. The length passed down is a hint for how much space on the drive needs to be recovered. By completing the BIO_DELETE comomands, this allows the upper layers to allocate and write to the blocks that were about to be trimmed. Since FreeBSD implements TRIMSs as advisory, we can eliminliminate them and go directly to writing. The biggest benefit from TRIMS coomes ffrom the drive being able t ooptimize its free block pool inthe log run. There's little nto no bene3efit in the shoort term. , sepeciall whn the trim is followed by a write. Speedup lets us make this tradeoff. Reviewed by: kirk, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D18351 Modified: head/sys/cam/cam_iosched.c Modified: head/sys/cam/cam_iosched.c == --- head/sys/cam/cam_iosched.c Tue Dec 17 00:13:40 2019(r355836) +++ head/sys/cam/cam_iosched.c Tue Dec 17 00:13:45 2019(r355837) @@ -1534,6 +1534,41 @@ cam_iosched_queue_work(struct cam_iosched_softc *isc, { /* +* A BIO_SPEEDUP from the uppper layers means that they have a block +* shortage. At the present, this is only sent when we're trying to +* allocate blocks, but have a shortage before giving up. bio_length is +* the size of their shortage. We will complete just enough BIO_DELETEs +* in the queue to satisfy the need. If bio_length is 0, we'll complete +* them all. This allows the scheduler to delay BIO_DELETEs to improve +* read/write performance without worrying about the upper layers. When +* it's possibly a problem, we respond by pretending the BIO_DELETEs +* just worked. We can't do anything about the BIO_DELETEs in the +* hardware, though. We have to wait for them to complete. +*/ + if (bp->bio_cmd == BIO_SPEEDUP) { + off_t len; + struct bio *nbp; + + len = 0; + while (bioq_first(>trim_queue) && + (bp->bio_length == 0 || len < bp->bio_length)) { + nbp = bioq_takefirst(>trim_queue); + len += nbp->bio_length; + nbp->bio_error = 0; + biodone(nbp); + } + if (bp->bio_length > 0) { + if (bp->bio_length > len) + bp->bio_resid = bp->bio_length - len; + else + bp->bio_resid = 0; + } + bp->bio_error = 0; + biodone(bp); + return; + } + + /* * If we get a BIO_FLUSH, and we're doing delayed BIO_DELETEs then we * set the last tick time to one less than the current ticks minus the * delay to force the BIO_DELETEs to be presented to the client driver. @@ -1919,8 +1954,8 @@ DB_SHOW_COMMAND(iosched, cam_iosched_db_show) db_printf("Trim Q len %d\n", biolen(>trim_queue)); db_printf("read_bias: %d\n", isc->read_bias); db_printf("current_read_bias: %d\n", isc->current_read_bias); - db_printf("Trims active %d\n", isc->pend_trim); - db_printf("Max trims active %d\n", isc->max_trim); + db_printf("Trims active %d\n", isc->pend_trims); + db_printf("Max trims active %d\n", isc->max_trims); } #endif #endif ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r355832 - head/sys/cam
What if any is the impact on request ordering with this new delayed TRIM? On 17/12/2019 00:13, Warner Losh wrote: Author: imp Date: Tue Dec 17 00:13:21 2019 New Revision: 355832 URL: https://svnweb.freebsd.org/changeset/base/355832 Log: Add rate limiters to TRIM. Add rate limiters to trims. Trims are a bit different than reads or writes in that they can be combined, so some care needs to be taken where we rate limit them. Additional work will be needed to push the working rate limit below the I/O quanta rate for things like IOPS. Sponsored by: Netflix Modified: head/sys/cam/cam_iosched.c Modified: head/sys/cam/cam_iosched.c == --- head/sys/cam/cam_iosched.c Tue Dec 17 00:11:48 2019(r355831) +++ head/sys/cam/cam_iosched.c Tue Dec 17 00:13:21 2019(r355832) @@ -755,7 +755,20 @@ cam_iosched_has_io(struct cam_iosched_softc *isc) static inline bool cam_iosched_has_more_trim(struct cam_iosched_softc *isc) { + struct bio *bp; + bp = bioq_first(>trim_queue); +#ifdef CAM_IOSCHED_DYNAMIC + if (do_dynamic_iosched) { + /* +* If we're limiting trims, then defer action on trims +* for a bit. +*/ + if (bp == NULL || cam_iosched_limiter_caniop(>trim_stats, bp) != 0) + return false; + } +#endif + /* * If we've set a trim_goal, then if we exceed that allow trims * to be passed back to the driver. If we've also set a tick timeout @@ -771,8 +784,8 @@ cam_iosched_has_more_trim(struct cam_iosched_softc *is return false; } - return !(isc->flags & CAM_IOSCHED_FLAG_TRIM_ACTIVE) && - bioq_first(>trim_queue); + /* NB: Should perhaps have a max trim active independent of I/O limiters */ + return !(isc->flags & CAM_IOSCHED_FLAG_TRIM_ACTIVE) && bp != NULL; } #define cam_iosched_sort_queue(isc) ((isc)->sort_io_queue >= 0 ? \ @@ -1389,10 +1402,17 @@ cam_iosched_next_trim(struct cam_iosched_softc *isc) struct bio * cam_iosched_get_trim(struct cam_iosched_softc *isc) { +#ifdef CAM_IOSCHED_DYNAMIC + struct bio *bp; +#endif if (!cam_iosched_has_more_trim(isc)) return NULL; #ifdef CAM_IOSCHED_DYNAMIC + bp = bioq_first(>trim_queue); + if (bp == NULL) + return NULL; + /* * If pending read, prefer that based on current read bias setting. The * read bias is shared for both writes and TRIMs, but on TRIMs the bias @@ -1414,6 +1434,26 @@ cam_iosched_get_trim(struct cam_iosched_softc *isc) */ isc->current_read_bias = isc->read_bias; } + + /* +* See if our current limiter allows this I/O. Because we only call this +* here, and not in next_trim, the 'bandwidth' limits for trims won't +* work, while the iops or max queued limits will work. It's tricky +* because we want the limits to be from the perspective of the +* "commands sent to the device." To make iops work, we need to check +* only here (since we want all the ops we combine to count as one). To +* make bw limits work, we'd need to check in next_trim, but that would +* have the effect of limiting the iops as seen from the upper layers. +*/ + if (cam_iosched_limiter_iop(>trim_stats, bp) != 0) { + if (iosched_debug) + printf("Can't trim because limiter says no.\n"); + isc->trim_stats.state_flags |= IOP_RATE_LIMITED; + return NULL; + } + isc->current_read_bias = isc->read_bias; + isc->trim_stats.state_flags &= ~IOP_RATE_LIMITED; + /* cam_iosched_next_trim below keeps proper book */ #endif return cam_iosched_next_trim(isc); } ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r355831 - head/sys/cam/nvme
Be aware that ZFS already does a pretty decent job of this already, so the statement about upper layers isn't true for all. It even has different priorities for different request types so I'm a little concerned that doing it at both layers could cause issues. In addition to this if its anything like SSD's numbers of requests are only a small part of the story with total trim size being the other one. I this case you could hit total desired size with just one BIO_DELETE request. With this code what's the impact of this? On 17/12/2019 00:11, Warner Losh wrote: Author: imp Date: Tue Dec 17 00:11:48 2019 New Revision: 355831 URL: https://svnweb.freebsd.org/changeset/base/355831 Log: NVME trim stuff. Add two sysctls to control pacing of nvme trims. kern.cam.nda.X.goal_trim is the number of upper layer BIO_DEELETE requests to try to collecet before sending TRIM down too the nvme drive. trim_ticks is the number of ticks, at mosot, to wait for at least goal_trim BIOS_DELEETE requests to come in. Trim pacing is useful when a large number off disjoint trims are comoing in from the upper layers. Since we have no way to chain toogether trims from the upper layers that are sent down, this acts as a hueristic to group trims into reasonable sized chunks. What's reasonable varies from drive to drive. Sponsored by: Netflix Modified: head/sys/cam/nvme/nvme_da.c Modified: head/sys/cam/nvme/nvme_da.c == --- head/sys/cam/nvme/nvme_da.c Tue Dec 17 00:10:19 2019(r355830) +++ head/sys/cam/nvme/nvme_da.c Tue Dec 17 00:11:48 2019(r355831) @@ -177,6 +177,14 @@ static int nda_max_trim_entries = NDA_MAX_TRIM_ENTRIES SYSCTL_INT(_kern_cam_nda, OID_AUTO, max_trim, CTLFLAG_RDTUN, _max_trim_entries, NDA_MAX_TRIM_ENTRIES, "Maximum number of BIO_DELETE to send down as a DSM TRIM."); +static int nda_goal_trim_entries = NDA_MAX_TRIM_ENTRIES / 2; +SYSCTL_INT(_kern_cam_nda, OID_AUTO, goal_trim, CTLFLAG_RDTUN, +_goal_trim_entries, NDA_MAX_TRIM_ENTRIES / 2, +"Number of BIO_DELETE to try to accumulate before sending a DSM TRIM."); +static int nda_trim_ticks = 50;/* 50ms ~ 1000 Hz */ +SYSCTL_INT(_kern_cam_nda, OID_AUTO, trim_ticks, CTLFLAG_RDTUN, +_trim_ticks, 50, +"Number of ticks to hold BIO_DELETEs before sending down a trim"); /* * All NVMe media is non-rotational, so all nvme device instances @@ -741,6 +749,9 @@ ndaregister(struct cam_periph *periph, void *arg) free(softc, M_DEVBUF); return(CAM_REQ_CMP_ERR); } + /* Statically set these for the moment */ + cam_iosched_set_trim_goal(softc->cam_iosched, nda_goal_trim_entries); + cam_iosched_set_trim_ticks(softc->cam_iosched, nda_trim_ticks); /* ident_data parsing */ ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r355430 - head/sys/cam/scsi
If the illegal chars where removed or replaced would the result be useful, if so might that be a better approach? On Fri, 6 Dec 2019 at 00:06, Alan Somers wrote: > Author: asomers > Date: Fri Dec 6 00:06:05 2019 > New Revision: 355430 > URL: https://svnweb.freebsd.org/changeset/base/355430 > > Log: > ses: sanitize illegal strings in SES element descriptors > > The SES4r3 standard requires that element descriptors may only contain > ASCII > characters in the range 0x20 to 0x7e. Some SuperMicro expanders violate > that rule. This patch adds a sanity check to ses(4). Descriptors in > violation will be replaced by "". > > This patch fixes "sesutil --libxo xml" on such systems. Previously it > would > generate non-well-formed XML output. > > PR: 241929 > Reviewed by: allanjude > MFC after:2 weeks > Sponsored by: Axcient > > Modified: > head/sys/cam/scsi/scsi_enc_ses.c > > Modified: head/sys/cam/scsi/scsi_enc_ses.c > > == > --- head/sys/cam/scsi/scsi_enc_ses.cThu Dec 5 19:39:51 2019 > (r355429) > +++ head/sys/cam/scsi/scsi_enc_ses.cFri Dec 6 00:06:05 2019 > (r355430) > @@ -110,7 +110,7 @@ typedef struct ses_addl_status { > typedef struct ses_element { > uint8_t eip;/* eip bit is set */ > uint16_t descr_len; /* length of the descriptor */ > - char *descr;/* descriptor for this object */ > + const char *descr; /* descriptor for this object */ > struct ses_addl_status addl;/* additional status info */ > } ses_element_t; > > @@ -1977,6 +1977,35 @@ ses_publish_cache(enc_softc_t *enc, struct > enc_fsm_sta > return (0); > } > > +/* > + * \brief Sanitize an element descriptor > + * > + * The SES4r3 standard, sections 3.1.2 and 6.1.10, specifies that element > + * descriptors may only contain ASCII characters in the range 0x20 to > 0x7e. > + * But some vendors violate that rule. Ensure that we only expose > compliant > + * descriptors to userland. > + * > + * \param desc SES element descriptor as reported by the hardware > + * \param len Length of desc in bytes, not necessarily including > + * trailing NUL. It will be modified if desc is > invalid. > + */ > +static const char* > +ses_sanitize_elm_desc(const char *desc, uint16_t *len) > +{ > + const char *invalid = ""; > + int i; > + > + for (i = 0; i < *len; i++) { > + if (desc[i] < 0x20 || desc[i] > 0x7e) { > + *len = strlen(invalid); > + return (invalid); > + } else if (desc[i] == 0) { > + break; > + } > + } > + return (desc); > +} > + > /** > * \brief Parse the descriptors for each object. > * > @@ -2061,7 +2090,8 @@ ses_process_elm_descs(enc_softc_t *enc, struct > enc_fsm > if (length > 0) { > elmpriv = element->elm_private; > elmpriv->descr_len = length; > - elmpriv->descr = [offset]; > + elmpriv->descr = > ses_sanitize_elm_desc([offset], > + >descr_len); > } > > /* skip over the descriptor itself */ > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r354283 - in head: stand/libsa/zfs sys/cddl/boot/zfs
Pretty sure we had at least two systems using root with log just fine, so would be interested to know why this isn’t supported anymore? On Sun, 3 Nov 2019 at 13:26, Toomas Soome wrote: > Author: tsoome > Date: Sun Nov 3 13:25:47 2019 > New Revision: 354283 > URL: https://svnweb.freebsd.org/changeset/base/354283 > > Log: > loader: we do not support booting from pool with log device > > If pool has log device, stop there and tell about it. > > Modified: > head/stand/libsa/zfs/zfs.c > head/stand/libsa/zfs/zfsimpl.c > head/sys/cddl/boot/zfs/zfsimpl.h > > Modified: head/stand/libsa/zfs/zfs.c > > == > --- head/stand/libsa/zfs/zfs.c Sun Nov 3 13:03:47 2019(r354282) > +++ head/stand/libsa/zfs/zfs.c Sun Nov 3 13:25:47 2019(r354283) > @@ -668,6 +668,11 @@ zfs_dev_open(struct open_file *f, ...) > spa = spa_find_by_guid(dev->pool_guid); > if (!spa) > return (ENXIO); > + if (spa->spa_with_log) { > + printf("Reading pool %s is not supported due to log > device.\n", > + spa->spa_name); > + return (ENXIO); > + } > mount = malloc(sizeof(*mount)); > if (mount == NULL) > return (ENOMEM); > > Modified: head/stand/libsa/zfs/zfsimpl.c > > == > --- head/stand/libsa/zfs/zfsimpl.c Sun Nov 3 13:03:47 2019 > (r354282) > +++ head/stand/libsa/zfs/zfsimpl.c Sun Nov 3 13:25:47 2019 > (r354283) > @@ -1109,6 +1109,7 @@ vdev_init_from_nvlist(const unsigned char *nvlist, > vde > const unsigned char *kids; > int nkids, i, is_new; > uint64_t is_offline, is_faulted, is_degraded, is_removed, > isnt_present; > + uint64_t is_log; > > if (nvlist_find(nvlist, ZPOOL_CONFIG_GUID, DATA_TYPE_UINT64, > NULL, ) > @@ -1132,17 +1133,20 @@ vdev_init_from_nvlist(const unsigned char *nvlist, > vde > } > > is_offline = is_removed = is_faulted = is_degraded = isnt_present > = 0; > + is_log = 0; > > nvlist_find(nvlist, ZPOOL_CONFIG_OFFLINE, DATA_TYPE_UINT64, NULL, > - _offline); > + _offline); > nvlist_find(nvlist, ZPOOL_CONFIG_REMOVED, DATA_TYPE_UINT64, NULL, > - _removed); > + _removed); > nvlist_find(nvlist, ZPOOL_CONFIG_FAULTED, DATA_TYPE_UINT64, NULL, > - _faulted); > + _faulted); > nvlist_find(nvlist, ZPOOL_CONFIG_DEGRADED, DATA_TYPE_UINT64, NULL, > - _degraded); > + _degraded); > nvlist_find(nvlist, ZPOOL_CONFIG_NOT_PRESENT, DATA_TYPE_UINT64, > NULL, > - _present); > + _present); > + nvlist_find(nvlist, ZPOOL_CONFIG_IS_LOG, DATA_TYPE_UINT64, NULL, > + _log); > > vdev = vdev_find(guid); > if (!vdev) { > @@ -1217,6 +1221,7 @@ vdev_init_from_nvlist(const unsigned char *nvlist, > vde > return (ENOMEM); > vdev->v_name = name; > } > + vdev->v_islog = is_log == 1; > } else { > is_new = 0; > } > @@ -1433,6 +1438,12 @@ vdev_status(vdev_t *vdev, int indent) > { > vdev_t *kid; > int ret; > + > + if (vdev->v_islog) { > + (void)pager_output("logs\n"); > + indent++; > + } > + > ret = print_state(indent, vdev->v_name, vdev->v_state); > if (ret != 0) > return (ret); > @@ -1737,6 +1748,12 @@ vdev_probe(vdev_phys_read_t *_read, void > *read_priv, s > printf("ZFS: inconsistent nvlist contents\n"); > return (EIO); > } > + > + /* > +* We do not support reading pools with log device. > +*/ > + if (vdev->v_islog) > + spa->spa_with_log = vdev->v_islog; > > /* > * Re-evaluate top-level vdev state. > > Modified: head/sys/cddl/boot/zfs/zfsimpl.h > > == > --- head/sys/cddl/boot/zfs/zfsimpl.hSun Nov 3 13:03:47 2019 > (r354282) > +++ head/sys/cddl/boot/zfs/zfsimpl.hSun Nov 3 13:25:47 2019 > (r354283) > @@ -1670,6 +1670,7 @@ typedef struct vdev { > vdev_phys_read_t *v_phys_read; /* read from raw leaf vdev */ > vdev_read_t *v_read;/* read from vdev */ > void*v_read_priv; /* private data for read function > */ > + boolean_t v_islog; > struct spa *spa; /* link to spa */ > /* > * Values stored in the config for an indirect or removing vdev. > @@ -1694,6 +1695,7 @@ typedef struct spa { > zio_cksum_salt_t spa_cksum_salt;/* secret salt for cksum */ > void
svn commit: r346594 - head/sbin/camcontrol
Author: smh Date: Tue Apr 23 07:46:38 2019 New Revision: 346594 URL: https://svnweb.freebsd.org/changeset/base/346594 Log: Add ATA power mode support to camcontrol Add the ability to report ATA device power mode with the cmmand 'powermode' to compliment the existing ability to set it using idle, standby and sleep commands. MFC after:2 weeks Sponsored by: Multiplay Modified: head/sbin/camcontrol/camcontrol.8 head/sbin/camcontrol/camcontrol.c Modified: head/sbin/camcontrol/camcontrol.8 == --- head/sbin/camcontrol/camcontrol.8 Tue Apr 23 06:36:32 2019 (r346593) +++ head/sbin/camcontrol/camcontrol.8 Tue Apr 23 07:46:38 2019 (r346594) @@ -27,7 +27,7 @@ .\" .\" $FreeBSD$ .\" -.Dd March 12, 2019 +.Dd April 22, 2019 .Dt CAMCONTROL 8 .Os .Sh NAME @@ -243,6 +243,10 @@ .Op device id .Op generic args .Nm +.Ic powermode +.Op device id +.Op generic args +.Nm .Ic apm .Op device id .Op generic args @@ -1388,6 +1392,8 @@ Value 0 disables timer. Put ATA device into SLEEP state. Note that the only way get device out of this state may be reset. +.It Ic powermode +Report ATA device power mode. .It Ic apm It optional parameter .Pq Fl l Modified: head/sbin/camcontrol/camcontrol.c == --- head/sbin/camcontrol/camcontrol.c Tue Apr 23 06:36:32 2019 (r346593) +++ head/sbin/camcontrol/camcontrol.c Tue Apr 23 07:46:38 2019 (r346594) @@ -109,7 +109,8 @@ typedef enum { CAM_CMD_ZONE= 0x0026, CAM_CMD_EPC = 0x0027, CAM_CMD_TIMESTAMP = 0x0028, - CAM_CMD_MMCSD_CMD = 0x0029 + CAM_CMD_MMCSD_CMD = 0x0029, + CAM_CMD_POWER_MODE = 0x002a, } cam_cmdmask; typedef enum { @@ -236,6 +237,7 @@ static struct camcontrol_opts option_table[] = { {"idle", CAM_CMD_IDLE, CAM_ARG_NONE, "t:"}, {"standby", CAM_CMD_STANDBY, CAM_ARG_NONE, "t:"}, {"sleep", CAM_CMD_SLEEP, CAM_ARG_NONE, ""}, + {"powermode", CAM_CMD_POWER_MODE, CAM_ARG_NONE, ""}, {"apm", CAM_CMD_APM, CAM_ARG_NONE, "l:"}, {"aam", CAM_CMD_AAM, CAM_ARG_NONE, "l:"}, {"fwdownload", CAM_CMD_DOWNLOAD_FW, CAM_ARG_NONE, "f:qsy"}, @@ -8885,6 +8887,61 @@ bailout: } static int +atapm_proc_resp(struct cam_device *device, union ccb *ccb) +{ +struct ata_res *res; + +res = >ataio.res; +if (res->status & ATA_STATUS_ERROR) { +if (arglist & CAM_ARG_VERBOSE) { +cam_error_print(device, ccb, CAM_ESF_ALL, +CAM_EPF_ALL, stderr); +printf("error = 0x%02x, sector_count = 0x%04x, " + "device = 0x%02x, status = 0x%02x\n", + res->error, res->sector_count, + res->device, res->status); +} + +return (1); +} + +if (arglist & CAM_ARG_VERBOSE) { +fprintf(stdout, "%s%d: Raw native check power data:\n", +device->device_name, device->dev_unit_num); +/* res is 4 byte aligned */ +dump_data((uint16_t*)(uintptr_t)res, sizeof(struct ata_res)); + +printf("error = 0x%02x, sector_count = 0x%04x, device = 0x%02x, " + "status = 0x%02x\n", res->error, res->sector_count, + res->device, res->status); +} + +printf("%s%d: ", device->device_name, device->dev_unit_num); +switch (res->sector_count) { +case 0x00: + printf("Standby mode\n"); + break; +case 0x40: + printf("NV Cache Power Mode and the spindle is spun down or spinning down\n"); + break; +case 0x41: + printf("NV Cache Power Mode and the spindle is spun up or spinning up\n"); + break; +case 0x80: + printf("Idle mode\n"); + break; +case 0xff: + printf("Active or Idle mode\n"); + break; +default: + printf("Unknown mode 0x%02x\n", res->sector_count); + break; +} + +return (0); +} + +static int atapm(struct cam_device *device, int argc, char **argv, char *combinedopt, int retry_count, int timeout) { @@ -8892,6 +8949,7 @@ atapm(struct cam_device *device, int argc, char **argv int retval = 0; int t = -1; int c; + u_int8_t ata_flags = 0; u_char cmd, sc; ccb = cam_getccb(device); @@ -8920,6 +8978,10 @@ atapm(struct cam_device *device, int argc, char **argv cmd = ATA_STANDBY_IMMEDIATE; else cmd = ATA_STANDBY_CMD; + } else if (strcmp(argv[1], "powermode") == 0) { + cmd = ATA_CHECK_POWER_MODE; + ata_flags = AP_FLAG_CHK_COND; + t = -1; } else { cmd = ATA_SLEEP; t = -1; @@ -8937,11 +8999,12 @@ atapm(struct cam_device *device, int argc, char **argv else
Re: svn commit: r348255 - head/sys/kern
Just wanted to say I really appreciate the details in this commit message. Its often the case the message get overlooked when it comes to the time needed to write a truly useful message to others and this a great example of the quality we should all try to follow. Regards Steve On Fri, 24 May 2019 at 23:33, Conrad Meyer wrote: > Author: cem > Date: Fri May 24 22:33:14 2019 > New Revision: 348255 > URL: https://svnweb.freebsd.org/changeset/base/348255 > > Log: > Disable intr_storm_threshold mechanism by default > > The ixl.4 manual page has documented that the threshold falsely detects > interrupt storms on 40Gbit NICs as long ago as 2015, and we have seen > similar false positives with the ioat(4) DMA device (which can push > GB/s). > > For example, synthetic load can be generated with tools/tools/ioat > 'ioatcontrol 0 200 8192 1 1000' (allocate 200x8kB buffers, generate an > interrupt for each one, and do this for 1000 milliseconds). With > storm-detection disabled, the Broadwell-EP version of this device is > capable > of generating ~350k real interrupts per second. > > The following historical context comes from jhb@: Originally, the > threshold > worked around incorrect routing of PCI INTx interrupts on single-CPU > systems > which would end up in a hard hang during boot. Since the threshold was > added, our PCI interrupt routing was improved, most PCI interrupts use > edge-triggered MSI instead of level-triggered INTx, and typical systems > have > multiple CPUs available to service interrupts. > > On the off chance that the threshold is useful in the future, it remains > available as a tunable and sysctl. > > Reviewed by: jhb > Sponsored by: Dell EMC Isilon > Differential Revision:https://reviews.freebsd.org/D20401 > > Modified: > head/sys/kern/kern_intr.c > > Modified: head/sys/kern/kern_intr.c > > == > --- head/sys/kern/kern_intr.c Fri May 24 22:30:40 2019(r348254) > +++ head/sys/kern/kern_intr.c Fri May 24 22:33:14 2019(r348255) > @@ -91,7 +91,7 @@ struct proc *intrproc; > > static MALLOC_DEFINE(M_ITHREAD, "ithread", "Interrupt Threads"); > > -static int intr_storm_threshold = 1000; > +static int intr_storm_threshold = 0; > SYSCTL_INT(_hw, OID_AUTO, intr_storm_threshold, CTLFLAG_RWTUN, > _storm_threshold, 0, > "Number of consecutive interrupts before storm protection is > enabled"); > > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r347384 - stable/12/sbin/camcontrol
Author: smh Date: Thu May 9 08:35:50 2019 New Revision: 347384 URL: https://svnweb.freebsd.org/changeset/base/347384 Log: MFC r346594: Add ATA power mode support to camcontrol Sponsored by: Multiplay Modified: stable/12/sbin/camcontrol/camcontrol.8 stable/12/sbin/camcontrol/camcontrol.c Directory Properties: stable/12/ (props changed) Modified: stable/12/sbin/camcontrol/camcontrol.8 == --- stable/12/sbin/camcontrol/camcontrol.8 Thu May 9 07:57:33 2019 (r347383) +++ stable/12/sbin/camcontrol/camcontrol.8 Thu May 9 08:35:50 2019 (r347384) @@ -27,7 +27,7 @@ .\" .\" $FreeBSD$ .\" -.Dd May 3, 2017 +.Dd May 9, 2019 .Dt CAMCONTROL 8 .Os .Sh NAME @@ -242,6 +242,10 @@ .Op device id .Op generic args .Nm +.Ic powermode +.Op device id +.Op generic args +.Nm .Ic apm .Op device id .Op generic args @@ -1382,6 +1386,8 @@ Value 0 disables timer. Put ATA device into SLEEP state. Note that the only way get device out of this state may be reset. +.It Ic powermode +Report ATA device power mode. .It Ic apm It optional parameter .Pq Fl l Modified: stable/12/sbin/camcontrol/camcontrol.c == --- stable/12/sbin/camcontrol/camcontrol.c Thu May 9 07:57:33 2019 (r347383) +++ stable/12/sbin/camcontrol/camcontrol.c Thu May 9 08:35:50 2019 (r347384) @@ -109,7 +109,8 @@ typedef enum { CAM_CMD_ZONE= 0x0026, CAM_CMD_EPC = 0x0027, CAM_CMD_TIMESTAMP = 0x0028, - CAM_CMD_MMCSD_CMD = 0x0029 + CAM_CMD_MMCSD_CMD = 0x0029, + CAM_CMD_POWER_MODE = 0x002a, } cam_cmdmask; typedef enum { @@ -236,6 +237,7 @@ static struct camcontrol_opts option_table[] = { {"idle", CAM_CMD_IDLE, CAM_ARG_NONE, "t:"}, {"standby", CAM_CMD_STANDBY, CAM_ARG_NONE, "t:"}, {"sleep", CAM_CMD_SLEEP, CAM_ARG_NONE, ""}, + {"powermode", CAM_CMD_POWER_MODE, CAM_ARG_NONE, ""}, {"apm", CAM_CMD_APM, CAM_ARG_NONE, "l:"}, {"aam", CAM_CMD_AAM, CAM_ARG_NONE, "l:"}, {"fwdownload", CAM_CMD_DOWNLOAD_FW, CAM_ARG_NONE, "f:qsy"}, @@ -8876,6 +8878,61 @@ bailout: } static int +atapm_proc_resp(struct cam_device *device, union ccb *ccb) +{ +struct ata_res *res; + +res = >ataio.res; +if (res->status & ATA_STATUS_ERROR) { +if (arglist & CAM_ARG_VERBOSE) { +cam_error_print(device, ccb, CAM_ESF_ALL, +CAM_EPF_ALL, stderr); +printf("error = 0x%02x, sector_count = 0x%04x, " + "device = 0x%02x, status = 0x%02x\n", + res->error, res->sector_count, + res->device, res->status); +} + +return (1); +} + +if (arglist & CAM_ARG_VERBOSE) { +fprintf(stdout, "%s%d: Raw native check power data:\n", +device->device_name, device->dev_unit_num); +/* res is 4 byte aligned */ +dump_data((uint16_t*)(uintptr_t)res, sizeof(struct ata_res)); + +printf("error = 0x%02x, sector_count = 0x%04x, device = 0x%02x, " + "status = 0x%02x\n", res->error, res->sector_count, + res->device, res->status); +} + +printf("%s%d: ", device->device_name, device->dev_unit_num); +switch (res->sector_count) { +case 0x00: + printf("Standby mode\n"); + break; +case 0x40: + printf("NV Cache Power Mode and the spindle is spun down or spinning down\n"); + break; +case 0x41: + printf("NV Cache Power Mode and the spindle is spun up or spinning up\n"); + break; +case 0x80: + printf("Idle mode\n"); + break; +case 0xff: + printf("Active or Idle mode\n"); + break; +default: + printf("Unknown mode 0x%02x\n", res->sector_count); + break; +} + +return (0); +} + +static int atapm(struct cam_device *device, int argc, char **argv, char *combinedopt, int retry_count, int timeout) { @@ -8883,6 +8940,7 @@ atapm(struct cam_device *device, int argc, char **argv int retval = 0; int t = -1; int c; + u_int8_t ata_flags = 0; u_char cmd, sc; ccb = cam_getccb(device); @@ -8911,6 +8969,10 @@ atapm(struct cam_device *device, int argc, char **argv cmd = ATA_STANDBY_IMMEDIATE; else cmd = ATA_STANDBY_CMD; + } else if (strcmp(argv[1], "powermode") == 0) { + cmd = ATA_CHECK_POWER_MODE; + ata_flags = AP_FLAG_CHK_COND; + t = -1; } else { cmd = ATA_SLEEP; t = -1; @@ -8928,11 +8990,12 @@ atapm(struct cam_device *device, int argc, char **argv else sc = 253; - retval = ata_do_28bit_cmd(device, + retval
svn commit: r346594 - head/sbin/camcontrol
Author: smh Date: Tue Apr 23 07:46:38 2019 New Revision: 346594 URL: https://svnweb.freebsd.org/changeset/base/346594 Log: Add ATA power mode support to camcontrol Add the ability to report ATA device power mode with the cmmand 'powermode' to compliment the existing ability to set it using idle, standby and sleep commands. MFC after:2 weeks Sponsored by: Multiplay Modified: head/sbin/camcontrol/camcontrol.8 head/sbin/camcontrol/camcontrol.c Modified: head/sbin/camcontrol/camcontrol.8 == --- head/sbin/camcontrol/camcontrol.8 Tue Apr 23 06:36:32 2019 (r346593) +++ head/sbin/camcontrol/camcontrol.8 Tue Apr 23 07:46:38 2019 (r346594) @@ -27,7 +27,7 @@ .\" .\" $FreeBSD$ .\" -.Dd March 12, 2019 +.Dd April 22, 2019 .Dt CAMCONTROL 8 .Os .Sh NAME @@ -243,6 +243,10 @@ .Op device id .Op generic args .Nm +.Ic powermode +.Op device id +.Op generic args +.Nm .Ic apm .Op device id .Op generic args @@ -1388,6 +1392,8 @@ Value 0 disables timer. Put ATA device into SLEEP state. Note that the only way get device out of this state may be reset. +.It Ic powermode +Report ATA device power mode. .It Ic apm It optional parameter .Pq Fl l Modified: head/sbin/camcontrol/camcontrol.c == --- head/sbin/camcontrol/camcontrol.c Tue Apr 23 06:36:32 2019 (r346593) +++ head/sbin/camcontrol/camcontrol.c Tue Apr 23 07:46:38 2019 (r346594) @@ -109,7 +109,8 @@ typedef enum { CAM_CMD_ZONE= 0x0026, CAM_CMD_EPC = 0x0027, CAM_CMD_TIMESTAMP = 0x0028, - CAM_CMD_MMCSD_CMD = 0x0029 + CAM_CMD_MMCSD_CMD = 0x0029, + CAM_CMD_POWER_MODE = 0x002a, } cam_cmdmask; typedef enum { @@ -236,6 +237,7 @@ static struct camcontrol_opts option_table[] = { {"idle", CAM_CMD_IDLE, CAM_ARG_NONE, "t:"}, {"standby", CAM_CMD_STANDBY, CAM_ARG_NONE, "t:"}, {"sleep", CAM_CMD_SLEEP, CAM_ARG_NONE, ""}, + {"powermode", CAM_CMD_POWER_MODE, CAM_ARG_NONE, ""}, {"apm", CAM_CMD_APM, CAM_ARG_NONE, "l:"}, {"aam", CAM_CMD_AAM, CAM_ARG_NONE, "l:"}, {"fwdownload", CAM_CMD_DOWNLOAD_FW, CAM_ARG_NONE, "f:qsy"}, @@ -8885,6 +8887,61 @@ bailout: } static int +atapm_proc_resp(struct cam_device *device, union ccb *ccb) +{ +struct ata_res *res; + +res = >ataio.res; +if (res->status & ATA_STATUS_ERROR) { +if (arglist & CAM_ARG_VERBOSE) { +cam_error_print(device, ccb, CAM_ESF_ALL, +CAM_EPF_ALL, stderr); +printf("error = 0x%02x, sector_count = 0x%04x, " + "device = 0x%02x, status = 0x%02x\n", + res->error, res->sector_count, + res->device, res->status); +} + +return (1); +} + +if (arglist & CAM_ARG_VERBOSE) { +fprintf(stdout, "%s%d: Raw native check power data:\n", +device->device_name, device->dev_unit_num); +/* res is 4 byte aligned */ +dump_data((uint16_t*)(uintptr_t)res, sizeof(struct ata_res)); + +printf("error = 0x%02x, sector_count = 0x%04x, device = 0x%02x, " + "status = 0x%02x\n", res->error, res->sector_count, + res->device, res->status); +} + +printf("%s%d: ", device->device_name, device->dev_unit_num); +switch (res->sector_count) { +case 0x00: + printf("Standby mode\n"); + break; +case 0x40: + printf("NV Cache Power Mode and the spindle is spun down or spinning down\n"); + break; +case 0x41: + printf("NV Cache Power Mode and the spindle is spun up or spinning up\n"); + break; +case 0x80: + printf("Idle mode\n"); + break; +case 0xff: + printf("Active or Idle mode\n"); + break; +default: + printf("Unknown mode 0x%02x\n", res->sector_count); + break; +} + +return (0); +} + +static int atapm(struct cam_device *device, int argc, char **argv, char *combinedopt, int retry_count, int timeout) { @@ -8892,6 +8949,7 @@ atapm(struct cam_device *device, int argc, char **argv int retval = 0; int t = -1; int c; + u_int8_t ata_flags = 0; u_char cmd, sc; ccb = cam_getccb(device); @@ -8920,6 +8978,10 @@ atapm(struct cam_device *device, int argc, char **argv cmd = ATA_STANDBY_IMMEDIATE; else cmd = ATA_STANDBY_CMD; + } else if (strcmp(argv[1], "powermode") == 0) { + cmd = ATA_CHECK_POWER_MODE; + ata_flags = AP_FLAG_CHK_COND; + t = -1; } else { cmd = ATA_SLEEP; t = -1; @@ -8937,11 +8999,12 @@ atapm(struct cam_device *device, int argc, char **argv else
svn commit: r345129 - stable/12/stand/libsa/zfs
Author: smh Date: Thu Mar 14 10:06:46 2019 New Revision: 345129 URL: https://svnweb.freebsd.org/changeset/base/345129 Log: Revert zfsimpl.c accidentally committed in r345128 Revert an unrelated change to zfsimpl.c accidentally committed in r345128. Sponsored by: Multiplay Modified: stable/12/stand/libsa/zfs/zfsimpl.c Modified: stable/12/stand/libsa/zfs/zfsimpl.c == --- stable/12/stand/libsa/zfs/zfsimpl.c Thu Mar 14 10:03:04 2019 (r345128) +++ stable/12/stand/libsa/zfs/zfsimpl.c Thu Mar 14 10:06:46 2019 (r345129) @@ -2076,7 +2076,6 @@ zfs_mount_dataset(const spa_t *spa, uint64_t objnum, o { dnode_phys_t dataset; dsl_dataset_phys_t *ds; - int err; if (objset_get_dnode(spa, >spa_mos, objnum, )) { printf("ZFS: can't find dataset %ju\n", (uintmax_t)objnum); @@ -2084,9 +2083,9 @@ zfs_mount_dataset(const spa_t *spa, uint64_t objnum, o } ds = (dsl_dataset_phys_t *) _bonus; - if ((err = zio_read(spa, >ds_bp, objset)) != 0) { - printf("ZFS: can't read object set for dataset %ju (error %d)\n", - (uintmax_t)objnum, err); + if (zio_read(spa, >ds_bp, objset)) { + printf("ZFS: can't read object set for dataset %ju\n", + (uintmax_t)objnum); return (EIO); } ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r345128 - in stable/12: sbin/camcontrol stand/libsa/zfs
Author: smh Date: Thu Mar 14 10:03:04 2019 New Revision: 345128 URL: https://svnweb.freebsd.org/changeset/base/345128 Log: MFC r344701: Fix incorrect / unused sector_count for identify requests Fix unused sector_count for identify requests from camcontrol by changing to zero which is a more appropriate value when the parameter is unused. Sponsored by: Multiplay Modified: stable/12/sbin/camcontrol/camcontrol.c stable/12/stand/libsa/zfs/zfsimpl.c Directory Properties: stable/12/ (props changed) Modified: stable/12/sbin/camcontrol/camcontrol.c == --- stable/12/sbin/camcontrol/camcontrol.c Thu Mar 14 09:18:54 2019 (r345127) +++ stable/12/sbin/camcontrol/camcontrol.c Thu Mar 14 10:03:04 2019 (r345128) @@ -2292,7 +2292,7 @@ ata_do_identify(struct cam_device *device, int retry_c /*command*/command, /*features*/0, /*lba*/0, -/*sector_count*/(u_int8_t)sizeof(struct ata_params), +/*sector_count*/0, /*data_ptr*/(u_int8_t *)ptr, /*dxfer_len*/sizeof(struct ata_params), /*timeout*/timeout ? timeout : 30 * 1000, @@ -2312,8 +2312,7 @@ ata_do_identify(struct cam_device *device, int retry_c /*command*/retry_command, /*features*/0, /*lba*/0, -/*sector_count*/(u_int8_t) -sizeof(struct ata_params), +/*sector_count*/0, /*data_ptr*/(u_int8_t *)ptr, /*dxfer_len*/sizeof(struct ata_params), /*timeout*/timeout ? timeout : 30 * 1000, Modified: stable/12/stand/libsa/zfs/zfsimpl.c == --- stable/12/stand/libsa/zfs/zfsimpl.c Thu Mar 14 09:18:54 2019 (r345127) +++ stable/12/stand/libsa/zfs/zfsimpl.c Thu Mar 14 10:03:04 2019 (r345128) @@ -2076,6 +2076,7 @@ zfs_mount_dataset(const spa_t *spa, uint64_t objnum, o { dnode_phys_t dataset; dsl_dataset_phys_t *ds; + int err; if (objset_get_dnode(spa, >spa_mos, objnum, )) { printf("ZFS: can't find dataset %ju\n", (uintmax_t)objnum); @@ -2083,9 +2084,9 @@ zfs_mount_dataset(const spa_t *spa, uint64_t objnum, o } ds = (dsl_dataset_phys_t *) _bonus; - if (zio_read(spa, >ds_bp, objset)) { - printf("ZFS: can't read object set for dataset %ju\n", - (uintmax_t)objnum); + if ((err = zio_read(spa, >ds_bp, objset)) != 0) { + printf("ZFS: can't read object set for dataset %ju (error %d)\n", + (uintmax_t)objnum, err); return (EIO); } ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r344701 - head/sbin/camcontrol
Not really much more to say that isn't explained by that and the code. Sure I could have used a different sentence structure for the body but it wouldn't add anything IMO, thoughts? On 02/03/2019 10:49, Alexey Dokuchaev wrote: On Fri, Mar 01, 2019 at 02:39:15PM +, Steven Hartland wrote: New Revision: 344701 URL: https://svnweb.freebsd.org/changeset/base/344701 Log: Fix incorrect / unused sector_count for identify requests Fix incorrect / unused sector_count for identify requests from camcontrol. Submitted by: Alexey Dokuchaev Thanks, although commit message is a bit scarce. Also, for some reason, it consists of two nearly identical lines -- unnoticed copy paste error? ./danfe ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r344701 - head/sbin/camcontrol
Author: smh Date: Fri Mar 1 14:39:15 2019 New Revision: 344701 URL: https://svnweb.freebsd.org/changeset/base/344701 Log: Fix incorrect / unused sector_count for identify requests Fix incorrect / unused sector_count for identify requests from camcontrol. Submitted by: Alexey Dokuchaev Reported by: Alexey Dokuchaev MFC after:1 week Sponsored by: Multiplay Differential Revision:https://reviews.freebsd.org/D19408 Modified: head/sbin/camcontrol/camcontrol.c Modified: head/sbin/camcontrol/camcontrol.c == --- head/sbin/camcontrol/camcontrol.c Fri Mar 1 14:33:20 2019 (r344700) +++ head/sbin/camcontrol/camcontrol.c Fri Mar 1 14:39:15 2019 (r344701) @@ -2292,7 +2292,7 @@ ata_do_identify(struct cam_device *device, int retry_c /*command*/command, /*features*/0, /*lba*/0, -/*sector_count*/(u_int8_t)sizeof(struct ata_params), +/*sector_count*/0, /*data_ptr*/(u_int8_t *)ptr, /*dxfer_len*/sizeof(struct ata_params), /*timeout*/timeout ? timeout : 30 * 1000, @@ -2312,8 +2312,7 @@ ata_do_identify(struct cam_device *device, int retry_c /*command*/retry_command, /*features*/0, /*lba*/0, -/*sector_count*/(u_int8_t) -sizeof(struct ata_params), +/*sector_count*/0, /*data_ptr*/(u_int8_t *)ptr, /*dxfer_len*/sizeof(struct ata_params), /*timeout*/timeout ? timeout : 30 * 1000, ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r343745 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
On 04/02/2019 16:13, Alexander Motin wrote: Author: mav Date: Mon Feb 4 16:13:41 2019 New Revision: 343745 URL: https://svnweb.freebsd.org/changeset/base/343745 Log: Add missed tunables/sysctls for some new vdev variables. While there, make few existing sysctls writeable, since there is no reason not to. MFC after: 1 week Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Mon Feb 4 16:02:03 2019(r343744) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Mon Feb 4 16:13:41 2019(r343745) @@ -165,29 +165,38 @@ static vdev_ops_t *vdev_ops_table[] = { /* target number of metaslabs per top-level vdev */ int vdev_max_ms_count = 200; -SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, max_ms_count, CTLFLAG_RDTUN, +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, max_ms_count, CTLFLAG_RWTUN, _max_ms_count, 0, -"Maximum number of metaslabs per top-level vdev"); +"Target number of metaslabs per top-level vdev"); /* minimum number of metaslabs per top-level vdev */ int vdev_min_ms_count = 16; -SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, min_ms_count, CTLFLAG_RDTUN, +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, min_ms_count, CTLFLAG_RWTUN, _min_ms_count, 0, "Minimum number of metaslabs per top-level vdev"); /* practical upper limit of total metaslabs per top-level vdev */ int vdev_ms_count_limit = 1ULL << 17; +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, max_ms_count_limit, CTLFLAG_RWTUN, +_ms_count_limit, 0, +"Maximum number of metaslabs per top-level vdev"); /* lower limit for metaslab size (512M) */ int vdev_default_ms_shift = 29; -SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, default_ms_shift, CTLFLAG_RDTUN, +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, default_ms_shift, CTLFLAG_RWTUN, _default_ms_shift, 0, -"Shift between vdev size and number of metaslabs"); +"Default shift between vdev size and number of metaslabs"); /* upper limit for metaslab size (256G) */ int vdev_max_ms_shift = 38; +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, max_ms_shift, CTLFLAG_RWTUN, +_max_ms_shift, 0, +"Maximal shift between vdev size and number of metaslabs"); It's a just a nit but I believe this should Maximum, like the others, instead of Maximal. boolean_t vdev_validate_skip = B_FALSE; +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, validate_skip, CTLFLAG_RWTUN, +_validate_skip, 0, +"Bypass vdev validation"); /* * Since the DTL space map of a vdev is not expected to have a lot of ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn: head/usr.bin: . trim
On 30/11/2018 22:09, Eugene Grosbein wrote: 01.12.2018 4:29, Steven Hartland wrote: On 30/11/2018 21:16, Eugene Grosbein wrote: 30.11.2018 21:23, Warner Losh wrote: So I'm back to my point: we should just put it into dd and move on with our lives. It's really the right place for it. Why can't we have two implementations? Diversity is good thing. I can imagine erasing a partition with ZFS Cache or ZIL inside and "trim /dev/da0p2 /dev/da0p3" looks much better :-) ZFS already does that no need for a separate tool Think of media taken out of (possibly already dead) ZFS-based to UFS-only system. By the way, how exactly do you trim previously ZIL partition withing working ZFS-based system? You could use camcontrol which can perform a secure erase on the device, but that's obviously device wide not a specific partition. What I was referring to is ZFS performs a delete of blocks when it initializes a volume, so there's usually no need to perform a manual step there. For reference this behavior can be disabled by setting vfs.zfs.vdev.trim_on_init=0 Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn: head/usr.bin: . trim
ZFS already does that no need for a separate tool On 30/11/2018 21:16, Eugene Grosbein wrote: 30.11.2018 21:23, Warner Losh wrote: So I'm back to my point: we should just put it into dd and move on with our lives. It's really the right place for it. Why can't we have two implementations? Diversity is good thing. I can imagine erasing a partition with ZFS Cache or ZIL inside and "trim /dev/da0p2 /dev/da0p3" looks much better :-) ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn: head/usr.bin: . trim
Personally I disagree, chances of people finding that option in dd is slim, a dedicated trim utility makes much more sense to me. Sure have both that's cool but keep the trim would be my vote. On 30/11/2018 01:17, Cy Schubert wrote: Agreed. --- Sent using a tiny phone keyboard. Apologies for any typos and autocorrect. Also, this old phone only supports top post. Apologies. Cy Schubert or The need of the many outweighs the greed of the few. --- From: Alexey Dokuchaev Sent: 29/11/2018 17:01 To: Maxim Sobolev Cc: eu...@freebsd.org; svn-src-h...@freebsd.org; svn-src-all@freebsd.org; src-committers Subject: Re: svn: head/usr.bin: . trim On Thu, Nov 29, 2018 at 10:36:02AM -0800, Maxim Sobolev wrote: > Interesting. I have a similar functionality implemented as an option for > the dd utility in my pipeline (conv=erase). Which probably makes a better place rather than adding 4-letter program, commonly named ("trim" is a simple word), into global namespace. :-/ ./danfe ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r339035 - stable/11/sys/netinet
Author: smh Date: Mon Oct 1 07:49:16 2018 New Revision: 339035 URL: https://svnweb.freebsd.org/changeset/base/339035 Log: MFC r336165: Removed pointless NULL check in rip_pcblist. Sponsored by: Multiplay Modified: stable/11/sys/netinet/raw_ip.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/netinet/raw_ip.c == --- stable/11/sys/netinet/raw_ip.c Mon Oct 1 04:08:47 2018 (r339034) +++ stable/11/sys/netinet/raw_ip.c Mon Oct 1 07:49:16 2018 (r339035) @@ -1053,8 +1053,6 @@ rip_pcblist(SYSCTL_HANDLER_ARGS) return (error); inp_list = malloc(n * sizeof *inp_list, M_TEMP, M_WAITOK); - if (inp_list == NULL) - return (ENOMEM); INP_INFO_RLOCK(_ripcbinfo); for (inp = LIST_FIRST(V_ripcbinfo.ipi_listhead), i = 0; inp && i < n; ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r336165 - head/sys/netinet
Author: smh Date: Tue Jul 10 08:05:32 2018 New Revision: 336165 URL: https://svnweb.freebsd.org/changeset/base/336165 Log: Removed pointless NULL check Removed pointless NULL check after malloc with M_WAITOK which can never return NULL. Sponsored by: Multiplay Modified: head/sys/netinet/raw_ip.c Modified: head/sys/netinet/raw_ip.c == --- head/sys/netinet/raw_ip.c Tue Jul 10 07:29:51 2018(r336164) +++ head/sys/netinet/raw_ip.c Tue Jul 10 08:05:32 2018(r336165) @@ -1069,8 +1069,6 @@ rip_pcblist(SYSCTL_HANDLER_ARGS) return (error); inp_list = malloc(n * sizeof *inp_list, M_TEMP, M_WAITOK); - if (inp_list == NULL) - return (ENOMEM); INP_INFO_RLOCK_ET(_ripcbinfo, et); for (inp = CK_LIST_FIRST(V_ripcbinfo.ipi_listhead), i = 0; inp && i < n; ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r335856 - in head/sys: netinet sys
Sorry guys I didn't spot it was just a revert as it was tagged on to the end of the description, I would have expected that to be in the subject. What do others think, is there an recommend style for revert commit messages? Regards Steve On 02/07/2018 17:30, Rodney W. Grimes wrote: [ Charset UTF-8 unsupported, converting... ] On Mon, Jul 2, 2018 at 10:44 AM Steven Hartland < steven.hartl...@multiplay.co.uk> wrote: You have M_WAITOK and a null check in this change And, that's the same as the way it was before his commits. So, he did exactly what he said he was doing and reverted his commits. I don't think it is good practice to mix reverts with other changes. It is a very bad practive to mix a revert with anything. Since you've noticed this, I think you should feel free to make the change. Jonathan ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r335856 - in head/sys: netinet sys
You have M_WAITOK and a null check in this change On Mon, 2 Jul 2018 at 06:20, Matt Macy wrote: > Author: mmacy > Date: Mon Jul 2 05:19:44 2018 > New Revision: 335856 > URL: https://svnweb.freebsd.org/changeset/base/335856 > > Log: > inpcb: don't gratuitously defer frees > > Don't defer frees in sysctl handlers. It isn't necessary > and it just confuses things. > revert: r333911, r334104, and r334125 > > Requested by: jtl > > Modified: > head/sys/netinet/ip_divert.c > head/sys/netinet/raw_ip.c > head/sys/netinet/tcp_subr.c > head/sys/netinet/udp_usrreq.c > head/sys/sys/malloc.h > > Modified: head/sys/netinet/ip_divert.c > > == > --- head/sys/netinet/ip_divert.cMon Jul 2 01:30:33 2018 > (r335855) > +++ head/sys/netinet/ip_divert.cMon Jul 2 05:19:44 2018 > (r335856) > @@ -552,7 +552,6 @@ div_detach(struct socket *so) > KASSERT(inp != NULL, ("div_detach: inp == NULL")); > INP_INFO_WLOCK(_divcbinfo); > INP_WLOCK(inp); > - /* XXX defer destruction to epoch_call */ > in_pcbdetach(inp); > in_pcbfree(inp); > INP_INFO_WUNLOCK(_divcbinfo); > @@ -632,7 +631,6 @@ static int > div_pcblist(SYSCTL_HANDLER_ARGS) > { > int error, i, n; > - struct in_pcblist *il; > struct inpcb *inp, **inp_list; > inp_gen_t gencnt; > struct xinpgen xig; > @@ -672,8 +670,9 @@ div_pcblist(SYSCTL_HANDLER_ARGS) > if (error) > return error; > > - il = malloc(sizeof(struct in_pcblist) + n * sizeof(struct inpcb > *), M_TEMP, M_WAITOK|M_ZERO_INVARIANTS); > - inp_list = il->il_inp_list; > + inp_list = malloc(n * sizeof *inp_list, M_TEMP, M_WAITOK); > + if (inp_list == NULL) > + return ENOMEM; > > INP_INFO_RLOCK(_divcbinfo); > for (inp = CK_LIST_FIRST(V_divcbinfo.ipi_listhead), i = 0; inp && > i < n; > @@ -702,9 +701,14 @@ div_pcblist(SYSCTL_HANDLER_ARGS) > } else > INP_RUNLOCK(inp); > } > - il->il_count = n; > - il->il_pcbinfo = _divcbinfo; > - epoch_call(net_epoch_preempt, >il_epoch_ctx, > in_pcblist_rele_rlocked); > + INP_INFO_WLOCK(_divcbinfo); > + for (i = 0; i < n; i++) { > + inp = inp_list[i]; > + INP_RLOCK(inp); > + if (!in_pcbrele_rlocked(inp)) > + INP_RUNLOCK(inp); > + } > + INP_INFO_WUNLOCK(_divcbinfo); > > if (!error) { > /* > @@ -721,6 +725,7 @@ div_pcblist(SYSCTL_HANDLER_ARGS) > INP_INFO_RUNLOCK(_divcbinfo); > error = SYSCTL_OUT(req, , sizeof xig); > } > + free(inp_list, M_TEMP); > return error; > } > > @@ -800,7 +805,6 @@ div_modevent(module_t mod, int type, void *unused) > break; > } > ip_divert_ptr = NULL; > - /* XXX defer to epoch_call ? */ > err = pf_proto_unregister(PF_INET, IPPROTO_DIVERT, > SOCK_RAW); > INP_INFO_WUNLOCK(_divcbinfo); > #ifndef VIMAGE > > Modified: head/sys/netinet/raw_ip.c > > == > --- head/sys/netinet/raw_ip.c Mon Jul 2 01:30:33 2018(r335855) > +++ head/sys/netinet/raw_ip.c Mon Jul 2 05:19:44 2018(r335856) > @@ -863,7 +863,6 @@ rip_detach(struct socket *so) > ip_rsvp_force_done(so); > if (so == V_ip_rsvpd) > ip_rsvp_done(); > - /* XXX defer to epoch_call */ > in_pcbdetach(inp); > in_pcbfree(inp); > INP_INFO_WUNLOCK(_ripcbinfo); > @@ -1033,7 +1032,6 @@ static int > rip_pcblist(SYSCTL_HANDLER_ARGS) > { > int error, i, n; > - struct in_pcblist *il; > struct inpcb *inp, **inp_list; > inp_gen_t gencnt; > struct xinpgen xig; > @@ -1068,8 +1066,9 @@ rip_pcblist(SYSCTL_HANDLER_ARGS) > if (error) > return (error); > > - il = malloc(sizeof(struct in_pcblist) + n * sizeof(struct inpcb > *), M_TEMP, M_WAITOK|M_ZERO_INVARIANTS); > - inp_list = il->il_inp_list; > + inp_list = malloc(n * sizeof *inp_list, M_TEMP, M_WAITOK); > + if (inp_list == NULL) > + return (ENOMEM); > > INP_INFO_RLOCK(_ripcbinfo); > for (inp = CK_LIST_FIRST(V_ripcbinfo.ipi_listhead), i = 0; inp && > i < n; > @@ -1098,9 +1097,14 @@ rip_pcblist(SYSCTL_HANDLER_ARGS) > } else > INP_RUNLOCK(inp); > } > - il->il_count = n; > - il->il_pcbinfo = _ripcbinfo; > - epoch_call(net_epoch_preempt, >il_epoch_ctx, > in_pcblist_rele_rlocked); > + INP_INFO_WLOCK(_ripcbinfo); > + for (i = 0; i < n; i++) { > + inp = inp_list[i]; > + INP_RLOCK(inp); > + if
Re: svn commit: r335171 - head/sys/vm
On 15/06/2018 00:07, Alan Cox wrote: On Jun 14, 2018, at 5:54 PM, Steven Hartland <mailto:steven.hartl...@multiplay.co.uk>> wrote: Out of interest, how would this exhibit itself? A panic in vm_page_insert_after(). So just to confirm this couldn't cause random memory corruption of the parent process? Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r335171 - head/sys/vm
Out of interest, how would this exhibit itself? On 14/06/2018 20:41, Konstantin Belousov wrote: Author: kib Date: Thu Jun 14 19:41:02 2018 New Revision: 335171 URL: https://svnweb.freebsd.org/changeset/base/335171 Log: Handle the race between fork/vm_object_split() and faults. If fault started before vmspace_fork() locked the map, and then during fork, vm_map_copy_entry()->vm_object_split() is executed, it is possible that the fault instantiate the page into the original object when the page was already copied into the new object (see vm_map_split() for the orig/new objects terminology). This can happen if split found a busy page (e.g. from the fault) and slept dropping the objects lock, which allows the swap pager to instantiate read-behind pages for the fault. Then the restart of the scan can see a page in the scanned range, where it was already copied to the upper object. Fix it by instantiating the read-ahead pages before swap_pager_getpages() method drops the lock to allocate pbuf. The object scan would see the whole range prefilled with the busy pages and not proceed the range. Note that vm_fault rechecks the map generation count after the object unlock, so that it restarts the handling if raced with split, and re-lookups the right page from the upper object. In collaboration with: alc Tested by: pho Sponsored by:The FreeBSD Foundation MFC after: 1 week Modified: head/sys/vm/swap_pager.c Modified: head/sys/vm/swap_pager.c == --- head/sys/vm/swap_pager.cThu Jun 14 19:01:40 2018(r335170) +++ head/sys/vm/swap_pager.cThu Jun 14 19:41:02 2018(r335171) @@ -1096,21 +1096,24 @@ swap_pager_getpages(vm_object_t object, vm_page_t *ma, int *rahead) { struct buf *bp; - vm_page_t mpred, msucc, p; + vm_page_t bm, mpred, msucc, p; vm_pindex_t pindex; daddr_t blk; - int i, j, maxahead, maxbehind, reqcount, shift; + int i, maxahead, maxbehind, reqcount; reqcount = count; - VM_OBJECT_WUNLOCK(object); - bp = getpbuf(_rcount); - VM_OBJECT_WLOCK(object); - - if (!swap_pager_haspage(object, ma[0]->pindex, , )) { - relpbuf(bp, _rcount); + /* +* Determine the final number of read-behind pages and +* allocate them BEFORE releasing the object lock. Otherwise, +* there can be a problematic race with vm_object_split(). +* Specifically, vm_object_split() might first transfer pages +* that precede ma[0] in the current object to a new object, +* and then this function incorrectly recreates those pages as +* read-behind pages in the current object. +*/ + if (!swap_pager_haspage(object, ma[0]->pindex, , )) return (VM_PAGER_FAIL); - } /* * Clip the readahead and readbehind ranges to exclude resident pages. @@ -1132,35 +1135,31 @@ swap_pager_getpages(vm_object_t object, vm_page_t *ma, *rbehind = pindex - mpred->pindex - 1; } + bm = ma[0]; + for (i = 0; i < count; i++) + ma[i]->oflags |= VPO_SWAPINPROG; + /* * Allocate readahead and readbehind pages. */ - shift = rbehind != NULL ? *rbehind : 0; - if (shift != 0) { - for (i = 1; i <= shift; i++) { + if (rbehind != NULL) { + for (i = 1; i <= *rbehind; i++) { p = vm_page_alloc(object, ma[0]->pindex - i, VM_ALLOC_NORMAL); - if (p == NULL) { - /* Shift allocated pages to the left. */ - for (j = 0; j < i - 1; j++) - bp->b_pages[j] = - bp->b_pages[j + shift - i + 1]; + if (p == NULL) break; - } - bp->b_pages[shift - i] = p; + p->oflags |= VPO_SWAPINPROG; + bm = p; } - shift = i - 1; - *rbehind = shift; + *rbehind = i - 1; } - for (i = 0; i < reqcount; i++) - bp->b_pages[i + shift] = ma[i]; if (rahead != NULL) { for (i = 0; i < *rahead; i++) { p = vm_page_alloc(object, ma[reqcount - 1]->pindex + i + 1, VM_ALLOC_NORMAL); if (p == NULL) break; - bp->b_pages[shift + reqcount + i] = p; + p->oflags |= VPO_SWAPINPROG; } *rahead = i; } @@ -1171,15 +1170,18 @@ swap_pager_getpages(vm_object_t object, vm_page_t *ma,
Re: svn commit: r333267 - head/sys/kern
Again why? On Fri, 4 May 2018 at 23:48, Mateusz Guzikwrote: > Author: mjg > Date: Fri May 4 22:48:10 2018 > New Revision: 333267 > URL: https://svnweb.freebsd.org/changeset/base/333267 > > Log: > tc: bcopy -> memcpy > > Modified: > head/sys/kern/kern_tc.c > > Modified: head/sys/kern/kern_tc.c > > == > --- head/sys/kern/kern_tc.c Fri May 4 22:41:12 2018(r333266) > +++ head/sys/kern/kern_tc.c Fri May 4 22:48:10 2018(r333267) > @@ -1352,7 +1352,7 @@ tc_windup(struct bintime *new_boottimebin) > ogen = th->th_generation; > th->th_generation = 0; > atomic_thread_fence_rel(); > - bcopy(tho, th, offsetof(struct timehands, th_generation)); > + memcpy(th, tho, offsetof(struct timehands, th_generation)); > if (new_boottimebin != NULL) > th->th_boottime = *new_boottimebin; > > > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r333266 - head/sys/amd64/amd64
Can we get the why in commit messages please? This sort of message doesnt provide anything more that can be obtained from reading the diff, which just leaves us wondering why? I’m sure there is a good reason, but without confirmation we’re just left guessing. The knock on to this is if some assumption that caused the why changes, anyone looking at this will not be able to make an informed descision that that was the case. On Fri, 4 May 2018 at 23:41, Mateusz Guzikwrote: > Author: mjg > Date: Fri May 4 22:41:12 2018 > New Revision: 333266 > URL: https://svnweb.freebsd.org/changeset/base/333266 > > Log: > amd64: syscall path bcopy -> memcpy > > Modified: > head/sys/amd64/amd64/trap.c > > Modified: head/sys/amd64/amd64/trap.c > > == > --- head/sys/amd64/amd64/trap.c Fri May 4 22:33:54 2018(r333265) > +++ head/sys/amd64/amd64/trap.c Fri May 4 22:41:12 2018(r333266) > @@ -908,7 +908,7 @@ cpu_fetch_syscall_args(struct thread *td) > error = 0; > argp = >tf_rdi; > argp += reg; > - bcopy(argp, sa->args, sizeof(sa->args[0]) * 6); > + memcpy(sa->args, argp, sizeof(sa->args[0]) * 6); > if (sa->narg > regcnt) { > KASSERT(params != NULL, ("copyin args with no params!")); > error = copyin(params, >args[regcnt], > > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r332523 - in head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys
Hey Mav, this seems like an important one to get in for 11.2 so just wanted to check if that was your intention as there's no MFC tag on the commit? On 16/04/2018 01:54, Alexander Motin wrote: Author: mav Date: Mon Apr 16 00:54:58 2018 New Revision: 332523 URL: https://svnweb.freebsd.org/changeset/base/332523 Log: 9433 Fix ARC hit rate When the compressed ARC feature was added in commit d3c2ae1 the method of reference counting in the ARC was modified. As part of this accounting change the arc_buf_add_ref() function was removed entirely. This would have be fine but the arc_buf_add_ref() function served a second undocumented purpose of updating the ARC access information when taking a hold on a dbuf. Without this logic in place a cached dbuf would not migrate its associated arc_buf_hdr_t to the MFU list. This would negatively impact the ARC hit rate, particularly on systems with a small ARC. This change reinstates the missing call to arc_access() from dbuf_hold() by implementing a new arc_buf_access() function. Reviewed-by: Giuseppe Di NataleReviewed-by: Tony Hutter Reviewed-by: Tim Chase Reviewed by: George Wilson Reviewed-by: George Melikov Signed-off-by: Brian Behlendorf Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Mon Apr 16 00:42:45 2018(r332522) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Mon Apr 16 00:54:58 2018(r332523) @@ -540,8 +540,13 @@ typedef struct arc_stats { */ kstat_named_t arcstat_mutex_miss; /* +* Number of buffers skipped when updating the access state due to the +* header having already been released after acquiring the hash lock. +*/ + kstat_named_t arcstat_access_skip; + /* * Number of buffers skipped because they have I/O in progress, are -* indrect prefetch buffers that have not lived long enough, or are +* indirect prefetch buffers that have not lived long enough, or are * not from the spa we're trying to evict from. */ kstat_named_t arcstat_evict_skip; @@ -796,6 +801,7 @@ static arc_stats_t arc_stats = { { "allocated",KSTAT_DATA_UINT64 }, { "deleted", KSTAT_DATA_UINT64 }, { "mutex_miss", KSTAT_DATA_UINT64 }, + { "access_skip", KSTAT_DATA_UINT64 }, { "evict_skip", KSTAT_DATA_UINT64 }, { "evict_not_enough", KSTAT_DATA_UINT64 }, { "evict_l2_cached", KSTAT_DATA_UINT64 }, @@ -5063,6 +5069,51 @@ arc_access(arc_buf_hdr_t *hdr, kmutex_t *hash_lock) } else { ASSERT(!"invalid arc state"); } +} + +/* + * This routine is called by dbuf_hold() to update the arc_access() state + * which otherwise would be skipped for entries in the dbuf cache. + */ +void +arc_buf_access(arc_buf_t *buf) +{ + mutex_enter(>b_evict_lock); + arc_buf_hdr_t *hdr = buf->b_hdr; + + /* +* Avoid taking the hash_lock when possible as an optimization. +* The header must be checked again under the hash_lock in order +* to handle the case where it is concurrently being released. +*/ + if (hdr->b_l1hdr.b_state == arc_anon || HDR_EMPTY(hdr)) { + mutex_exit(>b_evict_lock); + ARCSTAT_BUMP(arcstat_access_skip); + return; + } + + kmutex_t *hash_lock = HDR_LOCK(hdr); + mutex_enter(hash_lock); + + if (hdr->b_l1hdr.b_state == arc_anon || HDR_EMPTY(hdr)) { + mutex_exit(hash_lock); + mutex_exit(>b_evict_lock); + ARCSTAT_BUMP(arcstat_access_skip); + return; + } + + mutex_exit(>b_evict_lock); + + ASSERT(hdr->b_l1hdr.b_state == arc_mru || + hdr->b_l1hdr.b_state == arc_mfu); + + DTRACE_PROBE1(arc__hit, arc_buf_hdr_t *, hdr); + arc_access(hdr, hash_lock); + mutex_exit(hash_lock); + + ARCSTAT_BUMP(arcstat_hits); + ARCSTAT_CONDSTAT(!HDR_PREFETCH(hdr), + demand, prefetch, !HDR_ISTYPE_METADATA(hdr), data, metadata, hits); } /* a generic arc_done_func_t which you can use */ Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c
svn commit: r332318 - in stable/11: . sys/net
Author: smh Date: Mon Apr 9 08:25:29 2018 New Revision: 332318 URL: https://svnweb.freebsd.org/changeset/base/332318 Log: MFC r327559: Disabled the use of flowid for lagg by default Sponsored by: Multiplay Modified: stable/11/UPDATING stable/11/sys/net/if_lagg.c Directory Properties: stable/11/ (props changed) Modified: stable/11/UPDATING == --- stable/11/UPDATING Mon Apr 9 05:48:12 2018(r332317) +++ stable/11/UPDATING Mon Apr 9 08:25:29 2018(r332318) @@ -16,6 +16,14 @@ from older versions of FreeBSD, try WITHOUT_CLANG and the tip of head, and then rebuild without this option. The bootstrap process from older version of current across the gcc/clang cutover is a bit fragile. +20180409: + The use of RSS hash from the network card aka flowid has been + disabled by default for lagg(4) as it's currently incompatible with + the lacp and loadbalance protocols. + + This can be re-enabled by setting the following in loader.conf: + net.link.lagg.default_use_flowid="1" + 20180331: Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to 6.0.0. Please see the 20141231 entry below for information about Modified: stable/11/sys/net/if_lagg.c == --- stable/11/sys/net/if_lagg.c Mon Apr 9 05:48:12 2018(r332317) +++ stable/11/sys/net/if_lagg.c Mon Apr 9 08:25:29 2018(r332318) @@ -238,7 +238,7 @@ SYSCTL_INT(_net_link_lagg, OID_AUTO, failover_rx_all, "Accept input from any interface in a failover lagg"); /* Default value for using flowid */ -static VNET_DEFINE(int, def_use_flowid) = 1; +static VNET_DEFINE(int, def_use_flowid) = 0; #defineV_def_use_flowidVNET(def_use_flowid) SYSCTL_INT(_net_link_lagg, OID_AUTO, default_use_flowid, CTLFLAG_RWTUN, _NAME(def_use_flowid), 0, ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r332285 - head/sys/kern
Worth making the sysctls so they can be tuned the the HW / use case? On 08/04/2018 17:34, Mateusz Guzik wrote: Author: mjg Date: Sun Apr 8 16:34:10 2018 New Revision: 332285 URL: https://svnweb.freebsd.org/changeset/base/332285 Log: locks: tweak backoff a little bit Previous limits were chosen when locking primitives had spurious lock accesses. Flipping the starting point to 1 (or rather 2 as the first call shifts it) provides a modest win when mild contention is seen while not hurting worse cases. Tested on a bunch of one, two and four socket old and new systems (Westmere, Skylake, Threadreaper and others) by doing concurrent page faults, buildkernel/buildworld and other stuff (although not all systems got all the tests). Another thing is the upper limit. It is semi-arbitrarily chosen as it was getting out of hand for slightly less small systems (e.g. a 128-thread one). Note that backoff is fundamentally a speculative bandaid and this change just makes it fit a little bit better. It remains completely oblivious to the hardware topology or the contention pattern. This is being experimented with. Modified: head/sys/kern/subr_lock.c Modified: head/sys/kern/subr_lock.c == --- head/sys/kern/subr_lock.c Sun Apr 8 16:29:24 2018(r332284) +++ head/sys/kern/subr_lock.c Sun Apr 8 16:34:10 2018(r332285) @@ -156,8 +156,10 @@ void lock_delay_default_init(struct lock_delay_config *lc) { - lc->base = lock_roundup_2(mp_ncpus) / 4; - lc->max = lc->base * 1024; + lc->base = 1; + lc->max = lock_roundup_2(mp_ncpus) * 256; + if (lc->max > 32678) + lc->max = 32678; } #ifdef DDB ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 13:11, Slawa Olhovchenkov wrote: On Fri, Jan 05, 2018 at 03:50:31AM +0700, Eugene Grosbein wrote: 05.01.2018 3:05, Steven Hartland wrote: Author: smh Date: Thu Jan 4 20:05:47 2018 New Revision: 327559 URL: https://svnweb.freebsd.org/changeset/base/327559 Log: Disabled the use of flowid for lagg by default Disabled the use of RSS hash from the network card aka flowid for lagg(4) interfaces by default as it's currently incompatible with the lacp and loadbalance protocols. The incompatibility is due to the fact that the flowid isn't know for the first packet of a new outbound stream which can result in the hash calculation method changing and hence a stream being incorrectly split across multiple interfaces during normal operation. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" Discussed with: kmacy Sponsored by:Multiplay RSS by definition has meaning to received stream. What is "outbound" stream in this context, why can the hash calculatiom method change and what exactly does it mean "a stream being incorrectly split"? Defaults should not be changed so easily just because they are not optimal for some specific case. Each lagg has its own setting for flowid usage and why one cannot just use "ifconfig lagg0 -use_flowid" for such cases? Irrelevant to RSS and etc. flowid distribution in lacp case work very bad. This is good and must be MFC (IMHO). There was no concrete conclusion to this thread and I've not had time to look into this more and it's on my open list to MFC to stable/11 in time for 11.2. Even given the drop in performance, I think we should prefer correctness over increased performance and given the new default can still be overridden in loader.conf I'm looking to MFC this shortly unless I get any strong objections with a clear path forward. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r331851 - stable/11/usr.sbin/bsdinstall/scripts
Author: smh Date: Sat Mar 31 19:21:57 2018 New Revision: 331851 URL: https://svnweb.freebsd.org/changeset/base/331851 Log: MFC r320138: Fixed bsdinstall location of vfs.zfs.min_auto_ashift Sponsored by: Multiplay Modified: stable/11/usr.sbin/bsdinstall/scripts/config stable/11/usr.sbin/bsdinstall/scripts/zfsboot Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.sbin/bsdinstall/scripts/config == --- stable/11/usr.sbin/bsdinstall/scripts/configSat Mar 31 19:19:22 2018(r331850) +++ stable/11/usr.sbin/bsdinstall/scripts/configSat Mar 31 19:21:57 2018(r331851) @@ -32,7 +32,7 @@ cat $BSDINSTALL_TMPETC/rc.conf.* >> $BSDINSTALL_TMPETC/rc.conf rm $BSDINSTALL_TMPETC/rc.conf.* -cat $BSDINSTALL_CHROOT/etc/sysctl.conf $BSDINSTALL_TMPETC/sysctl.conf.hardening >> $BSDINSTALL_TMPETC/sysctl.conf +cat $BSDINSTALL_CHROOT/etc/sysctl.conf $BSDINSTALL_TMPETC/sysctl.conf.* >> $BSDINSTALL_TMPETC/sysctl.conf rm $BSDINSTALL_TMPETC/sysctl.conf.* cp $BSDINSTALL_TMPETC/* $BSDINSTALL_CHROOT/etc Modified: stable/11/usr.sbin/bsdinstall/scripts/zfsboot == --- stable/11/usr.sbin/bsdinstall/scripts/zfsboot Sat Mar 31 19:19:22 2018(r331850) +++ stable/11/usr.sbin/bsdinstall/scripts/zfsboot Sat Mar 31 19:21:57 2018(r331851) @@ -1446,7 +1446,7 @@ zfs_create_boot() if [ "$ZFSBOOT_FORCE_4K_SECTORS" ]; then f_eval_catch $funcname echo "$ECHO_APPEND" \ 'vfs.zfs.min_auto_ashift=12' \ -$BSDINSTALL_TMPBOOT/loader.conf.zfs || return $FAILURE +$BSDINSTALL_TMPETC/sysctl.conf.zfs || return $FAILURE fi if [ "$ZFSBOOT_SWAP_MIRROR" ]; then ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r331850 - stable/11/sys/net
Author: smh Date: Sat Mar 31 19:19:22 2018 New Revision: 331850 URL: https://svnweb.freebsd.org/changeset/base/331850 Log: MFC r328321: Added missing CTLFLAG_VNET to lacp default_strict_mode Sponsored by: Multiplay Modified: stable/11/sys/net/ieee8023ad_lacp.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/net/ieee8023ad_lacp.c == --- stable/11/sys/net/ieee8023ad_lacp.c Sat Mar 31 19:18:07 2018 (r331849) +++ stable/11/sys/net/ieee8023ad_lacp.c Sat Mar 31 19:19:22 2018 (r331850) @@ -197,8 +197,8 @@ SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, debug, CTLFL _NAME(lacp_debug), 0, "Enable LACP debug logging (1=debug, 2=trace)"); static VNET_DEFINE(int, lacp_default_strict_mode) = 1; -SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, default_strict_mode, CTLFLAG_RWTUN, -_NAME(lacp_default_strict_mode), 0, +SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, default_strict_mode, +CTLFLAG_RWTUN | CTLFLAG_VNET, _NAME(lacp_default_strict_mode), 0, "LACP strict protocol compliance default"); #define LACP_DPRINTF(a) if (V_lacp_debug & 0x01) { lacp_dprintf a ; } ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r331849 - stable/11/sys/dev/mps
Author: smh Date: Sat Mar 31 19:18:07 2018 New Revision: 331849 URL: https://svnweb.freebsd.org/changeset/base/331849 Log: MFC r330951: Fix mps deadlock when handling panic Sponsored by: Multiplay Modified: stable/11/sys/dev/mps/mps_sas_lsi.c stable/11/sys/dev/mps/mpsvar.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/mps/mps_sas_lsi.c == --- stable/11/sys/dev/mps/mps_sas_lsi.c Sat Mar 31 19:16:25 2018 (r331848) +++ stable/11/sys/dev/mps/mps_sas_lsi.c Sat Mar 31 19:18:07 2018 (r331849) @@ -50,6 +50,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -124,7 +125,7 @@ int mpssas_get_sas_address_for_sata_disk(struct mps_so u64 *sas_address, u16 handle, u32 device_info, u8 *is_SATA_SSD); static int mpssas_volume_add(struct mps_softc *sc, u16 handle); -static void mpssas_SSU_to_SATA_devices(struct mps_softc *sc); +static void mpssas_SSU_to_SATA_devices(struct mps_softc *sc, int howto); static void mpssas_stop_unit_done(struct cam_periph *periph, union ccb *done_ccb); @@ -1112,7 +1113,7 @@ out: * Return nothing. */ static void -mpssas_SSU_to_SATA_devices(struct mps_softc *sc) +mpssas_SSU_to_SATA_devices(struct mps_softc *sc, int howto) { struct mpssas_softc *sassc = sc->sassc; union ccb *ccb; @@ -1120,7 +1121,7 @@ mpssas_SSU_to_SATA_devices(struct mps_softc *sc) target_id_t targetid; struct mpssas_target *target; char path_str[64]; - struct timeval cur_time, start_time; + int timeout; /* * For each target, issue a StartStopUnit command to stop the device. @@ -1183,17 +1184,23 @@ mpssas_SSU_to_SATA_devices(struct mps_softc *sc) } /* -* Wait until all of the SSU commands have completed or time has -* expired (60 seconds). Pause for 100ms each time through. If any -* command times out, the target will be reset in the SCSI command -* timeout routine. +* Timeout after 60 seconds by default or 10 seconds if howto has +* RB_NOSYNC set which indicates we're likely handling a panic. */ - getmicrotime(_time); - while (sc->SSU_refcount) { + timeout = 600; + if (howto & RB_NOSYNC) + timeout = 100; + + /* +* Wait until all of the SSU commands have completed or timeout has +* expired. Pause for 100ms each time through. If any command +* times out, the target will be reset in the SCSI command timeout +* routine. +*/ + while (sc->SSU_refcount > 0) { pause("mpswait", hz/10); - getmicrotime(_time); - if ((cur_time.tv_sec - start_time.tv_sec) > 60) { + if (--timeout == 0) { mps_dprint(sc, MPS_FAULT, "Time has expired waiting " "for SSU commands to complete.\n"); break; @@ -1235,7 +1242,7 @@ mpssas_stop_unit_done(struct cam_periph *periph, union * Return nothing. */ void -mpssas_ir_shutdown(struct mps_softc *sc) +mpssas_ir_shutdown(struct mps_softc *sc, int howto) { u16 volume_mapping_flags; u16 ioc_pg8_flags = le16toh(sc->ioc_pg8.Flags); @@ -1340,5 +1347,5 @@ out: } } } - mpssas_SSU_to_SATA_devices(sc); + mpssas_SSU_to_SATA_devices(sc, howto); } Modified: stable/11/sys/dev/mps/mpsvar.h == --- stable/11/sys/dev/mps/mpsvar.h Sat Mar 31 19:16:25 2018 (r331848) +++ stable/11/sys/dev/mps/mpsvar.h Sat Mar 31 19:18:07 2018 (r331849) @@ -722,7 +722,7 @@ int mps_config_get_volume_wwid(struct mps_softc *sc, u int mps_config_get_raid_pd_pg0(struct mps_softc *sc, Mpi2ConfigReply_t *mpi_reply, Mpi2RaidPhysDiskPage0_t *config_page, u32 page_address); -void mpssas_ir_shutdown(struct mps_softc *sc); +void mpssas_ir_shutdown(struct mps_softc *sc, int howto); int mps_reinit(struct mps_softc *sc); void mpssas_handle_reinit(struct mps_softc *sc); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r331848 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Author: smh Date: Sat Mar 31 19:16:25 2018 New Revision: 331848 URL: https://svnweb.freebsd.org/changeset/base/331848 Log: MFC r330950: Prevent ZFS TRIM breaking VTOC8 partitions Sponsored by: Multiplay Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c == --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Sat Mar 31 17:28:30 2018(r331847) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Sat Mar 31 19:16:25 2018(r331848) @@ -728,7 +728,9 @@ vdev_label_init(vdev_t *vd, uint64_t crtxg, vdev_label } /* -* TRIM the whole thing so that we start with a clean slate. +* TRIM the whole thing, excluding the blank space and boot header +* as specified by ZFS On-Disk Specification (section 1.3), so that +* we start with a clean slate. * It's just an optimization, so we don't care if it fails. * Don't TRIM if removing so that we don't interfere with zpool * disaster recovery. @@ -736,7 +738,8 @@ vdev_label_init(vdev_t *vd, uint64_t crtxg, vdev_label if (zfs_trim_enabled && vdev_trim_on_init && !vd->vdev_notrim && (reason == VDEV_LABEL_CREATE || reason == VDEV_LABEL_SPARE || reason == VDEV_LABEL_L2CACHE)) - zio_wait(zio_trim(NULL, spa, vd, 0, vd->vdev_psize)); + zio_wait(zio_trim(NULL, spa, vd, VDEV_SKIP_SIZE, + vd->vdev_psize - VDEV_SKIP_SIZE)); /* * Initialize its label. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r331209 - head
I think it would be worth specifically detailing the steps to achieve this, as its not immediately obvious how this would be done. On 19/03/2018 15:27, Kyle Evans wrote: Author: kevans Date: Mon Mar 19 15:27:53 2018 New Revision: 331209 URL: https://svnweb.freebsd.org/changeset/base/331209 Log: Add note to UPDATING about UEFI changes requiring loader(8) update These problems have only been observed with boards using U-Boot (e.g. ARM) where virtual addresses are already set in the memory map by the firmware and the firmware is expecting a call to SetVirtualAddressMap to be made. I refrain from mentioning this in the note because this could also be the case on some not-yet-tested firmware on amd64 and it's not a bad recommendation for the general case. Modified: head/UPDATING Modified: head/UPDATING == --- head/UPDATING Mon Mar 19 15:11:10 2018(r331208) +++ head/UPDATING Mon Mar 19 15:27:53 2018(r331209) @@ -51,6 +51,13 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 12.x IS SLOW: ** SPECIAL WARNING: ** +20180319: + For UEFI systems: the UEFI loader(8), loader.efi, should be updated in + conjunction with installing a new kernel after r330868. The kernel, + after this revision, will be more lenient when mapping addresses for + UEFI Runtime Services and this may result in a kernel panic without the + corresponding loader(8) update. + 20180212: FreeBSD boot loader enhanced with Lua scripting. It's purely opt-in for now by building WITH_LOADER_LUA and WITHOUT_FORTH in /etc/src.conf. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r330951 - head/sys/dev/mps
Author: smh Date: Wed Mar 14 21:32:23 2018 New Revision: 330951 URL: https://svnweb.freebsd.org/changeset/base/330951 Log: Fix mps deadlock when handling panic During shutdown mps waits for its SSU requests to complete however when performing a reboot after handling a panic the scheduler is stopped so getmicrotime which is used can be non-functional. Switch to using the same method as shutdown_panic to ensure we actually complete. In addition reduce the timeout when RB_NOSYNC is set in howto as we expect this to fail. Reviewed by: slm MFC after:1 week Sponsored by: Multiplay Differential Revision:https://reviews.freebsd.org/D12776 Modified: head/sys/dev/mps/mps_sas_lsi.c head/sys/dev/mps/mpsvar.h Modified: head/sys/dev/mps/mps_sas_lsi.c == --- head/sys/dev/mps/mps_sas_lsi.c Wed Mar 14 21:21:03 2018 (r330950) +++ head/sys/dev/mps/mps_sas_lsi.c Wed Mar 14 21:32:23 2018 (r330951) @@ -52,6 +52,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include @@ -126,7 +127,7 @@ int mpssas_get_sas_address_for_sata_disk(struct mps_so u64 *sas_address, u16 handle, u32 device_info, u8 *is_SATA_SSD); static int mpssas_volume_add(struct mps_softc *sc, u16 handle); -static void mpssas_SSU_to_SATA_devices(struct mps_softc *sc); +static void mpssas_SSU_to_SATA_devices(struct mps_softc *sc, int howto); static void mpssas_stop_unit_done(struct cam_periph *periph, union ccb *done_ccb); @@ -1122,7 +1123,7 @@ out: * Return nothing. */ static void -mpssas_SSU_to_SATA_devices(struct mps_softc *sc) +mpssas_SSU_to_SATA_devices(struct mps_softc *sc, int howto) { struct mpssas_softc *sassc = sc->sassc; union ccb *ccb; @@ -1130,7 +1131,7 @@ mpssas_SSU_to_SATA_devices(struct mps_softc *sc) target_id_t targetid; struct mpssas_target *target; char path_str[64]; - struct timeval cur_time, start_time; + int timeout; /* * For each target, issue a StartStopUnit command to stop the device. @@ -1193,17 +1194,23 @@ mpssas_SSU_to_SATA_devices(struct mps_softc *sc) } /* -* Wait until all of the SSU commands have completed or time has -* expired (60 seconds). Pause for 100ms each time through. If any -* command times out, the target will be reset in the SCSI command -* timeout routine. +* Timeout after 60 seconds by default or 10 seconds if howto has +* RB_NOSYNC set which indicates we're likely handling a panic. */ - getmicrotime(_time); - while (sc->SSU_refcount) { + timeout = 600; + if (howto & RB_NOSYNC) + timeout = 100; + + /* +* Wait until all of the SSU commands have completed or timeout has +* expired. Pause for 100ms each time through. If any command +* times out, the target will be reset in the SCSI command timeout +* routine. +*/ + while (sc->SSU_refcount > 0) { pause("mpswait", hz/10); - getmicrotime(_time); - if ((cur_time.tv_sec - start_time.tv_sec) > 60) { + if (--timeout == 0) { mps_dprint(sc, MPS_FAULT, "Time has expired waiting " "for SSU commands to complete.\n"); break; @@ -1245,7 +1252,7 @@ mpssas_stop_unit_done(struct cam_periph *periph, union * Return nothing. */ void -mpssas_ir_shutdown(struct mps_softc *sc) +mpssas_ir_shutdown(struct mps_softc *sc, int howto) { u16 volume_mapping_flags; u16 ioc_pg8_flags = le16toh(sc->ioc_pg8.Flags); @@ -1350,5 +1357,5 @@ out: } } } - mpssas_SSU_to_SATA_devices(sc); + mpssas_SSU_to_SATA_devices(sc, howto); } Modified: head/sys/dev/mps/mpsvar.h == --- head/sys/dev/mps/mpsvar.h Wed Mar 14 21:21:03 2018(r330950) +++ head/sys/dev/mps/mpsvar.h Wed Mar 14 21:32:23 2018(r330951) @@ -772,7 +772,7 @@ int mps_config_get_volume_wwid(struct mps_softc *sc, u int mps_config_get_raid_pd_pg0(struct mps_softc *sc, Mpi2ConfigReply_t *mpi_reply, Mpi2RaidPhysDiskPage0_t *config_page, u32 page_address); -void mpssas_ir_shutdown(struct mps_softc *sc); +void mpssas_ir_shutdown(struct mps_softc *sc, int howto); int mps_reinit(struct mps_softc *sc); void mpssas_handle_reinit(struct mps_softc *sc); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r330950 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Missed the differential review: https://reviews.freebsd.org/D14695 On 14/03/2018 21:21, Steven Hartland wrote: Author: smh Date: Wed Mar 14 21:21:03 2018 New Revision: 330950 URL: https://svnweb.freebsd.org/changeset/base/330950 Log: Prevent ZFS TRIM breaking VTOC8 partitions Update the ZFS TRIM code to ensure it respects VTOC8 partition headers as documented by the ZFS On-Disk Specification section 1.3 Before this a zpool create on a VTOC8 partitioned device would overwrite the partition metadata. Reported by: marius Reviewed by: marius agv MFC after: 1 week Sponsored by:Multiplay Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.cWed Mar 14 21:11:41 2018(r330949) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.cWed Mar 14 21:21:03 2018(r330950) @@ -802,7 +802,9 @@ vdev_label_init(vdev_t *vd, uint64_t crtxg, vdev_label } /* -* TRIM the whole thing so that we start with a clean slate. +* TRIM the whole thing, excluding the blank space and boot header +* as specified by ZFS On-Disk Specification (section 1.3), so that +* we start with a clean slate. * It's just an optimization, so we don't care if it fails. * Don't TRIM if removing so that we don't interfere with zpool * disaster recovery. @@ -810,7 +812,8 @@ vdev_label_init(vdev_t *vd, uint64_t crtxg, vdev_label if (zfs_trim_enabled && vdev_trim_on_init && !vd->vdev_notrim && (reason == VDEV_LABEL_CREATE || reason == VDEV_LABEL_SPARE || reason == VDEV_LABEL_L2CACHE)) - zio_wait(zio_trim(NULL, spa, vd, 0, vd->vdev_psize)); + zio_wait(zio_trim(NULL, spa, vd, VDEV_SKIP_SIZE, + vd->vdev_psize - VDEV_SKIP_SIZE)); /* * Initialize its label. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r330950 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Author: smh Date: Wed Mar 14 21:21:03 2018 New Revision: 330950 URL: https://svnweb.freebsd.org/changeset/base/330950 Log: Prevent ZFS TRIM breaking VTOC8 partitions Update the ZFS TRIM code to ensure it respects VTOC8 partition headers as documented by the ZFS On-Disk Specification section 1.3 Before this a zpool create on a VTOC8 partitioned device would overwrite the partition metadata. Reported by: marius Reviewed by: marius agv MFC after:1 week Sponsored by: Multiplay Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.cWed Mar 14 21:11:41 2018(r330949) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.cWed Mar 14 21:21:03 2018(r330950) @@ -802,7 +802,9 @@ vdev_label_init(vdev_t *vd, uint64_t crtxg, vdev_label } /* -* TRIM the whole thing so that we start with a clean slate. +* TRIM the whole thing, excluding the blank space and boot header +* as specified by ZFS On-Disk Specification (section 1.3), so that +* we start with a clean slate. * It's just an optimization, so we don't care if it fails. * Don't TRIM if removing so that we don't interfere with zpool * disaster recovery. @@ -810,7 +812,8 @@ vdev_label_init(vdev_t *vd, uint64_t crtxg, vdev_label if (zfs_trim_enabled && vdev_trim_on_init && !vd->vdev_notrim && (reason == VDEV_LABEL_CREATE || reason == VDEV_LABEL_SPARE || reason == VDEV_LABEL_L2CACHE)) - zio_wait(zio_trim(NULL, spa, vd, 0, vd->vdev_psize)); + zio_wait(zio_trim(NULL, spa, vd, VDEV_SKIP_SIZE, + vd->vdev_psize - VDEV_SKIP_SIZE)); /* * Initialize its label. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r329812 - head/sys/cam
In our experience this is very device dependent, what lead you to this conclusion? On 22/02/2018 05:43, Warner Losh wrote: Author: imp Date: Thu Feb 22 05:43:20 2018 New Revision: 329812 URL: https://svnweb.freebsd.org/changeset/base/329812 Log: Don't sort TRIMs. While the code for ada and da both assume that the trim list is ordered when doing the coaleascing the TRIMs, it turns out that creating the sorted list uses more resources than are saved by having slightly fewer trims sent to the device. Sponsored by: Netflix Modified: head/sys/cam/cam_iosched.c Modified: head/sys/cam/cam_iosched.c == --- head/sys/cam/cam_iosched.c Thu Feb 22 04:30:52 2018(r329811) +++ head/sys/cam/cam_iosched.c Thu Feb 22 05:43:20 2018(r329812) @@ -1392,7 +1392,7 @@ cam_iosched_queue_work(struct cam_iosched_softc *isc, * the work on the bio queue. */ if (bp->bio_cmd == BIO_DELETE) { - bioq_disksort(>trim_queue, bp); + bioq_insert_tail(>trim_queue, bp); #ifdef CAM_IOSCHED_DYNAMIC isc->trim_stats.in++; isc->trim_stats.queued++; ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r328996 - head/sys/kern
What would be the expected behavior if this was triggered, app crash or kernel panic...? On 07/02/2018 21:52, Andriy Gapon wrote: Author: avg Date: Wed Feb 7 21:51:59 2018 New Revision: 328996 URL: https://svnweb.freebsd.org/changeset/base/328996 Log: exec_map_first_page: fix an inverse condition introduced in r254138 While the bug itself was serious, as we could either pass a non-busied page to vm_pager_get_pages() or leak a busy page, it could only be triggered under a very rare condition where the page is already inserted into the object, but it is not valid yet. Reviewed by: kib MFC after: 2 weeks Modified: head/sys/kern/kern_exec.c Modified: head/sys/kern/kern_exec.c == --- head/sys/kern/kern_exec.c Wed Feb 7 20:36:37 2018(r328995) +++ head/sys/kern/kern_exec.c Wed Feb 7 21:51:59 2018(r328996) @@ -1009,7 +1009,7 @@ exec_map_first_page(imgp) if ((ma[i] = vm_page_next(ma[i - 1])) != NULL) { if (ma[i]->valid) break; - if (vm_page_tryxbusy(ma[i])) + if (!vm_page_tryxbusy(ma[i])) break; } else { ma[i] = vm_page_alloc(object, i, ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r328625 - in head/sys: amd64/amd64 amd64/ia32 amd64/include dev/cpuctl i386/i386 x86/include x86/x86
Pretty sure I’ve seen that too On Wed, 31 Jan 2018 at 18:05, Rodney W. Grimes < free...@pdx.rh.cn85.dnsmgr.net> wrote: > > On Wed, Jan 31, 2018 at 02:56:24PM +, Bjoern A. Zeeb wrote: > > > On 31 Jan 2018, at 14:36, Konstantin Belousov wrote: > > > > > > > Author: kib > > > > Date: Wed Jan 31 14:36:27 2018 > > > > New Revision: 328625 > > > > URL: https://svnweb.freebsd.org/changeset/base/328625 > > > > > > > > Log: > > > > IBRS support, AKA Spectre hardware mitigation. > > > > > > > For existing processors, you need a microcode update which adds > IBRS > > > > CPU features, and to manually enable it by setting the > > > > tunable/sysctl > > > > hw.ibrs_disable to 0. Current status can be checked in sysctl > > > > hw.ibrs_active. The mitigation might be inactive if the CPU > feature > > > > > > Can you change the tunable/sysctl to hw.ibrs_enable[d] (and toggle the > > > default setting along). > > This is done consistently with the hw.clflush_disable. > > Anyway, the intent is that the knob will be used for disabling, > > since defaults are going to be changed in the near future. > > I thought we had something some place that said negative assertions > should be avoided if possible. > > > > I find it highly confusing to have two different sysctls ???disable??? > > > and ???active??? and a lot > > > of people (and cultures) have trouble with the double negative. > > > Also the ???enable[d]??? variant seems to be pre-dominant in the > kernel. > > > > > > Also can we spell IBRS in the sysctl description as ???Indirect Branch > > > Restricted Speculation (IBRS) > > Will do in half a hour. > > > -- > Rod Grimes > rgri...@freebsd.org > > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r328321 - head/sys/net
Author: smh Date: Wed Jan 24 10:13:14 2018 New Revision: 328321 URL: https://svnweb.freebsd.org/changeset/base/328321 Log: Added missing CTLFLAG_VNET to lacp default_strict_mode Added CTLFLAG_VNET to net.link.lagg.lacp.default_strict_mode which was missed in r290450. Reported by: julian@ MFC after:1 week Sponsored by: Multiplay Modified: head/sys/net/ieee8023ad_lacp.c Modified: head/sys/net/ieee8023ad_lacp.c == --- head/sys/net/ieee8023ad_lacp.c Wed Jan 24 07:54:05 2018 (r328320) +++ head/sys/net/ieee8023ad_lacp.c Wed Jan 24 10:13:14 2018 (r328321) @@ -201,8 +201,8 @@ SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, debug, CTLFL _NAME(lacp_debug), 0, "Enable LACP debug logging (1=debug, 2=trace)"); static VNET_DEFINE(int, lacp_default_strict_mode) = 1; -SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, default_strict_mode, CTLFLAG_RWTUN, -_NAME(lacp_default_strict_mode), 0, +SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, default_strict_mode, +CTLFLAG_RWTUN | CTLFLAG_VNET, _NAME(lacp_default_strict_mode), 0, "LACP strict protocol compliance default"); #define LACP_DPRINTF(a) if (V_lacp_debug & 0x01) { lacp_dprintf a ; } ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r328136 - in head/etc: defaults rc.d
Did you intend to add the growfs option at the same time as it wasn’t mentioned in the commit msg On Thu, 18 Jan 2018 at 20:46, Brad Daviswrote: > Author: brd (doc,ports committer) > Date: Thu Jan 18 20:45:41 2018 > New Revision: 328136 > URL: https://svnweb.freebsd.org/changeset/base/328136 > > Log: > Teach the resolv startup script to respect its enable flag. > > Reviewed by: will, imp > Approved by: imp > > Modified: > head/etc/defaults/rc.conf > head/etc/rc.d/resolv > > Modified: head/etc/defaults/rc.conf > > == > --- head/etc/defaults/rc.conf Thu Jan 18 20:12:12 2018(r328135) > +++ head/etc/defaults/rc.conf Thu Jan 18 20:45:41 2018(r328136) > @@ -96,6 +96,7 @@ fsck_y_enable="NO"# Set to YES to do fsck -y if the i > fsck_y_flags="-T ffs:-R -T ufs:-R" # Additional flags for fsck -y > background_fsck="YES" # Attempt to run fsck in the background where > possible. > background_fsck_delay="60" # Time to wait (seconds) before starting the > fsck. > +growfs_enable="NO" # Set to YES to attempt to grow the root > filesystem on boot > netfs_types="nfs:NFS smbfs:SMB" # Net filesystems. > extra_netfs_types="NO" # List of network extra filesystem types for > delayed > # mount at startup (or NO). > @@ -276,6 +277,7 @@ ctld_enable="NO"# CAM Target Layer / iSCSI > target da > local_unbound_enable="NO" # local caching resolver > blacklistd_enable="NO" # Run blacklistd daemon (YES/NO). > blacklistd_flags=""# Optional flags for blacklistd(8). > +resolv_enable="YES"# Enable resolv / resolvconf > > # > # kerberos. Do not run the admin daemons on slave servers > > Modified: head/etc/rc.d/resolv > > == > --- head/etc/rc.d/resolvThu Jan 18 20:12:12 2018(r328135) > +++ head/etc/rc.d/resolvThu Jan 18 20:45:41 2018(r328136) > @@ -35,6 +35,7 @@ > > name="resolv" > desc="Create /etc/resolv.conf from kenv" > +start_cmd="${name}_start" > stop_cmd=':' > > load_rc_config $name > @@ -42,17 +43,20 @@ load_rc_config $name > # if the info is available via dhcp/kenv > # build the resolv.conf > # > -if [ -n "`/bin/kenv dhcp.domain-name-servers 2> /dev/null`" ]; then > - interface="`/bin/kenv boot.netif.name`" > - ( > - if [ -n "`/bin/kenv dhcp.domain-name 2> /dev/null`" ]; then > - echo domain `/bin/kenv dhcp.domain-name` > +resolv_start() > +{ > + if [ -n "`/bin/kenv dhcp.domain-name-servers 2> /dev/null`" ]; then > + interface="`/bin/kenv boot.netif.name`" > + ( > + if [ -n "`/bin/kenv dhcp.domain-name 2> /dev/null`" ]; then > + echo domain `/bin/kenv dhcp.domain-name` > + fi > + > + set -- `/bin/kenv dhcp.domain-name-servers` > + for ns in `IFS=','; echo $*`; do > + echo nameserver $ns > + done > + ) | /sbin/resolvconf -a ${interface}:dhcp4 > fi > - > - set -- `/bin/kenv dhcp.domain-name-servers` > - for ns in `IFS=','; echo $*`; do > - echo nameserver $ns > - done > - ) | /sbin/resolvconf -a ${interface}:dhcp4 > -fi > +} > > > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 23:30, Scott Long wrote: On Jan 5, 2018, at 11:20 AM, Eugene Grosbeinwrote: CC'ng scottl@ as author of the change in question. 06.01.2018 0:39, Matt Joras wrote: For what it's worth, this was the conclusion I came to, and at Isilon we've made the same change being discussed here. For the case of drivers that end up using a queue index for the flowid, you end up with pathological behavior on the lagg; the flowid ends up getting right shifted by (default) 16. So in the case of e.g. two bxe(4) interfaces with 4 queues, you always end up choosing the interface in the lagg at index 0. Then why does if_lagg shifts 16 bits by default? Is seems senseless. This was introduced with r260070 by scottl: At the time, we were using cxgbe interfaces which inserted a reasonable RSS hash into the flowid field. The shift was done to expose different bits to the index/modulo scheme used by the LACP module vs the cxgbe module. In hindsight, I should not have set a default value of 16, I should have left it at zero so that default behavior would be preserved. Multi-queue NIC drivers and multi-port lagg tend to use the same lower bits of the flowid as each other, resulting in a poor distribution of packets among queues in certain cases. Work around this by adding a set of sysctls for controlling a bit-shift on the flowid when doing multi-port aggrigation in lagg and lacp. By default, lagg/lacp will now use bits 16 and higher instead of 0 and higher. Reviewed by:max Obtained from: Netflix MFC after: 3 days This commit message does not point to an example of NIC driver that would set non-zero bits 16 and higher for flowid so that shift result would be non-zero. Do we really have such a driver? Yes. Anyway, this lagg's default seems to be very driver-centric. For example, Intel driver family also do not use such high bits for flowid just like mentioned bxe(4). scottl@moe:~/svn/head/sys/dev % grep -r iri_flowid * bnxt/bnxt_txrx.c: ri->iri_flowid = le32toh(rcp->rss_hash); bnxt/bnxt_txrx.c: ri->iri_flowid = le32toh(tpas->low.rss_hash); e1000/em_txrx.c:ri->iri_flowid = le32toh(rxd->wb.lower.hi_dword.rss); e1000/igb_txrx.c: ri->iri_flowid = ixgbe/ix_txrx.c:ri->iri_flowid = le32toh(rxd->wb.lower.hi_dword.rss); The number of drivers that set m_pkhhdr.flowid directly to an RSS hash looks to be: cxgb cxgbe mlx4 mlx5 qlnx qlxgbe qlxge vmxnet3 Maybe the hardware doesn’t do a great job with generating a useful RSS hash, but that’s tangential to this conversation. We should change flowid_shift default to 0 for if_lagg(4), shouldn't we? In the short term, yes. What I see is that it’s too expensive to do a hash calculation on every TX packet in LACP (for anything resembling line rate), and flowid is unreliable when a connection is initiated without an RX packet triggering it. I don’t understand the consequences of the TX initiation problem well enough to offer a solution. For the problem of flowid being used inconsistently by drivers (i.e. not populating all 32 bits or using a weak RSS), that’s really a driver problem. What I’d recommend is that the LACP code check for M_HASHTYPE_NONE and M_HASHTYPE_OPAQUE and calculate a new hash if either are set (effectively ignoring the flowid). It’s then up to the drivers to set the correct hash type that corresponds with what they’re putting into the flowid field. An opaque type would mean that the value is useful to the driver but should not be considered useful anywhere else. You’ll get more correct and less surprising behavior from that, at the penalty of every TX packet requiring a hash calculation for hardware/drivers that are crummy. Mixing the hash methods causes problems with out of order packets even just for the first packet, and using a hash which is not what's configured by lagghash is confusing at best so that could be fixed to say "flowid" if that's whats going to happen or perhaps update it to the hash type that flowid represents if that's possible. LACP already checks for M_HASHTYPE_NONE if use_flowid is enabled and manually calculates a hash, which is what leads the the first packet port selection inconsistency. It's not clear what all the implications of the first packet port inconsistency is, it will likely be dependent a large number of factors including network hardware, layout and dest host + config., but when we've seen this in the 3 and 4 packet of a stream it leads to the destination sending RST, dropping the stream on the floor for 2% of all streams on our test box with 2 x ixgbe interfaces. Questions: 1. Is it possible to determine the hash method used by the HW and use that for all first packets? 2. Is it possible to significantly improve the performance the CPU hashing? 3. Is it possible to tell from the mbuf that its part of a connection instigated from the current host? Regards Steve
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 17:39, Matt Joras wrote: On Fri, Jan 5, 2018 at 9:32 AM, Eugene Grosbeinwrote: 06.01.2018 0:28, Matt Joras wrote: For what it's worth, this was the conclusion I came to, and at Isilon we've made the same change being discussed here. For the case of drivers that end up using a queue index for the flowid, you end up with pathological behavior on the lagg; the flowid ends up getting right shifted by (default) 16. So in the case of e.g. two bxe(4) interfaces with 4 queues, you always end up choosing the interface in the lagg at index 0. Not all drivers have this bug. These are drivers that needs to be fixed to not shift by 16, not lagg. I don't follow. It is if_lagg that does the shifting. For loadbalance it is done directly in lagg_snd_tag_alloc, and for LACP it is done in a separate fucntion, lacp_select_tx_port_by_hash. For both it shifts the flowid by the flowid_shift set on the lagg sc, which defaults to 16. For reference lacp_select_tx_port is the normal method, lacp_select_tx_port_by_hash is only used if RATELIMIT is enabled. They both do the same shift though, so ... You could make the argument that we should fix every driver that sets a queue index to instead use an RSS hash, but that seems like more work than simply disabling the use of flowid in if_lagg by default. For cases where this has an appreciable impact on forwarding performance the sysctl can be flipped back. That seems more reasonable to me than making laggs effectively useless for anyone using any one of a random set of drivers that set the flowid to a queue index (grep for "flowid =" and you can see which drivers do this). Matt ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 17:06, Eugene Grosbein wrote: 05.01.2018 23:11, Steven Hartland wrote: What do others think, am I missing something? You still consider only TCP case missing IP forwarning case when all IP packets are transit coming from lagg0 and going out via lagg1. Just going out via a laggX IP forwarding case benefits from pre-computed RSS flowid since 8.0-RELEASE and your change breaks it. Is there a way to determine if the mbuf is a forwarded mbuf of not? I know I've said it before but just to be totally clear, changing the default was done to prevent broken behavior, if you're not concerned about the issue or you know you're not effected you can enable use_flowid to restore the original behavior. This doesn't have to be the final fix, if there are improvements that can be made to make the default more intelligent for example and use flowid if its known to be good then that can be looked into. In the mean time the new "default" will prevent others from configuring lagg(4) with LACP or loadbalance and ending up with problems; yes this may mean that IP forwarding in HEAD will use manual hashing hence will perform a little worse for now but that's the lesser of two evils. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 17:16, Eugene Grosbein wrote: That is, there is no guarantee of persistance of flowid of incoming packets as they can be received with distinct ports of lagg being distinct hardware computing flowid differently. Some ports may not support RSS at all. We should not use incoming hardware flowid for anything by default in case of TCP. I don't believe your statement about persistence of flowid due to the use of variant ports is correct as LACP states that packets from the same flow "should" under normal conditions (no failure) be received on the same port. It still does not guarantee that and you miss opportunity of network failures that can easily change flowid of incoming packets. Correct, but that's not the normal behavior so the chances of seeing any impact of out of order packets is very small. In the case where the HW doesn't support RSS, then flowid should either always be unset or be set by OS to consistent value hence that should function as expected. That said I don't disagree that all hostA -> hostB should use Manual hash, as I can't see anyway to use to HW hash, however the ports in your example are wrong Yes, I stand corrected (just copied your example and adjusted it incompletly). Why do you mix flowid of incoming stream with flowid of outgoing stream? I expect this was done so we don't have the overhead of calculating a packet hash for every outgoing packet i.e. its an optimization, however I believe this is only possible for the destination host which always has a valid flowid, and not for the source host. How do you know that flowid of incoming packet is preserved on outgoing path? It should not. https://github.com/freebsd/freebsd/blob/master/sys/netinet/ip_output.c#L234 Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 17:02, Eugene Grosbein wrote: 05.01.2018 22:13, Steven Hartland wrote: I hope there's some improvements that can be made, for example if we can determine the stream was instigated remotely then flowid would always be valid hence we can use it assuming it matches the requested spec or if we can make it clear to the user that laggproto is not the one they requested, I'm open to ideas? We just need to clear flow id from incoming TCP segments and always generate new flow id for responses keeping old flow id for IP forwarding case. Please back out your change to not degrade IP forwarding performance. Sorry I don't follow you. You seem to be inferring that we can easily generate a flowid without involving the sending hardware RSS has nothing to do with sending hardware. It's operating system's job to choose outgoing port, not hardware's job. The OS is deciding which outgoing, however its using the hash based on the flowid to do so It should use flowid for transit forwarding IP packet only. It should not use flowid from incoming TCP segment. Not sure I follow your meaning, LACP has nothing to do with incoming TCP, its balancing and hence hashing is performed on outbound (tx) traffic only. I can't see how that is possible as that's chicken and egg i.e. you can't get the HW interface to generate the flowid without sending a packet and you can't send a packet until you have a the flowid to decide which interface to send it from. Outgoing packet flow does not and should not depend on incoming flow, they are independent things in case of LACP. There is no "chicken and egg" problem at all. But this is how it works ATM, it uses the flowid which is only valid after the first rx. Then this is a bug that should be fixed to solve your problem, instead of change of lagg defaults that degrades IP forwarding performance. You seem to be confusing IP forwarding with choice of port in the lagg interface? Once lagg (lacp in this case) has chosen the port then the stack continues as it always has, if this means using flowid to balance queues then that's fine. This change only changes the hash calculation which is used to determine the port that's used. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 09:41, hiren panchasara wrote: IIRC, with 'RSS' in kernconf, most NIC drivers and stack should do the right thing. Look at drivers and also conn startup code in TCP as I recall it doing the flowid mapping correctly when stream originated from the other side and had flowid assigned to it by the NIC. I am mostly concerned about the overhead of manual calculation but my knowledge is a bit rusty right now and lagg has always been special so please try this out and see. I've not been able to find any such option: head:src> grep -ri rss sys/amd64/conf/ head:src> Any other ideas on where it might be or is it just the default on HEAD? That said the more I think / talk about this the more I believe manual calculation is the right option for LACP. The reason I believe this is: * When configuring LACP in a network knowing the hash method is important, so using an unknown "flowid" based hash could produce unexpected results. * There's no easy way (possibly no way at all) to determine the flowid from the HW for the first packet of a new outbound connection * Having the hash algorithm vary for inbound and outbound connections increases the chance of unexpected results. * LCAP combines NIC's of even speed, however they can be different HW so there's no guarantee that the partaking ports use the same flowid calculation, again increasing the chance of a problem. So as mentioned in a previous reply the more I think about the more believe flowid can't be successfully used as a hash source for LACP or loadbalance. What do others think, am I missing something? Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 14:38, Slawa Olhovchenkov wrote: On Fri, Jan 05, 2018 at 08:36:48PM +0700, Eugene Grosbein wrote: 05.01.2018 20:11, Slawa Olhovchenkov wrote: Irrelevant to RSS and etc. flowid distribution in lacp case work very bad. This is good and must be MFC (IMHO). It may work bad depending on NIC and/or traffic type. It works just fine in common case of IP forwarding for packets with TCP/UDP inside. It can be easily disabled locally for specific cases when it does not work. Packet distrubuting on network equipment (lacp case) w/ enabled flowid cause uneven queue distributing. Yes, this is may be disabled locally, but diagnostic this root cause need uncommon skills. Indeed, the same for packet ordering issue, it took a good amount of effort here from multiple parties to determine the there was a bug in FreeBSD LACP implementation due to the use of flowid, which is why I opted to disable it by default. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 13:49, Eugene Grosbein wrote: 05.01.2018 16:26, Steven Hartland пишет: On 05/01/2018 02:01, Eugene Grosbein wrote: 05.01.2018 4:52, Steven Hartland wrote: RSS by definition has meaning to received stream. What is "outbound" stream in this context, why can the hash calculatiom method change and what exactly does it mean "a stream being incorrectly split"? Yes RSS is indeed a received stream but that is used by lagg for lacp and loadbalance protocols to decide which port of the lagg to "send" the packet out of. As the flowid is not known when a new "output" stream is instigated the current code falls back to manual hash calculation to determine which port to send the initial packet from. Once a response is received a tx then uses the flowid. This change of hash calculation method can result in the initial packet being sent from a different port than the rest of the stream; this is what I meant by "incorrectly split". See the following: https://github.com/freebsd/freebsd/blob/master/sys/net/if_lagg.c#L2066 https://github.com/freebsd/freebsd/blob/master/sys/net/ieee8023ad_lacp.c#L846 I still do not get what is "output stream" for you. If you are talking on forwarding (routing) transit packets at IP layer, they all have flowid from the beginning and first packet does not differ from others at all. At the simplest level its a tcp stream that is started from the host. So given we have hostA (src) and hostB (dest), the output stream is one started by hostA with a destination of hostB where hostA is configured with lagg. In this case with use_flowid we've confirmed we get the following (the interfaces used vary per flow of cause): hostA - SYN (ix0) -> hostB # Manual hash calculated hostB - SYN,ACK (ix0) -> hostA# flowid used hostA - ACK (ix1) -> hostB # flowid used hostA - Data(ix1) -> hostB # flowid used hostB - ACK (ix0) -> hostA # flowid used ... Here hostA and hostB both had lagg0 comprising of ix0 and ix1. It should be: hostA - SYN (ix0) -> hostB # Manual hash (1) calculated hostB - SYN,ACK (ix0) -> hostA# hardware flowid (2) received hostA - ACK (ix1) -> hostB # Manual hash (1) calculated hostA - Data(ix1) -> hostB # hardware flowid (2 or 3) received hostB - ACK (ix0) -> hostA # Manual hash (1) calculated That is, there is no guarantee of persistance of flowid of incoming packets as they can be received with distinct ports of lagg being distinct hardware computing flowid differently. Some ports may not support RSS at all. We should not use incoming hardware flowid for anything by default in case of TCP. I don't believe your statement about persistence of flowid due to the use of variant ports is correct as LACP states that packets from the same flow "should" under normal conditions (no failure) be received on the same port. In the case where the HW doesn't support RSS, then flowid should either always be unset or be set by OS to consistent value hence that should function as expected. That said I don't disagree that all hostA -> hostB should use Manual hash, as I can't see anyway to use to HW hash, however the ports in your example are wrong, all hostA -> hostB should be sent from the same ixY and all hostB -> hostA should be sent from the same ixZ (under normal circumstances) of course. If you are talking on locally originated (not transit) data streem from local TCP socket being sent in response to corresponding incoming TCP segments, then these outgoing packets should have their own fixed flow id by default in case of LACP and thhis flow id should not depend on (possibly ever changing) flow id of incoming TCP segments. Nope in this case we have all the information needed, but I don't believe we can't tell that's the case. If you insist that flow id of outgoing packets does depend on ever changing incoming packet's flow id, then this is the bug that should be fixed and not lagg's defaults. As detailed above once the session is established then the flowid remains fixed. Why do you mix flowid of incoming stream with flowid of outgoing stream? I expect this was done so we don't have the overhead of calculating a packet hash for every outgoing packet i.e. its an optimization, however I believe this is only possible for the destination host which always has a valid flowid, and not for the source host. My current thinking is that flowid shouldn't be used for either LACP or loadbalance protocols as doing so will almost certainly lead to unexpected behavior (the stated lagghash may not be valid). Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 13:41, Eugene Grosbein wrote: 05.01.2018 16:34, Steven Hartland wrote: I hope there's some improvements that can be made, for example if we can determine the stream was instigated remotely then flowid would always be valid hence we can use it assuming it matches the requested spec or if we can make it clear to the user that laggproto is not the one they requested, I'm open to ideas? We just need to clear flow id from incoming TCP segments and always generate new flow id for responses keeping old flow id for IP forwarding case. Please back out your change to not degrade IP forwarding performance. Sorry I don't follow you. You seem to be inferring that we can easily generate a flowid without involving the sending hardware RSS has nothing to do with sending hardware. It's operating system's job to choose outgoing port, not hardware's job. The OS is deciding which outgoing, however its using the hash based on the flowid to do so, which is only valid after the first rx hence the problem; as this results in the hash calculation being different for the first packet. I can't see how that is possible as that's chicken and egg i.e. you can't get the HW interface to generate the flowid without sending a packet and you can't send a packet until you have a the flowid to decide which interface to send it from. Outgoing packet flow does not and should not depend on incoming flow, they are independent things in case of LACP. There is no "chicken and egg" problem at all. But this is how it works ATM, it uses the flowid which is only valid after the first rx. ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
I found https://wiki.freebsd.org/NetworkRSS but I couldn't see any options mentioned, is there a sysctl or kernel option for that Adrian? For reference our current test is on a production LB running 11.0-RELEASE. We're in the process of updating our HEAD box for additional testing. On 05/01/2018 02:55, Adrian Chadd wrote: does it also happen when you actually enable RSS in the kernel? Since like I went through a whole lot of pain to assign a flowid at connection setup time. -a On 4 January 2018 at 15:37, Steven Hartland <ste...@multiplay.co.uk> wrote: On 04/01/2018 22:42, hiren panchasara wrote: On 01/04/18 at 09:52P, Steven Hartland wrote: On 04/01/2018 20:50, Eugene Grosbein wrote: 05.01.2018 3:05, Steven Hartland wrote: Author: smh Date: Thu Jan 4 20:05:47 2018 New Revision: 327559 URL: https://svnweb.freebsd.org/changeset/base/327559 Log: Disabled the use of flowid for lagg by default Disabled the use of RSS hash from the network card aka flowid for lagg(4) interfaces by default as it's currently incompatible with the lacp and loadbalance protocols. The incompatibility is due to the fact that the flowid isn't know for the first packet of a new outbound stream which can result in the hash calculation method changing and hence a stream being incorrectly split across multiple interfaces during normal operation. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" Discussed with: kmacy Sponsored by: Multiplay RSS by definition has meaning to received stream. What is "outbound" stream in this context, why can the hash calculatiom method change and what exactly does it mean "a stream being incorrectly split"? Yes RSS is indeed a received stream but that is used by lagg for lacp and loadbalance protocols to decide which port of the lagg to "send" the packet out of. As the flowid is not known when a new "output" stream is instigated the current code falls back to manual hash calculation to determine which port to send the initial packet from. Once a response is received a tx then uses the flowid. This change of hash calculation method can result in the initial packet being sent from a different port than the rest of the stream; this is what I meant by "incorrectly split". For my understanding, is this just an issue for the first packet when we originate the flow? Once we have a response and if flowid is there, we'd use it, right? OR am I missing something? Initially yes, but that can cause a whole cascading set of problems. If the source machine sends from two different ports then flow can traverse across the network using different paths and hence arrive at the destination on different ports too, causing the corresponding issue on the other side. And with this change, we'd always go and do manual calculation even when we have a valid flowid (i.e. we didn't initiate a connection)? Correct, but there's potentially no easy way to correctly determine what the flowid and hence hash should be in this case, likely impossible if the lagg consists of different interface types. In addition if the hardware hash doesn't match the requested one as per laggproto then additional issues could also be triggered. Our TCP stack seems fragile during setup to out of order packets which this multipath behavior causes, we've seen this on our loadbalancers which is what triggered the investigation. The concrete result is many aborted TCP connections, over 300k ~2% on the machine I'm looking at. I hope there's some improvements that can be made, for example if we can determine the stream was instigated remotely then flowid would always be valid hence we can use it assuming it matches the requested spec or if we can make it clear to the user that laggproto is not the one they requested, I'm open to ideas? Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 02:09, Eugene Grosbein wrote: 05.01.2018 6:37, Steven Hartland wrote: Our TCP stack seems fragile during setup to out of order packets which this multipath behavior causes, we've seen this on our loadbalancers which is what triggered the investigation. The concrete result is many aborted TCP connections, over 300k ~2% on the machine I'm looking at. This is another problem that needs to be fixed in general and not hidden under the carpet. Meantime, practical problems you see can be solved locally with any settings you like. While it may seem like it, there's not denying that the problem is caused by fact that the packets for a single flow arrive on two different interfaces in normal (none failure) workflow, which contravenes 802.3ad which states: 43.2.4 Frame Distributor … This standard does not mandate any particular distribution algorithm(s); however, any distribution algorithm shall ensure that, when frames are received by a Frame Collector as specified in 43.2.3, the algorithm shall not cause a) Mis-ordering of frames that are part of any given conversation, or b) Duplication of frames. The above requirement to maintain frame ordering is met by *ensuring that all frames that compose a given conversation are transmitted on a single link in the order* that they are generated by the MAC Client; hence, this requirement does not involve the addition (or modification) of any information to the MAC frame, nor any buffering or processing on the part of the corresponding Frame Collector in order to re-order frames. I hope there's some improvements that can be made, for example if we can determine the stream was instigated remotely then flowid would always be valid hence we can use it assuming it matches the requested spec or if we can make it clear to the user that laggproto is not the one they requested, I'm open to ideas? We just need to clear flow id from incoming TCP segments and always generate new flow id for responses keeping old flow id for IP forwarding case. Please back out your change to not degrade IP forwarding performance. Sorry I don't follow you. You seem to be inferring that we can easily generate a flowid without involving the sending hardware; I can't see how that is possible as that's chicken and egg i.e. you can't get the HW interface to generate the flowid without sending a packet and you can't send a packet until you have a the flowid to decide which interface to send it from. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 05/01/2018 02:01, Eugene Grosbein wrote: 05.01.2018 4:52, Steven Hartland wrote: RSS by definition has meaning to received stream. What is "outbound" stream in this context, why can the hash calculatiom method change and what exactly does it mean "a stream being incorrectly split"? Yes RSS is indeed a received stream but that is used by lagg for lacp and loadbalance protocols to decide which port of the lagg to "send" the packet out of. As the flowid is not known when a new "output" stream is instigated the current code falls back to manual hash calculation to determine which port to send the initial packet from. Once a response is received a tx then uses the flowid. This change of hash calculation method can result in the initial packet being sent from a different port than the rest of the stream; this is what I meant by "incorrectly split". See the following: https://github.com/freebsd/freebsd/blob/master/sys/net/if_lagg.c#L2066 https://github.com/freebsd/freebsd/blob/master/sys/net/ieee8023ad_lacp.c#L846 I still do not get what is "output stream" for you. If you are talking on forwarding (routing) transit packets at IP layer, they all have flowid from the beginning and first packet does not differ from others at all. At the simplest level its a tcp stream that is started from the host. So given we have hostA (src) and hostB (dest), the output stream is one started by hostA with a destination of hostB where hostA is configured with lagg. In this case with use_flowid we've confirmed we get the following (the interfaces used vary per flow of cause): hostA - SYN (ix0) -> hostB # Manual hash calculated hostB - SYN,ACK (ix0) -> hostA# flowid used hostA - ACK (ix1) -> hostB # flowid used hostA - Data(ix1) -> hostB # flowid used hostB - ACK (ix0) -> hostA # flowid used ... Here hostA and hostB both had lagg0 comprising of ix0 and ix1. I believe your referring to packets flowing through the physical interface, if so then this is too late as for LACP the flowid would need to be per-calculated for the first packet in order to make the decision on which port to send it on. Unless I'm missing something, this is a chicken and egg situation. If you are talking on locally originated (not transit) data streem from local TCP socket being sent in response to corresponding incoming TCP segments, then these outgoing packets should have their own fixed flow id by default in case of LACP and thhis flow id should not depend on (possibly ever changing) flow id of incoming TCP segments. Nope in this case we have all the information needed, but I don't believe we can't tell that's the case. If you insist that flow id of outgoing packets does depend on ever changing incoming packet's flow id, then this is the bug that should be fixed and not lagg's defaults. As detailed above once the session is established then the flowid remains fixed. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 04/01/2018 22:42, hiren panchasara wrote: On 01/04/18 at 09:52P, Steven Hartland wrote: On 04/01/2018 20:50, Eugene Grosbein wrote: 05.01.2018 3:05, Steven Hartland wrote: Author: smh Date: Thu Jan 4 20:05:47 2018 New Revision: 327559 URL: https://svnweb.freebsd.org/changeset/base/327559 Log: Disabled the use of flowid for lagg by default Disabled the use of RSS hash from the network card aka flowid for lagg(4) interfaces by default as it's currently incompatible with the lacp and loadbalance protocols. The incompatibility is due to the fact that the flowid isn't know for the first packet of a new outbound stream which can result in the hash calculation method changing and hence a stream being incorrectly split across multiple interfaces during normal operation. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" Discussed with: kmacy Sponsored by: Multiplay RSS by definition has meaning to received stream. What is "outbound" stream in this context, why can the hash calculatiom method change and what exactly does it mean "a stream being incorrectly split"? Yes RSS is indeed a received stream but that is used by lagg for lacp and loadbalance protocols to decide which port of the lagg to "send" the packet out of. As the flowid is not known when a new "output" stream is instigated the current code falls back to manual hash calculation to determine which port to send the initial packet from. Once a response is received a tx then uses the flowid. This change of hash calculation method can result in the initial packet being sent from a different port than the rest of the stream; this is what I meant by "incorrectly split". For my understanding, is this just an issue for the first packet when we originate the flow? Once we have a response and if flowid is there, we'd use it, right? OR am I missing something? Initially yes, but that can cause a whole cascading set of problems. If the source machine sends from two different ports then flow can traverse across the network using different paths and hence arrive at the destination on different ports too, causing the corresponding issue on the other side. And with this change, we'd always go and do manual calculation even when we have a valid flowid (i.e. we didn't initiate a connection)? Correct, but there's potentially no easy way to correctly determine what the flowid and hence hash should be in this case, likely impossible if the lagg consists of different interface types. In addition if the hardware hash doesn't match the requested one as per laggproto then additional issues could also be triggered. Our TCP stack seems fragile during setup to out of order packets which this multipath behavior causes, we've seen this on our loadbalancers which is what triggered the investigation. The concrete result is many aborted TCP connections, over 300k ~2% on the machine I'm looking at. I hope there's some improvements that can be made, for example if we can determine the stream was instigated remotely then flowid would always be valid hence we can use it assuming it matches the requested spec or if we can make it clear to the user that laggproto is not the one they requested, I'm open to ideas? Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r327559 - in head: . sys/net
On 04/01/2018 20:50, Eugene Grosbein wrote: 05.01.2018 3:05, Steven Hartland wrote: Author: smh Date: Thu Jan 4 20:05:47 2018 New Revision: 327559 URL: https://svnweb.freebsd.org/changeset/base/327559 Log: Disabled the use of flowid for lagg by default Disabled the use of RSS hash from the network card aka flowid for lagg(4) interfaces by default as it's currently incompatible with the lacp and loadbalance protocols. The incompatibility is due to the fact that the flowid isn't know for the first packet of a new outbound stream which can result in the hash calculation method changing and hence a stream being incorrectly split across multiple interfaces during normal operation. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" Discussed with: kmacy Sponsored by:Multiplay RSS by definition has meaning to received stream. What is "outbound" stream in this context, why can the hash calculatiom method change and what exactly does it mean "a stream being incorrectly split"? Yes RSS is indeed a received stream but that is used by lagg for lacp and loadbalance protocols to decide which port of the lagg to "send" the packet out of. As the flowid is not known when a new "output" stream is instigated the current code falls back to manual hash calculation to determine which port to send the initial packet from. Once a response is received a tx then uses the flowid. This change of hash calculation method can result in the initial packet being sent from a different port than the rest of the stream; this is what I meant by "incorrectly split". See the following: https://github.com/freebsd/freebsd/blob/master/sys/net/if_lagg.c#L2066 https://github.com/freebsd/freebsd/blob/master/sys/net/ieee8023ad_lacp.c#L846 Defaults should not be changed so easily just because they are not optimal for some specific case. Each lagg has its own setting for flowid usage and why one cannot just use "ifconfig lagg0 -use_flowid" for such cases? Yes we're already using -use_flowid to mitigate the problem, but the defaults should never result in broken behavior hence the change, at least for now. For reference I did look at keeping the default of 1 but only using that for protocols which weren't effected by the issue, and introducing a 2 to force those that are, but as its defined as acting on creation and we always create lagg interfaces as failover and then amend them that wasn't possible without making more invasive changes. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r327559 - in head: . sys/net
Author: smh Date: Thu Jan 4 20:05:47 2018 New Revision: 327559 URL: https://svnweb.freebsd.org/changeset/base/327559 Log: Disabled the use of flowid for lagg by default Disabled the use of RSS hash from the network card aka flowid for lagg(4) interfaces by default as it's currently incompatible with the lacp and loadbalance protocols. The incompatibility is due to the fact that the flowid isn't know for the first packet of a new outbound stream which can result in the hash calculation method changing and hence a stream being incorrectly split across multiple interfaces during normal operation. This can be re-enabled by setting the following in loader.conf: net.link.lagg.default_use_flowid="1" Discussed with: kmacy Sponsored by: Multiplay Modified: head/UPDATING head/sys/net/if_lagg.c Modified: head/UPDATING == --- head/UPDATING Thu Jan 4 19:47:01 2018(r327558) +++ head/UPDATING Thu Jan 4 20:05:47 2018(r327559) @@ -51,6 +51,14 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 12.x IS SLOW: ** SPECIAL WARNING: ** +20180104: + The use of RSS hash from the network card aka flowid has been + disabled by default for lagg(4) as it's currently incompatible with + the lacp and loadbalance protocols. + + This can be re-enabled by setting the following in loader.conf: + net.link.lagg.default_use_flowid="1" + 20180102: The SW_WATCHDOG option is no longer necessary to enable the hardclock-based software watchdog if no hardware watchdog is Modified: head/sys/net/if_lagg.c == --- head/sys/net/if_lagg.c Thu Jan 4 19:47:01 2018(r327558) +++ head/sys/net/if_lagg.c Thu Jan 4 20:05:47 2018(r327559) @@ -244,7 +244,7 @@ SYSCTL_INT(_net_link_lagg, OID_AUTO, failover_rx_all, "Accept input from any interface in a failover lagg"); /* Default value for using flowid */ -static VNET_DEFINE(int, def_use_flowid) = 1; +static VNET_DEFINE(int, def_use_flowid) = 0; #defineV_def_use_flowidVNET(def_use_flowid) SYSCTL_INT(_net_link_lagg, OID_AUTO, default_use_flowid, CTLFLAG_RWTUN, _NAME(def_use_flowid), 0, ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r327520 - stable/10/sys/netinet
Author: smh Date: Wed Jan 3 16:16:20 2018 New Revision: 327520 URL: https://svnweb.freebsd.org/changeset/base/327520 Log: MFC r322812: Avoid TCP log messages which are false positives. Sponsored by: Multiplay Modified: stable/10/sys/netinet/tcp_input.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/netinet/tcp_input.c == --- stable/10/sys/netinet/tcp_input.c Wed Jan 3 15:01:31 2018 (r327519) +++ stable/10/sys/netinet/tcp_input.c Wed Jan 3 16:16:20 2018 (r327520) @@ -1647,25 +1647,6 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, stru to.to_tsecr = 0; } /* -* If timestamps were negotiated during SYN/ACK they should -* appear on every segment during this session and vice versa. -*/ - if ((tp->t_flags & TF_RCVD_TSTMP) && !(to.to_flags & TOF_TS)) { - if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { - log(LOG_DEBUG, "%s; %s: Timestamp missing, " - "no action\n", s, __func__); - free(s, M_TCPLOG); - } - } - if (!(tp->t_flags & TF_RCVD_TSTMP) && (to.to_flags & TOF_TS)) { - if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { - log(LOG_DEBUG, "%s; %s: Timestamp not expected, " - "no action\n", s, __func__); - free(s, M_TCPLOG); - } - } - - /* * Process options only when we get SYN/ACK back. The SYN case * for incoming connections is handled in tcp_syncache. * According to RFC1323 the window field in a SYN (i.e., a @@ -1693,6 +1674,25 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, stru if ((tp->t_flags & TF_SACK_PERMIT) && (to.to_flags & TOF_SACKPERM) == 0) tp->t_flags &= ~TF_SACK_PERMIT; + } + + /* +* If timestamps were negotiated during SYN/ACK they should +* appear on every segment during this session and vice versa. +*/ + if ((tp->t_flags & TF_RCVD_TSTMP) && !(to.to_flags & TOF_TS)) { + if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { + log(LOG_DEBUG, "%s; %s: Timestamp missing, " + "no action\n", s, __func__); + free(s, M_TCPLOG); + } + } + if (!(tp->t_flags & TF_RCVD_TSTMP) && (to.to_flags & TOF_TS)) { + if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { + log(LOG_DEBUG, "%s; %s: Timestamp not expected, " + "no action\n", s, __func__); + free(s, M_TCPLOG); + } } /* ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r327519 - stable/11/sys/netinet
Author: smh Date: Wed Jan 3 15:01:31 2018 New Revision: 327519 URL: https://svnweb.freebsd.org/changeset/base/327519 Log: MFC r322812: Avoid TCP log messages which are false positives. Sponsored by: Multiplay Modified: stable/11/sys/netinet/tcp_input.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/netinet/tcp_input.c == --- stable/11/sys/netinet/tcp_input.c Wed Jan 3 12:18:55 2018 (r327518) +++ stable/11/sys/netinet/tcp_input.c Wed Jan 3 15:01:31 2018 (r327519) @@ -1686,25 +1686,6 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, stru to.to_tsecr = 0; } /* -* If timestamps were negotiated during SYN/ACK they should -* appear on every segment during this session and vice versa. -*/ - if ((tp->t_flags & TF_RCVD_TSTMP) && !(to.to_flags & TOF_TS)) { - if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { - log(LOG_DEBUG, "%s; %s: Timestamp missing, " - "no action\n", s, __func__); - free(s, M_TCPLOG); - } - } - if (!(tp->t_flags & TF_RCVD_TSTMP) && (to.to_flags & TOF_TS)) { - if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { - log(LOG_DEBUG, "%s; %s: Timestamp not expected, " - "no action\n", s, __func__); - free(s, M_TCPLOG); - } - } - - /* * Process options only when we get SYN/ACK back. The SYN case * for incoming connections is handled in tcp_syncache. * According to RFC1323 the window field in a SYN (i.e., a @@ -1732,6 +1713,25 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, stru if ((tp->t_flags & TF_SACK_PERMIT) && (to.to_flags & TOF_SACKPERM) == 0) tp->t_flags &= ~TF_SACK_PERMIT; + } + + /* +* If timestamps were negotiated during SYN/ACK they should +* appear on every segment during this session and vice versa. +*/ + if ((tp->t_flags & TF_RCVD_TSTMP) && !(to.to_flags & TOF_TS)) { + if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { + log(LOG_DEBUG, "%s; %s: Timestamp missing, " + "no action\n", s, __func__); + free(s, M_TCPLOG); + } + } + if (!(tp->t_flags & TF_RCVD_TSTMP) && (to.to_flags & TOF_TS)) { + if ((s = tcp_log_addrs(inc, th, NULL, NULL))) { + log(LOG_DEBUG, "%s; %s: Timestamp not expected, " + "no action\n", s, __func__); + free(s, M_TCPLOG); + } } /* ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r325092 - head/usr.bin/fortune/datfiles
I’ve still had to use rehash on several occasions for it to detect new apps, so remove that reference might be a mistake On Sun, 29 Oct 2017 at 18:51, Cy Schubertwrote: > In message > om> > , Warner Losh writes: > > --94eb2c114c9a7c3c21055cb3566c > > Content-Type: text/plain; charset="UTF-8" > > > > On Sun, Oct 29, 2017 at 8:26 AM, Ed Maste wrote: > > > > > On 29 October 2017 at 00:53, Eitan Adler wrote: > > > > Author: eadler > > > > Date: Sun Oct 29 04:53:33 2017 > > > > New Revision: 325092 > > > > URL: https://svnweb.freebsd.org/changeset/base/325092 > > > > > > > > Log: > > > > Modernize freebsd-tips a bit > > > ... > > > > % > > > > Want to run the same command again? > > > > -In tcsh you can type "!!". > > > > +Type "!!". > > > > % > > > > > > $ !! > > > sh: !!: not found > > > > > > I doubt many people use /bin/sh as an interactive shell, but the tip > > > ought not lead those who do astray > > > > > > > Yes. /bin/sh on FreeBSD doesn't grok it, though bash and some other > shells > > available as ports do. I think that the old text was a bit better. > > Or better yet, ctrl-r in bash and zsh, or up-arrow in tcsh. > > > -- > Cheers, > Cy Schubert > FreeBSD UNIX: Web: http://www.FreeBSD.org > > The need of the many outweighs the greed of the few. > > > > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r324983 - in head: lib/libc/sys sys/sys
Personally I would expect the fallback to be reboot as without the ability to power back on remotely e.g. IPMI this could render the machine inaccessible, which is not ideal, thoughts? On 25/10/2017 16:30, Warner Losh wrote: Author: imp Date: Wed Oct 25 15:30:20 2017 New Revision: 324983 URL: https://svnweb.freebsd.org/changeset/base/324983 Log: Define RB_POWERCYCLE RB_POWERCYCLE instructs the platform to power off and then power back on a short time later, if that's possible. Otherwise, degrade to the RB_POWEROFF behavior. Sponsored by: Netflix Modified: head/lib/libc/sys/reboot.2 head/sys/sys/reboot.h Modified: head/lib/libc/sys/reboot.2 == --- head/lib/libc/sys/reboot.2 Wed Oct 25 15:28:05 2017(r324982) +++ head/lib/libc/sys/reboot.2 Wed Oct 25 15:30:20 2017(r324983) @@ -28,7 +28,7 @@ .\" @(#)reboot.2 8.1 (Berkeley) 6/4/93 .\" $FreeBSD$ .\" -.Dd September 18, 2015 +.Dd October 24, 2017 .Dt REBOOT 2 .Os .Sh NAME @@ -84,6 +84,14 @@ for more information. .It Dv RB_HALT The processor is simply halted; no reboot takes place. This option should be used with caution. +.It Dv RB_POWERCYCLE +After halting, the shutdown code will do what it can to turn +off the power and then turn the power back on. +This requires hardware support, usually an auxiliary microprocessor +that can sequence the power supply. +At present only the +.Xr ipmi 4 +driver implements this feature. .It Dv RB_POWEROFF After halting, the shutdown code will do what it can to turn off the power. Modified: head/sys/sys/reboot.h == --- head/sys/sys/reboot.h Wed Oct 25 15:28:05 2017(r324982) +++ head/sys/sys/reboot.h Wed Oct 25 15:30:20 2017(r324983) @@ -60,6 +60,7 @@ #define RB_RESERVED20x8 /* reserved for internal use of boot blocks */ #define RB_PAUSE0x10 /* pause after each output line during probe */ #define RB_REROOT 0x20 /* unmount the rootfs and mount it again */ +#defineRB_POWERCYCLE 0x40 /* Power cycle if possible */ #define RB_MULTIPLE 0x2000 /* use multiple consoles */ #define RB_BOOTINFO 0x8000 /* have `struct bootinfo *' arg */ ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r318751 - in head/sys: kern sys
Personally I hate that idea as like being able to see all the processes from the host. I have a similar hate of Linux containers where you have to jump though hoops just to see whats really happening on the host. On Sat, 21 Oct 2017 at 20:29, Allan Judewrote: > On 2017-05-23 12:59, Steve Wills wrote: > > Author: swills (ports committer) > > Date: Tue May 23 16:59:24 2017 > > New Revision: 318751 > > URL: https://svnweb.freebsd.org/changeset/base/318751 > > > > Log: > > Add security.bsd.see_jail_proc > > > > Add security.bsd.see_jail_proc sysctl to hide jail processes from > non-root > > users > > > > Reviewed by:jamie > > Approved by:allanjude > > Relnotes: yes > > Differential Revision: https://reviews.freebsd.org/D10770 > > > I user was asking about this issue on IRC today. > > I think I have changed my mind a bit. > > I think we should make the default be off (so you can't see processes in > a jail from the host) by default in 12. > > And that we should MFC this sysctl to stable/11, but not change the > default behaviour there. > > Anyone else have thoughts? > > -- > Allan Jude > > ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r324449 - in head/sys/boot: arm/uboot efi/boot1 sparc64/loader
Yer even no -j fails :( On 10/10/2017 01:01, Warner Losh wrote: Oh, killed /usr/include/stand.h and found it. I'll post a fix when I get back. On Mon, Oct 9, 2017 at 6:00 PM, Warner Losh <i...@bsdimp.com <mailto:i...@bsdimp.com>> wrote: Can you find out? A clean build works for me. Chances are good that sys/boot/efi/boot1/Makefile needs a line like CFLAGS+=-I${SASRC} or similar. I have to go out for 2 hours, but will look into when I get back if you can't make progress. I don't see one there and I had to add it a couple of other places. Warner On Mon, Oct 9, 2017 at 5:56 PM, Steven Hartland <steven.hartl...@multiplay.co.uk <mailto:steven.hartl...@multiplay.co.uk>> wrote: Not sure which of these sets of changes caused the issue but a clean build from scratch is currently failing here with: In file included from /usr/home/smh/freebsd/base/head/sys/boot/efi/boot1/ufs_module.c:41: In file included from /usr/home/smh/freebsd/base/head/sys/boot/efi/boot1/boot_module.h:35: /usr/home/smh/freebsd/base/head/sys/boot/efi/boot1/../include/efilib.h:33:10: fatal error: 'stand.h' file not found #include ^ Build was with -j24 in case it matters, going to try without -j but that will take many hours On 09/10/2017 23:11, Warner Losh wrote: Author: imp Date: Mon Oct 9 22:11:57 2017 New Revision: 324449 URL:https://svnweb.freebsd.org/changeset/base/324449 <https://svnweb.freebsd.org/changeset/base/324449> Log: Prefer ${LIBSTAND} to -lstand Sponsored by: Netflix Modified: head/sys/boot/arm/uboot/Makefile head/sys/boot/efi/boot1/Makefile head/sys/boot/sparc64/loader/Makefile Modified: head/sys/boot/arm/uboot/Makefile == --- head/sys/boot/arm/uboot/MakefileMon Oct 9 21:06:16 2017 (r324448) +++ head/sys/boot/arm/uboot/MakefileMon Oct 9 22:11:57 2017 (r324449) @@ -121,7 +121,7 @@ CFLAGS+=-fPIC NO_WERROR.clang= DPADD= ${LIBFICL} ${LIBUBOOT} ${LIBFDT} ${LIBUBOOT_FDT} ${LIBSTAND} -LDADD= ${LIBFICL} ${LIBUBOOT} ${LIBFDT} ${LIBUBOOT_FDT} -lstand +LDADD= ${LIBFICL} ${LIBUBOOT} ${LIBFDT} ${LIBUBOOT_FDT} ${LIBSTAND} OBJS+= ${SRCS:N*.h:R:S/$/.o/g} Modified: head/sys/boot/efi/boot1/Makefile == --- head/sys/boot/efi/boot1/MakefileMon Oct 9 21:06:16 2017 (r324448) +++ head/sys/boot/efi/boot1/MakefileMon Oct 9 22:11:57 2017 (r324449) @@ -91,7 +91,7 @@ LIBEFI= ${.OBJDIR}/../libefi/libefi.a # as well as required string and memory functions for all platforms. # DPADD+= ${LIBEFI} ${LIBSTAND} -LDADD+=${LIBEFI} -lstand +LDADD+=${LIBEFI} ${LIBSTAND} DPADD+= ${LDSCRIPT} Modified: head/sys/boot/sparc64/loader/Makefile == --- head/sys/boot/sparc64/loader/Makefile Mon Oct 9 21:06:16 2017(r324448) +++ head/sys/boot/sparc64/loader/Makefile Mon Oct 9 22:11:57 2017(r324449) @@ -86,7 +86,7 @@ CFLAGS+= -I${.CURDIR}/../../../../lib/libstand/ CFLAGS+= -I${SRCTOP}/sys DPADD= ${LIBFICL} ${LIBZFSBOOT} ${LIBOFW} ${LIBSTAND} -LDADD= ${LIBFICL} ${LIBZFSBOOT} ${LIBOFW} -lstand +LDADD= ${LIBFICL} ${LIBZFSBOOT} ${LIBOFW} ${LIBSTAND} loader.help: help.common help.sparc64 cat ${.ALLSRC} | \ ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r324449 - in head/sys/boot: arm/uboot efi/boot1 sparc64/loader
Not sure which of these sets of changes caused the issue but a clean build from scratch is currently failing here with: In file included from /usr/home/smh/freebsd/base/head/sys/boot/efi/boot1/ufs_module.c:41: In file included from /usr/home/smh/freebsd/base/head/sys/boot/efi/boot1/boot_module.h:35: /usr/home/smh/freebsd/base/head/sys/boot/efi/boot1/../include/efilib.h:33:10: fatal error: 'stand.h' file not found #include ^ Build was with -j24 in case it matters, going to try without -j but that will take many hours On 09/10/2017 23:11, Warner Losh wrote: Author: imp Date: Mon Oct 9 22:11:57 2017 New Revision: 324449 URL: https://svnweb.freebsd.org/changeset/base/324449 Log: Prefer ${LIBSTAND} to -lstand Sponsored by: Netflix Modified: head/sys/boot/arm/uboot/Makefile head/sys/boot/efi/boot1/Makefile head/sys/boot/sparc64/loader/Makefile Modified: head/sys/boot/arm/uboot/Makefile == --- head/sys/boot/arm/uboot/MakefileMon Oct 9 21:06:16 2017 (r324448) +++ head/sys/boot/arm/uboot/MakefileMon Oct 9 22:11:57 2017 (r324449) @@ -121,7 +121,7 @@ CFLAGS+=-fPIC NO_WERROR.clang= DPADD= ${LIBFICL} ${LIBUBOOT} ${LIBFDT} ${LIBUBOOT_FDT} ${LIBSTAND} -LDADD= ${LIBFICL} ${LIBUBOOT} ${LIBFDT} ${LIBUBOOT_FDT} -lstand +LDADD= ${LIBFICL} ${LIBUBOOT} ${LIBFDT} ${LIBUBOOT_FDT} ${LIBSTAND} OBJS+= ${SRCS:N*.h:R:S/$/.o/g} Modified: head/sys/boot/efi/boot1/Makefile == --- head/sys/boot/efi/boot1/MakefileMon Oct 9 21:06:16 2017 (r324448) +++ head/sys/boot/efi/boot1/MakefileMon Oct 9 22:11:57 2017 (r324449) @@ -91,7 +91,7 @@ LIBEFI= ${.OBJDIR}/../libefi/libefi.a # as well as required string and memory functions for all platforms. # DPADD+= ${LIBEFI} ${LIBSTAND} -LDADD+=${LIBEFI} -lstand +LDADD+=${LIBEFI} ${LIBSTAND} DPADD+= ${LDSCRIPT} Modified: head/sys/boot/sparc64/loader/Makefile == --- head/sys/boot/sparc64/loader/Makefile Mon Oct 9 21:06:16 2017 (r324448) +++ head/sys/boot/sparc64/loader/Makefile Mon Oct 9 22:11:57 2017 (r324449) @@ -86,7 +86,7 @@ CFLAGS+= -I${.CURDIR}/../../../../lib/libstand/ CFLAGS+= -I${SRCTOP}/sys DPADD= ${LIBFICL} ${LIBZFSBOOT} ${LIBOFW} ${LIBSTAND} -LDADD= ${LIBFICL} ${LIBZFSBOOT} ${LIBOFW} -lstand +LDADD= ${LIBFICL} ${LIBZFSBOOT} ${LIBOFW} ${LIBSTAND} loader.help: help.common help.sparc64 cat ${.ALLSRC} | \ ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r323566 - head/sys/kern
Is this something that will be MFC'ed to 11 or is this 12 / CURRENT only? On 13/09/2017 23:11, Gleb Smirnoff wrote: Author: glebius Date: Wed Sep 13 22:11:05 2017 New Revision: 323566 URL: https://svnweb.freebsd.org/changeset/base/323566 Log: Use soref() in sendfile(2) instead fhold() to reference a socket. The problem is that fdrop() requires syscall context, as it may enter sleep in some cases. The reason to use it in the original non-blocking sendfile implementation, was to avoid use of global ACCEPT_LOCK() on every I/O completion. Now in head sorele() no longer requires this lock. Modified: head/sys/kern/kern_sendfile.c Modified: head/sys/kern/kern_sendfile.c == --- head/sys/kern/kern_sendfile.c Wed Sep 13 21:56:49 2017 (r323565) +++ head/sys/kern/kern_sendfile.c Wed Sep 13 22:11:05 2017 (r323566) @@ -80,7 +80,7 @@ struct sf_io { volatile u_int nios; u_int error; int npages; - struct file *sock_fp; + struct socket *so; struct mbuf *m; vm_page_t pa[]; }; @@ -255,7 +255,7 @@ static void sendfile_iodone(void *arg, vm_page_t *pg, int count, int error) { struct sf_io *sfio = arg; - struct socket *so; + struct socket *so = sfio->so; for (int i = 0; i < count; i++) if (pg[i] != bogus_page) @@ -267,8 +267,6 @@ sendfile_iodone(void *arg, vm_page_t *pg, int count, i if (!refcount_release(>nios)) return; - so = sfio->sock_fp->f_data; - if (sfio->error) { struct mbuf *m; @@ -296,8 +294,8 @@ sendfile_iodone(void *arg, vm_page_t *pg, int count, i CURVNET_RESTORE(); } - /* XXXGL: curthread */ - fdrop(sfio->sock_fp, curthread); + SOCK_LOCK(so); + sorele(so); free(sfio, M_TEMP); } @@ -724,6 +722,7 @@ retry_space: sfio = malloc(sizeof(struct sf_io) + npages * sizeof(vm_page_t), M_TEMP, M_WAITOK); refcount_init(>nios, 1); + sfio->so = so; sfio->error = 0; nios = sendfile_swapin(obj, sfio, off, space, npages, rhpages, @@ -858,9 +857,8 @@ prepend_header: error = (*so->so_proto->pr_usrreqs->pru_send) (so, 0, m, NULL, NULL, td); } else { - sfio->sock_fp = sock_fp; sfio->npages = npages; - fhold(sock_fp); + soref(so); error = (*so->so_proto->pr_usrreqs->pru_send) (so, PRUS_NOTREADY, m, NULL, NULL, td); sendfile_iodone(sfio, NULL, 0, 0); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r322881 - head/usr.bin/calendar/calendars
Author: smh Date: Fri Aug 25 08:21:02 2017 New Revision: 322881 URL: https://svnweb.freebsd.org/changeset/base/322881 Log: Add myself (smh) to calendar.freebsd Sponsored by: Multiplay Modified: head/usr.bin/calendar/calendars/calendar.freebsd Modified: head/usr.bin/calendar/calendars/calendar.freebsd == --- head/usr.bin/calendar/calendars/calendar.freebsdFri Aug 25 07:49:51 2017(r322880) +++ head/usr.bin/calendar/calendars/calendar.freebsdFri Aug 25 08:21:02 2017(r322881) @@ -333,6 +333,7 @@ 09/07 Chris Rees <cr...@freebsd.org> born in Kettering, United Kingdom, 1987 09/08 Boris Samorodov <b...@freebsd.org> born in Krasnodar, Russian Federation, 1963 09/09 Yoshio Mita <m...@freebsd.org> born in Hiroshima, Japan, 1972 +09/09 Steven Hartland <s...@freebsd.org> born in Wordsley, United Kingdom, 1973 09/10 Wesley R. Peters <w...@freebsd.org> born in Hartford, Alabama, United States, 1961 09/12 Weongyo Jeong <weon...@freebsd.org> born in Haman, Korea, 1980 09/12 Benedict Christopher Reuschling <b...@freebsd.org> born in Darmstadt, Germany, 1981 ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r322619 - stable/11/usr.bin/grep
This seems a little quick considering it only hit head 8 mins ago. On 17/08/2017 14:48, Kyle Evans wrote: Author: kevans Date: Thu Aug 17 13:48:46 2017 New Revision: 322619 URL: https://svnweb.freebsd.org/changeset/base/322619 Log: bsdgrep: fix build when linking against libgnuregex MFC r322618: bsdgrep: cast pmatch.rm_so to fix build when linking against libgnuregex Approved by: emaste (mentor) Modified: stable/11/usr.bin/grep/util.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/grep/util.c == --- stable/11/usr.bin/grep/util.c Thu Aug 17 13:40:45 2017 (r322618) +++ stable/11/usr.bin/grep/util.c Thu Aug 17 13:48:46 2017 (r322619) @@ -450,7 +450,7 @@ procline(struct parsec *pc) */ if (r == REG_NOMATCH && (retry == pc->lnstart || - pmatch.rm_so + 1 < retry)) + (unsigned int)pmatch.rm_so + 1 < retry)) retry = pmatch.rm_so + 1; if (r == REG_NOMATCH) continue; ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r320138 - head/usr.sbin/bsdinstall/scripts
Author: smh Date: Tue Jun 20 08:03:50 2017 New Revision: 320138 URL: https://svnweb.freebsd.org/changeset/base/320138 Log: Fixed bsdinstall location of vfs.zfs.min_auto_ashift vfs.zfs.min_auto_ashift is a sysctl only not a tunable so updated bsdinstall to use the correct location /etc/sysctl.conf instead of /boot/loader.conf Reported by: Aaron Caza Reviewed by: allanjude MFC after:2 days Sponsored by: Multiplay Differential Revision:https://reviews.freebsd.org/D11278 Modified: head/usr.sbin/bsdinstall/scripts/config head/usr.sbin/bsdinstall/scripts/zfsboot Modified: head/usr.sbin/bsdinstall/scripts/config == --- head/usr.sbin/bsdinstall/scripts/config Tue Jun 20 08:01:13 2017 (r320137) +++ head/usr.sbin/bsdinstall/scripts/config Tue Jun 20 08:03:50 2017 (r320138) @@ -32,7 +32,7 @@ cat $BSDINSTALL_TMPETC/rc.conf.* >> $BSDINSTALL_TMPETC/rc.conf rm $BSDINSTALL_TMPETC/rc.conf.* -cat $BSDINSTALL_CHROOT/etc/sysctl.conf $BSDINSTALL_TMPETC/sysctl.conf.hardening >> $BSDINSTALL_TMPETC/sysctl.conf +cat $BSDINSTALL_CHROOT/etc/sysctl.conf $BSDINSTALL_TMPETC/sysctl.conf.* >> $BSDINSTALL_TMPETC/sysctl.conf rm $BSDINSTALL_TMPETC/sysctl.conf.* cp $BSDINSTALL_TMPETC/* $BSDINSTALL_CHROOT/etc Modified: head/usr.sbin/bsdinstall/scripts/zfsboot == --- head/usr.sbin/bsdinstall/scripts/zfsbootTue Jun 20 08:01:13 2017 (r320137) +++ head/usr.sbin/bsdinstall/scripts/zfsbootTue Jun 20 08:03:50 2017 (r320138) @@ -1446,7 +1446,7 @@ zfs_create_boot() if [ "$ZFSBOOT_FORCE_4K_SECTORS" ]; then f_eval_catch $funcname echo "$ECHO_APPEND" \ 'vfs.zfs.min_auto_ashift=12' \ -$BSDINSTALL_TMPBOOT/loader.conf.zfs || return $FAILURE +$BSDINSTALL_TMPETC/sysctl.conf.zfs || return $FAILURE fi if [ "$ZFSBOOT_SWAP_MIRROR" ]; then ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r318438 - in stable/10: cddl/lib/libdtrace sys/netinet
Author: smh Date: Thu May 18 03:32:01 2017 New Revision: 318438 URL: https://svnweb.freebsd.org/changeset/base/318438 Log: Revert the partial MFC of r313045 which broke dtrace This removes the mbuf to ipinfo_t translator and switches tcp_autorcvbuf to use the older mtod macro. This was originally merged to stable/10 as part of r317375. Reported by: markj Reviewed by: markj, hiren Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D10769 Modified: stable/10/cddl/lib/libdtrace/ip.d stable/10/sys/netinet/in_kdtrace.c stable/10/sys/netinet/tcp_input.c Directory Properties: stable/10/ (props changed) Modified: stable/10/cddl/lib/libdtrace/ip.d == --- stable/10/cddl/lib/libdtrace/ip.d Thu May 18 01:46:30 2017 (r318437) +++ stable/10/cddl/lib/libdtrace/ip.d Thu May 18 03:32:01 2017 (r318438) @@ -240,24 +240,6 @@ translator ipinfo_t < uint8_t *p > { #pragma D binding "1.0" IFF_LOOPBACK inline int IFF_LOOPBACK = 0x8; -#pragma D binding "1.13" translator -translator ipinfo_t < struct mbuf *m > { - ip_ver =m == NULL ? 0 : ((struct ip *)m->m_data)->ip_v; - ip_plength =m == NULL ? 0 : - ((struct ip *)m->m_data)->ip_v == 4 ? - ntohs(((struct ip *)m->m_data)->ip_len) - - (((struct ip *)m->m_data)->ip_hl << 2): - ntohs(((struct ip6_hdr *)m->m_data)->ip6_ctlun.ip6_un1.ip6_un1_plen); - ip_saddr = m == NULL ? 0 : - ((struct ip *)m->m_data)->ip_v == 4 ? - inet_ntoa(&((struct ip *)m->m_data)->ip_src.s_addr) : - inet_ntoa6(&((struct ip6_hdr *)m->m_data)->ip6_src); - ip_daddr = m == NULL ? 0 : - ((struct ip *)m->m_data)->ip_v == 4 ? - inet_ntoa(&((struct ip *)m->m_data)->ip_dst.s_addr) : - inet_ntoa6(&((struct ip6_hdr *)m->m_data)->ip6_dst); -}; - #pragma D binding "1.0" translator translator ifinfo_t < struct ifnet *p > { if_name = p->if_xname; Modified: stable/10/sys/netinet/in_kdtrace.c == --- stable/10/sys/netinet/in_kdtrace.c Thu May 18 01:46:30 2017 (r318437) +++ stable/10/sys/netinet/in_kdtrace.c Thu May 18 03:32:01 2017 (r318438) @@ -58,28 +58,28 @@ SDT_PROBE_DEFINE6_XLATE(ip, , , send, SDT_PROBE_DEFINE5_XLATE(tcp, , , accept__established, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"struct mbuf *", "ipinfo_t *", +"uint8_t *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); SDT_PROBE_DEFINE5_XLATE(tcp, , , accept__refused, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"struct mbuf *", "ipinfo_t *", +"uint8_t *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfo_t *"); SDT_PROBE_DEFINE5_XLATE(tcp, , , connect__established, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"struct mbuf *", "ipinfo_t *", +"uint8_t *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); SDT_PROBE_DEFINE5_XLATE(tcp, , , connect__refused, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"struct mbuf *", "ipinfo_t *", +"uint8_t *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); @@ -93,7 +93,7 @@ SDT_PROBE_DEFINE5_XLATE(tcp, , , connect SDT_PROBE_DEFINE5_XLATE(tcp, , , receive, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"struct mbuf *", "ipinfo_t *", +"uint8_t *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); @@ -115,7 +115,7 @@ SDT_PROBE_DEFINE6_XLATE(tcp, , , state__ SDT_PROBE_DEFINE6_XLATE(tcp, , , receive__autoresize, "void *", "void *", "struct tcpcb *", "csinfo_t *", -"struct mbuf *", "ipinfo_t *", +"uint8_t *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *", "int", "int"); Modified: stable/10/sys/netinet/tcp_input.c == --- stable/10/sys/netinet/tcp_input.c Thu May 18 01:46:30 2017 (r318437) +++ stable/10/sys/netinet/tcp_input.c Thu May 18 03:32:01 2017 (r318438) @@ -1519,7 +1519,8 @@ tcp_autorcvbuf(struct mbuf *m, struct tc newsize = min(so->so_rcv.sb_hiwat + V_tcp_autorcvbuf_inc, V_tcp_autorcvbuf_max); } - TCP_PROBE6(receive__autoresize, NULL, tp, m, tp, th, newsize); + TCP_PROBE6(receive__autoresize, NULL, tp, mtod(m, const char *), + tp, th, newsize); /* Start over with next RTT. */ tp->rfbuf_ts = 0;
svn commit: r317470 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Author: smh Date: Wed Apr 26 22:25:01 2017 New Revision: 317470 URL: https://svnweb.freebsd.org/changeset/base/317470 Log: MFC r315449: Reduce ARC fragmentation threshold Sponsored by: Multiplay Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c == --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Apr 26 22:23:42 2017(r317469) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Wed Apr 26 22:25:01 2017(r317470) @@ -3978,7 +3978,7 @@ arc_available_memory(void) * Start aggressive reclamation if too little sequential KVA left. */ if (lowest > 0) { - n = (vmem_size(heap_arena, VMEM_MAXFREE) < zfs_max_recordsize) ? + n = (vmem_size(heap_arena, VMEM_MAXFREE) < SPA_MAXBLOCKSIZE) ? -((int64_t)vmem_size(heap_arena, VMEM_ALLOC) >> 4) : INT64_MAX; if (n < lowest) { ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r317469 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Author: smh Date: Wed Apr 26 22:23:42 2017 New Revision: 317469 URL: https://svnweb.freebsd.org/changeset/base/317469 Log: MFC r316460: Fix expandsz 16.0E vals and vdev_min_asize of RAIDZ children Sponsored by: Multiplay Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c == --- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Apr 26 22:17:54 2017(r317468) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Wed Apr 26 22:23:42 2017(r317469) @@ -228,7 +228,8 @@ vdev_get_min_asize(vdev_t *vd) * so each child must provide at least 1/Nth of its asize. */ if (pvd->vdev_ops == _raidz_ops) - return (pvd->vdev_min_asize / pvd->vdev_children); + return ((pvd->vdev_min_asize + pvd->vdev_children - 1) / + pvd->vdev_children); return (pvd->vdev_min_asize); } @@ -1376,7 +1377,7 @@ vdev_open(vdev_t *vd) vd->vdev_psize = psize; /* -* Make sure the allocatable size hasn't shrunk. +* Make sure the allocatable size hasn't shrunk too much. */ if (asize < vd->vdev_min_asize) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, @@ -1416,12 +1417,21 @@ vdev_open(vdev_t *vd) } /* -* If all children are healthy and the asize has increased, -* then we've experienced dynamic LUN growth. If automatic -* expansion is enabled then use the additional space. -*/ - if (vd->vdev_state == VDEV_STATE_HEALTHY && asize > vd->vdev_asize && - (vd->vdev_expanding || spa->spa_autoexpand)) +* If all children are healthy we update asize if either: +* The asize has increased, due to a device expansion caused by dynamic +* LUN growth or vdev replacement, and automatic expansion is enabled; +* making the additional space available. +* +* The asize has decreased, due to a device shrink usually caused by a +* vdev replace with a smaller device. This ensures that calculations +* based of max_asize and asize e.g. esize are always valid. It's safe +* to do this as we've already validated that asize is greater than +* vdev_min_asize. +*/ + if (vd->vdev_state == VDEV_STATE_HEALTHY && + ((asize > vd->vdev_asize && + (vd->vdev_expanding || spa->spa_autoexpand)) || + (asize < vd->vdev_asize))) vd->vdev_asize = asize; vdev_set_min_asize(vd); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r317375 - in stable/10: cddl/lib/libdtrace sys/netinet
Author: smh Date: Mon Apr 24 16:31:28 2017 New Revision: 317375 URL: https://svnweb.freebsd.org/changeset/base/317375 Log: Partial MFC r316676 and the required r313045 MFC r316676: Use estimated RTT for receive buffer auto resizing instead of timestamps. This is a partial MFC as stable/10 doesn't include the TCP stack modularisation. MFC r313045: Add an mbuf to ipinfo_t translator to finish cleanup of mbuf passing to TCP probes. This is a partial MFC (missing debug__output & debug__drop changes) due to the massive amount of additional dtrace changes that would be required for a full MFC. Relnotes: Yes Sponsored by: Multiplay Modified: stable/10/cddl/lib/libdtrace/ip.d stable/10/sys/netinet/in_kdtrace.c stable/10/sys/netinet/in_kdtrace.h stable/10/sys/netinet/tcp_input.c stable/10/sys/netinet/tcp_output.c stable/10/sys/netinet/tcp_var.h Directory Properties: stable/10/ (props changed) Modified: stable/10/cddl/lib/libdtrace/ip.d == --- stable/10/cddl/lib/libdtrace/ip.d Mon Apr 24 16:07:30 2017 (r317374) +++ stable/10/cddl/lib/libdtrace/ip.d Mon Apr 24 16:31:28 2017 (r317375) @@ -240,6 +240,24 @@ translator ipinfo_t < uint8_t *p > { #pragma D binding "1.0" IFF_LOOPBACK inline int IFF_LOOPBACK = 0x8; +#pragma D binding "1.13" translator +translator ipinfo_t < struct mbuf *m > { + ip_ver =m == NULL ? 0 : ((struct ip *)m->m_data)->ip_v; + ip_plength =m == NULL ? 0 : + ((struct ip *)m->m_data)->ip_v == 4 ? + ntohs(((struct ip *)m->m_data)->ip_len) - + (((struct ip *)m->m_data)->ip_hl << 2): + ntohs(((struct ip6_hdr *)m->m_data)->ip6_ctlun.ip6_un1.ip6_un1_plen); + ip_saddr = m == NULL ? 0 : + ((struct ip *)m->m_data)->ip_v == 4 ? + inet_ntoa(&((struct ip *)m->m_data)->ip_src.s_addr) : + inet_ntoa6(&((struct ip6_hdr *)m->m_data)->ip6_src); + ip_daddr = m == NULL ? 0 : + ((struct ip *)m->m_data)->ip_v == 4 ? + inet_ntoa(&((struct ip *)m->m_data)->ip_dst.s_addr) : + inet_ntoa6(&((struct ip6_hdr *)m->m_data)->ip6_dst); +}; + #pragma D binding "1.0" translator translator ifinfo_t < struct ifnet *p > { if_name = p->if_xname; Modified: stable/10/sys/netinet/in_kdtrace.c == --- stable/10/sys/netinet/in_kdtrace.c Mon Apr 24 16:07:30 2017 (r317374) +++ stable/10/sys/netinet/in_kdtrace.c Mon Apr 24 16:31:28 2017 (r317375) @@ -58,28 +58,28 @@ SDT_PROBE_DEFINE6_XLATE(ip, , , send, SDT_PROBE_DEFINE5_XLATE(tcp, , , accept__established, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"uint8_t *", "ipinfo_t *", +"struct mbuf *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); SDT_PROBE_DEFINE5_XLATE(tcp, , , accept__refused, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"uint8_t *", "ipinfo_t *", +"struct mbuf *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfo_t *"); SDT_PROBE_DEFINE5_XLATE(tcp, , , connect__established, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"uint8_t *", "ipinfo_t *", +"struct mbuf *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); SDT_PROBE_DEFINE5_XLATE(tcp, , , connect__refused, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"uint8_t *", "ipinfo_t *", +"struct mbuf *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); @@ -93,7 +93,7 @@ SDT_PROBE_DEFINE5_XLATE(tcp, , , connect SDT_PROBE_DEFINE5_XLATE(tcp, , , receive, "void *", "pktinfo_t *", "struct tcpcb *", "csinfo_t *", -"uint8_t *", "ipinfo_t *", +"struct mbuf *", "ipinfo_t *", "struct tcpcb *", "tcpsinfo_t *" , "struct tcphdr *", "tcpinfoh_t *"); @@ -112,6 +112,14 @@ SDT_PROBE_DEFINE6_XLATE(tcp, , , state__ "void *", "void *", "int", "tcplsinfo_t *"); +SDT_PROBE_DEFINE6_XLATE(tcp, , , receive__autoresize, +"void *", "void *", +"struct tcpcb *", "csinfo_t *", +"struct mbuf *", "ipinfo_t *", +"struct tcpcb *", "tcpsinfo_t *" , +"struct tcphdr *", "tcpinfoh_t *", +"int", "int"); + SDT_PROBE_DEFINE5_XLATE(udp, , , receive, "void *", "pktinfo_t *", "struct inpcb *", "csinfo_t *", Modified: stable/10/sys/netinet/in_kdtrace.h == --- stable/10/sys/netinet/in_kdtrace.h Mon Apr 24 16:07:30 2017 (r317374) +++ stable/10/sys/netinet/in_kdtrace.h Mon Apr 24 16:31:28 2017 (r317375) @@ -52,6 +52,7 @@ SDT_PROBE_DECLARE(tcp, , , connect__requ
svn commit: r317368 - in stable/11/sys/netinet: . tcp_stacks
Author: smh Date: Mon Apr 24 11:34:02 2017 New Revision: 317368 URL: https://svnweb.freebsd.org/changeset/base/317368 Log: MFC r316676: Use estimated RTT for receive buffer auto resizing instead of timestamps Relnotes: Yes Sponsored by: Multiplay Modified: stable/11/sys/netinet/in_kdtrace.c stable/11/sys/netinet/in_kdtrace.h stable/11/sys/netinet/tcp_input.c stable/11/sys/netinet/tcp_output.c stable/11/sys/netinet/tcp_stacks/fastpath.c stable/11/sys/netinet/tcp_var.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/netinet/in_kdtrace.c == --- stable/11/sys/netinet/in_kdtrace.c Mon Apr 24 11:22:06 2017 (r317367) +++ stable/11/sys/netinet/in_kdtrace.c Mon Apr 24 11:34:02 2017 (r317368) @@ -132,6 +132,14 @@ SDT_PROBE_DEFINE6_XLATE(tcp, , , state__ "void *", "void *", "int", "tcplsinfo_t *"); +SDT_PROBE_DEFINE6_XLATE(tcp, , , receive__autoresize, +"void *", "void *", +"struct tcpcb *", "csinfo_t *", +"struct mbuf *", "ipinfo_t *", +"struct tcpcb *", "tcpsinfo_t *" , +"struct tcphdr *", "tcpinfoh_t *", +"int", "int"); + SDT_PROBE_DEFINE5_XLATE(udp, , , receive, "void *", "pktinfo_t *", "struct inpcb *", "csinfo_t *", Modified: stable/11/sys/netinet/in_kdtrace.h == --- stable/11/sys/netinet/in_kdtrace.h Mon Apr 24 11:22:06 2017 (r317367) +++ stable/11/sys/netinet/in_kdtrace.h Mon Apr 24 11:34:02 2017 (r317368) @@ -65,6 +65,7 @@ SDT_PROBE_DECLARE(tcp, , , debug__input) SDT_PROBE_DECLARE(tcp, , , debug__output); SDT_PROBE_DECLARE(tcp, , , debug__user); SDT_PROBE_DECLARE(tcp, , , debug__drop); +SDT_PROBE_DECLARE(tcp, , , receive__autoresize); SDT_PROBE_DECLARE(udp, , , receive); SDT_PROBE_DECLARE(udp, , , send); Modified: stable/11/sys/netinet/tcp_input.c == --- stable/11/sys/netinet/tcp_input.c Mon Apr 24 11:22:06 2017 (r317367) +++ stable/11/sys/netinet/tcp_input.c Mon Apr 24 11:34:02 2017 (r317368) @@ -1473,6 +1473,68 @@ drop: return (IPPROTO_DONE); } +/* + * Automatic sizing of receive socket buffer. Often the send + * buffer size is not optimally adjusted to the actual network + * conditions at hand (delay bandwidth product). Setting the + * buffer size too small limits throughput on links with high + * bandwidth and high delay (eg. trans-continental/oceanic links). + * + * On the receive side the socket buffer memory is only rarely + * used to any significant extent. This allows us to be much + * more aggressive in scaling the receive socket buffer. For + * the case that the buffer space is actually used to a large + * extent and we run out of kernel memory we can simply drop + * the new segments; TCP on the sender will just retransmit it + * later. Setting the buffer size too big may only consume too + * much kernel memory if the application doesn't read() from + * the socket or packet loss or reordering makes use of the + * reassembly queue. + * + * The criteria to step up the receive buffer one notch are: + * 1. Application has not set receive buffer size with + * SO_RCVBUF. Setting SO_RCVBUF clears SB_AUTOSIZE. + * 2. the number of bytes received during the time it takes + * one timestamp to be reflected back to us (the RTT); + * 3. received bytes per RTT is within seven eighth of the + * current socket buffer size; + * 4. receive buffer size has not hit maximal automatic size; + * + * This algorithm does one step per RTT at most and only if + * we receive a bulk stream w/o packet losses or reorderings. + * Shrinking the buffer during idle times is not necessary as + * it doesn't consume any memory when idle. + * + * TODO: Only step up if the application is actually serving + * the buffer to better manage the socket buffer resources. + */ +int +tcp_autorcvbuf(struct mbuf *m, struct tcphdr *th, struct socket *so, +struct tcpcb *tp, int tlen) +{ + int newsize = 0; + + if (V_tcp_do_autorcvbuf && (so->so_rcv.sb_flags & SB_AUTOSIZE) && + tp->t_srtt != 0 && tp->rfbuf_ts != 0 && + TCP_TS_TO_TICKS(tcp_ts_getticks() - tp->rfbuf_ts) > + (tp->t_srtt >> TCP_RTT_SHIFT)) { + if (tp->rfbuf_cnt > (so->so_rcv.sb_hiwat / 8 * 7) && + so->so_rcv.sb_hiwat < V_tcp_autorcvbuf_max) { + newsize = min(so->so_rcv.sb_hiwat + + V_tcp_autorcvbuf_inc, V_tcp_autorcvbuf_max); + } + TCP_PROBE6(receive__autoresize, NULL, tp, m, tp, th, newsize); + + /* Start over with next RTT. */ + tp->rfbuf_ts = 0; + tp->rfbuf_cnt = 0; + } else { + tp->rfbuf_cnt += tlen; /* add up */ + } + + return
svn commit: r316944 - in stable/11: . sys/netinet sys/netinet6
Author: smh Date: Fri Apr 14 22:02:08 2017 New Revision: 316944 URL: https://svnweb.freebsd.org/changeset/base/316944 Log: MFC r316313, r316328: Allow explicitly assigned IPv4 & IPv6 loopback addresses to be used in jails. Relnotes: Yes Sponsored by: Multiplay Modified: stable/11/UPDATING stable/11/sys/netinet/in_jail.c stable/11/sys/netinet6/in6_jail.c Directory Properties: stable/11/ (props changed) Modified: stable/11/UPDATING == --- stable/11/UPDATING Fri Apr 14 21:49:20 2017(r316943) +++ stable/11/UPDATING Fri Apr 14 22:02:08 2017(r316944) @@ -16,6 +16,11 @@ from older versions of FreeBSD, try WITH the tip of head, and then rebuild without this option. The bootstrap process from older version of current across the gcc/clang cutover is a bit fragile. +20170414: + Binds and sends to the loopback addresses, IPv6 and IPv4, will now + use any explicitly assigned loopback address available in the jail + instead of using the first assigned address of the jail. + 20170402: Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 4.0.0. Please see the 20141231 entry below for information about prerequisites Modified: stable/11/sys/netinet/in_jail.c == --- stable/11/sys/netinet/in_jail.c Fri Apr 14 21:49:20 2017 (r316943) +++ stable/11/sys/netinet/in_jail.c Fri Apr 14 22:02:08 2017 (r316944) @@ -306,11 +306,6 @@ prison_local_ip4(struct ucred *cred, str } ia0.s_addr = ntohl(ia->s_addr); - if (ia0.s_addr == INADDR_LOOPBACK) { - ia->s_addr = pr->pr_ip4[0].s_addr; - mtx_unlock(>pr_mtx); - return (0); - } if (ia0.s_addr == INADDR_ANY) { /* @@ -323,6 +318,11 @@ prison_local_ip4(struct ucred *cred, str } error = prison_check_ip4_locked(pr, ia); + if (error == EADDRNOTAVAIL && ia0.s_addr == INADDR_LOOPBACK) { + ia->s_addr = pr->pr_ip4[0].s_addr; + error = 0; + } + mtx_unlock(>pr_mtx); return (error); } @@ -354,7 +354,8 @@ prison_remote_ip4(struct ucred *cred, st return (EAFNOSUPPORT); } - if (ntohl(ia->s_addr) == INADDR_LOOPBACK) { + if (ntohl(ia->s_addr) == INADDR_LOOPBACK && + prison_check_ip4_locked(pr, ia) == EADDRNOTAVAIL) { ia->s_addr = pr->pr_ip4[0].s_addr; mtx_unlock(>pr_mtx); return (0); @@ -370,9 +371,8 @@ prison_remote_ip4(struct ucred *cred, st /* * Check if given address belongs to the jail referenced by cred/prison. * - * Returns 0 if jail doesn't restrict IPv4 or if address belongs to jail, - * EADDRNOTAVAIL if the address doesn't belong, or EAFNOSUPPORT if the jail - * doesn't allow IPv4. Address passed in in NBO. + * Returns 0 if address belongs to jail, + * EADDRNOTAVAIL if the address doesn't belong to the jail. */ int prison_check_ip4_locked(const struct prison *pr, const struct in_addr *ia) Modified: stable/11/sys/netinet6/in6_jail.c == --- stable/11/sys/netinet6/in6_jail.c Fri Apr 14 21:49:20 2017 (r316943) +++ stable/11/sys/netinet6/in6_jail.c Fri Apr 14 22:02:08 2017 (r316944) @@ -293,12 +293,6 @@ prison_local_ip6(struct ucred *cred, str return (EAFNOSUPPORT); } - if (IN6_IS_ADDR_LOOPBACK(ia6)) { - bcopy(>pr_ip6[0], ia6, sizeof(struct in6_addr)); - mtx_unlock(>pr_mtx); - return (0); - } - if (IN6_IS_ADDR_UNSPECIFIED(ia6)) { /* * In case there is only 1 IPv6 address, and v6only is true, @@ -311,6 +305,11 @@ prison_local_ip6(struct ucred *cred, str } error = prison_check_ip6_locked(pr, ia6); + if (error == EADDRNOTAVAIL && IN6_IS_ADDR_LOOPBACK(ia6)) { + bcopy(>pr_ip6[0], ia6, sizeof(struct in6_addr)); + error = 0; + } + mtx_unlock(>pr_mtx); return (error); } @@ -341,7 +340,8 @@ prison_remote_ip6(struct ucred *cred, st return (EAFNOSUPPORT); } - if (IN6_IS_ADDR_LOOPBACK(ia6)) { + if (IN6_IS_ADDR_LOOPBACK(ia6) && +prison_check_ip6_locked(pr, ia6) == EADDRNOTAVAIL) { bcopy(>pr_ip6[0], ia6, sizeof(struct in6_addr)); mtx_unlock(>pr_mtx); return (0); @@ -357,9 +357,8 @@ prison_remote_ip6(struct ucred *cred, st /* * Check if given address belongs to the jail referenced by cred/prison. * - * Returns 0 if jail doesn't restrict IPv6 or if address belongs to jail, - * EADDRNOTAVAIL if the address doesn't belong, or EAFNOSUPPORT if the jail - * doesn't allow IPv6. + * Returns 0
svn commit: r316943 - in stable/11/sys: conf kern netinet netinet6 sys
Author: smh Date: Fri Apr 14 21:49:20 2017 New Revision: 316943 URL: https://svnweb.freebsd.org/changeset/base/316943 Log: MFC r303863: Move IPv4 & IPv6 specific jail functions to netinet and netinet6 files. Sponsored by: Multiplay Added: stable/11/sys/netinet/in_jail.c - copied unchanged from r303863, head/sys/netinet/in_jail.c stable/11/sys/netinet6/in6_jail.c - copied unchanged from r303863, head/sys/netinet6/in6_jail.c Modified: stable/11/sys/conf/files stable/11/sys/kern/kern_jail.c stable/11/sys/sys/jail.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/conf/files == --- stable/11/sys/conf/filesFri Apr 14 21:42:27 2017(r316942) +++ stable/11/sys/conf/filesFri Apr 14 21:49:20 2017(r316943) @@ -3805,6 +3805,7 @@ netinet/in_fib.c optional inet netinet/in_gif.c optional gif inet | netgraph_gif inet netinet/ip_gre.c optional gre inet netinet/ip_id.coptional inet +netinet/in_jail.c optional inet netinet/in_mcast.c optional inet netinet/in_pcb.c optional inet | inet6 netinet/in_pcbgroup.c optional inet pcbgroup | inet6 pcbgroup @@ -3871,6 +3872,7 @@ netinet6/in6_cksum.c optional inet6 netinet6/in6_fib.c optional inet6 netinet6/in6_gif.c optional gif inet6 | netgraph_gif inet6 netinet6/in6_ifattach.coptional inet6 +netinet6/in6_jail.coptional inet6 netinet6/in6_mcast.c optional inet6 netinet6/in6_pcb.c optional inet6 netinet6/in6_pcbgroup.coptional inet6 pcbgroup Modified: stable/11/sys/kern/kern_jail.c == --- stable/11/sys/kern/kern_jail.c Fri Apr 14 21:42:27 2017 (r316942) +++ stable/11/sys/kern/kern_jail.c Fri Apr 14 21:49:20 2017 (r316943) @@ -130,14 +130,6 @@ static void prison_racct_attach(struct p static void prison_racct_modify(struct prison *pr); static void prison_racct_detach(struct prison *pr); #endif -#ifdef INET -static int _prison_check_ip4(const struct prison *, const struct in_addr *); -static int prison_restrict_ip4(struct prison *pr, struct in_addr *newip4); -#endif -#ifdef INET6 -static int _prison_check_ip6(struct prison *pr, struct in6_addr *ia6); -static int prison_restrict_ip6(struct prison *pr, struct in6_addr *newip6); -#endif /* Flags for prison_deref */ #definePD_DEREF0x01 @@ -252,54 +244,6 @@ prison0_init(void) strlcpy(prison0.pr_osrelease, osrelease, sizeof(prison0.pr_osrelease)); } -#ifdef INET -static int -qcmp_v4(const void *ip1, const void *ip2) -{ - in_addr_t iaa, iab; - - /* -* We need to compare in HBO here to get the list sorted as expected -* by the result of the code. Sorting NBO addresses gives you -* interesting results. If you do not understand, do not try. -*/ - iaa = ntohl(((const struct in_addr *)ip1)->s_addr); - iab = ntohl(((const struct in_addr *)ip2)->s_addr); - - /* -* Do not simply return the difference of the two numbers, the int is -* not wide enough. -*/ - if (iaa > iab) - return (1); - else if (iaa < iab) - return (-1); - else - return (0); -} -#endif - -#ifdef INET6 -static int -qcmp_v6(const void *ip1, const void *ip2) -{ - const struct in6_addr *ia6a, *ia6b; - int i, rc; - - ia6a = (const struct in6_addr *)ip1; - ia6b = (const struct in6_addr *)ip2; - - rc = 0; - for (i = 0; rc == 0 && i < sizeof(struct in6_addr); i++) { - if (ia6a->s6_addr[i] > ia6b->s6_addr[i]) - rc = 1; - else if (ia6a->s6_addr[i] < ia6b->s6_addr[i]) - rc = -1; - } - return (rc); -} -#endif - /* * struct jail_args { * struct jail *jail; @@ -845,7 +789,8 @@ kern_jail_set(struct thread *td, struct * address to connect from. */ if (ip4s > 1) - qsort(ip4 + 1, ip4s - 1, sizeof(*ip4), qcmp_v4); + qsort(ip4 + 1, ip4s - 1, sizeof(*ip4), + prison_qcmp_v4); /* * Check for duplicate addresses and do some simple * zero and broadcast checks. If users give other bogus @@ -893,7 +838,8 @@ kern_jail_set(struct thread *td, struct ip6 = malloc(ip6s * sizeof(*ip6), M_PRISON, M_WAITOK); bcopy(op, ip6, ip6s * sizeof(*ip6)); if (ip6s > 1) - qsort(ip6 + 1, ip6s - 1,
Re: svn commit: r316676 - in head/sys/netinet: . tcp_stacks
I don't tend to MFC 10.x now, but do agree given the impact that for this one it should be done. The fix is a little different, due to code restructuring in 11 / head, but I do have a 10.x version already. Regards Steve On 10/04/2017 15:51, Julian Elischer wrote: If possible MFC to 10 too would be nice.. thanks On 10/4/17 4:19 pm, Steven Hartland wrote: Author: smh Date: Mon Apr 10 08:19:35 2017 New Revision: 316676 URL: https://svnweb.freebsd.org/changeset/base/316676 Log: Use estimated RTT for receive buffer auto resizing instead of timestamps Switched from using timestamps to RTT estimates when performing TCP receive buffer auto resizing, as not all hosts support / enable TCP timestamps. Disabled reset of receive buffer auto scaling when not in bulk receive mode, which gives an extra 20% performance increase. Also extracted auto resizing to a common method shared between standard and fastpath modules. With this AWS S3 downloads at ~17ms latency on a 1Gbps connection jump from ~3MB/s to ~100MB/s using the default settings. Reviewed by:lstewart, gnn MFC after: 2 weeks Relnotes: Yes Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D9668 Modified: head/sys/netinet/in_kdtrace.c head/sys/netinet/in_kdtrace.h head/sys/netinet/tcp_input.c head/sys/netinet/tcp_output.c head/sys/netinet/tcp_stacks/fastpath.c head/sys/netinet/tcp_var.h Modified: head/sys/netinet/in_kdtrace.c == --- head/sys/netinet/in_kdtrace.cMon Apr 10 06:19:09 2017 (r316675) +++ head/sys/netinet/in_kdtrace.cMon Apr 10 08:19:35 2017 (r316676) @@ -132,6 +132,14 @@ SDT_PROBE_DEFINE6_XLATE(tcp, , , state__ "void *", "void *", "int", "tcplsinfo_t *"); +SDT_PROBE_DEFINE6_XLATE(tcp, , , receive__autoresize, +"void *", "void *", +"struct tcpcb *", "csinfo_t *", +"struct mbuf *", "ipinfo_t *", +"struct tcpcb *", "tcpsinfo_t *" , +"struct tcphdr *", "tcpinfoh_t *", +"int", "int"); + SDT_PROBE_DEFINE5_XLATE(udp, , , receive, "void *", "pktinfo_t *", "struct inpcb *", "csinfo_t *", Modified: head/sys/netinet/in_kdtrace.h == --- head/sys/netinet/in_kdtrace.hMon Apr 10 06:19:09 2017 (r316675) +++ head/sys/netinet/in_kdtrace.hMon Apr 10 08:19:35 2017 (r316676) @@ -65,6 +65,7 @@ SDT_PROBE_DECLARE(tcp, , , debug__input) SDT_PROBE_DECLARE(tcp, , , debug__output); SDT_PROBE_DECLARE(tcp, , , debug__user); SDT_PROBE_DECLARE(tcp, , , debug__drop); +SDT_PROBE_DECLARE(tcp, , , receive__autoresize); SDT_PROBE_DECLARE(udp, , , receive); SDT_PROBE_DECLARE(udp, , , send); Modified: head/sys/netinet/tcp_input.c == --- head/sys/netinet/tcp_input.cMon Apr 10 06:19:09 2017 (r316675) +++ head/sys/netinet/tcp_input.cMon Apr 10 08:19:35 2017 (r316676) @@ -1486,6 +1486,68 @@ drop: return (IPPROTO_DONE); } +/* + * Automatic sizing of receive socket buffer. Often the send + * buffer size is not optimally adjusted to the actual network + * conditions at hand (delay bandwidth product). Setting the + * buffer size too small limits throughput on links with high + * bandwidth and high delay (eg. trans-continental/oceanic links). + * + * On the receive side the socket buffer memory is only rarely + * used to any significant extent. This allows us to be much + * more aggressive in scaling the receive socket buffer. For + * the case that the buffer space is actually used to a large + * extent and we run out of kernel memory we can simply drop + * the new segments; TCP on the sender will just retransmit it + * later. Setting the buffer size too big may only consume too + * much kernel memory if the application doesn't read() from + * the socket or packet loss or reordering makes use of the + * reassembly queue. + * + * The criteria to step up the receive buffer one notch are: + * 1. Application has not set receive buffer size with + * SO_RCVBUF. Setting SO_RCVBUF clears SB_AUTOSIZE. + * 2. the number of bytes received during the time it takes + * one timestamp to be reflected back to us (the RTT); + * 3. received bytes per RTT is within seven eighth of the + * current socket buffer size; + * 4. receive buffer size has not hit maximal automatic size; + * + * This algorithm does one step per RTT at most and only if + * we receive a bulk stream w/o packet losses or reorderings. + * Shrinking the buffer during idle times is not necessary as + * it doesn't consume any memory when idle. + * +
svn commit: r316676 - in head/sys/netinet: . tcp_stacks
Author: smh Date: Mon Apr 10 08:19:35 2017 New Revision: 316676 URL: https://svnweb.freebsd.org/changeset/base/316676 Log: Use estimated RTT for receive buffer auto resizing instead of timestamps Switched from using timestamps to RTT estimates when performing TCP receive buffer auto resizing, as not all hosts support / enable TCP timestamps. Disabled reset of receive buffer auto scaling when not in bulk receive mode, which gives an extra 20% performance increase. Also extracted auto resizing to a common method shared between standard and fastpath modules. With this AWS S3 downloads at ~17ms latency on a 1Gbps connection jump from ~3MB/s to ~100MB/s using the default settings. Reviewed by:lstewart, gnn MFC after: 2 weeks Relnotes: Yes Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D9668 Modified: head/sys/netinet/in_kdtrace.c head/sys/netinet/in_kdtrace.h head/sys/netinet/tcp_input.c head/sys/netinet/tcp_output.c head/sys/netinet/tcp_stacks/fastpath.c head/sys/netinet/tcp_var.h Modified: head/sys/netinet/in_kdtrace.c == --- head/sys/netinet/in_kdtrace.c Mon Apr 10 06:19:09 2017 (r316675) +++ head/sys/netinet/in_kdtrace.c Mon Apr 10 08:19:35 2017 (r316676) @@ -132,6 +132,14 @@ SDT_PROBE_DEFINE6_XLATE(tcp, , , state__ "void *", "void *", "int", "tcplsinfo_t *"); +SDT_PROBE_DEFINE6_XLATE(tcp, , , receive__autoresize, +"void *", "void *", +"struct tcpcb *", "csinfo_t *", +"struct mbuf *", "ipinfo_t *", +"struct tcpcb *", "tcpsinfo_t *" , +"struct tcphdr *", "tcpinfoh_t *", +"int", "int"); + SDT_PROBE_DEFINE5_XLATE(udp, , , receive, "void *", "pktinfo_t *", "struct inpcb *", "csinfo_t *", Modified: head/sys/netinet/in_kdtrace.h == --- head/sys/netinet/in_kdtrace.h Mon Apr 10 06:19:09 2017 (r316675) +++ head/sys/netinet/in_kdtrace.h Mon Apr 10 08:19:35 2017 (r316676) @@ -65,6 +65,7 @@ SDT_PROBE_DECLARE(tcp, , , debug__input) SDT_PROBE_DECLARE(tcp, , , debug__output); SDT_PROBE_DECLARE(tcp, , , debug__user); SDT_PROBE_DECLARE(tcp, , , debug__drop); +SDT_PROBE_DECLARE(tcp, , , receive__autoresize); SDT_PROBE_DECLARE(udp, , , receive); SDT_PROBE_DECLARE(udp, , , send); Modified: head/sys/netinet/tcp_input.c == --- head/sys/netinet/tcp_input.cMon Apr 10 06:19:09 2017 (r316675) +++ head/sys/netinet/tcp_input.cMon Apr 10 08:19:35 2017 (r316676) @@ -1486,6 +1486,68 @@ drop: return (IPPROTO_DONE); } +/* + * Automatic sizing of receive socket buffer. Often the send + * buffer size is not optimally adjusted to the actual network + * conditions at hand (delay bandwidth product). Setting the + * buffer size too small limits throughput on links with high + * bandwidth and high delay (eg. trans-continental/oceanic links). + * + * On the receive side the socket buffer memory is only rarely + * used to any significant extent. This allows us to be much + * more aggressive in scaling the receive socket buffer. For + * the case that the buffer space is actually used to a large + * extent and we run out of kernel memory we can simply drop + * the new segments; TCP on the sender will just retransmit it + * later. Setting the buffer size too big may only consume too + * much kernel memory if the application doesn't read() from + * the socket or packet loss or reordering makes use of the + * reassembly queue. + * + * The criteria to step up the receive buffer one notch are: + * 1. Application has not set receive buffer size with + * SO_RCVBUF. Setting SO_RCVBUF clears SB_AUTOSIZE. + * 2. the number of bytes received during the time it takes + * one timestamp to be reflected back to us (the RTT); + * 3. received bytes per RTT is within seven eighth of the + * current socket buffer size; + * 4. receive buffer size has not hit maximal automatic size; + * + * This algorithm does one step per RTT at most and only if + * we receive a bulk stream w/o packet losses or reorderings. + * Shrinking the buffer during idle times is not necessary as + * it doesn't consume any memory when idle. + * + * TODO: Only step up if the application is actually serving + * the buffer to better manage the socket buffer resources. + */ +int +tcp_autorcvbuf(struct mbuf *m, struct tcphdr *th, struct socket *so, +struct tcpcb *tp, int tlen) +{ + int newsize = 0; + + if (V_tcp_do_autorcvbuf && (so->so_rcv.sb_flags & SB_AUTOSIZE) && + tp->t_srtt != 0 && tp->rfbuf_ts != 0 && + TCP_TS_TO_TICKS(tcp_ts_getticks() - tp->rfbuf_ts) > + (tp->t_srtt >> TCP_RTT_SHIFT)) { + if (tp->rfbuf_cnt > (so->so_rcv.sb_hiwat / 8 *
svn commit: r316460 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Author: smh Date: Mon Apr 3 13:11:28 2017 New Revision: 316460 URL: https://svnweb.freebsd.org/changeset/base/316460 Log: Fix expandsz 16.0E vals and vdev_min_asize of RAIDZ children When a member of a RAIDZ has been replaced with a device smaller than the original, then the top level vdev can report its expand size as 16.0E. The reduced child asize causes the RAIDZ to have a vdev_asize lower than its vdev_max_asize which then results in an underflow during the calculation of the parents expand size. Fix this by updating the vdev_asize if it shrinks, which is already protected by a check against vdev_min_asize so should always be safe. Also for RAIDZ vdevs, ensure that the sum of their child vdev_min_asize is always greater than the parents vdev_min_size. Fixes: https://www.illumos.org/issues/7885 MFC after:2 weeks Sponsored by: Multiplay Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Mon Apr 3 13:06:28 2017(r316459) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Mon Apr 3 13:11:28 2017(r316460) @@ -229,7 +229,8 @@ vdev_get_min_asize(vdev_t *vd) * so each child must provide at least 1/Nth of its asize. */ if (pvd->vdev_ops == _raidz_ops) - return (pvd->vdev_min_asize / pvd->vdev_children); + return ((pvd->vdev_min_asize + pvd->vdev_children - 1) / + pvd->vdev_children); return (pvd->vdev_min_asize); } @@ -1377,7 +1378,7 @@ vdev_open(vdev_t *vd) vd->vdev_psize = psize; /* -* Make sure the allocatable size hasn't shrunk. +* Make sure the allocatable size hasn't shrunk too much. */ if (asize < vd->vdev_min_asize) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, @@ -1417,12 +1418,21 @@ vdev_open(vdev_t *vd) } /* -* If all children are healthy and the asize has increased, -* then we've experienced dynamic LUN growth. If automatic -* expansion is enabled then use the additional space. -*/ - if (vd->vdev_state == VDEV_STATE_HEALTHY && asize > vd->vdev_asize && - (vd->vdev_expanding || spa->spa_autoexpand)) +* If all children are healthy we update asize if either: +* The asize has increased, due to a device expansion caused by dynamic +* LUN growth or vdev replacement, and automatic expansion is enabled; +* making the additional space available. +* +* The asize has decreased, due to a device shrink usually caused by a +* vdev replace with a smaller device. This ensures that calculations +* based of max_asize and asize e.g. esize are always valid. It's safe +* to do this as we've already validated that asize is greater than +* vdev_min_asize. +*/ + if (vd->vdev_state == VDEV_STATE_HEALTHY && + ((asize > vd->vdev_asize && + (vd->vdev_expanding || spa->spa_autoexpand)) || + (asize < vd->vdev_asize))) vd->vdev_asize = asize; vdev_set_min_asize(vd); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r316311 - in head: lib/libstand sys/boot/geli sys/boot/i386/gptboot sys/boot/i386/loader sys/boot/i386/zfsboot
On 31/03/2017 16:16, Ian Lepore wrote: On Fri, 2017-03-31 at 00:04 +, Allan Jude wrote: Add explicit_bzero() to libstand, and switch GELIBoot to using it revolution > man explicit_bzero No manual entry for explicit_bzero revolution > svn log -v explicit_bzero.c ... r272673 | delphij | 2014-10-06 22:54:11 -0600 (Mon, 06 Oct 2014) | 5 lines Add explicit_bzero(3) and its kernel counterpart. Obtained from: OpenBSD So... can anyone provide a clue what's "explicit" (or different in any way) between explicit_bzero() and normal bzero()? Not sure why your system doesn't find the main page, as it works on my 11 box, however does this help: https://www.freebsd.org/cgi/man.cgi?query=explicit_bzero=0=3=FreeBSD+11-current=html Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r316328 - in head: . sys/netinet6
Author: smh Date: Fri Mar 31 09:10:05 2017 New Revision: 316328 URL: https://svnweb.freebsd.org/changeset/base/316328 Log: Allow explicitly assigned IPv6 loopback address to be used in jails If a jail has an explicitly assigned IPv6 loopback address then allow it to be used instead of remapping requests for the loopback adddress to the first IPv6 address assigned to the jail. This fixes issues where applications attempt to detect their bound port where they requested a loopback address, which was available, but instead the kernel remapped it to the jails first address. This is the same fix applied to IPv4 fix by: r316313 Also: * Correct the description of prison_check_ip6_locked to match the code. MFC after:2 weeks Relnotes: Yes Sponsored by: Multiplay Modified: head/UPDATING head/sys/netinet6/in6_jail.c Modified: head/UPDATING == --- head/UPDATING Fri Mar 31 08:43:07 2017(r316327) +++ head/UPDATING Fri Mar 31 09:10:05 2017(r316328) @@ -52,9 +52,9 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 12 ** SPECIAL WARNING: ** 20170331: - Binds and sends to the IPv4 loopback address (127.0.0.1) will now + Binds and sends to the loopback addresses, IPv6 and IPv4, will now use any explicitly assigned loopback address available in the jail - instead of using the first assigned IPv4 address of the jail. + instead of using the first assigned address of the jail. 20170329: The ctl.ko module no longer implements the iSCSI target frontend: Modified: head/sys/netinet6/in6_jail.c == --- head/sys/netinet6/in6_jail.cFri Mar 31 08:43:07 2017 (r316327) +++ head/sys/netinet6/in6_jail.cFri Mar 31 09:10:05 2017 (r316328) @@ -293,12 +293,6 @@ prison_local_ip6(struct ucred *cred, str return (EAFNOSUPPORT); } - if (IN6_IS_ADDR_LOOPBACK(ia6)) { - bcopy(>pr_ip6[0], ia6, sizeof(struct in6_addr)); - mtx_unlock(>pr_mtx); - return (0); - } - if (IN6_IS_ADDR_UNSPECIFIED(ia6)) { /* * In case there is only 1 IPv6 address, and v6only is true, @@ -311,6 +305,11 @@ prison_local_ip6(struct ucred *cred, str } error = prison_check_ip6_locked(pr, ia6); + if (error == EADDRNOTAVAIL && IN6_IS_ADDR_LOOPBACK(ia6)) { + bcopy(>pr_ip6[0], ia6, sizeof(struct in6_addr)); + error = 0; + } + mtx_unlock(>pr_mtx); return (error); } @@ -341,7 +340,8 @@ prison_remote_ip6(struct ucred *cred, st return (EAFNOSUPPORT); } - if (IN6_IS_ADDR_LOOPBACK(ia6)) { + if (IN6_IS_ADDR_LOOPBACK(ia6) && +prison_check_ip6_locked(pr, ia6) == EADDRNOTAVAIL) { bcopy(>pr_ip6[0], ia6, sizeof(struct in6_addr)); mtx_unlock(>pr_mtx); return (0); @@ -357,9 +357,8 @@ prison_remote_ip6(struct ucred *cred, st /* * Check if given address belongs to the jail referenced by cred/prison. * - * Returns 0 if jail doesn't restrict IPv6 or if address belongs to jail, - * EADDRNOTAVAIL if the address doesn't belong, or EAFNOSUPPORT if the jail - * doesn't allow IPv6. + * Returns 0 if address belongs to jail, + * EADDRNOTAVAIL if the address doesn't belong to the jail. */ int prison_check_ip6_locked(const struct prison *pr, const struct in6_addr *ia6) ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r316313 - in head: . sys/netinet
Author: smh Date: Fri Mar 31 00:41:54 2017 New Revision: 316313 URL: https://svnweb.freebsd.org/changeset/base/316313 Log: Allow explicitly assigned IPv4 loopback address to be used in jails If a jail has an explicitly assigned loopback address then allow it to be used instead of remapping requests for the loopback adddress to the first IPv4 address assigned to the jail. This fixes issues where applications attempt to detect their bound port where they requested a loopback address, which was available, but instead the kernel remapped it to the jails first address. A example of this is binding nginx to 127.0.0.1 and then running "service nginx upgrade" which before this change would cause nginx to fail. Also: * Correct the description of prison_check_ip4_locked to match the code. MFC after:2 weeks Relnotes: Yes Sponsored by: Multiplay Modified: head/UPDATING head/sys/netinet/in_jail.c Modified: head/UPDATING == --- head/UPDATING Fri Mar 31 00:07:03 2017(r316312) +++ head/UPDATING Fri Mar 31 00:41:54 2017(r316313) @@ -51,6 +51,11 @@ NOTE TO PEOPLE WHO THINK THAT FreeBSD 12 ** SPECIAL WARNING: ** +20170331: + Binds and sends to the IPv4 loopback address (127.0.0.1) will now + use any explicitly assigned loopback address available in the jail + instead of using the first assigned IPv4 address of the jail. + 20170329: The ctl.ko module no longer implements the iSCSI target frontend: cfiscsi.ko does instead. Modified: head/sys/netinet/in_jail.c == --- head/sys/netinet/in_jail.c Fri Mar 31 00:07:03 2017(r316312) +++ head/sys/netinet/in_jail.c Fri Mar 31 00:41:54 2017(r316313) @@ -306,11 +306,6 @@ prison_local_ip4(struct ucred *cred, str } ia0.s_addr = ntohl(ia->s_addr); - if (ia0.s_addr == INADDR_LOOPBACK) { - ia->s_addr = pr->pr_ip4[0].s_addr; - mtx_unlock(>pr_mtx); - return (0); - } if (ia0.s_addr == INADDR_ANY) { /* @@ -323,6 +318,11 @@ prison_local_ip4(struct ucred *cred, str } error = prison_check_ip4_locked(pr, ia); + if (error == EADDRNOTAVAIL && ia0.s_addr == INADDR_LOOPBACK) { + ia->s_addr = pr->pr_ip4[0].s_addr; + error = 0; + } + mtx_unlock(>pr_mtx); return (error); } @@ -354,7 +354,8 @@ prison_remote_ip4(struct ucred *cred, st return (EAFNOSUPPORT); } - if (ntohl(ia->s_addr) == INADDR_LOOPBACK) { + if (ntohl(ia->s_addr) == INADDR_LOOPBACK && + prison_check_ip4_locked(pr, ia) == EADDRNOTAVAIL) { ia->s_addr = pr->pr_ip4[0].s_addr; mtx_unlock(>pr_mtx); return (0); @@ -370,9 +371,8 @@ prison_remote_ip4(struct ucred *cred, st /* * Check if given address belongs to the jail referenced by cred/prison. * - * Returns 0 if jail doesn't restrict IPv4 or if address belongs to jail, - * EADDRNOTAVAIL if the address doesn't belong, or EAFNOSUPPORT if the jail - * doesn't allow IPv4. Address passed in in NBO. + * Returns 0 if address belongs to jail, + * EADDRNOTAVAIL if the address doesn't belong to the jail. */ int prison_check_ip4_locked(const struct prison *pr, const struct in_addr *ia) ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r315855 - stable/11/lib/libsysdecode
Author: smh Date: Thu Mar 23 10:43:29 2017 New Revision: 315855 URL: https://svnweb.freebsd.org/changeset/base/315855 Log: MFC r315423: Fix libsysdecode vmprot flag decoding Sponsored by: Multiplay Modified: stable/11/lib/libsysdecode/flags.c stable/11/lib/libsysdecode/mktables Directory Properties: stable/11/ (props changed) Modified: stable/11/lib/libsysdecode/flags.c == --- stable/11/lib/libsysdecode/flags.c Thu Mar 23 10:22:06 2017 (r315854) +++ stable/11/lib/libsysdecode/flags.c Thu Mar 23 10:43:29 2017 (r315855) @@ -51,6 +51,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include Modified: stable/11/lib/libsysdecode/mktables == --- stable/11/lib/libsysdecode/mktables Thu Mar 23 10:22:06 2017 (r315854) +++ stable/11/lib/libsysdecode/mktables Thu Mar 23 10:43:29 2017 (r315855) @@ -135,7 +135,7 @@ gen_table "sockoptudp" "UDP_[[:alnu gen_table "socktype""SOCK_[A-Z]+[[:space:]]+[1-9]+[0-9]*" "sys/socket.h" gen_table "thrcreateflags" "THR_[A-Z]+[[:space:]]+0x[0-9]+" "sys/thr.h" gen_table "umtxop" "UMTX_OP_[[:alnum:]_]+[[:space:]]+[0-9]+" "sys/umtx.h" -gen_table "vmprot" "VM_PROT_[A-Z]+[[:space:]]+\(\(vm_prot_t\)\)" "vm/vm.h" +gen_table "vmprot" "VM_PROT_[A-Z]+[[:space:]]+\(\(vm_prot_t\)[[:space:]]+0x[0-9]+\)" "vm/vm.h" gen_table "vmresult""KERN_[A-Z]+[[:space:]]+[0-9]+" "vm/vm_param.h" gen_table "wait6opt""W[A-Z]+[[:space:]]+[0-9]+" "sys/wait.h" gen_table "seekwhence" "SEEK_[A-Z]+[[:space:]]+[0-9]+" "sys/unistd.h" ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r315449 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Author: smh Date: Fri Mar 17 12:34:57 2017 New Revision: 315449 URL: https://svnweb.freebsd.org/changeset/base/315449 Log: Reduce ARC fragmentation threshold As ZFS can request up to SPA_MAXBLOCKSIZE memory block e.g. during zfs recv, update the threshold at which we start agressive reclamation to use SPA_MAXBLOCKSIZE (16M) instead of the lower zfs_max_recordsize which defaults to 1M. PR: 194513 Reviewed by: avg, mav MFC after:1 month Sponsored by: Multiplay Differential Revision:https://reviews.freebsd.org/D10012 Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Fri Mar 17 12:34:56 2017(r315448) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Fri Mar 17 12:34:57 2017(r315449) @@ -3978,7 +3978,7 @@ arc_available_memory(void) * Start aggressive reclamation if too little sequential KVA left. */ if (lowest > 0) { - n = (vmem_size(heap_arena, VMEM_MAXFREE) < zfs_max_recordsize) ? + n = (vmem_size(heap_arena, VMEM_MAXFREE) < SPA_MAXBLOCKSIZE) ? -((int64_t)vmem_size(heap_arena, VMEM_ALLOC) >> 4) : INT64_MAX; if (n < lowest) { ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r315423 - head/lib/libsysdecode
Author: smh Date: Thu Mar 16 20:55:00 2017 New Revision: 315423 URL: https://svnweb.freebsd.org/changeset/base/315423 Log: Fix libsysdecode vmprot flag decoding Fix the regex used to find vmprot table entries and add the missing include. This fixes kdumps output of PFLT arguments which would previously look like: 5202 101546 ktrace PFLT 0x5ae000 0x2<>2 They now display correctly: 5202 101546 ktrace PFLT 0x5ac000 0x2 MFC after:1 week Modified: head/lib/libsysdecode/flags.c head/lib/libsysdecode/mktables Modified: head/lib/libsysdecode/flags.c == --- head/lib/libsysdecode/flags.c Thu Mar 16 20:39:31 2017 (r315422) +++ head/lib/libsysdecode/flags.c Thu Mar 16 20:55:00 2017 (r315423) @@ -51,6 +51,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include Modified: head/lib/libsysdecode/mktables == --- head/lib/libsysdecode/mktables Thu Mar 16 20:39:31 2017 (r315422) +++ head/lib/libsysdecode/mktables Thu Mar 16 20:55:00 2017 (r315423) @@ -135,7 +135,7 @@ gen_table "sockoptudp" "UDP_[[:alnu gen_table "socktype""SOCK_[A-Z]+[[:space:]]+[1-9]+[0-9]*" "sys/socket.h" gen_table "thrcreateflags" "THR_[A-Z]+[[:space:]]+0x[0-9]+" "sys/thr.h" gen_table "umtxop" "UMTX_OP_[[:alnum:]_]+[[:space:]]+[0-9]+" "sys/umtx.h" -gen_table "vmprot" "VM_PROT_[A-Z]+[[:space:]]+\(\(vm_prot_t\)\)" "vm/vm.h" +gen_table "vmprot" "VM_PROT_[A-Z]+[[:space:]]+\(\(vm_prot_t\)[[:space:]]+0x[0-9]+\)" "vm/vm.h" gen_table "vmresult""KERN_[A-Z]+[[:space:]]+[0-9]+" "vm/vm_param.h" gen_table "wait6opt""W[A-Z]+[[:space:]]+[0-9]+" "sys/wait.h" gen_table "seekwhence" "SEEK_[A-Z]+[[:space:]]+[0-9]+" "sys/unistd.h" ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r314155 - head/sys/netinet
You might also be interested in reviewing my fix for TCP buffer scaling too Michael. https://reviews.freebsd.org/D9668 This fixes slow transfers due to no receive buffer scaling if TCP timestamps aren't negotiated. Its still got debug stuff in it ATM and I'm toying with removing the different cases between estimated RTT and timestamps as there appears to be no difference in practice. Tests here show jump from ~3MB/s @ 1Gbps and 17ms latency to 100MB/s, pretty much line rate, which is in line with Linux results. Any feedback welcome. Regards Steve On 23/02/2017 18:14, Michael Tuexen wrote: Author: tuexen Date: Thu Feb 23 18:14:36 2017 New Revision: 314155 URL: https://svnweb.freebsd.org/changeset/base/314155 Log: TCP window updates are only sent if the window can be increased by at least 2 * MSS. However, if the receive buffer size is small, this might be impossible. Add back a criterion to send a TCP window update if the window can be increased by at least half of the receive buffer size. This condition was removed in r242252. This patch simply brings it back. PR: 211003 Reviewed by: gnn MFC after: 1 week Sponsored by:Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D9475 Modified: head/sys/netinet/tcp_output.c Modified: head/sys/netinet/tcp_output.c == --- head/sys/netinet/tcp_output.c Thu Feb 23 17:56:24 2017 (r314154) +++ head/sys/netinet/tcp_output.c Thu Feb 23 18:14:36 2017 (r314155) @@ -696,6 +696,8 @@ after_sack_rexmit: recwin <= (so->so_rcv.sb_hiwat / 8) || so->so_rcv.sb_hiwat <= 8 * tp->t_maxseg)) goto send; + if (2 * adv >= (int32_t)so->so_rcv.sb_hiwat) + goto send; } dontupdate: ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r313260 - head/sys/kern
On 07/02/2017 20:34, Ed Maste wrote: On 7 February 2017 at 10:30, Steven Hartland <steven.hartl...@multiplay.co.uk> wrote: All I'm suggesting is, while one could guess this may be a performance or possibly a compatibility thing, the reason is not obvious, so a small piece of detail on why the change was done should always be included. For this one something like the following would be nice: Switch fget_unlocked to atomic_fcmpset Improve performance under contention by switching fget_unlocked to use atomic_fcmpset. I agree, and one of the key reasons to do this is so that there's this tiny bit of context if someone later runs "git blame" or "svn annotate" and discovers this change for the line containing atomic_fcmpset. Comments containing "eliminate memory leak" or "remove unused variable" have a self-evident reason, but I don't believe that's true for "switch to atomic_fcmpset." Repeating the "switch fget_unlocked to..." in the proposed commit message above feels redundant to me though, and I would suggest: | Switch fget_unlocked to atomic_fcmpset | | Improves performance under contention. or just: | Use atmoic_fcmpset to improve performance under contention All those work for me as they clearly state why the change was made, so I hope this is something we can try to improve moving forward :) ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r313260 - head/sys/kern
On 07/02/2017 14:57, Mateusz Guzik wrote: On Sun, Feb 05, 2017 at 03:17:46PM +, Alexey Dokuchaev wrote: On Sun, Feb 05, 2017 at 04:00:06AM +0100, Mateusz Guzik wrote: For instance, plugging an unused variable, a memory leak, doing a lockless check first etc. are all pretty standard and unless there is something unusual going on (e.g. complicated circumstances leading to a leak) there is not much to explain. In particular, I don't see why anyone would explain why leaks are bad on each commit plugging one. Right; these (unused variable, resource leaks) usually do not warrant elaborate explanation. [ Some linefeeds below were trimmed for brevity ] The gist is as follows: there are plenty of cases where the kernel wants to atomically replace the value of a particular variable. Sometimes, like in this commit, we want to bump the counter by 1, but only if the current value is not 0. For that we need to read the value, see if it is 0 and if not, try to replace what we read with what we read + 1. We cannot just increment as the value could have changed to 0 in the meantime. But this also means that multiple cpus doing the same operation on the same variable will trip on each other - one will succeed while the rest will have to retry. Prior to this commit, each retry attempt would explicitly re-read the value. This induces cache coherency traffic slowing everyone down. amd64 has the nice property of giving us the value it found eleminating the need to explicitly re-read it. There is similar story on i386 and sparc. Other architectures may also benefit from this, but that I did not benchmark. In short[,] under contention atomic_fcmpset is going to be faster than atomic_cmpset. I did not benchmark this particular change, but a switch of the sort easily gives 10%+ in microbenchmarks on amd64. That said, while one can argue this optimizes the code, it really depessimizes it as something of the sort should have been already employed. Given the above, IMHO it's quite far from an obvious or of manpage-lookup thing, and thus requires proper explanation in the commit log. If the aformenteiond explanation is necessary, the place for it is in the man page. There are already several commits with fcmpset and there will be more to come. I don't see why any of them would convey the information. The details of why performance under contention of atomic_fcmpset is better than atomic_cmpset, a manpage would be nice. All I'm suggesting is, while one could guess this may be a performance or possibly a compatibility thing, the reason is not obvious, so a small piece of detail on why the change was done should always be included. For this one something like the following would be nice: Switch fget_unlocked to atomic_fcmpset Improve performance under contention by switching fget_unlocked to use atomic_fcmpset. With small piece of additional information, its clear the reason for the change (why) was to improve performance and anyone who wants more detail on why this would be the case can research it via a manpage or other resources, wouldn't you agree? Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r313355 - in stable/11: lib/libstand sys/boot/common sys/boot/efi/libefi sys/boot/i386/libfirewire sys/boot/i386/libi386 sys/boot/mips/beri/loader sys/boot/ofw/libofw sys/boot/pc98/lib
IMO combining fixes from different areas (yes mainly libstand but dosfs and nandfs) and with style cleanups is not ideal as it makes it much harder to find and identify problems. I appreciate these have been in head for a while but its not unheard of to only identify issues once they are MFC'ed, so keeping the separation is better. You also missed the description from the 3rd commit e.g. loader: nandfs calls strategy with one extra argument. Finally keep the "r" prefix when mentioning the revisions being MFC'ed as that ensure they linked in svnweb properly e.g. MFC r309369, r310850, r310853: On 06/02/2017 22:03, Toomas Soome wrote: Author: tsoome Date: Mon Feb 6 22:03:07 2017 New Revision: 313355 URL: https://svnweb.freebsd.org/changeset/base/313355 Log: MFC r309369,310850,310853: libstand: dosfs cstyle cleanup for return keyword. dosfs support in libstand is broken since r298230 PR: 214423 Submitted by:Mikhail Kupchik Reported by: Mikhail Kupchik Approved by: imp (mentor) Modified: stable/11/lib/libstand/cd9660.c stable/11/lib/libstand/dosfs.c stable/11/lib/libstand/ext2fs.c stable/11/lib/libstand/nandfs.c stable/11/lib/libstand/read.c stable/11/lib/libstand/stand.h stable/11/lib/libstand/ufs.c stable/11/lib/libstand/write.c stable/11/sys/boot/common/bcache.c stable/11/sys/boot/common/bootstrap.h stable/11/sys/boot/common/disk.c stable/11/sys/boot/common/md.c stable/11/sys/boot/efi/libefi/efipart.c stable/11/sys/boot/i386/libfirewire/firewire.c stable/11/sys/boot/i386/libi386/bioscd.c stable/11/sys/boot/i386/libi386/biosdisk.c stable/11/sys/boot/i386/libi386/pxe.c stable/11/sys/boot/mips/beri/loader/beri_disk_cfi.c stable/11/sys/boot/mips/beri/loader/beri_disk_sdcard.c stable/11/sys/boot/ofw/libofw/ofw_disk.c stable/11/sys/boot/pc98/libpc98/bioscd.c stable/11/sys/boot/pc98/libpc98/biosdisk.c stable/11/sys/boot/powerpc/kboot/hostdisk.c stable/11/sys/boot/powerpc/ps3/ps3cdrom.c stable/11/sys/boot/powerpc/ps3/ps3disk.c stable/11/sys/boot/uboot/lib/disk.c stable/11/sys/boot/usb/storage/umass_loader.c stable/11/sys/boot/userboot/userboot/host.c stable/11/sys/boot/userboot/userboot/userboot_disk.c stable/11/sys/boot/zfs/zfs.c Directory Properties: stable/11/ (props changed) Modified: stable/11/lib/libstand/cd9660.c == --- stable/11/lib/libstand/cd9660.c Mon Feb 6 21:02:26 2017 (r313354) +++ stable/11/lib/libstand/cd9660.c Mon Feb 6 22:03:07 2017 (r313355) @@ -143,7 +143,7 @@ susp_lookup_record(struct open_file *f, if (bcmp(sh->type, SUSP_CONTINUATION, 2) == 0) { shc = (ISO_RRIP_CONT *)sh; error = f->f_dev->dv_strategy(f->f_devdata, F_READ, - cdb2devb(isonum_733(shc->location)), 0, + cdb2devb(isonum_733(shc->location)), ISO_DEFAULT_BLOCK_SIZE, susp_buffer, ); /* Bail if it fails. */ @@ -288,7 +288,7 @@ cd9660_open(const char *path, struct ope for (bno = 16;; bno++) { twiddle(1); rc = f->f_dev->dv_strategy(f->f_devdata, F_READ, cdb2devb(bno), - 0, ISO_DEFAULT_BLOCK_SIZE, buf, ); + ISO_DEFAULT_BLOCK_SIZE, buf, ); if (rc) goto out; if (read != ISO_DEFAULT_BLOCK_SIZE) { @@ -322,7 +322,7 @@ cd9660_open(const char *path, struct ope twiddle(1); rc = f->f_dev->dv_strategy (f->f_devdata, F_READ, -cdb2devb(bno + boff), 0, +cdb2devb(bno + boff), ISO_DEFAULT_BLOCK_SIZE, buf, ); if (rc) @@ -381,7 +381,7 @@ cd9660_open(const char *path, struct ope bno = isonum_733(rec.extent) + isonum_711(rec.ext_attr_length); twiddle(1); rc = f->f_dev->dv_strategy(f->f_devdata, F_READ, cdb2devb(bno), - 0, ISO_DEFAULT_BLOCK_SIZE, buf, ); + ISO_DEFAULT_BLOCK_SIZE, buf, ); if (rc) goto out; if (read != ISO_DEFAULT_BLOCK_SIZE) { @@ -438,7 +438,7 @@ buf_read_file(struct open_file *f, char twiddle(16); rc = f->f_dev->dv_strategy(f->f_devdata, F_READ, - cdb2devb(blkno), 0, ISO_DEFAULT_BLOCK_SIZE, + cdb2devb(blkno), ISO_DEFAULT_BLOCK_SIZE, fp->f_buf, ); if (rc) return (rc); Modified: stable/11/lib/libstand/dosfs.c
Re: svn commit: r313260 - head/sys/kern
On 05/02/2017 15:17, Alexey Dokuchaev wrote: On Sun, Feb 05, 2017 at 04:00:06AM +0100, Mateusz Guzik wrote: For instance, plugging an unused variable, a memory leak, doing a lockless check first etc. are all pretty standard and unless there is something unusual going on (e.g. complicated circumstances leading to a leak) there is not much to explain. In particular, I don't see why anyone would explain why leaks are bad on each commit plugging one. Right; these (unused variable, resource leaks) usually do not warrant elaborate explanation. Indeed these are self explanatory The gist is as follows: there are plenty of cases where the kernel wants to atomically replace the value of a particular variable. Sometimes, like in this commit, we want to bump the counter by 1, but only if the current value is not 0. For that we need to read the value, see if it is 0 and if not, try to replace what we read with what we read + 1. We cannot just increment as the value could have changed to 0 in the meantime. But this also means that multiple cpus doing the same operation on the same variable will trip on each other - one will succeed while the rest will have to retry. Prior to this commit, each retry attempt would explicitly re-read the value. This induces cache coherency traffic slowing everyone down. amd64 has the nice property of giving us the value it found eleminating the need to explicitly re-read it. There is similar story on i386 and sparc. Other architectures may also benefit from this, but that I did not benchmark. In short[,] under contention atomic_fcmpset is going to be faster than atomic_cmpset. I did not benchmark this particular change, but a switch of the sort easily gives 10%+ in microbenchmarks on amd64. That said, while one can argue this optimizes the code, it really depessimizes it as something of the sort should have been already employed. Given the above, IMHO it's quite far from an obvious or of manpage-lookup thing, and thus requires proper explanation in the commit log. Absolutely, I would encourage everyone to not only think about others making similar changes but also providing education for those who may uses similar code in other areas. If said changes where using older code as an example, without knowing otherwise they may not use the updated methodologies. Sharing the detail you have done above is fantastic, allowing others to take note without having to do the research that the may well not have time for, with the result being improved code quality moving forward; so thanks for that :) While on this subject are there any official guidelines to writing commit messages, if no should we create some? I'm unaware of any. We might not have official guidelines, but 30%-what/70%-why rule would apply perfectly here. ;-) Sounds like a good guide. Regards Steve ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r313260 - head/sys/kern
Hi Mateusz could you improve on the commit message as it currently describes what is changed, which can be obtained from the diff, but not why? I hope on one feels like I'm trying to teach them to suck eggs, as I know everyone here has a wealth of experience, but I strongly believe commit messages are a very important way of improving the overall quality of the code base by sharing with others the reason for changes, which they can then learn from. I know I for one love picking up new nuggets of knowledge from others in this way. Also I believe this is area the project as a whole can improve on, so I don't mean to single out anyone here. Anyway I hope people find this useful: When I write a commit message I try to stick to the following rules which I believe helps to bring clarity for others about my actions. 1. First line is a brief summary of the out come of the change e.g. Fixed compiler warnings in nvmecontrol on 32bit platforms 2. Follow up paragraphs expand on #1 if needed including details about not just what but why the change was made e.g. Use ssize_t instead of uint32_t to prevent warnings about a comparison with different signs. Due to the promotion rules, this would only happen on 32-bit platforms. 3. When writing #2 include details that would not be obvious to non-experts in the particular area. #2 and #3 are really important to sharing knowledge that others may not know, its quite relevant to this commit msg, as while it may be obvious to you and others familiar with the atomic ops, to the rest of us we're just wondering why make this change? N.B. The example is based on Warner's recent commit purely as an example, which had a good why, just missing the brief summary. While on this subject are there any official guidelines to writing commit messages, if no should we create some? On 05/02/2017 01:40, Mateusz Guzik wrote: Author: mjg Date: Sun Feb 5 01:40:27 2017 New Revision: 313260 URL: https://svnweb.freebsd.org/changeset/base/313260 Log: fd: switch fget_unlocked to atomic_fcmpset Modified: head/sys/kern/kern_descrip.c Modified: head/sys/kern/kern_descrip.c == --- head/sys/kern/kern_descrip.cSun Feb 5 01:20:39 2017 (r313259) +++ head/sys/kern/kern_descrip.cSun Feb 5 01:40:27 2017 (r313260) @@ -2569,8 +2569,8 @@ fget_unlocked(struct filedesc *fdp, int if (error != 0) return (error); #endif - retry: count = fp->f_count; + retry: if (count == 0) { /* * Force a reload. Other thread could reallocate the @@ -2584,7 +2584,7 @@ fget_unlocked(struct filedesc *fdp, int * Use an acquire barrier to force re-reading of fdt so it is * refreshed for verification. */ - if (atomic_cmpset_acq_int(>f_count, count, count + 1) == 0) + if (atomic_fcmpset_acq_int(>f_count, , count + 1) == 0) goto retry; fdt = fdp->fd_files; #ifdefCAPABILITIES ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r312279 - stable/11/usr.bin/netstat
Author: smh Date: Mon Jan 16 09:16:11 2017 New Revision: 312279 URL: https://svnweb.freebsd.org/changeset/base/312279 Log: MFC r311769: Fix rstat: symbol not in namelist from netstat Sponsored by: Multiplay Modified: stable/11/usr.bin/netstat/main.c Directory Properties: stable/11/ (props changed) Modified: stable/11/usr.bin/netstat/main.c == --- stable/11/usr.bin/netstat/main.cMon Jan 16 09:12:40 2017 (r312278) +++ stable/11/usr.bin/netstat/main.cMon Jan 16 09:16:11 2017 (r312279) @@ -427,6 +427,9 @@ main(int argc, char *argv[]) if (xflag && Tflag) xo_errx(1, "-x and -T are incompatible, pick one."); + /* Load all necessary kvm symbols */ + kresolve_list(nl); + if (Bflag) { if (!live) usage(); @@ -507,9 +510,6 @@ main(int argc, char *argv[]) exit(0); } - /* Load all necessary kvm symbols */ - kresolve_list(nl); - if (tp) { xo_open_container("statistics"); printproto(tp, tp->pr_name, ); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r312278 - stable/10/usr.bin/netstat
Author: smh Date: Mon Jan 16 09:12:40 2017 New Revision: 312278 URL: https://svnweb.freebsd.org/changeset/base/312278 Log: MFC r311769: Fix rstat: symbol not in namelist from netstat Sponsored by: Multiplay Modified: stable/10/usr.bin/netstat/main.c Directory Properties: stable/10/ (props changed) Modified: stable/10/usr.bin/netstat/main.c == --- stable/10/usr.bin/netstat/main.cMon Jan 16 08:25:33 2017 (r312277) +++ stable/10/usr.bin/netstat/main.cMon Jan 16 09:12:40 2017 (r312278) @@ -535,6 +535,9 @@ main(int argc, char *argv[]) if (xflag && Tflag) errx(1, "-x and -T are incompatible, pick one."); + /* Load all necessary kvm symbols */ + kresolve_list(nl); + if (Bflag) { if (!live) usage(); @@ -603,9 +606,6 @@ main(int argc, char *argv[]) exit(0); } - /* Load all necessary kvm symbols */ - kresolve_list(nl); - if (tp) { printproto(tp, tp->pr_name); exit(0); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r311769 - head/usr.bin/netstat
Author: smh Date: Mon Jan 9 09:28:03 2017 New Revision: 311769 URL: https://svnweb.freebsd.org/changeset/base/311769 Log: Fix rstat: symbol not in namelist from netstat Load kvm symbols earlier to prevent rstat: symbol not in namelist error when running netstat -rs. Submitted by: Sebastian HuberMFC after:1 week Sponsored by: Multiplay Modified: head/usr.bin/netstat/main.c Modified: head/usr.bin/netstat/main.c == --- head/usr.bin/netstat/main.c Mon Jan 9 08:12:22 2017(r311768) +++ head/usr.bin/netstat/main.c Mon Jan 9 09:28:03 2017(r311769) @@ -427,6 +427,9 @@ main(int argc, char *argv[]) if (xflag && Tflag) xo_errx(1, "-x and -T are incompatible, pick one."); + /* Load all necessary kvm symbols */ + kresolve_list(nl); + if (Bflag) { if (!live) usage(); @@ -507,9 +510,6 @@ main(int argc, char *argv[]) exit(0); } - /* Load all necessary kvm symbols */ - kresolve_list(nl); - if (tp) { xo_open_container("statistics"); printproto(tp, tp->pr_name, ); ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r311346 - in head/sys: kern sys vm
Given the use of the number of CPU's for sizing would this play nice with hot plug CPU's? Regards Steve On 05/01/2017 01:44, Mark Johnston wrote: Author: markj Date: Thu Jan 5 01:44:12 2017 New Revision: 311346 URL: https://svnweb.freebsd.org/changeset/base/311346 Log: Add a small allocator for exec_map entries. Upon each execve, we allocate a KVA range for use in copying data to the new image. Pages must be faulted into the range, and when the range is freed, the backing pages are freed and their mappings are destroyed. This is a lot of needless overhead, and the exec_map management becomes a bottleneck when many CPUs are executing execve concurrently. Moreover, the number of available ranges is fixed at 16, which is insufficient on large systems and potentially excessive on 32-bit systems. The new allocator reduces overhead by making exec_map allocations persistent. When a range is freed, pages backing the range are marked clean and made easy to reclaim. With this change, the exec_map is sized based on the number of CPUs. Reviewed by: kib MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8921 Modified: head/sys/kern/kern_exec.c head/sys/sys/imgact.h head/sys/vm/vm_init.c head/sys/vm/vm_kern.c head/sys/vm/vm_kern.h Modified: head/sys/kern/kern_exec.c == --- head/sys/kern/kern_exec.c Thu Jan 5 01:28:08 2017(r311345) +++ head/sys/kern/kern_exec.c Thu Jan 5 01:44:12 2017(r311346) @@ -45,6 +45,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -59,6 +60,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -1315,17 +1317,80 @@ err_exit: return (error); } +struct exec_args_kva { + vm_offset_t addr; + SLIST_ENTRY(exec_args_kva) next; +}; + +static DPCPU_DEFINE(struct exec_args_kva *, exec_args_kva); + +static SLIST_HEAD(, exec_args_kva) exec_args_kva_freelist; +static struct mtx exec_args_kva_mtx; + +static void +exec_prealloc_args_kva(void *arg __unused) +{ + struct exec_args_kva *argkva; + u_int i; + + SLIST_INIT(_args_kva_freelist); + mtx_init(_args_kva_mtx, "exec args kva", NULL, MTX_DEF); + for (i = 0; i < exec_map_entries; i++) { + argkva = malloc(sizeof(*argkva), M_PARGS, M_WAITOK); + argkva->addr = kmap_alloc_wait(exec_map, exec_map_entry_size); + SLIST_INSERT_HEAD(_args_kva_freelist, argkva, next); + } +} +SYSINIT(exec_args_kva, SI_SUB_EXEC, SI_ORDER_ANY, exec_prealloc_args_kva, NULL); + +static vm_offset_t +exec_alloc_args_kva(void **cookie) +{ + struct exec_args_kva *argkva; + + argkva = (void *)atomic_readandclear_ptr( + (uintptr_t *)DPCPU_PTR(exec_args_kva)); + if (argkva == NULL) { + mtx_lock(_args_kva_mtx); + while ((argkva = SLIST_FIRST(_args_kva_freelist)) == NULL) + (void)mtx_sleep(_args_kva_freelist, + _args_kva_mtx, 0, "execkva", 0); + SLIST_REMOVE_HEAD(_args_kva_freelist, next); + mtx_unlock(_args_kva_mtx); + } + *(struct exec_args_kva **)cookie = argkva; + return (argkva->addr); +} + +static void +exec_free_args_kva(void *cookie) +{ + struct exec_args_kva *argkva; + vm_offset_t base; + + argkva = cookie; + base = argkva->addr; + + vm_map_madvise(exec_map, base, base + exec_map_entry_size, MADV_FREE); + if (!atomic_cmpset_ptr((uintptr_t *)DPCPU_PTR(exec_args_kva), + (uintptr_t)NULL, (uintptr_t)argkva)) { + mtx_lock(_args_kva_mtx); + SLIST_INSERT_HEAD(_args_kva_freelist, argkva, next); + wakeup_one(_args_kva_freelist); + mtx_unlock(_args_kva_mtx); + } +} + /* * Allocate temporary demand-paged, zero-filled memory for the file name, - * argument, and environment strings. Returns zero if the allocation succeeds - * and ENOMEM otherwise. + * argument, and environment strings. */ int exec_alloc_args(struct image_args *args) { - args->buf = (char *)kmap_alloc_wait(exec_map, PATH_MAX + ARG_MAX); - return (args->buf != NULL ? 0 : ENOMEM); + args->buf = (char *)exec_alloc_args_kva(>bufkva); + return (0); } void @@ -1333,8 +1398,7 @@ exec_free_args(struct image_args *args) { if (args->buf != NULL) { - kmap_free_wakeup(exec_map, (vm_offset_t)args->buf, - PATH_MAX + ARG_MAX); + exec_free_args_kva(args->bufkva); args->buf = NULL; } if (args->fname_buf != NULL) { Modified: head/sys/sys/imgact.h == ---
Re: svn commit: r310112 - head/sys/conf
Thanks for doing this :) On 15/12/2016 12:57, Ed Maste wrote: Author: emaste Date: Thu Dec 15 12:57:03 2016 New Revision: 310112 URL: https://svnweb.freebsd.org/changeset/base/310112 Log: newvers.sh: add option to eliminate kernel build metadata Build metadata (username, hostname, etc.) prevents the FreeBSD kernel from building reproducibly. Add an option to disable inclusion of that metadata but retain the release information and SVN/git VCS details. See https://reproducible-builds.org/ for additional background. Reviewed by: bapt Obtained from: NetBSD MFC after: 1 month Sponsored by:Reproducible Builds World Summit 2, Berlin Differential Revision: https://reviews.freebsd.org/D4347 Modified: head/sys/conf/newvers.sh Modified: head/sys/conf/newvers.sh == --- head/sys/conf/newvers.shThu Dec 15 10:51:35 2016(r310111) +++ head/sys/conf/newvers.shThu Dec 15 12:57:03 2016(r310112) @@ -30,6 +30,14 @@ # @(#)newvers.sh 8.1 (Berkeley) 4/20/94 # $FreeBSD$ +# Command line options: +# +# -r Reproducible build. Do not embed directory names, user +# names, time stamps or other dynamic information into +# the outuput file. This is intended to allow two builds +# done at different times and even by different people on +# different hosts to produce identical output. + TYPE="FreeBSD" REVISION="12.0" BRANCH="CURRENT" @@ -250,10 +258,28 @@ if [ -n "$hg_cmd" ] ; then fi fi +include_metadata=true +while getopts r opt; do + case "$opt" in + r) + include_metadata= + ;; + esac +done +shift $((OPTIND - 1)) + +if [ -z "${include_metadata}" ]; then + VERINFO="${VERSION} ${svn}${git}${hg}${p4version}" + VERSTR="${VERINFO}\\n" +else + VERINFO="${VERSION} #${v}${svn}${git}${hg}${p4version}: ${t}" + VERSTR="${VERINFO}\\n${u}@${h}:${d}\\n" +fi + cat << EOF > vers.c $COPYRIGHT -#define SCCSSTR "@(#)${VERSION} #${v}${svn}${git}${hg}${p4version}: ${t}" -#define VERSTR "${VERSION} #${v}${svn}${git}${hg}${p4version}: ${t}\\n ${u}@${h}:${d}\\n" +#define SCCSSTR "@(#)${VERINFO}" +#define VERSTR "${VERSTR}" #define RELSTR "${RELEASE}" char sccs[sizeof(SCCSSTR) > 128 ? sizeof(SCCSSTR) : 128] = SCCSSTR; ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r308782 - in head: cddl/contrib/opensolaris/cmd/ztest sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys
Thanks, looks like the PR needs a rebase before it can be merged. On 17/11/2016 22:11, Alexander Motin wrote: It is in OpenZFS review queue now: https://github.com/openzfs/openzfs/pull/219 Welcome to comment there to speed up the process. On 17.11.2016 13:43, Steven Hartland wrote: Is this something that should be upstreamed? On 17/11/2016 21:01, Alexander Motin wrote: Author: mav Date: Thu Nov 17 21:01:27 2016 New Revision: 308782 URL: https://svnweb.freebsd.org/changeset/base/308782 Log: After some ZIL changes 6 years ago zil_slog_limit got partially broken due to zl_itx_list_sz not updated when async itx'es upgraded to sync. Actually because of other changes about that time zl_itx_list_sz is not really required to implement the functionality, so this patch removes some unneeded broken code and variables. Original idea of zil_slog_limit was to reduce chance of SLOG abuse by single heavy logger, that increased latency for other (more latency critical) loggers, by pushing heavy log out into the main pool instead of SLOG. Beside huge latency increase for heavy writers, this implementation caused double write of all data, since the log records were explicitly prepared for SLOG. Since we now have I/O scheduler, I've found it can be much more efficient to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG. Existing ZIL implementation had problem with space efficiency when it has to write large chunks of data into log blocks of limited size. In some cases efficiency stopped to almost as low as 50%. In case of ZIL stored on spinning rust, that also reduced log write speed in half, since head had to uselessly fly over allocated but not written areas. This change improves the situation by offloading problematic operations from z*_log_write() to zil_lwb_commit(), which knows real situation of log blocks allocation and can split large requests into pieces much more efficiently. Also as side effect it removes one of two data copy operations done by ZIL code WR_COPIED case. While there, untangle and unify code of z*_log_write() functions. Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing block boundary, that may also improve efficiency if ZPL is made to do that. Sponsored by: iXsystems, Inc. Modified: head/cddl/contrib/opensolaris/cmd/ztest/ztest.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_log.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Modified: head/cddl/contrib/opensolaris/cmd/ztest/ztest.c == --- head/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Nov 17 20:44:51 2016(r308781) +++ head/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Nov 17 21:01:27 2016(r308782) @@ -1371,7 +1371,6 @@ ztest_log_write(ztest_ds_t *zd, dmu_tx_t itx->itx_private = zd; itx->itx_wr_state = write_state; itx->itx_sync = (ztest_random(8) == 0); - itx->itx_sod += (write_state == WR_NEED_COPY ? lr->lr_length : 0); bcopy(>lr_common + 1, >itx_lr + 1, sizeof (*lr) - sizeof (lr_t)); Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h Thu Nov 17 20:44:51 2016(r308781) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h Thu Nov 17 21:01:27 2016(r308782) @@ -369,7 +369,6 @@ typedef struct itx { void*itx_private; /* type-specific opaque data */ itx_wr_state_t itx_wr_state; /* write state */ uint8_t itx_sync; /* synchronous transaction */ - uint64_titx_sod;/* record size on disk */ uint64_titx_oid;/* object id */ lr_titx_lr; /* common part of log record */ /* followed by type-specific part of lr_xx_t and its immediate data */ Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h Thu Nov 17 20:44:51 2016(r308781) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h Thu Nov 17 21:01:27 2016(r308782) @@ -42,6 +42,7 @@ extern "C" { typed
Re: svn commit: r308782 - in head: cddl/contrib/opensolaris/cmd/ztest sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys
Is this something that should be upstreamed? On 17/11/2016 21:01, Alexander Motin wrote: Author: mav Date: Thu Nov 17 21:01:27 2016 New Revision: 308782 URL: https://svnweb.freebsd.org/changeset/base/308782 Log: After some ZIL changes 6 years ago zil_slog_limit got partially broken due to zl_itx_list_sz not updated when async itx'es upgraded to sync. Actually because of other changes about that time zl_itx_list_sz is not really required to implement the functionality, so this patch removes some unneeded broken code and variables. Original idea of zil_slog_limit was to reduce chance of SLOG abuse by single heavy logger, that increased latency for other (more latency critical) loggers, by pushing heavy log out into the main pool instead of SLOG. Beside huge latency increase for heavy writers, this implementation caused double write of all data, since the log records were explicitly prepared for SLOG. Since we now have I/O scheduler, I've found it can be much more efficient to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG. Existing ZIL implementation had problem with space efficiency when it has to write large chunks of data into log blocks of limited size. In some cases efficiency stopped to almost as low as 50%. In case of ZIL stored on spinning rust, that also reduced log write speed in half, since head had to uselessly fly over allocated but not written areas. This change improves the situation by offloading problematic operations from z*_log_write() to zil_lwb_commit(), which knows real situation of log blocks allocation and can split large requests into pieces much more efficiently. Also as side effect it removes one of two data copy operations done by ZIL code WR_COPIED case. While there, untangle and unify code of z*_log_write() functions. Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing block boundary, that may also improve efficiency if ZPL is made to do that. Sponsored by: iXsystems, Inc. Modified: head/cddl/contrib/opensolaris/cmd/ztest/ztest.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_log.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c Modified: head/cddl/contrib/opensolaris/cmd/ztest/ztest.c == --- head/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Nov 17 20:44:51 2016(r308781) +++ head/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Nov 17 21:01:27 2016(r308782) @@ -1371,7 +1371,6 @@ ztest_log_write(ztest_ds_t *zd, dmu_tx_t itx->itx_private = zd; itx->itx_wr_state = write_state; itx->itx_sync = (ztest_random(8) == 0); - itx->itx_sod += (write_state == WR_NEED_COPY ? lr->lr_length : 0); bcopy(>lr_common + 1, >itx_lr + 1, sizeof (*lr) - sizeof (lr_t)); Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h Thu Nov 17 20:44:51 2016(r308781) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h Thu Nov 17 21:01:27 2016(r308782) @@ -369,7 +369,6 @@ typedef struct itx { void*itx_private; /* type-specific opaque data */ itx_wr_state_t itx_wr_state; /* write state */ uint8_t itx_sync; /* synchronous transaction */ - uint64_titx_sod;/* record size on disk */ uint64_titx_oid;/* object id */ lr_titx_lr; /* common part of log record */ /* followed by type-specific part of lr_xx_t and its immediate data */ Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h == --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h Thu Nov 17 20:44:51 2016(r308781) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil_impl.h Thu Nov 17 21:01:27 2016(r308782) @@ -42,6 +42,7 @@ extern "C" { typedef struct lwb { zilog_t *lwb_zilog; /* back pointer to log struct */ blkptr_tlwb_blk;/* on disk address of this log blk */ + boolean_t lwb_slog; /* lwb_blk is on SLOG device */ int lwb_nused; /* # used bytes in buffer */ int
Re: svn commit: r307507 - head/sys/cam/scsi
On 17/10/2016 09:51, Alexander Motin wrote: On 17.10.2016 11:45, Steven Hartland wrote: IIRC the timeout for this was intentionally lower than the default, might be worth just checking. I did traced back the commit history, and it was hardcoded to that value since the beginning 18 years ago. Theoretically SYNCHRONIZE CACHE may require even more time then WRITE, since nobody knows how big can be write caches and how many writes are sitting there. Cool, must be thinking about something else that was added recently then, thanks for checking :) ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"