> On 14-Oct-20 7:41 PM, Ananyev, Konstantin wrote:
> >> From: Liang Ma <liang.j...@intel.com>
> >>
> >> Add a simple on/off switch that will enable saving power when no
> >> packets are arriving. It is based on counting the number of empty
> >> polls and, when the number reaches a certain threshold, entering an
> >> architecture-defined optimized power state that will either wait
> >> until a TSC timestamp expires, or when packets arrive.
> >>
> >> This API mandates a core-to-single-queue mapping (that is, multiple
> >> queued per device are supported, but they have to be polled on different
> >> cores).
> >>
> >> This design is using PMD RX callbacks.
> >>
> >> 1. UMWAIT/UMONITOR:
> >>
> >>     When a certain threshold of empty polls is reached, the core will go
> >>     into a power optimized sleep while waiting on an address of next RX
> >>     descriptor to be written to.
> >>
> >> 2. Pause instruction
> >>
> >>     Instead of move the core into deeper C state, this method uses the
> >>     pause instruction to avoid busy polling.
> >>
> >> 3. Frequency scaling
> >>     Reuse existing DPDK power library to scale up/down core frequency
> >>     depending on traffic volume.
> >>
> >> Signed-off-by: Liang Ma <liang.j...@intel.com>
> >> Signed-off-by: Anatoly Burakov <anatoly.bura...@intel.com>
> >> ---
> >>
> >> Notes:
> >>      v6:
> >>      - Added wakeup mechanism for UMWAIT
> >>      - Removed memory allocation (everything is now allocated statically)
> >>      - Fixed various typos and comments
> >>      - Check for invalid queue ID
> >>      - Moved release notes to this patch
> >>
> >>      v5:
> >>      - Make error checking more robust
> >>        - Prevent initializing scaling if ACPI or PSTATE env wasn't set
> >>        - Prevent initializing UMWAIT path if PMD doesn't support 
> >> get_wake_addr
> >>      - Add some debug logging
> >>      - Replace x86-specific code path to generic path using the intrinsic 
> >> check
> >>
> 
> 
> <snip>
> 
> >
> >
> > I think you need to check state here, and _disable() have to set state with 
> > lock grabbed.
> > Otherwise this lock wouldn't protect you from race conditions.
> > As an example:
> >
> > CP@T0:
> > rte_spinlock_lock(&queue_cfg->umwait_lock);
> > if (queue_cfg->wait_addr != NULL) //wait_addr == NULL, fallthrough
> > rte_spinlock_unlock(&queue_cfg->umwait_lock);
> >
> > DP@T1:
> > rte_spinlock_lock(&queue_cfg->umwait_lock);
> > queue_cfg->wait_addr = target_addr;
> > monitor_sync(...);  // DP was put to sleep
> >
> > CP@T2:
> > queue_cfg->cur_cb = NULL;
> > queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
> > ret = 0;
> >
> > rte_power_pmd_mgmt_queue_disable() finished with success,
> > but DP core wasn't wokenup.
> >
> > To be more specific:
> > clb_umwait(...) {
> > ...
> > lock(&qcfg->lck);
> > if (qcfg->state == ENABLED)  {
> > qcfg->wake_addr = addr;
> > monitor_sync(addr, ...,&qcfg->lck);
> > }
> > unlock(&qcfg->lck);
> > ...
> > }
> >
> > _disable(...) {
> > ...
> > lock(&qcfg->lck);
> > qcfg->state = DISABLED;
> > if (qcfg->wake_addr != NULL)
> > monitor_wakeup(qcfg->wake_addr);
> > unlock(&qcfg->lock);
> > ...
> > }
> 
> True, didn't think of that. Will fix.
> 
> >> +
> >> +if (!i.power_monitor) {
> >> +RTE_LOG(DEBUG, POWER, "Monitoring intrinsics are not supported\n");
> >> +ret = -ENOTSUP;
> >> +goto end;
> >> +}
> >> +
> >> +/* check if the device supports the necessary PMD API */
> >> +if (rte_eth_get_wake_addr(port_id, queue_id,
> >> +&dummy_addr, &dummy_expected,
> >> +&dummy_mask, &dummy_sz) == -ENOTSUP) {
> >> +RTE_LOG(DEBUG, POWER, "The device does not support 
> >> rte_eth_rxq_ring_addr_get\n");
> >> +ret = -ENOTSUP;
> >> +goto end;
> >> +}
> >> +/* initialize UMWAIT spinlock */
> >> +rte_spinlock_init(&queue_cfg->umwait_lock);
> >
> > I think don't need to do that.
> > It supposed to be in valid state (otherwise you are probably in trouble 
> > anyway).
> 
> This is mostly for initialization, for when we first run the callback. I
> suppose we could do it in an RTE_INIT() function or just leave it be
> since the spinlocks are part of a statically allocated structure and
> will default to 0 anyway

Yes static initialization should be sufficient here.

 (although it wouldn't be proper usage of the
> API as that would be relying on implementation detail).

What I am trying to say - at that moment spinlock value should be zero.
If it is not, then your DP callback is still running and holding the lock.
I understand that it should be *very* rare situation, but it seems 
strange to me to silently over-write the locked value from other thread.
It shouldn't cause any major trouble, but still...

> 
> >
> >> +
> >> +/* initialize data before enabling the callback */
> >> +queue_cfg->empty_poll_stats = 0;
> >> +queue_cfg->cb_mode = mode;
> >> +queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
> >> +
> >> +queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
> >> +clb_umwait, NULL);
> >
> > Would be a bit cleaner/nicer to move add_rx_callback out of switch() {}
> > As you have to do it always anyway.
> > Same thought for disable() and remove_rx_callback().
> 
> The functions are different for each, so we can't move them out of
> switch (unless you're suggesting to unify the callback to handle all
> three modes?).
> 
> --
> Thanks,
> Anatoly

Reply via email to