On 28-Mar-19 1:13 PM, David Hunt wrote:
The distributor application is bottlenecked by the distributor core,
so if we can give more frequency to this core, then the overall
performance of the application may increase.
This patch uses the rte_power_get_capabilities() API to query the
cores provided in the core mask, and if any high frequency cores are
found (e.g. Turbo Boost is enabled), we will pin the distributor
workload to that core.
Signed-off-by: Liang Ma <liang.j...@intel.com>
Signed-off-by: David Hunt <david.h...@intel.com>
---
<...>
+ if (power_lib_initialised)
+ rte_power_exit(rte_lcore_id());
printf("\nCore %u exiting tx task.\n", rte_lcore_id());
return 0;
}
@@ -575,9 +582,35 @@ lcore_worker(struct lcore_params *p)
if (num > 0)
app_stats.worker_bursts[p->worker_id][num-1]++;
}
+ if (power_lib_initialised)
+ rte_power_exit(rte_lcore_id());
+ rte_free(p);
return 0;
}
+static int
+init_power_library(void)
+{
+ int ret = 0, lcore_id;
+ RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+ if (rte_lcore_is_enabled(lcore_id)) {
Please correct me if i'm wrong, but RTE_LCORE_FOREACH_SLAVE already
checks if the lcore is enabled.
<...>
+ if (power_lib_initialised) {
+ /*
+ * Here we'll pre-assign lcore ids to the rx, tx and
+ * distributor workloads if there's higher frequency
+ * on those cores e.g. if Turbo Boost is enabled.
+ * It's also worth mentioning that it will assign cores in a
+ * specific order, so that if there's less than three
+ * available, the higher frequency cores will go to the
+ * distributor first, then rx, then tx.
+ */
+ RTE_LCORE_FOREACH_SLAVE(lcore_id) {
+
+ rte_power_get_capabilities(lcore_id, &lcore_cap);
+
+ if (lcore_cap.turbo == 1) {
+ priority_num++;
+ switch (priority_num) {
+ case 1:
+ distr_core_id = lcore_id;
+ printf("Distributor on priority core
%d\n",
+ lcore_id);
+ break;
+ case 2:
+ rx_core_id = lcore_id;
+ printf("Rx on priority core %d\n",
+ lcore_id);
+ break;
+ case 3:
+ tx_core_id = lcore_id;
+ printf("Tx on priority core %d\n",
+ lcore_id);
+ break;
+ default:
+ break;
+ }
This seems to be doing the same thing as right below (assigning lcore
id's in order), yet in one case you use a switch, and in the other you
use a simple loop. I don't see priority_num used anywhere else, so you
might as well simplify this loop to be similar to what you have below,
with "skip-if-not-turbo, if not assigned, assign-and-continue" type flow.
Once that is fixed,
Reviewed-by: Anatoly Burakov <anatoly.bura...@intel.com>
--
Thanks,
Anatoly