Bill, Barry, Bala,

I've been thinking about this problem, and maybe it's caused by the
mismatch in throughput between the PKTIOs and the CPU for high speed BW
(eg, localhost connections). That is, both the PKTIN and the PKTOUT are
faster than the TM thread. This would lead to a situation where the
input_work_queue gets full and tm_enqueue() rejects packets, while the
PKTOUT never gets full and the TM thread doesn't need to enqueue packets.

So probably with a more realistic testbed, where the TM thread is faster
than the PKTIOs, it should work as expected.

Anyway, the list of config calls is similar to this:

odp_tm_requirements_init(&tm_req);
tm_req.num_levels = 1;
tm_req.max_tm_queues = 4;
odp_tm_egress_init(&egress);
egress.egress_kind = ODP_TM_EGRESS_PKT_IO;
egress.pktio = pktio;
tm = odp_tm_create("TM-0", &tm_req, &egress);

odp_tm_node_params_init(&node_params);
node_params.level = 1;
node_params.max_fanin = 4;
node = odp_tm_node_create(tm, "TM-NODE-0", &node_params)

odp_tm_threshold_params_init(&th_params);
th_params.max_pkts = 1000;
th_params.enable_max_pkts = 1;
th = odp_tm_threshold_create("TM-TH-0", &th_params);

/* Queue 1, prio 0, weight 0.3 */
odp_tm_queue_params_init(&queue_params);
queue_params.priority = 0;
queue_params.threshold_profile = th;
q = odp_tm_queue_create(tm, &queue_params);
odp_tm_queue_connect(q, node)
odp_tm_sched_params_init(&sch_params);
for (i = 0; i < ODP_TM_MAX_PRIORITIES; ++i) {
    sch_params.sched_modes[i] = mode;
    sch_params.sched_weights[i] = 0.3 * 254 + 1;
}
sch = odp_tm_sched_create("TM-SCH-0", &sch_params);
odp_tm_node_sched_config(node, q, sch);

/* Queue 2, prio 0, weight 0.7 */
odp_tm_queue_params_init(&queue_params);
queue_params.priority = 0;
queue_params.threshold_profile = th;
q = odp_tm_queue_create(tm, &queue_params);
odp_tm_queue_connect(q, node)
odp_tm_sched_params_init(&sch_params);
for (i = 0; i < ODP_TM_MAX_PRIORITIES; ++i) {
    sch_params.sched_modes[i] = mode;
    sch_params.sched_weights[i] = 0.7 * 254 + 1;
}
sch = odp_tm_sched_create("TM-SCH-1", &sch_params);
odp_tm_node_sched_config(node, q, sch);

/* Queue 3, prio 1, weight 1 (unused) */
odp_tm_queue_params_init(&queue_params);
queue_params.priority = 1;
queue_params.threshold_profile = th;
q = odp_tm_queue_create(tm, &queue_params);
odp_tm_queue_connect(q, node)
odp_tm_sched_params_init(&sch_params);
for (i = 0; i < ODP_TM_MAX_PRIORITIES; ++i) {
    sch_params.sched_modes[i] = mode;
    sch_params.sched_weights[i] = 255;
}
sch = odp_tm_sched_create("TM-SCH-2", &sch_params);
odp_tm_node_sched_config(node, q, sch);

/* Queue 4, prio 2, weight 1 */
odp_tm_queue_params_init(&queue_params);
queue_params.priority = 2;
queue_params.threshold_profile = th;
q = odp_tm_queue_create(tm, &queue_params);
odp_tm_queue_connect(q, node)
odp_tm_sched_params_init(&sch_params);
for (i = 0; i < ODP_TM_MAX_PRIORITIES; ++i) {
    sch_params.sched_modes[i] = mode;
    sch_params.sched_weights[i] = 255;
}
sch = odp_tm_sched_create("TM-SCH-3", &sch_params);
odp_tm_node_sched_config(node, q, sch);

--
Oriol Arcas
Software Engineer
Starflow Networks

On Mon, May 1, 2017 at 8:33 PM, Bill Fischofer <bill.fischo...@linaro.org>
wrote:

> +cc Barry, who is the expert in this area, for his thoughts. Do you have
> the complete list of TM config calls you issue to set this up? I know Bala
> (+cc) was looking to propose a "streamlined" config API for TM since this
> is can be tricky as there are a number of "knobs" the TM provides to
> control these things.
>
> You are correct that the TM only really gets involved when output classes
> become over-subscribed as its job is to throttle output based either on
> physical or configured transmission limits. So without seeing the complete
> config it's difficult to assess what might be going on here.
>
> On Fri, Apr 28, 2017 at 8:57 AM, Oriol Arcas <or...@starflownetworks.com>
> wrote:
>
>> Hello,
>>
>> I'm trying to use the Traffic Manager to implement priorities and weights.
>> But it is not applying these policies.
>>
>> I have observed from the traces that the TM is not queuing the packets,
>> but
>> directly ouputing them. For instance, see this final stats from the TM:
>>
>>  odp_tm_stats_print - tm_system=0x56347C6E6790 tm_idx=0
>>    input_work_queue size=256 current cnt=0 peak cnt=256
>>    input_work_queue enqueues=1225767 dequeues=1225767 fail_cnt=1034
>>    green_cnt=0 yellow_cnt=0 red_cnt=0
>>    * queue_num=1 priority=0 rcvd=408650 enqueued=0 dequeued=0
>> consumed=408650
>>    * queue_num=2 priority=0 rcvd=409041 enqueued=0 dequeued=0
>> consumed=409041
>>    * queue_num=4 priority=2 rcvd=408076 enqueued=0 dequeued=0
>> consumed=408076
>>
>> I have two priorities (0 and 2) and 3 queues. Two queues go to priority 0,
>> with 30% and 70% weights. I transmit 1GB through each queue with TCP.
>> There
>> is no prioritization, nor weighted scheduling.
>>
>> My understanding is that if the TM doesn't enqueue the packets in its
>> internal queues (enqueued and dequeued stats are 0), it doesn't apply any
>> of the algorithms (priority encoder, WFQ, etc.).
>>
>> I am using taps in my laptop, not a real network, so the BW is quite high
>> (~3 Gbps). Could it be that if the network can absorb the BW processed by
>> the CPU, then the PKTOUT doesn't get congested and no queing is done, so
>> no
>> TM either.
>>
>> However, to reject this hypothesis, I reduced the main input queue to 256
>> packets. Now, it gets full, because the TM rejects 1034 packets. So it
>> still doesn't apply queuing (thus priorities and weights), but it is
>> clearly congested because it rejects packets.
>>
>> Any thoughts on this?
>>
>> Thank you in advance.
>>
>> --
>> Oriol Arcas
>> Software Engineer
>> Starflow Networks
>>
>
>

Reply via email to