On 11.08.2017 16:11, Bodireddy, Bhanuprakash wrote:
>> On 09.08.2017 15:35, Bodireddy, Bhanuprakash wrote:
>>>>>
>>>>> +static int
>>>>> +netdev_dpdk_vhost_tx_burst(struct netdev_dpdk *dev, int qid) {
>>>>> +    struct dpdk_tx_queue *txq = &dev->tx_q[qid];
>>>>> +    struct rte_mbuf **cur_pkts = (struct rte_mbuf
>>>>> +**)txq->vhost_burst_pkts;
>>>>> +
>>>>> +    int tx_vid = netdev_dpdk_get_vid(dev);
>>>>> +    int tx_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
>>>>> +    uint32_t sent = 0;
>>>>> +    uint32_t retries = 0;
>>>>> +    uint32_t sum, total_pkts;
>>>>> +
>>>>> +    total_pkts = sum = txq->vhost_pkt_cnt;
>>>>> +    do {
>>>>> +        uint32_t ret;
>>>>> +        ret = rte_vhost_enqueue_burst(tx_vid, tx_qid,
>>>>> + &cur_pkts[sent],
>>>> sum);
>>>>> +        if (OVS_UNLIKELY(!ret)) {
>>>>> +            /* No packets enqueued - do not retry. */
>>>>> +            break;
>>>>> +        } else {
>>>>> +            /* Packet have been sent. */
>>>>> +            sent += ret;
>>>>> +
>>>>> +            /* 'sum' packet have to be retransmitted. */
>>>>> +            sum -= ret;
>>>>> +        }
>>>>> +    } while (sum && (retries++ < VHOST_ENQ_RETRY_NUM));
>>>>> +
>>>>> +    for (int i = 0; i < total_pkts; i++) {
>>>>> +        dp_packet_delete(txq->vhost_burst_pkts[i]);
>>>>> +    }
>>>>> +
>>>>> +    /* Reset pkt count. */
>>>>> +    txq->vhost_pkt_cnt = 0;
>>>>> +
>>>>> +    /* 'sum' refers to packets dropped. */
>>>>> +    return sum;
>>>>> +}
>>>>> +
>>>>> +/* Flush the txq if there are any packets available. */ static int
>>>>> +netdev_dpdk_vhost_txq_flush(struct netdev *netdev, int qid,
>>>>> +                            bool concurrent_txq OVS_UNUSED) {
>>>>> +    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>>> +    struct dpdk_tx_queue *txq;
>>>>> +
>>>>> +    qid = dev->tx_q[qid % netdev->n_txq].map;
>>>>> +
>>>>> +    /* The qid may be disabled in the guest and has been set to
>>>>> +     * OVS_VHOST_QUEUE_DISABLED.
>>>>> +     */
>>>>> +    if (OVS_UNLIKELY(qid < 0)) {
>>>>> +        return 0;
>>>>> +    }
>>>>> +
>>>>> +    txq = &dev->tx_q[qid];
>>>>> +    /* Increment the drop count and free the memory. */
>>>>> +    if (OVS_UNLIKELY(!is_vhost_running(dev) ||
>>>>> +                     !(dev->flags & NETDEV_UP))) {
>>>>> +
>>>>> +        if (txq->vhost_pkt_cnt) {
>>>>> +            rte_spinlock_lock(&dev->stats_lock);
>>>>> +            dev->stats.tx_dropped+= txq->vhost_pkt_cnt;
>>>>> +            rte_spinlock_unlock(&dev->stats_lock);
>>>>> +
>>>>> +            for (int i = 0; i < txq->vhost_pkt_cnt; i++) {
>>>>> +                dp_packet_delete(txq->vhost_burst_pkts[i]);
>>>>
>>>> Spinlock (tx_lock) must be held here to avoid queue and mempool
>> breakage.
>>>
>>> I think you are right. tx_lock might be acquired for freeing the packets.
>>
>> I think that 'vhost_pkt_cnt' reads and updates also should be protected to
>> avoid races.
> 
> From the discussion in the thread 
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337133.html,
> We are going to acquire tx_lock for updating the map and flushing the queue 
> inside vring_state_changed(). 
> 
> That triggers a deadlock in the  flushing function as we have already 
> acquired the same lock in netdev_dpdk_vhost_txq_flush().
> This is the same problem for freeing the memory and protecting the updates to 
> vhost_pkt_cnt.
> 
>    if (OVS_LIKELY(txq->vhost_pkt_cnt)) {
>          rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>         netdev_dpdk_vhost_tx_burst(dev, qid);
>         rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>    }
> 
> As the problem is triggered when the guest queues are enabled/disabled, with 
> a small race window where packets can get enqueued in to the queue just after 
> the flush and before map value is updated in cb 
> function(vring_state_changed()), how abt this?
> 
> Technically as the queues are disabled, there is no point in flushing the 
> packets, so let's free the packets and set the txq->vhost_pkt_cnt in 
> vring_state_changed() itself instead of calling flush().

Technically, enabling case also should be handled, because while enabling
we're remapping the queue and, in some specific cases, I guess, the old
queue may be not used after remapping by the threads.

> 
> vring_state_changed().
> ------------------------------------------------------
> rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
> 
> mapped_qid = dev->tx_q[qid].map;
>  if (OVS_UNLIKELY(qid != mapped_qid)) {
>         rte_spinlock_lock(&dev->tx_q[mapped_qid].tx_lock);
> }
> 
> if (enable) {
>                 dev->tx_q[qid].map = qid;
>   } else {
>                 struct dpdk_tx_queue *txq = &dev->tx_q[qid];
>                 if (txq->vhost_pkt_cnt) {
>                     rte_spinlock_lock(&dev->stats_lock);
>                     dev->stats.tx_dropped+= txq->vhost_pkt_cnt;
>                     rte_spinlock_unlock(&dev->stats_lock);
> 
>                     for (int i = 0; i < txq->vhost_pkt_cnt; i++) {
>                         dp_packet_delete(txq->vhost_burst_pkts[i]);
>                     }
>                     txq->vhost_pkt_cnt = 0;
>                 }
> 
>                 dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
>   }
> -------------------------------------------------------------------------
> 
> Regards,
> Bhanuprakash.
> 
>>
>>> ---------------------------------------------------------------------------
>>>     rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>>>     for (int i = 0; i < txq->vhost_pkt_cnt; i++) {
>>>              dp_packet_delete(txq->vhost_burst_pkts[i]);
>>>     }
>>>     rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>>>
>>> - Bhanuprakash
>>>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to