[dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue

2016-09-12 Thread Yuanhan Liu
On Mon, Sep 12, 2016 at 03:52:12PM +0200, Maxime Coquelin wrote:
> Just tried to apply your series without success.
> Apparently, it is not based directly on master branch,
> as it lacks some SHA-1 information.
> 
> Could you rebase it against master please?

It's rebased against the dpdk-next-virtio tree [0], where all the
virtio/vhost patches are applied first.

[0]: http://dpdk.org/browse/next/dpdk-next-virtio/

--yliu


[dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue

2016-09-12 Thread Maxime Coquelin


On 09/12/2016 03:52 PM, Maxime Coquelin wrote:
> Hi,
>
> On 09/09/2016 05:39 AM, Zhihong Wang wrote:
>> This patch set optimizes the vhost enqueue function.
>>
>> It implements the vhost logic from scratch into a single function
>> designed
>> for high performance and good maintainability, and improves CPU
>> efficiency
>> significantly by optimizing cache access, which means:
>>
>>  *  Higher maximum throughput can be achieved for fast frontends like
>> DPDK
>> virtio pmd.
>>
>>  *  Better scalability can be achieved that each vhost core can support
>> more connections because it takes less cycles to handle each single
>> frontend.
>>
>> This patch set contains:
>>
>>  1. A Windows VM compatibility fix for vhost enqueue in 16.07 release.
>>
>>  2. A baseline patch to rewrite the vhost logic.
>>
>>  3. A series of optimization patches added upon the baseline.
>>
>> The main optimization techniques are:
>>
>>  1. Reorder code to reduce CPU pipeline stall cycles.
>>
>>  2. Batch update the used ring for better efficiency.
>>
>>  3. Prefetch descriptor to hide cache latency.
>>
>>  4. Remove useless volatile attribute to allow compiler optimization.
>>
>> Code reordering and batch used ring update bring most of the performance
>> improvements.
>>
>> In the existing code there're 2 callbacks for vhost enqueue:
>>
>>  *  virtio_dev_merge_rx for mrg_rxbuf turned on cases.
>>
>>  *  virtio_dev_rx for mrg_rxbuf turned off cases.
>>
>> The performance of the existing code is not optimal, especially when the
>> mrg_rxbuf feature turned on. Besides, having 2 callback paths increases
>> maintenance efforts.
>>
>> Also, there's a compatibility issue in the existing code which causes
>> Windows VM to hang when the mrg_rxbuf feature turned on.
>>
>> ---
>> Changes in v5:
>>
>>  1. Rebase to the latest branch.
>>
>>  2. Rename variables to keep consistent in naming style.
>>
>>  3. Small changes like return value adjustment and vertical alignment.
>>
>>  4. Add details in commit log.
> Just tried to apply your series without success.
> Apparently, it is not based directly on master branch,
> as it lacks some SHA-1 information.
>
> Could you rebase it against master please?

Ok, it is in fact based on top of:
git://dpdk.org/next/dpdk-next-virtio master

For v6, if any, could you add this info to the cover letter please?

Thanks,
Maxime


[dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue

2016-09-12 Thread Maxime Coquelin
Hi,

On 09/09/2016 05:39 AM, Zhihong Wang wrote:
> This patch set optimizes the vhost enqueue function.
>
> It implements the vhost logic from scratch into a single function designed
> for high performance and good maintainability, and improves CPU efficiency
> significantly by optimizing cache access, which means:
>
>  *  Higher maximum throughput can be achieved for fast frontends like DPDK
> virtio pmd.
>
>  *  Better scalability can be achieved that each vhost core can support
> more connections because it takes less cycles to handle each single
> frontend.
>
> This patch set contains:
>
>  1. A Windows VM compatibility fix for vhost enqueue in 16.07 release.
>
>  2. A baseline patch to rewrite the vhost logic.
>
>  3. A series of optimization patches added upon the baseline.
>
> The main optimization techniques are:
>
>  1. Reorder code to reduce CPU pipeline stall cycles.
>
>  2. Batch update the used ring for better efficiency.
>
>  3. Prefetch descriptor to hide cache latency.
>
>  4. Remove useless volatile attribute to allow compiler optimization.
>
> Code reordering and batch used ring update bring most of the performance
> improvements.
>
> In the existing code there're 2 callbacks for vhost enqueue:
>
>  *  virtio_dev_merge_rx for mrg_rxbuf turned on cases.
>
>  *  virtio_dev_rx for mrg_rxbuf turned off cases.
>
> The performance of the existing code is not optimal, especially when the
> mrg_rxbuf feature turned on. Besides, having 2 callback paths increases
> maintenance efforts.
>
> Also, there's a compatibility issue in the existing code which causes
> Windows VM to hang when the mrg_rxbuf feature turned on.
>
> ---
> Changes in v5:
>
>  1. Rebase to the latest branch.
>
>  2. Rename variables to keep consistent in naming style.
>
>  3. Small changes like return value adjustment and vertical alignment.
>
>  4. Add details in commit log.
Just tried to apply your series without success.
Apparently, it is not based directly on master branch,
as it lacks some SHA-1 information.

Could you rebase it against master please?

Thanks,
Maxime


[dpdk-dev] [PATCH v5 0/6] vhost: optimize enqueue

2016-09-09 Thread Zhihong Wang
This patch set optimizes the vhost enqueue function.

It implements the vhost logic from scratch into a single function designed
for high performance and good maintainability, and improves CPU efficiency
significantly by optimizing cache access, which means:

 *  Higher maximum throughput can be achieved for fast frontends like DPDK
virtio pmd.

 *  Better scalability can be achieved that each vhost core can support
more connections because it takes less cycles to handle each single
frontend.

This patch set contains:

 1. A Windows VM compatibility fix for vhost enqueue in 16.07 release.

 2. A baseline patch to rewrite the vhost logic.

 3. A series of optimization patches added upon the baseline.

The main optimization techniques are:

 1. Reorder code to reduce CPU pipeline stall cycles.

 2. Batch update the used ring for better efficiency.

 3. Prefetch descriptor to hide cache latency.

 4. Remove useless volatile attribute to allow compiler optimization.

Code reordering and batch used ring update bring most of the performance
improvements.

In the existing code there're 2 callbacks for vhost enqueue:

 *  virtio_dev_merge_rx for mrg_rxbuf turned on cases.

 *  virtio_dev_rx for mrg_rxbuf turned off cases.

The performance of the existing code is not optimal, especially when the
mrg_rxbuf feature turned on. Besides, having 2 callback paths increases
maintenance efforts.

Also, there's a compatibility issue in the existing code which causes
Windows VM to hang when the mrg_rxbuf feature turned on.

---
Changes in v5:

 1. Rebase to the latest branch.

 2. Rename variables to keep consistent in naming style.

 3. Small changes like return value adjustment and vertical alignment.

 4. Add details in commit log.

---
Changes in v4:

 1. Fix a Windows VM compatibility issue.

 2. Free shadow used ring in the right place.

 3. Add failure check for shadow used ring malloc.

 4. Refactor the code for clearer logic.

 5. Add PRINT_PACKET for debugging.

---
Changes in v3:

 1. Remove unnecessary memset which causes frontend stall on SNB & IVB.

 2. Rename variables to follow naming convention.

 3. Rewrite enqueue and delete the obsolete in the same patch.

---
Changes in v2:

 1. Split the big function into several small ones.

 2. Use multiple patches to explain each optimization.

 3. Add comments.

Zhihong Wang (6):
  vhost: fix windows vm hang
  vhost: rewrite enqueue
  vhost: remove useless volatile
  vhost: add desc prefetch
  vhost: batch update used ring
  vhost: optimize cache access

 lib/librte_vhost/vhost.c  |  20 +-
 lib/librte_vhost/vhost.h  |   6 +-
 lib/librte_vhost/vhost_user.c |  31 ++-
 lib/librte_vhost/virtio_net.c | 561 +++---
 4 files changed, 242 insertions(+), 376 deletions(-)

-- 
2.7.4