[
https://issues.apache.org/jira/browse/SINGA-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Yeung updated SINGA-512:
------------------------------
Description:
In Cudnn 7.6, a new API is introduced for fused ops, which can accelerate many
use cases in ResNet-like networks. With this new API it is now possible to
execute various fused operations such as apply per channel scale and bias,
perform activation, compute convolution, and generate batchnorm statistics.
Reference:
https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_7xx.html#rel_760
The goal is to increase the image throughput of DL networks. Currently, this
task is assigned to Naili.
was:
In Cudnn 7.6, a new API is introduced for fused ops, which can accelerate many
use cases in ResNet-like networks. With this new API it is now possible to
execute various fused operations such as apply per channel scale and bias,
perform activation, compute convolution, and generate batchnorm statistics.
Reference:
https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_7xx.html#rel_760
The goal is to increase the image throughput of ResNet. Currently, this task is
assigned to Naili.
> Support of fused ops to increase throughput performance
> -------------------------------------------------------
>
> Key: SINGA-512
> URL: https://issues.apache.org/jira/browse/SINGA-512
> Project: Singa
> Issue Type: Improvement
> Components: Core
> Reporter: Chris Yeung
> Priority: Major
>
> In Cudnn 7.6, a new API is introduced for fused ops, which can accelerate
> many use cases in ResNet-like networks. With this new API it is now possible
> to execute various fused operations such as apply per channel scale and bias,
> perform activation, compute convolution, and generate batchnorm statistics.
> Reference:
> https://docs.nvidia.com/deeplearning/cudnn/release-notes/rel_7xx.html#rel_760
>
> The goal is to increase the image throughput of DL networks. Currently, this
> task is assigned to Naili.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)