remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-660622675


   > > The code should work for all devices supported by Intel OpenCL for FPGA, 
namely Intel Arria 10, Stratix V/10 and Cyclone V/10.
   > 
   > Just my humble opinion, given that both "Intel OpenCL for FPGA" and "VTA" 
requires a large amount of logic utilization, many Cyclone V chips that 
supports AOCL couldn't get this compiled because of the hardware utilization 
issue.
   > In my understanding of @tmoreau89 's comment, which I would agree, setting 
a proper target device in vta_config gives the property of being "versatile" in 
VTA - the Versatile Tensor Accelerator. Specifically, we could take the 
advantage of the properties in vta_config to define an accelerator that could 
scale from minimal ones towards data center scale accelerators.
   > 
   > As a side note, the reason we are taking these efforts in building open 
source projects, in part, we are hoping someone in the community could 
reproduce what we have done, and could easily start to build something that is 
even better. With the target being defined too board, an potential grad student 
could fail to reproduce the result, since not all the student could easily 
purchase a board with Stratix 10, and low cost Cyclone V boards couldn't get 
this running. In addition, they're wasting large amount of valuable logic 
resources, even they could afford a board with Stratix 10. Therefore, we should 
specify a precise target device for the vta_config.
   
   Thank you for your reply. However, could you explain more on the reason why 
the design shall not work on Cyclone V FPGAs that supports Intel OpenCL for 
FPGA?
   
   Precisely as you mentioned, the VTA design is versatile, the user could 
always change the settings (i.e. LOG_BLOCK and LOG_*_BUF_SIZE) to adjust the 
resource usage in order to fit their own FPGA boards. Surely a low cost Cyclone 
V device could not support 64x64 GEMV cores like large Stratix 10 FPGAs do. But 
the user could always try to use 16x16 or even 4x4 GEMV cores, by setting the 
LOG_BLOCK lower.
   
   Considering that, we used a relatively small default setting for LOG_BLOCK 
(4) and Buffers(15, 15, 18, 17). Thus the design should be able to fit into 
FPGAs comparable to the original Zynq/Zedboard platforms.
   
   We must admit that we don't have those Cyclone V boards, nor has the design 
been tested on those platforms. However, if anyone encountered problems when 
compiling the design for a AOCL-compatibale cyclone V board, we will be more 
than happy to investigate and try to solve the issue together.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to