remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-660622675


   > > The code should work for all devices supported by Intel OpenCL for FPGA, 
namely Intel Arria 10, Stratix V/10 and Cyclone V/10.
   > 
   > Just my humble opinion, given that both "Intel OpenCL for FPGA" and "VTA" 
requires a large amount of logic utilization, many Cyclone V chips that 
supports AOCL couldn't get this compiled because of the hardware utilization 
issue.
   > In my understanding of @tmoreau89 's comment, which I would agree, setting 
a proper target device in vta_config gives the property of being "versatile" in 
VTA - the Versatile Tensor Accelerator. Specifically, we could take the 
advantage of the properties in vta_config to define an accelerator that could 
scale from minimal ones towards data center scale accelerators.
   > 
   > As a side note, the reason we are taking these efforts in building open 
source projects, in part, we are hoping someone in the community could 
reproduce what we have done, and could easily start to build something that is 
even better. With the target being defined too board, an potential grad student 
could fail to reproduce the result, since not all the student could easily 
purchase a board with Stratix 10, and low cost Cyclone V boards couldn't get 
this running. In addition, they're wasting large amount of valuable logic 
resources, even they could afford a board with Stratix 10. Therefore, we should 
specify a precise target device for the vta_config.
   
   Thank you for your reply. However, could you explain more on the reason why 
the design shall not work on Cyclone V FPGAs that supports Intel OpenCL for 
FPGA?
   
   Precisely as you mentioned, the VTA design is versatile, the user could 
always change the settings (i.e. LOG_BLOCK and LOG_*_BUF_SIZE) to adjust the 
resource usage in order to fit their own FPGA boards. Surely a low cost Cyclone 
V device could not support 64x64 GEMV cores like large Stratix 10 FPGAs do. But 
the user could always try to use 16x16 or even 4x4 GEMV cores, by setting the 
LOG_BLOCK lower.
   
   Considering that, we used a relatively small default setting for LOG_BLOCK 
(4) and Buffers(15, 15, 18, 17). Thus the design should be able to fit into 
FPGAs comparable to the original Zynq/Zedboard platforms.
   
   We must admit that we don't have those Cyclone V boards on hand, nor has the 
design been tested on those platforms. However, if there is any issue on 
compiling the design for a AOCL-compatibale cyclone V board, we will be more 
than happy to investigate and try to solve the issue together.
   
   In terms of accessibility, we know that high-end could FPGA cards are very 
expensive. The good news is that nowadays there are many Cloud Service 
Providers available offering high-end FPGA instances! Those FPGA instances 
generally only cost few bucks for hour's usage.
   
   In addition, we are also working on porting the design over to Amazon EC2 F1 
instances (Xilinx SDAccel). We will update again when we finish testing on the 
Amazon platforms.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to