remotego edited a comment on pull request #9: URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-660622675
> > The code should work for all devices supported by Intel OpenCL for FPGA, namely Intel Arria 10, Stratix V/10 and Cyclone V/10. > > Just my humble opinion, given that both "Intel OpenCL for FPGA" and "VTA" requires a large amount of logic utilization, many Cyclone V chips that supports AOCL couldn't get this compiled because of the hardware utilization issue. > In my understanding of @tmoreau89 's comment, which I would agree, setting a proper target device in vta_config gives the property of being "versatile" in VTA - the Versatile Tensor Accelerator. Specifically, we could take the advantage of the properties in vta_config to define an accelerator that could scale from minimal ones towards data center scale accelerators. > > As a side note, the reason we are taking these efforts in building open source projects, in part, we are hoping someone in the community could reproduce what we have done, and could easily start to build something that is even better. With the target being defined too board, an potential grad student could fail to reproduce the result, since not all the student could easily purchase a board with Stratix 10, and low cost Cyclone V boards couldn't get this running. In addition, they're wasting large amount of valuable logic resources, even they could afford a board with Stratix 10. Therefore, we should specify a precise target device for the vta_config. Thank you for your reply. However, could you explain more on the reason why the design shall not work on Cyclone V FPGAs that supports Intel OpenCL for FPGA? Precisely as you mentioned, the VTA design is versatile, the user could always change the settings (i.e. LOG_BLOCK and LOG_*_BUF_SIZE) to adjust the resource usage in order to fit their own FPGA boards. Surely a low cost Cyclone V device could not support 64x64 GEMV cores like large Stratix 10 FPGAs do. But the user could always try to use 16x16 or even 4x4 GEMV cores, by setting the LOG_BLOCK lower. Considering that, we used a relatively small default setting for LOG_BLOCK (4) and Buffers(15, 15, 18, 17). Thus the design should be able to fit into FPGAs comparable to the original Zynq/Zedboard platforms. We must admit that we don't have those Cyclone V boards, nor has the design been tested on those platforms. However, if anyone encountered problems when compiling the design for a AOCL-compatibale cyclone V board, we will be more than happy to investigate and try to solve the issue together. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org