[GitHub] [incubator-tvm-vta] remotego edited a comment on pull request #9: [Hardware][OpenCL] Intelfocl support

2020-09-19 Thread GitBox


remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-695611214


   Yes. We will address those issues.
   Sorry for the delay. We are working on it and will update Monday or Tuesday.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm-vta] remotego edited a comment on pull request #9: [Hardware][OpenCL] Intelfocl support

2020-07-30 Thread GitBox


remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-665845575


   > thanks @remotego @liangfu @pasqoc for the insightful comments. I think 
that all opinions expressed are very valid!
   > 
   > So far our approach to naming VTA target has been to couple the VTA target 
with the FPGA board (e.g. pynq, ultra96, de10nano). These targets also contrast 
with functional simulation (sim), and cycle accurate sim (tsim).
   > 
   > We've been using the VTA target to guide the compilation process to the 
target device, given that some of the drivers had to be written in a board 
specific fashion. In the case of the Intel OpenCL FPGA support, I understand 
that this work captures a variety of hardware backends, including Arria 10, 
Stratix10 boards etc, and that the driver codebase would remain mostly 
identical between those boards (unlike Pynq, and Ultra96 that relied on very 
different ARM SoCs)
   > 
   > However to find a quick resolution to this discussion we can either choose
   > 
   > * to use a specific board name (rather than FPGA family) to indicate that 
the OCL FPGA design has been tested on this device. This echo @liangfu's 
concern that we should have concrete targets for the community to reproduce 
work on.
   > * keep the naming open as @remotego and @pasqoc are advocating, and 
classify this target as intelfocl_pcie to indicate that this applies only to 
PCIE-based OpenCL Intel FPGA devices.
   > 
   > @remotego let us know what you think
   
   Thank you for the reply. I am okay with either approach but I would like to 
suggest the decoupling between target and back-end. As we have three different 
back-end right now, we will definitely run into the situation that one 
particular board will be supported by more than one backend. I don't think we 
need to choose a backend for the user and it will be great if the user is 
willing to explore and experience the pros/cons of each backend.
   
   Agreed with you and @liangfu, I would like to keep the current target 
definition ("specific board") so that the user could have a concrete platform 
to test the design. However, using board name to imply backend is confusing as 
I see it. For example, as I mentioned before, the Intel OpenCL code is not 
restricted to PCIe based boards, it should support SoC based boards as well, 
which includes Altera DE10 boards. In addition, as Intel OpenCL platform 
supports software emulation and cycle-accurate simulation as well, it could 
also have targets of "fsim" and "tsim".
   
   I propose we could place an additional backend option inside 
vta_config.json, so that user could explicitly spell out the choice of backend 
for VTA. We are also actively extending our support for Xilinx SDAccel/SDSoc, 
so that we could support a lot more FPGA cards in the near future.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm-vta] remotego edited a comment on pull request #9: [Hardware][OpenCL] Intelfocl support

2020-07-29 Thread GitBox


remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-664791701


   > Hi @zhanghaohit , since 
[apache/incubator-tvm#6092](https://github.com/apache/incubator-tvm/pull/6092) 
has been merged, kindly retrigger ci testing for this PR?
   
   Hi, we are making some changes now and will push the update today. Thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm-vta] remotego edited a comment on pull request #9: [Hardware][OpenCL] Intelfocl support

2020-07-19 Thread GitBox


remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-660622675


   > > The code should work for all devices supported by Intel OpenCL for FPGA, 
namely Intel Arria 10, Stratix V/10 and Cyclone V/10.
   > 
   > Just my humble opinion, given that both "Intel OpenCL for FPGA" and "VTA" 
requires a large amount of logic utilization, many Cyclone V chips that 
supports AOCL couldn't get this compiled because of the hardware utilization 
issue.
   > In my understanding of @tmoreau89 's comment, which I would agree, setting 
a proper target device in vta_config gives the property of being "versatile" in 
VTA - the Versatile Tensor Accelerator. Specifically, we could take the 
advantage of the properties in vta_config to define an accelerator that could 
scale from minimal ones towards data center scale accelerators.
   > 
   > As a side note, the reason we are taking these efforts in building open 
source projects, in part, we are hoping someone in the community could 
reproduce what we have done, and could easily start to build something that is 
even better. With the target being defined too board, an potential grad student 
could fail to reproduce the result, since not all the student could easily 
purchase a board with Stratix 10, and low cost Cyclone V boards couldn't get 
this running. In addition, they're wasting large amount of valuable logic 
resources, even they could afford a board with Stratix 10. Therefore, we should 
specify a precise target device for the vta_config.
   
   Thank you for your reply. However, could you explain more on the reason why 
the design shall not work on Cyclone V FPGAs that supports Intel OpenCL for 
FPGA?
   
   Precisely as you mentioned, the VTA design is versatile, the user could 
always change the settings (i.e. LOG_BLOCK and LOG_*_BUF_SIZE) to adjust the 
resource usage in order to fit their own FPGA boards. Surely a low cost Cyclone 
V device could not support 64x64 GEMV cores like large Stratix 10 FPGAs do. But 
the user could always try to use 16x16 or even 4x4 GEMV cores, by setting the 
LOG_BLOCK lower.
   
   Considering that, we used a relatively small default setting for LOG_BLOCK 
(4) and Buffers(15, 15, 18, 17). Thus the design should be able to fit into 
FPGAs comparable to the original Zynq/Zedboard platforms.
   
   We must admit that we don't have those Cyclone V boards on hand, nor has the 
design been tested on those platforms. However, if there is any issue on 
compiling the design for a AOCL-compatibale cyclone V board, we will be more 
than happy to investigate and try to solve the issue together.
   
   In terms of accessibility, we know that high-end could FPGA cards are very 
expensive. The good news is that nowadays there are many Cloud Service 
Providers available offering high-end FPGA instances! Those FPGA instances 
generally only cost few bucks for hour's usage.
   
   In addition, we are also working on porting the design over to Amazon EC2 F1 
instances (Xilinx SDAccel). We will update again when we finish testing on the 
Amazon platforms.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm-vta] remotego edited a comment on pull request #9: [Hardware][OpenCL] Intelfocl support

2020-07-19 Thread GitBox


remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-660622675


   > > The code should work for all devices supported by Intel OpenCL for FPGA, 
namely Intel Arria 10, Stratix V/10 and Cyclone V/10.
   > 
   > Just my humble opinion, given that both "Intel OpenCL for FPGA" and "VTA" 
requires a large amount of logic utilization, many Cyclone V chips that 
supports AOCL couldn't get this compiled because of the hardware utilization 
issue.
   > In my understanding of @tmoreau89 's comment, which I would agree, setting 
a proper target device in vta_config gives the property of being "versatile" in 
VTA - the Versatile Tensor Accelerator. Specifically, we could take the 
advantage of the properties in vta_config to define an accelerator that could 
scale from minimal ones towards data center scale accelerators.
   > 
   > As a side note, the reason we are taking these efforts in building open 
source projects, in part, we are hoping someone in the community could 
reproduce what we have done, and could easily start to build something that is 
even better. With the target being defined too board, an potential grad student 
could fail to reproduce the result, since not all the student could easily 
purchase a board with Stratix 10, and low cost Cyclone V boards couldn't get 
this running. In addition, they're wasting large amount of valuable logic 
resources, even they could afford a board with Stratix 10. Therefore, we should 
specify a precise target device for the vta_config.
   
   Thank you for your reply. However, could you explain more on the reason why 
the design shall not work on Cyclone V FPGAs that supports Intel OpenCL for 
FPGA?
   
   Precisely as you mentioned, the VTA design is versatile, the user could 
always change the settings (i.e. LOG_BLOCK and LOG_*_BUF_SIZE) to adjust the 
resource usage in order to fit their own FPGA boards. Surely a low cost Cyclone 
V device could not support 64x64 GEMV cores like large Stratix 10 FPGAs do. But 
the user could always try to use 16x16 or even 4x4 GEMV cores, by setting the 
LOG_BLOCK lower.
   
   Considering that, we used a relatively small default setting for LOG_BLOCK 
(4) and Buffers(15, 15, 18, 17). Thus the design should be able to fit into 
FPGAs comparable to the original Zynq/Zedboard platforms.
   
   We must admit that we don't have those Cyclone V boards on hand, nor has the 
design been tested on those platforms. However, if there is any issue on 
compiling the design for a AOCL-compatibale cyclone V board, we will be more 
than happy to investigate and try to solve the issue together.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm-vta] remotego edited a comment on pull request #9: [Hardware][OpenCL] Intelfocl support

2020-07-19 Thread GitBox


remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-660622675


   > > The code should work for all devices supported by Intel OpenCL for FPGA, 
namely Intel Arria 10, Stratix V/10 and Cyclone V/10.
   > 
   > Just my humble opinion, given that both "Intel OpenCL for FPGA" and "VTA" 
requires a large amount of logic utilization, many Cyclone V chips that 
supports AOCL couldn't get this compiled because of the hardware utilization 
issue.
   > In my understanding of @tmoreau89 's comment, which I would agree, setting 
a proper target device in vta_config gives the property of being "versatile" in 
VTA - the Versatile Tensor Accelerator. Specifically, we could take the 
advantage of the properties in vta_config to define an accelerator that could 
scale from minimal ones towards data center scale accelerators.
   > 
   > As a side note, the reason we are taking these efforts in building open 
source projects, in part, we are hoping someone in the community could 
reproduce what we have done, and could easily start to build something that is 
even better. With the target being defined too board, an potential grad student 
could fail to reproduce the result, since not all the student could easily 
purchase a board with Stratix 10, and low cost Cyclone V boards couldn't get 
this running. In addition, they're wasting large amount of valuable logic 
resources, even they could afford a board with Stratix 10. Therefore, we should 
specify a precise target device for the vta_config.
   
   Thank you for your reply. However, could you explain more on the reason why 
the design shall not work on Cyclone V FPGAs that supports Intel OpenCL for 
FPGA?
   
   Precisely as you mentioned, the VTA design is versatile, the user could 
always change the settings (i.e. LOG_BLOCK and LOG_*_BUF_SIZE) to adjust the 
resource usage in order to fit their own FPGA boards. Surely a low cost Cyclone 
V device could not support 64x64 GEMV cores like large Stratix 10 FPGAs do. But 
the user could always try to use 16x16 or even 4x4 GEMV cores, by setting the 
LOG_BLOCK lower.
   
   Considering that, we used a relatively small default setting for LOG_BLOCK 
(4) and Buffers(15, 15, 18, 17). Thus the design should be able to fit into 
FPGAs comparable to the original Zynq/Zedboard platforms.
   
   We must admit that we don't have those Cyclone V boards, nor has the design 
been tested on those platforms. However, if anyone encountered problems when 
compiling the design for a AOCL-compatibale cyclone V board, we will be more 
than happy to investigate and try to solve the issue together.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm-vta] remotego edited a comment on pull request #9: [Hardware][OpenCL] Intelfocl support

2020-07-17 Thread GitBox


remotego edited a comment on pull request #9:
URL: https://github.com/apache/incubator-tvm-vta/pull/9#issuecomment-660411735


   > Thanks for the changes. Please apply the 0.0.2 and rename the vta target 
to something more specific, e.g. "arria10". Also there are some CI errors 
related to linting that could be addressed. Thanks!
   
   Thank you very much! Sure. We will apply the 0.0.2 and address the linting 
errors.
   However. I believe "arria10" is too restrictive here. The code should work 
for all devices supported by Intel OpenCL for FPGA, namely Intel Arria 10, 
Stratix V/10 and Cyclone V/10. So far we have tested it on both Arria 10 and 
Stratix 10 boards, and it worked.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org