Great proposal!

Few questions from my end:

1. Will the new C-API functions be threadsafe in general? Speak, I can invoke 
them at any point in time from any thread without the need of a lock, 
sticky-thread or a thread hierarchy? (I'm thinking of the thread-safety being 
done on the backend level)
2. Will this also support the GPU use-case? Speak, the parameters are only 
copied into GPU memory once in the same fashion as you're describing for the 
CPU?
3. Do you think there's a path forward to make all inference-related C-APIs 
threadsafe instead of splitting off another execution branch?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/16431#issuecomment-540828556

Reply via email to