Hallo

We are running LR[1] and GBDT[2] and similar algorithm in MP2 handles.
For each request, there were about 1000 features as arguments passed into the handles, via HTTP POST. The request will wait for about 100ms to get responses, coz the calculation is not cheap.
My question is, how can we improve the throughput by architecture
optimization?
Yes we know there are TFS[3] and RT[4] for prediction frameworks, but we didn't use Tensorflow yet.


[1] https://en.wikipedia.org/wiki/LR_parser
[2] https://en.wikipedia.org/wiki/Gradient_boosting
[3] https://www.tensorflow.org/tfx/guide/serving
[4] https://developer.nvidia.com/tensorrt


Thanks.

Reply via email to