Initial python part refactoring

Kamov Sergey Mon, 23 Aug 2021 11:28:12 -0700

Below my opinion about initial python part refactoring PR -https://github.com/apache/incubator-nlpcraft/pull/18

I tried to review how it works and want to shortly describe my remarkswhich should be fixed in current PR.

After we 'll do detailed code review code, etc


If somebody has another opinion - please let's discuss.

1) python part should be optional.
 - required libs for project are just: java 11 and maven.

- `mvn clean package/verify/install` should work as in master branchwithout python stuff. - Apache actions should workhttps://github.com/apache/incubator-nlpcraft/actions

 - ctx server and spacy are disabled by default.

So, by default everything should be same as in master branch andrequires installed java and maven only.



2) python stuff usage.
there are few python stuff installations ways:
 - via scripts,
 - via maven to simplify OS depended issues.

if maven way is prefered - it should be processed with some parameter -like `mvn clean package/verify/install -some_parameter_use_python` (ifit is possible with maven)- required libs for project: java 11, maven, python, conda (? -is itpossible to install it on backstage?)

(Suggested flag -Dauto-rm-existing-python-env= - is good thinks I guess,but it is about general approach)



3) main current remarks about implementation.
 3.1) We can't keep such progress bar in out installation output
 (Look please at attached file)

How to avoid it - another question, but our final installation productcan't print it I guess


 3.2) Also we have to try to find some way to avoid warning like

>> python/fasttext_module/fasttext/pybind/fasttext_pybind.cc:346:35:warning: comparison of integers of different signs: 'int32_t' (aka'int') and 'std::__1::vector<long long>::size_type' (aka 'unsignedlong') [-Wsign-compare]

   >>         for (int32_t i = 0; i < vocab_freq.size(); i++) {

I see that some of these things are out of our control - but we have todo some workarounds, change our approaches etc - we can't provide it toour users (Especially progressbar)


 3.3) start_server.sh
 Duplicated output  - should be fixed (it is still here)
>>CUDA is not available for Torch.
>>CUDA is not available for Torch.
>>[2021-08-23 12:52:42]: 80055 DEBUG Registering ctxserver blueprint
>>[2021-08-23 12:52:42]: 80057 DEBUG Registering ctxserver blueprint

3.3) server starts too slowly comparing to previous version

I think 3.3 and 3.4 - significant issues and should be resolved in thisticket (we can't merge implementations with such degradation)



Conclusions:

I guess after these 6 points (1, 2, 3.1-3.4) resolving, we can startdeep code review and detailed tests on various OS.

Other improvements etc can be moved on next tickets.


Regards,

Sergey

Initial python part refactoring

Reply via email to