We use Google Universal Sentence Encoder which operates on TensorFlow.
https://tfhub.dev/google/universal-sentence-encoder/1
It's impressive because it can handle multi-word "keywords", which ends up
being forgotten a lot in planning, but critical in actual use. It's
pretrained on several languag
How often will you access the files, and how often do they change? Is the
goal to also make them available for viewing online?
Amazon's product for this sort of thing is Glacier:
https://aws.amazon.com/glacier/
It would be somewhat expensive:
$.004/gb/mo, so $4/tb/mo, x200 = $800/mo,
or $9600/yr