Attention layer in C++ and Python
Hello, Long time reader, first time writer - has anyone attempted to implement an attention layer, similar to those used in Hierarchical Attention Networks in NLP domain in C++? I found one attempt in Python: https://github.com/magic282/MXNMT/blob/next/mxwrap/attention/BasicAttention.py We have a custom implementation in Keras with Theano backend which is terribly slow to train, so we are looking at implementing the same architecture directly in C++. Thanks, Max