[ https://issues.apache.org/jira/browse/OPENNLP-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705996#comment-17705996 ]
ASF GitHub Bot commented on OPENNLP-1442: ----------------------------------------- jzonthemtn commented on code in PR #523: URL: https://github.com/apache/opennlp/pull/523#discussion_r1150597785 ########## opennlp-dl/README.md: ########## @@ -4,44 +4,50 @@ This module provides OpenNLP interface implementations for ONNX models using the **Important**: This does not provide the ability to train models. Model training is done outside of OpenNLP. This code provides the ability to use ONNX models from OpenNLP. -To build with example models, download the models to the `/src/test/resources` directory. (These are the exported models described below.) +Models used in the tests are available in the opennlp evaluation test data. -``` - -export OPENNLP_DATA=/tmp/ -mkdir /tmp/dl-doccat /tmp/dl-namefinder +## NameFinderDL -# Document categorizer model -wget https://www.dropbox.com/s/n9uzs8r4xm9rhxb/model.onnx?dl=0 -O $OPENNLP_DATA/dl-doccat/model.onnx -wget https://www.dropbox.com/s/aw6yjc68jw0jts6/vocab.txt?dl=0 -O $OPENNLP_DATA/dl-doccat/vocab.txt +* Export a Huggingface NER model to ONNX, e.g.: -# Namefinder model -wget https://www.dropbox.com/s/zgogq65gs9tyfm1/model.onnx?dl=0 -O $OPENNLP_DATA/dl-namefinder/model.onnx -wget https://www.dropbox.com/s/3byt1jggly1dg98/vocab.txt?dl=0 -O $OPENNLP_DATA/dl-/namefinder/vocab.txt +``` +python -m transformers.onnx --model=dslim/bert-base-NER --feature token-classification exported ``` -## TokenNameFinder +## DocumentCategorizerDL -* Export a Huggingface NER model to ONNX, e.g.: +* Export a Huggingface classification (e.g. sentiment) model to ONNX, e.g.: ``` -python -m transformers.onnx --model=dslim/bert-base-NER --feature token-classification exported +python -m transformers.onnx --model=nlptown/bert-base-multilingual-uncased-sentiment --feature sequence-classification exported ``` -* Copy the exported model to `src/test/resources/namefinder/model.onnx`. -* Copy the model's [vocab.txt](https://huggingface.co/dslim/bert-base-NER/tree/main) to `src/test/resources/namefinder/vocab.txt`. +## SentenceVectors -Now you can run the tests in `NameFinderDLTest`. +* Convert a sentence vectors model to ONNX, e.g.: -## DocumentCategorizer - -* Export a Huggingface classification (e.g. sentiment) model to ONNX, e.g.: +Install dependencies: ``` -python -m transformers.onnx --model=nlptown/bert-base-multilingual-uncased-sentiment --feature sequence-classification exported +python3 -m pip install optimum onnx onnxruntime +``` + +Convert the model: + ``` Review Comment: I added those. > Use ONNX Runtime to support sentence-transformers > ------------------------------------------------- > > Key: OPENNLP-1442 > URL: https://issues.apache.org/jira/browse/OPENNLP-1442 > Project: OpenNLP > Issue Type: Task > Components: Deep Learning > Reporter: Jeff Zemerick > Assignee: Jeff Zemerick > Priority: Major > > Use ONNX Runtime to support sentence-transformers. OpenNLP should be able to > generate embeddings using an ONNX model. -- This message was sent by Atlassian Jira (v8.20.10#820010)