Hi Jerome, See the ref guide[1] for a writeup of how to enable uploading files larger than 1MB into ZooKeeper.
Local storage should also work - have you tried placing OpenNLP model files in ${solr.solr.home}/lib/ ? - make sure you do the same on each node. [1] https://lucene.apache.org/solr/guide/7_4/setting-up-an-external-zookeeper-ensemble.html#increasing-the-file-size-limit -- Steve www.lucidworks.com > On Jul 9, 2018, at 12:50 AM, Jerome Yang <jey...@pivotal.io> wrote: > > Hi guys, > > In Solrcloud mode, where to put the OpenNLP models? > Upload to zookeeper? > As I test on solr 7.3.1, seems absolute path on local host is not working. > And can not upload into zookeeper if the model size exceed 1M. > > Regards, > Jerome > > On Wed, Apr 18, 2018 at 9:54 AM Steve Rowe <sar...@gmail.com> wrote: > >> Hi Alexey, >> >> First, thanks for moving the conversation to the mailing list. Discussion >> of usage problems should take place here rather than in JIRA. >> >> I locally set up Solr 7.3 similarly to you and was able to get things to >> work. >> >> Problems with your setup: >> >> 1. Your update chain is missing the Log and Run update processors at the >> end (I see these are missing from the example in the javadocs for the >> OpenNLP NER update processor; I’ll fix that): >> >> <processor class="solr.LogUpdateProcessorFactory" /> >> <processor class="solr.RunUpdateProcessorFactory" /> >> >> The Log update processor isn’t strictly necessary, but, from < >> https://lucene.apache.org/solr/guide/7_3/update-request-processors.html#custom-update-request-processor-chain >>> : >> >> Do not forget to add RunUpdateProcessorFactory at the end of any >> chains you define in solrconfig.xml. Otherwise update requests >> processed by that chain will not actually affect the indexed data. >> >> 2. Your example document is missing an “id” field. >> >> 3. For whatever reason, the pre-trained model "en-ner-person.bin" doesn’t >> extract anything from text “This is Steve Jobs 2”. It will extract “Steve >> Jobs” from text “This is Steve Jobs in white” e.g. though. >> >> 4. (Not a problem necessarily) You may want to use a multi-valued “string” >> field for the “dest” field in your update chain, e.g. “people_str” (“*_str” >> in the default configset is so configured). >> >> -- >> Steve >> www.lucidworks.com >> >>> On Apr 17, 2018, at 8:23 AM, Alexey Ponomarenko <alex1989s...@gmail.com> >> wrote: >>> >>> Hi once more I am trying to implement named entities extraction using >> this >>> manual >>> >> https://lucene.apache.org/solr/7_3_0//solr-analysis-extras/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html >>> >>> I am modified solrconfig.xml like this: >>> >>> <updateRequestProcessorChain name="multiple-extract"> >>> <processor >> class="solr.OpenNLPExtractNamedEntitiesUpdateProcessorFactory"> >>> <str name="modelFile">opennlp/en-ner-person.bin</str> >>> <str name="analyzerFieldType">text_opennlp</str> >>> <str name="source">description_en</str> >>> <str name="dest">content</str> >>> </processor> >>> </updateRequestProcessorChain> >>> >>> But when I was trying to add data using: >>> >>> *request:* >>> >>> POST >>> >> http://localhost:8983/solr/numberplate/update?version=2.2&wt=xml&update.chain=multiple-extract >>> >>> <add><doc><field name="description_en">This is Steve Jobs 2 >>> </field><field name="content_pos">This is text 2</field><field >>> name="content">This is text for content 2</field></doc></add> >>> >>> *response* >>> >>> <?xml version="1.0" encoding="UTF-8"?> >>> <response> >>> <lst name="responseHeader"> >>> <int name="status">0</int> >>> <int name="QTime">3</int> >>> </lst> >>> </response> >>> >>> But I don't see any data inserted to *content* field and in any other >> field. >>> >>> *If you need some additional data I can provide it.* >>> >>> Can you help me? What have I done wrong? >> >> > > -- > Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/>