date:20200824

Creating a phrase match feature in LTR

2020-08-24 Thread krishan goyal

Hi,

I am trying to create a phrase match feature (what "pf" does in
dismax/edismax parsers)

I've tried various ways to set it up

{
  "name": "phraseMatch",
  "class": "org.apache.solr.ltr.feature.SolrFeature",
  "params": {
"q": "{!complexphrase inOrder=true}query(fieldName:${input})"
  },
  "store": "_DEFAULT_"
}

This fails with the exception

Exception from createWeight for SolrFeature [name=phraseMatch,
params={q={!complexphrase inOrder=true}query(fieldName:${input})}] null

But similar query works when used in the query reranking construct with
these params

rqq: "{!complexphrase inOrder=true v=$v1}",
v1: "query(fieldName:"some text"~2^1.0,0)",

What is the problem in the LTR configuration for the feature ?

RE: PDF extraction using Tika

2020-08-24 Thread Srinivas Kashyap

Hi Alexandre,

Yes, these are the same PDF files running in windows and linux. There are 
around 30 pdf files and I tried indexing single file, but faced same error. Is 
it related to how PDF stored in linux?

And with regard to DIH and TIKA going away, can you share if any program which 
extracts from PDF and pushes into solr?

Thanks,
Srinivas Kashyap

-Original Message-
From: Alexandre Rafalovitch  
Sent: 24 August 2020 20:54
To: solr-user 
Subject: Re: PDF extraction using Tika

The issue seems to be more with a specific file and at the level way below 
Solr's or possibly even Tika's:
Caused by: java.io.IOException: expected='>' actual='
' at offset 2383
at
org.apache.pdfbox.pdfparser.BaseParser.readExpectedChar(BaseParser.java:1045)

Are you indexing the same files on Windows and Linux? I am guessing not. I 
would try to narrow down which of the files it is. One way could be to get a 
standalone Tika (make sure to match the version Solr
embeds) and run it over the documents by itself. It will probably complain with 
the same error.

Regards,
   Alex.
P.s. Additionally, both DIH and Embedded Tika are not recommended for 
production. And both will be going away in future Solr versions. You may have a 
much less brittle pipeline if you save the structured outputs from those Tika 
standalone runs and then index them into Solr, possibly pre-processed.

On Mon, 24 Aug 2020 at 11:09, Srinivas Kashyap 
 wrote:
>
> Hello,
>
> We are using TikaEntityProcessor to extract the content out of PDF and make 
> the content searchable.
>
> When jetty is run on windows based machine, we are able to successfully load 
> documents using full import DIH(tika entity). Here PDF's is maintained in 
> windows file system.
>
> But when jetty solr is run on linux machine, and try to run DIH, we 
> are getting below exception: (Here PDF's are maintained in linux 
> filesystem)
>
> Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read 
> content Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
> at 
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
> at 
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
> at 
> org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read 
> content Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
> ... 4 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
> Unable to read content Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
> at 
> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:171)
> at 
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
> ... 6 more
> Caused by: org.apache.tika.exception.TikaException: Unable to extract PDF 
> content
> at 
> org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:139)
> at 
> org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:172)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> at 
> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:165)
> ... 10 more
> Caused by: java.io.IOException: expected='>' actual='
> ' at offset 2383
> at 
> org.apache.pdfbox.pdfparser.BaseParser.readExpectedChar(BaseParser.java:1045)
> at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:226)
> at 
> org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.

Re: Solr 8.6.1: Can't round-trip nested document from SolrJ

2020-08-24 Thread Alexandre Rafalovitch

I guess this gets into the point of whether "children" or whatever
field is used for child documents actually needs to be in the schema.
Schemaless mode creates one, but that's not a defining factor. Because
if it needs to be in the schema, then the code should reflect its
cardinality. But if it does not, then all bets are off.

Regards,
   Alex.
P.s. I added this question to SOLR-12298, as I don't think I know
enough about this part of Solr to judge.

On Mon, 24 Aug 2020 at 02:28, Munendra S N  wrote:
>
> >
> > Interestingly, I was forced to add children as an array even when the
> > child was alone and the field was already marked multivalued. It seems
> > the code does not do conversation to multi-value type, which means the
> > query code has to be a lot more careful about checking field return
> > type and having multi-path handling. That's not what Solr does for
> > string class (tested). Is that a known issue?
> >
> > https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java#L88-L89
>
> Not sure about this. Maybe we might need to check in Dev list or Slack
>
>  If I switch commented/uncommented lines around, the retrieval will fail
> > part way through, because one 'children' field is returned as array, but
> > not the other one:
>
> This might be because of these checks
> https://github.com/apache/lucene-solr/blob/e1392c74400d74366982ccb796063ffdcef08047/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformer.java#L201-L209
> but
> not sure
>
> Regards,
> Munendra S N
>
>
>
> On Sun, Aug 23, 2020 at 7:53 PM Alexandre Rafalovitch 
> wrote:
>
> > Thank you Nunedra,
> >
> > That was very helpful. I am looking forward to that documentation Jira
> > to be merged into the next release.
> >
> > I was able to get the example working by switching away from anonymous
> > children to the field approach. Which means hasChildren() call also
> > did not work. It seems the addChildren/hasChildren will need a
> > different schema, without _nest_path_ defined. I did not test.
> >
> > Interestingly, I was forced to add children as an array even when the
> > child was alone and the field was already marked multivalued. It seems
> > the code does not do conversation to multi-value type, which means the
> > query code has to be a lot more careful about checking field return
> > type and having multi-path handling. That's not what Solr does for
> > string class (tested). Is that a known issue?
> >
> > https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java#L88-L89
> >
> > If I switch commented/uncommented lines around, the retrieval will
> > fail part way through, because one 'children' field is returned as
> > array, but not the other one:
> >
> > {responseHeader={status=0,QTime=0,params={q=id:p1,fl=*,[child],wt=javabin,version=2}},response={numFound=1,numFoundExact=true,start=0,docs=[SolrDocument{id=p1,
> > name=[parent1], class=[foo.bar.parent1.1, foo.bar.parent1.2],
> > _version_=1675826293154775040, children=[SolrDocument{id=c1,
> > name=[child1], class=[foo.bar.child1], _version_=1675826293154775040,
> > children=SolrDocument{id=gc1, name=[grandChild1],
> > class=[foo.bar.grandchild1], _version_=1675826293154775040}},
> > SolrDocument{id=c2, name=[child2], class=[foo.bar.child2],
> > _version_=1675826293154775040}]}]}}
> >
> > Regards,
> >Alex.
> >
> > On Sun, 23 Aug 2020 at 01:38, Munendra S N 
> > wrote:
> > >
> > > Hi Alex,
> > >
> > > Currently, Fixing the documentation for nested docs is under progress.
> > More
> > > context is available in this JIRA -
> > > https://issues.apache.org/jira/browse/SOLR-14383.
> > >
> > >
> > https://github.com/arafalov/SolrJTest/blob/master/src/com/solrstart/solrj/Main.java
> > >
> > > The child doc transformer needs to be specified as part of the fl
> > parameter
> > > like fl=*,[child] so that the descendants are returned for each matching
> > > doc. As the query q=* matches all the documents, they are returned. If
> > only
> > > parent doc needs to be returned with descendants then, we should either
> > use
> > > block join query or query clause which matches only parent doc.
> > >
> > > Another thing I noticed in the code is that the child docs are indexed as
> > > anonymous docs (similar to old syntax) instead of indexing them in the
> > new
> > > syntax. With this, the nested block will be indexed but since the schema
> > > has _nested_path_ defined [child] doc transformer won't return any docs.
> > > Anonymous child docs need parentFilter but specifying parentFilter with
> > > _nested_path_ will lead to error
> > > It is due to this check -
> > >
> > https://github.com/apache/lucene-solr/blob/1c8f4c988a07b08f83d85e27e59b43eed5e2ca2a/solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java#L104
> > >
> > > Instead of indexing the docs this way,
> > >
> > > > SolrInputDocument parent1 = new SolrInputDocument();
> > > > parent1.addField("id", "p1");
> > > > parent1.addF

Re: Solr with HDFS configuration example running in production/dev

2020-08-24 Thread Joe Obernberger


Are you running with solr.lock.type=hdfs
?

Have you defined your DirectoryFactory - something like:


    true
    true
    43
    name="solr.hdfs.blockcache.direct.memory.allocation">true

    16384
    true
    true
    128
    1024
    hdfs://nameservice1:8020/solr8.2.0
    /etc/hadoop/conf.cloudera.hdfs1
    

-Joe

On 8/20/2020 2:30 AM, Prashant Jyoti wrote:

Hi Joe,
These are the errors I am running into:

org.apache.solr.common.SolrException: Error CREATEing SolrCore 
'newcollsolr2_shard1_replica_n1': Unable to create core 
[newcollsolr2_shard1_replica_n1] Caused by: Illegal char <:> at index 
4: 
hdfs://hn1-pjhado.tvbhpqtgh3judk1e5ihrx2k21d.tx.internal.cloudapp.net:8020/user/solr-data/newcollsolr2/core_node3/data\ 


at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1256)
at 
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:93)
at 
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:362)
at 
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:397)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:181)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211)

at org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:842)
at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:808)

at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:559)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:420)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:352)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1607)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1577)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)
at 
org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)

at org.eclipse.jetty.server.Server.handle(Server.java:500)
at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)

at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)

at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
at 
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)

at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.common.SolrException: Unable to create core 
[n

Re: How to Write Autoscaling Policy changes to Zookeeper/SolrCloud using the autoscaling Java API

2020-08-24 Thread Howard Gonzalez

Good morning! To add more context on the question, I can successfully use the 
Java API to build the list of new Clauses. However, the problem that I have is 
that I don't know how to "write" those changes back to solr using the Java API. 
I see there's a writeMap method in the Policy class however I can't find how to 
use it.

Thanks in advance


From: Howard Gonzalez 
Sent: Friday, August 21, 2020 12:45 PM
To: solr-user@lucene.apache.org 
Subject: How to Write Autoscaling Policy changes to Zookeeper/SolrCloud using 
the autoscaling Java API

Hello. I am trying to use the autoscaling Java API to write some cluster policy 
changes to a Zookeeper/SolrCloud cluster. However, I can't find the right way 
to do it. I can get all the autoscaling cluster policy clauses using:

autoScalingConfig.getPolicy.getClusterPolicy

However, after getting all the right List of clauses, I don't know how to write 
those changes to the Zookeeper/Solr cluster using the Java API.

Any guidance please? I know I can use the HTTP solr client to send a json 
request, but just wondering how to do it using the provided Java API.

Thanks in advance

Re: Apache Solr 8.6.0 with SSL

2020-08-24 Thread Jan Høydahl

I think you’re experiencing this:

https://issues.apache.org/jira/browse/SOLR-14711

No idea why the bin/solr script won’t work with SSL...

Jan

> 24. aug. 2020 kl. 15:52 skrev Patrik Peng :
> 
> Greetings
> 
> I'm in the process of setting up a SolrCloud cluster with 3 Zookeeper
> and 3 Solr nodes on FreeBSD and wish to enable SSL between the Solr nodes.
> Before enabling SSL, everything worked as expected and I followed the
> instructions described in the Solr 8.6 docs
> . But after
> enabling SSL, the solr command line utility stopped working for various
> tasks.
> 
> For example:
> 
> $ /usr/local/solr/bin/solr status
> 
> Found 1 Solr nodes:
> 
> Solr process 974 from /var/db/solr/solr-8983.pid not found.
> 
> $ /usr/local/solr/bin/solr create_collection -c test
> Failed to determine the port of a local Solr instance, cannot create test!
> 
> Also the following line appears in the logfile even though SSL is enabled:
> 
> 2020-08-24 15:29:52.612 WARN  (main) [   ] o.a.s.c.CoreContainer Solr 
> authentication is enabled, but SSL is off.  Consider enabling SSL to protect 
> user credentials and data with encryption.
> 
> Apart from these oddities, the cluster is working fine and dandy. The
> dashboard is available via HTTPS and the nodes can communicate via SSL.
> 
> Does anyone have any clue what's causing this? Any help would be
> appreciated.
> 
> Regards
> Patrik
>

Re: PDF extraction using Tika

2020-08-24 Thread Alexandre Rafalovitch

The issue seems to be more with a specific file and at the level way
below Solr's or possibly even Tika's:
Caused by: java.io.IOException: expected='>' actual='
' at offset 2383
at
org.apache.pdfbox.pdfparser.BaseParser.readExpectedChar(BaseParser.java:1045)

Are you indexing the same files on Windows and Linux? I am guessing
not. I would try to narrow down which of the files it is. One way
could be to get a standalone Tika (make sure to match the version Solr
embeds) and run it over the documents by itself. It will probably
complain with the same error.

Regards,
   Alex.
P.s. Additionally, both DIH and Embedded Tika are not recommended for
production. And both will be going away in future Solr versions. You
may have a much less brittle pipeline if you save the structured
outputs from those Tika standalone runs and then index them into Solr,
possibly pre-processed.

On Mon, 24 Aug 2020 at 11:09, Srinivas Kashyap
 wrote:
>
> Hello,
>
> We are using TikaEntityProcessor to extract the content out of PDF and make 
> the content searchable.
>
> When jetty is run on windows based machine, we are able to successfully load 
> documents using full import DIH(tika entity). Here PDF's is maintained in 
> windows file system.
>
> But when jetty solr is run on linux machine, and try to run DIH, we are 
> getting below exception: (Here PDF's are maintained in linux filesystem)
>
> Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read 
> content Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
> at 
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
> at 
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
> at 
> org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: 
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read 
> content Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
> ... 4 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
> Unable to read content Processing Document # 1
> at 
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
> at 
> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:171)
> at 
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)
> at 
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
> ... 6 more
> Caused by: org.apache.tika.exception.TikaException: Unable to extract PDF 
> content
> at 
> org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:139)
> at 
> org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:172)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> at 
> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:165)
> ... 10 more
> Caused by: java.io.IOException: expected='>' actual='
> ' at offset 2383
> at 
> org.apache.pdfbox.pdfparser.BaseParser.readExpectedChar(BaseParser.java:1045)
> at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:226)
> at 
> org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:163)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:510)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
> at 
> org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:139)
> at 
> org.apache.pdfbox

PDF extraction using Tika

2020-08-24 Thread Srinivas Kashyap

Hello,

We are using TikaEntityProcessor to extract the content out of PDF and make the 
content searchable.

When jetty is run on windows based machine, we are able to successfully load 
documents using full import DIH(tika entity). Here PDF's is maintained in 
windows file system.

But when jetty solr is run on linux machine, and try to run DIH, we are getting 
below exception: (Here PDF's are maintained in linux filesystem)

Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read 
content Processing Document # 1
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
at 
org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: 
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read 
content Processing Document # 1
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
... 4 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
Unable to read content Processing Document # 1
at 
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
at 
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:171)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
... 6 more
Caused by: org.apache.tika.exception.TikaException: Unable to extract PDF 
content
at 
org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:139)
at 
org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:172)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
at 
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:165)
... 10 more
Caused by: java.io.IOException: expected='>' actual='
' at offset 2383
at 
org.apache.pdfbox.pdfparser.BaseParser.readExpectedChar(BaseParser.java:1045)
at 
org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:226)
at 
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:163)
at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:510)
at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)
at 
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
at 
org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:139)
at 
org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:391)
at 
org.apache.tika.parser.pdf.PDF2XHTML.processPage(PDF2XHTML.java:147)
at 
org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319)
at 
org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266)
at 
org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:117)
... 15 more

Can you please suggest, how to extract PDF from linux based file system?

Thanks,
Srinivas Kashyap

DISCLAIMER:
E-mails and attachments from Bamboo Rose, LLC are confidential.
If you are not the intended recipient, please notify the sender immediately by 
replying to the e-mail, and then delete it without making copies or using it in 
any way.
No representation is made that this email or any attachments are free of 
viruses. Virus scanning is recommended and is the responsibility of the 
recipient.

Disclaimer

The information contained in this communication from the sender is 
confidential. It is intended solely for

Apache Solr 8.6.0 with SSL

2020-08-24 Thread Patrik Peng

Greetings

I'm in the process of setting up a SolrCloud cluster with 3 Zookeeper
and 3 Solr nodes on FreeBSD and wish to enable SSL between the Solr nodes.
Before enabling SSL, everything worked as expected and I followed the
instructions described in the Solr 8.6 docs
. But after
enabling SSL, the solr command line utility stopped working for various
tasks.

For example:

$ /usr/local/solr/bin/solr status

Found 1 Solr nodes:

Solr process 974 from /var/db/solr/solr-8983.pid not found.

$ /usr/local/solr/bin/solr create_collection -c test
Failed to determine the port of a local Solr instance, cannot create test!

Also the following line appears in the logfile even though SSL is enabled:

2020-08-24 15:29:52.612 WARN  (main) [   ] o.a.s.c.CoreContainer Solr 
authentication is enabled, but SSL is off.  Consider enabling SSL to protect 
user credentials and data with encryption.

Apart from these oddities, the cluster is working fine and dandy. The
dashboard is available via HTTPS and the nodes can communicate via SSL.

Does anyone have any clue what's causing this? Any help would be
appreciated.

Regards
Patrik

Re: How to perform keyword (exact_title) match in solr with sow=true

2020-08-24 Thread raj.yadav

Hi Community members,

I tried the following approaches but non of them worked for my use case.

1. For achieving exact match in solr we have to kept sow='false' (solr will
use field centric matching mode) and grouped multiple similar fields  into
one copy field. It does solve the problem of recall but we use different
boost values per field. So it hits our precision.This approach is mentioned
in this search-relevance book
(https://livebook.manning.com/book/relevant-search/chapter-6?origin=product-toc)

2. i tried to perform exact matching using functional query but couldn't
found any supporting function. 

Is there any existing/patch solution for performing an exact match with
sow='true'. Please let me know.

Regards,
Raj



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Simple query

2020-08-24 Thread Jayadevan Maymala

Thanks.
I just copied the config file under
solr/solr-8.6.0/server/solr/configsets/_default and made minor changes.
Tried the console - I think  SKMF is doing it.

Regards,
Jayadevan

On Mon, Aug 24, 2020 at 5:45 PM Dominique Bejean 
wrote:

> Hi,
>
> We need to know how is analyzed your catch_all field at index and search
> time.
>
> I think you are using a stemming filter and "apache" is stemmed as "apach".
> So "apache" and "apach" match the document and not "apac".
> You can use the console in order to see how terms are removed or
> transformed by each filter of the analysis chain for a field or a
> fieldtype.
>
> Regards
>
> Dominique
>
>
> Le lun. 24 août 2020 à 12:01, Jayadevan Maymala 
> a écrit :
>
> > Hi all,
> > I am learning the basics of Solr querying and am not able to figure out
> > something. The first query which searches for 'apac' fetches no
> documents.
> > The second one which searches for 'apach' , i.e. add h - one more
> > character, fetches a document.
> >
> > curl -X GET "
> >
> >
> http://localhost:8983/solr/search_twitter/select?q=apac&df=catch_all&fl=catch_all,score
> > "
> > {
> >   "responseHeader":{
> > "status":0,
> > "QTime":0,
> > "params":{
> >   "q":"apac",
> >   "df":"catch_all",
> >   "fl":"catch_all,score"}},
> >
> >
> >
> "response":{"numFound":0,"start":0,"maxScore":0.0,"numFoundExact":true,"docs":[]
> >   }}
> >
> >
> > curl -X GET "
> >
> >
> http://localhost:8983/solr/search_twitter/select?q=apach&df=catch_all&fl=catch_all,score
> > "
> > {
> >   "responseHeader":{
> > "status":0,
> > "QTime":0,
> > "params":{
> >   "q":"apach",
> >   "df":"catch_all",
> >   "fl":"catch_all,score"}},
> >
> >
> >
> "response":{"numFound":1,"start":0,"maxScore":0.13076457,"numFoundExact":true,"docs":[
> >   {
> > "catch_all":["apache",
> >   "Happy searching!",
> >   "https://lucene.apache.org/solr";,
> >   "https://lucene.apache.org";],
> > "score":0.13076457}]
> >   }}
> >
> > Field definition -
> > "name":"catch_all",
> > "type":"text_en",
> > "multiValued":true
> >
> >
> > Neither apac or apach is there in the data.
> >
> > Regards,
> > Jayadevan
> >
>

Re: ZooKeeper 3.4 end of life

2020-08-24 Thread Erick Erickson

I don’t think you’ll find an official EOL announcement. Here’s
a guide, but do note the phrase “on demand” for minor releases.

You should interpret “on demand” as when the developers feel
the issues in the current point-release code base are numerous
enough or critical enough to warrant the effort.

https://cwiki.apache.org/confluence/display/ZOOKEEPER/Roadmap

Best,
Erick

> On Aug 24, 2020, at 8:29 AM, h00452626  wrote:
> 
> Hey man,
> I'm wondering where the announcement is, I'm searching the EOL rule of ZK
> but found nothing.
> can u send me the link of the announcement?I will be very thankful.
> 
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: ZooKeeper 3.4 end of life

2020-08-24 Thread h00452626

Hey man,
I'm wondering where the announcement is, I'm searching the EOL rule of ZK
but found nothing.
can u send me the link of the announcement?I will be very thankful.




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: All cores gone along with all solr configuration upon reboot

2020-08-24 Thread Erick Erickson

this is consistent with the data disappearing from Zookeeper due 
to misconfiguration and/or some external process removing it when
you reboot.

So here’s what I’d do next:

Go ahead and reboot. You do _not_ need to start Solr to run bin/solr
scripts, and among them are 

bin/solr zk ls -r / -z path_to_Zookeeper_ensemble

that should dump a listing of the zk tree. Is it what you expect or 
does it mysteriously disappear when you reboot? If so, you must
track down what it is about your environment that’s deleting the
ZK data on reboot.

When Solr comes back up, it’s saying “Look, there’s no collection
information in Zookeeper but there are replicas on disk. That must
mean someone deleted the collections while I was down, so I’ll
clean up”.

bin/solr zk -help

will show you a lot of unix-like commands for poking around Zookeeper
without having to start Zookeeper. There are also GUI tools out there
you can use.

Again, it’s a near certainty that
1> your ZK data is disappearing when you reboot
2> something external to Solr is doing it.

Best,
Erick


> On Aug 24, 2020, at 1:11 AM, yaswanth kumar  wrote:
> 
> Hi Erick,
> 
> Here is the latest most error that I captured which seems to be actually
> deleting the cores ( I did noticed that the core folders under the path
> ../solr/server/solr were deleted one by one when the server came back from
> reboot)
> 
> 2020-08-24 04:41:27.424 ERROR
> (coreContainerWorkExecutor-2-thread-1-processing-n:9.70.170.51:8080_solr) [
>  ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on$
>at
> org.apache.solr.cloud.ZkController.checkStateInZk(ZkController.java:1875)
> 
> *org.apache.solr.cloud.ZkController$NotInClusterStateException:
> coreNodeName core_node3 does not exist in shard shard1, ignore the
> exception if the replica was deleted*at
> org.apache.solr.cloud.ZkController.checkStateInZk(ZkController.java:1875)
> ~[solr-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe -
> ivera - 2019-07-19$
>at
> org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1774)
> ~[solr-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe -
> ivera - 2019-07-19 15$
>at
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1238)
> ~[solr-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe -
> ivera - 201$
>at
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:756)
> ~[solr-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe -
> ivera - 2019-07-19$
>at
> org.apache.solr.core.CoreContainer$$Lambda$343/.call(Unknown
> Source) ~[?:?]
>at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202)
> ~[metrics-core-4.0.5.jar:4.0.5]
>at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
>at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> ~[solr-solrj-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e$
>at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$142/.run(Unknown
> Source) ~[?:?]
>at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> ~[?:?]
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> ~[?:?]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> 
> For some reason I believe solr is not able to find the replica in the
> clusterstate and its causing the delete activity, not really sure on why
> its not able to find it in clusterstate, I think due to some issue looks
> like first clusterstate is getting wiped out and then slowly rest the cores
> are getting deleted themselves.
> 
> As you asked I did cross checked once again on the port numbers and I am
> using 2181 as a clientport and the same is what I see in the dashboard
> screen of solr for ZKHOST., not really sure on how can I prevent this going
> forward. One thing here is that I am using Solr basic AUTHENTICATION plugin
> if it makes any difference.
> 
> On Sat, Aug 22, 2020 at 11:55 AM Erick Erickson 
> wrote:
> 
>> Autopurge shouldn’t matter, that’s just cleaning up old snapshots. That
>> is, it should be configured, but having it enabled or not should have no
>> bearing on your data disappearing.
>> 
>> Also, are you absolutely certain that you are using your external ZK?
>> Check the port on the admin screen. 9983 is the default for embededded ZK.
>> 
>> All that said, nothing in Solr just deletes all this. The fact that you
>> only saw this on reboot is highly suspicious, some external-to-Solr
>> process, anything from a startup script to restoring a disk image to…. is
>> removing that data I suspect.
>> 
>> Best,
>> Erick
>> 
>>> On Aug 22, 2020, at 9:24 AM, yaswanth kumar 
>> wrote:
>>> 
>>> Thanks Eric for looking into this..
>>> 
>>> But as I said before I confirmed that

Re: Simple query

2020-08-24 Thread Dominique Bejean

Hi,

We need to know how is analyzed your catch_all field at index and search
time.

I think you are using a stemming filter and "apache" is stemmed as "apach".
So "apache" and "apach" match the document and not "apac".
You can use the console in order to see how terms are removed or
transformed by each filter of the analysis chain for a field or a fieldtype.

Regards

Dominique


Le lun. 24 août 2020 à 12:01, Jayadevan Maymala 
a écrit :

> Hi all,
> I am learning the basics of Solr querying and am not able to figure out
> something. The first query which searches for 'apac' fetches no documents.
> The second one which searches for 'apach' , i.e. add h - one more
> character, fetches a document.
>
> curl -X GET "
>
> http://localhost:8983/solr/search_twitter/select?q=apac&df=catch_all&fl=catch_all,score
> "
> {
>   "responseHeader":{
> "status":0,
> "QTime":0,
> "params":{
>   "q":"apac",
>   "df":"catch_all",
>   "fl":"catch_all,score"}},
>
>
> "response":{"numFound":0,"start":0,"maxScore":0.0,"numFoundExact":true,"docs":[]
>   }}
>
>
> curl -X GET "
>
> http://localhost:8983/solr/search_twitter/select?q=apach&df=catch_all&fl=catch_all,score
> "
> {
>   "responseHeader":{
> "status":0,
> "QTime":0,
> "params":{
>   "q":"apach",
>   "df":"catch_all",
>   "fl":"catch_all,score"}},
>
>
> "response":{"numFound":1,"start":0,"maxScore":0.13076457,"numFoundExact":true,"docs":[
>   {
> "catch_all":["apache",
>   "Happy searching!",
>   "https://lucene.apache.org/solr";,
>   "https://lucene.apache.org";],
> "score":0.13076457}]
>   }}
>
> Field definition -
> "name":"catch_all",
> "type":"text_en",
> "multiValued":true
>
>
> Neither apac or apach is there in the data.
>
> Regards,
> Jayadevan
>

Re: Solr doesn't run after editing solr.in.sh

2020-08-24 Thread Vincenzo D'Amore

Pay attention to this line 

SOLR_ULIMIT_CHECKS=falseGC_TUNE=" \

 you lost a new line after false. 

SOLR_ULIMIT_CHECKS=false
GC_TUNE=" \

Ciao,
Vincenzo

--
mobile: 3498513251
skype: free.dev

> On 24 Aug 2020, at 01:41, Walter Underwood  wrote:
> 
> Also, what platform is this on and what editor did you use (especially if 
> you are on Windows)?
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Aug 23, 2020, at 4:35 PM, Erick Erickson  wrote:
>> 
>> Well, first show exactly what you uncommented. I doubt you uncommented them 
>> one by one and tried everything, so you leave us guessing. Uncommenting 
>> SOLR_HOME for instance would be shooting yourself in the foot since Solr 
>> wouldn’t know where to start. Uncommenting some the authorization parameters 
>> without providing the proper values would cause Solr not to run. 
>> Uncommenting #SOLR_OPTS="$SOLR_OPTS -Dsolr.environment=prod” should be fine.
>> 
>> 
>> Second, show us the output when you _do_ try to run. You can use the -f 
>> option to dump logging to the console.
>> 
>> Best,
>> Erick
>> 
 On Aug 23, 2020, at 9:58 AM, Joe Doupnik  wrote:
>>> 
>>> On 22/08/2020 22:08, maciejpreg...@tutanota.com.INVALID wrote:
 Good morning.
 When I uncomment any of commands in solr.in.sh, Solr doesn't run. What do 
 I have to do to fix a problem?
 Best regards,
 Maciej Pregiel
>>> On 22/08/2020 22:08, maciejpreg...@tutanota.com.INVALID wrote:
 Good morning.
 When I uncomment any of commands in solr.in.sh, Solr doesn't run. What do 
 I have to do to fix a problem?
 Best regards,
 Maciej Pregiel
>>> -
>>>   My approach has been to add local configuration options to the end of the 
>>> file and leave intact the original text. Here is the end of my file, which 
>>> has no changes above this material:
>>> 
>>> #SOLR_SECURITY_MANAGER_ENABLED=false
>>> ## JRD values
>>> SOLR_ULIMIT_CHECKS=falseGC_TUNE=" \
>>> -XX:SurvivorRatio=4 \
>>> -XX:TargetSurvivorRatio=90 \
>>> -XX:MaxTenuringThreshold=8 \
>>> -XX:+UseConcMarkSweepGC \
>>> -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
>>> -XX:+CMSScavengeBeforeRemark \
>>> -XX:PretenureSizeThreshold=64m \
>>> -XX:+UseCMSInitiatingOccupancyOnly \
>>> -XX:CMSInitiatingOccupancyFraction=50 \
>>> -XX:CMSMaxAbortablePrecleanTime=6000 \
>>> -XX:+CMSParallelRemarkEnabled \
>>> -XX:+ParallelRefProcEnabled \
>>> -XX:-OmitStackTraceInFastThrow"
>>> #JRD give more memory
>>> ##SOLR_HEAP="4096m"
>>> SOLR_HEAP="2048m"
>>> ##JRD enlarge this
>>> #SOLR_OPTS="$SOLR_OPTS -Xss512k"
>>> SOLR_OPTS="$SOLR_OPTS -Xss1024k"
>>> SOLR_STOP_WAIT=30
>>> SOLR_JAVA_HOME="/usr/java/latest/"
>>> SOLR_PID_DIR="/home/search/solr"
>>> SOLR_HOME="/home/search/solr/data"
>>> SOLR_LOGS_DIR="/home/search/solr/logs"
>>> SOLR_PORT="8983"
>>> SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=3000"
>>> SOLR_OPTS="$SOLR_OPTS -Dsolr.autoCommit.maxTime=6"
>>> SOLR_OPTS="$SOLR_OPTS -Djava.io.tmpdir=/home/search/tmp"
>>> 
>>>   Thanks,
>>>   Joe D.
>> 
>

Simple query

2020-08-24 Thread Jayadevan Maymala

Hi all,
I am learning the basics of Solr querying and am not able to figure out
something. The first query which searches for 'apac' fetches no documents.
The second one which searches for 'apach' , i.e. add h - one more
character, fetches a document.

curl -X GET "
http://localhost:8983/solr/search_twitter/select?q=apac&df=catch_all&fl=catch_all,score
"
{
  "responseHeader":{
"status":0,
"QTime":0,
"params":{
  "q":"apac",
  "df":"catch_all",
  "fl":"catch_all,score"}},

"response":{"numFound":0,"start":0,"maxScore":0.0,"numFoundExact":true,"docs":[]
  }}


curl -X GET "
http://localhost:8983/solr/search_twitter/select?q=apach&df=catch_all&fl=catch_all,score
"
{
  "responseHeader":{
"status":0,
"QTime":0,
"params":{
  "q":"apach",
  "df":"catch_all",
  "fl":"catch_all,score"}},

"response":{"numFound":1,"start":0,"maxScore":0.13076457,"numFoundExact":true,"docs":[
  {
"catch_all":["apache",
  "Happy searching!",
  "https://lucene.apache.org/solr";,
  "https://lucene.apache.org";],
"score":0.13076457}]
  }}

Field definition -
"name":"catch_all",
"type":"text_en",
"multiValued":true


Neither apac or apach is there in the data.

Regards,
Jayadevan

Re: SOLR Compatibility with Oracle Enterprise Linux 7

2020-08-24 Thread Shawn Heisey


On 8/24/2020 12:46 AM, Wang, Ke wrote:
We are using Apache SOLR version 8.4.4.0. The project is planning to 
upgrade the Linux server from Oracle Enterprise Linux (Red Hat 
Enterprise Linux) 6 to OEL 7. As I was searching on the Confluence 
page and was not able to find the information, can I please confirm 
if: * Apache SOLR 8.4.4.0 is compatible with Oracle Enterprise Linux 
(Red Hat Enterprise Linux) 7? Please let me know if any further 
information is required.


There is no 8.4.4.0 version of Solr.  The closest versions to that are 
8.4.0 and 8.4.1.  If you are seeing 8.4.4.0 as the version, that must 
have come from somewhere other than this project.


The only concrete system requirement for Solr is Java. Solr 8.x has a 
requirement of Java 8 or later.  If Java is available for the OS, then 
Solr should work on that OS.  I am pretty sure that Oracle Linux has 
Java available.


Thanks,
Shawn

Re: SOLR Compatibility with Oracle Enterprise Linux 7

2020-08-24 Thread Jörn Franke

Yes, it should be no issues to upgrade to RHEL7.
I assume you mean Solr 8.4.0. You can also use the latest Solr version.

Why not RHEL8?


> Am 24.08.2020 um 09:02 schrieb Wang, Ke :
>

SOLR Compatibility with Oracle Enterprise Linux 7

2020-08-24 Thread Wang, Ke

Hi there,

We are using Apache SOLR version 8.4.4.0. The project is planning to upgrade 
the Linux server from Oracle Enterprise Linux (Red Hat Enterprise Linux) 6 to 
OEL 7. As I was searching on the Confluence page and was not able to find the 
information, can I please confirm if:

  *   Apache SOLR 8.4.4.0 is compatible with Oracle Enterprise Linux (Red Hat 
Enterprise Linux) 7?

Please let me know if any further information is required.

Thank you in advance for your help.

Regards,

Ke




This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. Your privacy is important to us. Accenture uses your personal data only 
in compliance with data protection laws. For further information on how 
Accenture processes your personal data, please see our privacy statement at 
https://www.accenture.com/us-en/privacy-policy.
__

www.accenture.com

Creating a phrase match feature in LTR

RE: PDF extraction using Tika

Re: Solr 8.6.1: Can't round-trip nested document from SolrJ

Re: Solr with HDFS configuration example running in production/dev

Re: How to Write Autoscaling Policy changes to Zookeeper/SolrCloud using the autoscaling Java API

Re: Apache Solr 8.6.0 with SSL

Re: PDF extraction using Tika

PDF extraction using Tika

Apache Solr 8.6.0 with SSL

Re: How to perform keyword (exact_title) match in solr with sow=true

Re: Simple query

Re: ZooKeeper 3.4 end of life

Re: ZooKeeper 3.4 end of life

Re: All cores gone along with all solr configuration upon reboot

Re: Simple query

Re: Solr doesn't run after editing solr.in.sh

Simple query

Re: SOLR Compatibility with Oracle Enterprise Linux 7

Re: SOLR Compatibility with Oracle Enterprise Linux 7

SOLR Compatibility with Oracle Enterprise Linux 7

20 matches

Site Navigation

Mail list logo

Footer information