Re: [ANNOUNCE] Apache Solr 8.7.0 released

2020-11-09 Thread David Smiley
FYI an updated Docker image was just published a few hours ago:
https://hub.docker.com/_/solr

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Nov 4, 2020 at 9:06 AM Atri Sharma  wrote:

> 3/11/2020, Apache Solr™ 8.7 available
>
> The Lucene PMC is pleased to announce the release of Apache Solr 8.7
>
> Solr is the popular, blazing fast, open source NoSQL search platform
> from the Apache Lucene project. Its major features include powerful
> full-text search, hit highlighting, faceted search and analytics, rich
> document parsing, geospatial search, extensive REST APIs as well as
> parallel SQL. Solr is enterprise grade, secure and highly scalable,
> providing fault tolerant distributed search and indexing, and powers
> the search and navigation features of many of the world's largest
> internet sites.
>
>
> The release is available for immediate download at:
>
>
> https://lucene.apache.org/solr/downloads.html
>
>
> Please read CHANGES.txt for a detailed list of changes:
>
>
> https://lucene.apache.org/solr/8_7_0/changes/Changes.html
>
>
> Solr 8.7.0 Release Highlights
>
>
> SOLR-14588 -- Circuit Breakers Infrastructure and Real JVM Based Circuit
> Breaker
>
>
> SOLR-14615 –- CPU Based Circuit Breaker
>
>
> SOLR-14537 -- Improve performance of ExportWriter
>
>
> SOLR-14651 -- The MetricsHistoryHandler Can Be Disabled
>
>
> A summary of important changes is published in the Solr Reference
> Guide at https://lucene.apache.org/solr/guide/8_7/solr-upgrade-notes.html.
> For the most exhaustive list, see the full release notes at
> https://lucene.apache.org/solr/8_7_0/changes/Changes.html or by
> viewing the CHANGES.txt file accompanying the distribution.  Solr's
> release notes usually don't include Lucene layer changes.  Lucene's
> release notes are at
> https://lucene.apache.org/core/8_7_0/changes/Changes.html
>
>
> Note: The Apache Software Foundation uses an extensive mirroring network
> for
>
> distributing releases. It is possible that the mirror you are using may
> not have
>
> replicated the release yet. If that is the case, please try another mirror.
>
> This also applies to Maven access.
>
> 
>
> --
> Regards,
>
> Atri
> Apache Concerted
>


Re: How to reflect changes of solrconfig.xml to all the cores without causing any conflict

2020-11-09 Thread Shawn Heisey

On 11/9/2020 5:44 AM, raj.yadav wrote:

*Question:*
Since reload is not done, none of the replica (including leader) will have
updated solrconfig. And if we restart replica and if it trys to sync up with
leader will it reflect the latest changes of solrconfig or it will be the
same as leader.





Solr Collection detail:
single collection having 6 shard. each Vm is hosting single replica.
Collection size: 60 GB (each shard size is 10 GB)
Average doc size: 1.0Kb


If you restart Solr, it is effectively the same thing as reloading all 
cores on that Solr instance.


Your description (use of the terms "collection" and "shards") suggests 
that you're running in SolrCloud mode.  If you are, then modifying 
solrconfig.xml on the disk will change nothing.  You need to modify the 
solrconfig.xml that lives in ZooKeeper, or re-upload the changes to ZK. 
 Is that what you're doing?  After that, to make any changes effective, 
you have to reload the collection or restart the correct Solr instances.


I cannot tell you exactly what will happen as far as SolrCloud index 
synchronization, because I know nothing about your setup.  If the 
follower replica type is TLOG or PULL, then the index will be an exact 
copy of the leader's index.  With NRT, all replicas will independently 
index the data.


Thanks,
Shawn


intermittent log rotation bug in Solr8.5.2?

2020-11-09 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We are in the process of upgrading from Solr7.4.0 to Solr8.5.2, and we are 
experiencing intermittent problem with log rotation

Our log4j2.xml file (both for Solr7.4.0 and for Solr8.5.2) includes the 
following:


  

  %d{-MM-dd HH:mm:ss.SSS} %-5p (%t) [%X{collection} %X{shard} 
%X{replica} %X{core}] %c{1.} %m%n

  
  


  
  


When we start our Solr nodes, we pass in "-Dlog.backup.index=99"

Last month, we had a problem in that, on those nodes which have been upgraded 
to Solr8.5.2, only the two most recent solr.log files were being retained. That 
problem went away in the middle of last month, but has reappeared over the 
weekend. Log rotation has been consistently been correct on those nodes which 
are still running Solr7.4.0

Has anyone else encountered any similar phenomenon? Has Solr8 introduced some 
limitation on DefaultRolloverStrategy max which varies by the day of the month? 
(or by the phase of the moon? Or something else?)


Using Multiple collections with streaming expressions

2020-11-09 Thread ufuk yılmaz
For example the streaming expression significantTerms:

https://lucene.apache.org/solr/guide/8_4/stream-source-reference.html#significantterms


significantTerms(collection1,
 q="body:Solr",
 field="author",
 limit="50",
 minDocFreq="10",
 maxDocFreq=".20",
 minTermLength="5")

Solr supports querying multiple collections at once, but I can’t figure  out 
how I can do that with streaming expressions.
When I try enclosing them in quotes like:

significantTerms(“collection1, collection2”,
 q="body:Solr",
 field="author",
 limit="50",
 minDocFreq="10",
 maxDocFreq=".20",
 minTermLength="5")

It gives the error: "EXCEPTION":"java.io.IOException: Slices not found for \" 
collection1, collection2\""
I think Solr thinks quotes as part of the collection names, hence it can’t find 
slices for it.

When I just use it without quotes:
significantTerms(collection1, collection2,…
It gives the error: "EXCEPTION":"invalid expression 
significantTerms(collection1, collection2, …

I tried single quotes, escaping the quotation mark but nothing Works…

Any ideas?

Best, ufuk

Windows 10 için Posta ile gönderildi



Re: Unable to finish sending updates - Solr 8.5.0

2020-11-09 Thread raj.yadav
He Scott,
We have also recently migrated to solr 8.5.2. And facing similar issue.
Are you able to resolve this



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: can't connect to SOLR with JDBC url

2020-11-09 Thread Vincent Bossuet
Thank you Kevin, I can now connect (I test with DbVisualiser) with -c
option.
Vincent


Le lun. 9 nov. 2020 à 16:30, Kevin Risden  a écrit :

> >
> > start (without option : bin/solr start)
>
>
> Solr SQL/JDBC requires Solr Cloud (running w/ Zookeeper) since streaming
> expressions (which backs the Solr SQL) requires it.
>
> You should be able to start Solr this way to get Solr in cloud mode.
>
> bin/solr start -c
>
> If you use the above to start Solr, the embedded ZK is on localhost:9983 so
> the JDBC connection string should be:
>
> jdbc:solr://localhost:9983?collection=test
>
> Assuming your collection name is test.
>
> Kevin Risden
>
>
> On Fri, Nov 6, 2020 at 11:31 AM Vincent Bossuet  wrote:
>
> > Hi all :)
> >
> > I'm trying to connect to Solr with JDBC, but I always have
> > "java.util.concurrent.TimeoutException: Could not connect to ZooKeeper
> > localhost:9983/ within 15000 ms" (or other port, depends wich jdbc url I
> > test).
> >
> > Here what I did :
> >
> >-
> >
> >I installed Solr 7.7.2 (i followed install doc here
> >),
> i.e.
> >download, extract, start (without option : bin/solr start). This
> > version of
> >Solr is the one I have at work, so i installed the same to test before
> > on
> >localhost.
> >-
> >
> >I added a 'test' collection and the examples xml documents, I can see
> >them at this url 
> >-
> >
> >then I installed DbVisualizer, added driver and a connection, like
> > explained
> >here
> > >
> > =>
> >the only differences I saw with documentation is that on screencopy
> with
> >the jar to import, versions are differents and there is one more jar
> in
> >solr archive (commons-math3-3.6.1.jar). Also, the jdbc url to use is
> > with
> >or without a '/' in the middle (see here
> >), as this :
> >jdbc:solr://localhost:9983?collection=test or
> >jdbc:solr://localhost:9983/?collection=test. I don't know if it is
> >important...
> >-
> >
> >and I tried both on Ubuntu VM and Windows 10
> >
> > So, all seems to be installed correctly, as in documentation, but when I
> > click on 'connect', always have a timeout. Every website where I found
> some
> > info talk about an url with 9983 port, I tried other possibilities (just
> in
> > case) but no success...
> >
> >- jdbc:solr://localhost:9983?collection=test
> >- jdbc:solr://127.0.0.1:9983?collection=test
> >- jdbc:solr://localhost:9983/?collection=test
> >- jdbc:solr://localhost:9983/solr?collection=test
> >- jdbc:solr://localhost:8983/?collection=test
> >- jdbc:solr://localhost:8983?collection=test
> >- jdbc:solr://localhost:8983/solr?collection=test
> >- jdbc:solr://localhost:2181?collection=test
> >- jdbc:solr://localhost:2181/?collection=test
> >- jdbc:solr://localhost:2181/solr?collection=test
> >
> > If you have an idea, thanks for help !
> >
> > Vincent
> >
>


Re: can't connect to SOLR with JDBC url

2020-11-09 Thread Kevin Risden
>
> start (without option : bin/solr start)


Solr SQL/JDBC requires Solr Cloud (running w/ Zookeeper) since streaming
expressions (which backs the Solr SQL) requires it.

You should be able to start Solr this way to get Solr in cloud mode.

bin/solr start -c

If you use the above to start Solr, the embedded ZK is on localhost:9983 so
the JDBC connection string should be:

jdbc:solr://localhost:9983?collection=test

Assuming your collection name is test.

Kevin Risden


On Fri, Nov 6, 2020 at 11:31 AM Vincent Bossuet  wrote:

> Hi all :)
>
> I'm trying to connect to Solr with JDBC, but I always have
> "java.util.concurrent.TimeoutException: Could not connect to ZooKeeper
> localhost:9983/ within 15000 ms" (or other port, depends wich jdbc url I
> test).
>
> Here what I did :
>
>-
>
>I installed Solr 7.7.2 (i followed install doc here
>), i.e.
>download, extract, start (without option : bin/solr start). This
> version of
>Solr is the one I have at work, so i installed the same to test before
> on
>localhost.
>-
>
>I added a 'test' collection and the examples xml documents, I can see
>them at this url 
>-
>
>then I installed DbVisualizer, added driver and a connection, like
> explained
>here
>
> =>
>the only differences I saw with documentation is that on screencopy with
>the jar to import, versions are differents and there is one more jar in
>solr archive (commons-math3-3.6.1.jar). Also, the jdbc url to use is
> with
>or without a '/' in the middle (see here
>), as this :
>jdbc:solr://localhost:9983?collection=test or
>jdbc:solr://localhost:9983/?collection=test. I don't know if it is
>important...
>-
>
>and I tried both on Ubuntu VM and Windows 10
>
> So, all seems to be installed correctly, as in documentation, but when I
> click on 'connect', always have a timeout. Every website where I found some
> info talk about an url with 9983 port, I tried other possibilities (just in
> case) but no success...
>
>- jdbc:solr://localhost:9983?collection=test
>- jdbc:solr://127.0.0.1:9983?collection=test
>- jdbc:solr://localhost:9983/?collection=test
>- jdbc:solr://localhost:9983/solr?collection=test
>- jdbc:solr://localhost:8983/?collection=test
>- jdbc:solr://localhost:8983?collection=test
>- jdbc:solr://localhost:8983/solr?collection=test
>- jdbc:solr://localhost:2181?collection=test
>- jdbc:solr://localhost:2181/?collection=test
>- jdbc:solr://localhost:2181/solr?collection=test
>
> If you have an idea, thanks for help !
>
> Vincent
>


How to reflect changes of solrconfig.xml to all the cores without causing any conflict

2020-11-09 Thread raj.yadav
Recently we had modified `noCFSRatio` parameter of merge policy.

 
8
5
50.0
4000
0.0
  

This is our current merge policy. Earlier `noCFSRatio` was set to `0.1`.

generally to reflect any changes of solrconfig we reload the collection. But
we stop doing this because we observe that during reload operation some of
the replicas go under-recovery after reloading operation. 
So instead of reload, we restart each replica one by one.  

Our restart procesdure:
1. Indexing was stopped on the collection and  issued a hard commit 
2. First restarted are the non leader replica and in the end restarted
leader replica

*Question:*
Since reload is not done, none of the replica (including leader) will have
updated solrconfig. And if we restart replica and if it trys to sync up with
leader will it reflect the latest changes of solrconfig or it will be the
same as leader. 

Also after this exercise, we have seen a sudden spike in CPU utilization on
a few replicas though there is not much increase in our system load. 
 

System config of VM:
disk size: 250 GB
cpu: (8 vcpus, 64 GiB memory)

Solr Collection detail:
single collection having 6 shard. each Vm is hosting single replica. 
Collection size: 60 GB (each shard size is 10 GB)
Average doc size: 1.0Kb




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: SolrCloud shows cluster still healthy even the node data directory is deleted

2020-11-09 Thread Erick Erickson
Depends. *nix systems have delete-on-close semantics, that is as
long as there’s a single file handle open, the file will be still be
available to the process using it. Only when the last file handle is
closed will the file actually be deleted.

Solr (Lucene actually) has  file handle open to every file in the index
all the time.

These files aren’t visible when you do a directory listing. So if you
stop Solr, are the files gone? NOTE: When you start Solr again, if
there are existing replicas that are healthy then the entire index
should be copied from another replica….

Best,
Erick

> On Nov 9, 2020, at 3:30 AM, Amy Bai  wrote:
> 
> Hi community,
> 
> I found that SolrCloud won't check the IO status if the SolrCloud process is 
> alive.
> E.g. If I delete the SolrCloud data directory, there are no errors report, 
> and I can still log in to the SolrCloud   Admin UI to create/query 
> collections.
> Is this reasonable?
> Can someone explain why SOLR handles it like this?
> Thanks so much.
> 
> 
> Regards,
> Amy



Re: Commits (with openSearcher = true) are too slow in solr 8

2020-11-09 Thread raj.yadav
Thanks, Shawn and  Erick.
We are step by step trying out the changes suggested in your post.
Will get back once we have some numbers.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


SolrCloud shows cluster still healthy even the node data directory is deleted

2020-11-09 Thread Amy Bai
Hi community,

I found that SolrCloud won't check the IO status if the SolrCloud process is 
alive.
E.g. If I delete the SolrCloud data directory, there are no errors report, and 
I can still log in to the SolrCloud   Admin UI to create/query collections.
Is this reasonable?
Can someone explain why SOLR handles it like this?
Thanks so much.


Regards,
Amy