Re: Commits (with openSearcher = true) are too slow in solr 8

2020-11-03 Thread Shawn Heisey

On 11/3/2020 11:46 PM, raj.yadav wrote:

We have two parallel system one is  solr 8.5.2 and other one is solr 5.4
In solr_5.4 commit time with opensearcher true is 10 to 12 minutes while in
solr_8 it's around 25 minutes.


Commits on a properly configured and sized system should take a few 
seconds, not minutes.  10 to 12 minutes for a commit is an enormous red 
flag.



This is our current caching policy of solr_8




This is probably the culprit.  Do you know how many entries the 
filterCache actually ends up with?  What you've said with this config is 
"every time I open a new searcher, I'm going to execute up to 6000 
queries against the new index."  If each query takes one second, running 
6000 of them is going to take 100 minutes.  I have seen these queries 
take a lot longer than one second.


Also, each entry in the filterCache can be enormous, depending on the 
number of docs in the index.  Let's say that you have five million 
documents in your core.  With five million documents, each entry in the 
filterCache is going to be 625000 bytes.  That means you need 20GB of 
heap memory for a full filterCache of 32768 entries -- 20GB of memory 
above and beyond everything else that Solr requires.  Your message 
doesn't say how many documents you have, it only says the index is 11GB. 
 From that, it is not possible for me to figure out how many documents 
you have.



While debugging this we came across this page.
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Slowcommits


I wrote that wiki page.


Here one of the reasons for slow commit is mentioned as:
*/`Heap size issues. Problems from the heap being too big will tend to be
infrequent, while problems from the heap being too small will tend to happen
consistently.`/*

Can anyone please help me understand the above point?


If your heap is a lot bigger than it needs to be, then what you'll see 
is slow garbage collections, but it won't happen very often.  If the 
heap is too small, then there will be garbage collections that happen 
REALLY often, leaving few system resources for actually running the 
program.  This applies to ANY Java program, not just Solr.



System config:
disk size: 250 GB
cpu: (8 vcpus, 64 GiB memory)
Index size: 11 GB
JVM heap size: 30 GB


That heap seems to be a lot larger than it needs to be.  I have run 
systems with over 100GB of index, with tens of millions of documents, on 
an 8GB heap.  My filterCache on each core had a max size of 64, with an 
autowarmCount of four ... and commits STILL would take 10 to 15 seconds, 
which I consider to be very slow.  Most of that time was spent executing 
those four queries in order to autowarm the filterCache.


What I would recommend you start with is reducing the size of the 
filterCache.  Try a size of 128 and an autowarmCount of 8, see what you 
get for a hit rate on the cache.  Adjust from there as necessary.  And I 
would reduce the heap size for Solr as well -- your heap requirements 
should drop dramatically with a reduced filterCache.


Thanks,
Shawn


Re: Solr migration related issues.

2020-11-03 Thread Modassar Ather
Thanks Erick and Ilan.

I am using APIs to create core and collection and have removed all the
entries from core.properties. Currently I am facing init failure and
debugging it.
Will write back if I am facing any issues.

Best,
Modassar

On Wed, Nov 4, 2020 at 3:20 AM Erick Erickson 
wrote:

> Do note, though, that the default value for legacyCloud changed from
> true to false so even though you can get it to work by setting
> this cluster prop I wouldn’t…
>
> The change in the default value is why it’s failing for you.
>
>
> > On Nov 3, 2020, at 11:20 AM, Ilan Ginzburg  wrote:
> >
> > I second Erick's recommendation, but just for the record legacyCloud was
> > removed in (upcoming) Solr 9 and is still available in Solr 8.x. Most
> > likely this explains Modassar why you found it in the documentation.
> >
> > Ilan
> >
> >
> > On Tue, Nov 3, 2020 at 5:11 PM Erick Erickson 
> > wrote:
> >
> >> You absolutely need core.properties files. It’s just that they
> >> should be considered an “implementation detail” that you
> >> should rarely, if ever need to be aware of.
> >>
> >> Scripting manual creation of core.properties files in order
> >> to define your collections has never been officially supported, it
> >> just happened to work.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Nov 3, 2020, at 11:06 AM, Modassar Ather 
> >> wrote:
> >>>
> >>> Thanks Erick for your response.
> >>>
> >>> I will certainly use the APIs and not rely on the core.properties. I
> was
> >>> going through the documentation on core.properties and found it to be
> >> still
> >>> there.
> >>> I have all the solr install scripts based on older Solr versions and
> >> wanted
> >>> to re-use the same as the core.properties way is still available.
> >>>
> >>> So does this mean that we do not need core.properties anymore?
> >>> How can we ensure that the core name is configurable and not
> dynamically
> >>> set?
> >>>
> >>> I will try to use the APIs to create the collection as well as the
> cores.
> >>>
> >>> Best,
> >>> Modassar
> >>>
> >>> On Tue, Nov 3, 2020 at 5:55 PM Erick Erickson  >
> >>> wrote:
> >>>
>  You’re relying on legacyMode, which is no longer supported. In
>  older versions of Solr, if a core.properties file was found on disk
> Solr
>  attempted to create the replica (and collection) on the fly. This is
> no
>  longer true.
> 
> 
>  Why are you doing it this manually instead of using the collections
> API?
>  You can precisely place each replica with that API in a way that’ll
>  be continued to be supported going forward.
> 
>  This really sounds like an XY problem, what is the use-case you’re
>  trying to solve?
> 
>  Best,
>  Erick
> 
> > On Nov 3, 2020, at 6:39 AM, Modassar Ather 
>  wrote:
> >
> > Hi,
> >
> > I am migrating from Solr 6.5.1 to Solr 8.6.3. As a part of the entire
> > upgrade I have the first task to install and configure the solr with
> >> the
> > core and collection. The solr is installed in SolrCloud mode.
> >
> > In Solr 6.5.1 I was using the following key values in core.properties
>  file.
> > The configuration files were uploaded to zookeeper using the upconfig
> > command.
> > The core and collection was automatically created with the setting in
> > core.properties files and the configSet uploaded in zookeeper and it
> >> used
> > to display on the Solr 6.5.1 dashboard.
> >
> > numShards=12
> >
> > name=mycore
> >
> > collection=mycore
> >
> > configSet=mycore
> >
> >
> > With the latest Solr 8.6.3 the same approach is not working. As per
> my
> > understanding the core is identified using the location of
>  core.properties
> > which is under */mycore/core.properties.*
> >
> > Can you please help me with the following?
> >
> >
> > - Is there any property I am missing to load the core and collection
> >> as
> > it used to be in Solr 6.5.1 with the help of core.properties and
>  config set
> > on zookeeper?
> > - The name of the core and collection should be configurable and not
>  the
> > dynamically generated names. How can I control that in the latest
> >> Solr?
> > - Is the core and collection API the only way to create core and
> > collection as I see that the core is also not getting listed even if
>  the
> > core.properties file is present?
> >
> > Please note that I will be doing a full indexing once the setup is
> >> done.
> >
> > Kindly help me with your suggestions.
> >
> > Best,
> > Modassar
> 
> 
> >>
> >>
>
>


Commits (with openSearcher = true) are too slow in solr 8

2020-11-03 Thread raj.yadav
Hi everyone,
We have two parallel system one is  solr 8.5.2 and other one is solr 5.4
In solr_5.4 commit time with opensearcher true is 10 to 12 minutes while in
solr_8 it's around 25 minutes.

This is our current caching policy of solr_8







In solr 5, we are using FastLRUCache (instead of CaffeineCache) and other
parameters are same.

While debugging this we came across this page.
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-Slowcommits

Here one of the reasons for slow commit is mentioned as:
*/`Heap size issues. Problems from the heap being too big will tend to be
infrequent, while problems from the heap being too small will tend to happen
consistently.`/*

Can anyone please help me understand the above point?

System config:
disk size: 250 GB
cpu: (8 vcpus, 64 GiB memory)
Index size: 11 GB
JVM heap size: 30 GB



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


docValues usage

2020-11-03 Thread Wei
Hi,

I have a couple of primitive single value numeric type fields,  their
values are used in boosting functions, but not used in sort/facet. or in
returned response.   Should I use docValues for them in the schema?  I can
think of the following options:

 1)   indexed=true,  stored=true, docValues=false
 2)   indexed=true, stored=false, docValues=true
 3)   indexed=false,  stored=false,  docValues=true

What would be the performance implications for these options?

Best,
Wei


Re: download binary files will not uncompress

2020-11-03 Thread Mike Drob
Routing back to the mailing list, please do not reply directly to
individual emails.

You did not download the complete file, the releases should be
approximately 180MB, not the 30KB that you show.

Try downloading from a different mirror, or check if you are behind a proxy
or firewall preventing the downloads.


On Tue, Nov 3, 2020 at 4:51 PM James Rome  wrote:

> jar@jarfx ~/.gnupg $ gpg --import ~/download/KEYS
> gpg: key B83EA82A0AFCEE7C: public key "Yonik Seeley "
> imported
> gpg: key E48025ED13E57FFC: public key "Upayavira "
> imported
>
> ...
>
> jar@jarfx ~/download $ ls -l solr*
> -rw-r--r-- 1 root root 30690 Nov  3 17:00 solr-8.6.3.tgz
> -rw-r--r-- 1 root root   833 Oct  3 21:44 solr-8.6.3.tgz.asc
> -rw-r--r-- 1 root root   145 Oct  3 21:44 solr-8.6.3.tgz.sha512
> -rw-r--r-- 1 root root 30718 Nov  3 17:01 solr-8.6.3.zip
>
> gpg --verify  solr-8.6.3.tgz.asc solr-8.6.3.tgz
> gpg: Signature made Sat 03 Oct 2020 06:17:01 PM EDT
> gpg:using RSA key 902CC51935C140BF820230961FD5295281436075
> gpg: BAD signature from "Jason Gerlowski (CODE SIGNING KEY)
> " [unknown]
>
> jar@jarfx ~/download $ tar xvf solr-8.6.3.tgz
>
> gzip: stdin: not in gzip format
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now
>
>
> James A. Rome
> 116 Claymore Lane
> Oak Ridge, TN 37830-7674
> 865 482-5643; Cell: 865 566-7991
> jamesr...@gmail.com
> https://jamesrome.net
>
> On 11/3/20 5:20 PM, Mike Drob wrote:
> > Can you check the signatures to make sure your downloads were not
> > corrupted? I just checked and was able to download and uncompress both of
> > them.
> >
> > Also, depending on your version of tar, you don't want the - for your
> > flags... tar xf solr-8.6.3.tgz
> >
> > Mike
> >
> > On Tue, Nov 3, 2020 at 4:15 PM James Rome  wrote:
> >
> >> # Source release: solr-8.6.3-src.tgz
> >> <
> >>
> https://www.apache.org/dyn/closer.lua/lucene/solr/8.6.3/solr-8.6.3-src.tgz
> >
> >>
> >> [PGP
> >>  >]
> >> [SHA512
> >> <
> https://downloads.apache.org/lucene/solr/8.6.3/solr-8.6.3-src.tgz.sha512
> >>> ]
> >> # Binary releases: solr-8.6.3.tgz
> >>  >
> >> [PGP
> >> ]
> >> [SHA512
> >> ]
> >> / solr-8.6.3.zip
> >>  >
> >> [PGP
> >> ]
> >> [SHA512
> >> ]
> >>
> >>unzip solr-8.6.3.zip
> >> Archive:  solr-8.6.3.zip
> >> End-of-central-directory signature not found.  Either this file is
> not
> >> a zipfile, or it constitutes one disk of a multi-part archive. In
> the
> >> latter case the central directory and zipfile comment will be found
> on
> >> the last disk(s) of this archive.
> >>
> >>
> >> and
> >>
> >> # tar -xvf solr-8.6.3.tgz
> >>
> >> gzip: stdin: not in gzip format
> >> tar: Child returned status 1
> >> tar: Error is not recoverable: exiting now
> >>
> >> --
> >> James A. Rome
> >>
> >> https://jamesrome.net
> >>
> >>
>


Re: download binary files will not uncompress

2020-11-03 Thread Mike Drob
Can you check the signatures to make sure your downloads were not
corrupted? I just checked and was able to download and uncompress both of
them.

Also, depending on your version of tar, you don't want the - for your
flags... tar xf solr-8.6.3.tgz

Mike

On Tue, Nov 3, 2020 at 4:15 PM James Rome  wrote:

> # Source release: solr-8.6.3-src.tgz
> <
> https://www.apache.org/dyn/closer.lua/lucene/solr/8.6.3/solr-8.6.3-src.tgz>
>
> [PGP
> ]
> [SHA512
>  >]
> # Binary releases: solr-8.6.3.tgz
> 
> [PGP
> ]
> [SHA512
> ]
> / solr-8.6.3.zip
> 
> [PGP
> ]
> [SHA512
> ]
>
>   unzip solr-8.6.3.zip
> Archive:  solr-8.6.3.zip
>End-of-central-directory signature not found.  Either this file is not
>a zipfile, or it constitutes one disk of a multi-part archive. In the
>latter case the central directory and zipfile comment will be found on
>the last disk(s) of this archive.
>
>
> and
>
> # tar -xvf solr-8.6.3.tgz
>
> gzip: stdin: not in gzip format
> tar: Child returned status 1
> tar: Error is not recoverable: exiting now
>
> --
> James A. Rome
>
> https://jamesrome.net
>
>


download binary files will not uncompress

2020-11-03 Thread James Rome
# Source release: solr-8.6.3-src.tgz 
 
[PGP 
] 
[SHA512 
]
# Binary releases: solr-8.6.3.tgz 
 
[PGP 
] 
[SHA512 
] 
/ solr-8.6.3.zip 
 
[PGP 
] 
[SHA512 
]


 unzip solr-8.6.3.zip
Archive:  solr-8.6.3.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive. In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.


and

# tar -xvf solr-8.6.3.tgz

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

--
James A. Rome

https://jamesrome.net



Re: Solr migration related issues.

2020-11-03 Thread Erick Erickson
Do note, though, that the default value for legacyCloud changed from
true to false so even though you can get it to work by setting
this cluster prop I wouldn’t…

The change in the default value is why it’s failing for you.


> On Nov 3, 2020, at 11:20 AM, Ilan Ginzburg  wrote:
> 
> I second Erick's recommendation, but just for the record legacyCloud was
> removed in (upcoming) Solr 9 and is still available in Solr 8.x. Most
> likely this explains Modassar why you found it in the documentation.
> 
> Ilan
> 
> 
> On Tue, Nov 3, 2020 at 5:11 PM Erick Erickson 
> wrote:
> 
>> You absolutely need core.properties files. It’s just that they
>> should be considered an “implementation detail” that you
>> should rarely, if ever need to be aware of.
>> 
>> Scripting manual creation of core.properties files in order
>> to define your collections has never been officially supported, it
>> just happened to work.
>> 
>> Best,
>> Erick
>> 
>>> On Nov 3, 2020, at 11:06 AM, Modassar Ather 
>> wrote:
>>> 
>>> Thanks Erick for your response.
>>> 
>>> I will certainly use the APIs and not rely on the core.properties. I was
>>> going through the documentation on core.properties and found it to be
>> still
>>> there.
>>> I have all the solr install scripts based on older Solr versions and
>> wanted
>>> to re-use the same as the core.properties way is still available.
>>> 
>>> So does this mean that we do not need core.properties anymore?
>>> How can we ensure that the core name is configurable and not dynamically
>>> set?
>>> 
>>> I will try to use the APIs to create the collection as well as the cores.
>>> 
>>> Best,
>>> Modassar
>>> 
>>> On Tue, Nov 3, 2020 at 5:55 PM Erick Erickson 
>>> wrote:
>>> 
 You’re relying on legacyMode, which is no longer supported. In
 older versions of Solr, if a core.properties file was found on disk Solr
 attempted to create the replica (and collection) on the fly. This is no
 longer true.
 
 
 Why are you doing it this manually instead of using the collections API?
 You can precisely place each replica with that API in a way that’ll
 be continued to be supported going forward.
 
 This really sounds like an XY problem, what is the use-case you’re
 trying to solve?
 
 Best,
 Erick
 
> On Nov 3, 2020, at 6:39 AM, Modassar Ather 
 wrote:
> 
> Hi,
> 
> I am migrating from Solr 6.5.1 to Solr 8.6.3. As a part of the entire
> upgrade I have the first task to install and configure the solr with
>> the
> core and collection. The solr is installed in SolrCloud mode.
> 
> In Solr 6.5.1 I was using the following key values in core.properties
 file.
> The configuration files were uploaded to zookeeper using the upconfig
> command.
> The core and collection was automatically created with the setting in
> core.properties files and the configSet uploaded in zookeeper and it
>> used
> to display on the Solr 6.5.1 dashboard.
> 
> numShards=12
> 
> name=mycore
> 
> collection=mycore
> 
> configSet=mycore
> 
> 
> With the latest Solr 8.6.3 the same approach is not working. As per my
> understanding the core is identified using the location of
 core.properties
> which is under */mycore/core.properties.*
> 
> Can you please help me with the following?
> 
> 
> - Is there any property I am missing to load the core and collection
>> as
> it used to be in Solr 6.5.1 with the help of core.properties and
 config set
> on zookeeper?
> - The name of the core and collection should be configurable and not
 the
> dynamically generated names. How can I control that in the latest
>> Solr?
> - Is the core and collection API the only way to create core and
> collection as I see that the core is also not getting listed even if
 the
> core.properties file is present?
> 
> Please note that I will be doing a full indexing once the setup is
>> done.
> 
> Kindly help me with your suggestions.
> 
> Best,
> Modassar
 
 
>> 
>> 



Re: Solr tag cloud - words and counts

2020-11-03 Thread Walter Underwood
For a tag cloud, the anomalous words are what you want. If you choose the most 
common words, then every tag cloud will have the same words. It will look like:

the, be, to, it, of, and, a, in, that, have, I, it, for, not, on, with, ...

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Nov 3, 2020, at 10:04 AM, uyilmaz  wrote:
> 
> 
> I have been trying to find a way to do this in Solr for a while. Perform a 
> query, and for a text_general field in the result set, find each term's # of 
> occurences.
> 
> - I tried the Terms Component, it doesn't have the ability to restrict the 
> result set with a query.
> 
> - Tried faceting on the field, since it's a text_general field it doesn't 
> have docValues, plus cardinality is very high (millions of documents * tens 
> of words in each field), so it works but it's very slow and sometimes times 
> out.
> 
> - Tried significantTerms streaming expression, but it's logically not the 
> same with what I'm looking for. It gives the words occuring frequently in the 
> result set, but not occuring as frequently outside it. So it's better to find 
> out frequency anomalies rather than simply the counts.
> 
> Do you have any suggestions?
> 
> Regards
> 
> -- 
> uyilmaz 



Solr tag cloud - words and counts

2020-11-03 Thread uyilmaz


I have been trying to find a way to do this in Solr for a while. Perform a 
query, and for a text_general field in the result set, find each term's # of 
occurences.

- I tried the Terms Component, it doesn't have the ability to restrict the 
result set with a query.

- Tried faceting on the field, since it's a text_general field it doesn't have 
docValues, plus cardinality is very high (millions of documents * tens of words 
in each field), so it works but it's very slow and sometimes times out.

- Tried significantTerms streaming expression, but it's logically not the same 
with what I'm looking for. It gives the words occuring frequently in the result 
set, but not occuring as frequently outside it. So it's better to find out 
frequency anomalies rather than simply the counts.

Do you have any suggestions?

Regards

-- 
uyilmaz 


solr-exporter using string arrays - 2

2020-11-03 Thread Maximilian Renner
Sorry for the bad format of the first mail, once again:

Hello there,   while playing around with the 
https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
 I found a bug when trying to use string arrays like 'facet.field':  
Exception occures on solr-8.4.1: _ERROR - 2020-11-03 17:21:07.742; 
org.apache.solr.prometheus.exporter.SolrExporter; Could not load scrape 
configuration from %s_
_Exception in thread "main" java.lang.RuntimeException: 
java.lang.ClassCastException: class java.util.ArrayList cannot be cast to class 
java.lang.String (java.util.ArrayList and java.lang.String are in module 
java.base of loader 'bootstrap')_
_at 
org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:223)_
_at 
org.apache.solr.prometheus.exporter.SolrExporter.main(SolrExporter.java:205)_   
I have seen the bug is still there on master branch.I can provide a patch 
for the bugfix.Can I create an issue or do you know about it? 
(https://issues.apache.org/jira/secure/CreateIssue.jspa)   Regards,
Max  

-
FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSITÄT UND KOMFORT


~solr-exporter using string arrays

2020-11-03 Thread Maximilian Renner
Hello there,   while playing around with the 
https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
 I found a bug when trying to use string arrays like 'facet.field':   __
__
__
__
__
__
_test_
_/select_
__
__
__
_{!EX=PUBLICATION}PUBLICATION_
_{!EX=ORGUNIT}ORG_UNIT_
__
__
__
__
__
__
__
__
__
__
__
__   Exception occures on solr-8.4.1: _ERROR - 2020-11-03 17:21:07.742; 
org.apache.solr.prometheus.exporter.SolrExporter; Could not load scrape 
configuration from %s_
_Exception in thread "main" java.lang.RuntimeException: 
java.lang.ClassCastException: class java.util.ArrayList cannot be cast to class 
java.lang.String (java.util.ArrayList and java.lang.String are in module 
java.base of loader 'bootstrap')_
_at 
org.apache.solr.prometheus.exporter.SolrExporter.loadMetricsConfiguration(SolrExporter.java:223)_
_at 
org.apache.solr.prometheus.exporter.SolrExporter.main(SolrExporter.java:205)_   
I have seen the bug is still there on master branch.I can provide a patch 
for the bugfix.Can I create an issue? 
(https://issues.apache.org/jira/secure/CreateIssue.jspa)   Regards,
Max 

-
FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSITÄT UND KOMFORT


Re: how do you manage your config and schema

2020-11-03 Thread matthew sporleder
Is there a more conservative starting point that is still up to date
than _default?

On Tue, Nov 3, 2020 at 11:13 AM matthew sporleder  wrote:
>
> So _default considered unsafe?  :)
>
> On Tue, Nov 3, 2020 at 11:08 AM Erick Erickson  
> wrote:
> >
> > The caution I would add is that you should be careful
> > that you don’t enable schemaless mode without understanding
> > the consequences in detail.
> >
> > There is, in fact, some discussion of removing schemaless entirely,
> > see:
> > https://issues.apache.org/jira/browse/SOLR-14701
> >
> > Otherwise, I usually recommend that you take the stock ocnfigs and
> > overlay whatever customizations you’ve added in terms of
> > field definitions and the like.
> >
> > Do also be careful, some default field params have changed…
> >
> > Best,
> > Erick
> >
> > > On Nov 3, 2020, at 9:30 AM, matthew sporleder  
> > > wrote:
> > >
> > > Yesterday I realized that we have been carrying forward our configs
> > > since, probably, 4.x days.
> > >
> > > I ran a config set action=create (from _default) and saw files i
> > > didn't recognize, and a lot *fewer* things than I've been uploading
> > > for the last few years.
> > >
> > > Anyway my new plan is to just use _default and keep params.json,
> > > solrconfig.xml, and schema.xml in git and just use the defaults for
> > > the rest.  (modulo synonyms/etc)
> > >
> > > Did everyone move on to managed schema and use some kind of
> > > intermediate format to upload?
> > >
> > > I'm just looking for updated best practices and a little survey of usage 
> > > trends.
> > >
> > > Thanks,
> > > Matt
> >


Re: Solr migration related issues.

2020-11-03 Thread Ilan Ginzburg
I second Erick's recommendation, but just for the record legacyCloud was
removed in (upcoming) Solr 9 and is still available in Solr 8.x. Most
likely this explains Modassar why you found it in the documentation.

Ilan


On Tue, Nov 3, 2020 at 5:11 PM Erick Erickson 
wrote:

> You absolutely need core.properties files. It’s just that they
> should be considered an “implementation detail” that you
> should rarely, if ever need to be aware of.
>
> Scripting manual creation of core.properties files in order
> to define your collections has never been officially supported, it
> just happened to work.
>
> Best,
> Erick
>
> > On Nov 3, 2020, at 11:06 AM, Modassar Ather 
> wrote:
> >
> > Thanks Erick for your response.
> >
> > I will certainly use the APIs and not rely on the core.properties. I was
> > going through the documentation on core.properties and found it to be
> still
> > there.
> > I have all the solr install scripts based on older Solr versions and
> wanted
> > to re-use the same as the core.properties way is still available.
> >
> > So does this mean that we do not need core.properties anymore?
> > How can we ensure that the core name is configurable and not dynamically
> > set?
> >
> > I will try to use the APIs to create the collection as well as the cores.
> >
> > Best,
> > Modassar
> >
> > On Tue, Nov 3, 2020 at 5:55 PM Erick Erickson 
> > wrote:
> >
> >> You’re relying on legacyMode, which is no longer supported. In
> >> older versions of Solr, if a core.properties file was found on disk Solr
> >> attempted to create the replica (and collection) on the fly. This is no
> >> longer true.
> >>
> >>
> >> Why are you doing it this manually instead of using the collections API?
> >> You can precisely place each replica with that API in a way that’ll
> >> be continued to be supported going forward.
> >>
> >> This really sounds like an XY problem, what is the use-case you’re
> >> trying to solve?
> >>
> >> Best,
> >> Erick
> >>
> >>> On Nov 3, 2020, at 6:39 AM, Modassar Ather 
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am migrating from Solr 6.5.1 to Solr 8.6.3. As a part of the entire
> >>> upgrade I have the first task to install and configure the solr with
> the
> >>> core and collection. The solr is installed in SolrCloud mode.
> >>>
> >>> In Solr 6.5.1 I was using the following key values in core.properties
> >> file.
> >>> The configuration files were uploaded to zookeeper using the upconfig
> >>> command.
> >>> The core and collection was automatically created with the setting in
> >>> core.properties files and the configSet uploaded in zookeeper and it
> used
> >>> to display on the Solr 6.5.1 dashboard.
> >>>
> >>> numShards=12
> >>>
> >>> name=mycore
> >>>
> >>> collection=mycore
> >>>
> >>> configSet=mycore
> >>>
> >>>
> >>> With the latest Solr 8.6.3 the same approach is not working. As per my
> >>> understanding the core is identified using the location of
> >> core.properties
> >>> which is under */mycore/core.properties.*
> >>>
> >>> Can you please help me with the following?
> >>>
> >>>
> >>>  - Is there any property I am missing to load the core and collection
> as
> >>>  it used to be in Solr 6.5.1 with the help of core.properties and
> >> config set
> >>>  on zookeeper?
> >>>  - The name of the core and collection should be configurable and not
> >> the
> >>>  dynamically generated names. How can I control that in the latest
> Solr?
> >>>  - Is the core and collection API the only way to create core and
> >>>  collection as I see that the core is also not getting listed even if
> >> the
> >>>  core.properties file is present?
> >>>
> >>> Please note that I will be doing a full indexing once the setup is
> done.
> >>>
> >>> Kindly help me with your suggestions.
> >>>
> >>> Best,
> >>> Modassar
> >>
> >>
>
>


Re: how do you manage your config and schema

2020-11-03 Thread matthew sporleder
So _default considered unsafe?  :)

On Tue, Nov 3, 2020 at 11:08 AM Erick Erickson  wrote:
>
> The caution I would add is that you should be careful
> that you don’t enable schemaless mode without understanding
> the consequences in detail.
>
> There is, in fact, some discussion of removing schemaless entirely,
> see:
> https://issues.apache.org/jira/browse/SOLR-14701
>
> Otherwise, I usually recommend that you take the stock ocnfigs and
> overlay whatever customizations you’ve added in terms of
> field definitions and the like.
>
> Do also be careful, some default field params have changed…
>
> Best,
> Erick
>
> > On Nov 3, 2020, at 9:30 AM, matthew sporleder  wrote:
> >
> > Yesterday I realized that we have been carrying forward our configs
> > since, probably, 4.x days.
> >
> > I ran a config set action=create (from _default) and saw files i
> > didn't recognize, and a lot *fewer* things than I've been uploading
> > for the last few years.
> >
> > Anyway my new plan is to just use _default and keep params.json,
> > solrconfig.xml, and schema.xml in git and just use the defaults for
> > the rest.  (modulo synonyms/etc)
> >
> > Did everyone move on to managed schema and use some kind of
> > intermediate format to upload?
> >
> > I'm just looking for updated best practices and a little survey of usage 
> > trends.
> >
> > Thanks,
> > Matt
>


Re: Solr migration related issues.

2020-11-03 Thread Erick Erickson
You absolutely need core.properties files. It’s just that they
should be considered an “implementation detail” that you
should rarely, if ever need to be aware of.

Scripting manual creation of core.properties files in order
to define your collections has never been officially supported, it
just happened to work.

Best,
Erick

> On Nov 3, 2020, at 11:06 AM, Modassar Ather  wrote:
> 
> Thanks Erick for your response.
> 
> I will certainly use the APIs and not rely on the core.properties. I was
> going through the documentation on core.properties and found it to be still
> there.
> I have all the solr install scripts based on older Solr versions and wanted
> to re-use the same as the core.properties way is still available.
> 
> So does this mean that we do not need core.properties anymore?
> How can we ensure that the core name is configurable and not dynamically
> set?
> 
> I will try to use the APIs to create the collection as well as the cores.
> 
> Best,
> Modassar
> 
> On Tue, Nov 3, 2020 at 5:55 PM Erick Erickson 
> wrote:
> 
>> You’re relying on legacyMode, which is no longer supported. In
>> older versions of Solr, if a core.properties file was found on disk Solr
>> attempted to create the replica (and collection) on the fly. This is no
>> longer true.
>> 
>> 
>> Why are you doing it this manually instead of using the collections API?
>> You can precisely place each replica with that API in a way that’ll
>> be continued to be supported going forward.
>> 
>> This really sounds like an XY problem, what is the use-case you’re
>> trying to solve?
>> 
>> Best,
>> Erick
>> 
>>> On Nov 3, 2020, at 6:39 AM, Modassar Ather 
>> wrote:
>>> 
>>> Hi,
>>> 
>>> I am migrating from Solr 6.5.1 to Solr 8.6.3. As a part of the entire
>>> upgrade I have the first task to install and configure the solr with the
>>> core and collection. The solr is installed in SolrCloud mode.
>>> 
>>> In Solr 6.5.1 I was using the following key values in core.properties
>> file.
>>> The configuration files were uploaded to zookeeper using the upconfig
>>> command.
>>> The core and collection was automatically created with the setting in
>>> core.properties files and the configSet uploaded in zookeeper and it used
>>> to display on the Solr 6.5.1 dashboard.
>>> 
>>> numShards=12
>>> 
>>> name=mycore
>>> 
>>> collection=mycore
>>> 
>>> configSet=mycore
>>> 
>>> 
>>> With the latest Solr 8.6.3 the same approach is not working. As per my
>>> understanding the core is identified using the location of
>> core.properties
>>> which is under */mycore/core.properties.*
>>> 
>>> Can you please help me with the following?
>>> 
>>> 
>>>  - Is there any property I am missing to load the core and collection as
>>>  it used to be in Solr 6.5.1 with the help of core.properties and
>> config set
>>>  on zookeeper?
>>>  - The name of the core and collection should be configurable and not
>> the
>>>  dynamically generated names. How can I control that in the latest Solr?
>>>  - Is the core and collection API the only way to create core and
>>>  collection as I see that the core is also not getting listed even if
>> the
>>>  core.properties file is present?
>>> 
>>> Please note that I will be doing a full indexing once the setup is done.
>>> 
>>> Kindly help me with your suggestions.
>>> 
>>> Best,
>>> Modassar
>> 
>> 



Re: how do you manage your config and schema

2020-11-03 Thread Erick Erickson
The caution I would add is that you should be careful 
that you don’t enable schemaless mode without understanding 
the consequences in detail.

There is, in fact, some discussion of removing schemaless entirely, 
see:
https://issues.apache.org/jira/browse/SOLR-14701

Otherwise, I usually recommend that you take the stock ocnfigs and
overlay whatever customizations you’ve added in terms of
field definitions and the like.

Do also be careful, some default field params have changed…

Best,
Erick

> On Nov 3, 2020, at 9:30 AM, matthew sporleder  wrote:
> 
> Yesterday I realized that we have been carrying forward our configs
> since, probably, 4.x days.
> 
> I ran a config set action=create (from _default) and saw files i
> didn't recognize, and a lot *fewer* things than I've been uploading
> for the last few years.
> 
> Anyway my new plan is to just use _default and keep params.json,
> solrconfig.xml, and schema.xml in git and just use the defaults for
> the rest.  (modulo synonyms/etc)
> 
> Did everyone move on to managed schema and use some kind of
> intermediate format to upload?
> 
> I'm just looking for updated best practices and a little survey of usage 
> trends.
> 
> Thanks,
> Matt



Re: Solr migration related issues.

2020-11-03 Thread Modassar Ather
Thanks Erick for your response.

I will certainly use the APIs and not rely on the core.properties. I was
going through the documentation on core.properties and found it to be still
there.
I have all the solr install scripts based on older Solr versions and wanted
to re-use the same as the core.properties way is still available.

So does this mean that we do not need core.properties anymore?
How can we ensure that the core name is configurable and not dynamically
set?

I will try to use the APIs to create the collection as well as the cores.

Best,
Modassar

On Tue, Nov 3, 2020 at 5:55 PM Erick Erickson 
wrote:

> You’re relying on legacyMode, which is no longer supported. In
> older versions of Solr, if a core.properties file was found on disk Solr
> attempted to create the replica (and collection) on the fly. This is no
> longer true.
>
>
> Why are you doing it this manually instead of using the collections API?
> You can precisely place each replica with that API in a way that’ll
> be continued to be supported going forward.
>
> This really sounds like an XY problem, what is the use-case you’re
> trying to solve?
>
> Best,
> Erick
>
> > On Nov 3, 2020, at 6:39 AM, Modassar Ather 
> wrote:
> >
> > Hi,
> >
> > I am migrating from Solr 6.5.1 to Solr 8.6.3. As a part of the entire
> > upgrade I have the first task to install and configure the solr with the
> > core and collection. The solr is installed in SolrCloud mode.
> >
> > In Solr 6.5.1 I was using the following key values in core.properties
> file.
> > The configuration files were uploaded to zookeeper using the upconfig
> > command.
> > The core and collection was automatically created with the setting in
> > core.properties files and the configSet uploaded in zookeeper and it used
> > to display on the Solr 6.5.1 dashboard.
> >
> > numShards=12
> >
> > name=mycore
> >
> > collection=mycore
> >
> > configSet=mycore
> >
> >
> > With the latest Solr 8.6.3 the same approach is not working. As per my
> > understanding the core is identified using the location of
> core.properties
> > which is under */mycore/core.properties.*
> >
> > Can you please help me with the following?
> >
> >
> >   - Is there any property I am missing to load the core and collection as
> >   it used to be in Solr 6.5.1 with the help of core.properties and
> config set
> >   on zookeeper?
> >   - The name of the core and collection should be configurable and not
> the
> >   dynamically generated names. How can I control that in the latest Solr?
> >   - Is the core and collection API the only way to create core and
> >   collection as I see that the core is also not getting listed even if
> the
> >   core.properties file is present?
> >
> > Please note that I will be doing a full indexing once the setup is done.
> >
> > Kindly help me with your suggestions.
> >
> > Best,
> > Modassar
>
>


how do you manage your config and schema

2020-11-03 Thread matthew sporleder
Yesterday I realized that we have been carrying forward our configs
since, probably, 4.x days.

I ran a config set action=create (from _default) and saw files i
didn't recognize, and a lot *fewer* things than I've been uploading
for the last few years.

Anyway my new plan is to just use _default and keep params.json,
solrconfig.xml, and schema.xml in git and just use the defaults for
the rest.  (modulo synonyms/etc)

Did everyone move on to managed schema and use some kind of
intermediate format to upload?

I'm just looking for updated best practices and a little survey of usage trends.

Thanks,
Matt


Frequent Index Replication Failure in solr.

2020-11-03 Thread Parshant Kumar
Hi team,

We are having solr architecture as *master->repeater-> 3 slave servers.*

We are doing incremental indexing on the master server(every 20 min) .
Replication of index is done from master to repeater server(every 10 mins)
and from repeater to 3 slave servers (every 3 hours).
*We are facing the frequent replication failure between master to repeater
server  as well as between repeater  to slave servers.*
On checking logs found that every time one of the below  exceptions
occurred whenever the replication has failed .

1)WARN : Error in fetching file: _4rnu_t.liv (downloaded 0 of 11505507
bytes)
java.io.EOFException: Unexpected end of ZLIB input stream
at
java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at
java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at
org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
at
org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:88)
at
org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:139)
at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:166)
at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:160)
at
org.apache.solr.handler.IndexFetcher$FileFetcher.fetchPackets(IndexFetcher.java:1443)
at
org.apache.solr.handler.IndexFetcher$FileFetcher.fetch(IndexFetcher.java:1409)


2)
WARN : Error getting file length for [segments_568]
java.nio.file.NoSuchFileException:
/data/solr/search/application/core-conf/im-search/data/index.20200711012319226/segments_568
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at
sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
at
sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
at
sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
at java.nio.file.Files.readAttributes(Files.java:1737)
at java.nio.file.Files.size(Files.java:2332)
at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:243)
at
org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(LukeRequestHandler.java:615)
at
org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(LukeRequestHandler.java:588)
at
org.apache.solr.handler.admin.CoreAdminOperation.getCoreStatus(CoreAdminOperation.java:335)

3)
WARN : Error in fetching file: _4nji.nvd (downloaded 507510784 of 555377795
bytes)
org.apache.http.MalformedChunkCodingException: CRLF expected at end of chunk
at
org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:255)
at
org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:227)
at
org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:186)
at
org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:137)
at
java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238)
at
java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at
org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:79)
at
org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:128)
at
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:166)
at
org.apache.solr.handler.IndexFetcher$FileFetcher.fetchPackets(IndexFetcher.java:1458)
at
org.apache.solr.handler.IndexFetcher$FileFetcher.fetch(IndexFetcher.java:1409)
at
org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(IndexFetcher.java:1390)
at
org.apache.solr.handler.IndexFetcher.downloadIndexFiles(IndexFetcher.java:872)
at
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:438)
at
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:254)

*Replication configuration of master,repeater,slave's is given below:*



${enable.master:false}
commit
startup
00:00:10



*Commit Configuration master,repeater,slave's is given below :*

 
10false


Please help in finding the root cause of replication failure.Let me
know for any queries.

Thanks

Parshant kumar










Virus-free.
www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

-- 



Re: Search issue in the SOLR for few words

2020-11-03 Thread Erick Erickson
There is not nearly enough information here to begin
to help you.

At minimum we need:
1> your field definition
2> the text you index
3> the query you send

You might want to review: 
https://wiki.apache.org/solr/UsingMailingLists

Best,
Erick

> On Nov 3, 2020, at 1:08 AM, Viresh Sasalawad 
>  wrote:
> 
> Hi Sir/Madam,
> 
> Am facing an issue with few keyword searches (like gazing, one) in solr.
> Can you please help why these words are not listed in solr results?
> 
> Indexing is done properly.
> 
> 
> -- 
> Thanks and Regards
> Veeresh Sasalawad



Re: Solr migration related issues.

2020-11-03 Thread Erick Erickson
You’re relying on legacyMode, which is no longer supported. In
older versions of Solr, if a core.properties file was found on disk Solr
attempted to create the replica (and collection) on the fly. This is no
longer true.


Why are you doing it this manually instead of using the collections API?
You can precisely place each replica with that API in a way that’ll
be continued to be supported going forward.

This really sounds like an XY problem, what is the use-case you’re
trying to solve?

Best,
Erick

> On Nov 3, 2020, at 6:39 AM, Modassar Ather  wrote:
> 
> Hi,
> 
> I am migrating from Solr 6.5.1 to Solr 8.6.3. As a part of the entire
> upgrade I have the first task to install and configure the solr with the
> core and collection. The solr is installed in SolrCloud mode.
> 
> In Solr 6.5.1 I was using the following key values in core.properties file.
> The configuration files were uploaded to zookeeper using the upconfig
> command.
> The core and collection was automatically created with the setting in
> core.properties files and the configSet uploaded in zookeeper and it used
> to display on the Solr 6.5.1 dashboard.
> 
> numShards=12
> 
> name=mycore
> 
> collection=mycore
> 
> configSet=mycore
> 
> 
> With the latest Solr 8.6.3 the same approach is not working. As per my
> understanding the core is identified using the location of core.properties
> which is under */mycore/core.properties.*
> 
> Can you please help me with the following?
> 
> 
>   - Is there any property I am missing to load the core and collection as
>   it used to be in Solr 6.5.1 with the help of core.properties and config set
>   on zookeeper?
>   - The name of the core and collection should be configurable and not the
>   dynamically generated names. How can I control that in the latest Solr?
>   - Is the core and collection API the only way to create core and
>   collection as I see that the core is also not getting listed even if the
>   core.properties file is present?
> 
> Please note that I will be doing a full indexing once the setup is done.
> 
> Kindly help me with your suggestions.
> 
> Best,
> Modassar



Solr migration related issues.

2020-11-03 Thread Modassar Ather
Hi,

I am migrating from Solr 6.5.1 to Solr 8.6.3. As a part of the entire
upgrade I have the first task to install and configure the solr with the
core and collection. The solr is installed in SolrCloud mode.

In Solr 6.5.1 I was using the following key values in core.properties file.
The configuration files were uploaded to zookeeper using the upconfig
command.
The core and collection was automatically created with the setting in
core.properties files and the configSet uploaded in zookeeper and it used
to display on the Solr 6.5.1 dashboard.

numShards=12

name=mycore

collection=mycore

configSet=mycore


With the latest Solr 8.6.3 the same approach is not working. As per my
understanding the core is identified using the location of core.properties
which is under */mycore/core.properties.*

Can you please help me with the following?


   - Is there any property I am missing to load the core and collection as
   it used to be in Solr 6.5.1 with the help of core.properties and config set
   on zookeeper?
   - The name of the core and collection should be configurable and not the
   dynamically generated names. How can I control that in the latest Solr?
   - Is the core and collection API the only way to create core and
   collection as I see that the core is also not getting listed even if the
   core.properties file is present?

Please note that I will be doing a full indexing once the setup is done.

Kindly help me with your suggestions.

Best,
Modassar


Solr.NestPathField

2020-11-03 Thread Dawn
Hi:

Use Solr.NestPathField to store data,

SPLITSHARD is then used to shard the shard and it is found that the child's 
data is lost.


Version 8.6.3