RE: create collection from existing managed-schema

2018-07-26 Thread Rahul Chhiber
Hi,

If you want to share schema and/or other configurations between collections, 
you need to create a configset. Then, specify this configset while creating any 
collections.

Any changes made to that configset or schema will reflect in all collections 
that are using it.

By default, Solr has the _default configset for any collections created without 
explicit configset.

Regards,
Rahul Chhiber

-Original Message-
From: Chuming Chen [mailto:chumingc...@gmail.com] 
Sent: Thursday, July 26, 2018 11:35 PM
To: solr-user@lucene.apache.org
Subject: create collection from existing managed-schema

Hi All,

>From Solr Admin interface, I have created a collection and added field 
>definitions. I can get its managed-schema from the Admin interface. 

Can I use this managed-schema to create a new collection? How?

Thanks,

Chuming




Re: Edismax and ShingleFilterFactory exponential term grow

2018-07-26 Thread Erick Erickson
This is doing exactly what it should. It'd be a little clearer if you
used a tokenSeparator other than the default space. Then this line:

text_shingles:word1 word2 word3+text_shingles:word4 word5

would look more like this:
text_shingles:word1_word2_word3+text_shingles:word4_word5

It's building a query from all of the 1, 2 and 3 grams. You're getting
the single tokens because outputUnigrams defaults to "true".

So of course as the number of terms in the query grows the number of
clauses int he parsed query grows non-linearly.

Best,
Erick

On Thu, Jul 26, 2018 at 12:44 PM, Jokin C  wrote:
> Hi, I have a problem and I don't know if it's something that am and doing
> wrong or if it's maybe a bug. I want to query a field with shingles, the
> field and type definition are this:
>
>  stored="false"/>
>
>  positionIncrementGap="100">
> 
>   
>   
>maxShingleSize="3" />
> 
>   
>
>
> I'm using Solr  7.2.1.
>
> I jus wanted to have different min and max shingle sizes to test how ir
> works, but if the query is long solr is giving timeouts, high cpu and OOM.
>
> the query I'm using is this:
>
> http://localhost:8983/solr/ntnx/select?debugQuery=on&q={!edismax%20%20qf=%22text_shingles%22%20}%22%20word1%20word2%20word3%20word4%20word5%20word6%20word7
>
> and the parsed query grows like this with just 4 words, when I use a query
> with a lot of word it fails.
>
> 2 words:
> "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> +text_shingles:word2) text_shingles:word1 word2)))",
>
> 3words:
> "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> +text_shingles:word2 +text_shingles:word3) (+text_shingles:word1
> +text_shingles:word2 word3) (+text_shingles:word1 word2
> +text_shingles:word3) text_shingles:word1 word2 word3)))",
>
> 4 words:
> "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> +text_shingles:word2 +text_shingles:word3 +text_shingles:word4)
> (+text_shingles:word1 +text_shingles:word2 +text_shingles:word3 word4)
> (+text_shingles:word1 +text_shingles:word2 word3 +text_shingles:word4)
> (+text_shingles:word1 +text_shingles:word2 word3 word4)
> (+text_shingles:word1 word2 +text_shingles:word3 +text_shingles:word4)
> (+text_shingles:word1 word2 +text_shingles:word3 word4)
> (+text_shingles:word1 word2 word3 +text_shingles:word4",
>
> 5 words:
> "parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
> +text_shingles:word2 +text_shingles:word3 +text_shingles:word4
> +text_shingles:word5) (+text_shingles:word1 +text_shingles:word2
> +text_shingles:word3 +text_shingles:word4 word5) (+text_shingles:word1
> +text_shingles:word2 +text_shingles:word3 word4 +text_shingles:word5)
> (+text_shingles:word1 +text_shingles:word2 +text_shingles:word3 word4
> word5) (+text_shingles:word1 +text_shingles:word2 word3
> +text_shingles:word4 +text_shingles:word5) (+text_shingles:word1
> +text_shingles:word2 word3 +text_shingles:word4 word5)
> (+text_shingles:word1 +text_shingles:word2 word3 word4
> +text_shingles:word5) (+text_shingles:word1 word2 +text_shingles:word3
> +text_shingles:word4 +text_shingles:word5) (+text_shingles:word1 word2
> +text_shingles:word3 +text_shingles:word4 word5) (+text_shingles:word1
> word2 +text_shingles:word3 word4 +text_shingles:word5)
> (+text_shingles:word1 word2 +text_shingles:word3 word4 word5)
> (+text_shingles:word1 word2 word3 +text_shingles:word4
> +text_shingles:word5) (+text_shingles:word1 word2 word3
> +text_shingles:word4 word5",
>
>
> So, something bad is happening, it's because I'm doing wrong or maybe its a
> bug and should I report on the team issue tracker?


Re: Recent configuration change to our site causes frequent index corruption

2018-07-26 Thread Erick Erickson
And as for your performance warning Overlapping onDeckSearchers,
almost certainly some external process (probably the indexing client)
is issuing the commits.

Best,
Erick.

On Thu, Jul 26, 2018 at 1:43 PM, Markus Jelsma
 wrote:
> Hello,
>
> Is your maximum number of open files 1024? If so, increase it to a more 
> regular 65536. Some operating systems ship with 1024 for reasons i don't 
> understand. Whenever installing Solr anywhere for the past ten years, we have 
> had to check this each and every time, and still have to!
>
> Regards,
> Markus
>
>
>
> -Original message-
>> From:cyndefromva 
>> Sent: Thursday 26th July 2018 22:18
>> To: solr-user@lucene.apache.org
>> Subject: Recent configuration change to our site causes frequent index 
>> corruption
>>
>> I have Rails 5 application that uses solr to index and search our site. The
>> sunspot gem is used to integrate ruby and sunspot.  It's a relatively small
>> site (no more 100,000 records) and has moderate usage (except for the
>> googlebot).
>>
>> Until recently we regularly received 503 errors; reloading the page
>> generally cleared it up, but that was not exactly the user experience we
>> wantedso we added the following initializer to force the retry on failures:
>>
>> Sunspot.session =
>> Sunspot::SessionProxy::Retry5xxSessionProxy.new(Sunspot.session)
>>
>> As a result, about every third day the site locks up until we rebuild the
>> data directory (stop solr, move data directory to another location, start
>> solr, reindex).
>>
>> At the point it starts failing I see a java exception: "java.io-IOException:
>> Too many open files" in the solr log file and a SolrException (Error open
>> new searcher) is returned to the user.
>>
>> In the solrconfig.xml file we have autoCommit and autoSoftCommit set as
>> follows:
>>
>>   
>>  ${solr.autoCommit.maxTime:15000}
>>  false
>>   
>>
>>   
>>  ${solr.autoSoftCommit.maxTime:-1}
>>   
>>
>> Which I believe means there should be a hard commit every 15 seconds.
>>
>> But it appears to be calling commit more frequently. In the solr log I see
>> the following commit written miliseconds from each other:
>>
>>   UpdateHandler start
>> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>>
>> I also see the following written right below it:
>>
>> PERFORMANCE WARNING: Overlapping onDeckSearchers=2
>>
>> Note: maxWarmingSearchers is set to 2.
>>
>>
>> I would really appreciate any help I can get to resolve this issue.
>>
>> Thank you!
>>
>>
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>


Re: Can the export handler be used with the edismax or dismax query handler

2018-07-26 Thread Joel Bernstein
The export handler doesn't allow sorting by score at this time. It only
supports sorting on fields. So the edismax qparser won't cxcurrently work
with the export handler.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Jul 26, 2018 at 5:52 PM, Tom Burton-West  wrote:

> Hello all,
>
> I am completely new to the export handler.
>
> Can the export handler be used with the edismax or dismax query handler?
>
> I tried using local params :
>
> q= _query_:"{!edismax qf='ocr^5+allfields^1+titleProper^50'
> mm='100%25'
> tie='0.9' } art"
>
> which does not seem to be working.
>
> Tom
>


Can the export handler be used with the edismax or dismax query handler

2018-07-26 Thread Tom Burton-West
Hello all,

I am completely new to the export handler.

Can the export handler be used with the edismax or dismax query handler?

I tried using local params :

q= _query_:"{!edismax qf='ocr^5+allfields^1+titleProper^50' mm='100%25'
tie='0.9' } art"

which does not seem to be working.

Tom


RealTimeGet - routing

2018-07-26 Thread Susmit
Hi,

RTG query is not able to match docs in my collection.
Collection - 2 shards
router field is not the same as uniqueKey
solr 6.3.1

Looked at the code - it uses passed list of ids to find the shard it belongs to 
based on hash range and fires distributed queries. But ids are uniqueKey and 
hash ranges are for router.field so it does not find the doc.

I could think of a workaround by computing shards on client and send shards 
parameter with query. 
Would it also impact replica sync?

Thanks,
Susmit

RE: Recent configuration change to our site causes frequent index corruption

2018-07-26 Thread Markus Jelsma
Hello,

Is your maximum number of open files 1024? If so, increase it to a more regular 
65536. Some operating systems ship with 1024 for reasons i don't understand. 
Whenever installing Solr anywhere for the past ten years, we have had to check 
this each and every time, and still have to!

Regards,
Markus

 
 
-Original message-
> From:cyndefromva 
> Sent: Thursday 26th July 2018 22:18
> To: solr-user@lucene.apache.org
> Subject: Recent configuration change to our site causes frequent index 
> corruption
> 
> I have Rails 5 application that uses solr to index and search our site. The
> sunspot gem is used to integrate ruby and sunspot.  It's a relatively small
> site (no more 100,000 records) and has moderate usage (except for the
> googlebot).
> 
> Until recently we regularly received 503 errors; reloading the page
> generally cleared it up, but that was not exactly the user experience we
> wantedso we added the following initializer to force the retry on failures:
> 
> Sunspot.session =
> Sunspot::SessionProxy::Retry5xxSessionProxy.new(Sunspot.session)
> 
> As a result, about every third day the site locks up until we rebuild the
> data directory (stop solr, move data directory to another location, start
> solr, reindex). 
> 
> At the point it starts failing I see a java exception: "java.io-IOException:
> Too many open files" in the solr log file and a SolrException (Error open
> new searcher) is returned to the user.
> 
> In the solrconfig.xml file we have autoCommit and autoSoftCommit set as
> follows:
> 
>   
>  ${solr.autoCommit.maxTime:15000}
>  false
>   
> 
>   
>  ${solr.autoSoftCommit.maxTime:-1}
>   
> 
> Which I believe means there should be a hard commit every 15 seconds.
> 
> But it appears to be calling commit more frequently. In the solr log I see
> the following commit written miliseconds from each other:
> 
>   UpdateHandler start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 
> I also see the following written right below it:
> 
> PERFORMANCE WARNING: Overlapping onDeckSearchers=2
> 
> Note: maxWarmingSearchers is set to 2.
> 
> 
> I would really appreciate any help I can get to resolve this issue.
> 
> Thank you!
> 
> 
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> 


Recent configuration change to our site causes frequent index corruption

2018-07-26 Thread cyndefromva
I have Rails 5 application that uses solr to index and search our site. The
sunspot gem is used to integrate ruby and sunspot.  It's a relatively small
site (no more 100,000 records) and has moderate usage (except for the
googlebot).

Until recently we regularly received 503 errors; reloading the page
generally cleared it up, but that was not exactly the user experience we
wantedso we added the following initializer to force the retry on failures:

Sunspot.session =
Sunspot::SessionProxy::Retry5xxSessionProxy.new(Sunspot.session)

As a result, about every third day the site locks up until we rebuild the
data directory (stop solr, move data directory to another location, start
solr, reindex). 

At the point it starts failing I see a java exception: "java.io-IOException:
Too many open files" in the solr log file and a SolrException (Error open
new searcher) is returned to the user.

In the solrconfig.xml file we have autoCommit and autoSoftCommit set as
follows:

  
 ${solr.autoCommit.maxTime:15000}
 false
  

  
 ${solr.autoSoftCommit.maxTime:-1}
  

Which I believe means there should be a hard commit every 15 seconds.

But it appears to be calling commit more frequently. In the solr log I see
the following commit written miliseconds from each other:

  UpdateHandler start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

I also see the following written right below it:

PERFORMANCE WARNING: Overlapping onDeckSearchers=2

Note: maxWarmingSearchers is set to 2.


I would really appreciate any help I can get to resolve this issue.

Thank you!





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Edismax and ShingleFilterFactory exponential term grow

2018-07-26 Thread Jokin C
Hi, I have a problem and I don't know if it's something that am and doing
wrong or if it's maybe a bug. I want to query a field with shingles, the
field and type definition are this:





  
  
  

  


I'm using Solr  7.2.1.

I jus wanted to have different min and max shingle sizes to test how ir
works, but if the query is long solr is giving timeouts, high cpu and OOM.

the query I'm using is this:

http://localhost:8983/solr/ntnx/select?debugQuery=on&q={!edismax%20%20qf=%22text_shingles%22%20}%22%20word1%20word2%20word3%20word4%20word5%20word6%20word7

and the parsed query grows like this with just 4 words, when I use a query
with a lot of word it fails.

2 words:
"parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
+text_shingles:word2) text_shingles:word1 word2)))",

3words:
"parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
+text_shingles:word2 +text_shingles:word3) (+text_shingles:word1
+text_shingles:word2 word3) (+text_shingles:word1 word2
+text_shingles:word3) text_shingles:word1 word2 word3)))",

4 words:
"parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
+text_shingles:word2 +text_shingles:word3 +text_shingles:word4)
(+text_shingles:word1 +text_shingles:word2 +text_shingles:word3 word4)
(+text_shingles:word1 +text_shingles:word2 word3 +text_shingles:word4)
(+text_shingles:word1 +text_shingles:word2 word3 word4)
(+text_shingles:word1 word2 +text_shingles:word3 +text_shingles:word4)
(+text_shingles:word1 word2 +text_shingles:word3 word4)
(+text_shingles:word1 word2 word3 +text_shingles:word4",

5 words:
"parsedquery":"+DisjunctionMaxQuery+text_shingles:word1
+text_shingles:word2 +text_shingles:word3 +text_shingles:word4
+text_shingles:word5) (+text_shingles:word1 +text_shingles:word2
+text_shingles:word3 +text_shingles:word4 word5) (+text_shingles:word1
+text_shingles:word2 +text_shingles:word3 word4 +text_shingles:word5)
(+text_shingles:word1 +text_shingles:word2 +text_shingles:word3 word4
word5) (+text_shingles:word1 +text_shingles:word2 word3
+text_shingles:word4 +text_shingles:word5) (+text_shingles:word1
+text_shingles:word2 word3 +text_shingles:word4 word5)
(+text_shingles:word1 +text_shingles:word2 word3 word4
+text_shingles:word5) (+text_shingles:word1 word2 +text_shingles:word3
+text_shingles:word4 +text_shingles:word5) (+text_shingles:word1 word2
+text_shingles:word3 +text_shingles:word4 word5) (+text_shingles:word1
word2 +text_shingles:word3 word4 +text_shingles:word5)
(+text_shingles:word1 word2 +text_shingles:word3 word4 word5)
(+text_shingles:word1 word2 word3 +text_shingles:word4
+text_shingles:word5) (+text_shingles:word1 word2 word3
+text_shingles:word4 word5",


So, something bad is happening, it's because I'm doing wrong or maybe its a
bug and should I report on the team issue tracker?


create collection from existing managed-schema

2018-07-26 Thread Chuming Chen
Hi All,

From Solr Admin interface, I have created a collection and added field 
definitions. I can get its managed-schema from the Admin interface. 

Can I use this managed-schema to create a new collection? How?

Thanks,

Chuming




Configuring ZK Timeouts

2018-07-26 Thread solrnoobie
We are having problems with zk / solr node recovery and we are encountering
this issue:

 [   ] o.a.z.ClientCnxn Client session timed out, have not heard from server
in 5003ms

We have set the solr.xml zkClientTimeout to 30 secs.


What are we missing here?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: how to index GEO JSON

2018-07-26 Thread SolrUser1543
of course i saw this reference . 

but it is not clear understood , how exactly geojson looks like . 

where do I put an item Id ? 


regular json index request looks like :
{
"id":"111",
"geo_srpt": 
}

i tried to put a geo json as a geo_srpt value but it does not work . 

so far I managed to index some of WKT ( not all types ) , but no geojson at
all . 

Any help  with detailed example will be appreciated 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html