FW: NRTCachingDirectory threads stuck

2015-02-21 Thread Moshe Recanati
Hi,
I saw message rejected because of attachment.
I uploaded data to drive
https://drive.google.com/file/d/0B0GR0M-lL5QHVDNjZlUwVTR2QTQ/view?usp=sharing

Moshe

From: Moshe Recanati [mailto:mos...@kmslh.com]
Sent: Sunday, February 22, 2015 8:37 AM
To: solr-user@lucene.apache.org
Subject: RE: NRTCachingDirectory threads stuck

From: Moshe Recanati
Sent: Sunday, February 22, 2015 8:34 AM
To: solr-user@lucene.apache.org
Subject: NRTCachingDirectory threads stuck

Hi,
We're running two Solr servers on same machine.
Once Solr 4.0 and the second is Solr 4.7.1.
In the Solr 4.7.1 we've very strange behavior, while indexing document we get 
spike of memory from 1GB to 4Gb in couple of minutes and huge number of threads 
stuck on
NRTCachingDirectory.openInput methods.

Thread sump and GC attached.

Are you familiar with this behavior? What can be the trigger for this?

Thank you,


Regards,
Moshe Recanati
SVP Engineering
Office + 972-73-2617564
Mobile  + 972-52-6194481
Skype:  recanati
[KMS2]
More at:  www.kmslh.com | 
LinkedIn | 
FB




RE: Suspicious message with attachment

2015-02-21 Thread Moshe Recanati
Please proceed


Regards,
Moshe Recanati
SVP Engineering
Office + 972-73-2617564
Mobile  + 972-52-6194481
Skype    :  recanati

More at:  www.kmslh.com | LinkedIn | FB


-Original Message-
From: postmas...@ssww.com [mailto:postmas...@ssww.com] On Behalf Of 
h...@ssww.com
Sent: Sunday, February 22, 2015 8:39 AM
To: solr-user@lucene.apache.org
Subject: Suspicious message with attachment

The following message addressed to you was quarantined because it likely 
contains a virus:

Subject: RE: NRTCachingDirectory threads stuck
From: Moshe Recanati 

However, if you know the sender and are expecting an attachment, please reply 
to this message, and we will forward the quarantined message to you.


Suspicious message with attachment

2015-02-21 Thread help
The following message addressed to you was quarantined because it likely 
contains a virus:

Subject: RE: NRTCachingDirectory threads stuck
From: Moshe Recanati 

However, if you know the sender and are expecting an attachment, please reply 
to this message, and we will forward the quarantined message to you.


Re: Performing DIH on predefined list of IDS

2015-02-21 Thread Shawn Heisey
On 2/21/2015 6:33 PM, Walter Underwood wrote:
> Never do POST for a read-only request. Never. That only guarantees that you 
> cannot reproduce the problem by looking at the logs.
> 
> If your design requires extremely long GET requests, you may need to re-think 
> your design.

I agree with those sentiments ... but those who consume the services we
provide tend to push the envelope well beyond any reasonable limits.

My Solr install deals with some Solr queries where the GET request is
pushing 2 characters.  The queries and filters constructed by the
website code for some of the more powerful users are really large.  I
had to configure haproxy and jetty to allow HTTP headers up to 32K.  I'd
like to tell development that we just can't handle it, but with the way
the system is currently structured, there's no other way to get the
results they need.

If I were to make it widely known internally that the Solr config is
currently allowing POST requests up to 32 megabytes, I am really scared
to find out what sort of queries development would try to do.  I raised
that particular configuration limit (which defaults to 2MB) for my own
purposes, not for the development group.

Thanks,
Shawn



RE: Solr synonyms logic

2015-02-21 Thread steve
SEO is search fun 
subject!http://www.academia.edu/1033371/Hyponymy_extraction_and_web_search_behavior_analysis_based_on_query_reformulation
planeta terra (planet earth),planeta (planet).Conclusion : Planet earth is a 
hyponym of planetplaneta terra (planet earth),planeta (planet).Conclusion : 
Planet earth is a hyponym of planet

> Date: Sat, 21 Feb 2015 08:12:33 -0800
> Subject: Re: Solr synonyms logic
> From: rjo...@gmail.com
> To: solr-user@lucene.apache.org
> 
> What you are describing is hyponymy.  Pastry is the hypernym.  You can
> accomplish this by not using expansion, for example:
> cannelloni => cannelloni, pastry
> 
> This has the result of adding pastry to the index.
> 
> Ryan
> 
> On Saturday, February 21, 2015, Mikhail Khludnev 
> wrote:
> 
> > Hello,
> >
> > usually debugQuery=true output explains a lot of such details.
> >
> > On Sat, Feb 21, 2015 at 10:52 AM, davym >
> > wrote:
> >
> > > Hi all,
> > >
> > > I'm querying a recipe database in Solr. By using synonyms, I'm trying to
> > > make my search a little smarter.
> > >
> > > What I'm trying to do here, is that a search for pastry returns all
> > > lasagne,
> > > penne & cannelloni recipes.
> > > However a search for lasagne should only return lasagne recipes.
> > >
> > > In my synonyms.txt, I have these lines:
> > > -
> > > lasagne,pastry
> > > penne,pastry
> > > cannelloni,pastry
> > > -
> > >
> > > Filter in my scheme.xml looks like this:
> > >  > > ignoreCase="true" expand="true"
> > > tokenizerFactory="solr.WhitespaceTokenizerFactory" />
> > > Only in the index analyzer, not in the query.
> > >
> > > When using the Solr analysis tool, I can see that my index for lasagne
> > has
> > > a
> > > synonym pastry and my query only queries lasagne. Same for penne and
> > > cannelloni, they both have the synonym pastry.
> > >
> > > Currently my Solr query for lasagne also returns all penne and cannelloni
> > > recipes. I cannot understand why this is the case.
> > >
> > > Can someone explain this behaviour to me please?
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > > http://lucene.472066.n3.nabble.com/Solr-synonyms-logic-tp4187827.html
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> > Principal Engineer,
> > Grid Dynamics
> >
> > 
> > >
> 
  

Re: Performing DIH on predefined list of IDS

2015-02-21 Thread Walter Underwood
Am an expert? Not sure, but I worked on an enterprise search spider and search 
engine for about a decade (Ultraseek Server) and I’ve done customer-facing 
search for another 6+ years.

Let the server reject URLs it cannot handle. Great servers will return a 414, 
good servers will return a 400, broken servers will return a 500, and crapulous 
servers will hang. In nearly all cases, you’ll get a fast fail which won’t hurt 
other users of the site.

Manage your site for zero errors, so you can fix the queries that are too long.

At Chegg, we have people paste entire homework problems into the search for 
homework solutions, and, yes, we have a few queries longer than 8K. But we deal 
with it gracefully.

Never do POST for a read-only request. Never. That only guarantees that you 
cannot reproduce the problem by looking at the logs.

If your design requires extremely long GET requests, you may need to re-think 
your design.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

On Feb 21, 2015, at 4:45 PM, Shawn Heisey  wrote:

> On 2/21/2015 1:46 AM, steve wrote:
>> Careful with the GETs! There is a real, hard limit on the length of a GET 
>> url (in the low hundreds of characters). That's why a POST is so much better 
>> for complex queries; the limit is in the hundreds of MegaBytes.
> 
> The limit on a GET command (including the GET itself and the protocol
> specifier (usually HTTP/1.1) is normally 8K, or 8192 bytes.  That's the
> default value in Jetty, at least.
> 
> A question for the experts:  Would it be a good idea to force a POST
> request in SolrEntityProcessor?  It may be dealing with parameters that
> have been sent via POST and may exceed the header size limit.
> 
> Thanks,
> Shawn
> 



RE: edismax removes query string: (pg_int:-1) becomes ()

2015-02-21 Thread Tang, Rebecca
Thank you!  I tried (pg_int:"-1") and (pg_int:\-1) and they both worked (I got 
118 results back as expected).

The field pg_int is defined as follows:






Rebecca

From: Jack Krupansky [jack.krupan...@gmail.com]
Sent: Saturday, February 21, 2015 3:04 PM
To: solr-user@lucene.apache.org
Subject: Re: edismax removes query string: (pg_int:-1) becomes ()

I would classify this behavior as a bug, even if we can explain it somehow
- it is certainly not intuitively expected.

As a workaround, try placing the -1 in quotes: (pg_int:"-1"). Or escape the
minus with a backslash: (pg_int:\-1)

Also, what is the field and field type for pg_int?

The edismax query parser has a few too many parsing heuristics, causing way
too many odd combinations that are not exhaustively tested.


-- Jack Krupansky

On Sat, Feb 21, 2015 at 5:43 PM, Tang, Rebecca 
wrote:

> Hi there,
>
> I have a field pg_int which is number of pages stored as integer.  There
> are 118 records in my index with pg_int = -1.
>
> If I search the index with pg_int:-1, I get the correct records returned
> in the results.
> { "responseHeader": { "status": 0, "QTime": 1, "params": { "debugQuery":
> "true", "indent": "true", "q": "pg_int:-1\n", "_": "1424558304272", "wt":
> "json", "rows": "0" } }, "response": { "numFound": 118, "start": 0, "docs":
> [] }, "moreLikeThis": {}, "highlighting": {}, "debug": { "moreLikeThis":
> {}, "rawquerystring": "pg_int:-1\n", "querystring": "pg_int:-1\n",
> "parsedquery": "(+pg_int:-1)/no_coord", "parsedquery_toString":
> "+pg_int:`\u0007", "explain": {}, "QParser": "ExtendedDismaxQParser",
>
> But if I put parens around the query and send it over to solr as
> (pg_int:-1), then the query string gets completely removed by the edimax
> parser:
> { "responseHeader": { "status": 0, "QTime": 1, "params": { "debugQuery":
> "true", "indent": "true", "q": "(pg_int:-1)\n", "_": "1424558355671", "wt":
> "json", "rows": "0" } }, "response": { "numFound": 0, "start": 0, "docs":
> [] }, "moreLikeThis": {}, "highlighting": {}, "debug": { "moreLikeThis":
> {}, "rawquerystring": "(pg_int:-1)\n", "querystring": "(pg_int:-1)\n",
> "parsedquery": "(+())/no_coord", < query string is removed
> "parsedquery_toString": "+()", "explain": {}, "QParser":
> "ExtendedDismaxQParser",
>
>
> I don't understand what could be causing this.  It doesn't happen to
> positive integers.  Both pg_int:1 and (pg_int:1) work fine.
>
> Has anyone run into this issue?  How do I get around it?
>
> Thanks,
> Rebecca
>


Re: Performing DIH on predefined list of IDS

2015-02-21 Thread Shawn Heisey
On 2/21/2015 1:46 AM, steve wrote:
> Careful with the GETs! There is a real, hard limit on the length of a GET url 
> (in the low hundreds of characters). That's why a POST is so much better for 
> complex queries; the limit is in the hundreds of MegaBytes.

The limit on a GET command (including the GET itself and the protocol
specifier (usually HTTP/1.1) is normally 8K, or 8192 bytes.  That's the
default value in Jetty, at least.

A question for the experts:  Would it be a good idea to force a POST
request in SolrEntityProcessor?  It may be dealing with parameters that
have been sent via POST and may exceed the header size limit.

Thanks,
Shawn



Re: edismax removes query string: (pg_int:-1) becomes ()

2015-02-21 Thread Jack Krupansky
I would classify this behavior as a bug, even if we can explain it somehow
- it is certainly not intuitively expected.

As a workaround, try placing the -1 in quotes: (pg_int:"-1"). Or escape the
minus with a backslash: (pg_int:\-1)

Also, what is the field and field type for pg_int?

The edismax query parser has a few too many parsing heuristics, causing way
too many odd combinations that are not exhaustively tested.


-- Jack Krupansky

On Sat, Feb 21, 2015 at 5:43 PM, Tang, Rebecca 
wrote:

> Hi there,
>
> I have a field pg_int which is number of pages stored as integer.  There
> are 118 records in my index with pg_int = -1.
>
> If I search the index with pg_int:-1, I get the correct records returned
> in the results.
> { "responseHeader": { "status": 0, "QTime": 1, "params": { "debugQuery":
> "true", "indent": "true", "q": "pg_int:-1\n", "_": "1424558304272", "wt":
> "json", "rows": "0" } }, "response": { "numFound": 118, "start": 0, "docs":
> [] }, "moreLikeThis": {}, "highlighting": {}, "debug": { "moreLikeThis":
> {}, "rawquerystring": "pg_int:-1\n", "querystring": "pg_int:-1\n",
> "parsedquery": "(+pg_int:-1)/no_coord", "parsedquery_toString":
> "+pg_int:`\u0007", "explain": {}, "QParser": "ExtendedDismaxQParser",
>
> But if I put parens around the query and send it over to solr as
> (pg_int:-1), then the query string gets completely removed by the edimax
> parser:
> { "responseHeader": { "status": 0, "QTime": 1, "params": { "debugQuery":
> "true", "indent": "true", "q": "(pg_int:-1)\n", "_": "1424558355671", "wt":
> "json", "rows": "0" } }, "response": { "numFound": 0, "start": 0, "docs":
> [] }, "moreLikeThis": {}, "highlighting": {}, "debug": { "moreLikeThis":
> {}, "rawquerystring": "(pg_int:-1)\n", "querystring": "(pg_int:-1)\n",
> "parsedquery": "(+())/no_coord", < query string is removed
> "parsedquery_toString": "+()", "explain": {}, "QParser":
> "ExtendedDismaxQParser",
>
>
> I don't understand what could be causing this.  It doesn't happen to
> positive integers.  Both pg_int:1 and (pg_int:1) work fine.
>
> Has anyone run into this issue?  How do I get around it?
>
> Thanks,
> Rebecca
>


edismax removes query string: (pg_int:-1) becomes ()

2015-02-21 Thread Tang, Rebecca
Hi there,

I have a field pg_int which is number of pages stored as integer.  There are 
118 records in my index with pg_int = -1.

If I search the index with pg_int:-1, I get the correct records returned in the 
results.
{ "responseHeader": { "status": 0, "QTime": 1, "params": { "debugQuery": 
"true", "indent": "true", "q": "pg_int:-1\n", "_": "1424558304272", "wt": 
"json", "rows": "0" } }, "response": { "numFound": 118, "start": 0, "docs": [] 
}, "moreLikeThis": {}, "highlighting": {}, "debug": { "moreLikeThis": {}, 
"rawquerystring": "pg_int:-1\n", "querystring": "pg_int:-1\n", "parsedquery": 
"(+pg_int:-1)/no_coord", "parsedquery_toString": "+pg_int:`\u0007", 
"explain": {}, "QParser": "ExtendedDismaxQParser",

But if I put parens around the query and send it over to solr as (pg_int:-1), 
then the query string gets completely removed by the edimax parser:
{ "responseHeader": { "status": 0, "QTime": 1, "params": { "debugQuery": 
"true", "indent": "true", "q": "(pg_int:-1)\n", "_": "1424558355671", "wt": 
"json", "rows": "0" } }, "response": { "numFound": 0, "start": 0, "docs": [] }, 
"moreLikeThis": {}, "highlighting": {}, "debug": { "moreLikeThis": {}, 
"rawquerystring": "(pg_int:-1)\n", "querystring": "(pg_int:-1)\n", 
"parsedquery": "(+())/no_coord", < query string is removed 
"parsedquery_toString": "+()", "explain": {}, "QParser": 
"ExtendedDismaxQParser",


I don't understand what could be causing this.  It doesn't happen to positive 
integers.  Both pg_int:1 and (pg_int:1) work fine.

Has anyone run into this issue?  How do I get around it?

Thanks,
Rebecca


RE: Performing DIH on predefined list of IDS

2015-02-21 Thread steve
Thank you! Another 4xx error that makes sense. Quoting from the Book of 
StackOverFlowhttp://stackoverflow.com/questions/2659952/maximum-length-of-http-get-request"Most
 webservers have a limit of 8192 bytes (8KB), which is usually configureable 
somewhere in the server configuration. As to the client side matter, the HTTP 
1.1 specification even warns about this, here's an extract of chapter 
3.2.1:Note: Servers ought to be cautious about depending on URI lengths above 
255 bytes, because some older client or proxy implementations might not 
properly support these lengths.The limit is in MSIE and Safari about 2KB, in 
Opera about 4KB and in Firefox about 8KB. We may thus assume that 8KB is the 
maximum possible length and that 2KB is a more affordable length to rely on at 
the server side and that 255 bytes is the safest length to assume that the 
entire URL will come in.If the limit is exceeded in either the browser or the 
server, most will just truncate the characters outside the limit without any 
warning. Some servers however may send a HTTP 414 error. If you need to send 
large data, then better use POST instead of GET. Its limit is much higher, but 
more dependent on the server used than the client. Usually up to around 2GB is 
allowed by the average webserver. This is also configureable somewhere in the 
server settings. The average server will display a server-specific 
error/exception when the POST limit is exceeded, usually as HTTP 500 error."
> From: wun...@wunderwood.org
> Subject: Re: Performing DIH on predefined list of IDS
> Date: Sat, 21 Feb 2015 09:50:46 -0800
> To: solr-user@lucene.apache.org
> 
> The HTTP protocol does not set a limit on GET URL size, but individual web 
> servers usually do. You should get a response code of “414 Request-URI Too 
> Long” when the URL is too long.
> 
> This limit is usually configurable.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
> 
> On Feb 21, 2015, at 12:46 AM, steve  wrote:
> 
> > Careful with the GETs! There is a real, hard limit on the length of a GET 
> > url (in the low hundreds of characters). That's why a POST is so much 
> > better for complex queries; the limit is in the hundreds of MegaBytes.
> > 
> >> Date: Sat, 21 Feb 2015 01:42:03 -0700
> >> From: osta...@gmail.com
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Performing DIH on predefined list of IDS
> >> 
> >> Yes,  you right,  I am not using a DB. 
> >> SolrEntityProcessor is using a GET method,  so I will need to send
> >> relatively big URL ( something like a hundreds of ids ) hope it will be
> >> possible. 
> >> 
> >> Any way I think it is the only method to perform reindex if I want to
> >> control it and be able to continue from any point in case of failure.  
> >> 
> >> 
> >> 
> >> --
> >> View this message in context: 
> >> http://lucene.472066.n3.nabble.com/Performing-DIH-on-predefined-list-of-IDS-tp4187589p4187835.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >   
> 
  

Re: Performing DIH on predefined list of IDS

2015-02-21 Thread Walter Underwood
The HTTP protocol does not set a limit on GET URL size, but individual web 
servers usually do. You should get a response code of “414 Request-URI Too 
Long” when the URL is too long.

This limit is usually configurable.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


On Feb 21, 2015, at 12:46 AM, steve  wrote:

> Careful with the GETs! There is a real, hard limit on the length of a GET url 
> (in the low hundreds of characters). That's why a POST is so much better for 
> complex queries; the limit is in the hundreds of MegaBytes.
> 
>> Date: Sat, 21 Feb 2015 01:42:03 -0700
>> From: osta...@gmail.com
>> To: solr-user@lucene.apache.org
>> Subject: Re: Performing DIH on predefined list of IDS
>> 
>> Yes,  you right,  I am not using a DB. 
>> SolrEntityProcessor is using a GET method,  so I will need to send
>> relatively big URL ( something like a hundreds of ids ) hope it will be
>> possible. 
>> 
>> Any way I think it is the only method to perform reindex if I want to
>> control it and be able to continue from any point in case of failure.  
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Performing-DIH-on-predefined-list-of-IDS-tp4187589p4187835.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 



Re: Clarification of locktype=single and implications of use

2015-02-21 Thread Erick Erickson
Tom:

I updated the CWiki a bit base don this conversation, does that do it?
trying to balance between tl;dr and not enough info...

Erick

On Fri, Feb 20, 2015 at 3:11 PM, Tom Burton-West  wrote:
> Thanks Hoss,
>
> Protection from misconfiguration and/or starting separate solr instances
> pointing to the same index dir I can understand.
>
> The current documentation on the wiki and in the ref guide (along with just
> enough understanding of Solr/Lucene indexing to be dangerous)  left me
> wondering if maybe somehow a correctly configured Solr might have multiple
> processes writing to the same file.
> I'm wondering if your explanation above  might be added to the
> documentation.
>
> Tom
>
> On Fri, Feb 20, 2015 at 1:25 PM, Chris Hostetter 
> wrote:
>
>>
>> : We are using Solr.  We would not configure two different Solr instances
>> to
>> : write to the same index.  So why would a "normal" Solr set-up possibly
>> end
>> : up having more than one process writing to the same index?
>>
>> The risk here is that if you configure lockType=single, and then have some
>> unintended user error such that two distinct java processes both attempt
>> to use the same index dir, the locType will not protect you in that
>> situation.
>>
>> For example: you normally run solr on port 8983, but someone accidently
>> starts a second instance of solr on more 7574 using the exact same conigs
>> with the exact same index dir -- lockType single won't help you spot this
>> error.  lockType=native will (assuming your FileSystem can handle it)
>>
>> lockType=single should protect you however if, for example, multiple
>> SolrCores w/in the same Solr java process attempted to refer to the same
>> index dir because you accidently put an absolulte path in a solrconfig.xml
>> that gets shared my multiple cores.
>>
>>
>> -Hoss
>> http://www.lucidworks.com/
>>


Re: Solr synonyms logic

2015-02-21 Thread Ryan Josal
What you are describing is hyponymy.  Pastry is the hypernym.  You can
accomplish this by not using expansion, for example:
cannelloni => cannelloni, pastry

This has the result of adding pastry to the index.

Ryan

On Saturday, February 21, 2015, Mikhail Khludnev 
wrote:

> Hello,
>
> usually debugQuery=true output explains a lot of such details.
>
> On Sat, Feb 21, 2015 at 10:52 AM, davym >
> wrote:
>
> > Hi all,
> >
> > I'm querying a recipe database in Solr. By using synonyms, I'm trying to
> > make my search a little smarter.
> >
> > What I'm trying to do here, is that a search for pastry returns all
> > lasagne,
> > penne & cannelloni recipes.
> > However a search for lasagne should only return lasagne recipes.
> >
> > In my synonyms.txt, I have these lines:
> > -
> > lasagne,pastry
> > penne,pastry
> > cannelloni,pastry
> > -
> >
> > Filter in my scheme.xml looks like this:
> >  > ignoreCase="true" expand="true"
> > tokenizerFactory="solr.WhitespaceTokenizerFactory" />
> > Only in the index analyzer, not in the query.
> >
> > When using the Solr analysis tool, I can see that my index for lasagne
> has
> > a
> > synonym pastry and my query only queries lasagne. Same for penne and
> > cannelloni, they both have the synonym pastry.
> >
> > Currently my Solr query for lasagne also returns all penne and cannelloni
> > recipes. I cannot understand why this is the case.
> >
> > Can someone explain this behaviour to me please?
> >
> >
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/Solr-synonyms-logic-tp4187827.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> 
> >
>


Re: Solr synonyms logic

2015-02-21 Thread Mikhail Khludnev
Hello,

usually debugQuery=true output explains a lot of such details.

On Sat, Feb 21, 2015 at 10:52 AM, davym  wrote:

> Hi all,
>
> I'm querying a recipe database in Solr. By using synonyms, I'm trying to
> make my search a little smarter.
>
> What I'm trying to do here, is that a search for pastry returns all
> lasagne,
> penne & cannelloni recipes.
> However a search for lasagne should only return lasagne recipes.
>
> In my synonyms.txt, I have these lines:
> -
> lasagne,pastry
> penne,pastry
> cannelloni,pastry
> -
>
> Filter in my scheme.xml looks like this:
>  ignoreCase="true" expand="true"
> tokenizerFactory="solr.WhitespaceTokenizerFactory" />
> Only in the index analyzer, not in the query.
>
> When using the Solr analysis tool, I can see that my index for lasagne has
> a
> synonym pastry and my query only queries lasagne. Same for penne and
> cannelloni, they both have the synonym pastry.
>
> Currently my Solr query for lasagne also returns all penne and cannelloni
> recipes. I cannot understand why this is the case.
>
> Can someone explain this behaviour to me please?
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-synonyms-logic-tp4187827.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics





RE: Performing DIH on predefined list of IDS

2015-02-21 Thread steve
And I'm familiar with the setup and configuration using Python, JavaScript, and 
PHP; not at all with Java.

> Date: Sat, 21 Feb 2015 01:52:07 -0700
> From: osta...@gmail.com
> To: solr-user@lucene.apache.org
> Subject: RE: Performing DIH on predefined list of IDS
> 
> That's right, but I am not sure that if it is works with Get I will able to
> use Post without changing it. 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Performing-DIH-on-predefined-list-of-IDS-tp4187589p4187838.html
> Sent from the Solr - User mailing list archive at Nabble.com.
  

RE: Performing DIH on predefined list of IDS

2015-02-21 Thread SolrUser1543
That's right, but I am not sure that if it is works with Get I will able to
use Post without changing it. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performing-DIH-on-predefined-list-of-IDS-tp4187589p4187838.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Performing DIH on predefined list of IDS

2015-02-21 Thread steve
Careful with the GETs! There is a real, hard limit on the length of a GET url 
(in the low hundreds of characters). That's why a POST is so much better for 
complex queries; the limit is in the hundreds of MegaBytes.

> Date: Sat, 21 Feb 2015 01:42:03 -0700
> From: osta...@gmail.com
> To: solr-user@lucene.apache.org
> Subject: Re: Performing DIH on predefined list of IDS
> 
> Yes,  you right,  I am not using a DB. 
>  SolrEntityProcessor is using a GET method,  so I will need to send
> relatively big URL ( something like a hundreds of ids ) hope it will be
> possible. 
> 
> Any way I think it is the only method to perform reindex if I want to
> control it and be able to continue from any point in case of failure.  
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Performing-DIH-on-predefined-list-of-IDS-tp4187589p4187835.html
> Sent from the Solr - User mailing list archive at Nabble.com.
  

Re: Performing DIH on predefined list of IDS

2015-02-21 Thread SolrUser1543
Yes,  you right,  I am not using a DB. 
 SolrEntityProcessor is using a GET method,  so I will need to send
relatively big URL ( something like a hundreds of ids ) hope it will be
possible. 

Any way I think it is the only method to perform reindex if I want to
control it and be able to continue from any point in case of failure.  



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Performing-DIH-on-predefined-list-of-IDS-tp4187589p4187835.html
Sent from the Solr - User mailing list archive at Nabble.com.