Re: [Migration Solr4 to Solr5] Collection reload error

2016-03-10 Thread Dmitry Kan
Thanks Shawn, Missed the openSearcher=false setting. So another thing to check really is whether there are concurrent commitWithin calls ever to the same shard. 10 марта 2016 г. 4:39 PM пользователь "Shawn Heisey" написал: > On 3/10/2016 3:05 AM, Dmitry Kan wrote: > > The

timeAllowed

2016-03-10 Thread Anil
HI, is timeallowed is max threshold of Qtime ? or overall time ? Please clarify. Thanks, Anil

Re: Multiple custom Similarity implementations

2016-03-10 Thread Parvesh Garg
Hi Ahmet, Thanks for the pointer. I have similar thoughts on the subject. The risk assumptions are based on not testing your stuff before taking it in. That risk is still valid with similarity configuration. And sometimes, it may not be possible to use multiple similarities (custom or otherwise).

Using group.ngroups during query search

2016-03-10 Thread Zheng Lin Edwin Yeo
Hi, I would like to check, will using the results grouping with group.ngroups (which will include the number of groups that have matched the query) in the search affects the performance of the Solr? I found that the searching speed has slowed down quite significantly after I added in the

Re: Load pre-built index to Solr

2016-03-10 Thread Erick Erickson
bq: is there a better way to load a pre-built index quickly like before In a word (well, two) "Collection Aliasing". You have two collections and an alias. So your search URL stays constant, say 'aliasedcollection'. Then you index to collectionA and point aliasedcollection to it. That's your live

Re: Timeout error during commit

2016-03-10 Thread Erick Erickson
Probably more than you want to know about commits, hard and soft: https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Best, Erick On Thu, Mar 10, 2016 at 3:40 PM, Shawn Heisey wrote: > On 3/10/2016 4:06 PM, Steven White

Re: Timeout error during commit

2016-03-10 Thread Shawn Heisey
On 3/10/2016 4:06 PM, Steven White wrote: > Last question on this topic (maybe), wouldn't a commit at the very end take > too long on a 1 billion items? Wouldn't a commit every, lets say 10,000 > items be more efficient? The behavior that I have witnessed suggests that commit speed on a

Solr Stats or Analytic Query Help

2016-03-10 Thread Gopal Patwa
I am trying write query to get below stats for orders data Find Total Order Count, sum(Quantity) and sum(Cost) for specified date range in a gap of 1 day. example if date range is for 10 days then get these result for every day for 10 days. Solr Version : 5.2.1 Example Order Solr Doc

Re: Solr on AIX

2016-03-10 Thread Shawn Heisey
On 3/10/2016 9:07 AM, Stephane Bouchard wrote: > Hi, the company where I work is planning to migrate to AIX. Does anyone had > any issues running Solr 5 on AIX? Solr's start scripts were designed for the OS tools found on recent versions of free operating systems like Linux, FreeBSD, etc. When

Re: Timeout error during commit

2016-03-10 Thread Steven White
Got it. Last question on this topic (maybe), wouldn't a commit at the very end take too long on a 1 billion items? Wouldn't a commit every, lets say 10,000 items be more efficient? Steve On Thu, Mar 10, 2016 at 5:44 PM, Shawn Heisey wrote: > On 3/10/2016 3:29 PM, Steven

Load pre-built index to Solr

2016-03-10 Thread praneethvarma
I'm building an index on HDFS using the MapReduceIndexerTool which I'd later like to load into my Solr cores with minimal delay. With Solr 4.4, I was able to switch out the underlying index directory of a core (I don't need to keep any of the existing index) and reload the core, and it worked

Re: Timeout error during commit

2016-03-10 Thread Shawn Heisey
On 3/10/2016 3:29 PM, Steven White wrote: > Thanks you for your insight Shawn, they are always valuable. > > Question, if I wait to the very end to issue a commit, wouldn't that mean I > could lose everything if there was an OOM or some other server issue? I > don't have any commit setting set in

Foot, Inch: Stripping Out Special Characters: DisMax: WhitespaceTokenizer vs. Keyword Tokenizer

2016-03-10 Thread Fuad Efendi
Hello, I finally got it work: search for 5’ 3” (5 feet 3 inches) It is strange for me that if I use WhitespaceTokenizer for field query-type analyzer then it will receive only 5 and 3 with special characters removed. It is also strange that EDisMax does not strips out odd number of quotes.

Re: Timeout error during commit

2016-03-10 Thread Steven White
Thanks you for your insight Shawn, they are always valuable. Question, if I wait to the very end to issue a commit, wouldn't that mean I could lose everything if there was an OOM or some other server issue? I don't have any commit setting set in my solrconfig.xml. Steve On Wed, Mar 9, 2016 at

Re: NoSuchFileException errors common on version 5.5.0

2016-03-10 Thread Shawn Heisey
On 3/10/2016 12:18 PM, Shawn Heisey wrote: > I pulled down branch_5_5 and installed a 5.5.1 snapshot. Had to edit > lucene/version.properties to get it to be 5.5.1. I also had to edit the > SolrIdentifierValidator class to allow hyphens, since I have them in > some of my core names. The

Re: Query behavior.

2016-03-10 Thread Jack Krupansky
We probably need a Jira to investigate whether this really is an explicitly intentional feature change, or whether it really is a bug. And if it truly was intentional, how people can work around the change to get the desired, pre-5.5 behavior. Personally, I always thought it was a mistake that

Re: ngrams with position

2016-03-10 Thread Jack Krupansky
I suspect that what you really want is analogous to PF2/PF3, but based on the ngram terms that come out of query token analysis rather than using pairs/triples of source terms before analysis that are then analyzed as phrases so that all of the ngrams for a PF2/PF3 phrase must be in order rather

Re: Query result cache not getting inserted for query lasting > 5secs

2016-03-10 Thread Erick Erickson
Let's see the query. If you do anything with dates using NOW (without rounding), the queries are actually not the same since NOW resolves itself to the epoch and will change every millisecond. Best, Erick On Thu, Mar 10, 2016 at 12:18 AM, Murali TV wrote: > Hi, > > I have a

Re: Clarification on +, and in edismax parser

2016-03-10 Thread Erick Erickson
Here's a _very_ useful explanation of why the query syntax isn't pure Boolean: https://lucidworks.com/blog/2011/12/28/why-not-and-or-and-not/ Best, Erick On Thu, Mar 10, 2016 at 12:30 AM, Anil wrote: > Thank you Dikshant. > > On 10 March 2016 at 13:26, Dikshant Shahi

Re: Very slow updates

2016-03-10 Thread Erick Erickson
This really doesn't have much information to go on. Have you reviewed: http://wiki.apache.org/lucene-java/ImproveIndexingSpeed? What is "slow"? How are you updating? Are you batching updates? Are you committing often? Details matter. Best, Erick On Thu, Mar 10, 2016 at 2:41 AM, michael

Re: NoSuchFileException errors common on version 5.5.0

2016-03-10 Thread Shawn Heisey
On 3/10/2016 10:09 AM, Kevin Risden wrote: > This sounds related to SOLR-8587 and there is a fix in SOLR-8793 that isn't > out in a release since it was fixed after 5.5 went out. Thanks for that info. I pulled down branch_5_5 and installed a 5.5.1 snapshot. Had to edit lucene/version.properties

Re: Query on Highlights

2016-03-10 Thread Anil
i have tested with large documents with large values of hl.maxAnalyzedChars, i can see highlights now. thanks. On 10 March 2016 at 22:29, Anil wrote: > HI, > > i have indexed large files (around 10 mb) in a text field with stored and > indexed as true. > Search of a text

Re: NoSuchFileException errors common on version 5.5.0

2016-03-10 Thread Kevin Risden
This sounds related to SOLR-8587 and there is a fix in SOLR-8793 that isn't out in a release since it was fixed after 5.5 went out. Kevin Risden Hadoop Tech Lead | Avalon Consulting, LLC M: 732 213 8417 LinkedIn

NoSuchFileException errors common on version 5.5.0

2016-03-10 Thread Shawn Heisey
I have a dev system running 5.5.0. I am seeing a lot of NoSuchFileException errors (for segments_XXXfilenames). Here's a log excerpt: 2016-03-10 09:52:00.054 INFO (qtp1012570586-821) [ x:inclive] org.apache.solr.core.SolrCore.Request [inclive] webapp=/solr path=/admin/luke

Query on Highlights

2016-03-10 Thread Anil
HI, i have indexed large files (around 10 mb) in a text field with stored and indexed as true. Search of a text against the field returning records but highlights are empty for few documents. is it because of default hl.maxAnalyzedChars ? Please let me know if you need any additional

Solr debug 'explain' values differ from the Solr score

2016-03-10 Thread Rick Sullivan
Hi, I'm seeing behavior in Solr 5.5.0 where the top-level values I see in the debug response don't always correspond with the scores Solr assigns to the matched documents. For example, here is the top-level debug information for two documents matched by a query: 114628: Objectdescription: "sum

Solr on AIX

2016-03-10 Thread Stephane Bouchard
Hi, the company where I work is planning to migrate to AIX. Does anyone had any issues running Solr 5 on AIX? Thanks SB

Facets of nested docs when parent docs are grouped

2016-03-10 Thread Jhon Smith
Is it bug or by design: if i group docs with option "=true" then facet counts are grouped and "represent" groups, including the fact that no count can be larger than number of groups. But when the docs have nested docs and i additionally fetch neested docs facet with option

Re: [Migration Solr4 to Solr5] Collection reload error

2016-03-10 Thread Shawn Heisey
On 3/10/2016 3:05 AM, Dmitry Kan wrote: > The only thing that I spot is that you use both auto-commit with 900 sec > frequency AND commitWithin. Solr is smart enough to skip empty commits. But > if auto-commit kicks in during the doc add / delete, there will be at least > two commits ongoing.

Re: how to force rescan of core.properties file in solr

2016-03-10 Thread Shawn Heisey
On 3/10/2016 3:00 AM, Gian Maria Ricci - aka Alkampfer wrote: > but this change in core.properties is not available until I restart > the service and Solr does core autodiscovery. Issuing a Core RELOAD > does not work. > > > > How I can force solr to reload core.properties when I change it? >

Re: ngrams with position

2016-03-10 Thread elisabeth benoit
oh yeah, now that you're saying it, yeah you're right, pf2 pf3 will boost proximity between words, not between ngrams. Thanks again, Elisabeth 2016-03-10 12:31 GMT+01:00 Alessandro Benedetti : > The reason pf2 and pf3 seems not a good solution to me is the fact that the >

Re: ngrams with position

2016-03-10 Thread Alessandro Benedetti
The reason pf2 and pf3 seems not a good solution to me is the fact that the edismax query parser calculate those grams on top of words shingles. So it takes the query in input, and produces the shingle based on the white space separator. i.e. if you search : "white tiger jumping" and pf2

Re: ngrams with position

2016-03-10 Thread elisabeth benoit
That's the use cas, yes. Find Amsterdam with Asmtreadm. And yes, we're only doing approximative search if we get 0 result. I don't quite get why pf2 pf3 not a good solution. We're actually testing a solution close to phonetic. Some kind of word reduction. Thanks for the suggestion (and the

Very slow updates

2016-03-10 Thread michael solomon
Hi, I have a collection with one shard in solrcloud (for development before scaling) and when I'm trying to update new documents it's take about 20 sec for 12mb of data. What wrong with my config? VM RAM - 28gb JVM-Memory - 10gb What else can I do? Thanks, Michael

Re: ngrams with position

2016-03-10 Thread Alessandro Benedetti
If I followed your use case is: I type Asmtreadm and I want document matching Amsterdam ( even if the edit distance is greater than 2) . First of all is something I hope you do only if you get 0 results, if not the overhead can be great and you are going to lose a lot of precision causing

Re: [Migration Solr4 to Solr5] Collection reload error

2016-03-10 Thread Dmitry Kan
Hi, The only thing that I spot is that you use both auto-commit with 900 sec frequency AND commitWithin. Solr is smart enough to skip empty commits. But if auto-commit kicks in during the doc add / delete, there will be at least two commits ongoing. Could you change you Full recovery case to

how to force rescan of core.properties file in solr

2016-03-10 Thread Gian Maria Ricci - aka Alkampfer
I've setup a configuration in my solrconfig.xml to manage maseter or slave with settings in core.properties file. This allows me to select if the core is slave or master with a simple change of core.properties file. I've setup a dns entry for master.mysolr..xxx, this allows me to point

Re: Multiple custom Similarity implementations

2016-03-10 Thread Ahmet Arslan
Hi Parvesh, Please see the similar discussion : http://search-lucene.com/m/eHNlijx91I7etm1 Ahmet On Thursday, March 10, 2016 6:57 AM, Parvesh Garg wrote: Thanks Markus. We will look at other options. May I ask what can be the reasons for not supporting this ever?

Re: ngrams with position

2016-03-10 Thread elisabeth benoit
I am trying to do approximative search with solr. We've tried fuzzy search, and spellcheck search, it's working ok but edit distance is limited (to 2 for DirectSolrSpellChecker in solr 4.10.1). With fuzzy operator, we've had performance issues, and I don't think you can have an edit distance more

Re: Query behavior.

2016-03-10 Thread Modassar Ather
Thanks Shawn for pointing to the jira issue. I was not sure that if it is an expected behavior or a bug or there could have been a way to get the desired result. Best, Modassar On Thu, Mar 10, 2016 at 11:32 AM, Shawn Heisey wrote: > On 3/9/2016 10:55 PM, Shawn Heisey

Re: Clarification on +, and in edismax parser

2016-03-10 Thread Anil
Thank you Dikshant. On 10 March 2016 at 13:26, Dikshant Shahi wrote: > Hi, > > No, + and "and" doesn't works similar. Even "and" and "AND" would have a > different behavior (is configurable) in edismax. > > When you put a + before a term, you specify that it's mandatory.

Query result cache not getting inserted for query lasting > 5secs

2016-03-10 Thread Murali TV
Hi, I have a query that takes about 5secs to complete. The result count is about 250 million, and row size is about 25. The problem is that this query result is not getting loaded to the query cache, so it takes ~5secs every time its issued. I also confirmed this by looking at the cache stats