Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread shamik
As Shawn had pointed, if you are using CloudSolrServer client, then you are immune to the scenario where a shard and its replica(s) go down. The communication should be ideally with the zookeepers and not the solr servers directly, One thing you need to make sure is to add the shard.tolerant parame

Re: Increasing number of SolrIndexSearcher (Leakage)?

2014-02-18 Thread Nguyen Manh Tien
I found a custom component cause that issue, It creates a SolrQueryRequest but doesn't close at the end that make ref to SolrIndexSearcher don't go to 0 and SIS is not released. > > On Tue, Feb 18, 2014 at 9:31 PM, Yonik Seeley wrote: > On Mon, Feb 17, 2014 at 1:34 AM, Nguyen Manh Tien > wrote

Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Vineet Mishra
Thanks for all your response but my doubt is which *Server:Port* should the query be made as we don't know the crashed server or which server might crash in the future(as any server can go down). The only intention for writing this doubt is to get an idea about how the query format for distributed

Re: block join and atomic updates

2014-02-18 Thread Mikhail Khludnev
Colleagues, You are definitely right regarding denorm&collapse. It works fine in most cases, but look at the case more precisely. Moritz needs to update the parent's fields, if they are copied during denormalization, the price of update is the same as block join's. With q-time join updates are way

Re: Weird behavior of stopwords in search query

2014-02-18 Thread shamik
Jack, thanks for the pointer. I should have checked this closely. I'm using edismax and here's my qf entry : id^10.0 cat^1.4 text^0.5 features^1.0 name^1.2 sku^1.5 manu^1.1 title^10.0 description^5.0 keywords^5.0 author^2.0 resourcename^1.0 As you can see, I was boosting id and

Re: Weird behavior of stopwords in search query

2014-02-18 Thread Jack Krupansky
Does "other" appear in the id, cat, or sku fields? This clause requires it to appear in at least one of those fields: +DisjunctionMaxQuery((id:other^10.0 | cat:other^1.4 | sku:other^1.5)) The "and" is treated as the "AND" operator. What query parser are you using? Without "and", the terms are

Re: Preventing multiple on-deck searchers without causing failed commits

2014-02-18 Thread Colin Bartolome
Inline quoting ahead, sorry: Colin: Stop. Back up. The automatic soft commits will make updates available to your users every second. Those documents _include_ anything from your "hard commit" jobs. What could be faster? Parenthetically I'll add that 1 second soft commits are rarely an actual r

Re: block join and atomic updates

2014-02-18 Thread Walter Underwood
Listen to that advice. Denormalize, denormalize, denormalize. Think about the results page and work backwards from that. Flat data model. wunder Search guy at Infoseek, Inktomi, Verity, Autonomy, Netflix, and Chegg On Feb 18, 2014, at 7:37 PM, Jason Hellman wrote: > Thinking in terms of norma

Re: block join and atomic updates

2014-02-18 Thread Jason Hellman
Thinking in terms of normalized data in the context of a Lucene index is dangerous. It is not a relational data model technology, and the join behaviors available to you have limited use. Each approach requires compromises that are likely impermissible for certain uses cases. If it is at al

Re: query parameters

2014-02-18 Thread Erick Erickson
Solr/Lucene query language is NOT strictly boolean, see Chris's excellent blog here: http://searchhub.org/dev/2011/12/28/why-not-and-or-and-not/ Best, Erick On Tue, Feb 18, 2014 at 11:54 AM, Andreas Owen wrote: > I tried it in solr admin query and it showed me all the docs without a > value >

Re: Preventing multiple on-deck searchers without causing failed commits

2014-02-18 Thread Erick Erickson
Colin: Stop. Back up. The automatic soft commits will make updates available to your users every second. Those documents _include_ anything from your "hard commit" jobs. What could be faster? Parenthetically I'll add that 1 second soft commits are rarely an actual requirement, but that's your deci

Re: SolrJ 3.4 Client compatible with Solr 4.6 Server?

2014-02-18 Thread Shawn Heisey
On 2/18/2014 5:13 PM, Lan wrote: I'm in the process of updating from Solr 3.4 to Solr 4.6. Is the SolrJ 3.4 Client forward compatible with Solr 4.6? This isn't mentioned in the documentation http://wiki.apache.org/solr/javabin page. In a test environment, I did some indexing and querying wit

Weird behavior of stopwords in search query

2014-02-18 Thread Shamik Bandopadhyay
Hi, I'm observing a weird behavior while using stopwords as part of the search query. I'm able to replicate it in standalone Solr instance well. The issue pops up when I'm trying to use "other" and "and" stopword together in a query string. The query doesn't return any result. But it works with

Re: Slow 95th-percentile

2014-02-18 Thread Chris Hostetter
: Slowing the soft commits to every 100 seconds helped. The main culprit : was a bad query that was coming through every few seconds. Something : about the empty fq param and the q=* slowed everything else down. : : INFO: [event] webapp=/solr path=/select : params={start=0&q=*&wt=javabin&fq=&f

Re: SOLR Suggester - return matched suggestion along with other suggestions

2014-02-18 Thread bbi123
Nevermind, I added a space to the end of all the field values (keywords) supplied to suggester and it works!!! iphone is indexed as iphone (with additional space at the end) I trim the value passed to the search after selection the keyword from dropdown suggestion so it will be again passed as ip

Re: Slow 95th-percentile

2014-02-18 Thread Allan Carroll
Slowing the soft commits to every 100 seconds helped. The main culprit was a bad query that was coming through every few seconds. Something about the empty fq param and the q=* slowed everything else down. INFO: [event] webapp=/solr path=/select params={start=0&q=*&wt=javabin&fq=&fq=startTime:1

Re: Escape \\n from getting highlighted - highlighter component

2014-02-18 Thread T. Kuro Kurosaka
Your search expression means 'talk' OR 'n' OR 'text'. I think you want to do a phrase search. To do that, quote the whole thing with double-quotes "talk n text", if you are using one of the Solr standard query parsers. On 02/17/2014 03:53 PM, Developer wrote: Hi, When searching for a text l

SolrJ 3.4 Client compatible with Solr 4.6 Server?

2014-02-18 Thread Lan
I'm in the process of updating from Solr 3.4 to Solr 4.6. Is the SolrJ 3.4 Client forward compatible with Solr 4.6? This isn't mentioned in the documentation http://wiki.apache.org/solr/javabin page. In a test environment, I did some indexing and querying with a SolrJ3.4 Client and a Solr4.6

SOLR Suggester - return matched suggestion along with other suggestions

2014-02-18 Thread bbi123
Hi, Is there a way to make suggester return the matched suggestion too? http://localhost:8983/solr/core1/suggest?q=name:iphone The above query should return *iphone * iphone5c iphone4g Currently it returns only iphone5c iphone4g I can use edge N gram filter to implement the above feature b

Cluster state ranges are all null after reboot

2014-02-18 Thread Greg Pendlebury
We've got a 15 shard cluster spread across 3 hosts. This morning our puppet software rebooted them all and afterwards the 'range' for each shard has become null in zookeeper. Is there any way to restore this value short of rebuilding a fresh index? I've read various questions from people with a si

Re: Solr4 performance

2014-02-18 Thread Shawn Heisey
On 2/18/2014 2:14 PM, Joshi, Shital wrote: Thanks much for all suggestions. We're looking into reducing allocated heap size of Solr4 JVM. We're using NRTCachingDirectoryFactory. Does it use MMapDirectory internally? Can someone please confirm? In Solr, NRTCachingDirectory does indeed use MMa

RE: Solr4 performance

2014-02-18 Thread Joshi, Shital
Hi, Thanks much for all suggestions. We're looking into reducing allocated heap size of Solr4 JVM. We're using NRTCachingDirectoryFactory. Does it use MMapDirectory internally? Can someone please confirm? Would optimization help with performance? We did that in QA (took about 13 hours for 70

Re: Slow 95th-percentile

2014-02-18 Thread Shawn Heisey
On 2/18/2014 11:51 AM, Allan Carroll wrote: I was thinking GC too, but it doesn’t feel like it is. Running jstat -gcutil only shows a 10-50ms parnew collection every 10 or 15 seconds and almost no full CMS collections. Anything other places to look for GC activity I might be missing? I did

Using payloads for expanded query terms

2014-02-18 Thread Manuel Le Normand
Hello, I'm trying to handle a situation with taxonomy search - that is for each taxonomy I have a list of words with their boosts. These taxonomies are updated frequently so I retrieve these scored lists at query time from an external service. My expectation would be: q={!some_query_parser}Cities

RE: query parameters

2014-02-18 Thread Andreas Owen
I tried it in solr admin query and it showed me all the docs without a value in ogranisations and roles. It didn't matter if i used a base term, isn't that give through the q-parameter? -Original Message- From: Raymond Wiker [mailto:rwi...@gmail.com] Sent: Dienstag, 18. Februar 2014 13:19

Caching Solr boost functions?

2014-02-18 Thread Gregg Donovan
We're testing out a new handler that uses edismax with three different "boost" functions. One has a random() function in it, so is not very cacheable, but the other two boost functions do not change from query to query. I'd like to tell Solr to cache those boost queries for the life of the Searche

Re: Slow 95th-percentile

2014-02-18 Thread Allan Carroll
Thanks for the suggestions.  I was thinking GC too, but it doesn’t feel like it is. Running jstat -gcutil only shows a 10-50ms parnew collection every 10 or 15 seconds and almost no full CMS collections. Anything other places to look for GC activity I might be missing? I did a little investiga

JOB @ Sematext: Professional Services Lead => Head

2014-02-18 Thread Otis Gospodnetic
Hello, We have what I think is a great opening at Sematext. Ideal candidate would be in New York, but that's not an absolute must. More info below + on http://sematext.com/about/jobs.html in job-ad-speak, but I'd be happy to describe what we are looking for, what we do, and what types of companie

Re: Preventing multiple on-deck searchers without causing failed commits

2014-02-18 Thread Colin Bartolome
On 02/18/2014 10:15 AM, Shawn Heisey wrote: If you want to be completely in control like that, get rid of the automatic soft commits and just do the hard commits. I would personally choose another option for your setup -- get rid of *all* explicit commits entirely, and just configure autoCommit

Re: Additive boost function

2014-02-18 Thread Jack Krupansky
The edismax query parser "bf" parameter gives you an additive boost. See: http://wiki.apache.org/solr/ExtendedDisMax#bf_.28Boost_Function.2C_additive.29 -- Jack Krupansky -Original Message- From: Zwer Sent: Tuesday, February 18, 2014 12:52 PM To: solr-user@lucene.apache.org Subject: A

Re: Solr Suggester not working in sharding (distributed search)

2014-02-18 Thread bbi123
Try this http://solr:8983/solr/select?*q=*:**&spellcheck=true&spellcheck.build=true&spellcheck.q=toyata&qt=spell&shards.qt=/spell&shards=solr-shard1:8983/solr,solr-shard2:8983/solr -- View this message in context: http://lucene.472066.n3.nabble.com/using-distributed-search-with-the-suggest-com

Re: Solr Autosuggest - Strange issue with leading numbers in query

2014-02-18 Thread bbi123
Thanks a lot for your response Erik. I was trying to find if I have any suggestion starting with numbers using terms component but I couldn't find any.. Its very strange!!! Anyways, thanks again for your response. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Autosu

Re: Preventing multiple on-deck searchers without causing failed commits

2014-02-18 Thread Shawn Heisey
On 2/18/2014 10:59 AM, Colin Bartolome wrote: I'll describe a bit more about our setup, so I can say why I don't think that'll work for us: * Our web servers send update requests to Solr via a background thread, so HTTP requests don't have to wait for the request to complete. * That background

Re: Best way to copy data from SolrCloud to standalone Solr?

2014-02-18 Thread Shalin Shekhar Mangar
There's a related issue: SOLR-5340 - Add support for named snapshots. I think we'd want this in SolrCloud soon. https://issues.apache.org/jira/browse/SOLR-5340 On Tue, Feb 18, 2014 at 7:23 PM, Daniel Bryant wrote: > Hi Shawn, Michael, > > Many thanks for your responses - we're going to try the r

Re: Preventing multiple on-deck searchers without causing failed commits

2014-02-18 Thread Colin Bartolome
On 02/17/2014 09:46 PM, Shawn Heisey wrote: I think I put too much information in my reply. Apologies. Here's the most important information to deal with first: Don't send hard commits at all. Configure autoCommit in your server config, with the all-important openSearcher parameter set to fal

Additive boost function

2014-02-18 Thread Zwer
Hi Guys, I faced with a problem of additive boosting. 2 fields: last_name and first_name. User is searching for "mike t" Query: (last_name:mike^15 last_name:mike*^7 first_name:mike^10 first_name:mike*^5) AND (last_name:t^15 last_name:t*^7 first_name:t^10 first_name:*^5) The search result does

Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Shawn Heisey
On 2/18/2014 8:32 AM, Shawn Heisey wrote: On 2/18/2014 6:05 AM, Vineet Mishra wrote: *Shard 1 Shard 2* localhost:8983localhost:7574 localhost:8900localhost:

Re: Boost Query Example

2014-02-18 Thread Amit Jha
I would say use dismax query parser and set boost factor in qf params. Following link may help http://wiki.apache.org/solr/DisMaxQParserPlugin#qf_.28Query_Fields.29 https://wiki.apache.org/solr/SolrRelevancyFAQ#Solr_Relevancy_FAQ Rgds AJ > On 18-Feb-2014, at 20:49, "EXTERNAL Taminidi Ravi (ETI

Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Amit Jha
Solr will complaint only if you brought down both replica & leader of same shard. It would be difficult to have highly available env. If you have less number of physical servers. Rgds AJ > On 18-Feb-2014, at 18:35, Vineet Mishra wrote: > > Hi All, > > I want to have clear idea about the Faul

Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Per Steffensen
If localhost:8900 is down but localhost:8983 contain replica of the same shard(s) that 8900 was running, all data/documents are still available. You cannot query the shutdown server (port 8900), but you can query any of the other servers (8983, 7574 or 7500). If you make a distributed query to

Re: Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Shawn Heisey
On 2/18/2014 6:05 AM, Vineet Mishra wrote: > *Shard 1 Shard 2* > localhost:8983localhost:7574 > localhost:8900localhost:7500 > > > I Indexed some document an

Re: Indexed a new big database while the old is running?

2014-02-18 Thread Shawn Heisey
On 2/18/2014 5:28 AM, Bruno Mannina wrote: > We have actually a SOLR db with around 88 000 000 docs. > All work fine :) > > We receive each year a new backfile with the same content (but improved). > > Index these docs takes several days on SOLR, > So is it possible to create a new collection (re

RE: Boost Query Example

2014-02-18 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
I am not much experience on this boosting, can you explain with an example? Really appreciated on you help. --Ravi -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Tuesday, February 18, 2014 9:58 AM To: solr-user@lucene.apache.org Subject: Re: Boost Query

Re: Limit amount of search result

2014-02-18 Thread Sameer Maggon
You are welcome! On Mon, Feb 17, 2014 at 11:07 PM, rachun wrote: > hi Samee, > > Thank you very much for your suggestion. > Now I got it worked now;) > > Chun. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Limit-amount-of-search-result-tp4117062p4117952.html > S

Re: Boost Query Example

2014-02-18 Thread Jack Krupansky
Add debugQuery=true to your queries and look at the scoring in the "explain" section. From the intermediate scoring by field, you should be able to do the math to figure out what boost would be required to rank your exact match high enough. -- Jack Krupansky -Original Message- From:

RE: Boost Query Example

2014-02-18 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi Michael, Thanks for the information. Now I am trying with the query , but I am not getting the sequence in order. SKU with 223-CL10V3 lists first (Exact Match) ManfacturerNumber with 223-CL10V3 list Second (Exact Match) if first is available if not MangacturesNumber doc will be first in the l

Re: Increasing number of SolrIndexSearcher (Leakage)?

2014-02-18 Thread Yonik Seeley
On Mon, Feb 17, 2014 at 1:34 AM, Nguyen Manh Tien wrote: > - *But after i index some docs and run softCommit or hardCommit with > openSearcher=false, number of SolrIndexSearcher increase by 1* This is fine... it's more of an internal implementation detail (we open what is called a "real-time" sea

Re: Best way to copy data from SolrCloud to standalone Solr?

2014-02-18 Thread Daniel Bryant
Hi Shawn, Michael, Many thanks for your responses - we're going to try the replication/backup command, as we're thinking this is a 'two bird with one stone' approach which will not only allow us to copy the indexes, but also help with backups in SolrCloud as well. Thanks again to you both!

Fault Tolerant Technique of Solr Cloud

2014-02-18 Thread Vineet Mishra
Hi All, I want to have clear idea about the Fault Tolerant Capability of SolrCloud Considering I have setup the SolrCloud with a external Zookeeper, 2 shards, each having a replica with single collection as given in the official Solr Documentation. https://cwiki.apache.org/confluence/display/sol

Indexed a new big database while the old is running?

2014-02-18 Thread Bruno Mannina
Dear Solr Users, We have actually a SOLR db with around 88 000 000 docs. All work fine :) We receive each year a new backfile with the same content (but improved). Index these docs takes several days on SOLR, So is it possible to create a new collection (restart SOLR) and Index these new 88 000

Re: block join and atomic updates

2014-02-18 Thread Mikhail Khludnev
absolutely. On Tue, Feb 18, 2014 at 1:20 PM, wrote: > But isn't query time join much slower when it comes to a large amount of > documents? > > Zitat von Mikhail Khludnev : > > > Hello, >> >> It sounds like you need to switch to query time join. >> 15.02.2014 21:57 пользователь написал: >> >>

Re: query parameters

2014-02-18 Thread Raymond Wiker
That could be because the second condition does not do what you think it does... have you tried running the second condition separately? You may have to add a "base term" to the second condition, like what you have for the "bq" parameter in your config file; i.e, something like (*:* -organisation

Re: Facet cache issue when deleting documents from the index

2014-02-18 Thread Marius Dumitru Florea
In the end the problem was actually in my code.. sorry for the noise. The documents were deleted from my database but not from the Solr index and I have a display filter that filters out search results that correspond to documents that don't exist any more in the database, but this filter doesn't u

RE: query parameters

2014-02-18 Thread Andreas Owen
It seams that fq doesn't except OR because: (organisations:(150 OR 41) AND roles:(174)) OR (-organisations:["" TO *] AND -roles:["" TO *]) only returns docs that match the first conditions. it doesn't return any docs with the empty fields organisations and roles. -Original Message- Fro

Re: block join and atomic updates

2014-02-18 Thread mm
But isn't query time join much slower when it comes to a large amount of documents? Zitat von Mikhail Khludnev : Hello, It sounds like you need to switch to query time join. 15.02.2014 21:57 пользователь написал: Any suggestions? Zitat von m...@preselect-media.com: Yonik Seeley : O