Performing DIH on predefined list of IDS

2015-02-20 Thread SolrUser1543
Relatively frequently (about a once a month) we need to reindex the data, by using DIH and copying the data from one index to another. Because of the fact that we have a large index, it could take from 12 to 24 hours to complete. At the same time the old index is being queried by users. Sometimes

Remove all parent docs having specific child doc

2015-02-20 Thread Lokesh Chhaparwal
Hi, I want to remove all the parent docs having a specific child doc. Eg. docEmployee1 doc fieldDept1/field /doc doc fieldDept2/field /doc /doc docEmployee2 doc fieldDept2/field /doc doc fieldDept3/field /doc /doc Query: Remove all

Use multiple collections having different configuration

2015-02-20 Thread Nitin Solanki
Hello, I have scenario where I want to create/use 2 collection into same Solr named as collection1 and collection2. I want to use distributed servers. Each collection has multiple shards. Each collection contains different configurations(solrconfig.xml and schema.xml). How can I do? In

Advantage of using Java programming with Solr over Solr API

2015-02-20 Thread Nitin Solanki
Hi, What is the advantages of java programming with Solr over Solr API?

Re: ignoring bad documents during index

2015-02-20 Thread SolrUser1543
I want to experiment with this issue , where exactly I should take a look ? I want to try to fix this missing aggregation . What class is responsible to that ? -- View this message in context: http://lucene.472066.n3.nabble.com/ignoring-bad-documents-during-index-tp4176947p4187587.html Sent

RE: Committed before 500

2015-02-20 Thread NareshJakher
Hi Shawn, I do not want to increase timeout as these errors are very few. Also current timeout of 90 seconds is good enough. Is there a way to find why Solr is getting timed-out ( at times ), could it be that Solr is busy doing other activities like re-indexing, commits etc. Additionally I

Re: ignoring bad documents during index

2015-02-20 Thread Gora Mohanty
On 20 February 2015 at 15:31, SolrUser1543 osta...@gmail.com wrote: I want to experiment with this issue , where exactly I should take a look ? I want to try to fix this missing aggregation . What class is responsible to that ? Are you indexing through SolrJ, DIH, or what? Regards,

Re: Advantage of using Java programming with Solr over Solr API

2015-02-20 Thread Shawn Heisey
On 2/20/2015 6:38 AM, Nitin Solanki wrote: I mean embedded Solr . On Fri, Feb 20, 2015 at 7:05 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: This question makes no sense. Do you mean embedded Solr vs Standalone? Regards, Alex On 20 Feb 2015 3:30 am, Nitin Solanki

Re: Advantage of using Java programming with Solr over Solr API

2015-02-20 Thread Alexandre Rafalovitch
This question makes no sense. Do you mean embedded Solr vs Standalone? Regards, Alex On 20 Feb 2015 3:30 am, Nitin Solanki nitinml...@gmail.com wrote: Hi, What is the advantages of java programming with Solr over Solr API?

Re: Advantage of using Java programming with Solr over Solr API

2015-02-20 Thread Nitin Solanki
I mean embedded Solr . On Fri, Feb 20, 2015 at 7:05 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: This question makes no sense. Do you mean embedded Solr vs Standalone? Regards, Alex On 20 Feb 2015 3:30 am, Nitin Solanki nitinml...@gmail.com wrote: Hi, What is the

Re: Collations are not working fine.

2015-02-20 Thread Nitin Solanki
How to get only the best collations whose hits are more and need to sort them? On Wed, Feb 18, 2015 at 3:53 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Hi Nitin, I was trying many different options for a couple different queries. In fact, I have collations working ok now

Re: Use multiple collections having different configuration

2015-02-20 Thread Shawn Heisey
On 2/20/2015 4:06 AM, Nitin Solanki wrote: I have scenario where I want to create/use 2 collection into same Solr named as collection1 and collection2. I want to use distributed servers. Each collection has multiple shards. Each collection contains different

Re: Performing DIH on predefined list of IDS

2015-02-20 Thread SolrUser1543
My index has about 110 millions of documents. The index is split over several shards. May be the number it's not so big ,but each document is relatively large. The reason to perform the reindex is something like adding a new fields , or adding some update processor which can extract something

Re: Performing DIH on predefined list of IDS

2015-02-20 Thread Shawn Heisey
On 2/20/2015 3:46 PM, Shawn Heisey wrote: If the URL parameter is idlist then you can use ${dih.request.idlist} in your SELECT statement. I realized after I sent this that you are not using a database ... the list would simply go in the query you send to the other server. I don't know whether

Re: Performing DIH on predefined list of IDS

2015-02-20 Thread Shawn Heisey
On 2/20/2015 2:57 PM, SolrUser1543 wrote: That's the reason that I want to run on predefined list of IDs. In this case I will able to restart from any point and to know about filed IDs. You can include information on a URL parameter and then use that URL parameter inside your dih config. If

Re: Clarification of locktype=single and implications of use

2015-02-20 Thread Tom Burton-West
Thanks Hoss, Protection from misconfiguration and/or starting separate solr instances pointing to the same index dir I can understand. The current documentation on the wiki and in the ref guide (along with just enough understanding of Solr/Lucene indexing to be dangerous) left me wondering if

[ANNOUNCE] Apache Solr 5.0.0 and Reference Guide for Solr 5.0 released

2015-02-20 Thread Anshum Gupta
20 February 2015, Apache Solr™ 5.0.0 and Reference Guide for Solr 5.0 available The Lucene PMC is pleased to announce the release of Apache Solr 5.0.0 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful

Re: Performing DIH on predefined list of IDS

2015-02-20 Thread Mikhail Khludnev
It's a little bit hard to get the overall context eg why do you live with OOME as usual, what's the reasoning to pull from one index to another, and what's added during this process. Make sure that you are aware of http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor which queries

Re: Performing DIH on predefined list of IDS

2015-02-20 Thread Erick Erickson
Personally, I much prefer indexing from an independent SolrJ client to using DIH when I have to take explicit control of errors etc. Here's an example: https://lucidworks.com/blog/indexing-with-solrj/ In your example, you seem to be assuming that the Lucene IDs (and here I'm assuming you're not

Re: Strange search behaviour when upgrading to 4.10.3

2015-02-20 Thread Rishi Easwaran
Hi Shawn, Also, the tokenizer we use is very similar to the following. ftp://zimbra.imladris.sk/src/HELIX-720.fbsd/ZimbraServer/src/java/com/zimbra/cs/index/analysis/UniversalTokenizer.java

Re: rankquery usage bug?

2015-02-20 Thread Joel Bernstein
Ryan, This looks like a good jira ticket to me. Joel Bernstein Search Engineer at Heliosearch On Fri, Feb 20, 2015 at 6:40 PM, Ryan Josal rjo...@gmail.com wrote: Hey guys, I put a rq in defaults but I can't figure out how to override it with no rankquery. Looks like one option might be

Re: Use multiple collections having different configuration

2015-02-20 Thread Nitin Solanki
Thanks Shawn.. On Fri, Feb 20, 2015 at 7:53 PM, Shawn Heisey apa...@elyograg.org wrote: On 2/20/2015 4:06 AM, Nitin Solanki wrote: I have scenario where I want to create/use 2 collection into same Solr named as collection1 and collection2. I want to use distributed servers.

Re: Strange search behaviour when upgrading to 4.10.3

2015-02-20 Thread Shawn Heisey
On 2/20/2015 4:24 PM, Rishi Easwaran wrote: Also, the tokenizer we use is very similar to the following. ftp://zimbra.imladris.sk/src/HELIX-720.fbsd/ZimbraServer/src/java/com/zimbra/cs/index/analysis/UniversalTokenizer.java

Re: ignoring bad documents during index

2015-02-20 Thread Michael Della Bitta
At the layer right before you send that XML out, have it have a fallback option on error where it sends each document one at a time if there's a failure with the batch. Michael Della Bitta Senior Software Engineer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East

Clarification of locktype=single and implications of use

2015-02-20 Thread Tom Burton-West
Hello, We don't want to use locktype=native (we are using NFS) or locktype=simple (we mount a read-only snapshot of the index on our search servers and with locktype=simple, Solr refuses to start up becaise it sees the lock file.) However, we don't quite understand the warnings about using

Re: Getting unique key of a document inside of a Similarity class.

2015-02-20 Thread J-Pro
from all the examples of what you've described, i'm fairly certain all you really need is a TFIDF based Similarity where coord(), idf(), tf() and queryNorm() return 1 allways, and you omitNorms from all fields. Yeah, that's what I did in the very first iteration. It works only for cases #1 and

Re: Committed before 500

2015-02-20 Thread Walter Underwood
Since you are getting these failures, the 90 second timeout is not “good enough”. Try increasing it. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Feb 20, 2015, at 5:22 AM, NareshJakher naresh.jak...@capgemini.com wrote: Hi Shawn, I do not

Re: Remove all parent docs having specific child doc

2015-02-20 Thread Mikhail Khludnev
On Fri, Feb 20, 2015 at 2:10 PM, Lokesh Chhaparwal xyzlu...@gmail.com wrote: Hi, I want to remove all the parent docs having a specific child doc. Eg. docEmployee1 doc fieldDept1/field /doc doc fieldDept2/field /doc /doc docEmployee2 doc

Re: ignoring bad documents during index

2015-02-20 Thread SolrUser1543
I am sending a bulk of XML via http request. The same way like indexing via documents in solr interface. -- View this message in context: http://lucene.472066.n3.nabble.com/ignoring-bad-documents-during-index-tp4176947p4187632.html Sent from the Solr - User mailing list archive at

Solr synonyms logic

2015-02-20 Thread davym
Hi all, I'm querying a recipe database in Solr. By using synonyms, I'm trying to make my search a little smarter. What I'm trying to do here, is that a search for pastry returns all lasagne, penne cannelloni recipes. However a search for lasagne should only return lasagne recipes. In my

Re: Strange search behaviour when upgrading to 4.10.3

2015-02-20 Thread Rishi Easwaran
Yes, The analyzers and tokenizers were recompiled with new version of solr/lucene and there were some errors, most of them were related to using BytesRefBuilder, which i did. Can you try these links.

Strange search behaviour when upgrading to 4.10.3

2015-02-20 Thread Rishi Easwaran
Hi, We are trying to upgrade from Solr 4.6 to 4.10.3. When testing search 4.10.3 search results are not being returned, actually looks like only the first word in a sentence is getting indexed. Ex: inserting This is a test message only returns results when searching for content:this*.

Re: Strange search behaviour when upgrading to 4.10.3

2015-02-20 Thread Shawn Heisey
On 2/20/2015 9:37 AM, Rishi Easwaran wrote: We are trying to upgrade from Solr 4.6 to 4.10.3. When testing search 4.10.3 search results are not being returned, actually looks like only the first word in a sentence is getting indexed. Ex: inserting This is a test message only returns results

rankquery usage bug?

2015-02-20 Thread Ryan Josal
Hey guys, I put a rq in defaults but I can't figure out how to override it with no rankquery. Looks like one option might be checking for empty string before trying to use it in QueryComponent? I can work around it in the prep method of an earlier searchcomponent for now. Ryan

Re: Clarification of locktype=single and implications of use

2015-02-20 Thread Chris Hostetter
: We are using Solr. We would not configure two different Solr instances to : write to the same index. So why would a normal Solr set-up possibly end : up having more than one process writing to the same index? The risk here is that if you configure lockType=single, and then have some

Re: Remove all parent docs having specific child doc

2015-02-20 Thread Kydryavtsev Andrey
*q= - {!parent which=employee:*} department:Dept1 *- it does not work with block join query parser. What do you mean? What this query (no spaces, brackets) ? q=-({!parent which=employee:*}department:Dept1) returns in your case? 20.02.2015, 18:02, Mikhail Khludnev mkhlud...@griddynamics.com: