Re: Index optimize runs in background.

2015-05-28 Thread Modassar Ather
I have not added any timeout in the indexer except zk client time out which is 30 seconds. I am simply calling client.close() at the end of indexing. The same code was not running in background for optimize with solr-4.10.3 and org.apache.solr.client.solrj.impl.CloudSolrServer. On Fri, May 29, 201

How To: Debuging the whole indexing process

2015-05-28 Thread Aman Tandon
Hi, I want to debug the whole indexing process, the life cycle of indexing process (each and every function call by going via function to function), from the posting of the data.xml to creation of various index files ( _fnm, _fdt, etc ). So how/what should I setup and start, please help. I will be

Re: Index optimize runs in background.

2015-05-28 Thread Erick Erickson
Are you timing out on the client request? The theory here is that it's still a synchronous call, but you're just timing out at the client level. At that point, the optimize is still running it's just the connection has been dropped Shot in the dark. Erick On Thu, May 28, 2015 at 10:31 PM, Mod

Re: Index optimize runs in background.

2015-05-28 Thread Modassar Ather
I could not notice it but with my past experience of commit which used to take around 2 minutes is now taking around 8 seconds. I think this is also running as background. On Fri, May 29, 2015 at 10:52 AM, Modassar Ather wrote: > The indexer takes almost 2 hours to optimize. It has a multi-threa

Re: Index optimize runs in background.

2015-05-28 Thread Modassar Ather
The indexer takes almost 2 hours to optimize. It has a multi-threaded add of batches of documents to org.apache.solr.client.solrj.impl.CloudSolrClient. Once all the documents are indexed it invokes commit and optimize. I have seen that the optimize goes into background after 10 minutes and indexer

Re: Relevancy Score and Proximity Search

2015-05-28 Thread Zheng Lin Edwin Yeo
I've tried to use the site. I saw that when I search for Matex, it actually only gives a boost of 0.8 to the word Latex as it is not the main word that is search, but I still can't understand why the score can be so high? This is what I get from the output explanation: { - - "match": true,

Number of clustering labels to show

2015-05-28 Thread Zheng Lin Edwin Yeo
Hi, I'm trying to increase the number of cluster result to be shown during the search. I tried to set carrot.fragSize=20 but only 15 cluster labels is shown. Even when I tried to set carrot.fragSize=5, there's also 15 labels shown. Is this the correct way to do this? I understand that setting it

Re: Running Solr 5.1.0 as a Service on Windows

2015-05-28 Thread Zheng Lin Edwin Yeo
Hi Timothy, I don't really have much of a good recommendation. Basically I've written a batch file which will call the solr.cmd with all the setting like heap size and enable clustering, and I point the path in the NSSM to this batch file. If I just point it directly to solr.cmd, I not sure if the

Re: Running Solr 5.1.0 as a Service on Windows

2015-05-28 Thread Zheng Lin Edwin Yeo
Hi Miller, Yes, I managed to get the zookeeper to start as a service on Windows in Windows 8 running Java 8. However, it didn't work on Windows Server 2008 R2 (SP1) when I upgrade the Java to Java 8. It is able to work when the server is running on Java 7. Regards, Edwin On 26 May 2015 at 23:5

Re: docValues: Can we apply synonym

2015-05-28 Thread Aman Tandon
Thanks chris. Yes we are using it for handling multiword synonym problem. With Regards Aman Tandon On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles < charles.reit...@tiaa-cref.org> wrote: > Again, I would recommend using Nolan Lawson's > SynonymExpandingExtendedDismaxQParserPlugin. > > http:/

Re: When is too many fields in "qf" is too many?

2015-05-28 Thread Steven White
Hi Folks, First, thanks for taking the time to read and reply to this subject, it is much appreciated, I have yet to come up with a final solution that optimizes Solr. To give you more context, let me give you the big picture of how the application and the database is structured for which I'm try

Re: optimal shard assignment with low shard key cardinality using compositeId to enable shard splitting

2015-05-28 Thread Erick Erickson
Charles: You raise good points, and I didn't mean to say that co-locating docs due to some critera was never a good idea. That said, it does add administrative complexity that I'd prefer to avoid unless necessary. I suppose it largely depends on what the load and response SLAs are. If there's 1 q

Re: Ignoring the Document Cache per query

2015-05-28 Thread Erick Erickson
First, there isn't that I know of. But why would you want to do this? On the face of it, it makes no sense to ignore the doc cache. One of its purposes is to hold the document (read off disk) for successive search components _in the same query_. Otherwise, each component might have to do a disk se

Re: Problem indexing, value "0.0"

2015-05-28 Thread Shawn Heisey
On 5/28/2015 3:08 PM, Shawn Heisey wrote: > Because we are planning a change on this field in the database to > decimal, I tried changing the price field on both solr servers to double > -- TrieDoubleField with precisionStep set to 0. This didn't fix the > problem. Dev was still fine, production

Re: Dynamic range on numbers

2015-05-28 Thread Chris Hostetter
: i'm not sure i follow what you're saying on #3. let me clarify in case it's : on my end. i was wanting to *eventually* set a lower bound of -10%size1 and : an upper of +10%size1. for the sake of experimentation i started with just lower bound "of what" ? write out the math equation you want to

Problem indexing, value "0.0"

2015-05-28 Thread Shawn Heisey
Here's the error I am getting on Solr 4.9.1 on a production server: ERROR - 2015-05-28 14:39:13.449; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: ERROR: [doc=getty36914013] Error adding field 'price'='0.0' msg=For input string: "0.0" On a dev server, everything is f

Re: Dynamic range on numbers

2015-05-28 Thread John Blythe
i'm not sure i follow what you're saying on #3. let me clarify in case it's on my end. i was wanting to *eventually* set a lower bound of -10%size1 and an upper of +10%size1. for the sake of experimentation i started with just the lower bound. i didn't care (at that point) about the results, just g

Re: Dynamic range on numbers

2015-05-28 Thread Chris Hostetter
: 2) lame :\ Why do you say that? ... it's a practical limitation -- for each document a function is computed, and then the result of that function is compared against the (fixed) upper and lower bounds. In situations where you want the something like the lower bound of the range comparison t

RE: optimal shard assignment with low shard key cardinality using compositeId to enable shard splitting

2015-05-28 Thread Reitzel, Charles
We have used a similar sharding strategy for exactly the reasons you say. But we are fairly certain that the # of documents per user ID is < 5000 and, typically, <500. Thus, we think the overhead of distributed searches clearly outweighs the benefits. Would you agree? We have done some l

Re: UI Velocity

2015-05-28 Thread Erick Erickson
In the velocity directory you'll find the templates that implement all this stuff, I usually copy/paste. Do understand that this is _demo_ code, not an "official" UI for Solr so it's rather a case of digging in and experimenting. I would _not_ use this (or anything else that gave direct access to

Ignoring the Document Cache per query

2015-05-28 Thread Bryan Bende
Is there a way to the document cache on a per-query basis? It looks like theres {!cache=false} for preventing the filter cache from being used for a given query, looking for the same thing for the document cache. Thanks, Bryan

Re: When is too many fields in "qf" is too many?

2015-05-28 Thread Jack Krupansky
I would reconsider the strategy of mashing so many different record types into one Solr collection. Sure, you get some advantage from denormalizing data, but if the downside cost gets too high, it may not make so much sense. I'd consider a collection per record type, or at least group similar reco

Re: Dynamic range on numbers

2015-05-28 Thread John Blythe
1) ooo, i see 2) lame :\ 3) right. i hadn't bothered with the upper limit yet simply for sake of less complexity / chance to fk it up. wanted to get the function working for lower before worrying about adding u= and getting the query refined 4) very good point about just doing it client side. i kno

UI Velocity

2015-05-28 Thread Sznajder ForMailingList
Hi I tried to use the UI Velocity from Solr. Could you please help in the following: - how do I define the fields from my schema that I would like to be displayed as facet in the UI? Thanks! Benjamin

RE: When is too many fields in "qf" is too many?

2015-05-28 Thread Reitzel, Charles
Still, it seems like the right direction. Does it "smell" ok to have a few hundred request handlers?Again, my logic is that if any given view requires no more than 50 fields, one request handler per view would work. This is different than a request handler per user category (which requ

RE: docValues: Can we apply synonym

2015-05-28 Thread Reitzel, Charles
Again, I would recommend using Nolan Lawson's SynonymExpandingExtendedDismaxQParserPlugin. http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/ -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Wednesday, May 27, 2015 6:42 PM To: solr-user@luc

Re: Dynamic range on numbers

2015-05-28 Thread Chris Hostetter
: Expected identifier at pos 29 str='{!frange l=sum(size1, product(size1, : .10))}size1 : : pos 29 is the open parenthesis of product(). can i not use a function : within a function? or is there something else i'm missing in the way i'm : constructing this? 1) you're confusing the parser by tryi

Re: Dynamic range on numbers

2015-05-28 Thread John Blythe
doh! 1) silly me, i knew better but was getting tunnel visioned 2) moved to fq and am now getting this error: Expected identifier at pos 29 str='{!frange l=sum(size1, product(size1, .10))}size1 pos 29 is the open parenthesis of product(). can i not use a function within a function? or is there so

Re: Per field mm parameter

2015-05-28 Thread Doug Turnbull
You could use local params with a filter query and specify multiple mm in each local param. Here's an example for our VA State Laws Solr (you're free to poke around with). Here I only allow search results that have mm=1 on catch_line (a title field) and mm=2 for text field. http://solr.quepid.co

Re: Unsubscribe

2015-05-28 Thread Erick Erickson
Please follow instructions here: http://lucene.apache.org/solr/resources.html Be sure to use the exact e-mail address you originally subscribed with. On Thu, May 28, 2015 at 9:49 AM, Nirali Mehta wrote: > Unsubscribe

Re: Ability to load solrcore.properties from zookeeper

2015-05-28 Thread Chris Hostetter
: certainly didn't intend to write it like this!). The problem here will : be that CoreDescriptors are currently built entirely from : core.properties files, and the CoreLocators that construct them don't : have any access to zookeeper. But they do have access to the CoreContainer which is pa

Re: Unsubscribe

2015-05-28 Thread Nirali Mehta
Unsubscribe

Re: Ability to load solrcore.properties from zookeeper

2015-05-28 Thread Chris Hostetter
: Never even considered loading core.properties from ZK, so not even an : oversight on my part ;) to be very clear -- we're not talking about core.properties. we're talking about solrcore.properties -- the file that's existed for much longer then core.properites (predates both solrcloud and

Re: Dynamic range on numbers

2015-05-28 Thread Erick Erickson
fq, not fl. fq is "filter query" fl is the "field list", the stored fields to be returned to the user. Best, Erick On Thu, May 28, 2015 at 9:03 AM, John Blythe wrote: > I've set the field to be processed as such: > > > and then have this in the fl box in Solr admin UI: > *, score, {!frange l=s

Re: Per field mm parameter

2015-05-28 Thread Chris Hostetter
: Subject: Per field mm parameter : : How to specify per field mm parameter in edismax query. you can't. the mm param applies to the number of minimum match clauses in the final query, where each of those clauses is a disjunction over each of the qf fields. this blog might help explain the q

Re: solr-user-unsubscribe

2015-05-28 Thread Erick Erickson
Please follow the instructions here: http://lucene.apache.org/solr/resources.html. Be sure to use the exact same e-mail you used to subscribe. Best, Erick On Thu, May 28, 2015 at 6:10 AM, Stefan Meise - SONIC Performance Support wrote: >

Re: When is too many fields in "qf" is too many?

2015-05-28 Thread Erick Erickson
Gotta agree with Jack here. This is an insane number of fields, query performance on any significant corpus will be "fraught" etc. The very first thing I'd look at is having that many fields. You have 3,500 different fields! Whatever the motivation for having that many fields is the place I'd start

Re: Dynamic range on numbers

2015-05-28 Thread John Blythe
I've set the field to be processed as such: and then have this in the fl box in Solr admin UI: *, score, {!frange l=sum(size1, product(size1, .10))}size1 I'm trying to use the size1 field as the item upon which a frange is being used, but also need to use the size1 value for the mathematical fun

Re: Ability to load solrcore.properties from zookeeper

2015-05-28 Thread Erick Erickson
Never even considered loading core.properties from ZK, so not even an oversight on my part ;) On Thu, May 28, 2015 at 3:48 AM, Alan Woodward wrote: > I think this is an oversight, rather than intentional (at least, I certainly > didn't intend to write it like this!). The problem here will

Re: Solr advanced StopFilterFactory

2015-05-28 Thread Timothy Potter
Seems like you should be able to use the ManagedStopFilterFactory with a custom StorageIO impl that pulls from your db: http://lucene.apache.org/solr/5_1_0/solr-core/index.html?org/apache/solr/rest/ManagedResourceStorage.StorageIO.html On Thu, May 28, 2015 at 7:03 AM, Alessandro Benedetti wrote:

Re: distributed search limitations via SolrCloud

2015-05-28 Thread Erick Erickson
5.x will still build a war file that you an deploy on Tomcat. But support for that is going away eventually, certainly by 6.0. But you do have to make the decision sometime before 6.0 at least. Best, Erick On Wed, May 27, 2015 at 1:24 PM, Vishal Swaroop wrote: > Thanks a lot Erick... great input

Re: HW requirements

2015-05-28 Thread Jack Krupansky
You need to translate your source data size into number of documents and document size. Document size will depend on number of fields, the type of data in each field, and the size of the data in each field. You need to think about numeric and date fields, raw string fields, and keyword text fields.

Re: Native library of plugin is loaded for every core

2015-05-28 Thread adfel70
Works as expected :) Thanks guys! -- View this message in context: http://lucene.472066.n3.nabble.com/Native-library-of-plugin-is-loaded-for-every-core-tp4207996p4208372.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Relevancy Score and Proximity Search

2015-05-28 Thread John Blythe
this site has been a great help to me in seeing how things shake out as far as the scores are concerned: http://splainer.io/ -- *John Blythe* Product Manager & Lead Developer 251.605.3071 | j...@curvolabs.com www.curvolabs.com 58 Adams Ave Evansville, IN 47713 On Thu, May 28, 2015 at 10:06 AM,

Re: Relevancy Score and Proximity Search

2015-05-28 Thread Vivek Pathak
You explain parameter and it should show you the scores and the calculations Sent from my Fire On May 28, 2015, at 10:06 AM, Zheng Lin Edwin Yeo wrote: Hi, Does anyone knows how Solr does its scoring with a query that has proximity search enabled. For example, when I issue a query q=Matex~1,

Relevancy Score and Proximity Search

2015-05-28 Thread Zheng Lin Edwin Yeo
Hi, Does anyone knows how Solr does its scoring with a query that has proximity search enabled. For example, when I issue a query q=Matex~1, the result with the top score that came back was actually 'Latex', and with a score of 2.27. This is with the fact that there are several documents in my in

solr and uima dictionary annotator

2015-05-28 Thread hossmaa
Hi everyone I am using the UIMA DictionaryAnnotator to tag Solr documents. It seems to be working (I do get tags), but I get some strange behavior: 1. I am using the White Space Tokenizer both for the indexed text and for creating the dictionary. Most entries in my dictionary consist of multiple

Guidance needed to modify ExtendedDismaxQParserPlugin

2015-05-28 Thread Aman Tandon
Hi, *Problem Statement: *query -> "i need leather jute bags" If we are searching on the *title *field using the pf2 ( *server:8003/solr/core0/select?q=i%20need%20leather%20jute%20bags&pf2=titlex&debug=query&defType=edismax&wt=xml&rows=0*). Currently it will create the shingled phrases like "i nee

RE: HW requirements

2015-05-28 Thread Allison, Timothy B.
A classic on the importance of prototyping with your data and on the intractability of sizing in the abstract: https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ This might be of use: https://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/s

Re: solr uima and opennlp

2015-05-28 Thread hossmaa
Hi Tommaso Thanks for the quick reply! I have another question about using the Dictionary Annotator, but I guess it's better to post it separately. Cheers Andreea -- View this message in context: http://lucene.472066.n3.nabble.com/solr-uima-and-opennlp-tp4206873p4208348.html Sent from the Sol

solr-user-unsubscribe  

2015-05-28 Thread Stefan Meise - SONIC Performance Support

Re: Solr advanced StopFilterFactory

2015-05-28 Thread Alessandro Benedetti
As Alex initially specified , the custom stop filter factory is the right way ! So is mainly related to the suggester ? Anyway with a custom stop filter, it can be possible and actually can be a nice contribution as well. Cheers 2015-05-28 13:01 GMT+01:00 Rupali : > sylkaalex gmail.com> write

Re: When is too many fields in "qf" is too many?

2015-05-28 Thread Jack Krupansky
This does not even pass a basic smell test for reasonability of matching the capabilities of Solr and the needs of your application. I'd like to hear from others, but I personally would be -1 on this approach to misusing qf. I'd simply say that you need to go back to the drawing board, and that you

Per field mm parameter

2015-05-28 Thread Nutch Solr User
How to specify per field mm parameter in edismax query. - Nutch Solr User "The ultimate search engine would basically understand everything in the world, and it would always give you the right thing." -- View this message in context: http://lucene.472066.n3.nabble.com/Per-field-mm-paramet

Re: When is too many fields in "qf" is too many?

2015-05-28 Thread Steven White
Hi Charles, That is what I have done. At the moment, I have 22 request handlers, some have 3490 field items in "qf" (that's the most and the qf line spans over 95,000 characters in solrconfig.xml file) and the least one has 1341 fields. I'm working on seeing if I can use copyField to copy the da

Re: Solr advanced StopFilterFactory

2015-05-28 Thread Rupali
sylkaalex gmail.com> writes: > > The main goal to allow each user use own stop words list. For example user > type "th" > now he will see next results in his terms search: > the > the one > the then > then > then and > > But user has stop word "the" and he want get next results: > then > then

Re: SolrCloud: Creating more shard at runtime will lower down the load?

2015-05-28 Thread Aman Tandon
Thank you Alessandro. With Regards Aman Tandon On Thu, May 28, 2015 at 3:57 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Hi Aman, > this feature can be interesting for you : > > > Shard Splitting > > > > When you create a collection in SolrCloud, you decide on the initial > >

Re: Ability to load solrcore.properties from zookeeper

2015-05-28 Thread Alan Woodward
I think this is an oversight, rather than intentional (at least, I certainly didn't intend to write it like this!). The problem here will be that CoreDescriptors are currently built entirely from core.properties files, and the CoreLocators that construct them don't have any access to zookeeper.

Re: SolrCloud: Creating more shard at runtime will lower down the load?

2015-05-28 Thread Alessandro Benedetti
Hi Aman, this feature can be interesting for you : > Shard Splitting > > When you create a collection in SolrCloud, you decide on the initial > number shards to be used. But it can be difficult to know in advance the > number of shards that you need, particularly when organizational > requirements