RE: SlrCloud RAM requirments

2014-09-24 Thread Norgorn
Thanks again. I'd answered before properly reading your post, my apologizes. I can't say for sure, cause filter caches are out of the JVM (dat HS), but top shows 5 GB cached and no free RAM. The only question for me now is how to balance disk cache and filter cache? Do I need to worry about that,

Solr Cloud Default Document Routing

2014-09-24 Thread Susmit Shukla
Hi, I'm building out a multi shard solr collection as the index size is likely to grow fast. I was testing out the setup with 2 shards on 2 nodes with test data. Indexed few documents with "id" as the unique key. collection create command - /solr/admin/collections?action=CREATE&name=multishard&num

Re: MRIT's morphline mapper doesn't co-locate with data

2014-09-24 Thread Wolfgang Hoschek
Based on our measurements, Lucene indexing is so CPU intensive that it wouldn’t really help much to exploit data locality on read. The overwhelming bottleneck remains the same. Having said that, we have an ingestion tool in the works that will take advantage of data locality for splitable files

Re: Does soft commit block on autowarming?

2014-09-24 Thread Yonik Seeley
On Wed, Sep 24, 2014 at 6:56 PM, Bruce Johnson wrote: > Is it reliably true that once a soft commit request returns, > any subsequent queries will hit a new (and autowarmed) searcher? Yes. The default for commit and softCommit commands is waitSearcher=true, which will not return until a new searc

Does soft commit block on autowarming?

2014-09-24 Thread Bruce Johnson
I currently have an algorithm that needs to know whether query results are fresh up to a known point in time, and I'm using an explicit soft commit request to act as a latch point. I record the time T just before I issue a soft commit request, and when it returns, I assume that query results includ

Re: [ANN] Lucidworks Fusion 1.0.0

2014-09-24 Thread Sebastián Ramírez
It's good to know you'll talk about it at Lucene/Solr Revolution 2014 too. *Sebastián Ramírez* Diseñador de Algoritmos Tel: (+571) 795 7950 ext: 1012 Cel: (+57) 300 370 77 10 Calle 73 No 7 - 06 Piso 4 Linkedin: co.linkedin.com/in/tiangolo/ Email:

Scoring with wild cars

2014-09-24 Thread Pigeyre Romain
Hi, I hava two records with name_fra field One with name_fra="un test CARREAU" And another one with name_fra="un test CARRE" { "codeBarre": "1", "name_FRA": "un test CARREAU" } { "codeBarre": "2", "name_FRA": "un test CARRE" } Configuration of these fi

Re: Solr 4.10 termsIndexInterval and termsIndexDivisor not supported with default PostingsFormat?

2014-09-24 Thread Tom Burton-West
Thanks Hoss, Just opened SOLR-6560 and attached a patch which removes the offending section from the example solrconfig.xml file. We suspect that with the much more efficient block and FST based Solr 4 default postings format that the need to mess with the parameters in order to reduce memory u

Re: Spellchecking and suggesting part numbers

2014-09-24 Thread Jorge Luis Betancourt Gonzalez
I’ve done something similar to this using the the EdgeNGram not the spellchecker component, I don’t know if this is along with your requirements: The relevant portion of my fieldType config: class="solr.SpellCheckComponent"> > >

MRIT's morphline mapper doesn't co-locate with data

2014-09-24 Thread Tom Chen
Hi, The MRIT (MapReduceIndexerTool) uses NLineInputFormat for the morphline mapper. The mapper doesn't co-locate with the input data that it process. Isn't this a performance hit? Ideally, morphline mapper should be run on those hosts that contain most data blocks for the input files it process.

Memory issue in merge thread

2014-09-24 Thread Thomas Mortagne
Hi guys, I recently upgraded from Solr 4.0 to 4.8.1. I start it with a clean index (we did some change in the Solr schema in the meantime) and after some time of indexing a very big database my instance is becoming totally unusable with 99% of the heap filled. Then when I restart it it get stuck v

Re: Issue Adding Filter Query

2014-09-24 Thread Erick Erickson
Glad your problem isn't one any longer. Yeah, there are a lot of nooks and crannies that one gets used to with Solr! I'd estimate that between learning how to read the debug output and the analysis page 80-90% of the "my search isn't working" questions on the list can be answered, but it takes a w

Re: Solr upgrade to latest version

2014-09-24 Thread Erick Erickson
Did you look at the rest of this thread? There are some comments there. The CHANGES.txt file will guide you through each intermediate step. There's nothing going straight from 1.4 to 4.x. You could go from 1.4 -> 3.x then 3.x->4.x, but frankly I'd just start with a stock 4.x distro and transfer o

Help in selecting the appropriate feature to obtain results

2014-09-24 Thread barrybear
Hi guys, I'm still a beginner to Solr and I'm not sure whether to implement a custom Filter Query or any other available features/plugins that I am not aware of in Solr. I am using Solr v4.4.0. I have a collection as an example as below: [ { description: 'group1', group: ['G?', 'GE

RE: Spellchecking and suggesting part numbers

2014-09-24 Thread Dyer, James
Alexander, You could use a higher value for spellcheck.count, maybe 20 or so, then in your application pick out the suggestions that make changes on the right side. Another option is to use DirectSolrSpellChecker (usually a better choice anyhow) and set the "minPrefix" field. This will require

Spellchecking and suggesting part numbers

2014-09-24 Thread Lochschmied, Alexander
Hello Solr Users, we are trying to get suggestions for part numbers using the spellchecker. Problem scenario: ABCD1234 // This is the search term ABCE1234 // This is what we get from spellchecker ABCD1244 // This is what we would like to get from spellchecker Characters towards the left of our

Re: Issue Adding Filter Query

2014-09-24 Thread aaguilar
Hello Erick, Just wanted to let you know that I did the change you suggested and everything works as expected. Also, thanks for letting me know about the Analysis page in solr. I did not know about it and I have found it very useful. Thanks! On Mon, Sep 22, 2014 at 5:41 PM, Antelmo Aguilar wr

RE: SlrCloud RAM requirments

2014-09-24 Thread Toke Eskildsen
Norgorn [lsunnyd...@mail.ru] wrote: > Collection contains about billion of documents. So 3-400M documents per core. That is a challenge with frequent updates and facets, but with your simple queries it should be doable. > At the end, I want to reach several seconds per search query (for not cach

Re: [ANN] Lucidworks Fusion 1.0.0

2014-09-24 Thread Grant Ingersoll
Hi Thomas, Thanks for the question, yes, I give a brief demo of it in action during my talk and we will have demos at our booth. I will also give a demo during the Webinar, which will be recorded. As others have said as well, you can simply download it and try yourself. Cheers, Grant On Sep

RE: SlrCloud RAM requirments

2014-09-24 Thread Norgorn
Thanks for your reply. Collection contains about billion of documents. I'm using most of all simple queries with date and other filters (5 filters per query). Yup, disks are cheapest and simplest. At the end, I want to reach several seconds per search query (for not cached query =) ), so, please,

log to file and to logging tab in web interface at the same time

2014-09-24 Thread wolvverine
Is any way to logging to file (J done this) AND see fresh logs in solr web interface (tab logging) ? my configuration in *.war file classes/log4j.properties: # Logging level solr.log=/usr/local/inp/logs/tomcat log4j.rootLogger=WARN, File, Console # Log to console log4j.appender.Console=org.apa

Re: Solr: Boost of childs (json)

2014-09-24 Thread ku3ia
ku3ia wrote > I can't find an example to post document with child boosted documents > using json update handler. > ... > How to set the "boost" of child documents?? No ideas? Is it possible at all? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Boost-of-childs-json-tp

RE: How does KeywordRepeatFilterFactory help giving a higher score to an original term vs a stemmed term

2014-09-24 Thread Markus Jelsma
Hi - but this makes no sense, they are scored as equals, except for tiny differences in TF and IDF. What you would need is something like a stemmer that preserves the original token and gives a < 1 payload to the stemmed token. The same goes for filters like decompounders and accent folders that

RE: RE: using facet enum et fc in the same query.

2014-09-24 Thread Toke Eskildsen
jerome.dup...@bnf.fr [jerome.dup...@bnf.fr] wrote: [1 thread = 15 seconds, ∞ threads = 2 seconds] > The "slow" request corresponds to our website search query. It for our > book catalog: some facets are for type of documents, author, title > subjets, location of the book, dates... > In this requ

Re: Accessing document stored fields in a custom function

2014-09-24 Thread Mikhail Khludnev
Actual Stored Fields are no-go definitely. You can hit any kind of forward-view index. http://www.youtube.com/watch?v=T5RmMNDR5XI Look at StrFieldSource, IntFieldSource. If you wonder how to access stored fields anyway, call org.apache.lucene.index.IndexReader.document(int). Beware of difference be

Re: Combining several fields for facets.

2014-09-24 Thread lboutros
How many different values do you have in your fields and do you know them ? Faceting by query is not an option for you ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-several-fields-for-facets-tp4160679p4160866.html Sent from the So

RE: SlrCloud RAM requirments

2014-09-24 Thread Toke Eskildsen
Norgorn [lsunnyd...@mail.ru] wrote: > I have CLOUD with 3 nodes and 16 MB RAM on each. > My index is about 1 TB and search speed is awfully bad. We all have different standard with regards to search performance. What is "awfully bad" and what is "good enough" for you? Related to this: How many d

Re: Combining several fields for facets.

2014-09-24 Thread SolrUser1543
Using a copy field will require reindeer of my data, I am looking for a solution without reindex. -- View this message in context: http://lucene.472066.n3.nabble.com/Combining-several-fields-for-facets-tp4160679p4160858.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Loading an index (generated by map reduce) in SolrCloud

2014-09-24 Thread rulinma
copy is not a good choice, transfer to hdfs and merge. -- View this message in context: http://lucene.472066.n3.nabble.com/Loading-an-index-generated-by-map-reduce-in-SolrCloud-tp4159530p4160855.html Sent from the Solr - User mailing list archive at Nabble.com.