Using LocalParams in StatsComponent to create a price slider?
Hi, I'm using the StatsComponent to receive to lower and upper bounds of a price field to create a price slider. If someone sets the price range to $100-$200 I have to add a filter to the query. But then the lower and upper bound are calculated of the filtered result. Is it possible to use LocalParams (like for facets) to ignore a specific filter? Thanks. Mark
Reducing heap space consumption for large dictionaries?
Hi, in my index schema I has defined a DictionaryCompoundWordTokenFilterFactory and a HunspellStemFilterFactory. Each FilterFactory has a dictionary with about 100k entries. To avoid an out of memory error I have to set the heap space to 128m for 1 index. Is there a way to reduce the memory consumption when parsing the dictionary? I need to create several indexes and 128m for each index is too much. mark
Stemming - How to add tokens and dont replace the existing tokens?
Hi, I like to use the HunspellStemFilterFactory to improve my search results. Why isn't there an arg inject like in solr.PhoneticFilterFactory to add tokens instead of replacing them? I don't want to replace them, because documents with the unstemmed word should be more relevant. Thanks.
Re: Stemming - How to add tokens and dont replace the existing tokens?
Hi Marian, thanks for your answer. Using a copyField is a good idea. Mark 2011/12/5 Marian Steinbach marian.steinb...@gmail.com: Hi Mark! You could help yourself with creating an additional field. One field would hold the stemmed version and the other one would hold the unstemmed version. This would allow for a higher boost on the unstemmed field. Use copyField for convenience to copy the content from one field to the other one. Marian 2011/12/5 Mark Schoy hei...@gmx.de Hi, I like to use the HunspellStemFilterFactory to improve my search results. Why isn't there an arg inject like in solr.PhoneticFilterFactory to add tokens instead of replacing them? I don't want to replace them, because documents with the unstemmed word should be more relevant. Thanks.
Re: Facet on a field with rows=n
Hi Kashif, that is not possible in solr. The facets are always based on all the documents matching the query. But there is a workaround: 1) Do a normal query without facets (you only need to request doc ids at this point) 2) Collect all the IDs of the documents returned 3) Do a second query for all fields and facets, adding a filter to restrict result to those IDs collected in step 2. Mark 2011/12/5 Kashif Khan uplink2...@gmail.com: Hi all, i am looking for a solution where i want the facets to obtain based on the paging of solr documents. For ex:- say i hv a query *:* and set start=0 and rows=10 and then i want facets on any one of the fields in the 10 docs obtained and not on the entire docs for which the query was matched. Any intelligent people can solve my problem? -- View this message in context: http://lucene.472066.n3.nabble.com/Facet-on-a-field-with-rows-n-tp3561083p3561083.html Sent from the Solr - User mailing list archive at Nabble.com.
Best practise to automatically change a field value for a specific period of time
Hi, I have an solr index for an online shop with a field price which contains the standard price of a product. But in the database, the shop owner can specify a period of time with an alternative price. For example: standard price is $20.00, but 12/24/11 08:00am to 12/26/11 11:59pm = $12.59 Of course I could use an cronjob to updating the documents. But I think this is too unstable. I also could save all price campaigns in a field an then extracting the correct price. But then I could not sort by price or only by the standard price. What I need is an field where I can put a condition like that: if [current_time between one of the price campains] then [return price of price campaign]. But (unfortunately) this is not possible. Thanks for advice.
Re: Best practise to automatically change a field value for a specific period of time
Hi Morten, thanks, this is a very good solution. I also found another solution:Creating a custom ValueSourceParser for price sorting which consideredthe standard price and the campaign price. In my special case I think your approach isn't working, because i alsoneed result grouping and this cant be combined with field collapsing. 2011/12/2 Morten Lied Johansen morte...@ifi.uio.no: On 02. des. 2011 12:21, Mark Schoy wrote: This is a problem that can be solved with grouping. http://wiki.apache.org/solr/FieldCollapsing For each possible price on a product, you index a document with the dates and the price. In your query, you group on the product, and apply a date-filter, and the price you see for each product will be from the top document within the given dates. You can also sort by price. If you have multiple overlapping campaigns, you might need to pay attention to which one you want to take precedence, as your sorting will determine which document gets shown. -- Morten We all live in a yellow subroutine.
Solr Replication: relative path in confFiles Element?
Hi, is ist possible to define a relative path in confFile? For example: str name=confFiles../../x.xml/str If yes, to which location will the file be copied at the slave? Thanks.
Re: How to create a solr core if no solr cores were created before?
Thanks for your answer, but your answer is a little bit useless for me. Could you please add more information in addition to this link? Do I have to create a root core to create other cores? How can I create a root core? Manually adding in the solr.xml config? 2011/7/11 Gabriele Kahlout gabri...@mysimpatico.com: have a look here [1]. [1] https://issues.apache.org/jira/browse/SOLR-2645?focusedCommentId=13062748page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13062748 On Mon, Jul 11, 2011 at 4:46 PM, Mark Schoy hei...@gmx.de wrote: Hi, I tried to create a solr core but I always get No such solr core:-Exception. - File home = new File( pathToSolrHome ); File f = new File( home, solr.xml ); CoreContainer coreContainer = new CoreContainer(); coreContainer.load( pathToSolrHome, f ); EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, ); CoreAdminRequest.createCore(coreName, coreDir, server); - I think the problem is the in new EmbeddedSolrServer(coreContainer, ); Thanks. -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains [LON] or the addressee acknowledges the receipt within 48 hours then I don't resend the email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x) Now + 48h) ⇒ ¬resend(I, this). If an email is sent by a sender that is not a trusted contact or the email does not contain a valid code then the email is not received. A valid code starts with a hyphen and ends with X. ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ L(-[a-z]+[0-9]X)).
How to create a solr core if no solr cores were created before?
Hi, I tried to create a solr core but I always get No such solr core:-Exception. - File home = new File( pathToSolrHome ); File f = new File( home, solr.xml ); CoreContainer coreContainer = new CoreContainer(); coreContainer.load( pathToSolrHome, f ); EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, ); CoreAdminRequest.createCore(coreName, coreDir, server); - I think the problem is the in new EmbeddedSolrServer(coreContainer, ); Thanks.
Using two repeater to rapidly switching Master and Slave (Replication)?
Hi, I have an idea how to switching master and slave in case of one server is crashing: Setting up two server as repeater but disabling master and slave config on both with str name=enablefalse/str. Now you can dynamically disable and enable master or slave option by url: enable / disable replication on master: http://master_host:port/solr/replication?command=disablereplication http://master_host:port/solr/replication?command=enablereplication enable / disable polling on slave: http://slave_host:port/solr/replication?command=disablepoll http://slave_host:port/solr/replication?command=enablepoll Does this work?
Re: why too many open files?
Hi, did you have checked the max opened files of your OS? see: http://lj4newbies.blogspot.com/2007/04/too-many-open-files.html 2011/6/20 Jason, Kim hialo...@gmail.com Hi, All I have 12 shards and ramBufferSizeMB=512, mergeFactor=5. But solr raise java.io.FileNotFoundException (Too many open files). mergeFactor is just 5. How can this happen? Below is segments of some shard. That is too many segments over mergFactor. What's wrong and How should I set the mergeFactor? == [root@solr solr]# ls indexData/multicore-us/usn02/data/index/ _0.fdt _gs.fdt _h5.tii _hl.nrm _i1.nrm _kn.nrm _l1.nrm _lq.tii _0.fdx _gs.fdx _h5.tis _hl.prx _i1.prx _kn.prx _l1.prx _lq.tis _3i.fdt _gs.fnm _h7.fnm _hl.tii _i1.tii _kn.tii _l1.tii lucene-2de7b31b5eabdff0b6ec7fd32eecf8c7-write.lock _3i.fdx _gs.frq _h7.frq _hl.tis _i1.tis _kn.tis _l1.tis _lu.fnm _3s.fnm _gs.nrm _h7.nrm _hn.fnm _j7.fdt _kp.fnm _l2.fnm _lu.frq _3s.frq _gs.prx _h7.prx _hn.frq _j7.fdx _kp.frq _l2.frq _lu.nrm _3s.nrm _gs.tii _h7.tii _hn.nrm _kb.fnm _kp.nrm _l2.nrm _lu.prx _3s.prx _gs.tis _h7.tis _hn.prx _kb.frq _kp.prx _l2.prx _lu.tii _3s.tii _gu.fnm _h9.fnm _hn.tii _kb.nrm _kp.tii _l2.tii _lu.tis _3s.tis _gu.frq _h9.frq _hn.tis _kb.prx _kp.tis _l2.tis _ly.fnm _48.fdt _gu.nrm _h9.nrm _hp.fnm _kb.tii _kq.fnm _l6.fnm _ly.frq _48.fdx _gu.prx _h9.prx _hp.frq _kb.tis _kq.frq _l6.frq _ly.nrm _4d.fnm _gu.tii _h9.tii _hp.nrm _kc.fnm _kq.nrm _l6.nrm _ly.prx _4d.frq _gu.tis _h9.tis _hp.prx _kc.frq _kq.prx _l6.prx _ly.tii _4d.nrm _gw.fnm _hb.fnm _hp.tii _kc.nrm _kq.tii _l6.tii _ly.tis _4d.prx _gw.frq _hb.frq _hp.tis _kc.prx _kq.tis _l6.tis _m3.fnm _4d.tii _gw.nrm _hb.nrm _hr.fnm _kc.tii _kr.fnm _la.fnm _m3.frq _4d.tis _gw.prx _hb.prx _hr.frq _kc.tis _kr.frq _la.frq _m3.nrm _5b.fdt _gw.tii _hb.tii _hr.nrm _kf.fdt _kr.nrm _la.nrm _m3.prx _5b.fdx _gw.tis _hb.tis _hr.prx _kf.fdx _kr.prx _la.prx _m3.tii _5b.fnm _gy.fnm _he.fdt _hr.tii _kf.fnm _kr.tii _la.tii _m3.tis _5b.frq _gy.frq _he.fdx _hr.tis _kf.frq _kr.tis _la.tis _m8.fnm _5b.nrm _gy.nrm _he.fnm _ht.fnm _kf.nrm _kt.fnm _le.fnm _m8.frq _5b.prx _gy.prx _he.frq _ht.frq _kf.prx _kt.frq _le.frq _m8.nrm _5b.tii _gy.tii _he.nrm _ht.nrm _kf.tii _kt.nrm _le.nrm _m8.prx _5b.tis _gy.tis _he.prx _ht.prx _kf.tis _kt.prx _le.prx _m8.tii _5m.fnm _h0.fnm _he.tii _ht.tii _kg.fnm _kt.tii _le.tii _m8.tis _5m.frq _h0.frq _he.tis _ht.tis _kg.frq _kt.tis _le.tis _md.fnm _5m.nrm _h0.nrm _hh.fnm _hv.fnm _kg.nrm _kw.fnm _li.fnm _md.frq _5m.prx _h0.prx _hh.frq _hv.frq _kg.prx _kw.frq _li.frq _md.nrm _5m.tii _h0.tii _hh.nrm _hv.nrm _kg.tii _kw.nrm _li.nrm _md.prx _5m.tis _h0.tis _hh.prx _hv.prx _kg.tis _kw.prx _li.prx _md.tii _5n.fnm _h2.fnm _hh.tii _hv.tii _kj.fdt _kw.tii _li.tii _md.tis _5n.frq _h2.frq _hh.tis _hv.tis _kj.fdx _kw.tis _li.tis _mi.fnm _5n.nrm _h2.nrm _hk.fnm _hz.fdt _kj.fnm _ky.fnm _lm.fnm _mi.frq _5n.prx _h2.prx _hk.frq _hz.fdx _kj.frq _ky.frq _lm.frq _mi.nrm _5n.tii _h2.tii _hk.nrm _hz.fnm _kj.nrm _ky.nrm _lm.nrm _mi.prx _5n.tis _h2.tis _hk.prx _hz.frq _kj.prx _ky.prx _lm.prx _mi.tii _5x.fnm _h5.fdt _hk.tii _hz.nrm _kj.tii _ky.tii _lm.tii _mi.tis _5x.frq _h5.fdx _hk.tis _hz.prx _kj.tis _ky.tis _lm.tis segments_1 _5x.nrm _h5.fnm _hl.fdt _hz.tii _kn.fdt _l1.fdt _lq.fnm segments.gen _5x.prx _h5.frq _hl.fdx _hz.tis _kn.fdx _l1.fdx _lq.frq _5x.tii _h5.nrm _hl.fnm _i1.fnm _kn.fnm _l1.fnm _lq.nrm _5x.tis _h5.prx _hl.frq _i1.frq _kn.frq _l1.frq _lq.prx == Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/why-too-many-open-files-tp3084407p3084407.html Sent from the Solr - User mailing list archive at Nabble.com.
Master Slave Replication in Solr Cloud - What happens if the master is not available?
Hi, if I use a master slave replication in Solr Cloud and the master crashes, can the slave automatically switch to master mode? Or is there another way to index documents after the master is down? Thanks.
Re: Master Slave Replication in Solr Cloud - What happens if the master is not available?
Thanks for your answer Erick. So the easiest way will be to set up 2 shard cluster with shard replicas ;) 2011/6/20 Erick Erickson erickerick...@gmail.com: No, there's nothing built into Solr to automatically promote a slave to a master. You have several choices here. One is to build a new master and reindex from scratch. Another is to configure your slave as a new master and then bring up a new machine and have it replicate. Now make that new machine your master (you'll have to re-configure both). The fun part is continuing to serve requests while all this is going on. It's easier if you have more than one slave so you can move things around while the remaining slave is reconfigured (or whatever) Best Erick On Mon, Jun 20, 2011 at 8:28 AM, Mark Schoy hei...@gmx.de wrote: Hi, if I use a master slave replication in Solr Cloud and the master crashes, can the slave automatically switch to master mode? Or is there another way to index documents after the master is down? Thanks.
Re: Master Slave Replication in Solr Cloud - What happens if the master is not available?
You're right, thanks! 2011/6/20 Erick Erickson erickerick...@gmail.com: Hmmm, be a little careful here with terminology. Shards may be unnecessary if you can put your whole index on a single searcher. It's preferable to simply have each slave hold a complete copy of the index, no sharding necessary. Best Erick On Mon, Jun 20, 2011 at 9:45 AM, Mark Schoy hei...@gmx.de wrote: Thanks for your answer Erick. So the easiest way will be to set up 2 shard cluster with shard replicas ;) 2011/6/20 Erick Erickson erickerick...@gmail.com: No, there's nothing built into Solr to automatically promote a slave to a master. You have several choices here. One is to build a new master and reindex from scratch. Another is to configure your slave as a new master and then bring up a new machine and have it replicate. Now make that new machine your master (you'll have to re-configure both). The fun part is continuing to serve requests while all this is going on. It's easier if you have more than one slave so you can move things around while the remaining slave is reconfigured (or whatever) Best Erick On Mon, Jun 20, 2011 at 8:28 AM, Mark Schoy hei...@gmx.de wrote: Hi, if I use a master slave replication in Solr Cloud and the master crashes, can the slave automatically switch to master mode? Or is there another way to index documents after the master is down? Thanks.
Re: Indexing-speed issues (chart included)
Sorry, here are some details: requestHandler: XmlUpdateRequesetHandler protocol: http (10 concurrend threads) document: 1kb size, 15 fields cpu load: 20% memory usage: 50% But generally speaking, is that normal or must be something wrong with my configuration, ... 2011/6/17 Erick Erickson erickerick...@gmail.com Well, it's kinda hard to say anything pertinent with so little information. How are you indexing things? What kind of documents? How are you feeding docs to Solr? You might review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Fri, Jun 17, 2011 at 8:10 AM, Mark Schoy hei...@gmx.de wrote: Hi, If I start indexing documents it getting slower the more documents were added without commiting and optimizing: http://imageshack.us/photo/my-images/695/solrchart.png/ I've changed the mergeFactor from 10 to 30, changed maxDocs (100,1000,1) but it always getting slower the more documents were added. If I'm using elasticsearch which is also based on lucene I'm getting constant indexing rates (without commiting and optimizing too) Does anybody know whats wrong?
Performance loss - querying more than 64 cores (randomly)
Hi, I set up a Solr instance with 512 cores. Each core has 100k documents and 15 fields. Solr is running on a CPU with 4 cores (2.7Ghz) and 16GB RAM. Now I've done some benchmarks with JMeter. On each thread iteration JMeter queriing another Core by random. Here are the results (Duration: each with 180 second): Randomly queried cores | queries per second 1| 2016 2 | 2001 4 | 1978 8 | 1958 16 | 2047 32 | 1959 64 | 1879 128 | 1446 256 | 1009 512 | 428 Why are the queries per second until 64 constant and then the performance is degreasing rapidly? Solr only uses 10GB of the 16GB memory so I think it is not a memory issue.
Re: Performance loss - querying more than 64 cores (randomly)
Thanks for your answers. Andrzej was right with his assumption. Solr only needs about 9GB memory but the system needs the rest of it for disc IO: 64 Cores: 64*100MB index size = 6,4GB + 9 GB Solr Cache + about 600 MB OS = 16GB Conclusion: My system can exactly buffer the data of 64 Cores. Every additional core cant be buffered and the performance is decreasing. 2011/6/16 François Schiettecatte fschietteca...@gmail.com I am assuming that you are running on linux here, I have found atop to be very useful to see what is going on. http://freshmeat.net/projects/atop/ dstat is also very useful too but needs a little more work to 'decode'. Obviously there is contention going on, you just need to figure out where it is, most likely it is disk I/O but it could also be the number of cores you have. Also I would not say that performance is decreasing rapidly, probably more of a gentle slope down if you plot it (your double the number of cores every time). I would be very interested in hearing about what you find. Cheers François On Jun 16, 2011, at 10:00 AM, Andrzej Bialecki wrote: On 6/16/11 3:22 PM, Mark Schoy wrote: Hi, I set up a Solr instance with 512 cores. Each core has 100k documents and 15 fields. Solr is running on a CPU with 4 cores (2.7Ghz) and 16GB RAM. Now I've done some benchmarks with JMeter. On each thread iteration JMeter queriing another Core by random. Here are the results (Duration: each with 180 second): Randomly queried cores | queries per second 1| 2016 2 | 2001 4 | 1978 8 | 1958 16 | 2047 32 | 1959 64 | 1879 128 | 1446 256 | 1009 512 | 428 Why are the queries per second until 64 constant and then the performance is degreasing rapidly? Solr only uses 10GB of the 16GB memory so I think it is not a memory issue. This may be an OS-level disk buffer issue. With a limited disk buffer space the more random IO occurs from different files, the higher is the churn rate, and if the buffers are full then the churn rate may increase dramatically (and the performance will drop then). Modern OS-es try to keep as much data in memory as possible, so the memory usage itself is not that informative - but check what are the pagein/pageout rates when you start hitting the 32 vs 64 cores. -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com