Using LocalParams in StatsComponent to create a price slider?

2011-12-14 Thread Mark Schoy
Hi,

I'm using the StatsComponent to receive to lower and upper bounds of a
price field to create a price slider.
If someone sets the price range to $100-$200 I have to add a filter to
the query. But then the lower and upper bound are calculated of the
filtered result.

Is it possible to use LocalParams (like for facets) to ignore a specific filter?

Thanks.

Mark


Reducing heap space consumption for large dictionaries?

2011-12-07 Thread Mark Schoy
Hi,

in my index schema I has defined a
DictionaryCompoundWordTokenFilterFactory and a
HunspellStemFilterFactory. Each FilterFactory has a dictionary with
about 100k entries.

To avoid an out of memory error I have to set the heap space to 128m
for 1 index.

Is there a way to reduce the memory consumption when parsing the dictionary?
I need to create several indexes and 128m for each index is too much.

mark


Stemming - How to add tokens and dont replace the existing tokens?

2011-12-05 Thread Mark Schoy
Hi,

I like to use the HunspellStemFilterFactory to improve my search results.
Why isn't there an arg inject like in solr.PhoneticFilterFactory to
add tokens instead of replacing them?

I don't want to replace them, because documents with the unstemmed
word should be more relevant.

Thanks.


Re: Stemming - How to add tokens and dont replace the existing tokens?

2011-12-05 Thread Mark Schoy
Hi Marian,

thanks for your answer.
Using a copyField is a good idea.

Mark

2011/12/5 Marian Steinbach marian.steinb...@gmail.com:
 Hi Mark!

 You could help yourself with creating an additional field. One field would
 hold the stemmed version and the other one would hold the unstemmed
 version.

 This would allow for a higher boost on the unstemmed field.

 Use copyField for convenience to copy the content from one field to the
 other one.

 Marian


 2011/12/5 Mark Schoy hei...@gmx.de

 Hi,

 I like to use the HunspellStemFilterFactory to improve my search results.
 Why isn't there an arg inject like in solr.PhoneticFilterFactory to
 add tokens instead of replacing them?

 I don't want to replace them, because documents with the unstemmed
 word should be more relevant.

 Thanks.



Re: Facet on a field with rows=n

2011-12-05 Thread Mark Schoy
Hi Kashif,

that is not possible in solr. The facets are always based on all the
documents matching the query.

But there is a workaround:
1) Do a normal query without facets (you only need to request doc ids
at this point)
2) Collect all the IDs of the documents returned
3) Do a second query for all fields and facets, adding a filter to
restrict result to those IDs collected in step 2.

Mark

2011/12/5 Kashif Khan uplink2...@gmail.com:
 Hi all,

 i am looking for a solution where i want the facets to obtain based on the
 paging of solr documents.
 For ex:-

 say i hv a query *:* and set start=0 and rows=10 and then i want facets on
 any one of the fields in the 10 docs obtained and not on the entire docs for
 which the query was matched.

 Any intelligent people can solve my problem?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Facet-on-a-field-with-rows-n-tp3561083p3561083.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Best practise to automatically change a field value for a specific period of time

2011-12-02 Thread Mark Schoy
Hi,

I have an solr index for an online shop with a field price which
contains the standard price of a product.
But in the database, the shop owner can specify a period of time with
an alternative price.

For example: standard price is $20.00, but 12/24/11 08:00am to
12/26/11 11:59pm = $12.59

Of course I could use an cronjob to updating the documents. But I
think this is too unstable.
I also could save all price campaigns in a field an then extracting
the correct price. But then I could not sort by price or only by the
standard price.

What I need is an field where I can put a condition like that: if
[current_time between one of the price campains] then [return price of
price campaign]. But (unfortunately) this is not possible.

Thanks for advice.


Re: Best practise to automatically change a field value for a specific period of time

2011-12-02 Thread Mark Schoy
Hi Morten,
thanks, this is a very good solution.
I also found another solution:Creating a custom ValueSourceParser for
price sorting which consideredthe standard price and the campaign
price.
In my special case I think your approach isn't working, because i
alsoneed result grouping and this cant be combined with field
collapsing.
2011/12/2 Morten Lied Johansen morte...@ifi.uio.no:
 On 02. des. 2011 12:21, Mark Schoy wrote:

 This is a problem that can be solved with grouping.
 http://wiki.apache.org/solr/FieldCollapsing

 For each possible price on a product, you index a document with the dates
 and the price. In your query, you group on the product, and apply a
 date-filter, and the price you see for each product will be from the top
 document within the given dates.

 You can also sort by price. If you have multiple overlapping campaigns, you
 might need to pay attention to which one you want to take precedence, as
 your sorting will determine which document gets shown.

 --
 Morten
 We all live in a yellow subroutine.


Solr Replication: relative path in confFiles Element?

2011-10-25 Thread Mark Schoy
Hi,

is ist possible to define a relative path in confFile?

For example:

str name=confFiles../../x.xml/str

If yes, to which location will the file be copied at the slave?

Thanks.


Re: How to create a solr core if no solr cores were created before?

2011-07-12 Thread Mark Schoy
Thanks for your answer, but your answer is a little bit useless for
me. Could you please add more information in addition to this link?

Do I have to create a root core to create other cores?
How can I create a root core? Manually adding in the solr.xml config?

2011/7/11 Gabriele Kahlout gabri...@mysimpatico.com:
 have a look here [1].

 [1]
 https://issues.apache.org/jira/browse/SOLR-2645?focusedCommentId=13062748page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13062748

 On Mon, Jul 11, 2011 at 4:46 PM, Mark Schoy hei...@gmx.de wrote:

 Hi,

 I tried to create a solr core but I always get No such solr
 core:-Exception.

 -
 File home = new File( pathToSolrHome );
 File f = new File( home, solr.xml );

 CoreContainer coreContainer = new CoreContainer();
 coreContainer.load( pathToSolrHome, f );

 EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, );
 CoreAdminRequest.createCore(coreName, coreDir, server);
 -

 I think the problem is the  in new EmbeddedSolrServer(coreContainer, );

 Thanks.




 --
 Regards,
 K. Gabriele

 --- unchanged since 20/9/10 ---
 P.S. If the subject contains [LON] or the addressee acknowledges the
 receipt within 48 hours then I don't resend the email.
 subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
  Now + 48h) ⇒ ¬resend(I, this).

 If an email is sent by a sender that is not a trusted contact or the email
 does not contain a valid code then the email is not received. A valid code
 starts with a hyphen and ends with X.
 ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
 L(-[a-z]+[0-9]X)).



How to create a solr core if no solr cores were created before?

2011-07-11 Thread Mark Schoy
Hi,

I tried to create a solr core but I always get No such solr core:-Exception.

-
File home = new File( pathToSolrHome );
File f = new File( home, solr.xml );

CoreContainer coreContainer = new CoreContainer();
coreContainer.load( pathToSolrHome, f );

EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, );
CoreAdminRequest.createCore(coreName, coreDir, server);
-

I think the problem is the  in new EmbeddedSolrServer(coreContainer, );

Thanks.


Using two repeater to rapidly switching Master and Slave (Replication)?

2011-06-21 Thread Mark Schoy
Hi,

I have an idea how to switching master and slave in case of one server
is crashing:

Setting up two server as repeater but disabling master and slave
config on both with str name=enablefalse/str.

Now you can dynamically disable and enable master or slave option by url:

enable / disable replication on master:
http://master_host:port/solr/replication?command=disablereplication
http://master_host:port/solr/replication?command=enablereplication

enable / disable polling on slave:
http://slave_host:port/solr/replication?command=disablepoll
http://slave_host:port/solr/replication?command=enablepoll

Does this work?


Re: why too many open files?

2011-06-20 Thread Mark Schoy
Hi,

did you have checked the max opened files of your OS?

see: http://lj4newbies.blogspot.com/2007/04/too-many-open-files.html



2011/6/20 Jason, Kim hialo...@gmail.com

 Hi, All

 I have 12 shards and ramBufferSizeMB=512, mergeFactor=5.
 But solr raise java.io.FileNotFoundException (Too many open files).
 mergeFactor is just 5. How can this happen?
 Below is segments of some shard. That is too many segments over mergFactor.
 What's wrong and How should I set the mergeFactor?

 ==
 [root@solr solr]# ls indexData/multicore-us/usn02/data/index/
 _0.fdt   _gs.fdt  _h5.tii  _hl.nrm  _i1.nrm  _kn.nrm  _l1.nrm  _lq.tii
 _0.fdx   _gs.fdx  _h5.tis  _hl.prx  _i1.prx  _kn.prx  _l1.prx  _lq.tis
 _3i.fdt  _gs.fnm  _h7.fnm  _hl.tii  _i1.tii  _kn.tii  _l1.tii
 lucene-2de7b31b5eabdff0b6ec7fd32eecf8c7-write.lock
 _3i.fdx  _gs.frq  _h7.frq  _hl.tis  _i1.tis  _kn.tis  _l1.tis  _lu.fnm
 _3s.fnm  _gs.nrm  _h7.nrm  _hn.fnm  _j7.fdt  _kp.fnm  _l2.fnm  _lu.frq
 _3s.frq  _gs.prx  _h7.prx  _hn.frq  _j7.fdx  _kp.frq  _l2.frq  _lu.nrm
 _3s.nrm  _gs.tii  _h7.tii  _hn.nrm  _kb.fnm  _kp.nrm  _l2.nrm  _lu.prx
 _3s.prx  _gs.tis  _h7.tis  _hn.prx  _kb.frq  _kp.prx  _l2.prx  _lu.tii
 _3s.tii  _gu.fnm  _h9.fnm  _hn.tii  _kb.nrm  _kp.tii  _l2.tii  _lu.tis
 _3s.tis  _gu.frq  _h9.frq  _hn.tis  _kb.prx  _kp.tis  _l2.tis  _ly.fnm
 _48.fdt  _gu.nrm  _h9.nrm  _hp.fnm  _kb.tii  _kq.fnm  _l6.fnm  _ly.frq
 _48.fdx  _gu.prx  _h9.prx  _hp.frq  _kb.tis  _kq.frq  _l6.frq  _ly.nrm
 _4d.fnm  _gu.tii  _h9.tii  _hp.nrm  _kc.fnm  _kq.nrm  _l6.nrm  _ly.prx
 _4d.frq  _gu.tis  _h9.tis  _hp.prx  _kc.frq  _kq.prx  _l6.prx  _ly.tii
 _4d.nrm  _gw.fnm  _hb.fnm  _hp.tii  _kc.nrm  _kq.tii  _l6.tii  _ly.tis
 _4d.prx  _gw.frq  _hb.frq  _hp.tis  _kc.prx  _kq.tis  _l6.tis  _m3.fnm
 _4d.tii  _gw.nrm  _hb.nrm  _hr.fnm  _kc.tii  _kr.fnm  _la.fnm  _m3.frq
 _4d.tis  _gw.prx  _hb.prx  _hr.frq  _kc.tis  _kr.frq  _la.frq  _m3.nrm
 _5b.fdt  _gw.tii  _hb.tii  _hr.nrm  _kf.fdt  _kr.nrm  _la.nrm  _m3.prx
 _5b.fdx  _gw.tis  _hb.tis  _hr.prx  _kf.fdx  _kr.prx  _la.prx  _m3.tii
 _5b.fnm  _gy.fnm  _he.fdt  _hr.tii  _kf.fnm  _kr.tii  _la.tii  _m3.tis
 _5b.frq  _gy.frq  _he.fdx  _hr.tis  _kf.frq  _kr.tis  _la.tis  _m8.fnm
 _5b.nrm  _gy.nrm  _he.fnm  _ht.fnm  _kf.nrm  _kt.fnm  _le.fnm  _m8.frq
 _5b.prx  _gy.prx  _he.frq  _ht.frq  _kf.prx  _kt.frq  _le.frq  _m8.nrm
 _5b.tii  _gy.tii  _he.nrm  _ht.nrm  _kf.tii  _kt.nrm  _le.nrm  _m8.prx
 _5b.tis  _gy.tis  _he.prx  _ht.prx  _kf.tis  _kt.prx  _le.prx  _m8.tii
 _5m.fnm  _h0.fnm  _he.tii  _ht.tii  _kg.fnm  _kt.tii  _le.tii  _m8.tis
 _5m.frq  _h0.frq  _he.tis  _ht.tis  _kg.frq  _kt.tis  _le.tis  _md.fnm
 _5m.nrm  _h0.nrm  _hh.fnm  _hv.fnm  _kg.nrm  _kw.fnm  _li.fnm  _md.frq
 _5m.prx  _h0.prx  _hh.frq  _hv.frq  _kg.prx  _kw.frq  _li.frq  _md.nrm
 _5m.tii  _h0.tii  _hh.nrm  _hv.nrm  _kg.tii  _kw.nrm  _li.nrm  _md.prx
 _5m.tis  _h0.tis  _hh.prx  _hv.prx  _kg.tis  _kw.prx  _li.prx  _md.tii
 _5n.fnm  _h2.fnm  _hh.tii  _hv.tii  _kj.fdt  _kw.tii  _li.tii  _md.tis
 _5n.frq  _h2.frq  _hh.tis  _hv.tis  _kj.fdx  _kw.tis  _li.tis  _mi.fnm
 _5n.nrm  _h2.nrm  _hk.fnm  _hz.fdt  _kj.fnm  _ky.fnm  _lm.fnm  _mi.frq
 _5n.prx  _h2.prx  _hk.frq  _hz.fdx  _kj.frq  _ky.frq  _lm.frq  _mi.nrm
 _5n.tii  _h2.tii  _hk.nrm  _hz.fnm  _kj.nrm  _ky.nrm  _lm.nrm  _mi.prx
 _5n.tis  _h2.tis  _hk.prx  _hz.frq  _kj.prx  _ky.prx  _lm.prx  _mi.tii
 _5x.fnm  _h5.fdt  _hk.tii  _hz.nrm  _kj.tii  _ky.tii  _lm.tii  _mi.tis
 _5x.frq  _h5.fdx  _hk.tis  _hz.prx  _kj.tis  _ky.tis  _lm.tis  segments_1
 _5x.nrm  _h5.fnm  _hl.fdt  _hz.tii  _kn.fdt  _l1.fdt  _lq.fnm  segments.gen
 _5x.prx  _h5.frq  _hl.fdx  _hz.tis  _kn.fdx  _l1.fdx  _lq.frq
 _5x.tii  _h5.nrm  _hl.fnm  _i1.fnm  _kn.fnm  _l1.fnm  _lq.nrm
 _5x.tis  _h5.prx  _hl.frq  _i1.frq  _kn.frq  _l1.frq  _lq.prx
 ==

 Thanks in advance.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/why-too-many-open-files-tp3084407p3084407.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Master Slave Replication in Solr Cloud - What happens if the master is not available?

2011-06-20 Thread Mark Schoy
Hi,

if I use a master slave replication in Solr Cloud and the master
crashes, can the slave automatically switch to master mode?

Or is there another way to index documents after the master is down?

Thanks.


Re: Master Slave Replication in Solr Cloud - What happens if the master is not available?

2011-06-20 Thread Mark Schoy
Thanks for your answer Erick.

So the easiest way will be to set up 2 shard cluster with shard replicas ;)

2011/6/20 Erick Erickson erickerick...@gmail.com:
 No, there's nothing built into Solr to automatically promote a slave
 to a master.

 You have several choices here. One is to build a new master and
 reindex from scratch.

 Another is to configure your slave as a new master and then
 bring up a new machine and have it replicate. Now make that new machine
 your master (you'll have to re-configure both).

 The fun part is continuing to serve requests while all this is going
 on. It's easier
 if you have more than one slave so you can move things around while the
 remaining slave is reconfigured (or whatever)

 Best
 Erick

 On Mon, Jun 20, 2011 at 8:28 AM, Mark Schoy hei...@gmx.de wrote:
 Hi,

 if I use a master slave replication in Solr Cloud and the master
 crashes, can the slave automatically switch to master mode?

 Or is there another way to index documents after the master is down?

 Thanks.




Re: Master Slave Replication in Solr Cloud - What happens if the master is not available?

2011-06-20 Thread Mark Schoy
You're right, thanks!

2011/6/20 Erick Erickson erickerick...@gmail.com:
 Hmmm, be a little careful here with terminology.
 Shards may be unnecessary if you  can put your whole index
 on a single searcher. It's preferable to   simply have each
 slave hold a complete copy of the index, no sharding necessary.

 Best
 Erick

 On Mon, Jun 20, 2011 at 9:45 AM, Mark Schoy hei...@gmx.de wrote:
 Thanks for your answer Erick.

 So the easiest way will be to set up 2 shard cluster with shard replicas ;)

 2011/6/20 Erick Erickson erickerick...@gmail.com:
 No, there's nothing built into Solr to automatically promote a slave
 to a master.

 You have several choices here. One is to build a new master and
 reindex from scratch.

 Another is to configure your slave as a new master and then
 bring up a new machine and have it replicate. Now make that new machine
 your master (you'll have to re-configure both).

 The fun part is continuing to serve requests while all this is going
 on. It's easier
 if you have more than one slave so you can move things around while the
 remaining slave is reconfigured (or whatever)

 Best
 Erick

 On Mon, Jun 20, 2011 at 8:28 AM, Mark Schoy hei...@gmx.de wrote:
 Hi,

 if I use a master slave replication in Solr Cloud and the master
 crashes, can the slave automatically switch to master mode?

 Or is there another way to index documents after the master is down?

 Thanks.






Re: Indexing-speed issues (chart included)

2011-06-17 Thread Mark Schoy
Sorry, here are some details:

requestHandler: XmlUpdateRequesetHandler
protocol: http (10 concurrend threads)
document: 1kb size, 15 fields

cpu load: 20%
memory usage: 50%

But generally speaking, is that normal or must be something wrong with my
configuration, ...

2011/6/17 Erick Erickson erickerick...@gmail.com

 Well, it's kinda hard to say anything pertinent with so little
 information. How are you indexing things? What kind of documents?
 How are you feeding docs to Solr?

 You might review:
 http://wiki.apache.org/solr/UsingMailingLists

 Best
 Erick

 On Fri, Jun 17, 2011 at 8:10 AM, Mark Schoy hei...@gmx.de wrote:
  Hi,
 
  If I start indexing documents it getting slower the more documents were
  added without commiting and optimizing:
 
  http://imageshack.us/photo/my-images/695/solrchart.png/
 
  I've changed the mergeFactor from 10 to 30, changed maxDocs
 (100,1000,1)
  but it always getting slower the more documents were added.
  If I'm using elasticsearch which is also based on lucene I'm getting
  constant indexing rates (without commiting and optimizing too)
 
  Does anybody know whats wrong?
 



Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread Mark Schoy
Hi,

I set up a Solr instance with 512 cores. Each core has 100k documents and 15
fields. Solr is running on a CPU with 4 cores (2.7Ghz) and 16GB RAM.

Now I've done some benchmarks with JMeter. On each thread iteration JMeter
queriing another Core by random. Here are the results (Duration:  each with
180 second):

Randomly queried cores | queries per second
1| 2016
2 | 2001
4 | 1978
8 | 1958
16 | 2047
32 | 1959
64 | 1879
128 | 1446
256 | 1009
512 | 428

Why are the queries per second until 64 constant and then the performance is
degreasing rapidly?

Solr only uses 10GB of the 16GB memory so I think it is not a memory issue.


Re: Performance loss - querying more than 64 cores (randomly)

2011-06-16 Thread Mark Schoy
Thanks for your answers.

Andrzej was right with his assumption. Solr only needs about 9GB memory but
the system needs the rest of it for disc IO:

64 Cores:  64*100MB index size = 6,4GB + 9 GB Solr Cache + about 600 MB OS =
16GB

Conclusion: My system can exactly buffer the data of 64 Cores. Every
additional core cant be buffered and the performance is decreasing.



2011/6/16 François Schiettecatte fschietteca...@gmail.com

 I am assuming that you are running on linux here, I have found atop to be
 very useful to see what is going on.

http://freshmeat.net/projects/atop/

 dstat is also very useful too but needs a little more work to 'decode'.

 Obviously there is contention going on, you just need to figure out where
 it is, most likely it is disk I/O but it could also be the number of cores
 you have. Also I would not say that performance is decreasing rapidly,
 probably more of a gentle slope down if you plot it (your double the number
 of cores every time).

 I would be very interested in hearing about what you find.

 Cheers

 François

 On Jun 16, 2011, at 10:00 AM, Andrzej Bialecki wrote:

  On 6/16/11 3:22 PM, Mark Schoy wrote:
  Hi,
 
  I set up a Solr instance with 512 cores. Each core has 100k documents
 and 15
  fields. Solr is running on a CPU with 4 cores (2.7Ghz) and 16GB RAM.
 
  Now I've done some benchmarks with JMeter. On each thread iteration
 JMeter
  queriing another Core by random. Here are the results (Duration:  each
 with
  180 second):
 
  Randomly queried cores | queries per second
  1| 2016
  2 | 2001
  4 | 1978
  8 | 1958
  16 | 2047
  32 | 1959
  64 | 1879
  128 | 1446
  256 | 1009
  512 | 428
 
  Why are the queries per second until 64 constant and then the
 performance is
  degreasing rapidly?
 
  Solr only uses 10GB of the 16GB memory so I think it is not a memory
 issue.
 
 
  This may be an OS-level disk buffer issue. With a limited disk buffer
 space the more random IO occurs from different files, the higher is the
 churn rate, and if the buffers are full then the churn rate may increase
 dramatically (and the performance will drop then). Modern OS-es try to keep
 as much data in memory as possible, so the memory usage itself is not that
 informative - but check what are the pagein/pageout rates when you start
 hitting the 32 vs 64 cores.
 
  --
  Best regards,
  Andrzej Bialecki 
  ___. ___ ___ ___ _ _   __
  [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
  ___|||__||  \|  ||  |  Embedded Unix, System Integration
  http://www.sigram.com  Contact: info at sigram dot com