Re: Indexing Hanging during GC?

2010-08-18 Thread Rebecca Watson
flexible indexing in Lucene) OR to increase the term index interval so I will try one/both of these and see if this means I can increase the number of documents I can index given my current hardware (6GB RAM) where these docs have a lot of unique terms! thanks :) bec On 13 August 2010 19:15, R

tii RAM usage on startup

2010-08-18 Thread Rebecca Watson
hi, I am running solr 1.4.1 and java 1.6 with 6GB heap and the following GC settings: gc_args="-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:NewSize=2g -XX:MaxNewSize=2g -XX:CMSInitiatingOccupancyFraction=60" So 6GB total heap and 2GB allocated to eden space. I have caching, autoco

Re: Indexing Hanging during GC?

2010-08-13 Thread Rebecca Watson
each file to Solr so i'll email an update about this. if this works ok i'm then going to try using only one auto-commit setting rather than two and see if this works ok. thanks :) bec On 13 August 2010 00:24, Rebecca Watson wrote: > hi, > >> 1) I assume you are doing batch

Re: Indexing Hanging during GC?

2010-08-12 Thread Rebecca Watson
ing Solr. i'm not sure if we would either if we posted as we created the index.xml format... but because we post 500+ documents a time (one article file per LCF post) and LCF can post these files quickly i'm not sure if I need to try and slow down the post rate!? thanks for your repli

Re: Indexing Hanging during GC?

2010-08-12 Thread Rebecca Watson
t; the source) which is 100k documents an hour without breaking a sweat. > > > > On 8/12/10, Rebecca Watson wrote: >> Hi, >> >> When indexing large amounts of data I hit a problem whereby Solr >> becomes unresponsive >> and doesn't recover (even w

Re: Analysing SOLR logfiles

2010-08-12 Thread Rebecca Watson
we've just started using awstats - as suggested by the solr 1.4 book. its open source!: http://awstats.sourceforge.net/ On 12 August 2010 18:18, Jay Flattery wrote: > Thanks - splunk looks overkill. > We're extremely small scale - were hoping for something open source :-) > > > - Original Me

Indexing Hanging during GC?

2010-08-12 Thread Rebecca Watson
Hi, When indexing large amounts of data I hit a problem whereby Solr becomes unresponsive and doesn't recover (even when left overnight!). I think i've hit some GC problems/tuning is required of GC and I wanted to know if anyone has ever hit this problem. I can replicate this error (albeit taking

Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req

2010-07-16 Thread Rebecca Watson
hi, would faceting work? http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr if you have a field for rootId that is multivalued + facet on it -- you'll get value+count pairs back (top 100 i think by default) bec :) On 16 July 2010 16:07, Ninad Raut wrot

Re: Error in building Solr-Cloud (ant example)

2010-07-15 Thread Rebecca Watson
hi mark, jayf and i are working together :) i tried to apply the patch to the trunk, but the ant tests failed... i checked out the latest trunk: svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk patched it with SOLR-1873, and put the two JARs into trunk/solr/lib ant compile in the

Re: faceting over field not in all documents

2010-07-13 Thread Rebecca Watson
brilliant! thanks very much for your help :) On 13 July 2010 21:47, Jonathan Rochkind wrote: >> i'm hoping that -- faceting simply calculates+returns the counts for docs >> that >> have the field present while results may still contain documents that don't >> have the facet field (i.e. the field

faceting over field not in all documents

2010-07-13 Thread Rebecca Watson
hi, has anyone had experience with faceting over a field where the field is not present in all documents within the index? i'm hoping that -- faceting simply calculates+returns the counts for docs that have the field present while results may still contain documents that don't have the facet fiel

Re: Locked Index files

2010-07-13 Thread Rebecca Watson
shut down your solr server first... if its not important! :) On 13 July 2010 16:47, ZAROGKIKAS,GIORGOS wrote: > I found it but I can not delete > Any suggestion??? > > -Original Message- > From: Yuval Feinstein [mailto:yuv...@answers.com] > Sent: Tuesday, July 13, 2010 11:39 AM > To: solr

Re: Problem with Wildcard searches in Solr

2010-07-13 Thread Rebecca Watson
hi, sorry realised i had a typo: > of course, non of this is going to sort out trying to match against the query > "co?mput?r" because you've probably stemmed "computer" to "comput" or > something > at index time -- but if you add in a copyfield to an extra field that > isn't stemmed > at query

Re: Problem with Wildcard searches in Solr

2010-07-13 Thread Rebecca Watson
Hi, earlier this week i started messing with getting wildcard queries to be analysed i've got some weird analysers doing stemming/lowercasing and writing in the same rules into a custom queryparser didn't seem logical given i just want the analysers to apply as they do at index time i ca

Re: fq= "more then one" ?

2010-07-12 Thread Rebecca Watson
oops - i thought you couldn't put more than one - ignore my answer then :) On 12 July 2010 17:20, Rebecca Watson wrote: > hi, > > you shouldn't have two fq parameters -- some solr params work like > that, but fq doesn't > >> http://172.20.1.3

Re: fq= "more then one" ?

2010-07-12 Thread Rebecca Watson
hi, you shouldn't have two fq parameters -- some solr params work like that, but fq doesn't > http://172.20.1.33:8983/solr/select/?q=*:*&start=0&fq=EMAIL_HEADER_FROM:t...@mail.de&fq=EMAIL_HEADER_TO:t...@mail.de you need to combine it into a single param i.e. try putting it as an "OR" or "AND" if

Re: Faceting unknown fields

2010-07-08 Thread Rebecca Watson
hi, > So, can I index and facet these fields, without describe then in my schema? > > I will first try with dynamic fields, but I'm not sure it's going to work. we do all our facet fields in this way, with just general string field for single/multivalued fields: and faceting

Re: How to manage resource out of index?

2010-07-06 Thread Rebecca Watson
hi li, i looked at doing something similar - where we only index the text but retrieve search results / highlight from files -- we ended up giving up because of the amount of customisation required in solr -- mainly because we wanted the distributed search functionality in solr which meant making