Re: Replication: abort-fetch and restarting

2011-01-10 Thread Markus Jelsma
Any thoughts on this one? Should i add a ticket? On Tuesday 04 January 2011 20:08:40 Markus Jelsma wrote: > Hi, > > It seems abort-fetch nicely removes the index directory which i'm > replicating to which is fine. Restarting, however, does not trigger the > the same featur

Re: Improving Solr performance

2011-01-10 Thread Markus Jelsma
No, it also depends on the queries you execute (sorting is a big consumer) and the number of concurrent users. > Is that a general rule of thumb? That it is best to have about the > same amount of RAM as the size of your index? > > So, with a 5GB index, I should have between 4GB and 8GB of RAM >

Re: Improving Solr performance

2011-01-10 Thread Markus Jelsma
Any sources to cite for this statement? And are you talking about RAM allocated to the JVM or available for OS cache? > Not sure if this was mentioned yet, but if you are doing slave/master > replication you'll need 2x the RAM at replication time. Just something to > keep in mind. > > -mike > >

Re: != unequal in fq !?

2011-01-11 Thread Markus Jelsma
Hi, It works just like boolean operators in the main query: fq=-status:refunded http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#Boolean operators Cheers > hello. > > i need to filter a field. i want all fields are not like the given string. > > eg.: ...&fq=status!=refundend > > h

Re: Where does admin UI visually distinguish between "master" and "slave"?

2011-01-12 Thread Markus Jelsma
> > 'slave'. (i.e. > > http://localhost:8983/solr/production/admin/replication/index.jsp) > > > > I'd like a clearer visual confirmation that the master node is indeed a > > master and the slave is a slave. > > > > Summary question: > > Does the admin UI distinguish betwen "master and slave"? > > > > thanks > > > > will -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: StopFilterFactory and "qf" containing some fields that use it and some that do not

2011-01-12 Thread Markus Jelsma
I haven't used edismax but i can imagine its a feature. Ths is because inconstent use of stopwords in the analyzers of the fields specified in qf can yield really unexpected results because of the mm parameter. In dismax, if one analyzer removed stopwords and the other doesn't the mm parameter

Re: verifying that an index contains ONLY utf-8

2011-01-12 Thread Markus Jelsma
This is supposed to be dealt with outside the index. All input must be UTF-8 encoded. Failing to do so will give unexpected results. > We've created an index from a number of different documents that are > supplied by third parties. We want the index to only contain UTF-8 > encoded characters. I

Re: StopFilterFactory and "qf" containing some fields that use it and some that do not

2011-01-12 Thread Markus Jelsma
> Have used edismax and Stopword filters as well. But usually use the fq > parameter e.g. fq=title:the life and never had any issues. That is because filter queries are not relevant for the mm parameter which is being used for the main query. > > Can you turn on the debugQuery and check whats

Re: StopFilterFactory and "qf" containing some fields that use it and some that do not

2011-01-12 Thread Markus Jelsma
pens, or enable stop words for all > the fields that are used in "qf" with stopword-enabled fields. > Unless...someone has a better idea?? > > James Dyer > E-Commerce Systems > Ingram Content Group > (615) 213-4311 > > -Original Message- > From: Mark

Re: basic document crud in an index

2011-01-13 Thread Markus Jelsma
that part. > I just have a job that runs in the middle of the night and runs Optimize > once each night, I don't dig deeper than that into what goes on. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: start value in queries zero or one based?

2011-01-13 Thread Markus Jelsma
Perhaps it would be more useful to RTFM instead of messing around on the mailing list: http://wiki.apache.org/solr/CommonQueryParameters#start Please, read every wiki page you can find and write notes. > Do I even need a body for this message? ;-) > > Dennis Gearon > > > Signature Warning >

Re: Solr: using to index large folders recursively containing lots of different documents, and querying over the web

2011-01-14 Thread Markus Jelsma
Please visit the Nutch project. It is a powerful crawler and can integrate with Solr. http://nutch.apache.org/ > Hi Solr users, > > I hope you can help. We are migrating our intranet web site management > system to Windows 2008 and need a replacement for Index Server to do the > text searching

Re: Solr: using to index large folders recursively containing lots of different documents, and querying over the web

2011-01-14 Thread Markus Jelsma
egorisation and/or text searching), so we need > something that will index all the files in a given folder, rather than > follow links like a crawler. Can Nutch do this? As well as the other > requirements below? > Regards > Cathy > > On 14 January 2011 12:09, Markus Jelsma

Re: Clustering using Carrot2 clustering componet

2011-01-17 Thread Markus Jelsma
changes are required in solr.config or anywhere else. > > Thanks! > Isha -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Any way to query by offset?

2011-01-17 Thread Markus Jelsma
I think Steve wants the 1000th, 2000th and 3000th document from the query. And since there's no method of doing so you're constrained to executing three queries with rows=1 and start is resp. 1000, 2000 and 3000. If you want these documents to return you will have to do multiple queries with di

Re: Does field collapsing (with facet) reduce performance?

2011-01-17 Thread Markus Jelsma
There is always CPU and RAM involved for every nice component you use. Just how much the penalty is depends completely on your hardware, index and type of query. Under heavy load it numbers will change. Since we don't know your situation and it's hard to predict without benchmarks, you should r

Re: Is deduplication possible during Tika extract?

2011-01-17 Thread Markus Jelsma
In my opinion it should work for every update handler. If you're really sure your configuration if fine and it still doesn't work you might have to file an issue. Your configuration looks alright but don't forget you've configured overwriteDupes=false! > Hello, > > here is an excerpt of my so

Re: Solr Out of Memory Error

2011-01-18 Thread Markus Jelsma
jetty.HttpConnection$RequestHandler.content(HttpConnection.java > :938) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:755) at > org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218) at > org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at > org.mortbay.jetty.

Re: using dismax

2011-01-18 Thread Markus Jelsma
. > > When I incorporate dismax and say mm=1, no results get returned. > http://localhost:8080/solr/cs/select?q=(poi_id:3)&defType=dismax&mm=1 > > What I wanted to do when I specify mm=1 is to say at least 1 query > parameter matches. > What am I missing? > > Thank

Re: using dismax

2011-01-18 Thread Markus Jelsma
> http://localhost:8080/solr/cs/select?q=(poi_id:3) > > > > I get a row returned. > > > > When I incorporate dismax and say mm=1, no results get returned. > > http://localhost:8080/solr/cs/select?q=(poi_id:3)&defType=dismax&mm=1 > > > > What I

Re: what would cause large numbers of executeWithRetry INFO messages?

2011-01-18 Thread Markus Jelsma
Hi, This is a slave polling the master for its index version but it seems the master fails to respond. From the javadoc: > public class NoHttpResponseException > extends IOException > > Signals that the target server failed to respond with a valid HTTP > response. Cheers, > I see a large numb

Re: what would cause large numbers of executeWithRetry INFO messages?

2011-01-18 Thread Markus Jelsma
Oh, and this should not have the INFO level in my opinion. Other log lines indicating a problem with the master (such as a time out or unreachable host) are not flagged as INFO. Maybe you could file a Jira ticket? Don't forget to specifiy your Solr version. Also, please check the master log f

Re: Indexing and Searching Chinese with SolrNet

2011-01-18 Thread Markus Jelsma
Why creating two threads for the same problem? Anyway, is your servlet container capable of accepting UTF-8 in the URL? Also, is SolrNet capable of handling those characters? To confirm, try a tool like curl. > Dear all, > > After reading some pages on the Web, I created the index with the foll

Re: Indexing and Searching Chinese with SolrNet

2011-01-18 Thread Markus Jelsma
n .NET? > > Thanks so much! > LB > > > On Wed, Jan 19, 2011 at 2:34 AM, Markus Jelsma > > wrote: > > Why creating two threads for the same problem? Anyway, is your servlet > > container capable of accepting UTF-8 in the URL? Also, is SolrNet capable > &g

Re: Local param tag voodoo ?

2011-01-18 Thread Markus Jelsma
Hi, You get an error because LocalParams need to be in the beginning of a parameter's value. So no parenthesis first. The second query should not give an error because it's a valid query. Anyway, i assume you're looking for : http://wiki.apache.org/solr/SimpleFacetParameters#Multi- Select_Facet

Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?

2011-01-18 Thread Markus Jelsma
> [X] ASF Mirrors (linked in our release announcements or via the Lucene > website) > > [] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) > > [X] I/we build them from source via an SVN/Git checkout. > > [] Other (someone in your company mirrors them internally or via a > downst

Re: How to find Master & Slave are in sync

2011-01-19 Thread Markus Jelsma
TP APIs? > > http://master_host:port/solr/replication?command=indexversion > http://slave_host:port/solr/replication?command=details -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Replication: abort-fetch and restarting

2011-01-19 Thread Markus Jelsma
Issue created: https://issues.apache.org/jira/browse/SOLR-2323 On Tuesday 04 January 2011 20:08:40 Markus Jelsma wrote: > Hi, > > It seems abort-fetch nicely removes the index directory which i'm > replicating to which is fine. Restarting, however, does not trigger the > the

Re: Switching existing solr indexes from Segment to Compound Style index files

2011-01-19 Thread Markus Jelsma
bit more info > we may be able to suggest other alternatives. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Documentaion: For newbies and recent newbies

2011-01-19 Thread Markus Jelsma
That someone should just visit the wiki: http://wiki.apache.org/solr/SolrResources > If someone is looking for good documentation and getting started guides, I > am putting this in the newsgroups to be searched upon. I recommend: > > A/ The Wikis: (FREE) >http://wiki.apache.org/solr/Front

Re: How to index my users info

2011-01-19 Thread Markus Jelsma
http://lucene.apache.org/solr/#getstarted > I would like to index the information of my employees to be able to get > through some fields such as: e-mail, registration, ID, cell phone, name. > > I am very new to SOLR and would like to know how to index these fields this > way and how to search fi

Re: Mem allocation - SOLR vs OS

2011-01-19 Thread Markus Jelsma
You only need so much for Solr so it can do its thing. Faceting can take quite some memory on a large index but sorting can be a really big RAM consumer. As Erick pointed out, inspect and tune the cache settings and adjust RAM allocated to the JVM if required. Using tools like JConsole you can m

Re: facet or filter based on user's history

2011-01-19 Thread Markus Jelsma
Hi, I've never seen Solr's behaviour with a huge amount of values in a multi valued but i think it should work alright. Then you can stored a list of user ID's along with each book document and user filter queries to include or exclude the book from the result set. Cheers, > Hi, > > I'm look

Re: No system property or default value specified for...

2011-01-19 Thread Markus Jelsma
Hi, I'm unsure if i completely understand but you first had the error for local.code and then set the property in solr.xml? Then of course it will give an error for the next undefined property that has no default set. If you use a property without default it _must_ be defined in solr.xml or so

Re: Mem allocation - SOLR vs OS

2011-01-19 Thread Markus Jelsma
ually can be found. > We do have sorting but not faceting. OK so I guess there is no 'hard and > fast rule' as such so I will play with it and see. > > Thanks for the help > > On Wed, Jan 19, 2011 at 11:48 PM, Markus Jelsma > > wrote: > > You only

Re: dataDir in solr.xml

2011-01-19 Thread Markus Jelsma
You have set the property already but i haven't seen you use that same property for the dataDir setting in solrconfig. > I've checked the archive, and plenty of people have suggested an > arrangement where you can have two cores which share a configuration but > maintain separate data paths. But

Re: performance during index switch

2011-01-19 Thread Markus Jelsma
> Hi, > > Are there performance issues during the index switch? What do you mean by index switch? > > As the size of index gets bigger, response time slows down? Are there any > studies on this? I haven't seen any studies as of yet but response time will slow down for some components. Sor

Re: No system property or default value specified for...

2011-01-19 Thread Markus Jelsma
alues for the dataimport.delta values? that > doesn't seem right > > On Wed, Jan 19, 2011 at 11:57 AM, Markus Jelsma > > wrote: > > Hi, > > > > I'm unsure if i completely understand but you first had the error for > > local.code and then set the property in

Re: No system property or default value specified for...

2011-01-19 Thread Markus Jelsma
#System_property_substitution > there error I am getting is that I have no default value > for ${dataimporter.last_index_time} > > should I just define -00-00 00:00:00 as the default for that field? > > On Wed, Jan 19, 2011 at 12:45 PM, Markus Jelsma > > wrote: > > No, you only need

Re: using dismax

2011-01-20 Thread Markus Jelsma
Did i write wt? Oh dear. The q and w are too close =) > Markus, > > Its not wt its qt, wt for response type, > Also qt is not for Query Parser its for Request Handler ,In solrconfig.xml > there are many Request Handlers can be Defined using "dismax" Query Parser > Or Using "lucene" Query Parser. >

Re: Multicore Search "Map size must not be negative"

2011-01-20 Thread Markus Jelsma
gt; > When i search > http://192.168.105.59:8080/solr/mail/select?wt=php&q=*:*&shards=192.168.105 > .59:8080/solr/mail,192.168.105.59:8080/solr/mail11 > > it works but i need wt=phps it is important! > > but i dont understand the Problem!!! > > > Jörg -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Master and Slaves

2011-01-21 Thread Markus Jelsma
the slaves i want to override the "dataDir" value of the > solrconfig.xml, but it get overrided by the one replicated. > Is there a way to have the slaves having their solrconfig replicated, but > with some "special" configurations? > > I want to avoid havi

Re: Master and Slaves

2011-01-21 Thread Markus Jelsma
d also... > > So i don't understand why isn't figuring as replicated. > Maybe i'm doing something wrong. Don't know > > On Fri, Jan 21, 2011 at 10:16 AM, Ezequiel Calderara wrote: > > Thanks!, thats what i needed! > > > > There is

Re: Indexing FTP Documents through SOLR??

2011-01-21 Thread Markus Jelsma
Hi, Please take a look at Apache Nutch. I can crawl through a file system over FTP. After crawling, it can use Tika to extract the content from your PDF files and other. Finally you can then send the data to your Solr server for indexing. http://nutch.apache.org/ > Hi All, > Is there is any

Re: Is solr 4.0 ready for prime time? (or other ways to use geo distance in search)

2011-01-21 Thread Markus Jelsma
Hi, You can use Solr 1.4.1 and a third party plugin [1]. It does a pretty good job in spatial search. You could also try the Solr 3.1 branch which also has some spatial features on-board. It, however, does not return computed distances but can filter and sort using the great circle algorithm or

Re: fieldType textgen. tokens > 2

2011-01-24 Thread Markus Jelsma
; catenateWords="0" catenateNumbers="0" > catenateAll="0" splitOnCaseChange="0"/> > > > > > - > --- System > -------- > > One Server, 12 GB RAM, 2 Solr Ins

Re: How data is replicating from Master to Slave?

2011-01-24 Thread Markus Jelsma
o slave? > This is my first work with Solr. So I'm not sure how to tackle this issue. > > Regds > dhanesh s.r -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Migrating from 1.4.0 to 1.4.1 solr

2011-01-24 Thread Markus Jelsma
included in solr.xml do show up on the > browser. > > Pl do let me know the reason. Is there anything I need to do for the core > migration? I dont have any data in these cores. Also if there was data is > there a nice way of migrating from 1.4.0 to 1.4.1 (Which does not involve > re

Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-01-24 Thread Markus Jelsma
Are you using 3rd-party plugins? > We have two slaves replicating off one master every 2 minutes. > > Both using the CMS + ParNew Garbage collector. Specifically > > -server -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing > > but periodically they bo

Re: Solr set up issues with Magento

2011-01-24 Thread Markus Jelsma
Hi, You haven't defined the field in Solr's schema.xml configuration so it needs to be added first. Perhaps following the tutorial might be a good idea. http://lucene.apache.org/solr/tutorial.html Cheers. > Hello Team: > > > I am in the process of setting up Solr 1.4 with Magento ENterpris

Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-01-25 Thread Markus Jelsma
> > Or am I barking up completely the wrong tree? I'm trawling through heap > logs and gc logs at the moment trying to to see what other tuning I can > do but any other hints, tips, tricks or cluebats gratefully received. > Even if it's just "Yeah, we had that

Re: Recommendation on RAM-/Cache configuration

2011-01-25 Thread Markus Jelsma
> > Would you say it is ok to reduce the cache sizes? Would this increase disk > i/o, or would the index be hold in the OS's disk cache? Yes! If you also allocate less RAM to the JVM then there is more for the OS to cache. > > Do have other recommendations to follow / questions? > > Thanx && cheers, > Martin -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: List of indexed or stored fields

2011-01-25 Thread Markus Jelsma
mbers for "version", so I don't think it's the version of Luke. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Markus Jelsma
Then you don't need NGrams at all. A wildcard will suffice or you can use the TermsComponent. If these strings are indexed as single tokens (KeywordTokenizer with LowercaseFilter) you can simply do field:app* to retrieve the "apple milk shake". You can also use the string field type but then yo

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Markus Jelsma
Oh, i should perhaps mention that EdgeNGrams will yield results a lot quicker than using wildcards at the cost of a larger index. You should, of course, use EdgeNGrams if you worry about performance and have a huge index and a number of queries per second. > Then you don't need NGrams at all. A

Re: in-index representaton of tokens

2011-01-25 Thread Markus Jelsma
This should shed some light on the matter http://lucene.apache.org/java/2_9_0/fileformats.html > I am saying there is a list of tokens that have been parsed (a table of > them) for each column? Or one for the whole index? > > Dennis Gearon > > > Signature Warning > > It is alw

Re: SOLR deduplication

2011-01-26 Thread Markus Jelsma
Not right now: https://issues.apache.org/jira/browse/SOLR-1909 > Hi - I have the SOLR deduplication configured and working well. > > Is there any way I can tell which documents have been not added to the > index as a result of the deduplication rejecting subsequent identical > documents? > > Man

Re: SolrDocumentList Size vs NumFound

2011-01-26 Thread Markus Jelsma
Hi, If your query yields 1000 documents and the rows parameter is 10 then you'll get only 10 documents. Consult the wiki on the start and rows parameters: http://wiki.apache.org/solr/CommonQueryParameters Cheers. > Dear all, > > I got a weird problem. The number of searched documents is much

Re: How to group result when search on multiple fields

2011-01-26 Thread Markus Jelsma
http://wiki.apache.org/solr/ClusteringComponent http://wiki.apache.org/solr/FieldCollapsing

Re: DIH and duplicate content

2011-01-27 Thread Markus Jelsma
I mean if > the desciption of one product already exist in index not import this new > product. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
44f\u0437\u044b\u043a\u0438 \u2022 Aliaj lingvoj \u2022 \ub2e4\ub978 \uc5b8\uc5b4 \u2022 Ngôn ng\u1eef khác Wiktionary Wikinews Wikiquote Wikibooks Wikispecies Wikisource Wikiversity Commons Meta-Wiki ---

Re: Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
ot an Firefox-Issue, try xmllint on your shell to > check the given xml? > > Regards > Stefan > > On Tue, Feb 1, 2011 at 4:43 PM, Markus Jelsma > > wrote: > > There is an issue with the XML response writer. It cannot cope with some > > very exotic characters

Re: Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
e sure that > whatever font you use in firefox has the 'exotic' characters you are > expecting. There might also be some issues on your platform with mixing > script direction but that is probably not likely. > > Cheers > > François > > On Feb 1, 2011, at 10:43

Re: Malformed XML with exotic characters

2011-02-01 Thread Markus Jelsma
> > -Sascha > > p.s. I can provide the pdf file in question, if anybody would like to > see it in action. > > On 01.02.2011 16:43, Markus Jelsma wrote: > > There is an issue with the XML response writer. It cannot cope with some > > very exotic characters or possibly t

Re: Index MS office

2011-02-02 Thread Markus Jelsma
s, > Sai Thumuluri -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Open Too Many Files

2011-02-03 Thread Markus Jelsma
Or decrease the mergeFactor. > or change the index to a compound-index > > solrconfig.xml: true > > so solr creates one index file and not thousands. > > - > --- System > > > One Server, 12 GB RAM, 2 Solr Instances, 7 Cor

Re: Malformed XML with exotic characters

2011-02-03 Thread Markus Jelsma
Hi I've seen almost all funky charsets but gothic is always trouble. I'm also unsure if its really a bug in Solr. It could well be the Xerces being unable to cope. Besides, most systems indeed don't go well with gothic. This mail client does, but my terminal can't find its cursor after (properl

Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-02-07 Thread Markus Jelsma
Heap usage can spike after a commit. Existing caches are still in use and new caches are being generated and/or auto warmed. Can you confirm this is the case? On Friday 28 January 2011 00:34:42 Simon Wistow wrote: > On Tue, Jan 25, 2011 at 01:28:16PM +0100, Markus Jelsma said: > > Are

Re: dynamic fields revisited

2011-02-07 Thread Markus Jelsma
It would be quite annoying if it behaves as you were hoping for. This way it is possible to use different field types (and analyzers) for the same field value. In faceting, for example, this can be important because you should use analyzed fields for q and fq but unanalyzed fields for facet.fiel

Re: q.alt=*:* for every request?

2011-02-07 Thread Markus Jelsma
There is no measurable performance penalty when setting the parameter, except maybe the execution of the query with a high value for rows. To make things easy, you can define q.alt=*:* as default in your request handler. No need to specifiy it in the URL. > Hi, > > I use dismax handler with s

Re: Possible Memory Leaks / Upgrading to a Later Version of Solr or Lucene

2011-02-07 Thread Markus Jelsma
also happen when you sort on a very large dataset that isn't optimized, in this case the maxDoc value is too high. Anyway, try some settings and monitor the nodes and please report your findings. > On Mon, Feb 07, 2011 at 02:06:00PM +0100, Markus Jelsma said: > > Heap usage can sp

Re: does copyField recurse?

2011-02-08 Thread Markus Jelsma
Field values are copied before being analyzed. There is no cascading of analyzers. > Hello list, > > if I have a field title which copied to text and a field text that is > copied to text.stemmed. Am I going to get the copy from the field title to > the field text.stemmed or should I include it?

Re: q.alt=*:* for every request?

2011-02-08 Thread Markus Jelsma
ax > QParserPlugin is particularly powerful in there so it'd be nice to see > what's happening. > > Any logging category I need to activate? > > paul > > Le 8 févr. 2011 à 03:22, Markus Jelsma a écrit : > > There is no measurable performance penalty when setti

Re: difference between filter_queries and parsed_filter_queries

2011-02-08 Thread Markus Jelsma
="true" > > > > synonyms="synonyms_city_facet.txt" ignoreCase="true" expand="false" /> > > > > > please suggest me please. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Cache size

2011-02-08 Thread Markus Jelsma
o know the size *in bytes* occupied by a cache (filter > cache, doc cache ...)? I don't find such information within the stats page. > > Regards -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Nutch and Solr search on the fly

2011-02-09 Thread Markus Jelsma
oolish and the crawl never happened, I feel I > am missing some information here. I think somewhere in the process there > should be a crawling happening and I missed it out. > > Just wanted to see if some one could help me pointing this out and where I > went wrong in the process. Forgive

Re: Nutch and Solr search on the fly

2011-02-09 Thread Markus Jelsma
ch-solr/) assumed > everyone would know. > > Thanks, > Abi > > On Wed, Feb 9, 2011 at 7:09 PM, Markus Jelsma wrote: > > The parsed data is only sent to the Solr index of you tell a segment to > > be indexed; solrindex > > > > If you did this only once aft

Re: Solr 1.4.1 using more memory than Solr 1.3

2011-02-09 Thread Markus Jelsma
amount of processing > compared to Solr 1.3 ? > > Is there any particular configuration that needs to be done to avoid this > high memory usage ? > > Thanks, > Rachita -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Solr Out of Memory Error

2011-02-09 Thread Markus Jelsma
Bing Li, One should be conservative when setting Xmx. Also, just setting Xmx might not do the trick at all because the garbage collector might also be the issue here. Configure the JVM to output debug logs of the garbage collector and monitor the heap usage (especially the tenured generation) w

Re: Solr Out of Memory Error

2011-02-09 Thread Markus Jelsma
I should also add that reducing the caches and autowarm sizes (or not using them at all) drastically reduces memory consumption when a new searcher is being prepares after a commit. The memory usage will spike at these events. Again, use a monitoring tool to get more information on your specific

Re: Tomcat6 and Log4j

2011-02-10 Thread Markus Jelsma
Add it to the CATALINA_OPTS, on Debian systems you could edit /etc/default/tomcat On Thursday 10 February 2011 12:27:59 Xavier SCHEPLER wrote: > -Dlog4j.configuration=$CATALINA_HOME/webapps/solr/WEB-INF/classes/log4j.pr > operties -- Markus Jelsma - CTO - Openindex http://www.linkedin.

Re: Tomcat6 and Log4j

2011-02-10 Thread Markus Jelsma
on this page is that this error occurs what the > log4j.properties isn't found. > > Could someone help me to have it working ? > > Thanks in advance, > > Xavier -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Tomcat6 and Log4j

2011-02-10 Thread Markus Jelsma
pIndex=20 On Thursday 10 February 2011 12:51:13 Markus Jelsma wrote: > Oh, now looking at your log4j.properties, i believe it's wrong. You > declared INFO as rootLogger but you use SOLR. > > -log4j.rootLogger=INFO > +log4j.rootLogger=SOLR > > try again > > On

Re: Wikipedia table of contents.

2011-02-10 Thread Markus Jelsma
Yes but it's not very useful: http://wiki.apache.org/solr/TitleIndex On Thursday 10 February 2011 16:14:40 Dennis Gearon wrote: > Is there a detailed, perhaps alphabetical & hierarchical table of > contents for all ether wikis on the sole site? Sent > from Yahoo! Mail on A

Re: solr admin result page error

2011-02-11 Thread Markus Jelsma
lso > no \u utf8-code. > > Only idea I have is solr itself or the result page generation. > > How to proceed, what else to check? > > Regards, > Bernd -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: solr admin result page error

2011-02-11 Thread Markus Jelsma
? > > It is definately a bug and has nothing to do with firefox. > > Regards, > Bernd > > Am 11.02.2011 13:48, schrieb Markus Jelsma: > > It looks like you hit the same issue as i did a while ago: > > http://www.mail-archive.com/solr-user@lucene.apache.org/ms

Re: Title index to wiki

2011-02-11 Thread Markus Jelsma
age & did not see that link on that page. Who's got > write access to wikis pages? > Sent from Yahoo! Mail on Android -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: Difference between Solr and Lucidworks distribution

2011-02-11 Thread Markus Jelsma
an installer, etc.. Is there any other > differences? Is it a good idea to use this free distribution? > > Greg -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: carrot2 clustering component error

2011-02-15 Thread Markus Jelsma
I've seen that before on a 3.1 check out after i compiled the clustering component, copied the jars and started Solr. For some reason , recompiling didn't work and doing an ant clean in front didn't fix it either. Updating to a revision i knew did work also failed. I just removed the entire che

Re: clustering with tomcat

2011-02-16 Thread Markus Jelsma
On Debian you can edit /etc/default/tomcat6 > hi, > i am using solr1.4 with apache tomcat. to enable the > clustering feature > i follow the link > http://wiki.apache.org/solr/ClusteringComponent > Plz help me how to add-Dsolr.clustering.enabled=true to $CATALINA_OPTS. > after that w

Re: clustering with tomcat

2011-02-16 Thread Markus Jelsma
Garg wrote: > On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote: > > On Debian you can edit /etc/default/tomcat6 > > > >> hi, > >> > >> i am using solr1.4 with apache tomcat. to enable the > >> > >> clustering fea

Snappull failed

2011-02-16 Thread Markus Jelsma
) All i know is that it was unable to download but the reason eludes me. Sometimes, a machine rolls out many of these errors and increasing the index size because it can't handle the already downloaded data. Cheers, -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-85

Re: clustering with tomcat

2011-02-16 Thread Markus Jelsma
I have no idea, seems you haven't compiled Carrot2 or haven't included all jars. On Wednesday 16 February 2011 11:29:30 Isha Garg wrote: > On Wednesday 16 February 2011 03:32 PM, Markus Jelsma wrote: > > What distro are you using? On at least Debian syst

Re: Term Vector Query on Single Document

2011-02-16 Thread Markus Jelsma
'm > thinking 'yes' Yes. > > - How expensive is setting the termVector on a field? Takes up additional disk space and RAM. Can be a lot. > > > Thanks - Tod -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

Re: optimize and mergeFactor

2011-02-16 Thread Markus Jelsma
> In my own Solr 1.4, I am pretty sure that running an index optimize does > give me significant better performance. Perhaps because I use some > largeish (not huge, maybe as large as 200k) stored fields. 200.000 stored fields? I asume that number includes your number of documents? Sounds crazy =

Re: Shutdown hook executing for a long time

2011-02-16 Thread Markus Jelsma
Closing a core will shutdown almost everything related to the workings of a core. Update and search handlers, possible warming searchers etc. Check the implementation of the close method: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/src/java/org/apache/solr/core/SolrCore.java?v

Re: optimize and mergeFactor

2011-02-16 Thread Markus Jelsma
> Thanks for the answers, more questions below. > > On 2/16/2011 3:37 PM, Markus Jelsma wrote: > > 200.000 stored fields? I asume that number includes your number of > > documents? Sounds crazy =) > > Nope, I wasn't clear. I have less than a dozen stored field, b

Re: Solr multi cores or not

2011-02-16 Thread Markus Jelsma
Hi, That depends (as usual) on your scenario. Let me ask some questions: 1. what is the sum of documents for your applications? 2. what is the expected load in queries/minute 3. what is the update frequency in documents/minute and how many documents per commit? 4. how many different applications

Re: Solr multi cores or not

2011-02-16 Thread Markus Jelsma
You can also easily abuse shards to query multiple cores that share parts of the schema. This way you have isolation with the ability to query them all. The same can, of course, also be achieved using a sinlge index with a simple field identying the application and using fq on that one. > Yes,

Re: Replication and newSearcher registerd > poll interval

2011-02-17 Thread Markus Jelsma
polls (though warning to logs) when there is a > > searcher in the process of warming? Else as in our case it brings the > > slave to it's knees, workaround was to extend the poll interval, > > though not ideal. > > > > Cheers, > > Dan -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

  1   2   3   4   5   6   7   >