Re: access denied to solr home lib dir

2009-11-22 Thread Yonik Seeley
Maybe ensuring that the full parent path (all parent directories) have rx permissions? -Yonik http://www.lucidimagination.com On Sun, Nov 22, 2009 at 2:59 PM, Charles Moad cm...@imamuseum.org wrote:    I have been trying to get a new solr install setup on Ubuntu 9.10 using tomcat6.  I have

Re: query function query; what's it for?

2009-11-22 Thread Yonik Seeley
On Sun, Nov 22, 2009 at 11:06 PM, David Smiley @MITRE.org dsmi...@mitre.org wrote: It's not clear to me what purpose the query function query solves.  I've read the description: http://wiki.apache.org/solr/FunctionQuery#query  but it doesn't really explain the point of it. I'm sure it has to

Re: Solr 1.3 query and index perf tank during optimize

2009-11-21 Thread Yonik Seeley
On Sat, Nov 21, 2009 at 12:33 AM, Lance Norskog goks...@gmail.com wrote: And, terms whose documents have been deleted are not purged. So, you can merge all you like and the index will not shrink back completely. Under what conditions? Certainly not all, since I just tried a simple test and a

Re: Upgrade to solr 1.4

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 10:26 AM, kalidoss kalidoss.muthuramalin...@sifycorp.com wrote: In version 1.3 EventDate field type is date, In 1.4 also its date But we are getting the following error. Use the schema you had with 1.3 and it should work. The example schemas are not backward compatible

Re: Default sort order for filter query

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 11:15 AM, Mike mpiluson...@comcast.net wrote: Sorry for the noise - I think I have just answered my own question. The order in which docs are indexed determine the result sort order unless overridden via sort query parameters :) Correct. The internal lucene document id

Re: Default sort order for filter query

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 11:28 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, Nov 20, 2009 at 11:15 AM, Mike mpiluson...@comcast.net wrote: Sorry for the noise - I think I have just answered my own question. The order in which docs are indexed determine the result sort order unless

Re: Solr 1.3 query and index perf tank during optimize

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 12:24 PM, Michael solrco...@gmail.com wrote: So -- I thought I understood you to mean that if I frequently merge, it's basically the same as an optimize, and cruft will get purged.  Am I misunderstanding you? That only applies to the segments involved in the merge. The

Re: Solr 1.3 query and index perf tank during optimize

2009-11-20 Thread Yonik Seeley
On Fri, Nov 20, 2009 at 2:32 PM, Michael solrco...@gmail.com wrote: On Fri, Nov 20, 2009 at 12:35 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, Nov 20, 2009 at 12:24 PM, Michael solrco...@gmail.com wrote: So -- I thought I understood you to mean that if I frequently merge, it's

Re: Solrj writes data to local directory

2009-11-19 Thread Yonik Seeley
Looks like a very old undesirable config issue... http://search.lucidimagination.com/search/document/c5ae6fa490d0f59a I'll open a JIRA issue to track this. -Yonik http://www.lucidimagination.com On Thu, Nov 19, 2009 at 7:32 AM, Mike mpiluson...@comcast.net wrote: Stuart Grimshaw wrote: I

Re: NPE when trying to view a specific document via Luke

2009-11-19 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 8:05 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : I tied to reproduce this in 1.4 using an index/configs created with 1.3, : but i got a *different* NPE when loading this url... I should have tried a simpler test ...  iget NPE's just trying to execute a simple

Re: NPE when trying to view a specific document via Luke

2009-11-13 Thread Yonik Seeley
On Fri, Nov 13, 2009 at 5:41 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : I'm seeing this stack trace when I try to view a specific document, e.g. : /admin/luke?id=1 but luke appears to be working correctly when I just FWIW: I was able to reproduce this using the example setup (i

Re: Converting SortableIntField to Integer (Externalizing)

2009-11-12 Thread Yonik Seeley
On Thu, Nov 12, 2009 at 8:02 AM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: this works fine for me! However, I'm using Java/SolrJ and I have the freedom to add any necessary jars to convert the value. These conversions should normally be done on the Solr server side (i.e.

Re: distributed facet dates

2009-11-10 Thread Yonik Seeley
On Tue, Nov 10, 2009 at 7:09 AM, Marc Sturlese marc.sturl...@gmail.com wrote: Hey there, I am thinking to develope facet dates for distributed search but I don't know exacly where to start. I am familiar with facet dates source code and I think if I could undesertand how distributed facet

Re: distributed facet dates

2009-11-10 Thread Yonik Seeley
On Tue, Nov 10, 2009 at 7:54 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Tue, Nov 10, 2009 at 7:09 AM, Marc Sturlese marc.sturl...@gmail.com wrote: Hey there, I am thinking to develope facet dates for distributed search but I don't know exacly where to start. I am familiar

Re: tracking solr response time

2009-11-10 Thread Yonik Seeley
On Tue, Nov 10, 2009 at 8:07 AM, bharath venkatesh bharathv6.proj...@gmail.com wrote: how much ram would be good enough for the Solr JVM  to run comfortably. It really depends on how much stuff is cached, what fields you facet and sort on, etc. It can be easier to measure than to try and

Re: Converting SortableIntField to Integer (Externalizing)

2009-11-10 Thread Yonik Seeley
On Tue, Nov 10, 2009 at 10:26 AM, Chantal Ackermann chantal.ackerm...@btelligent.de wrote: has anyone some code snippet on how to convert the String representation of a SortableIntField (or SortableLongField or else) to a java.lang.Integer or int? FieldType.indexedToReadable() -Yonik

Re: StreamingUpdateSolrServer - indexing process stops in a couple of hours

2009-11-05 Thread Yonik Seeley
Seems fixed. https://issues.apache.org/jira/browse/SOLR-1543 -Yonik http://www.lucidimagination.com On Mon, Nov 2, 2009 at 6:05 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I'm able to reproduce this issue consistently using JDK 1.6.0_16 After an optimize is called, only one

Re: StreamingUpdateSolrServer - indexing process stops in a couple of hours

2009-11-04 Thread Yonik Seeley
Can you open a JIRA issue if you haven't already? How did you reproduce it (what's the simplest method?) -Yonik http://www.lucidimagination.com On Mon, Nov 2, 2009 at 6:05 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I'm able to reproduce this issue consistently using JDK 1.6.0_16

Re: tracking solr response time

2009-11-02 Thread Yonik Seeley
On Mon, Nov 2, 2009 at 8:13 AM, bharath venkatesh bharathv6.proj...@gmail.com wrote:    We are using solr for many of ur products  it is doing quite well .  But since no of hits are becoming high we are experiencing latency in certain requests ,about 15% of our requests are suffering a latency

Re: tracking solr response time

2009-11-02 Thread Yonik Seeley
On Mon, Nov 2, 2009 at 2:21 PM, bharath venkatesh bharathv6.proj...@gmail.com wrote: we observed many times there is huge mismatch between qtime and time measured at the client for the response Long times to stream back the result to the client could be due to - client not reading fast enough

Re: Scoring algorithm?

2009-10-31 Thread Yonik Seeley
? The easiest way is to just use something like post.sh *.xml That's slow performance-wise, but not a big deal of you don't have too many docs. -Yonik http://www.lucidimagination.com On Sat, Oct 31, 2009 at 10:04 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Sat, Oct 31, 2009 at 8:48 AM, Paul

Re: Ok, that didn't work

2009-10-31 Thread Yonik Seeley
Hmmm... perhaps you're missing the add tag around the doc? -Yonik http://www.lucidimagination.com On Sat, Oct 31, 2009 at 10:37 AM, Paul Tomblin ptomb...@xcski.com wrote: I was looking at the script in example/exampledocs to feed documents to the server. Just to see if it was possible, I

Re: Ok, that didn't work

2009-10-31 Thread Yonik Seeley
$URL --data-binary @- -H 'Content-type:text/xml; charset=utf-8' -Yonik http://www.lucidimagination.com  Is there a way to feed the actual documents without adding tags that aren't part of the schema to them? On Sat, Oct 31, 2009 at 10:43 AM, Yonik Seeley yo...@lucidimagination.com wrote

Re: Solr Cell on web-based files?

2009-10-31 Thread Yonik Seeley
On Sat, Oct 31, 2009 at 12:52 PM, Insight 49, LLC insigh...@gmail.com wrote: Is local file URIs a limitation of solr cell, or just curl; All of Solr's interfaces are currently based on HTTP and usable over a network. Curl (like wget) is simply a useful command line tool that can speak HTTP and

Re: Another question about omitNorms

2009-10-31 Thread Yonik Seeley
On Sat, Oct 31, 2009 at 3:18 PM, Paul Tomblin ptomb...@xcski.com wrote: In an earlier message, Yonik suggested that I use omitNorms=true if I wanted the length of the document to not be counted in the scoring. The documentation also mentions that it omits index-time boosting. What does that

Re: Facets - ORing attribute values

2009-10-29 Thread Yonik Seeley
Perhaps something like this that's actually running Solr w/ multi-selecti? http://search.lucidimagination.com/ http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters You just need a recent version of Solr 1.4 -Yonik http://www.lucidimagination.com On Thu, Oct 29,

Re: Faceting within one document

2009-10-29 Thread Yonik Seeley
On Wed, Oct 28, 2009 at 2:02 PM, Andrew Clegg andrew.cl...@gmail.com wrote: If I give a query that matches a single document, and facet on a particular field, I get a list of all the terms in that field which appear in that document. (I also get some with a count of zero, I don't really

Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 8:44 AM, Jérôme Etévé jerome.et...@gmail.com wrote: I don't really get why these two tokens are subsequently put together in a phrase query. That's the way the Lucene query parser has always worked... phrase queries are made if multiple tokens are produced from one field

Re: Solr 1.4 (RC) performance on multi-CPU system

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 10:23 AM, gabriele renzi rff@gmail.com wrote: On Mon, Oct 26, 2009 at 10:43 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka k...@basistech.com wrote: Are multiple CPUs utilized at indexing time as well, or just

Re: Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 4:14 PM, Grant Ingersoll gsing...@apache.org wrote: I'm pretty sure wildcard queries don't go through analysis, hence they are probably not stemmed. Right - same thing would happen w/o the reverse filter. Also, wildcarding mixes poorly with stemming - trying to analyze

Re: question about text field and WordDelimiterFilter in example schema.xml

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 10:31 PM, Bill Au bill.w...@gmail.com wrote: Here is my example. With the current text field, the query term iPhone will not match document containing the string iphone because iPhone is analyzed into two terms: i(1) and phone(2). Right. The limitations are known...

Re: Solr ignoring maxFieldLength?

2009-10-26 Thread Yonik Seeley
Yes, please show us your solrconfig.xml, and verify that you reindexed the document after changing maxFieldLength and restarting solr. I'll also see if I can reproduce a problem with maxFieldLength being ignored. -Yonik http://www.lucidimagination.com On Mon, Oct 26, 2009 at 7:11 AM, Andrew

Re: Solr ignoring maxFieldLength?

2009-10-26 Thread Yonik Seeley
my data and conf directories here: http://biotext.org.uk/static/solr-issue-example.tar.gz That should be enough to reproduce it with. Thanks! Andrew. Yonik Seeley-2 wrote: Yes, please show us your solrconfig.xml, and verify that you reindexed the document after changing maxFieldLength

Re: Solr ignoring maxFieldLength?

2009-10-26 Thread Yonik Seeley
On Mon, Oct 26, 2009 at 11:00 AM, Andrew Clegg andrew.cl...@gmail.com wrote: Yonik Seeley-2 wrote: Sorry Andrew, this is something that's bitten people before. search for maxFieldLength and you will see *2* of them in your config - one for indexDefaults and one for mainIndex. The one

Re: Solr ignoring maxFieldLength?

2009-10-26 Thread Yonik Seeley
On Mon, Oct 26, 2009 at 11:43 AM, Andrew Clegg andrew.cl...@gmail.com wrote: Yonik Seeley-2 wrote: If you could, it would be great if you could test commenting out the one in mainIndex and see if it inherits correctly from indexDefaults... if so, I can comment it out in the example and remove

Re: Solr 1.4 (RC) performance on multi-CPU system

2009-10-26 Thread Yonik Seeley
2009/10/26 Teruhiko Kurosaka k...@basistech.com: Is Solr 1.4 (Release Candidate) suppose to take advantage of muti-CPU (core) system? I.e. if more than one update or search requests come in about the same time, they can be automatically assigned to differnt CPUs if available (and the OS does

Re: Solr 1.4 (RC) performance on multi-CPU system

2009-10-26 Thread Yonik Seeley
On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka k...@basistech.com wrote: Are multiple CPUs utilized at indexing time as well, or just by searcher? Yes, multiple CPUs are utilized for indexing. If you're using SolrJ, and easy way to exploit this parallelism is to use

Re: relevancy and merging

2009-10-26 Thread Yonik Seeley
On Mon, Oct 26, 2009 at 5:40 PM, Paul Rosen p...@performantsoftware.com wrote: Is there any difference to the relevancy score for a document that has been added directly to an index vs. the same document that got into the index because of a merge? Nope. Anything else would be a bug. There are

Re: Too many open files

2009-10-24 Thread Yonik Seeley
On Sat, Oct 24, 2009 at 12:18 PM, Fuad Efendi f...@efendi.ca wrote: Mark, I don't understand this; of course it is use case specific, I haven't seen any terrible behaviour with 8Gb If you had gone over 2GB of actual buffer *usage*, it would have broke... Guaranteed. We've now added a check in

Re: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Yonik Seeley
Try using example/exampledocs/test_utf8.sh to narrow down if the charset problems you're hitting are due to servlet container configuration. -Yonik http://www.lucidimagination.com 2009/10/24 Glock, Thomas thomas.gl...@pfizer.com: Thanks but not working... I did have the URIEncoding in place

Re: boostQParser and dismax

2009-10-22 Thread Yonik Seeley
On Thu, Oct 22, 2009 at 7:01 PM, Joe Calderon calderon@gmail.com wrote: hello *, i was just reading over the wiki function query page and found this little gem for boosting recent docs thats much better than what i was doing before recip(ms(NOW,mydatefield),3.16e-11,1,1) Thanks, it's

Re: No search hits for items starting with one-letter words

2009-10-21 Thread Yonik Seeley
I just tried with Solr 1.4 trunk and it seems to work fine. a is a stopword... but I'm not sure how stopwords could be messing you up. For matching song titles, you may want to use a field type with no stopwords though (there are a lot of common words in song titles I think). If you've changed

Re: commitWithin question

2009-10-21 Thread Yonik Seeley
On Wed, Oct 21, 2009 at 11:58 AM, Jacob Elder jel...@locamoda.com wrote: Our application involves lots of live index updates with mixed priority. A few updates are very important and need to be in the index promptly, while we also have a great deal of updates which can be dealt with lazily.

Re: Lucene versions in upcoming solr release

2009-10-21 Thread Yonik Seeley
On Wed, Oct 21, 2009 at 8:36 PM, entdeveloper cameron.develo...@gmail.com wrote: It's my understanding that Solr 1.4, which is to be released any day now, will be based on version 2.9 of lucene. Serious bugs were found in Lucene 2.9.0... we are all set to release when Lucene 2.9.1 is released,

Re: Near Real Time

2009-10-21 Thread Yonik Seeley
On Wed, Oct 21, 2009 at 10:19 PM, George Aroush geo...@aroush.net wrote: Depends a lot on the nature of the requests and the size of the index, but one minute is often doable. On a large index that facets on many fields per request, one minute is probably still out of reach. With no facets,

Re: Boost with wildcard.

2009-10-20 Thread Yonik Seeley
On Mon, Oct 19, 2009 at 10:32 AM, Jay Ess li...@netrogenic.com wrote: The boost (index time) does not work when i am searching for a word with a wildcard appended to the end. I stumbled on to this feature and its pretty much a show stopper for me. I am implementing a live search feature where

Re: Slow Phrase Queries

2009-10-20 Thread Yonik Seeley
Solr just uses a stock lucene phrase query. What version of Lucene and Solr are you comparing? Do the queries match the same number of documents? -Yonik http://www.lucidimagination.com On Tue, Oct 20, 2009 at 2:18 PM, DHast hastings.recurs...@gmail.com wrote: Hello, I have recently installed

Re: max words/tokens

2009-10-20 Thread Yonik Seeley
On Tue, Oct 20, 2009 at 1:53 PM, Joe Calderon calderon@gmail.com wrote: i have a pretty basic question, is there an existing analyzer that limits the number of words/tokens indexed from a field? let say i only wanted to index the top 25 words... It would be really easy to write one, but no

Re: Hierarchical Facet Sorting

2009-10-20 Thread Yonik Seeley
What version of Solr are you using? I just tried this with the latest 1.4-dev version, and it works fine. http://localhost:8983/solr/select?q=*:*facet=truefacet.field=catfacet.sort=true Note that facet.sort=true/false has been deprecated in Solr 1.4

Re: question about text field and WordDelimiterFilter in example schema.xml

2009-10-20 Thread Yonik Seeley
On Tue, Oct 20, 2009 at 6:37 PM, Bill Au bill.w...@gmail.com wrote: I have a question regarding the use of the WordDelimiterFilter in the text field in the example schema.xml.  The parameters are set differently for the indexing and querying.  Namely, catenateWords and catenateNumbers are set

Re: ArrayIndexOutOfBoundsException during indexing

2009-10-19 Thread Yonik Seeley
Thanks for the report Aaron, this definitely looks like a Lucene bug, and I've opened https://issues.apache.org/jira/browse/LUCENE-1995 Can you follow up there (I asked about your index settings). -Yonik http://www.lucidimagination.com On Mon, Oct 19, 2009 at 3:04 PM, Aaron McKee

Re: Filter query optimization

2009-10-19 Thread Yonik Seeley
, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Oct 19, 2009 at 2:55 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: If a filter query matches nothing, then no additional query should be performed and no results returned?  I don't think we have this today

Re: Solr commits before documents are added

2009-10-19 Thread Yonik Seeley
On Mon, Oct 19, 2009 at 7:39 PM, Lance Norskog goks...@gmail.com wrote: commit(waitFlush=true, waitSearcher=true)  waits for the entire operation and when it finishes, all 1 million documents should be searchable. That waits for the commit to complete, but not any adds that may be happening in

Re: Core/shard preference

2009-10-19 Thread Yonik Seeley
Although shards should be disjoint, Solr tolerates duplication (won't return duplicates in the main results list, but doesn't make any effort to correct facet counts, etc). Currently, whichever shard responds first wins. The relevant code is around line 420 in QueryComponent.java:

Re: Solr 1.4 release candidate

2009-10-18 Thread Yonik Seeley
FYI, the latest nightly includes more lucene bug fixes targeted toward Lucene 2.9.1 The (current) full list is here: http://svn.apache.org/viewvc/lucene/java/branches/lucene_2_9/CHANGES.txt?view=markuppathrev=826563 -Yonik http://www.lucidimagination.com On Wed, Oct 14, 2009 at 10:01 AM, Yonik

Re: stats page slow in latest nightly

2009-10-17 Thread Yonik Seeley
On Tue, Oct 6, 2009 at 5:51 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : When I was working on it, I was actually going to default to not show : the size, and make you click a link that added a param to get the sizes : in the display too. But I foolishly didn't bring it up when Hoss

Re: lucene 2.9 bug

2009-10-16 Thread Yonik Seeley
On Fri, Oct 16, 2009 at 11:41 AM, Joe Calderon calderon@gmail.com wrote: hello * , ive read in other threads that lucene 2.9 had a serious bug in it, hence trunk moved to 2.9.1 dev, im wondering what the bug is as ive been using the 2.9.0 version for the past weeks with no problems, is it

Re: Replication filelist command failure on container restart

2009-10-16 Thread Yonik Seeley
I think you may need to tell the replication handler to enable replication after startup too? str name=replicateAftercommit/str str name=replicateAfterstartup/str -Yonik http://www.lucidimagination.com On Fri, Oct 16, 2009 at 12:58 PM, Jérôme Etévé jerome.et...@gmail.com wrote:

Re: Solr/Lucene keeps eating up memory while idling

2009-10-15 Thread Yonik Seeley
I just did some allocation profiling on the stock Solr example... it's not completely idle when no requests are being made. There's only one thing allocating memory: org.mortbay.util.Scanner.scanFiles() That must be Jetty looking to see if any of the files under webapps has changed. It's really

Re: lazy loading error usin Solr Cell

2009-10-14 Thread Yonik Seeley
Hmmm, I just tried the first steps of the Solr Cell tutorial, and it worked fine for me (well, with the exception that there is no site directory... I went to docs instead - I'll fix that). Oh wait - I see your problem: at gnu.xml.stream.SAXParserFactory.setFeature(libgcj.so.90) You're path

Solr 1.4 release candidate

2009-10-14 Thread Yonik Seeley
Folks, we've been in code freeze since Monday and a test release candidate was created yesterday, however it already had to be updated last night due to a serious bug found in Lucene. For now you can use the latest nightly build to get any recent changes like this:

Re: Solr 1.4 release candidate

2009-10-14 Thread Yonik Seeley
a little early - nothing to worry about though. When we build official releases, we explicitly specify the version number anyway. -Yonik http://www.lucidimagination.com On Wed, Oct 14, 2009 at 7:01 AM, Yonik Seeley yo...@lucidimagination.com wrote: Folks, we've been in code freeze since Monday

Re: how to get field contents out of Document object

2009-10-14 Thread Yonik Seeley
On Wed, Oct 14, 2009 at 2:24 PM, Joe Calderon calderon@gmail.com wrote: hello *, sorry if this seems like a dumb question, im still fairly new to working with lucene/solr internals. given a Document object, what is the proper way to fetch an integer value for a field called num_in_stock,

Re: 'Down' boosting shorter docs

2009-10-14 Thread Yonik Seeley
A multiplicative boost may work better than one added in: http://lucene.apache.org/solr/api/org/apache/solr/search/BoostQParserPlugin.html -Yonik http://www.lucidimagination.com On Wed, Oct 14, 2009 at 7:21 PM, Simon Wistow si...@thegestalt.org wrote: Our index has some items in it which

Re: solr IOException

2009-10-13 Thread Yonik Seeley
Jetty has a maximum request size for HTTP-GET... can you use POST instead? -Yonik http://www.lucidimagination.com On Tue, Oct 13, 2009 at 4:33 PM, Elaine Li elaine.bing...@gmail.com wrote: Hi, In my query, i have around 80 boolean clauses. I don't know if it is because the number of boolean

Re: Is negative boost possible?

2009-10-12 Thread Yonik Seeley
On Mon, Oct 12, 2009 at 5:58 AM, Andrzej Bialecki a...@getopt.org wrote: BTW, standard Collectors collect only results with positive scores, so if you want to collect results with negative scores as well then you need to use a custom Collector. Solr never discarded non-positive hits, and now

Re: Is negative boost possible?

2009-10-12 Thread Yonik Seeley
On Mon, Oct 12, 2009 at 12:03 PM, Andrzej Bialecki a...@getopt.org wrote: Solr never discarded non-positive hits, and now Lucene 2.9 no longer does either. Hmm ... The code that I pasted in my previous email uses Searcher.search(Query, int), which in turn uses search(Query, Filter, int), and

Re: Is negative boost possible?

2009-10-11 Thread Yonik Seeley
On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog goks...@gmail.com wrote: And the other important thing to know about boost values is that the dynamic range is about 6-8 bits That's an index-time boost - an 8 bit float with 5 bits of mantissa and 3 bits of exponent. Query time boosts are normal

Re: Dismax: Impossible to search for a _phrase_ in tokenized and untokenized fields at the same time

2009-10-10 Thread Yonik Seeley
On Sat, Oct 10, 2009 at 6:34 AM, Alex Baranov alex.barano...@gmail.com wrote: Hello, It seems to me that there is no way how I can use dismax handler for searching in both tokenized and untokenized fields while I'm searching for a phrase. Consider the next example. I have two fields in

Re: Different sort behavior on same code

2009-10-09 Thread Yonik Seeley
is done while filling in the FieldCache entry, and deleted docs are skipped by Lucene TermDocs, so terms from deleted docs won't be seen/counted. -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 12:10 PM, Yonik Seeley yo...@lucidimagination.com wrote: Lucene's test for multi-valued

Re: solr severe error when doing a faceted search

2009-10-09 Thread Yonik Seeley
Hi Paul, The new faceting method is faster in the general case, but doesn't work well for faceting full text fields (which tends not to work well regardless of the method). You can get the old behavior bt adding either of the parameters facet.method=enum or f.content.facet.method=enum We'll add

Re: how to post(index) large file of 5 GB or greater than this

2009-10-08 Thread Yonik Seeley
What is this huge file? Solr XML? CSV? Anyway, if it's a local file, you can get Solr to directly read/stream it via stream.file Examples in http://wiki.apache.org/solr/UpdateCSV but it should work for any update format, not just CSV. -Yonik http://www.lucidimagination.com On Thu, Oct 8,

Re: UTF-8 and latin accents

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 12:48 PM, Claudio Martella claudio.marte...@tis.bz.it wrote: I'm trying to index documents with latin accents (italian documents). I extract the text from .doc documents with Tika directly into .xml files. If i open up the XML document with my Dashcode (i run mac os x) i

Re: how can I use debugQuery if I have extended QParserPlugin?

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 12:14 PM, gdeconto gerald.deco...@topproducer.com wrote: I did check the other posts, as well as whatever I could find on the net but didnt find anything. Has anyone encountered this type of issue, or is what I am doing (extending QParserPlugin) that unusual?? I think

Re: IndexWriter InfoStream in solrconfig not working

2009-10-08 Thread Yonik Seeley
I can't get it to work either, so I reopened https://issues.apache.org/jira/browse/SOLR-1145 -Yonik http://www.lucidimagination.com On Wed, Oct 7, 2009 at 1:45 PM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: I had the same problem. I'd be very interested to know how to

Re: IndexWriter InfoStream in solrconfig not working

2009-10-08 Thread Yonik Seeley
OK, move the infoStream part in solrconfig.xml from indexDefaults into mainIndex and it should work. -Yonik http://www.lucidimagination.com On Thu, Oct 8, 2009 at 2:40 PM, Yonik Seeley yonik.see...@lucidimagination.com wrote: I can't get it to work either, so I reopened https

Re: delay while adding document to solr index

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 1:58 AM, swapna_here swapna.here...@gmail.com wrote: i don't understand why my solr index increasing daily when i am adding and deleting the same number of documents daily A delete is just a bit flip, and does not reclaim disk space immediately. Deleted documents are

Re: indexing frequently-changing fields

2009-10-08 Thread Yonik Seeley
It's a bit round-about but you might be able to use ExternalFileField http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html The fieldType definition would look like fieldType name=file keyField=id defVal=1 stored=false indexed=false class=solr.ExternalFileField

Re: [slightly off topic] Jetty and NIO

2009-10-08 Thread Yonik Seeley
On Thu, Oct 8, 2009 at 6:24 PM, Grant Ingersoll gsing...@apache.org wrote: So, if I'm on Centos 2.6 (64 bit), what connector should I be using?  Based on the comments, I'm not sure the top one is the right thing either, but it also sounds like it is my only other choice. Right - the connector

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 12:51 PM, Phillip Farber pfar...@umich.edu wrote: In a separate thread, I've detailed how an optimize is taking 2x disk space. We don't use solr distribution/snapshooter.  We are using the default deletion policy = 1. We can't optimize a 192G index in 400GB of space.

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 1:50 PM, Phillip Farber pfar...@umich.edu wrote: So this implies that for a normal optimize, in every case, due to the Searcher holding open the existing segment prior to optimize that we'd always need 3x even in the normal case. This seems wrong since it is repeated

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 3:16 PM, Phillip Farber pfar...@umich.edu wrote: Wow, this is weird.  I commit before I optimize.  In fact, I bounce tomcat before I optimize just in case. It makse sense, as you say, that then the open reader can only be holding references to segments that wouldn't be

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 3:31 PM, Mark Miller markrmil...@gmail.com wrote: I can't tell why calling a commit or restarting is going to help anything Depends on what scenarios you consider, and what you are taking 2x of. 1) Open reader on index 2) Open writer and add two documents... the first

Re: How much disk space does optimize really take

2009-10-07 Thread Yonik Seeley
On Wed, Oct 7, 2009 at 3:56 PM, Mark Miller markrmil...@gmail.com wrote: I guess you can't guarantee 2x though, as if you have queries coming in that take a while, a commit opening a new Reader will not guarantee the old Reader is quite ready to go away. Might want to wait a short bit after

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Yonik Seeley
to be clinging. Needs to find some inner peace. Yonik Seeley wrote: On Mon, Oct 5, 2009 at 4:54 PM, Jeff Newburn jnewb...@zappos.com wrote: Ok we have done some more testing on this issue.  When I only have the 1 core the reindex completes fine.  However, when I added a second core

Re: Solr Trunk Heap Space Issues

2009-10-06 Thread Yonik Seeley
Miller markrmil...@gmail.com wrote: Yeah - I was wondering about that ... not sure how these guys are stacking up ... Yonik Seeley wrote: TestIndexingPerformance? What the heck... that's not even multi-threaded! -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 12:17 PM, Mark

Re: Importing CSV file slow/crashes

2009-10-06 Thread Yonik Seeley
Is it possible to narrow down what fields/field-types are causing the problems? Or perhaps profile and see what's taking up time compared to the older version? -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 1:48 PM, Nasseam Elkarra nass...@bodukai.com wrote: Hello Erick,

Re: Solr Timeouts

2009-10-06 Thread Yonik Seeley
...@gmail.com] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 12:52 PM To: solr-user@lucene.apache.org Subject: Re: Solr Timeouts This is what one of my SOLR requests look like: http://titans:8080/solr/update/extract/?literal.versionId=684936literal.filingDate=1997-12-04T00:00

Re: Different sort behavior on same code

2009-10-06 Thread Yonik Seeley
Lucene's test for multi-valued fields is crude... it's essentially if the number of values (un-inverted term instances) becomes greater than the number of documents. -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 3:04 PM, wojtekpia wojte...@hotmail.com wrote: Hi, I'm running

Re: stats page slow in latest nightly

2009-10-06 Thread Yonik Seeley
Might be the new Lucene fieldCache stats stuff that was recently added? -Yonik http://www.lucidimagination.com On Tue, Oct 6, 2009 at 3:56 PM, Joe Calderon calderon@gmail.com wrote: hello *, ive been noticing that /admin/stats.jsp is really slow in the recent builds, has anyone else

Re: Importing CSV file slow/crashes

2009-10-06 Thread Yonik Seeley
On Tue, Oct 6, 2009 at 1:06 PM, Nasseam Elkarra nass...@bodukai.com wrote: I had a dev build of 1.4 from 5/1/2009 and importing a 20K row took less than a minute. Updating to the latest as of yesterday, the import is really slow and I had to cancel it after a half hour. This prevented me from

Re: Solr Trunk Heap Space Issues

2009-10-05 Thread Yonik Seeley
:  0 cumulative_evictions :  0 -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Yonik Seeley yo...@lucidimagination.com Reply-To: solr-user@lucene.apache.org Date: Fri, 2 Oct 2009 10:04:27 -0400 To: solr-user@lucene.apache.org Subject: Re: Solr

Re: Solr Timeouts

2009-10-05 Thread Yonik Seeley
On Mon, Oct 5, 2009 at 12:03 PM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: Hi, I’m attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6 on Windows Server 2003 x64. I’m running 64 bit Tomcat and JVM. I’ve fired up 4-5 different jobs that

Re: debugQuery different score for same query. dismax

2009-10-05 Thread Yonik Seeley
scores: 3.7137468 * .375 / .4375 = 3.1832115 If you don't want length normalization for this field, turn it off by setting omitNorms=true -Yonik http://www.lucidimagination.com Yonik Seeley wrote: On Fri, Oct 2, 2009 at 8:16 AM, Julian Davchev j...@drun.net wrote: It looks for pari

Re: Solr Timeouts

2009-10-05 Thread Yonik Seeley
On Mon, Oct 5, 2009 at 12:30 PM, Giovanni Fernandez-Kincade gfernandez-kinc...@capitaliq.com wrote: I'm not committing at all actually - I'm waiting for all 6 million to be done. You either have solr auto commit set up, or a client is issuing a commit. -Yonik http://www.lucidimagination.com

Re: Solr Timeouts

2009-10-05 Thread Yonik Seeley
This is what one of my SOLR requests look like: http://titans:8080/solr/update/extract/?literal.versionId=684936literal.filingDate=1997-12-04T00:00:00Zliteral.formTypeId=95literal.companyId=3567904literal.sourceId=0resource.name=684936.txtcommit=false Have you verified that all of your

Re: Solr Timeouts

2009-10-05 Thread Yonik Seeley
\indexer.logtrue/infoStream I tried relative and absolute paths, but no dice so far. Any other ideas? -Gio. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Monday, October 05, 2009 12:52 PM To: solr-user@lucene.apache.org Subject

Re: Solr Trunk Heap Space Issues

2009-10-05 Thread Yonik Seeley
On Mon, Oct 5, 2009 at 1:00 PM, Jeff Newburn jnewb...@zappos.com wrote: Ok I have eliminated all queries for warming and am still getting the heap space dump.  Any ideas at this point what could be wrong?  This seems like a huge increase in memory to go from indexing without issues to not being

Re: Solr Trunk Heap Space Issues

2009-10-05 Thread Yonik Seeley
self populate with so many entries (assuming it is the document cache again). -Yonik http://www.lucidimagination.com -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 From: Yonik Seeley yo...@lucidimagination.com Reply-To: solr-user@lucene.apache.org Date

Re: is it possible to speed up this query?

2009-10-04 Thread Yonik Seeley
would cut it to 16. The fastest would be to index a separate field to indicate presence or absence of your field. So instead of myfield:[* TO *] use field_present:myfield or -field_absent:myfield -Yonik http://www.lucidimagination.com Regards, Steve On Sat, Oct 3, 2009 at 9:07 AM, Yonik

<    9   10   11   12   13   14   15   16   17   18   >