Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-17 Thread Mike Klaas
Hi Jayson, It is on my list of things to do. I've been having a very busy week and and am also working all weekend. I hope to get to it next week sometime, if no-one else has taken it. cheers, -mike On 8-May-09, at 10:15 PM, jayson.minard wrote: First cut of updated handler now in: ht

Re: Solr vs Sphinx

2009-05-14 Thread Mike Klaas
On 14-May-09, at 9:46 AM, gdeconto wrote: Solr is very fast even with 1.3 and the developers have done an incredible job. However, maybe the next Solr improvement should be the creation of a configuration manager and/or automated tuning tool. I know that optimizing Solr performance can

Re: public apology for company spam

2009-03-05 Thread Mike Klaas
On 5-Mar-09, at 6:47 AM, Yonik Seeley wrote: This morning, an apparently over-zealous marketing firm, on behalf of the company I work for, sent out a marketing email to a large number of subscribers of the Lucene email lists. This was done without my knowledge or approval, and I can assure you

Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas
On 18-Feb-09, at 2:09 PM, Stephen Weiss wrote: I third the motion SOLR is the second largest contributor to my e-mail glut (my company's marketing is #1). I often have no idea what area of Solr I'm actually asking about when I have a question, so I would disagree and say a general for

Re: why don't we have a forum for discussion?

2009-02-18 Thread Mike Klaas
On 18-Feb-09, at 11:06 AM, Tony Wang wrote: I am just curious why we don't have a forum for discussion or you guys think it's really necessary to receive lots of crap information about Solr and nutch in email? I can offer you a forum for discussion anyway. If you want to follow solr-user

Re: Searching on field A gives spurious highlights in field B

2009-02-06 Thread Mike Klaas
On 6-Feb-09, at 12:34 PM, Jeffrey Baker wrote: Hello all. First post to the list. Welcome aboard. I noticed that if I search for foo:blah&hl.fl=bar, I get highlight output for instances of "blah" in field "bar". Is there any way to avoid that? I'm using solr 1.3. Try hl.requireFieldMat

Re: Highlighting does not work?

2009-01-29 Thread Mike Klaas
on. Wiadomość napisana w dniu 2009-01-28, o godz. 20:01, przez Mike Klaas: Well, both pages I listed are in the search results :). But I agree that it isn't obvious to find, and that it should be improved. (The Wiki is a community-created site which anyone can contribute to, incidental

Re: Highlighting does not work?

2009-01-28 Thread Mike Klaas
ear I was looking this information in Solr wiki. See for yourself if this is accessible at all: http://wiki.apache.org/solr/?action=fullsearch&context=180&value=highlight&fullsearch=Text Wiadomość napisana w dniu 2009-01-28, o godz. 00:58, przez Mike Klaas: They are d

Re: Highlighting does not work?

2009-01-27 Thread Mike Klaas
They are documented in http://wiki.apache.org/solr/ FieldOptionsByUseCase and in the FAQ , but I agree that it could be more readily accessible. -Mike On 27-Jan-09, at 5:26 AM, Jarek Zgoda wrote: Finally found that the fields have to have an analyzer to be highlighted. Neat. Can I ask so

Re: OOME diagnosis - possible to disable caching?

2009-01-19 Thread Mike Klaas
On 19-Jan-09, at 2:44 PM, James Brady wrote: Hi all, I have 20 indices, each ~10GB in size, being searched by a single Solr slave instance (using the multicore features in a slightly old 1.2 dev build) I'm getting unpredictable, but inevitable, OutOfMemoryError from the slave, and I have n

Re: non fix highlight snippet size

2009-01-13 Thread Mike Klaas
On 13-Jan-09, at 12:48 AM, Marc Sturlese wrote: Hey there, I need a rule in my highlights that sets for example, the snippet size to 400 in case there's just one snippet, 225 in case two snippeds are found and 125 in case 3 or more snippets are found. Is there any way to do that via solrc

Re: Solr on a multiprocessor machine

2009-01-08 Thread Mike Klaas
On 8-Jan-09, at 3:37 PM, smock wrote: Assuming I have enough RAM then, should I be able to get a performance boost with my current setup? Basically, the question I am trying to answer is - will the Tomcat+Solr setup I have above utilize multiple processors or do I need to do something el

Re: Subscribe Me

2009-01-07 Thread Mike Klaas
Kalidoss, You can subscribe here: http://lucene.apache.org/solr/mailing_lists.html regards, -Mike On 5-Jan-09, at 4:19 AM, kalidoss wrote: Thanks, kalidoss.m, ** DISCLAIMER ** Information contained and transmitted by this E-MAIL is proprietary to Sify Limited and is inten

Re: debugging long commits

2009-01-07 Thread Mike Klaas
Hi Brian, You might want to follow up on the Lucene list (java-u...@lucene.apache.org ). Something was causing problems with the merging and thus you ended up with too many segments (hence the slow commits). I doubt that you lost anything--usually the merge function doesn't modify the ind

Re: Get All terms from all documents

2008-12-18 Thread Mike Klaas
On 18-Dec-08, at 10:53 AM, roberto wrote: Erick, Thanks for the answer, let me clarify the thing, we would like to have a combobox with the terms to guide the user in the search i mean, if a have thousands of documents and want to tell them how many documents in the base have the partic

Re: Smaller filterCache giving better performance

2008-12-05 Thread Mike Klaas
On 5-Dec-08, at 2:24 PM, wojtekpia wrote: I've seen some strangle results in the last few days of testing, but this one flies in the face of everything I've read on this forum: Reducing filterCache size has increased performance. This isn't really unexpected behaviour. The problem with a

Re: Deleting indices

2008-11-27 Thread Mike Klaas
On 26-Nov-08, at 4:48 AM, Raghunandan Rao wrote: I have restarted and re-indexed all the docs after the change in the schema.xml. I was able to search even after that. I hit browser with this url http://localhost:7001/solr/select?q=name:2124&fl=*&debugQuery=true Are you sure all the old doc

Re: [VOTE] Community Logo Preferences

2008-11-26 Thread Mike Klaas
https://issues.apache.org/jira/secure/attachment/12394350/solr.s4.jpg https://issues.apache.org/jira/secure/attachment/12394268/apache_solr_c_red.jpg https://issues.apache.org/jira/secure/attachment/12393995/sslogo-solr-70s.png https://issues.apache.org/jira/secure/attachment/12393936/logo_remake

Re: Highlighting wildcards

2008-11-21 Thread Mike Klaas
On 21-Nov-08, at 3:45 AM, Mark Miller wrote: To do it now, you'd have to switch the query parser to using the old style wildcard (and/or prefix) query, which is slower on large indexes and has max clause issues. An alternative is to query for q=tele?*, which forces wildcardquery -Mike

Re: No search result behavior (a la Amazon)

2008-11-20 Thread Mike Klaas
On 20-Nov-08, at 11:40 AM, Caligula wrote: Thanks. I understand what Amazon is doing. The original question is how to achieve this with Solr. And to be more specific, how to achieve this within Solr and not involve multiple search queries to Solr. There isn't a way. The best way to

Re: solr.WordDelimiterFilterFactory

2008-11-20 Thread Mike Klaas
On 20-Nov-08, at 6:20 AM, Daniel Rosher wrote: Hi, I'm trying to index some content that has things like 'java/J2EE' but with solr.WordDelimiterFilterFactory and parameters [generateWordParts="1" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange

Re: Filtering on blank fields

2008-11-20 Thread Mike Klaas
On 20-Nov-08, at 12:23 PM, Manepalli, Kalyan wrote: Hi, I want to fetch only the documents which have a certain field. For this I am using a fq query like this fq=rev.comments:[* TO *] rev.comments fields is of type string. The functionality works correctly but I am seeing a p

Re: Software Announcement: LuSql: Database to Lucene indexing

2008-11-18 Thread Mike Klaas
On 18-Nov-08, at 6:56 AM, Glen Newton wrote: Erik, Right now there is no real abstraction like DIH in LuSql. But as indicated in the TODO section of the documentation, I was planning on implementing or straight borrowing DIH in the near future. I am assuming that Solr is all multi-threaded & a

Re: Deadlock with DirectUpdateHandler2

2008-11-18 Thread Mike Klaas
On 18-Nov-08, at 12:18 PM, Mark Miller wrote: Mike Klaas wrote: autoCommitCount is written in a CommitTracker.synchronized block only. It is read to print stats in an unsynchronized fashion, which perhaps could be fixed, though I can't see how it could cause a problem lastAdde

Re: Deadlock with DirectUpdateHandler2

2008-11-18 Thread Mike Klaas
On 18-Nov-08, at 8:54 AM, Mark Miller wrote: Mark Miller wrote: Toby Cole wrote: Has anyone else experienced a deadlock when the DirectUpdateHandler2 does an autocommit? I'm using a recent snapshot from hudson (apache- solr-2008-11-12_08-06-21), and quite often when I'm loading data the s

Re: SOLR Performance

2008-11-03 Thread Mike Klaas
If you never execute any queries, a gig should be more than enough. Of course, I've never played around with a .8 billion doc corpus on one machine. -Mike On 3-Nov-08, at 2:16 PM, Alok Dhir wrote: in terms of RAM -- how to size that on the indexer? --- Alok K. Dhir Symplicity Corporation

Re: DocSet: BitDocSet or HashDocSet ?

2008-11-03 Thread Mike Klaas
On 28-Oct-08, at 5:36 AM, Jérôme Etévé wrote: Hi all, In my code, I'd like to keep a subset of my 14M docs which is around 100k large. What is according to you the best option in terms of speed and memory usage ? Some basic thoughts tells me the BitDocSet should be the fastest for lookup

Re: Lucene 2.4 released

2008-10-17 Thread Mike Klaas
I don't think that there is any outstanding work to do on this issue. 2.4.0 should be compatible with the Solr 1.3 release; simply drop the lucene jars in solr's lib directory if you want to use the (slightly newer) version of lucene. -Mike On 15-Oct-08, at 10:00 AM, Feak, Todd wrote: T

Re: dismax and long phrases

2008-10-09 Thread Mike Klaas
On 7-Oct-08, at 9:27 AM, Jon Drukman wrote: Mike Klaas wrote: On 6-Oct-08, at 11:20 AM, Jon Drukman wrote: is there any way i could 'fake' it by adding a second field without stopwords, or something like that? Yep, you can "fake" it by only using fieldsets (qf) that

Re: dismax and long phrases

2008-10-06 Thread Mike Klaas
On 6-Oct-08, at 11:20 AM, Jon Drukman wrote: Chris Hostetter wrote: It's not a bug in the implementation, it's a side effect of the basic tenent of how dismax works since it inverts the input and creates a DisjunctionMaxQuery for each "word" in the input, any word that is valid in at leas

Re: What's the bottleneck?

2008-09-11 Thread Mike Klaas
On 11-Sep-08, at 8:24 AM, Jason Rennie wrote: We have a 14 million document index that we only use for querying (optimized, read-only). When we issue queries that have few, relatively rare words, the query returns quickly. However, when the query is longer and uses more common words (hitti

Re: Custom scoring example

2008-09-10 Thread Mike Klaas
On 5-Sep-08, at 5:01 PM, Ravindra Sharma wrote: I am looking for an example if anyone has done any custom scoring with Solr/Lucene. I need to implement a Query similar to DisjunctionMaxQuery, the only difference would be it should score based on sum of score of sub queries' scores instead of

Re: Highlighting Unindexed Fields

2008-09-03 Thread Mike Klaas
On 3-Sep-08, at 1:29 PM, Chris Harris wrote: http://wiki.apache.org/solr/FieldOptionsByUseCase says that a field needs to be both stored and indexed for highlighting to work. Unless I'm very confused, though, I just tested and highlighting worked fine (on trunk) for a stored, *non-indexed* fiel

Re: Solr FAQ entry about "Dynamically calculated range facet" topic

2008-08-22 Thread Mike Klaas
On 22-Aug-08, at 10:35 AM, Rogerio Pereira wrote: Hi Cris, I just asked this because I need to know if this kind of addition is welcome and somebody cares with this kind of information. Are you planning on discussing ways to optimize facet queries such as age:[XX TO YY]? I'm sure someo

Re: Buffer overflow attack on solr seen in the wild

2008-08-21 Thread Mike Klaas
Hi Jim, Looks like a sql injection attack that is automatically entered into search forms. Solr should not be affected, but it could affect you if you insert the raw/unescaped query into a sql database (for logging, etc.). -Mike On 21-Aug-08, at 3:30 PM, Jim Hurst wrote: Hey folks, I

Re: Solr Logo thought

2008-08-21 Thread Mike Klaas
rk on this because I see that it can be confuzing. Regards, Lukas On Wed, Aug 20, 2008 at 9:01 PM, Mike Klaas wrote: Nice job Lukas; the professionalism and quality of work is evident. I like aspects of the logo, but too am having trouble getting past the eye-looking O. Is it intentional

Re: Solr Logo thought

2008-08-20 Thread Mike Klaas
Nice job Lukas; the professionalism and quality of work is evident. I like aspects of the logo, but too am having trouble getting past the eye-looking O. Is it intentional (eye:look:search, etc)? -Mike On 20-Aug-08, at 5:25 AM, Mark Miller wrote: I went through the same thought process -

Re: shards and performance

2008-08-19 Thread Mike Klaas
On 19-Aug-08, at 12:58 PM, Phillip Farber wrote: So you experience differs from Mike's. Obviously it's an important decision as to whether to buy more machines. Can you (or Mike) weigh in on what factors led to your different take on local shards vs. shards distributed across machines?

Re: Clarification on facets

2008-08-19 Thread Mike Klaas
A simple way is to query using debugQuery=true and parse the output: 0.74248177 = queryWeight(rawText:python), product of: 2.581456 = idf(docFreq=16017) 0.28762132 = queryNorm 0.4191762 = (MATCH) fieldWeight(rawText:python in 950285), product of: 5.196152 = tf(termFreq(rawText

Re: shards and performance

2008-08-19 Thread Mike Klaas
On 19-Aug-08, at 10:18 AM, Phillip Farber wrote: I'm trying to understand how splitting a monolithic index into shards improves query response time. Please tell me if I'm on the right track here. Were does the increase in performance come from? Is it that in-memory arrays are smaller

Re: adds / delete within same 'transaction'..

2008-08-12 Thread Mike Klaas
On 11-Aug-08, at 10:48 PM, Norberto Meijome wrote: Hello :) I *think* i know the answer, but i'd like to confirm : Say I have 1old already indexed and commited (ie, 'live' ) What happens if I issue: 1 1new will delete happen first, and then the add, or could it be that the add happens b

Re: Solr Logo thought

2008-08-08 Thread Mike Klaas
To me, the release timing doesn't much affect what logo we decided to use or when to adopt it. Surely the most visible, important location for the logo is on the website, that we can replace at any time? -Mike On 8-Aug-08, at 7:30 AM, Otis Gospodnetic wrote: I think you are right about fav

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-29 Thread Mike Klaas
On 28-Jul-08, at 11:16 PM, Britske wrote: That sounds interesting. Let me explain my situation, which may be a variant of what you are proposing. My documents contain more than 10.000 fields, but these fields are divided like: 1. about 20 general purpose fields, of which more than 1 can b

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
On 28-Jul-08, at 1:53 PM, Britske wrote: Each query requests at most 20 stored fields. Why doesn't help lazyfieldloading in this situation? It does help, but not enough. With lots of data per document and not a lot of memory, it becomes probabilistically likely that each doc resides in a

Re: big discrepancy between elapsedtime and qtime although enableLazyFieldLoading= true

2008-07-28 Thread Mike Klaas
Another possibility is to partition the stored fields into a frequently-accessed set and a full set. If the frequently-accessed set is significantly smaller (in terms of # bytes), then the documents will be tightly-packed on disk and the os caching will be much more effective given the sam

Re: Seeking Anecdotes: Solr Plugins

2008-07-22 Thread Mike Klaas
On 22-Jul-08, at 4:34 PM, Chris Hostetter wrote: Hey everybody, I'll be giving a talk called "Apache Solr: Beyond the Box" at ApacheCon this year, which will focus on the how/when/why of writing Solr Plugins... http://us.apachecon.com/c/acus2008/sessions/10 I've got several use

Re: pf nixes fl

2008-07-22 Thread Mike Klaas
On 22-Jul-08, at 11:53 AM, Jason Rennie wrote: Just tried adding a pf field to my request handler. When I did this, solr returned all document fields for each doc (no "score") instead of returning the fields specified in fl. Bug? Feature? Anyone know what the reason for this behavior

Re: Specifying explicit FacetQuery w/ a normal query?

2008-07-22 Thread Mike Klaas
.query, it will only let you deal w/ the stuff you have already filtered out. I think what I was is possible, just need to dig in the code more. - Jon On Jul 21, 2008, at 9:14 PM, Mike Klaas wrote: On 17-Jul-08, at 6:27 AM, Jon Baer wrote: Ive gone from a complex multicore setup back to a

Re: Specifying explicit FacetQuery w/ a normal query?

2008-07-21 Thread Mike Klaas
On 17-Jul-08, at 6:27 AM, Jon Baer wrote: Ive gone from a complex multicore setup back to a single solrconfig setup and using a doctype field (since the index is pretty small), however there are a few spots where items are laid out in tabs and each tab has a count of docs associated, ie:

Re: How do you query for a string containing a colon?

2008-07-21 Thread Mike Klaas
On 21-Jul-08, at 2:15 PM, Ian Connor wrote: Hi, I am trying to query a "doi" field. However, a doi can contain a ":" colon character and the query parser throws an error at this point. How do you escape a colon? You can escape anything with a backslash, and most things with quotation marks

Re: Vote on a new solr logo

2008-07-21 Thread Mike Klaas
1, 2008 at 11:52 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: I can't figure how to use the poll either... here are a few others to check out: http://lapnap.net/solr/ perhaps "a" and "f" could live together, you use 'a' if you need a background other then

Re: Vote on a new solr logo

2008-07-21 Thread Mike Klaas
On 20-Jul-08, at 6:19 PM, Mark Miller wrote: From the dev list: Shalin Shekhar Mangar: +1 for a new logo. It's a new release, let's have a new logo too! First step is to decide which one of these is more Solr-ish. I'm looking to improve the look of solr, so I am going to do my best to p

Re: OutOfMemoryError - Quick Fix: Increase HashDocSet

2008-07-17 Thread Mike Klaas
On 17-Jul-08, at 10:28 AM, Fuad Efendi wrote: Change it to higher value, for instance, 3. OpenBitSet is created for larger values and requires a lot of memory... Careful--hash sets of that size can be quite slow. It does make sense to bump up the value to 6000 or so for large

Re: Search slow on a field with many unique values (date)

2008-07-10 Thread Mike Klaas
On 10-Jul-08, at 4:55 PM, Galen Pahlke wrote: Hi all, I have an index with 40 million small records with about 10 fields each. As my index size grows, I've noticed that queries involving the date field ( range queries, order by, etc) are taking a disproportionately long time. Could this

Re: Do I need Searcher on indexing machine

2008-07-10 Thread Mike Klaas
On 10-Jul-08, at 4:16 AM, Gudata wrote: Why Solr is registering new searcher all the time. Is this overhead, and if yes, how to stop it? It is needed for deleteByQuery. Its overhead is negligable if unused. -Mike

Re: Bulk delete

2008-07-04 Thread Mike Klaas
Why? It is not reasonable in a distributed system to perform requests of unbounded size (not to say that it won't work). If the concern is throughput, large batches should be sufficient. -Mike On 4-Jul-08, at 9:06 AM, Jonathan Ariel wrote: Yes, I just wanted to avoid N requests and do ju

Re: Big slowdown with phrase queries

2008-07-03 Thread Mike Klaas
On 3-Jul-08, at 5:13 PM, Chris Harris wrote: That's pretty much impossible (way too small). Double check those numbers. I don't know where I got the above numbers. Sorry. Here are the real numbers: .tis file: 730MB .frq files: 10.1 GB .prx file: 43.2 GB Now keeping all *that* in RAM, t

Re: Slow deleteById request

2008-07-03 Thread Mike Klaas
On 1-Jul-08, at 10:44 PM, Chris Hostetter wrote: : Yes, updating to a newer version of nightly Solr build could solve the : problem, but I am a little afraid to do it since solr-trunk has switched to : lucene 2.4-dev. but did you check wether or not you have maxPendingDeletes configured

Re: Big slowdown with phrase queries

2008-07-03 Thread Mike Klaas
On 3-Jul-08, at 3:04 PM, Chris Harris wrote: Now I gather that phrase queries are inherently slower than non-phrase queries, but 1-3 orders of magnitude difference seems noteworthy. This is on Solr r654965, which I don't think is *too* far behind the trunk version. 1200Mb RAM allocated to Solr

Re: Solr Capabilities/Limitations

2008-07-01 Thread Mike Klaas
On 1-Jul-08, at 8:37 AM, Willie Wong wrote: I need to be able to search through terabytes of existing data. Documents may vary in size from 10 MB to 20 KB in size. Also at some point I’ll also need to feed in approximately approximately 1-5 million new documents a day. This depends grea

Re: Slow performance using MatchAllDocsQuery with filter query

2008-07-01 Thread Mike Klaas
On 1-Jul-08, at 12:25 PM, Guangwei Yuan wrote: I've noticed some bad performance in faceted browsing, when the query is empty (so the MatchAllDocsQuery is used) and there are only filter queries. An example of the search url is: http://hostname:8080/solr/select/?q=&qt=dismax&fq=color:%23000

Re: Limit Porter stemmer to plural stemming only?

2008-06-30 Thread Mike Klaas
If you find a solution that works well, I encourage you to contribute it back to Solr. Plural-only stemming is probably a common need (I've definitely wanted to use it before). cheers, -Mike On 30-Jun-08, at 2:25 AM, climbingrose wrote: Ok, it looks like step 1a in Porter algo does what I

Re: Can I add field compression without reindexing?

2008-06-25 Thread Mike Klaas
On 24-Jun-08, at 4:26 PM, Chris Harris wrote: I have an index that I eventually want to rebuild so I can set compressed=true on a couple of fields. It's not really practical to rebuild the whole thing right now, though. If I change my schema.xml to set compressed=true and then keep adding ne

Re: Cost of having fieldTypes defined but not used

2008-06-23 Thread Mike Klaas
On 23-Jun-08, at 7:05 PM, Norberto Meijome wrote: Hi all, I'm curious , what is the cost (memory / processing time @ load? performance hit ? ) of having several unused fieldTypes defined in schema.xml ? Affects startup time only, likely non-measurable. -Mike

Re: easiest roadmap for server deployment

2008-06-17 Thread Mike Klaas
On 17-Jun-08, at 12:55 AM, Bram de Jong wrote: It looks like all my tests with solr have been very conclusive: it's the way to go. Glad to hear it! Sadly enough, me nor our sysadmin have any experience with setting up tomcat, jetty, orion, . We have plenty of experience with other servers

Re: Release date of SOLR 1.3

2008-06-06 Thread Mike Klaas
We're basically in that state already for the trunk. I don't think that we need a separate branch unless there is a big movement toward starting a new big non-1.3 feature before 1.3 is released. If that happens, we'll see what needs to be done to keep development going. -Mike On 6-Jun-08

Re: highlighting fragment

2008-06-06 Thread Mike Klaas
On 5-Jun-08, at 8:31 PM, Kevin Xiao wrote: Hi, I have a question about highlighting fragment. I set hl.fragsize to 100, but the return is cut off from a middle of a sentence with correct search term highlighting though. Is there a way to make the cutoff to the beginning of a sentence? Set

Re: Release date of SOLR 1.3

2008-06-05 Thread Mike Klaas
On 5-Jun-08, at 4:47 PM, Ryan Grange wrote: It would be nice to see some kind of update to the Solr website regarding what's holding up a 1.3 release. I look at that a lot more often than I look at this mailing list to see whether or not there's a new version I should be looking to test o

Re: Release date of SOLR 1.3

2008-06-05 Thread Mike Klaas
The Solr website isn't really for disseminating that information (also, we aren't allowed to advertise unofficial builds very loudly. It's a legal thing.). If you want to know what is holding up release, check out JIRA:

Re: POSTing repeated fields to Solr

2008-06-04 Thread Mike Klaas
On 4-Jun-08, at 2:22 PM, Andrew Nagy wrote: Hello - I was wondering if there is a work around with POSTing repeated fields to Solr. I am using Jetty as my container with Solr 1.2. I tried something like: http://localhost:8080/solr/select/?q=author:(smith)&rows=0&start=0&facet=true&facet.mi

Re: Solr indexing configuration help

2008-06-02 Thread Mike Klaas
On 2-Jun-08, at 2:09 PM, Norskog, Lance wrote: Solr 1.2 ignores the 'number of documents' attribute. It honors the "every 30 minutes" attribute. Only if you specify both, I think. There was a bug in the implementation. -Mike Lance -Original Message- From: [EMAIL PROTECTED] [m

Re: Issuing queries during analysis?

2008-06-02 Thread Mike Klaas
On 30-May-08, at 9:51 PM, Dallan Quass wrote: One more clarification -- I don't need to do this for every token in the text; just for "place" fields in the document. Each document has 1-3 place fields that need to be converted to standard form when the document is indexed. There is a spe

Re: highlighting and hyperlink

2008-05-30 Thread Mike Klaas
On 30-May-08, at 2:25 PM, Kevin Xiao wrote: Hi I am not sure if there are any discussions about this, I could not find the search function in mailing list archives. :) Anyway, here is my problem: In my document, I have a hyperlink, say, breast cancer, but when I applied solr highlighti

Re: new user: some questions about parameters and query syntax

2008-05-30 Thread Mike Klaas
On 29-May-08, at 11:22 AM, Bram de Jong wrote: On Thu, May 29, 2008 at 6:40 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: I haven't been paying close attention to the uniformity of URL parameters, but if there is room for making them more uniform (e.g. always use singular, always use comma

Re: wildcard highlighting

2008-05-30 Thread Mike Klaas
On 30-May-08, at 6:45 AM, Stefan Oestreicher wrote: Hi, I've started to play around with Solr and I'm quite impressed with its performance and features. However it seems to me that highlighting of wildcard terms is not supported, which is somewhat disappointing. Are there any plans to suppor

Re: field normalization and omitNorms

2008-05-28 Thread Mike Klaas
On 27-May-08, at 3:16 PM, Phillip Farber wrote: Hi all, I've been looking without success for a simple explanation of the effect of omitNorms=false for a text field. Can someone point me to the relevant doc? The length of the field, as well as field and document boosts, will not affec

Re: SOLR OOM (out of memory) problem

2008-05-22 Thread Mike Klaas
On 22-May-08, at 4:27 AM, gurudev wrote: Hi Rong, My cache hit ratio are: filtercache: 0.96 documentcache:0.51 queryresultcache:0.58 Note that you may be able to reduce the _size_ of the document cache without materially affecting the hit rate, since typically some documents are much m

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread Mike Klaas
more ram in order to do facet queries. This is with Solr 1.3. -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 21, 2008 11:23 AM To: solr-user@lucene.apache.org Subject: Re: SOLR OOM (out of memory) problem On 21-May-08, at 4:46 AM, gurudev wrote

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread Mike Klaas
On 21-May-08, at 4:46 AM, gurudev wrote: Just to add more: The JVM heap allocated is 6GB with initial heap size as 2GB. We use quadro(which is 8 cpus) on linux servers for SOLR slaves. We use facet searches, sorting. document cache is set to 7 million (which is total documents in index) filte

Re: Fetching the first 10 results and the last result

2008-05-21 Thread Mike Klaas
On 21-May-08, at 2:35 AM, Tim Mahy wrote: Hi all, is there a way to let Solr not only return the total number of found articles, but also the data of the last document when for example only requesting the first 10 documents ? we could do this with a seperate query by either letting the se

Re: Highlighting - field criteria highlights in other fields

2008-05-20 Thread Mike Klaas
On 20-May-08, at 12:31 AM, Tim Mahy wrote: Hi all, we have situation in which we have documents that have an introduction (text) , a body (text) and some meta data fields (integers mostly). when we create a query like this : q=( +(body_nl:( brussel) ) AND ( (+publicationid:("3430" OR "34

Re: Minion, anyone?

2008-05-20 Thread Mike Klaas
On 20-May-08, at 9:06 AM, Binkley, Peter wrote: Has anyone in the Solr community started looking at Sun's Minion (now released under GPL 2.0)? https://minion.dev.java.net/ And (dare I say it) might it be possible to wrap Minion into Solr as an alternative to Lucene? The Search Guy (Stephen

Re: Duplicates results when using a non optimized index

2008-05-15 Thread Mike Klaas
On 15-May-08, at 12:50 AM, Tim Mahy wrote: Hi, yep it is a very strange problem that we never encountered before. We are uploading all the documents again to see if that solves the problem (hoping that the delete will delete also the multiple document instances) If you are re-adding ever

Re: solr highlighting

2008-05-14 Thread Mike Klaas
The minimum "stuff" needed to highlight term X in field F is: field F must be 'stored' field F must have an analyzer defined a query with term X is sent (e.g., q=X) with parameters hl=true (or 'on'), hl.fl=F Try it on the example: 1. get the example running 2. cd example/exampledocs 3. ./post.sh

Re: Selecting data with an order on string field causes slow commits from then on

2008-05-12 Thread Mike Klaas
This was answered yesterday on the list: http://www.nabble.com/Re%3A-exceeded-limit-of-maxWarmingSearchers-p17165631.html regards, -Mike On 12-May-08, at 6:12 PM, David Stevenson wrote: We have a table that has roughly 1M rows. If we run a query against the table and order by a string field

Re: result limit / diversity with an OR query

2008-05-12 Thread Mike Klaas
On 12-May-08, at 9:31 AM, s d wrote: Hi,I have a query similar to: x OR y OR z and i want to know if there is a way to make sure i get 1 result with x, 1 result with y and one with z ? The easiest way is to execute three separate queries: +x y z x +y z x y +z -Mike

Re: Loading performance slowdown at ~ 400K documents

2008-05-12 Thread Mike Klaas
Glad to hear it. Incidentally, lowering maxBufferedDocs will reduce peak memory consumption during indexing, at a cost of slower indexing throughput. -Mike On 11-May-08, at 3:41 AM, Tracy Flynn wrote: Thanks for the replies. For a completely different reason, I happened to look at the me

Re: Function Query result

2008-05-09 Thread Mike Klaas
Thanks so much Umar! -Mike On 9-May-08, at 1:22 PM, Umar Shah wrote: Mike, as asked, I have added an example , hope it will be helpful to future users . thanks again. On Sat, May 10, 2008 at 12:11 AM, Mike Klaas <[EMAIL PROTECTED]> wrote: No problem. You can return the fav

Re: How Special Character '&' used in indexing

2008-05-09 Thread Mike Klaas
On 9-May-08, at 6:26 AM, Ricky wrote: I have tried sending the '&' instead of '&' like the following, A & K Inc. But i still get the same error ""entity reference name can not contain character ' A & .. Please use a library for doing xml encoding--there is absolutely no reason to do thi

Re: Function Query result

2008-05-09 Thread Mike Klaas
how it was not evident from the wiki example, or i was too presumptious ;-). -umar On Fri, May 9, 2008 at 2:53 AM, Mike Klaas <[EMAIL PROTECTED]> wrote: On 7-May-08, at 11:40 PM, Umar Shah wrote: That would be sufficient for my requirements, I'm using the following que

Re: Loading performance slowdown at ~ 400K documents

2008-05-09 Thread Mike Klaas
Hi Tracy, What is your Solr/Lucene version? Is the slowdown sustained or temporary (it is not strange to see a slowdown for a few minutes if a large segment merge is happening)? I disagree with Nick's advice of enabling autocommit. -Mike On 9-May-08, at 5:02 AM, Tracy Flynn wrote: Hi,

Re: Function Query result

2008-05-08 Thread Mike Klaas
On 7-May-08, at 11:40 PM, Umar Shah wrote: That would be sufficient for my requirements, I'm using the following query params q=*:* &_val_:function(blah, blah) &fl=*,score I'm not getting the value, value of score =1, am I missing something? Is that really what you are sending to Solr? The

Re: SOLR-470 & default value in schema with NOW (update)

2008-05-07 Thread Mike Klaas
On 7-May-08, at 5:04 AM, Daniel Papasian wrote: Chris Hostetter wrote: The two exceptions you cited both indicate there was at least one date instance with no millis included -- NOW can't do that. it always inlcudes millis (even though it shouldn't). I've seen people suggest, for perform

Re: using solr as master for data storage/retrieval?

2008-05-07 Thread Mike Klaas
On 7-May-08, at 8:26 AM, Phillip Rhodes wrote: I currently have a java-based application that stores all objects on the file system (text, blobs) and uses lucene to search the objects. If I can store these objects in solr, I would greatly increase the scalability of my application. Woul

Re: multi-language searching with Solr

2008-05-07 Thread Mike Klaas
namic fields, you don't have to explicitly declare all fields: -Mike On 7-May-08, at 12:46 PM, Gereon Steffens wrote: I have the same requirement, and from what I understand the distributed search feature will help implementing this, by having one shard per language. Am I right? Ger

Re: Function Query result

2008-05-07 Thread Mike Klaas
On 7-May-08, at 5:01 AM, Umar Shah wrote: Hi, I want to use result of a function query based on multiple field values for ranking the results, can I also return the value of the computed function along with other fields of the documents returned. If you score documents based on a func

Re: Delete's increase while adding new documents

2008-05-07 Thread Mike Klaas
at which speed so that Solr stays stable ? We use Tomcat 5.5 and our java memory limit is 2gb. Greetings, Tim ____ Van: Mike Klaas [EMAIL PROTECTED] Verzonden: dinsdag 6 mei 2008 20:17 Aan: solr-user@lucene.apache.org Onderwerp: Re: Delete's increase wh

Re: multi-language searching with Solr

2008-05-06 Thread Mike Klaas
On 5-May-08, at 1:28 PM, Eli K wrote: Wouldn't this impact both indexing and search performance and the size of the index? It is also probable that I will have more then one free text fields later on and with at least 20 languages this approach does not seem very manageable. Are there other opt

Re: Delete's increase while adding new documents

2008-05-06 Thread Mike Klaas
On 6-May-08, at 4:56 AM, Tim Mahy wrote: Hi all, it seems that we get errors during the auto-commit : java.io.FileNotFoundException: /opt/solr/upload/nl/archive/data/ index/_4x.fnm (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAcc

Re: Tokenize integers?

2008-05-05 Thread Mike Klaas
On 5-May-08, at 9:19 PM, Chris Hostetter wrote: : Just use fieldType="string", and send them to solr in a multivalued fashion: : : 1133field> : name="blah">999 But as the OP said: that requires preprocessing -- it would be nice if Solr would make this easier for you. Oh I see, I misinter

  1   2   3   4   5   6   >