Re: Implementing a logout

2009-08-22 Thread Lance Norskog
Sorry, hit 'send' too soon. You can kill the servlet process, but it is much better to use the servlet container's shutdown protocol. On Sat, Aug 22, 2009 at 6:46 PM, Lance Norskog goks...@gmail.com wrote: There is no 'logout'. There is no permanent state in Solr beyond the Lucene index

Re: Passing a Cookie in SolrJ

2009-08-19 Thread Lance Norskog
CommonsHttpSolrServer and override the request method. Copy/paste the code from CommonsHttpSolrServer#request and make the changes. It is not an elegant way but it will work. -- Regards, Shalin Shekhar Mangar. -- Lance Norskog goks...@gmail.com

Re: Shutdown Solr

2009-08-19 Thread Lance Norskog
, /var/lock etc...) - ok then, Graceful Shutdown depends on how you started Tomcat. *No* application is graceful for kill -9. The whole point of kill -9 is that it's uncatchable. -- http://www.linkedin.com/in/paultomblin -- Lance Norskog goks...@gmail.com

Re: DataImportHandler ignoring most rows

2009-08-19 Thread Lance Norskog
name=Message pk=key query=...redacted... /entity /document /dataConfig -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- Lance Norskog goks...@gmail.com

Re: Advice on updating solr indexes

2009-08-15 Thread Lance Norskog
, if there are any bugs or gotchas, please post a Jira issue. -- Lance Norskog goks...@gmail.com On Sat, Aug 15, 2009 at 7:38 AM, William Pierce evalsi...@hotmail.comwrote: Folks: In our app we index approx 50 M documents every so often. One of the fields in each document is called CompScore which

Re: How much data can Solr handle?

2009-06-26 Thread Lance Norskog
, distribution, copying or use of this communication without prior permission of the addressee is strictly prohibited. If you are not the intended addressee you must delete this e-mail and its attachments. -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: pk vs. uniqueKey with DIH delta-import

2009-06-20 Thread Lance Norskog
that indexing would, only for the field that matches Solr's uniqueKey setting would be necessary?? Erik -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: delta-import not giving updated records

2009-02-24 Thread Lance Norskog
the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: New wiki pages

2009-02-08 Thread Lance Norskog
, you need to have two fields. Both should be processed similarly, but the phrase search field should not use stemming or stopwords. -Yonik -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: DIH using values from solrconfig.xml inside data-config.xml

2009-02-04 Thread Lance Norskog
for large scale data processing (In hindsight XPathEntityProcessor and XPathRecordReader should probably have been named StreamingXPathEntityProcessor and StreamingXPathRecordReader) thoughts? -Hoss -- --Noble Paul -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: New wiki pages

2009-02-04 Thread Lance Norskog
of millions of small records. These are not the final word on how to do production Solr. Enjoy, Lance Norskog On Mon, Feb 2, 2009 at 10:25 PM, Lance Norskog goks...@gmail.com wrote: http://wiki.apache.org/solr/SchemaDesign http://wiki.apache.org/solr/LargeIndexes http://wiki.apache.org/solr

Re: DIH using values from solrconfig.xml inside data-config.xml

2009-02-02 Thread Lance Norskog
. I guess we never thought someone would need to use a variable in the regex attribute :) -- Regards, Shalin Shekhar Mangar. -- --Noble Paul -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: Understanding Solr memory usage

2009-02-02 Thread Lance Norskog
still running out of heap space periodically. Thanks in advance for any help! -- Matt Wagner -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

New wiki pages

2009-02-02 Thread Lance Norskog
comments. For example: they are stupid, the wiki has no links to them and those links should be here, etc. -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: Optimizing Improving results based on user feedback

2009-02-01 Thread Lance Norskog
for your time! Matthew Runo Software Engineer, Zappos.com mr...@zappos.com - 702-943-7833 -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: Performance dead-zone due to garbage collection

2009-02-01 Thread Lance Norskog
://www.nabble.com/Performance-%22dead-zone%22-due-to-garbage-collection-tp21588427p21758001.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: Range search question

2009-02-01 Thread Lance Norskog
even seen this with date fields, which seems very odd (more data being returned than I expected). If you want to stick with string, index 011 instead of 11. Koji -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: solr as the data store

2009-02-01 Thread Lance Norskog
to go back to a seperate database model? Is there a critical need you can think is missing? -- Regards, Ian Connor -- Lance Norskog goks...@gmail.com 650-922-8831 (US)

Re: Help with Solr 1.3 lockups?

2009-01-19 Thread Lance Norskog
Java 1.5 has thread-locking bugs. Switching to Java 1.6 may cure this problem. On Thu, Jan 15, 2009 at 10:57 AM, Jerome L Quinn jlqu...@us.ibm.com wrote: Hi, all. I'm running solr 1.3 inside Tomcat 6.0.18. I'm running a modified query parser, tokenizer, highlighter, and have a

Re: Solr expert(s) needed

2009-01-09 Thread Lance Norskog
http://issues.apache.org/jira/browse/NUTCH-442 Haven't used Nutch. Can the Nutch-generated index be reverse-engineered into a Solr schema? In that case, you can just copy the Lucene index files away from Nutch and run them under Solr.

UUID field type documentation and ExtractingRequestHandler

2009-01-09 Thread Lance Norskog
The UUID field type is not documented on the Wiki. https://issues.apache.org/jira/browse/SOLR-308 The ExtractingRequestHandler creates its own UUID instead of using the UUID field type. http://issues.apache.org/jira/browse/SOLR-284

Re: Solr expert(s) needed

2009-01-09 Thread Lance Norskog
crawler do you use to generate index for Solr? Thanks a lot!! On Fri, Jan 9, 2009 at 8:04 PM, Lance Norskog goks...@gmail.com wrote: http://issues.apache.org/jira/browse/NUTCH-442 Haven't used Nutch. Can the Nutch-generated index be reverse-engineered into a Solr schema? In that case, you

[JOB] Solr expert available

2009-01-01 Thread Lance Norskog
Please contact me if you want to hire an expert Java and Solr developer/architect. My recent experience is with a half-billion record index at www.divvio.com, supplying our video search database to www.videocrawler.com. Thanks, Lance Norskog goks...@yahoo.com

RE: exceeded limit of maxWarmingSearchers=4

2008-12-11 Thread Lance Norskog
How big is your index? There is a variant of the Lucene disk accessors in the Lucene contrib area. It stores all of the index data directly in POJOs (Java objects) and does not marshal them into a disk-saveable format. The indexes are understandably larger, but all data added is automatically

RE: Multi Core - Max Core Count Recommendation

2008-12-10 Thread Lance Norskog
1) Our limit is: is how big a file do we want to copy around? We switched to multiple indexes because of the logistics of replicating/backing up giant Lucene index files. 2) Searching takes a little memory, sorting takes a lot of memory, and faceting eats like a black hole. There is an

RE: Sorting on text-fields with international characters

2008-12-08 Thread Lance Norskog
Also, is there any way to get Solr to sort, i.e, á, à or â together with the regular a's? The ISOLatin1 filter downconverts these variants to the ASCII a letter. It does this in the index, not the stored data. This solves the Bjork/Bjork-umlaut problem: you can type either and find records for

RE: Russian stopwords

2008-12-06 Thread Lance Norskog
The default encoding on windows is not UTF-8. This causes various weirdness when you develop on Windows. This has helped me find all places in string-handling that need the encoding name parameter, so it's not all bad. Lance -Original Message- From: tushar kapoor [mailto:[EMAIL

RE: Dealing with field values as key/value pairs

2008-12-06 Thread Lance Norskog
This is really cool. U... How does it integrate with the Data Import Handler? Lance -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Friday, December 05, 2008 8:31 PM To: solr-user@lucene.apache.org Subject: Re: Dealing with field values as key/value pairs

RE: DataImportHandler: Deleteing from index and db; lastIndexed id feature

2008-12-02 Thread Lance Norskog
Does the DIH delta feature rewrite the delta-import file for each set of rows? If it does not, that sounds like a bug/enhancement. Lance -Original Message- From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2008 8:51 AM To: solr-user@lucene.apache.org

RE: idea about faceting

2008-11-22 Thread Lance Norskog
Index two fields instead of one. One field has ISOLatin1Filter, LowerCase etc. and one does not. Search the processed, filter on the raw one. For more searching power, you can even index a third field with the Soundex/Metaphone phoneme translators. -Original Message- From: Marc Sturlese

Boosting by field contents

2008-11-20 Thread Lance Norskog
Is it possible to boost a document by the contents of a field? Given the query: text field:value I want to return all documents with 'text'. Documents where 'field = value' boosted over documents where 'field = some other value'. This query does it: (text field:value)^100 (text

RE: Filtering on blank fields

2008-11-20 Thread Lance Norskog
The problem with a zero-length string is that it is also returned by: field:[* TO *]. So you don't know if you're doing this right or not. For those of us who cannot reindex at the drop of a hat, this is a big deal. We went with -1. Lance -Original Message- From: Manepalli, Kalyan

RE: filtering on blank OR specific range

2008-11-19 Thread Lance Norskog
Try: Type:blue OR -Type:[* TO *] You can't have a negative clause at the beginning. Yes, Lucene should barf about this. -Original Message- From: Geoffrey Young [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 12:17 PM To: solr-user@lucene.apache.org Subject: filtering on

RE: Solr security

2008-11-17 Thread Lance Norskog
About that read-only switch for Solr: one of the basic HTTP design guidelines is that GET should only return values, and should never change the state of the data. All changes to the data should be made with POST. (In REST style guidelines, PUT, POST, and DELETE.) This prevents you from passing

RE: DataImportHandler, custom properties

2008-11-14 Thread Lance Norskog
These are what you may be asking: 1) Do you wish to read records from the database that are already indexed, and you want to change the fields found and leave the rest of the Solr document? This would certainly be a worthwhile feature; there is a separate project to add 'altering existing

RE: Programatic way to know when an optimize is finished?

2008-11-14 Thread Lance Norskog
The 'optimize' http command blocks. If you script your automation, you can just call the http and then the next command in the script runs after the optimize finishes. Hours later, in our case. Lance -Original Message- From: Phillip Farber [mailto:[EMAIL PROTECTED] Sent: Friday,

RE: Query Performance while updating teh index

2008-11-12 Thread Lance Norskog
Yes, this is the cache autowarming. We turned this off and staged separate queries that pre-warm our standard queries. We are looking at pulling the query server out of the load balancer during this process; it is the most effective way to give fixed response time. Lance -Original

RE: Large Data Set Suggestions

2008-11-07 Thread Lance Norskog
, 2008 at 12:44 AM, Lance Norskog [EMAIL PROTECTED] wrote: You can also do streaming XML upload for the XML-based indexing. This can feed, say, 100k records in one XML file from a separate machine. All of these options ignore the case where there is an error in your input records v.s. the schema

RE: Large Data Set Suggestions

2008-11-06 Thread Lance Norskog
You can also do streaming XML upload for the XML-based indexing. This can feed, say, 100k records in one XML file from a separate machine. All of these options ignore the case where there is an error in your input records v.s. the schema. DIH gives up on an error. Streaming XML gives up on an

RE: DIH Http input bug - problem with two-level RSS walker

2008-11-03 Thread Lance Norskog
@lucene.apache.org Subject: Re: DIH Http input bug - problem with two-level RSS walker If you wish to create 1 doc per inner entity the set rootEntity=false for the entity outer. The exception is because the url is wrong On Sat, Nov 1, 2008 at 10:30 AM, Lance Norskog [EMAIL PROTECTED] wrote: I wrote

RE: SOLR Performance

2008-11-03 Thread Lance Norskog
The logistics of handling giant index files hit us before search performance. We switched to a set of indexes running inside one server (tomcat) instance with the Multicore+Distributed Search tools, with a frozen old index and a new index actively taking updates. The smaller new index takes much

DIH http input xpath syntax

2008-11-01 Thread Lance Norskog
The wiki page for the DIH handler mentions that the XML is parsed with a streaming parser and that the xpath parser only handles a subset of the xpath syntax. Which streaming parser is it and where would I find this subset documented? I tried a few things like the the first entry and length of

RE: DIH and rss feeds

2008-10-31 Thread Lance Norskog
wrote: run full-import with clean=false for full-import clean is set to true by default and for delta-import clean is false by default. On Fri, Oct 31, 2008 at 9:16 AM, Lance Norskog [EMAIL PROTECTED] wrote: I have a DataImportHandler configured to index from an RSS feed

DIH Http input bug - problem with two-level RSS walker

2008-10-31 Thread Lance Norskog
I wrote a nested HttpDataSource RSS poller. The outer loop reads an rss feed which contains N links to other rss feeds. The nested loop then reads each one of those to create documents. (Yes, this is an obnoxious thing to do.) Let's say the outer RSS feed gives 10 items. Both feeds use the same

DIH and rss feeds

2008-10-30 Thread Lance Norskog
will have 200 documents? 'full-import' throws away the first 100. 'delta-import' is not implemented. What is the special trick here? I'm using the Solr-1.3.0 release. Thanks, Lance Norskog

RE: replication handler - compression

2008-10-28 Thread Lance Norskog
Aha! The hint to the actual problem: When compressed with winzip. You are running Solr on Windows. Snapshots don't work on Windows: they depend on a Unix file system feature. You may be copying the entire index. Not just that, it could be inconsistent. This is a fine topic for a best practices

RE: Issue with Query Parsing '+' works as 'OR'

2008-10-22 Thread Lance Norskog
URI encoding turns a space into a plus, then (maybe) Lucene takes that as a space. Also you want a + in front of first_name. A AND B - +first_name:joe++last_name:smith B AND maybe A - first_name:joe++last_name:smith Some of us need sample use cases to understand these things;

RE: Sorting performance

2008-10-20 Thread Lance Norskog
Accd to previous posters on this topic, sorting requires an array with an entry per document in the entire index. Each entry has 32 bits for the 'int' type, and 32 bits plus the field representation length for other types. Not knowing Lucene internals I have a hard time believing that it really

RE: Japonish language seems to don't work on solr 1.3

2008-10-20 Thread Lance Norskog
Also the MySQL table may not be in UTF-8 mode. -Original Message- From: Feak, Todd [mailto:[EMAIL PROTECTED] Sent: Monday, October 20, 2008 9:48 AM To: solr-user@lucene.apache.org Subject: RE: Japonish language seems to don't work on solr 1.3 I would look real closely at the data

RE: Solr has limit to number of returned results?

2008-10-10 Thread Lance Norskog
To select all, do star-colon-star *:* To select a negative clause do *:* AND -clause To select a wildcard, h* and h?* work fine. Star as the only character, or star or ? as the first character are not allowed. These blow up with too many clauses: H*? and H*H and H*H*. And when they don't

RE: Controlling Length of Text Snippets Before and After Highlighted Term

2008-10-03 Thread Lance Norskog
You could handle this problem with an XSL script on the output. It would scan for the highlighting markers and munge the text. I've done a few things with the XsltResponseWriter and I do not envy you this coding task :) but it is possible. http://wiki.apache.org/solr/XsltResponseWriter

RE: Problem restarting Solr after shutting it down.

2008-10-01 Thread Lance Norskog
We send it a normal kill, wait 30 seconds, then use a kill -9. This means we tell it to shut down, give it thirty seconds to do whatever it wants to, then forcefully kill it. I'm not sure we have ever seen the first 'normal kill' work, but we do it anyway. Lance -Original Message- From:

RE: Snappuller taking up CPU on master

2008-09-24 Thread Lance Norskog
rsync has an option to limit the transfer rate. You give a maximum bandwidth for it to use in the transfer. (Please do not post the same thing if you don't get a response.) -Original Message- From: rahul_k123 [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 24, 2008 10:57 AM To:

RE: Some new SOLR features

2008-09-17 Thread Lance Norskog
My vote is for dynamically scanning a directory of configuration files. When a new one appears, or an existing file is touched, load it. When a configuration disappears, unload it. This model works very well for servlet containers. Lance -Original Message- From: [EMAIL PROTECTED]

Solr 1.3 and Lucene 2.4 dev

2008-09-16 Thread Lance Norskog
Is it possible to run Solr 1.3 with Lucene 2.3.2, the last official release of Lucene? We're running into a problem with our very very large index and wonder if there is a bug in the development Lucene. Thanks, Lance Norskog

RE: Searching for future or null dates

2008-09-16 Thread Lance Norskog
If the query stars with a negative clause Lucene returns nothing. endDate[NOW TO *] OR -endDate:[* TO *] Might work -Original Message- From: Kolodziej Christian [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 16, 2008 12:01 AM To: solr-user@lucene.apache.org Subject: AW: Searching

RE: Adding bias to Distributed search feature?

2008-09-15 Thread Lance Norskog
to Distributed search feature? On Thu, Sep 11, 2008 at 10:31 PM, Lance Norskog [EMAIL PROTECTED] wrote: Is it possible to add a bias to the ordering in the distributed search feature? That is, if the search finds the same content in two different indexes, it always favors the document from the first

Adding bias to Distributed search feature?

2008-09-11 Thread Lance Norskog
? Thanks, Lance Norskog

RE: AW: Cross-context-forward to solr-instance

2008-09-08 Thread Lance Norskog
You can give a default core set by adding a default parameter to the query in solrconfig.xml. This is hacky, but it gives you a set of cores instead of just one core. -Original Message- From: David Smiley @MITRE.org [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 7:54 AM To:

RE: question about + and - in field queries

2008-08-28 Thread Lance Norskog
This is somewhere in the mail archives. The AND/OR/NOT syntax is binary. The +/- syntax is ternary: (+one -two three) means: must have one, cannot have two, things with three have a higher score. -Original Message- From: Lyman Hurd [mailto:[EMAIL PROTECTED] Sent: Thursday, August 28,

RE: How to know if a field is null?

2008-08-25 Thread Lance Norskog
Has this been fixed in solr 1.3? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Monday, August 25, 2008 5:44 AM To: solr-user@lucene.apache.org Subject: Re: How to know if a field is null? On Mon, Aug 25, 2008 at 5:33 AM, Erik Hatcher

RE: How to know if a field is null?

2008-08-23 Thread Lance Norskog
And, a negative query does not work, so if this is the only clause, you have to say: *:* AND -field[* TO *] Where *:* is a special code for all documents. It's like learning a language: there is the normal grammar, there are the unusual cases, and then there are the bizarre slang expressions.

RE: shards and performance

2008-08-21 Thread Lance Norskog
We found that searching by itself was faster with the Distributed multicore search over three cores in the same servlet engine, than one just one core. Faceting and sorting use more memory than simple searches, and we could not do faceting on our one simple index. We needed this for data

RE: .wsdl for example....

2008-08-18 Thread Lance Norskog
Various Java web service libraries come with 'wsdl2java' and 'java2wsdl' programs. You just run 'java2wsdl' on the Java soap description. -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Monday, August 18, 2008 6:53 PM To: solr-user@lucene.apache.org Subject: Re:

RE: Administrative questions

2008-08-13 Thread Lance Norskog
I wrote shell tasks that start, stop, and heartbeat the server and run them from cron (unix). Heartbeat means: 1) is the tomcat even running, 2) does tomcat return the Solr admin page, 3) does Solr return a search. For an indexer, 4) does solr return from a commit. Stopping the server via the

RE: Out of memory on Solr sorting

2008-07-29 Thread Lance Norskog
A sneaky source of OutOfMemory errors is the permanent generation. If you add this: -XX:PermSize=64m -XX:MaxPermSize=96m You will increase the size of the permanent generation. We found this helped. Also note that when you undeploy a war file, the old deployment has permanent storage

Simple mistake in Wiki

2008-07-24 Thread Lance Norskog
Should this refer to facet.mincount instead of facet.limit? The default is true if facet.limit is greater than 0, false otherwise. http://wiki.apache.org/solr/SimpleFacetParameters facet.sort Set to true, this parameter indicates that constraints should be sorted by their count. If false,

RE: UnicodeNormalizationFilterFactory

2008-06-24 Thread Lance Norskog
ISOLatin1AccentFilterFactory works quite well for us. It solves our basic euro-text keyboard searching problem, where protege should find protégé. (protege with two accents.) -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 24, 2008 4:05 PM To:

XSL scripting

2008-06-09 Thread Lance Norskog
This started out in the num-docs thread, but deserves its own. And a wiki page. There is a more complex and general way to get the number of documents in the index. I run a query against solr and postprocess the output with an XSL script. Install this xsl script as home/conf/xslt/numfound.xsl.

RE: Num docs

2008-06-07 Thread Lance Norskog
This appears in the stats.jsp page. Both the total of document 'slots' and the number of live documents. -Original Message- From: Marcus Herou [mailto:[EMAIL PROTECTED] Sent: Saturday, June 07, 2008 2:09 AM To: solr-user@lucene.apache.org Subject: Num docs Hi. Is there a way of

RE: How to describe 2 entities in dataConfig for the DataImporter?

2008-05-30 Thread Lance Norskog
You might try creating your whole transform as an SQL database view rather than with the Solr transformer toolkit. This would also make it easier to directly examine the data to be indexed. Lance -Original Message- From: Julio Castillo [mailto:[EMAIL PROTECTED] Sent: Thursday, May 29,

RE: Announcement of Solr Javascript Client

2008-05-27 Thread Lance Norskog
Nice! Another technique for the denial-of-service problem: you can regulate the number of simultaneous active servlets. Most servlet containers have a configuration for this somewhere. This will slow down legit users but will still avoid killing the server machine. -Original Message-

RE: Indexing HTML Content

2008-05-22 Thread Lance Norskog
The HTMLStripReader tool worked very well for us. It handles garbled HTML well. The only hole we found was that it does not find alt-text attributes for images. Also, note that this code is written as a Java Reader class rather than a Solr class. This makes it useful for other projects. Given the

RE: Solr feasibility with terabyte-scale data

2008-05-09 Thread Lance Norskog
page 'SchemaDesignTips'. Cheers, Lance Norskog

RE: Multiple Index creation

2008-05-07 Thread Lance Norskog
To search against multiple Solrs, you can use http://wiki.apache.org/solr/DistributedSearch in Solr 1.3. This is not tied to the MultiCore feature. -Original Message- From: Shalin Shekhar Mangar [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 06, 2008 9:28 PM To:

RE: Help optimizing

2008-05-06 Thread Lance Norskog
One cause of out-of-memory is multiple simultaneous requests. If you limit the query stream to one or two simultaneous requests, you might fix this. No, Solr does not have an option for this. The servlet containers have controls for this that you have to dig very deep to find. Lance Norskog

RE: Help optimizing

2008-05-06 Thread Lance Norskog
There are two integer types, 'sint' and 'integer'. On an integer, you cannot do a range check (that makes sense). But! Lucene sort makes an array of integers for every record. On an integer field, it creates an integer array. On any other kind of field, each array item has a lot more. So, if you

MultiCore and Distributed Search

2008-05-01 Thread Lance Norskog
Is Distributed Search () in the main line yet? Is it considered useable? And, how closely does it match the Wiki entry? https://issues.apache.org/jira/browse/SOLR-303 https://issues.apache.org/jira/browse/SOLR-303 http://wiki.apache.org/solr/DistributedSearch

MultiCore on Wiki

2008-04-30 Thread Lance Norskog
The MultiCore writeup on the Wiki (http://wiki.apache.org/solr/MultiCore) says: ... Configuration-core-dataDir The data directory for a given core. (optional) How can a core not have its own dataDir? What happens if this is not set? Cheers, Lance Norskog

RE: Solr with Auto-suggest

2008-04-25 Thread Lance Norskog
This what the spellchecker does. It makes a separate Lucene index of n-gram letters and searches those. Works pretty well and it is outside the main index. I did an experimental variation indexing word pairs as phrases, and it worked well too. Lance Norskog -Original Message- From: Ryan

Lucene Modules - LucQE [lucky] Lucene Query Expansion Module

2008-04-24 Thread Lance Norskog
http://lucene-qe.sourceforge.net/ This is a much smarter technique for doing query expansion with synonyms, using Rocchio's Algorithm. Has anyone tried to shoehorn this into Solr? It's a little weird: it needs an analyser, a searcher, and a similarity function. It should be possible to refactor

Facet Query

2008-04-11 Thread Lance Norskog
What do facet queries do that is different from the regular query? What is a use case where I would use a facet.query in addition to the regular query? Thanks, Lance Norskog From the wiki: http://wiki.apache.org/solr/SimpleFacetParameters#head-529bb9b985632b36cbd46 a37bde9753772e47cdd

Meta: Mail quirk of solr-user

2008-04-11 Thread Lance Norskog
Hi- When I reply to a solr-user mail, the To: address is the sender instead of solr-user. Didn't it used to be solr-user? Lance

RE: synonyms

2008-03-28 Thread Lance Norskog
Lucas- Your examples are Portuguese and Spanish. You might find a Spanish-language stemmer that follows the very rigid conjugation in Spanish (and I'm assuming in Portuguese as well). Spanish follows conjugation rules that embed much more semantics than English, so a huge number of synonyms can

RE: How to index multiple sites with option of combining results in search

2008-03-26 Thread Lance Norskog
In fact, 55m records works fine in Solr; assuming they are small records. The problem is that the index files wind up in the tens of gigabytes. The logistics of doing backups, snapping to query servers, etc. is what makes this index unwieldy, and why multiple shards are useful. Lance

RE: stopwords and phrase queries

2008-03-21 Thread Lance Norskog
. We solved this problem by making a separate indexed field with a simplified text type: no stopwords. Phrase searches go against the 'rawfield' and word searches go against it first. You may want to also filter out punctuation or Sound Of Music will not bring up Sound Of Music! Cheers, Lance

Preferential boosting

2008-03-20 Thread Lance Norskog
with duration 3 above the others? These do not work (at least for me): *:* OR duration:3^2.0 duration:[* TO *] duration:3^2.0 duration:3^2.0 OR -duration:3 Thanks, Lance Norskog

RE: Preferential boosting

2008-03-20 Thread Lance Norskog
at 3:13 PM, Lance Norskog [EMAIL PROTECTED] wrote: Suppose I have a schema with an integer field called 'duration'. I want to find all records, but if the duration is 3 I want those records to be boosted. The index has 10 records, with duration between 2 and 4. What is the query

RE: sort by index id descending?

2008-03-19 Thread Lance Norskog
... another magic field name like score ... This could be done with a separate magic punctuation like $score, $mean (the mean score), etc.so $docid would work. Cheers, Lance -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 18, 2008 9:01 PM To:

Finding an empty field

2008-03-13 Thread Lance Norskog
) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 159) Cheers, Lance Norskog

RE: Use of get instead of post may be causing some problems

2008-03-06 Thread Lance Norskog
I just switched to doing posts for queries. We have a bunch of filters etc. and Solr stopped working on tomcat. -Original Message- From: Benson Margulies [mailto:[EMAIL PROTECTED] Sent: Thursday, March 06, 2008 12:43 PM To: solr-user Subject: Use of get instead of post may be causing

RE: what's the schedule of the release of solr 1.3?

2008-03-01 Thread Lance Norskog
An alternative would be for someone to give a subversion checkout number against 1.3-dev which represents a solid working checkout. There are a lot of people using 1.3-dev in production, could you all please tell us what checkout number you are using? Cheers, Lance -Original Message-

Fastest Solr query

2008-03-01 Thread Lance Norskog
The fastest solr query I can find is any query on unused dynamic field name: unused_dynamic_field_s:3 Is there another query style that should be faster? See this line in http://wiki.apache.org/solr/SolrConfigXml pingQueryq=solramp;version=2.0amp;start=0amp;rows=0/pingQuery A better ping

RE: escaping special chars in query

2008-02-19 Thread Lance Norskog
You may also use Unicode escapes: \u for example. -Original Message- From: Reece [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 19, 2008 10:04 AM To: solr-user@lucene.apache.org Subject: Re: escaping special chars in query The bottom of the Lucene query syntax page:

RE: Questions about filters and scoring

2008-02-18 Thread Lance Norskog
3) But then would not 'certificate anystopword found' match your phrase? I wound up making a separate index without stopwords just so that my phrase lookups would work. (I do not have the luxury of re-indexing, so now I'm stuck with this design even if there is a better one.) I also made one

RE: solr to work for my web application

2008-02-13 Thread Lance Norskog
I strongly recommend that you switch from the latest nightly build to the Solr 1.2 release. Lance -Original Message- From: Thorsten Scherler [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 13, 2008 4:03 AM To: solr-user@lucene.apache.org Subject: Re: solr to work for my web

RE: Performance help for heavy indexing workload

2008-02-12 Thread Lance Norskog
1) autowarming: it means that if you have a cached query or similar, and do a commit, it then reloads each cached query. This is in solrconfig.xml 2) sorting is a pig. A sort creates an array of N integers where N is the size of the index, not the query. If the sorted field is anything but an

RE: upgrading to lucene 2.3

2008-02-12 Thread Lance Norskog
What will this improve? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Tuesday, February 12, 2008 6:48 AM To: solr-user@lucene.apache.org Subject: Re: upgrading to lucene 2.3 On Feb 12, 2008 9:25 AM, Robert Young [EMAIL PROTECTED]

RE: range vs. filter queries

2008-02-11 Thread Lance Norskog
Is it not possible to make a grid of your boxes? It seems like this would be a more efficient query: grid:N100_S50_E250_W412 This is how GIS systems work, right? Lance -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Monday, February 11, 2008 6:13 PM To:

RE: Lucene index verifier

2008-02-08 Thread Lance Norskog
on performance/search/indexing. -Grant On Feb 7, 2008, at 11:15 PM, Lance Norskog wrote: (Sorry, my Lucene java-user access is wonky.) I would like to verify that my snapshots are not corrupt before I enable them. What is the simplest program to verify that a Lucene index

<    8   9   10   11   12   13   14   >