Re: begins with searches

2009-10-27 Thread Avlesh Singh
You are right about the parsing of query terms without a double quote (solrQueryParser's defaultOperator has to be AND in your case). For the problem at hand, two things - 1. Do you have any reason for not doing a PhraseQuery (query terms enclosed in double quotes) on your edgytext field?

Re: Solr Configuration Management

2009-10-27 Thread Licinio Fernández Maurelo
are you referring to DIH? yes 2009/10/27 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com 2009/10/26 Licinio Fernández Maurelo licinio.fernan...@gmail.com: Hi there, i must enhance solr config deploys. I have a configuration file per environment and per role (Master-Slave) so i

Re: solr cell/tika: pdf import with xml metatags

2009-10-27 Thread Grant Ingersoll
On Oct 27, 2009, at 6:36 AM, markus.rietz...@rzf.fin-nrw.de markus.rietz...@rzf.fin-nrw.de wrote: hi, we want to use SOLR as our intranet search engine. i downloaded the nightly bild of solr 1.4. pdf extraction does via Solr Cell/Tika. i can send the pdf via curl to solr. we do have a

AW: solr cell/tika: pdf import with xml metatags

2009-10-27 Thread Markus.Rietzler
thanxs, i know and read that page. sending additional meta-tags with the curl call is no problem. i only thought that there might be a way to use the xml-approach also with PDF files. i'll go the curl-way for that files. -- mit freundlichen Grüßen Markus Rietzler - rietzler_software/

Re: AW: solr cell/tika: pdf import with xml metatags

2009-10-27 Thread Grant Ingersoll
You can send PDF files via SolrJ: http://www.lucidimagination.com/blog/2009/09/14/posting-rich-documents-to-apache-solr-using-solrj-and-solr-cell-apache-tika/ I'm sure the various other clients could do the same thing. All you really need is a way to upload the files. Still, sending lots

Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Jérôme Etévé
Hi All, I'm using a multified query parser to generated weighted queries across different fields. For instance, perl developer gives me: +(title:perl^10.0 keywords:perl company:perl^3.0) +(title:developer^10.0 keywords:developer company:developer^3.0) Either in solr 1.3 or solr 1.4 (from 12 oct

Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 8:44 AM, Jérôme Etévé jerome.et...@gmail.com wrote: I don't really get why these two tokens are subsequently put together in a phrase query. That's the way the Lucene query parser has always worked... phrase queries are made if multiple tokens are produced from one field

Solr Cell on web-based files?

2009-10-27 Thread Insight 49, LLC
Hi, If I use the ExtractingRequestHandler http://wiki.apache.org/solr/ExtractingRequestHandler on a local file (as shown in http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput), all works well, but how do I do this for files located on a server? e.g. (works) curl

Re: Solr 1.4 (RC) performance on multi-CPU system

2009-10-27 Thread gabriele renzi
On Mon, Oct 26, 2009 at 10:43 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka k...@basistech.com wrote: Are multiple CPUs utilized at indexing time as well, or just by searcher? Yes, multiple CPUs are utilized for indexing. If you're

Re: Solr Cell on web-based files?

2009-10-27 Thread Grant Ingersoll
You might try remote streaming with Solr (see http://wiki.apache.org/solr/SolrConfigXml ). Otherwise, look into a crawler such as Nutch or Droids or Heretrix. -Grant On Oct 27, 2009, at 11:14 AM, Insight 49, LLC wrote: Hi, If I use the ExtractingRequestHandler

Re: Solr 1.4 (RC) performance on multi-CPU system

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 10:23 AM, gabriele renzi rff@gmail.com wrote: On Mon, Oct 26, 2009 at 10:43 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka k...@basistech.com wrote: Are multiple CPUs utilized at indexing time as well, or just

AW: Solr Cell on web-based files?

2009-10-27 Thread Markus.Rietzler
curl reads from local file or stdin, so you could do something like if it only a single file from a webserver curl http://someserver/file.html/ | curl http://localhost:8983/solr/update/extract?extractOnly=true; -F na...@- but this way no crawling, no link following etc... -- mit

Re: Solr Cell on web-based files?

2009-10-27 Thread Andrzej Bialecki
Grant Ingersoll wrote: You might try remote streaming with Solr (see http://wiki.apache.org/solr/SolrConfigXml). Otherwise, look into a crawler such as Nutch or Droids or Heretrix. Additionally, Nutch can be configured to send the crawled/parsed documents to Solr for indexing. -- Best

Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Jérôme Etévé
Hum, That's probably because of our own customized types/tokenizers/filters. I tried reindexing and querying our data using the default solr type 'textgen' and it works fine. I need to investigate which features of the new lucene 2.9 API is not implemented in our own tokenizers etc... Thanks.

Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Jérôme Etévé
Actually here is the difference between the textgen analysis pipeline and our: For the phrase ingenieur d'affaire senior , Our pipeline gives right after our tokenizer: term position 1 2 3 4 term text ingenieur d affaire senior 'd' and 'affaire' are

Re: Solr Cell on web-based files?

2009-10-27 Thread Insight 49, LLC
Andrzej Bialecki wrote: Grant Ingersoll wrote: You might try remote streaming with Solr (see http://wiki.apache.org/solr/SolrConfigXml). Otherwise, look into a crawler such as Nutch or Droids or Heretrix. Additionally, Nutch can be configured to send the crawled/parsed documents to Solr

Re : Solr - Plugin : QParserPlugin is not working..

2009-10-27 Thread Phanindra Reva
Hello All, I am a newbie, learning Solr - plugins concept. While following the tutorials on the same from http://wiki.apache.org/solr/SolrPlugins , I have tried to work on Query Parser plugin concept by extending QParserPlugin class. I have registered my custom plugin class in

Solr: How to create a core from the command line ?

2009-10-27 Thread SGE0
Hi, currently on dev. I use following command (in a browser) to create a new core (we use Solr in a multi-core mode):

facet.query and fq

2009-10-27 Thread David Giffin
Hi There, Is there a way to get facet.query= to ignore the fq= param? We want to do a query like this: select?fl=*start=0q=coolfq=in_stock:truefacet=truefacet.query=in_stock:falseqt=dismax To understand the count of items not in stock, when someone has filtered items that are in stock. Or is

Re: facet.query and fq

2009-10-27 Thread David Giffin
Thanks, that was just what I was looking for! On Tue, Oct 27, 2009 at 1:27 PM, Jérôme Etévé jerome.et...@gmail.com wrote: Hi,  you need to 'tag' your filter and then exclude it from the faceting.  An example here:

DIH out of memory exception

2009-10-27 Thread William Pierce
Folks: My db contains approx 6M records -- on average each is approx 1K bytes. When I use the DIH, I reliably get an OOM exception. The machine has 4 GB ram, my tomcat is set to use max heap of 2G. The option of increasing memory is not tenable coz as the number of documents grows I

Re: DIH out of memory exception

2009-10-27 Thread Gilbert Boyreau
Hi, I got the same problem using DIH with a large dataset in MySql database. Following : http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-implementation-notes.html, and looking at the java code, it appears that DIH use PreparedStatement in the JdbcDataSource. I set the

Re: Greater-than and less-than in data import SQL queries

2009-10-27 Thread Erik Hatcher
Use lt; instead of in that attribute. That should fix the issue. Remember, it's an XML file, so it has to obey XML encoding rules which make it ugly but whatcha gonna do? Erik On Oct 27, 2009, at 11:50 AM, Andrew Clegg wrote: Hi, If I have a DataImportHandler query with a

Re: Greater-than and less-than in data import SQL queries

2009-10-27 Thread Andrew Clegg
Heh, eventually I decided where 4 node_depth was the most pleasing (if slightly WTF-ish) way of writing it... Cheers, Andrew. Erik Hatcher-4 wrote: Use lt; instead of in that attribute. That should fix the issue. Remember, it's an XML file, so it has to obey XML encoding rules

Re: Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Chris Hostetter
: i found out when i use the new feature solr.ReversedWildcardFilterFactory that the following happens: Hmmm in spite of ReversedWildcardFilterFactory being somethng designed to be used as part of the analysis chain for a fields *so* you can do leading wildcard queries on them, it doesn't

Re: Re : Solr - Plugin : QParserPlugin is not working..

2009-10-27 Thread Grant Ingersoll
On Oct 27, 2009, at 12:58 PM, Phanindra Reva wrote: Hello All, I am a newbie, learning Solr - plugins concept. While following the tutorials on the same from http://wiki.apache.org/solr/SolrPlugins , I have tried to work on Query Parser plugin concept by extending QParserPlugin

Re: Solr: How to create a core from the command line ?

2009-10-27 Thread Chris Hostetter
: More appropriate would be to use a commandline utility like 'curl' to : create the new cores. : : curl 'http://production.example.com:8080/solr/admin/cores?rest-of-request' More specificly: HTTP is the only way to admin a Solr port while running. All of the existing scripts htat come

Re: Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Grant Ingersoll
On Oct 27, 2009, at 9:44 AM, patric.wi...@rtl.de patric.wi...@rtl.de wrote: Hey, i found out when i use the new feature solr.ReversedWildcardFilterFactory that the following happens: I query for thomas. str name=rawquerystringnickname:thomas/str str name=querystringnickname:thomas/str

Re: long startup time

2009-10-27 Thread Grant Ingersoll
How big is your index? Can you share your solrconfig? Have you looked at it in a profiler during this time? What is it doing? -Grant On Oct 26, 2009, at 8:32 PM, Teruhiko Kurosaka wrote: I've been testing Solr 1.4.0 (RC). After sometime, solr started to pause for a long time (a minutes

Re: weird behaviour while inserting records into solr

2009-10-27 Thread Grant Ingersoll
On Oct 26, 2009, at 1:14 AM, Rakhi Khatwani wrote: Hi, i was trying to insert one million records in solr (keeping the id from 0 to 100). things were fine till it inserted (id = 523932). after that it started inserting it from 1 (i.e updating). i am not able to understand this

Re: Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 4:14 PM, Grant Ingersoll gsing...@apache.org wrote: I'm pretty sure wildcard queries don't go through analysis, hence they are probably not stemmed. Right - same thing would happen w/o the reverse filter. Also, wildcarding mixes poorly with stemming - trying to analyze

Re: Is optimize / optimized?

2009-10-27 Thread Chris Hostetter
: If I issue two optimize / requests with no intervening changes to the : index, will the second optimize request be smart enough to not do : anything? the actual optimize is optimized, but postCommit and postOptimize event listeners will still be fired. -Hoss

Response XML Deserializing

2009-10-27 Thread Thomas Nguyen
Hello, Is there a solution packaged in SOLR that deserializes XML response documents into Lucene documents?

benchmarking tools

2009-10-27 Thread Mike Anderson
I've been making modifications here and there to the Solr source code in hopes to optimize for my particular setup. My goal now is to establish a descent benchmark toolset so that I can evaluate the observed performance increase before deciding rolling out. So far I've investigated Jmeter and

RE: long startup time

2009-10-27 Thread Teruhiko Kurosaka
From: Grant Ingersoll [mailto:gsing...@apache.org] Sent: Tuesday, October 27, 2009 1:15 PM To: solr-user@lucene.apache.org Subject: Re: long startup time How big is your index? Can you share your solrconfig? Have you looked at it in a profiler during this time? What is it doing? The

Re: Seattle / NW Hadoop, Lucene, Apache Cloud Stack Meetup, Wed Oct 28 6:45pm

2009-10-27 Thread Bradford Stephens
Hey guys! Don't forget this is tomorrow (Wednesday). See you there! Cheers, Bradford On Sun, Oct 18, 2009 at 5:10 PM, Bradford Stephens bradfordsteph...@gmail.com wrote: Greetings, (You're receiving this e-mail because you're on a DL or I think you'd be interested) It's time for another

problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Jeremy Hinegardner
Hi all, I was trying to use the new 'shareSchema=true' feature in solr 1.4 and it appears as though this will only happen in one configuration. I'd like someone to confirm this for me and then we can file a bug on it. This all happens in CoreContainer.create(). When you have shareSchema=true

logging options for 1.3

2009-10-27 Thread Adamsky, Robert
Is there any way to get 1.3 Solr to use something other than java logging? Am running solr inside tomcat and would like logging for solr to be directed to one set of (rotated) log files and leave tomcat logging in its own log files. Also, with 1.4, I see it requires removal of jar and swapping

Re: DIH out of memory exception

2009-10-27 Thread William Pierce
Hi, Gilbert: Thanks for your tip! I just tried it. Unfortunately, it does not work for me. I still get the OOM exception. How large was your dataset? And what were your machine specs? Cheers, - Bill -- From: Gilbert Boyreau

Re: Response XML Deserializing

2009-10-27 Thread Mattmann, Chris A (388J)
Hi Thomas, If you check out SOLR-1516, I developed a custom response writer that simplifies this process. You basically have to implement an #emitDoc or an #emitDocList function in which you are handed the resultant o.a.l.Document List or o.a.l.Document object (on a per Document basis) and you

Re: benchmarking tools

2009-10-27 Thread Joshua Tuberville
Mike, For response times I would also look at java.net's Faban benchmarking framework. We use it extensively for our acceptance tests and tuning excercises. Joshua On Oct 27, 2009, at 1:59 PM, Mike Anderson wrote: I've been making modifications here and there to the Solr source code

MLT cross core

2009-10-27 Thread Adamsky, Robert
Have two cores with some common fields in their schemas. I want to perform a MLT query on one core and get results from the other schema. Both cores have same type of id. I saw this thread: http://www.nabble.com/Does-MoreLikeThis-support-sharding--td25378654.html This is not quite what I am

Re: question about text field and WordDelimiterFilter in example schema.xml

2009-10-27 Thread Bill Au
I have been playing with this using the analysis.jsp. I am still not clear why we don't want to catenate at query time. Here is my example. With the current text field, the query term iPhone will not match document containing the string iphone because iPhone is analyzed into two terms: i(1) and

[ANNOUNCE] Lucene MeetUp in Oakland, CA - Tue Nov 3rd @ 8PM

2009-10-27 Thread Chris Hostetter
(cross posted to many user lists, please confine reply to gene...@lucene) There will be a Lucene meetup next week at ApacheCon in Oakland, CA on Tuesday, November 3rd. Meetups are free (the rest of the conference is not). See: http://wiki.apache.org/lucene-java/LuceneAtApacheConUs2009 For

Re: QTime always a multiple of 50ms ?

2009-10-27 Thread Lance Norskog
This is different for each model of computer. It has to do with exactly what chips are used. On Fri, Oct 23, 2009 at 10:42 AM, Jérôme Etévé jerome.et...@gmail.com wrote: 2009/10/23 Andrzej Bialecki a...@getopt.org: Jérôme Etévé wrote: Hi all,  I'm using Solr trunk from 2009-10-12 and I

Re: problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi, Looks like a bug. open an issue. On Wed, Oct 28, 2009 at 4:04 AM, Jeremy Hinegardner jer...@hinegardner.org wrote: Hi all, I was trying to use the new 'shareSchema=true' feature in solr 1.4 and it appears as though this will only happen in one configuration. I'd like someone to confirm

Re: question about text field and WordDelimiterFilter in example schema.xml

2009-10-27 Thread Yonik Seeley
On Tue, Oct 27, 2009 at 10:31 PM, Bill Au bill.w...@gmail.com wrote: Here is my example. With the current text field, the query term iPhone will not match document containing the string iphone because iPhone is analyzed into two terms: i(1) and phone(2). Right. The limitations are known...

Re: problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
I've opened an issue https://issues.apache.org/jira/browse/SOLR-1527 2009/10/28 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com: hi, Looks like a bug. open an issue. On Wed, Oct 28, 2009 at 4:04 AM, Jeremy Hinegardner jer...@hinegardner.org wrote: Hi all, I was trying to use the new

Re: problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Jeremy Hinegardner
I just tested your patch, it works for me. I vote for getting this in on the 1.4 release :-) enjoy, -jeremy On Wed, Oct 28, 2009 at 09:15:03AM +0530, Noble Paul ? ?? wrote: I've opened an issue https://issues.apache.org/jira/browse/SOLR-1527 2009/10/28

RE: begins with searches

2009-10-27 Thread Bernadette Houghton
Thanks Avlesh. The issue with not doing a phrase query on my edgytext field was that my parent application was adding an escape character to the quotation marks, and I was hoping to fix (or rather, work around) at the solr end to save maintenance overhead. But I've done a hack in the parent

Re: begins with searches

2009-10-27 Thread Avlesh Singh
My next issue relates to how to get the results of the author field come up in a search across all fields. For example, a search on author:Houghton, B (which uses the edgytext) yields 16 documents, but a search on all:Houghton, B (which doesn't) yields only 9. I thought the solution should

Re: MLT cross core

2009-10-27 Thread Avlesh Singh
Have two cores with some common fields in their schemas. I want to perform a MLT query on one core and get results from the other schema. Both cores have same type of id. Having the same type of id in two different cores is of no good for a MLT handler (which in-fact operates on one core)