date:20091027

Re: Solr Configuration Management

2009-10-27 Thread Licinio Fernández Maurelo

> are you referring to DIH? yes 2009/10/27 Noble Paul നോബിള്‍ नोब्ळ् > 2009/10/26 Licinio Fernández Maurelo : > > Hi there, > > > > i must enhance solr config deploys. > > > > I have a configuration file per environment and per role (Master-Slave) > so i > > want to separate DataSource definitio

solr cell/tika: pdf import with xml metatags

2009-10-27 Thread Markus.Rietzler

hi, we want to use SOLR as our intranet search engine. i downloaded the nightly bild of solr 1.4. pdf extraction does via Solr Cell/Tika. i can send the pdf via curl to solr. we do have a large set of meta-tags to all our intranet documents, including PDF, PPT etc. to import html files from our

Re: solr cell/tika: pdf import with xml metatags

2009-10-27 Thread Grant Ingersoll

On Oct 27, 2009, at 6:36 AM, > wrote: hi, we want to use SOLR as our intranet search engine. i downloaded the nightly bild of solr 1.4. pdf extraction does via Solr Cell/Tika. i can send the pdf via curl to solr. we do have a large set of meta-tags to all our intranet documents, includ

AW: solr cell/tika: pdf import with xml metatags

2009-10-27 Thread Markus.Rietzler

thanxs, i know and read that page. sending additional meta-tags with the curl call is no problem. i only thought that there might be a way to use the xml-approach also with PDF files. i'll go the "curl"-way for that files. -- mit freundlichen Grüßen Markus Rietzler - Rechenzentrum der Finanzver

Re: AW: solr cell/tika: pdf import with xml metatags

2009-10-27 Thread Grant Ingersoll

You can send PDF files via SolrJ: http://www.lucidimagination.com/blog/2009/09/14/posting-rich-documents-to-apache-solr-using-solrj-and-solr-cell-apache-tika/ I'm sure the various other clients could do the same thing. All you really need is a way to upload the files. Still, sending lots o

Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Jérôme Etévé

Hi All, I'm using a multified query parser to generated weighted queries across different fields. For instance, perl developer gives me: +(title:perl^10.0 keywords:perl company:perl^3.0) +(title:developer^10.0 keywords:developer company:developer^3.0) Either in solr 1.3 or solr 1.4 (from 12 oct

Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Yonik Seeley

On Tue, Oct 27, 2009 at 8:44 AM, Jérôme Etévé wrote: > I don't really get why these two tokens are subsequently put together > in a phrase query. That's the way the Lucene query parser has always worked... phrase queries are made if multiple tokens are produced from one field query. > In solr 1.

Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Patric.Wilms

Hey, i found out when i use the new feature solr.ReversedWildcardFilterFactory that the following happens: I query for thomas. nickname:thomas nickname:thomas nickname:thoma nickname:thoma We see the parsed String ist thoma. I query for *thomas nickname:*thomas nickname:*thomas nickname:#1;s

Solr Cell on web-based files?

2009-10-27 Thread Insight 49, LLC

Hi, If I use the ExtractingRequestHandler on a local file (as shown in http://wiki.apache.org/solr/TikaExtractOnlyExampleOutput), all works well, but how do I do this for files located on a server? e.g. (works) curl http://localhost:8983

Re: Solr 1.4 (RC) performance on multi-CPU system

2009-10-27 Thread gabriele renzi

On Mon, Oct 26, 2009 at 10:43 PM, Yonik Seeley wrote: > On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka wrote: >> Are multiple CPUs utilized at indexing time as well, or just by searcher? > > Yes, multiple CPUs are utilized for indexing. > > If you're using SolrJ, and easy way to exploit this

Re: Solr Cell on web-based files?

2009-10-27 Thread Grant Ingersoll

You might try remote streaming with Solr (see http://wiki.apache.org/solr/SolrConfigXml ). Otherwise, look into a crawler such as Nutch or Droids or Heretrix. -Grant On Oct 27, 2009, at 11:14 AM, Insight 49, LLC wrote: Hi, If I use the ExtractingRequestHandler

Re: Solr 1.4 (RC) performance on multi-CPU system

2009-10-27 Thread Yonik Seeley

On Tue, Oct 27, 2009 at 10:23 AM, gabriele renzi wrote: > On Mon, Oct 26, 2009 at 10:43 PM, Yonik Seeley > wrote: >> On Mon, Oct 26, 2009 at 4:32 PM, Teruhiko Kurosaka >> wrote: >>> Are multiple CPUs utilized at indexing time as well, or just by searcher? >> >> Yes, multiple CPUs are utilized f

AW: Solr Cell on web-based files?

2009-10-27 Thread Markus.Rietzler

curl reads from local file or stdin, so you could do something like if it only a single file from a webserver curl http://someserver/file.html/ | curl "http://localhost:8983/solr/update/extract?extractOnly=true"; -F na...@- but this way no crawling, no link following etc... -- mit freundlic

Re: Solr Cell on web-based files?

2009-10-27 Thread Andrzej Bialecki

Grant Ingersoll wrote: You might try remote streaming with Solr (see http://wiki.apache.org/solr/SolrConfigXml). Otherwise, look into a crawler such as Nutch or Droids or Heretrix. Additionally, Nutch can be configured to send the crawled/parsed documents to Solr for indexing. -- Best reg

Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Jérôme Etévé

Hum, That's probably because of our own customized types/tokenizers/filters. I tried reindexing and querying our data using the default solr type 'textgen' and it works fine. I need to investigate which features of the new lucene 2.9 API is not implemented in our own tokenizers etc... Thanks.

Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4

2009-10-27 Thread Jérôme Etévé

Actually here is the difference between the textgen analysis pipeline and our: For the phrase "ingenieur d'affaire senior" , Our pipeline gives right after our tokenizer: term position 1 2 3 4 term text ingenieur d affaire senior 'd' and 'affaire' are separa

Re: field collapsing exception

2009-10-27 Thread Martijn v Groningen

Hi Joe, Ok that is not good. A took a look at it and this would mean that the variable docHeadCollapseGroupAssoc (declared at line 80 in FieldValueCountCollapseCollectorFactory) is null. I always thought that the variable was set, but that is not the case when no documents are being collapsed. I w

Greater-than and less-than in data import SQL queries

2009-10-27 Thread Andrew Clegg

Hi, If I have a DataImportHandler query with a greater-than sign in, like this: Everything's fine. However, if it contains a less-than sign: I get this exception: INFO: Processing configuration from solrconfig.xml: {config=dataconfig.xml} [Fatal Error] :240:129: The value o

Re: Solr Cell on web-based files?

2009-10-27 Thread Insight 49, LLC

Andrzej Bialecki wrote: Grant Ingersoll wrote: You might try remote streaming with Solr (see http://wiki.apache.org/solr/SolrConfigXml). Otherwise, look into a crawler such as Nutch or Droids or Heretrix. Additionally, Nutch can be configured to send the crawled/parsed documents to Solr for

Re: field collapsing exception

2009-10-27 Thread Martijn v Groningen

I have attached a new patch to SOLR-236 that fixes this issue. Cheers, Martijn 2009/10/27 Martijn v Groningen : > Hi Joe, > > Ok that is not good. A took a look at it and this would mean that the > variable docHeadCollapseGroupAssoc (declared at line 80 in > FieldValueCountCollapseCollectorFacto

Re : Solr - Plugin : QParserPlugin is not working..

2009-10-27 Thread Phanindra Reva

Hello All, I am a newbie, learning Solr - plugins concept. While following the tutorials on the same from http://wiki.apache.org/solr/SolrPlugins , I have tried to work on Query Parser plugin concept by extending QParserPlugin class. I have registered my custom plugin class in solrconf

Solr: How to create a core from the command line ?

2009-10-27 Thread SGE0

Hi, currently on dev. I use following command (in a browser) to create a new core (we use Solr in a multi-core mode): http:\\localhost:8080\solr-rel-1-00-7064\admin\cores?action=CREATE&name=test&instanceDir=c:\data\solr\cores\test&config=c:\data\solr\cores\test\conf\solrconfig.xml&schema=c:\data

facet.query and fq

2009-10-27 Thread David Giffin

Hi There, Is there a way to get facet.query= to ignore the fq= param? We want to do a query like this: select?fl=*&start=0&q=cool&fq=in_stock:true&facet=true&facet.query=in_stock:false&qt=dismax To understand the count of items not in stock, when someone has filtered items that are in stock. Or

Re: facet.query and fq

2009-10-27 Thread Jérôme Etévé

Hi, you need to 'tag' your filter and then exclude it from the faceting. An example here: http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters J. 2009/10/27 David Giffin : > Hi There, > > Is there a way to get facet.query= to ignore the fq= param? We want to > do a

Re: facet.query and fq

2009-10-27 Thread David Giffin

Thanks, that was just what I was looking for! On Tue, Oct 27, 2009 at 1:27 PM, Jérôme Etévé wrote: > Hi, > > you need to 'tag' your filter and then exclude it from the faceting. > > An example here: > http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters > > J. > > 200

DIH out of memory exception

2009-10-27 Thread William Pierce

Folks: My db contains approx 6M records -- on average each is approx 1K bytes. When I use the DIH, I reliably get an OOM exception. The machine has 4 GB ram, my tomcat is set to use max heap of 2G. The option of increasing memory is not tenable coz as the number of documents grows I wi

Re: DIH out of memory exception

2009-10-27 Thread Gilbert Boyreau

Hi, I got the same problem using DIH with a large dataset in MySql database. Following : http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-implementation-notes.html, and looking at the java code, it appears that DIH use PreparedStatement in the JdbcDataSource. I set the batchsiz

Re: Greater-than and less-than in data import SQL queries

2009-10-27 Thread Erik Hatcher

Use < instead of < in that attribute. That should fix the issue. Remember, it's an XML file, so it has to obey XML encoding rules which make it ugly but whatcha gonna do? Erik On Oct 27, 2009, at 11:50 AM, Andrew Clegg wrote: Hi, If I have a DataImportHandler query with a great

Re: Solr: How to create a core from the command line ?

2009-10-27 Thread Jeremy Hinegardner

On Tue, Oct 27, 2009 at 10:03:23AM -0700, SGE0 wrote: > > Hi, > > currently on dev. I use following command (in a browser) to create a new > core (we use Solr in a multi-core mode): > > http:\\localhost:8080\solr-rel-1-00-7064\admin\cores?action=CREATE&name=test&instanceDir=c:\data\solr\cores\te

Re: Greater-than and less-than in data import SQL queries

2009-10-27 Thread Andrew Clegg

Heh, eventually I decided "where 4 > node_depth" was the most pleasing (if slightly WTF-ish) way of writing it... Cheers, Andrew. Erik Hatcher-4 wrote: > > Use < instead of < in that attribute. That should fix the issue. > Remember, it's an XML file, so it has to obey XML encoding rule

Re: Solr: How to create a core from the command line ?

2009-10-27 Thread Jason Venner

wget or curl? -- Jason Venner Author: Pro Hadoop A howto guide to learning and using hadoop and map/reduce http://www.prohadoopbook.com/ a Ning network for Hadoop using Professionals On 10/27/09 10:03 AM, "SGE0" wrote: > > Hi, > > currently on dev. I use following command (in a browser) to

Re: Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Chris Hostetter

: i found out when i use the new feature solr.ReversedWildcardFilterFactory that the following happens: Hmmm in spite of ReversedWildcardFilterFactory being somethng designed to be used as part of the analysis chain for a fields *so* you can do leading wildcard queries on them, it doesn't c

Re: Re : Solr - Plugin : QParserPlugin is not working..

2009-10-27 Thread Grant Ingersoll

On Oct 27, 2009, at 12:58 PM, Phanindra Reva wrote: Hello All, I am a newbie, learning Solr - plugins concept. While following the tutorials on the same from http://wiki.apache.org/solr/SolrPlugins , I have tried to work on Query Parser plugin concept by extending QParserPlugin class.

Re: Solr: How to create a core from the command line ?

2009-10-27 Thread Chris Hostetter

: More appropriate would be to use a commandline utility like 'curl' to : create the new cores. : : curl 'http://production.example.com:8080/solr/admin/cores?rest-of-request' More specificly: HTTP is the only way to admin a Solr port while running. All of the existing scripts htat come wi

Re: Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Grant Ingersoll

On Oct 27, 2009, at 9:44 AM, wrote: Hey, i found out when i use the new feature solr.ReversedWildcardFilterFactory that the following happens: I query for thomas. nickname:thomas nickname:thomas nickname:thoma nickname:thoma We see the parsed String ist thoma. I query for *thomas

Re: long startup time

2009-10-27 Thread Grant Ingersoll

How big is your index? Can you share your solrconfig? Have you looked at it in a profiler during this time? What is it doing? -Grant On Oct 26, 2009, at 8:32 PM, Teruhiko Kurosaka wrote: I've been testing Solr 1.4.0 (RC). After sometime, solr started to pause for a long time (a minutes or

Re: weird behaviour while inserting records into solr

2009-10-27 Thread Grant Ingersoll

On Oct 26, 2009, at 1:14 AM, Rakhi Khatwani wrote: Hi, i was trying to insert one million records in solr (keeping the id from 0 to 100). things were fine till it inserted (id = 523932). after that it started inserting it from 1 (i.e updating). i am not able to understand this b

Re: Wildcard on first char, Possible Bug 1.4?

2009-10-27 Thread Yonik Seeley

On Tue, Oct 27, 2009 at 4:14 PM, Grant Ingersoll wrote: > I'm pretty sure wildcard queries don't go through analysis, hence they are > probably not stemmed. Right - same thing would happen w/o the reverse filter. Also, wildcarding mixes poorly with stemming - trying to analyze won't fix the probl

Re: Is optimized?

2009-10-27 Thread Chris Hostetter

: If I issue two requests with no intervening changes to the : index, will the second optimize request be smart enough to not do : anything? the actual optimize is optimized, but postCommit and postOptimize event listeners will still be fired. -Hoss

Response XML Deserializing

2009-10-27 Thread Thomas Nguyen

Hello, Is there a solution packaged in SOLR that deserializes XML response documents into Lucene documents?

benchmarking tools

2009-10-27 Thread Mike Anderson

I've been making modifications here and there to the Solr source code in hopes to optimize for my particular setup. My goal now is to establish a descent benchmark toolset so that I can evaluate the observed performance increase before deciding rolling out. So far I've investigated Jmeter and Lucid

RE: long startup time

2009-10-27 Thread Teruhiko Kurosaka

> From: Grant Ingersoll [mailto:gsing...@apache.org] > Sent: Tuesday, October 27, 2009 1:15 PM > To: solr-user@lucene.apache.org > Subject: Re: long startup time > > How big is your index? Can you share your solrconfig? Have > you looked at it in a profiler during this time? What is it doing?

Re: Seattle / NW Hadoop, Lucene, Apache "Cloud Stack" Meetup, Wed Oct 28 6:45pm

2009-10-27 Thread Bradford Stephens

Hey guys! Don't forget this is tomorrow (Wednesday). See you there! Cheers, Bradford On Sun, Oct 18, 2009 at 5:10 PM, Bradford Stephens wrote: > Greetings, > > (You're receiving this e-mail because you're on a DL or I think you'd > be interested) > > It's time for another Hadoop/Lucene/Apache "C

problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Jeremy Hinegardner

Hi all, I was trying to use the new 'shareSchema=true' feature in solr 1.4 and it appears as though this will only happen in one configuration. I'd like someone to confirm this for me and then we can file a bug on it. This all happens in CoreContainer.create(). When you have shareSchema=true i

logging options for 1.3

2009-10-27 Thread Adamsky, Robert

Is there any way to get 1.3 Solr to use something other than java logging? Am running solr inside tomcat and would like logging for solr to be directed to one set of (rotated) log files and leave tomcat logging in its own log files. Also, with 1.4, I see it requires removal of jar and swapping in

Re: DIH out of memory exception

2009-10-27 Thread William Pierce

Hi, Gilbert: Thanks for your tip! I just tried it. Unfortunately, it does not work for me. I still get the OOM exception. How large was your dataset? And what were your machine specs? Cheers, - Bill -- From: "Gilbert Boyreau" Sent: Tuesda

Re: Response XML Deserializing

2009-10-27 Thread Mattmann, Chris A (388J)

Hi Thomas, If you check out SOLR-1516, I developed a custom response writer that simplifies this process. You basically have to implement an #emitDoc or an #emitDocList function in which you are handed the resultant o.a.l.Document List or o.a.l.Document object (on a per Document basis) and you

Re: benchmarking tools

2009-10-27 Thread Joshua Tuberville

Mike, For response times I would also look at java.net's Faban benchmarking framework. We use it extensively for our acceptance tests and tuning excercises. Joshua On Oct 27, 2009, at 1:59 PM, Mike Anderson wrote: > I've been making modifications here and there to the Solr source > code

MLT cross core

2009-10-27 Thread Adamsky, Robert

Have two cores with some common fields in their schemas. I want to perform a MLT query on one core and get results from the other schema. Both cores have same type of id. I saw this thread: http://www.nabble.com/Does-MoreLikeThis-support-sharding--td25378654.html This is not quite what I am do

Re: question about text field and WordDelimiterFilter in example schema.xml

2009-10-27 Thread Bill Au

I have been playing with this using the analysis.jsp. I am still not clear why we don't want to catenate at query time. Here is my example. With the current text field, the query term "iPhone" will not match document containing the string "iphone" because "iPhone" is analyzed into two terms: i(1

[ANNOUNCE] Lucene MeetUp in Oakland, CA - Tue Nov 3rd @ 8PM

2009-10-27 Thread Chris Hostetter

(cross posted to many user lists, please confine reply to gene...@lucene) There will be a Lucene meetup next week at ApacheCon in Oakland, CA on Tuesday, November 3rd. Meetups are free (the rest of the conference is not). See: http://wiki.apache.org/lucene-java/LuceneAtApacheConUs2009 For ot

Re: QTime always a multiple of 50ms ?

2009-10-27 Thread Lance Norskog

This is different for each model of computer. It has to do with exactly what chips are used. On Fri, Oct 23, 2009 at 10:42 AM, Jérôme Etévé wrote: > 2009/10/23 Andrzej Bialecki : >> Jérôme Etévé wrote: >>> >>> Hi all, >>> >>> I'm using Solr trunk from 2009-10-12 and I noticed that the QTime >>>

Re: problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्

hi, Looks like a bug. open an issue. On Wed, Oct 28, 2009 at 4:04 AM, Jeremy Hinegardner wrote: > Hi all, > > I was trying to use the new 'shareSchema=true' feature in solr 1.4 and > it appears as though this will only happen in one configuration. I'd like > someone to confirm this for me and th

Re: question about text field and WordDelimiterFilter in example schema.xml

2009-10-27 Thread Yonik Seeley

On Tue, Oct 27, 2009 at 10:31 PM, Bill Au wrote: > Here is my example. > With the current text field, the query term "iPhone" will not match document > containing the string "iphone" because "iPhone" is analyzed into two terms: > i(1) and phone(2). Right. The limitations are known... but we do

Re: problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Noble Paul നോബിള്‍ नोब्ळ्

I've opened an issue https://issues.apache.org/jira/browse/SOLR-1527 2009/10/28 Noble Paul നോബിള്‍ नोब्ळ् : > hi, > Looks like a bug. open an issue. > > > On Wed, Oct 28, 2009 at 4:04 AM, Jeremy Hinegardner > wrote: >> Hi all, >> >> I was trying to use the new 'shareSchema=true' feature in solr

Re: problem using solr 1.4 multicore with shareSchema=true

2009-10-27 Thread Jeremy Hinegardner

I just tested your patch, it works for me. I vote for getting this in on the 1.4 release :-) enjoy, -jeremy On Wed, Oct 28, 2009 at 09:15:03AM +0530, Noble Paul ? ?? wrote: > I've opened an issue https://issues.apache.org/jira/browse/SOLR-1527 > > 2009/10/

RE: "begins with" searches

2009-10-27 Thread Bernadette Houghton

Thanks Avlesh. The issue with not doing a phrase query on my "edgytext" field was that my parent application was adding an escape character to the quotation marks, and I was hoping to fix (or rather, work around) at the solr end to save maintenance overhead. But I've done a hack in the parent ap

Re: "begins with" searches

2009-10-27 Thread Avlesh Singh

> > My next issue relates to how to get the results of the author field come up > in a search across all fields. For example, a search on author:"Houghton, B" > (which uses the edgytext) yields 16 documents, but a search on > all:"Houghton, B" (which doesn't) yields only 9. I thought the solution >

Re: Environment Timezone being considered when using SolrJ

2009-10-27 Thread Chris Hostetter

: I've a wrote a Unit Test in order to simulate the date processing. A high I think you are missunderstanding what your test is doing, but i'll get to that in a second... : level detail of this problem is that it occurs only when used the JavaBin : custom format (&wt=javabin), in this case the

Re: MLT cross core

2009-10-27 Thread Avlesh Singh

> > Have two cores with some common fields in their schemas. I want to perform > a MLT query on one core and get results from the other schema. Both cores > have same type of id. > Having the same "type of id" in two different cores is of no good for a MLT handler (which in-fact operates on one co

60 matches

Mail list logo