Re: A Newbie Question

2010-11-12 Thread Lance Norskog
Using 'curl' is fine. There is a library called SolrJ for Java and other libraries for other scripting languages that let you upload with more control. There is a thing in Solr called the DataImportHandler that lets you script walking a file system. On Thu, Nov 11, 2010 at 8:38 PM, K. Seshadri

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-12 Thread Lance Norskog
I think you have to compile all of the stempel source including your filter factory into one jar at the same time. Everybody does this; I don't know how different Java versions make class file binaries. On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawa jakub.god...@gmail.com wrote: Hi! Sorry for

Re: solr 1.3 how to parse rich documents

2010-11-12 Thread Lance Norskog
Did you do a 'commit' after this? If there is no error in the log or the HTTP response, the document should get added. Solr 1.4.1 has much newer versions of this software. On Thu, Nov 11, 2010 at 6:46 AM, Nikola Garafolic nikola.garafo...@srce.hr wrote: Hi, I use solr 1.3 with patch for

Re: Rollback can't be done after committing?

2010-11-12 Thread Michael McCandless
In fact Lucene can rollback to a previous commit. You just need to use a deletion policy that preserves past commits (the default policy only keeps the most recent commit). Once you have multiple commits in the index you can do fun things like open an IndexReader on an old commit, rollback (open

full text search in multiple fields

2010-11-12 Thread PeterKerk
I want to provide a full text search function. This function has to search through the 2 fields: title and description that I have defined in my schema.xml (both of type string). Now, since solr doesnt (by default) provide an or operator, I thought I should somehow combine these fields into 1

Re: full text search in multiple fields

2010-11-12 Thread Tommaso Teofili
Hi, 2010/11/12 PeterKerk vettepa...@hotmail.com I want to provide a full text search function. This function has to search through the 2 fields: title and description that I have defined in my schema.xml (both of type string). Now, since solr doesnt (by default) provide an or operator,

Re: full text search in multiple fields

2010-11-12 Thread Ahmet Arslan
--- On Fri, 11/12/10, PeterKerk vettepa...@hotmail.com wrote: From: PeterKerk vettepa...@hotmail.com Subject: full text search in multiple fields To: solr-user@lucene.apache.org Date: Friday, November 12, 2010, 1:32 PM I want to provide a full text search function. This function has

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-12 Thread Jakub Godawa
Am I not doing it in the point no 4? I am compiling all the folder that was extracted before, but now with that new class file. 2010/11/12 Lance Norskog goks...@gmail.com: I think you have to compile all of the stempel source including your filter factory into one jar at the same time.

Re: Issue with facet fields

2010-11-12 Thread gauravshetti
thanks. got it! -- View this message in context: http://lucene.472066.n3.nabble.com/Issue-with-facet-fields-tp1883106p102.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: A Newbie Question

2010-11-12 Thread K. Seshadri Iyer
Hi Lance, Thank you very much for responding (not sure how I reply to the group, so, writing to you). Can you please expand on your suggestion? I am not a web guy and so, don't know where to start. What is the difference between SolrJ and DataImportHandler? Do I need to set up web servers on

Assistance required fine-tuning nutch/solr - (paid work)

2010-11-12 Thread Jean-Luc
I require the expertise of a developer who can assist with fine-tuning my nutch/solr setup. I have the basics working but I think I probably need a custom nutch plugin written. If you're interested please contact me: jeanluct [at] gmail . com Hope it's ok to post this here - I'm not a

Re: full text search in multiple fields

2010-11-12 Thread Erick Erickson
In addition to the other replies, do be careful about string types. It's probably not what you want as it indexes the entire input as a single token. For instance, indexing great expectations as a string type would NOT get you a hit when searching for great. Think about a text type instead... And

Re: A Newbie Question

2010-11-12 Thread Erick Erickson
Think of the data import handler (DIH) as Solr pulling data to index from some source based on configuration. So, once you set up your DIH config to point to your file system, you issue a command to solr like OK, do your data import thing. See the FileListEntityProcessor.

Re: WELCOME to solr-user@lucene.apache.org

2010-11-12 Thread Solr User
Ahmet, Thanks for the reply. select/?q=built+to+lastdefType=dismaxqf=searchFields^0.2+title^20debugQuery=on For some reason if I use title field in my query I don't get any results. I am copying all searchable fields into searchFields field. So I am able to search only in the searchFields

Re: Looking for help with Solr implementation

2010-11-12 Thread Shalin Shekhar Mangar
On Thu, Nov 11, 2010 at 7:52 PM, AC acanuc...@yahoo.com wrote: Hi, Not sure if this is the correct place to post but I'm looking for someone to help finish a Solr install on our LAMP based website.  This would be a paid project. The programmer that started the project got too busy with

Re: Looking for help with Solr implementation

2010-11-12 Thread Shashi Kant
Have you tried posting on odesk.com? I have had decent success finding Solr/Lucene resources there. On Thu, Nov 11, 2010 at 7:52 PM, AC acanuc...@yahoo.com wrote: Hi, Not sure if this is the correct place to post but I'm looking for someone to help finish a Solr install on our LAMP based

Re: WELCOME to solr-user@lucene.apache.org

2010-11-12 Thread Ahmet Arslan
select/?q=built+to+lastdefType=dismaxqf=searchFields^0.2+title^20debugQuery=on For some reason if I use title field in my query I don't get any results. I am copying all searchable fields into searchFields field. So I am able to search only in the searchFields field not in any other

Re: WELCOME to solr-user@lucene.apache.org

2010-11-12 Thread Solr User
Ahmet, In production system we are using /spell/?q=built+to+last so that we can check the spelling. We are not using /select?q=built+to+last Can I use dismax with /spell? I understood from your reply that I need to change my schema.xml and modify the field types. Do I need to still use the

Doubt about index size

2010-11-12 Thread Claudio Devecchi
Hi everybody, I'm doing some indexing testing on solr 1.4.1 and I'm not understanding one thing, let me try to explain. I have 1.2 million xml files and I'm indexing then, when I do it for first time my index size is around 3 GB and in my statistics on http://localhost:8983/solr/admin/stats.jsp

RE: Corename after Swap in MultiCore

2010-11-12 Thread sivaram
Do you mean solr.core.name has the wrong value after the swap? You swapped doc-temp so now it's doc and solr.core.name is still doc-temp? This completely contradicts my experience, what version of solr are you using? Why use postCommit? You're running the risk of performing a swap when you don't

Re: Looking for help with Solr implementation

2010-11-12 Thread Abe Couse
Thanks for the replies, I will try odesk. Haven't had any luck with the contact list in the wiki, tried contacting a few people listed and no replies. On Nov 12, 2010, at 7:16 AM, Shashi Kant sk...@sloan.mit.edu wrote: Have you tried posting on odesk.com? I have had decent success finding

RE: Doubt about index size

2010-11-12 Thread Burton-West, Tom
Hi Claudio, What's happening when you re-index the documents is that Solr/Lucene implements an update as a delete plus a new index. Because of the nature of inverted indexes, deleting documents requires a rewrite of the entire index. In order to avoid rewriting the entire index each time one

analyzer type

2010-11-12 Thread gauravshetti
Can you please help me distinguish between analyzer types. i am not able to find document for the same. I want to add solr.HTMLStripCharFilterFactory in the schema.xml file. And i can see two types defined in my schema.xml for analyzer analyzer type=index analyzer type=query -- View this

Re: Corename after Swap in MultiCore

2010-11-12 Thread Shawn Heisey
On 11/7/2010 9:11 AM, Ephraim Ofir wrote: Do you mean solr.core.name has the wrong value after the swap? You swapped doc-temp so now it's doc and solr.core.name is still doc-temp? This completely contradicts my experience, what version of solr are you using? Why use postCommit? You're running

Re: Doubt about index size

2010-11-12 Thread Claudio Devecchi
Hi Tom, thanks for your explanation, Do you recommend the index continues this way? Or can I configure it to make optmize automatically? tks On Fri, Nov 12, 2010 at 2:39 PM, Burton-West, Tom tburt...@umich.eduwrote: Hi Claudio, What's happening when you re-index the documents is that

Re: WELCOME to solr-user@lucene.apache.org

2010-11-12 Thread Ahmet Arslan
/spell/?q=built+to+last so that we can check the spelling. We are not using /select?q=built+to+last Can I use dismax with /spell? Yes you can. I understood from your reply that I need to change my schema.xml and modify the field types. Correct. Make them full-text searchable. string

Searching with AND + OR and spaces

2010-11-12 Thread Jon Drukman
I want to search two fields for the phrase Call Of Duty. I tried this: (title:Call of Duty OR subhead:Call of Duty) No matches, despite the fact that there are many documents that should match. So I left out the quotes, and it seems to work. But now when I try doing things like title:Call of

Re: analyzer type

2010-11-12 Thread Tomas Fernandez Lobbe
For a field type the anslysis applied at index time (when you are adding documents to Solr) can be a slightly different than the analysis applied at query time (when a user executes a query). For example, if you know you are going to be indexing html pages, you might need to use the

Re: Searching with AND + OR and spaces

2010-11-12 Thread Ahmet Arslan
(title:Call of Duty OR subhead:Call of Duty) No matches, despite the fact that there are many documents that should match. Field types of title and subhead are important here. Do you use stopwordfilterfactory with enable position increments? What is you solr version? So I left out the

Re: Searching with AND + OR and spaces

2010-11-12 Thread Tomas Fernandez Lobbe
Hi Jon, for the first query: title:Call of Duty OR subhead:Call of Duty If you are sure that you have documents with the same phrase, make sure you don't have a problem with stop words and with token positions. I recommend you to check the analysis page at the Solr admin. pay special attention

Re: Searching with AND + OR and spaces

2010-11-12 Thread Jon Drukman
Ahmet Arslan iorixxx at yahoo.com writes: (title:Call of Duty OR subhead:Call of Duty) No matches, despite the fact that there are many documents that should match. Field types of title and subhead are important here. Do you use stopwordfilterfactory with enable position

Shuffle results a little

2010-11-12 Thread David Yang
Hi, I am interested in using solr to return search results for products. Is there any feature which will allow the result to be spread/shuffled around a little? The problem is that there are lots of results for one brand, but there are lots of other brands a few pages later. Is it possible to

Re: Doubt about index size

2010-11-12 Thread Erick Erickson
It's probably a good idea to optimize. How are you re-indexing anyway? DIH? custom code? post.jar? Manual optimizing is just issuing the appropriate curl command, see: http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22 Best Erick On Fri, Nov 12, 2010 at 12:13 PM,

RE: Doubt about index size

2010-11-12 Thread Burton-West, Tom
An optimize takes lots of cpu and I/O since it has to rewrite your indexes, so only do it when necessary. You can just use curl to send an optimize message to Solr when you are ready. See: http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL Tom

Re: Shuffle results a little

2010-11-12 Thread Ahmet Arslan
I am interested in using solr to return search results for products. Is there any feature which will allow the result to be spread/shuffled around a little? The problem is that there are lots of results for one brand, but there are lots of other brands a few pages later. Is it possible to

Re: Shuffle results a little

2010-11-12 Thread Dave Searle
You could also try splitting the brand name from the product name into a separate field and then boosting on the product name? Sent from my iPhone On 12 Nov 2010, at 20:32, Ahmet Arslan iori...@yahoo.com wrote: I am interested in using solr to return search results for products. Is there

Re: Corename after Swap in MultiCore

2010-11-12 Thread sivaram
Shawn That is good if we can restart the solr. But we don't want to restart the whole solr after every commit because some of the core usually have to update for comparatively short times. So, we do a core reload to get all the synonyms and other stuff getting updated with out the solr reload.

Re: Searching with AND + OR and spaces

2010-11-12 Thread Imran
To get a more precise result on exact matches of your terms, how about having another a string type field for title and subhead. And use dismax to boost the string type fields more than the text type fields. Cheers -- Imran On Fri, Nov 12, 2010 at 6:56 PM, Jon Drukman j...@cluttered.com wrote:

Re: Looking for help with Solr implementation

2010-11-12 Thread Jean-Sebastien Vachon
Hi, If you're still looking for someone, I might be interested in getting more information about your project. From you initial message that does not seem to be a lot of work so I might be willing to give you some time. I've been working with Solr for the last 7 months on my full-time job

Re: Crawling with nutch and mapping fields to solr

2010-11-12 Thread Ramavtar Meena
Hi, This question is more suitable for nutch mailing list but let me give you couple of pointers. If its only metadata you can use the below mentioned patch, but if you want more flexibility with your data you can look at writing your own parser plugin, here is a good place to start:

Re: Looking for help with Solr implementation

2010-11-12 Thread Jean-Sebastien Vachon
Sorry all, I obviously meant to send this to the original poster - Original Message - From: Jean-Sebastien Vachon js.vac...@videotron.ca To: solr-user@lucene.apache.org Sent: Friday, November 12, 2010 10:09 PM Subject: Re: Looking for help with Solr implementation Hi, If you're

Re: Looking for help with Solr implementation

2010-11-12 Thread Dennis Gearon
I might be looking down the road. Send me a site showing the functionality you described? Filing this in the 'Solr Conultants' mail folder. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from

filtering or getting accurate crawling results

2010-11-12 Thread Dennis Gearon
How easy is it to get good results from the Lucene crawling software? Let's say for example I wanted only information about a general subject, but nothing else? (Sorry, not ready to say what exactly at this point) Is it like tuning Solr, or IS it tuning Solr to just not accept what does not fit

Re: Looking for help with Solr implementation

2010-11-12 Thread Dennis Gearon
Hmmm, still getting used to the new Yahoo mail. This should have gone only to the writer. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them

Re: filtering or getting accurate crawling results

2010-11-12 Thread Dennis Gearon
Actually, can Nutch be used for SCRAPING, not crawling? I don't just want the url, I want the data assigned to specific fields, no matter what site or format it is coming from. I've done scraping, but it had to be custom tailored for each target. Dennis Gearon Signature Warning

Re: Shuffle results a little

2010-11-12 Thread Lance Norskog
There is a Random field type which returns random numbers. You might try boosting with that. Dave Searle wrote: You could also try splitting the brand name from the product name into a separate field and then boosting on the product name? Sent from my iPhone On 12 Nov 2010, at 20:32, Ahmet

Re: A Newbie Question

2010-11-12 Thread Lance Norskog
About web servers: Solr is a servlet war file and needs a Java web server container to run. The example/ folder in the Solr disribution uses 'Jetty', and this is fine for small production-quality projects. You can just copy the example/ directory somewhere to set up your own running Solr;

Re: How to use polish stemmer - Stempel - in schema.xml?

2010-11-12 Thread Lance Norskog
I don't know of the Stempel jar includes the Java source. At this point I think you should ask the author to Stempel to make a Solr front-end for it. It's very simple for him. Jakub Godawa wrote: Am I not doing it in the point no 4? I am compiling all the folder that was extracted before, but

Re: Looking for help with Solr implementation

2010-11-12 Thread AC
Hey Jean-Sebastien, Thanks for the reply.  It sounds like your experience is exactly what is needed for my project.  To give you some background this project is for a personal project related to biomedical field that I'm trying to get up off the ground.  The site is www.antibodyreview.com 

Re: Corename after Swap in MultiCore

2010-11-12 Thread Shawn Heisey
On 11/12/2010 2:48 PM, sivaram wrote: That is good if we can restart the solr. But we don't want to restart the whole solr after every commit because some of the core usually have to update for comparatively short times. So, we do a core reload to get all the synonyms and other stuff getting

Searching problem

2010-11-12 Thread M.Rizwan
Hi All, Do you have any idea that why solr search for panasonic* ( without quotes ) does not match panasonic ? If we search panasonic it matches a result but if we search with panasonic* it does not find it. What needs to be done here ? Thanks Riz