Re: What are stopwords and protwords ???
Stopwords are commonly occurring words that don't add _much_ value to search, such as the, an, a and are usually removed during analysis. Protwords (protected words) are words that would be stemmed by the English porter stemmer that you do not want to be stemmed. In the end, removing stopwords may keep your index smaller and can keep some queries from taking a long time, but they also mean you can't query for those words. As for protwords, that is something you would do if you felt the results for those tokens was off. Many people use stopwords, many don't. Personally, I don't think removing them is the right thing to do, as there isn't always a way to recover them and they do provide meaning, otherwise why would they be needed in the language? Often, the best thing to do, is keep stopwords, but handle them intelligently on the query side (in phrases, etc.). However, since you're a beginner, it probably makes sense to just throw out stopwords for now. -Grant On May 21, 2008, at 1:50 AM, Akeel wrote: Hi, I am a beginner to Solr, I have successfully indexed my db in solr. I want to know that what are the stopwords and protwords ??? and how much they have effect on my search results ? Thanks in advance. -- Akeel
Re: What are stopwords and protwords ???
Thank you very much for such a detailed reply. can you please tell me how can i interact with solr from within my Java/JSP application ? I mean how to query the solr running at localhost and getting results back in the application. Do i have to change something there in solrconfig.xml ? Please help me in this regards Thanks in advance -- Akeel On Wed, May 21, 2008 at 4:11 PM, Grant Ingersoll [EMAIL PROTECTED] wrote: Stopwords are commonly occurring words that don't add _much_ value to search, such as the, an, a and are usually removed during analysis. Protwords (protected words) are words that would be stemmed by the English porter stemmer that you do not want to be stemmed. In the end, removing stopwords may keep your index smaller and can keep some queries from taking a long time, but they also mean you can't query for those words. As for protwords, that is something you would do if you felt the results for those tokens was off. Many people use stopwords, many don't. Personally, I don't think removing them is the right thing to do, as there isn't always a way to recover them and they do provide meaning, otherwise why would they be needed in the language? Often, the best thing to do, is keep stopwords, but handle them intelligently on the query side (in phrases, etc.). However, since you're a beginner, it probably makes sense to just throw out stopwords for now. -Grant On May 21, 2008, at 1:50 AM, Akeel wrote: Hi, I am a beginner to Solr, I have successfully indexed my db in solr. I want to know that what are the stopwords and protwords ??? and how much they have effect on my search results ? Thanks in advance. -- Akeel -- Thanks and Regards, Akeel ur Rehman Faridee http://riseofpakistan.blogspot.com cell: 0321-4714151 When there is injustice in society, then everyone will go to politics Except the two kinds: those who are timid and those who are materialist (Aristotle)
Re: What are stopwords and protwords ???
Hi Akeel, Take a look at SolrJ which is a Java client library for Solr. It is packaged with the Solr nightly binary downloads. This can be used by your Java/JSP application to add documents or query Solr. No changes to any config files is needed. On Wed, May 21, 2008 at 5:15 PM, Akeel [EMAIL PROTECTED] wrote: Thank you very much for such a detailed reply. can you please tell me how can i interact with solr from within my Java/JSP application ? I mean how to query the solr running at localhost and getting results back in the application. Do i have to change something there in solrconfig.xml ? Please help me in this regards Thanks in advance -- Akeel On Wed, May 21, 2008 at 4:11 PM, Grant Ingersoll [EMAIL PROTECTED] wrote: Stopwords are commonly occurring words that don't add _much_ value to search, such as the, an, a and are usually removed during analysis. Protwords (protected words) are words that would be stemmed by the English porter stemmer that you do not want to be stemmed. In the end, removing stopwords may keep your index smaller and can keep some queries from taking a long time, but they also mean you can't query for those words. As for protwords, that is something you would do if you felt the results for those tokens was off. Many people use stopwords, many don't. Personally, I don't think removing them is the right thing to do, as there isn't always a way to recover them and they do provide meaning, otherwise why would they be needed in the language? Often, the best thing to do, is keep stopwords, but handle them intelligently on the query side (in phrases, etc.). However, since you're a beginner, it probably makes sense to just throw out stopwords for now. -Grant On May 21, 2008, at 1:50 AM, Akeel wrote: Hi, I am a beginner to Solr, I have successfully indexed my db in solr. I want to know that what are the stopwords and protwords ??? and how much they have effect on my search results ? Thanks in advance. -- Akeel -- Thanks and Regards, Akeel ur Rehman Faridee http://riseofpakistan.blogspot.com cell: 0321-4714151 When there is injustice in society, then everyone will go to politics Except the two kinds: those who are timid and those who are materialist (Aristotle) -- Regards, Shalin Shekhar Mangar.
Re: What are stopwords and protwords ???
Here's the link to wiki documentation on SolrJ http://wiki.apache.org/solr/Solrj On Wed, May 21, 2008 at 11:09 PM, Shalin Shekhar Mangar [EMAIL PROTECTED] wrote: Hi Akeel, Take a look at SolrJ which is a Java client library for Solr. It is packaged with the Solr nightly binary downloads. This can be used by your Java/JSP application to add documents or query Solr. No changes to any config files is needed. On Wed, May 21, 2008 at 5:15 PM, Akeel [EMAIL PROTECTED] wrote: Thank you very much for such a detailed reply. can you please tell me how can i interact with solr from within my Java/JSP application ? I mean how to query the solr running at localhost and getting results back in the application. Do i have to change something there in solrconfig.xml ? Please help me in this regards Thanks in advance -- Akeel On Wed, May 21, 2008 at 4:11 PM, Grant Ingersoll [EMAIL PROTECTED] wrote: Stopwords are commonly occurring words that don't add _much_ value to search, such as the, an, a and are usually removed during analysis. Protwords (protected words) are words that would be stemmed by the English porter stemmer that you do not want to be stemmed. In the end, removing stopwords may keep your index smaller and can keep some queries from taking a long time, but they also mean you can't query for those words. As for protwords, that is something you would do if you felt the results for those tokens was off. Many people use stopwords, many don't. Personally, I don't think removing them is the right thing to do, as there isn't always a way to recover them and they do provide meaning, otherwise why would they be needed in the language? Often, the best thing to do, is keep stopwords, but handle them intelligently on the query side (in phrases, etc.). However, since you're a beginner, it probably makes sense to just throw out stopwords for now. -Grant On May 21, 2008, at 1:50 AM, Akeel wrote: Hi, I am a beginner to Solr, I have successfully indexed my db in solr. I want to know that what are the stopwords and protwords ??? and how much they have effect on my search results ? Thanks in advance. -- Akeel -- Thanks and Regards, Akeel ur Rehman Faridee http://riseofpakistan.blogspot.com cell: 0321-4714151 When there is injustice in society, then everyone will go to politics Except the two kinds: those who are timid and those who are materialist (Aristotle) -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: What are stopwords and protwords ???
thanks everyone On Thu, May 22, 2008 at 7:18 AM, Grant Ingersoll [EMAIL PROTECTED] wrote: See http://lucene.apache.org/solr/tutorial.html. You can also see the wiki for a whole bunch of docs, including links to tutorials, etc. Also, just for future reference, please separate out questions so that they can be addressed separately, and more easily found by others in the future. -Grant On May 21, 2008, at 7:45 AM, Akeel wrote: Thank you very much for such a detailed reply. can you please tell me how can i interact with solr from within my Java/JSP application ? I mean how to query the solr running at localhost and getting results back in the application. Do i have to change something there in solrconfig.xml ? Please help me in this regards Thanks in advance -- Akeel On Wed, May 21, 2008 at 4:11 PM, Grant Ingersoll [EMAIL PROTECTED] wrote: Stopwords are commonly occurring words that don't add _much_ value to search, such as the, an, a and are usually removed during analysis. Protwords (protected words) are words that would be stemmed by the English porter stemmer that you do not want to be stemmed. In the end, removing stopwords may keep your index smaller and can keep some queries from taking a long time, but they also mean you can't query for those words. As for protwords, that is something you would do if you felt the results for those tokens was off. Many people use stopwords, many don't. Personally, I don't think removing them is the right thing to do, as there isn't always a way to recover them and they do provide meaning, otherwise why would they be needed in the language? Often, the best thing to do, is keep stopwords, but handle them intelligently on the query side (in phrases, etc.). However, since you're a beginner, it probably makes sense to just throw out stopwords for now. -Grant On May 21, 2008, at 1:50 AM, Akeel wrote: Hi, I am a beginner to Solr, I have successfully indexed my db in solr. I want to know that what are the stopwords and protwords ??? and how much they have effect on my search results ? Thanks in advance. -- Akeel -- Thanks and Regards, Akeel ur Rehman Faridee http://riseofpakistan.blogspot.com cell: 0321-4714151 When there is injustice in society, then everyone will go to politics Except the two kinds: those who are timid and those who are materialist (Aristotle) -- Grant Ingersoll Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ -- Thanks and Regards, Akeel ur Rehman Faridee http://riseofpakistan.blogspot.com cell: 0321-4714151 When there is injustice in society, then everyone will go to politics Except the two kinds: those who are timid and those who are materialist (Aristotle)