Indexing large amount of data

2010-07-12 Thread sarfaraz masood
i have a large amount of data (120 GB) to be indexed in the index. Hence i want to improve the performance of indexing this data. I went through the documentation given on the lucene website which mentioned various ways by which the performance can be improved. i am working on debian linux with

Problem during indexing

2010-07-12 Thread sarfaraz masood
i am trying to add 20 million documents to my index from another index that contains these documents(cant help this architecture..its something that i will have to follow) Now the problems i am facing are the following : 1) Too many files open error.. its at the code which is adding documents to

Re: Problem with linux

2010-07-10 Thread sarfaraz masood
uded the version of Lucene you're using (which is suggested by your code snippet, perhaps the Lucene user's list would be a better forum?). Nor the version of your JVM. Nor your linux version. Nor.... Best Erick On Fri, Jul 9, 2010 at 7:49 PM, sarfaraz masood < sarfarazmasood2...@

Problem with linux

2010-07-09 Thread sarfaraz masood
I have problems when i execute my prog on linux having this following piece of code. { Document d; Analyzer analyzer = new PorterStemAnalyzer(); System.out.println("1");     Directory index = FSDirectory.open(new File("index1")); System.out.println("2"); IndexWriter w = new IndexWriter(index

Re: stemming the index

2010-07-07 Thread sarfaraz masood
e indexing and searching". Take a look through the mail archive, try search for multilanguage or multi-language or multiple languages. There's a wealth of info there because this topic has been discussed many times. Best Erick On Wed, Jul 7, 2010 at 3:51 PM, sarfaraz masood < sarfa

terms separated by -

2010-07-07 Thread sarfaraz masood
There are terms in my data like : one-way , separated by '-' , now the problem is that the standard analyzer is considering these as a single term instead of two. but i need that these should be stored as two terms in the index.. but how to do this ?? Sarfaraz

stemming the index

2010-07-07 Thread sarfaraz masood
My index contains data of 2 different languages, English & German. Now which analyzer & stemmer should be applied on this data before feeding to index -Sarfaraz

Re: how to apply stemming to the index ?

2010-07-05 Thread sarfaraz masood
orterStemFilter to a filter chain when making our own analyzer (see the synonym example in Lucene In Action, first or second edition. If this is gibberish, perhaps you could provide some more context for what you're trying to accomplish. HTH Erick On Fri, Jul 2, 2010 at 5:08 AM, sarfaraz

how to apply stemming to the index ?

2010-07-02 Thread sarfaraz masood
I want to stem the terms in my index. but currently i am using standard analyzer that is not performing any kind of stemming. StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT); After some searching i found a code for PorterStemAnalyzer but that is having some problems

Re: Mr Erick Re: Change the Solr searcher

2010-06-23 Thread sarfaraz masood
-user@lucene.apache.org Date: Tuesday, 22 June, 2010, 11:11 PM Sounds like what you want is to override Solr's "query" component.  Have a look at the built-in one and go from there.     Erik On Jun 22, 2010, at 1:38 PM, sarfaraz masood wrote: > I am a novice in solr / lucen

Change the Solr searcher

2010-06-22 Thread sarfaraz masood
I am a novice in solr / lucene. but i have gone thru the documentations of both.I have even implemented programs in lucene for searching etc. My problem is to apply a new search technique other than the one used by solr. Now as i know that lucene has its own searcher which is used by solr as wel

Re: Mr Lance : customize the search algorithm of solr

2010-06-21 Thread sarfaraz masood
Mr Lance Thanks a lot for ur reply.. I am a novice a solr / lucene. but i have gone thru the documentations of both.I have even implemented programs in lucene for searching etc. My problem is to apply a new search technique other than the one used by solr. Step 1: My algorithm finds the tf idf

customize the search algorithm of solr

2010-06-18 Thread sarfaraz masood
Are there any means by which we can customize the search of solr, by plugins etc ?? i have been working on a research based project to implement a new search algorithm for search engines.I wanna know if i can make solr use this algorithm to decide the resultant documents, and still allow me to

access term vectors in lucene

2010-06-16 Thread sarfaraz masood
hello all, I wanna know that how can we access terms vectors in lucene.. actually i making a project where i need tf idf values of all the terms in the documents.. but i m unable to get any reference eg where it shows how to use these term vectors to get the tf idf values of ALL the terms in my

how to get tf-idf values in solr

2010-06-15 Thread sarfaraz masood
you all -Sarfaraz Masood