i have a large amount of data (120 GB) to be indexed in the index. Hence i want
to improve the performance of indexing this data. I went through the
documentation given on the lucene website which mentioned various ways by which
the performance can be improved.
i am working on debian linux with
i am trying to add 20 million documents to my index from another index that
contains these documents(cant help this architecture..its something that i will
have to follow) Now the problems i am facing are the following :
1) Too many files open error.. its at the code which is adding documents to
uded the version of
Lucene you're using (which is suggested by your
code snippet, perhaps the Lucene user's list would be
a better forum?). Nor the version of your JVM. Nor
your linux version. Nor....
Best
Erick
On Fri, Jul 9, 2010 at 7:49 PM, sarfaraz masood <
sarfarazmasood2...@
I have problems when i execute my prog on linux having this following piece of
code.
{
Document d;
Analyzer analyzer = new PorterStemAnalyzer();
System.out.println("1");
Directory index = FSDirectory.open(new File("index1"));
System.out.println("2");
IndexWriter w = new IndexWriter(index
e indexing and searching".
Take a look through the mail archive, try search for multilanguage or
multi-language
or multiple languages. There's a wealth of info there because this topic has
been
discussed many times.
Best
Erick
On Wed, Jul 7, 2010 at 3:51 PM, sarfaraz masood <
sarfa
There are terms in my data like : one-way , separated by '-' , now the problem
is that the standard analyzer is considering these as a single term instead of
two. but i need that these should be stored as two terms in the index.. but how
to do this ??
Sarfaraz
My index contains data of 2 different languages, English & German. Now which
analyzer & stemmer should be applied on this data before feeding to index
-Sarfaraz
orterStemFilter to a filter chain
when making our own analyzer (see the synonym example in
Lucene In Action, first or second edition.
If this is gibberish, perhaps you could provide some more context
for what you're trying to accomplish.
HTH
Erick
On Fri, Jul 2, 2010 at 5:08 AM, sarfaraz
I want to stem the terms in my index. but currently i am using standard
analyzer that is not performing any kind of stemming.
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
After some searching i found a code for PorterStemAnalyzer but that is having
some problems
-user@lucene.apache.org
Date: Tuesday, 22 June, 2010, 11:11 PM
Sounds like what you want is to override Solr's "query" component. Have a look
at the built-in one and go from there.
Erik
On Jun 22, 2010, at 1:38 PM, sarfaraz masood wrote:
> I am a novice in solr / lucen
I am a novice in solr / lucene. but i have gone
thru the documentations of both.I have even implemented programs in
lucene for searching etc.
My problem is to apply a new search technique other than the one used by solr.
Now as i know that lucene has its own searcher
which is used by solr as wel
Mr Lance
Thanks
a lot for ur reply.. I am a novice a solr / lucene. but i have gone
thru the documentations of both.I have even implemented programs in
lucene for searching etc.
My problem is to apply a new search technique other than the one used by solr.
Step 1: My algorithm finds the tf idf
Are there any means by which we can customize the search of solr, by plugins
etc ??
i have been working on a research based project to implement a new search
algorithm for search engines.I wanna know if i can make solr use this algorithm
to decide the resultant documents, and still allow me to
hello all,
I wanna know that how can we access terms vectors in lucene.. actually i making
a project where i need tf idf values of all the terms in the documents.. but i
m unable to get any reference eg where it shows how to use these term vectors
to get the tf idf values of ALL the terms in my
you all
-Sarfaraz Masood
15 matches
Mail list logo