--
View this message in context:
http://old.nabble.com/english-dictionary-for-spelling-tp26672045p26672045.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e-mail:
hello all
i've a doubt in spell checker , am creating spell index from my
original index , but my original index itself has some misspelled words. So
i decided to use any proper English dictionary words for my spell checker ,
can any one tell me is there any option in lucene to do my
hello all
how do i update my existing index to avoid my duplicates , this is
how am doing my indexing
doc.add(new Field(id,+i,Field.Store.YES,Field.Index.NOT_ANALYZED));
doc.add(new Field(title, indexForm.getTitle(), Field.Store.YES,
hello all
i've doubt in lucene split words search , for example if i search
for dualcore it should return dual core , how do i split this word ? is
there any analyzer in lucene to do it? please any one help me.
--
View this message in context:
What should i do now , could you make me clear ??
Grant Ingersoll-6 wrote:
On Nov 24, 2009, at 1:16 AM, m.harig wrote:
String[] suggestions = spellChecker.suggestSimilar(hoem, 3,indexReader,
contents, true);
this is how am retrieving my did you mean words
And which distance
hello all
is there any way to update the spell index directory ? please any1 help
me out of this.
--
View this message in context:
http://old.nabble.com/updating-spell-index-tp26490695p26490695.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
String[] suggestions = spellChecker.suggestSimilar(hoem, 3,indexReader,
contents, true);
this is how am retrieving my did you words
Grant Ingersoll-6 wrote:
How are you invoking the spell checker?
On Nov 19, 2009, at 1:22 AM, m.harig wrote:
hello all
i've a doubt
hello all
i've a doubt in spell checker , when i search for a keyword hoem
am getting the spell results as in the following order (in which am
retrieving 4 suggested words)
form
hold
home
them
my need is to get the home word to be fetched first. But its in the third
position ,
) this will delete
the old document and add the new one.
simon
On Tue, Nov 10, 2009 at 10:05 AM, m.harig m.ha...@gmail.com wrote:
hello all,
This is my situation , i've multiple indexes , for example , index1 ,
index2 , index3 ... i've to update the indexes every night . If i open my
IndexWriter
Thanks again
this is my code ,
doc.add(new Field(id,+i,Field.Store.YES,Field.Index.NOT_ANALYZED));
doc.add(new Field(title, indexForm.getTitle(), Field.Store.YES,
Field.Index.ANALYZED));
doc.add(new Field(contents,
Thanks simon ,,
this is my code
doc.add(new Field(id,+i,Field.Store.YES,Field.Index.NOT_ANALYZED));
doc.add(new Field(title, indexForm.getTitle(), Field.Store.YES,
Field.Index.ANALYZED));
doc.add(new Field(contents,
Thanks Ian , it works , thanks a lot.
Ian Lea wrote:
Try updateDocument(new Term(id, +i), doc).
See javadocs for Term constructors.
--
Ian.
On Tue, Nov 10, 2009 at 9:47 AM, m.harig m.ha...@gmail.com wrote:
Thanks again
this is my code ,
doc.add(new Field(id,+i
Thanks Erick ,
i understand the issue , but my doubt is when you search for a keyword
which is originally a single word, for example , metacity is really single
keyword . when i search for meta city am not able to get the results , this
is what my doubt ,
if you goto google and search for
hello all
i've a doubt in plural singular word searching , i've got code
snippet from nabble forum ,
private static Analyzer createEnglishAnalyzer() {
return new Analyzer() {
public TokenStream tokenStream(String fieldName, Reader reader)
{
TokenStream result =
thanks erick ,
A little more information would help here.1 Are you using the same analyzer
at both index and query time?
no . sorry , am using StandardAnalyzer at the index time , during querying
am using the code snippet found from nabble.
2 Assuming 1 is yes, did you re-index your data
Thanks erick ,
It works fine , if i use the (code snippet found from nabble) same
analyzer for both indexing querying .
But the highlighter has gone for plural words. Hope i need to search more ,
i'll come back to you once if i can't find out. Thanks again erick.
--
View this message in
is an IndexReader on top of various
Sub-IndexReaders.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: m.harig [mailto:m.ha...@gmail.com]
Sent: Friday, October 02, 2009 6:52 PM
To: java-user@lucene.apache.org
hello all ,
am merging more than one indexes to search a document , how do i use
IndexReader here to open multiple indexes? (since IndexReader will open one
directory at a time) could any1 please suggest me?
--
View this message in context:
hello all ,
is there any way to get all tokens from my index ? please anyone
suggest me
--
View this message in context:
http://www.nabble.com/get-all-tokens-from-index-tp25359411p25359411.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
Thanks Ahmet , i found the solution. thanks a lot
Ahmet Arslan wrote:
hello all, is there any way to get all
tokens from my index ? please anyone
suggest me
The code below prints all terms of a field.
String path = E:\\ThesaurusSolrHome\\data\\index;
String field =
Hello
Will my reader.reopen() method work on windows machine when the index get
updated? i mean my tomcat server will allow the reader to update my index?
please help me.
--
View this message in context:
http://www.nabble.com/reading-index-tp24862928p24875673.html
Sent from the Lucene -
hello all,
thanks to lucene. Am using lucene 2.4.0 for my application. My
doubt is , can i read the index for many number of times? i mean , i've a
search application which reads the index , which is 300MB in size, am
reading my index at every time the user hits the page . Is it
Thanks
This is my codw snippet
IndexSearcher searcher = new IndexSearcher(indexDir);
Analyzer analyzer = new StopAnalyzer();
WildcardQuery query = new WildcardQuery(new
Term(DEFAULT_FIELD));
Thanks for your reply,
my original code snippet is
IndexSearcher searcher = new IndexSearcher(indexDir);
Analyzer analyzer = new StopAnalyzer();
BooleanClause.Occur[] flags = { BooleanClause.Occur.SHOULD,
Thanks ,
i've noticed that , but the code is for known tokens, how do i
do it for dynamic tokens , meaning , i don't know the urls , someone picked
up the urls and i'll index it. Is there any technique to use while indexing
? am using lucene 2.4.0 version. Please suggest me.
--
Thanks all,
but how nutch handle this problem? am aware of nutch but not in
depth. If i search the keyword about us , nutch gives me exactly what i
want. Is there any scoring techinques? please let me know.
--
View this message in context:
Hello
Do you've any idea about the integration of Lucene with Hadoop
BrickMcLargeHuge wrote:
Hey all,
I just wanted to send a link to a presentation I made on how my
company is building its entire core BI infrastructure around Hadoop,
HBase, Lucene, and more. It features
Thanks all ,
Very thankful to all , am tired of hadoop settings , is it
good to use read such type large index with lucene alone? will it go for OOM
? anyone pl suggest me.
--
View this message in context:
http://www.nabble.com/indexing-100GB-of-data-tp24600563p24620846.html
hello all
We've got 100GB of data which has doc,txt,pdf,ppt,etc.., we've
separate parser for each file format, so we're going to index those data by
lucene. (since we scared of Nutch setup , thats why we didn't use it) My
doubt is , will it be scalable when i index those dcouments ?
Thanks Shai
So there won't be problem when searching that kind of large index
. am i right?
Can anyone tell me is it possible to use hadoop with lucene??
--
View this message in context:
http://www.nabble.com/indexing-100GB-of-data-tp24600563p24602064.html
Sent from the
Is there any article or forum for using Hadoop with lucene? Please any1 help
me
--
View this message in context:
http://www.nabble.com/indexing-100GB-of-data-tp24600563p24605164.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
hello all ,
am using .Net lucene for my search application , how do i index non
english pages ? Is there any analyzers to do it?? because am struggling with
utf8 problem , please any1 help me
--
View this message in context:
http://www.nabble.com/.net-lucene-doubt-tp24510928p24510928.html
hello all,
i've gone through most of the posts from this forum , i need a code
snippet for searching large index, currently am iterating ,
hits = searher.search(query);
for (int inc = 0; inc hits.length(); inc++) {
Document doc = hits.doc(inc);
Thanks Simon ,
Its working now , thanks a lot , i've a doubt
i've got 30,000 pdf files indexed , but if i use the code which you
sent , returns only 200 results , because am setting TopDocs topDocs =
searcher.search(query,200); as i said if use Integer.MAX_VALUE , it
Hi there,
On Tue, Jun 30, 2009 at 12:41 PM, m.harigm.ha...@gmail.com wrote:
Thanks Simon ,
Its working now , thanks a lot , i've a doubt
i've got 30,000 pdf files indexed , but if i use the code which you
sent , returns only 200 results , because am setting TopDocs
Thanks eric
in Ian's link, particularly see the section Don't iterate over morehits
than necessary.
A couple of other things:
1 Loading the entire document just to get a field or two isn't
very efficient, think about lazy loading (See FieldSelector)
i done it , but have couple of
Thanks Uwe,
can you please give me a code snippet , so that i can resolve my
issue , please
The correct way to iterate over all results is to use a custom HitCollector
(Collector in 2.9) instance. The HitCollector's method collect(docid, score)
is called for every hit. No need to
hello all
Am doing a search application on lucene, its working fine when my
index size is small, am getting java heap space error when am using large
size index, i came to know about hadoop with lucene to solve this problem ,
but i don't have any idea about hadoop , i've searched thru
Simon Willnauer wrote:
Hey there,
before going out to use hadoop (hadoop mailing list would help you
better I guess) you could provide more information about you
situation. For instance:
- how big is you index
- version of lucene
- which java vm
- how much heap space
- where does the
Simon Willnauer wrote:
On Mon, Jun 29, 2009 at 1:48 PM, m.harigm.ha...@gmail.com wrote:
Simon Willnauer wrote:
Hey there,
before going out to use hadoop (hadoop mailing list would help you
better I guess) you could provide more information about you
situation. For instance:
- how
Thanks Simon
I don't run any application on the tomcat , moreover i restarted
it , am not doing any jobs except searching , we've a 500GB drive , we've
indexed around 100,000 documents , it gives me around 1GB index . When i
tried to search pdf i got the heap space error ,
--
View
Thanks Simon ,
This is how am indexing my documents ,
indexWriter.addDocument(doc, new StopAnalyzer());
indexWriter.setMergeFactor(10);
indexWriter.setMaxBufferedDocs(100);
Thanks again,
Did i index my files correctly, please need some tips, the following
is the error when i run my keyword , i typed pdf , thats it , because i've
got around 30,000 files named pdf,
HTTP Status 500 -
type Exception report
message
description The server encountered
Thanks Simon ,
Hey there, that makes things easier. :)
ok here are some questions:
Do you iterate over all docs calling hits.doc(i) ?If so do you have to
load all fields to render your results, if not you should not retrieve
all of them?
Yes, am iterating over all docs by calling
Thanks SImon ,
Example:
IndexReader open = IndexReader.open(/tmp/testindex/);
IndexSearcher searcher = new IndexSearcher(open);
final String fName = test;
is fName a field like summary , contents??
TopDocs topDocs = searcher.search(new TermQuery(new Term(fName,
lucene)),
Hello all
Can anyone tell me what is the difference between query.setBoost()
and doc.setBoost()... More over if use query.setBoost(4.0f) am not able
to boost my results . which one makes my results better please anyone
help me out of this...
--
View this message in context:
Hello all
i've a search application running on lucene-2.3.0 , say
for example am indexing 10 urls as an input , when am searching am not able
to get the expected result at the best ranking, i.e, unrelated hits are
coming up rather than related hits. I've been working this for a
Hello all,
i've a search application which uses lucene-2.3.0 , and my
application running for a banking domain. Am indexing some banking urls as
an input and am searching some keywords. What my doubt is when i search
cards, the less count keyword url comes up. I mean , for
hi all.
am indexing a price field by
doc.add(new Field(price, 1450, Field.Store.YES,
Field.Index.TOKENIZED));
doc.add(new Field(price, 3800, Field.Store.YES,
Field.Index.TOKENIZED));
doc.add(new Field(price,
49 matches
Mail list logo