Indexed data is coming out in the same way as put in. Lucene works with Java
Strings, so encoding is irrelevant. When you index your values, you must be
sure, to construct your index string/char arrays correctly using the UTF-8
encoding (e.g. by using a standard Java Reader, new String byte[], char
How to post utf-8 unicoded data to lucene index. Do we have to specify
something special, any sort of flag saying that we're posting unicoded data?
I tried to post some utf-8 encoded data, during retrieval I'm not able to
see those data , there are just "?" marks in all those places. Earlier I was
hi can someone point me in the direction of how i can get a string array of
the corpus/index vocabulary from the index using an indexreader?
Currently this is what i am doing:
IndexReader reader = IndexReader.open(indexdirectorypath);
termenumvar = reader.terms();
then i iterate through this ter
Hi,balasubramanian Thanks for your reply.
Both first:25 and second:90 perhaps include 'java' or not.
I have set doc#90's boost is 3.15 and doc#25's boost is 1.0. I think that is
key. I try to set query term boost to proper value, but it is not fix. to
one is okay, but another not.
balasubram
My guess that this can happen when your document matches more than one
condition. For example first:25 could match lang:java as well??
- Original Message
From: hacklisp
To: java-user@lucene.apache.org
Sent: Thursday, May 21, 2009 10:03:52 AM
Subject: About sort questions
I search
I search 'lisp' with lucene application using the following query string:
uid:5^3 OR uid:10^2 OR lang:lisp
I hope result as following:
first:5 (which id is 5)
second:10 (which id is 10)
others:other results sort according to relevance.
it is always ok, but sometimes no
Hi there
On Tue, May 19, 2009 at 8:32 AM, ac wrote:
> hello,
> I am using CustomScoreQuery for result ranking.
> A field of my documents is parsable as an integer value, the magnide
> of which exceeds the precision of the float type.
> A sample value of this field is 24118569
>
> However, due to
The Lucene In Action book (at least the first edition and, I presume, the
second)
has exactly this, called SynonymAnalyzer. The basic idea is that at index
time
you index your multiple terms with no increment between, so all your
synonyms
get indexed in the same position.
I highly recommend the bo
dear list
i want to add a entry to an index with a custom synomlist to an index. for
example with the following text:
[i worrie about nothing beacuse this worls is crazy]
and i want to add the two custom synonyms
[anything]=>[nothing]
and
[lazy]=>[crazy]
so that a search for lazy, crazy not
dear list
im searching through some lucene(2.9) index built with the GermanAnalyzer
(from the package analyzers 2.9).
when i search for the word deutschland (query parsed with german alnalyzer
transforms to deutschla) i get a few hits.
whei im searching for deu?schland i became no results, bec
Woops -- disregard my comments. I was looking at the unreleased
(2.9-dev) version of RangeQuery.
In 2.4, RangeQuery will throw TooManyClauses, if the number of terms
in the range exceeds BooleanQuery's maxClauseCount.
ConstantScoreRangeQuery will not throw that exception.
Mike
On Wed, May 20, 2
Hi,
I did not see method setConstantScoreRewrite method
in RangeQuery class?
Best regards, Lisheng
-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Wednesday, May 20, 2009 11:10 AM
To: java-user@lucene.apache.org
Subject: Re: RangeQuery & TooManyClause
Hmm... that's actually not true: RangeQuery will still throw that
exception, unless you call setConstantScoreRewrite to true (at which
point it does the same thing as ConstantScoreRangeQuery, ie that
exception will not be thrown).
The javadoc for RangeQuery is very misleading. (This happened when
Hi,
Looking at the docs for the 2.4 codebase, for RangeQuery
http://lucene.apache.org/java/2_4_0/api/index.html?org/apache/lucene/search/RangeQuery.html
there is a comment that a TooManyClauses exception is no longer thrown.
Does this mean that it is now safe to use RangeQuery without worrying
a
Hi All
This may not be a question for this mailing list but i wasn't sure where to
start. Please accept my apologies if anyone thinks that this is not the
appropriate place for this question.
I am currently working on building a proof of concept search solution for my
company using Lucene and Hi
Thank you ag...@john.
This is even better. I don't have to bother about the 3rd argument, right?
I'll use the same one everytime for both registering a new core as well as
adding docs to an existing one.
Thanks,
KK.
On Wed, May 20, 2009 at 6:54 PM, John Byrne wrote:
> Hi KK,
>
> You're welcome!
Hi KK,
You're welcome!
BTW, I had a quick look at the Javadoc for IndexWriter and noticed this
constructor:
public IndexWriter(Directory d, Analyzer a)
"Constructs an IndexWriter for the index in d, first creating it if it
does not already exist."
I think that might solve your problem and
Unless something about your problem space *requires* that you reopen theindex,
you're better off just opining it once, writing all your documents to
it, then closing it. Although what you're doing will work, it's not very
efficient.
And the same thing is *especially* true of the searcher. There's
Thanks a lot @John. That solved the problem and the other advice is really
helpful. I'd have bumped over that otherwise.
This clarifies my doubt, that everytime I've to create a new index just call
the indexwriter with "true" thereby creating the directory, then start
adding docs with "false" as th
I think the problem is that you are creating an new index every time you
add a document:
IndexWriter writer = new IndexWriter(trueIndexPath, new
StandardAnalyzer(), true);
The last argument, the boolean 'true' tells IndexWriter to overwrite any
existing index in that directory. If you set that
Right, so again, you are opening your index by reference there. You
application has to assume that the index that its looking for exists in
the same directory as the application itself lives. Since you are
deploying this application as a deployable war file that's not going to
work really wel
Marco
You haven't answered Matt's question about where you are running it
from. Tomcat's default directory may well not be the same as yours.
I strongly suggest that you use a full path name and/or provide some
evidence that your readers and writers are using the same directory
and thus lucene i
Thank you very much.
I'm using the one mentioned by @Anshum ..but the problem is that after
indexing some no of docs what I see is only the last one indexed which
clearly indicates that the index is getting overwritten. I'm posing my
simple indexer and searcher herewith. Actually I'm trying to craw
Hi KK,
Easier still, you could just open the indexwriter with the last (3rd)
arguement as true, this way the indexwriter would create a new index as soon
as you start indexing. Also, if you just leave the indexWriter without the
3rd arguement, it'd conditionally create a new directory i.e. only if
You can do this with pure Java. Create a file object with the path you
want, check if it exists, and it not, create it:
File newIndexDir = new File("/foo/bar")
if(!newFileDir.exists()) {
newDirFile.mkdirs();
}
The 'mkdirs()' method creates any necessary parent directories.
If you want t
How to create a new index? everytime I need to do so , I've to create a new
directory and put the path to that, right? how to automate the creation of
new directory?
I'm a new user of lucene. Please help me out.
Thanks,
KK.
I've posted the indexing part,but I don't use this in my app.After I
create the index,I put that in a folder like /home/marco/RDFIndexLucece
and when I run the query I'm only searching (and not indexing).
String[] fieldsearch = new String[] {"name", "synonyms", "propIn"};
//RDFinder rdfind = n
Ok, I understand. I will use the HitColector.
Thanks a lot for all the explanations!
Best,
Liat
2009/5/18 Erick Erickson
> As best I understand it, you DO NOT WANT A FILTER. Filters do notcontribute
> to scoring, therefore do not rank your documents. If you use
> a filter, the most irrelevant do
Ok, so let me clear it up.
Lucene offers different types of Directories
(org.apache.lucene.store.Directory) into which it stores the index data.
Most people probably use the FSDirectory implementation which writes the
index data as files into the filesystem. However, we use the DbDirectory
implem
29 matches
Mail list logo