Hi
I'm new to searching and am trying to use Lucene to search English Arabic
documents. I've got a bunch of questions (hopefully you'll find some
interesting!) and am hoping someone's gone through some of them and has some
answers for me!
First, do I have to worry about the Arabic
Perhaps you can use a filter in the following way.
-Create a filter (via QueryFilter) that would contain all document that
do not have null values for the field
-flip the bits of the filter so that it now contains documents that have
null values for a field
-Use the filter in conjunction with
On Jul 24, 2007, at 3:21 AM, Elie Choueiri wrote:
Hi
I'm new to searching and am trying to use Lucene to search English
Arabic
documents. I've got a bunch of questions (hopefully you'll find some
interesting!) and am hoping someone's gone through some of them and
has some
answers
Would it be more efficient to create an additional inverted field where I
assign a value to that field only when the field I would like to search is
NULL?
daniel rosher wrote:
Perhaps you can use a filter in the following way.
-Create a filter (via QueryFilter) that would contain all
On 7/24/07, daniel rosher [EMAIL PROTECTED] wrote:
Perhaps you can use a filter in the following way.
-Create a filter (via QueryFilter) that would contain all document that
do not have null values for the field
-flip the bits of the filter so that it now contains documents that have
null
Hello all,
I´m using solr in an app, but I´m getting an error that it might be a lucene
problem. When I perform a simple query like q = brasil I´m getting this
exception:
java.lang.ArrayIndexOutOfBoundsException: 1226511
at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
at
You'll also find lots of discussion about indexing multiple
languages if you search the mail archive for things like multiple
language.
I think one thing you're missing is that Lucene indexes data however
you tell it to. You have both total control over and total responsibility
for how things
Nobody can answer that question, you have to test in your particular
situation. Filters are very efficient to use once created, can be created
once and used often, etc.
Adding a special value to stand for an empty field is conceptually
simple, and queries are straight forward.
Unless you can
I don´t know the exact date of the build, but it is certainly before July 4,
and before the LUCENE-843 patch was committed. My index has 1.119.934 docs
on it and is about 8.2G.
I really don´t know how to reproduce this, the only query that I get this
error, so far, is brasil... and I don´t know
daniel rosher wrote:
Perhaps you can use a filter in the following way.
-Create a filter (via QueryFilter) that would contain all document that
do not have null values for the field
Interesting: what does the QueryFilter look like? Isn't it just as hard
as finding out what docs have the null
I figured out the problem. The issue had nothing to do with Lucene 2.2. I
had accidentally reset the default mergeFactor to 1000. This was the reason
it was not merging the segments. With the default mergeFactor, the indexing
is working perfectly fine.
Thanks,
Harini
On 7/24/07, Michael
I did a litle debug and found that in the TermScorer, the byte[] norms has
size = 1.119.933, wich is the number of docs on my index, and there is a
docID = 1226511, that is if the doc variable in the method is the docID.
I tried to access this document with reader.document() and got a *
On 7/24/07, Rafael Rossini [EMAIL PROTECTED] wrote:
I did a litle debug and found that in the TermScorer, the byte[] norms has
size = 1.119.933, wich is the number of docs on my index, and there is a
docID = 1226511, that is if the doc variable in the method is the docID.
I tried to access this
Got it,
I don´t have a clue if this corruption was caused by hardware failure,
but that is possible because we suffer with a lot of power failures from
time to time. But the thing is that I´ve been using lucene for a long time
and I never got this kind of exception.
The thing is that I´d
Hey Guys,
I just finished up using Lucene in my application. I have data in a database
, so while indexing I extract this data from the database and pump it into
the index. Specifically , I have the following data in the index:
itemID tags title summary contents
where itemID is just a number
Hey Guys,
From what I understand, FieldCache is used to store only the field required
for search. I am using a Document object and then using doc.get(item). One
of my fields is HUGE, so using Document will slow things down.
How can I use FieldCache ? an example ?
thanks,
AZ
Hi, guys,
I found Analyzers for Japanese, Korean and Chinese, but not stemmers;
the Snowball stemmers only include European languages. Does stemming
not make sense for ideograph-based languages (i.e., no stemming is
needed for Japanese, Korean and Chinese)?
Also for spell checking, does the
Where are you getting your numbers from? That is, where are your
timers? Are you timing the rs.next() loop, or the individual calls
to Lucene? What do the getX methods look like? How big are your
queries? How big is your index?
Essentially, we need more info to really help you.
Thanks for the reply.
I am timing the entire search process with a stop watch, a bit ghetto style.
My getXXX methods are:
Document doc = hits.doc(i);
String str = doc.get(item);
So you can see that I am retrieving the entire document in a search query.
Ideally , I'd like to just retrieve the
Can someone please tell me how to cache results in Lucene ? I know the
classes, but I don't know how to go about it.
thanks,
Askar
On 7/24/07, Askar Zaidi [EMAIL PROTECTED] wrote:
Thanks for the reply.
I am timing the entire search process with a stop watch, a bit ghetto
style. My getXXX
Sorry, I mistyped. I don't mean the get methods, I mean the
doTagSearch, doTitleSearch, etc.
As for the stop watch, not really sure what to make of that... Try
System.currentTimeMillis()...
You can get just the fields you want when loading a Document by using
the FieldSelector API
I ran some tests and it seems that the slowness is from Lucene calls when I
do doBodySearch, if I remove that call, Lucene gives me results in 5
seconds. otherwise it takes about 50 seconds.
But I need to do Body search and that field contains lots of text. The field
is contents. How can I
Could you show us the relevant source from doBodySearch()?
-h
On Tue, 2007-07-24 at 19:58 -0400, Askar Zaidi wrote:
I ran some tests and it seems that the slowness is from Lucene calls when I
do doBodySearch, if I remove that call, Lucene gives me results in 5
seconds. otherwise it takes
Sure.
public float doBodySearch(Searcher searcher,String query, int id){
try{
score = search(searcher, query,id);
}
catch(IOException io){}
catch(ParseException pe){}
Are you sure you are using the same Searcher for every search? Don't
open a new one unless you have modified the index. You are iterating
over every hit with the Hits class. You don't ever want to do this. Use
a HitCollector if you want to iterate over more than a hundred or so
hits. You will
Inline below
On Jul 24, 2007, at 8:14 PM, Askar Zaidi wrote:
Sure.
public float doBodySearch(Searcher searcher,String query, int id){
try{
score = search(searcher, query,id);
}
catch(IOException
I'm no expert on this (so please accept the comments in that context)
but 2 things seem weird to me:
1. Iterating over each hit is an expensive proposition. I've often
seen people recommending a HitCollector.
2. It seems that doBodySearch() is essentially saying, do this search
and return the
I'm trying to get some relatively old Lucene code to compile (please see
below), and it appears that Field.Text has been deprecated. Can someone please
suggest what I should use in its place?
Thank you.
Lindsey
public static void main(String args[]) throws Exception
Hey Hira ,
Thanks so much for the reply. Much appreciate it.
Quote:
Would it be possible to just include a query clause?
- i.e., instead of just contents:userQuery, also add
+id:idWeCareAbout
How can I do that ?
I see my query as :
+contents:harvard +contents:business +contents:review
Please reference How do I get code written for Lucene 1.4.x to work with
Lucene 2.x?
http://wiki.apache.org/lucene-java/LuceneFAQ#head-86d479476c63a2579e867b
75d4faa9664ef6cf4d
Andy
-Original Message-
From: Lindsey Hess [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 25, 2007 12:31 PM
30 matches
Mail list logo