Erik,
Just to make it clear for me. The last Lucene version supports interior
+/-.
So, how looks the query that deals with "BOOK NAME" as keyword fields
And I want to find C++ tutorial ?
Thanks in advanvce
Alex Kiselevski
Development Expert, Amdocs Advanced Technologies
+9.729.776.4346 (desk)
Barry,
You may also want to consider PostgreSQL for a few reasons: 1) it's
historically known to work well for geo-spatial data, 2) has
GIS/geo-spatial data types and such, and 3) it seems that the new
versions let you embed Java directly into the database (perhaps
something like Oracle's Java-emb
Does Lucene optimize range queries that use Sort and/or limit the number
of hits?
My situation: I have a listing of 2 million cities, with the name,
latitude, longitude, and population of each city. I want to efficiently
find the 50 most populous cities between (for example) latitudes 35.2 and
41
Chris,
How about indexing the domain as one field and each part of the path
as separate terms in another field? I'm sure you've probably already
thought of doing this... and maybe discarded the idea because you'd
lose the position information. However, even though you can't just
simply
What version of Lucene are you using? There was a change that
helped with that situation such that interior +/- was not considered
an operator. That changed is in the 1.4 versions - might you be
running a previous version of Lucene?
Erik
On Jul 27, 2005, at 6:42 PM, Derek Westfa
On Jul 27, 2005, at 4:56 PM, Chris May wrote:
Always domain + part of a path e.g.
url:http://blogs.warwick.ac.uk/chrismay/*
or
url:http://www2.warwick.ac.uk/fac/soc/law/ug/prospective/degrees/
modules/commonlaw/*
or
url:http://www2.warwick.ac.uk/services/its/*
... and so on. Part of th
Hi,
I believe your problem is described on page 121 in the Lucene book:
http://www.lucenebook.com/search?query=%22dealing+with+keyword+fields%22
The solution for you may be to write your own Analyzer that knows how
to correctly tokenize or not tokenize certain fields in your index.
Using PerFiel
Hm, not sure why you're emailing [EMAIL PROTECTED] [EMAIL PROTECTED]
may be better. Here are 2 ancient classes from 2003 that I once used
to normalize URLs, to help me identify URL duplicates. This may get
stripped on its way to the list.
Otis
--- Chris Fraschetti <[EMAIL PROTECTED]> wrote:
I have a field index as keyword. And have two records "J400-C-V1-S10-T1" and
"J400-C-V-S10-T1"
When I search for "J400-C-V1-S10-T1", it returns me matching record, but
when I Search for "J400-C-V-S10-T1" it doesn't return the matching one.
Further I found that "J400-C-V-S10-T1" is incorrectly t
Writing simple code to trim down a URL is trivial, but to actually
trim it down to its most meaningful state is very hard. In same cases
the URL parameters actually define the page in others they are useless
babble. I'd like to use the hash of a page's URL as well as a hash of
the content data to h
Ah - my brain was off. :)
In the Lucene book we refer to that index format as "compound index
format", while the original format we call "multifile index format"
http://www.lucenebook.com/search?query=compound+index
http://www.lucenebook.com/search?query=multifile+index
Yes, the latter will g
Is there a way to allow users to use + and - and special operators in
free-text searches, but also allow them to search for a last name like
Smith-Jones? (which I'd have to escape?)
Is there a regular expression to determine/fix this kind of user input
so it is queryparser-legal?
Ie they can't ju
Thanks for the reminder, Otis.
I haven't done any more on this since this post:
http://archives.devshed.com/a/ml/200501-114586/lucene-query-sql-kind
The scalability concerns with the user-defined-functions I created
prevented me from taking it any further. A proper solution would need a
tight
My apologies Otis, I should have spelled that out.
I'm going to take a stab at answering this. But please, others on the list,
chime in with corrections / clarifications.
CFS = "compact file system" or "consolidate file system" or something like
that.
Essentially, each Lucene index segment is a
Option 1) will most likely give you more, but there are a number of
other things you could do before going for monster hardware. Splitting
the index, more than 1 disk, ParallelIndexReader, the patch that splits
index files into a number of data files, etc.
Otis
--- Michael Celona <[EMAIL PROTEC
What's CFS? Cryptographic File System? I'm not being sarcastic here,
I'm really curious about what you referring to.
Otis
--- Mark Bennett <[EMAIL PROTECTED]> wrote:
> Also, non-hardware, have you considered turning off CFS?
>
> Our client told us this sped up their system.
>
> -Original
--- Mag Gam <[EMAIL PROTECTED]> wrote:
> Anyone here have any luck with integration of Apache Derby and
> Lucene?
I believe Mark Harwood has done some experiments with Lucene and
Derby... here:
http://www.google.com/search?q=Lucene+derby+harwood
Otis
Some changes were made to MultiSearcher version that is in the SVN
repository. Which version of Lucene are you using, and can you provide
an index and a query that cause this exception?
Otis
--- Daniel Cortes <[EMAIL PROTECTED]> wrote:
> I don't know why, but all this problems that I shared wi
Always domain + part of a path e.g.
url:http://blogs.warwick.ac.uk/chrismay/*
or
url:http://www2.warwick.ac.uk/fac/soc/law/ug/prospective/degrees/
modules/commonlaw/*
or
url:http://www2.warwick.ac.uk/services/its/*
... and so on. Part of the problem is that we may need to go an
arbitrar
I am retrieving the documents using "hits.doc(i)". I put in some timing
output. Here are the results:
Before Search 1122497423976
After Search 1122497426795
After Build 1122497426839 (after I retrieve 10 results from hits )
What is CFS?
Thanks,
Michael
-Original Message-
From: M
Hi,
> > I've added the Lucene mailing lists to our searchable archive found
> > here:
> >
> > http://www.gossamer-threads.com/lists/lucene/
> >
> > The search is, of course, powered by Lucene. =) I hope you find it
> > useful, and thanks for the great work! If you have any questions or
>
I posted the code I use to do this (based on a single index) here:
http://marc.theaimsgroup.com/?l=lucene-dev&m=111044178212335&w=2
Cheers
Mark
___
Yahoo! Messenger - NEW crystal clear PC to PC calling
On Jul 27, 2005, at 4:32 PM, Scott Ganyo wrote:
Actually, I believe the correct answer is an empty result set.
Oops I really screwed that one up! :O
You're absolutely right, my apologies.
Erik
On Jul 27, 2005, at 12:14 PM, Erik Hatcher wrote:
On Jul 27, 2005, at 12:40 PM, Pete
Could you give some examples of the types of PrefixQuery's you'd like
to use? Is it always at a granularity of domain and path? Or are
you wanting to do a prefix pieces of the domain and path?
Erik
On Jul 27, 2005, at 3:47 PM, Chris May wrote:
First, apologies for what seems to be s
Actually, I believe the correct answer is an empty result set.
On Jul 27, 2005, at 12:14 PM, Erik Hatcher wrote:
On Jul 27, 2005, at 12:40 PM, Peter Gelderbloem wrote:
I wonder what would happen
An exception :)
Peter Gelderbloem -Original Message-
From: Erik Hatcher [mailto:[EM
I'm working on a problem where I need to search over 160 million
documents. I know Lucene can do this no sweat; my problem is that these
documents are grouped in more then 500 categories. I need to get a
count of documents that match a given query, within each category.
There is no need for scori
First, apologies for what seems to be something of an FAQ.
However, I've not been able to find an answer either in LIA or in the
relevant section of the FAQ (http://wiki.apache.org/jakarta-lucene/
LuceneFAQ#head-06fafb5d19e786a50fb3dfb8821a6af9f37aa831)
My setup is as follows: I have an inde
Also, non-hardware, have you considered turning off CFS?
Our client told us this sped up their system.
-Original Message-
From: Chris Lamprecht [mailto:[EMAIL PROTECTED]
Sent: Wednesday, July 27, 2005 11:52 AM
To: java-user@lucene.apache.org
Subject: Re: Hardware Question
It depends on
It depends on your usage. When you search, does your code also
retrieve the docs (using Searcher.document(n), for instance). If your
index is 8GB, part of that is the "indexed" part (searchable), and
part is just "stored" document fields.
It may be as simple as adding more RAM (try 4, 6, and 8G
I don't know why, but all this problems that I shared with you are
produced for my use of multisearcher
Now I've obtained this, anyone had this problem sometimes?
java.lang.IndexOutOfBoundsException: Index: 43, Size: 12
at java.util.ArrayList.RangeCheck(ArrayList.java:507)
at java
I am going over ways to increase overall search performance.
Currently, I have a dual zeon with 2G of ram dedicated to java searching an
8G index on one 7200 rpm drive.
Which will give the greatest payoff?
1) Going to 64bit server and giving more memory to java with faster
drive
On Jul 27, 2005, at 12:40 PM, Peter Gelderbloem wrote:
I wonder what would happen
An exception :)
Peter Gelderbloem -Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 27 July 2005 17:36
To: java-user@lucene.apache.org
Subject: Re: Quick newbie question
On Jul 27,
I wonder what would happen
Peter Gelderbloem -Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 27 July 2005 17:36
To: java-user@lucene.apache.org
Subject: Re: Quick newbie question
On Jul 27, 2005, at 12:22 PM, Andrew Boyd wrote:
> Of course you can do the inverse o
On Jul 27, 2005, at 12:22 PM, Andrew Boyd wrote:
Of course you can do the inverse of what Erik said.
That is search for a term that you know is not in the index and use
the NOT operator.
Ummm... no you can't. A purely negative query is not allowed with
Lucene.
Erik
Andrew
---
Of course you can do the inverse of what Erik said.
That is search for a term that you know is not in the index and use the NOT
operator.
Andrew
-Original Message-
From: Erik Hatcher <[EMAIL PROTECTED]>
Sent: Jul 27, 2005 10:49 AM
To: java-user@lucene.apache.org
Subject: Re: Quick newb
On Jul 27, 2005, at 11:07 AM, Federico Tonioni wrote:
Hi all!
I have just a simple question
How can I retrieve all documents in an index by using QueryParser?
I thought
Query query = QueryParser.parse("*", "contents",
new StandardAnalyzer());
might be the solution, but it
Hi all!
I have just a simple question
How can I retrieve all documents in an index by using QueryParser?
I thought
Query query = QueryParser.parse("*", "contents",
new StandardAnalyzer());
might be the solution, but it's not:)
thanks in advance
fede
--
-
Anyone here have any luck with integration of Apache Derby and Lucene?
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
38 matches
Mail list logo