Dear All,
Who has the je-analysis.jar?
If somebody has, can you send it to me? I don't have the access to
download something in my computer now.
Thank you very much!
Yours truly,
Daniel
This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may c
How can we use nutch APIs in Lucene? For example using FetchedSegments , we
can get ParseText from which we can
get the content of the document. So can we use these classes
(FetchedSegments, ParseText ) in lucene. If so, how to use them?
Thank You
Hi
I am using NumberUtils to encode and decode numbers while indexing and
searching, when I am going to decode the number retrieved from an index it
throws exception for some fields
the exception message is:
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
range: 1
at
Thanks for the quick response =)
On 8/1/07, Shailendra Sharma <[EMAIL PROTECTED]> wrote:
> Yes, it is easily doable through "Payload" facility. During indexing process
> (mainly tokenization), you need to push this extra information in each
> token. And then you can use BoostingTermQuery for using
It sounds like you have a fairly busy system, perhaps 100% load on the
process is not that strange, at least not during short periods of time.
A simpler solution would be to nice the process a little bit in order to
give your background jobs some more time to think.
Running a profiler is still t
Glad it worked out for you Did you ever have any insight into what
was magical about 87,300? Although now that I re-read your mail, that
was the number of characters, so I can imagine that your corpus
averaged 8.73 characters/word
Best
Erick
On 8/1/07, Eduardo Botelho <[EMAIL PROTECTED]>
On 1-Aug-07, at 11:34 AM, Joe Attardi wrote:
On 8/1/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
Use a SpanNearQuery with a slop of 0 and specify true for ordering.
What that will do is require that the segments you specify must
appear
in order with no gaps. You have to construct this your
Hi Erick!!
You're right, I just use setMaxFieldLength() and all work fine.
You save my life, thanks! (y)
On 7/30/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> See IndexWriter.setMaxFieldLength(). 87,300 is odd, since the default
> max field length, last I knew, was 10,000. But this sounds li
I suspect you're going to have to deal with wildcards if you really want
this functionality.
Erick
On 8/1/07, Joe Attardi <[EMAIL PROTECTED]> wrote:
>
> On 8/1/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
> >
> > Use a SpanNearQuery with a slop of 0 and specify true for ordering.
> > What that w
On 8/1/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
>
> Use a SpanNearQuery with a slop of 0 and specify true for ordering.
> What that will do is require that the segments you specify must appear
> in order with no gaps. You have to construct this yourself since there's
> no support for SpanQueri
Think of a custom analyzer class rather than an custom query parser. The
QueryParser uses your analyzer, so it all just "comes along".
Here's the approach I'd try first, off the top of my head
Yes, break the IP and etc. up into octets and index them
tokenized.
Use a SpanNearQuery with a slop
Hi Erick,
First, consider using your own analyzer and/or breaking the IP addresses
> up by substituting ' ' for '.' upon input.
Do you mean breaking the IP up into one token for each segment, like ["192",
"168", "1", "100"] ?
> But on to your question. Please post what you mean by
> "a large n
First, consider using your own analyzer and/or breaking the IP addresses
up by substituting ' ' for '.' upon input. Otherwise, you'll have endless
issues as time passes..
But on to your question. Please post what you mean by
"a large number". 10,000? 1,000,000,000? we have no clue
from your po
On 8/1/07, Ridwan Habbal <[EMAIL PROTECTED]> wrote:
>
> but what about runing it on mutiThread app like web application? There
> you are the code:
If you are targeting a multi threaded webapp than I strongly suggest you
look into using either Solr or the LuceneIndexAccessor code. You will want
If I'm reading this correctly, there's something a little wonky here. In
your example code, you close the IndexWriter and then, without creating
a new IndexWriter, you call addDocument again. This shouldn't be
possible (what version of Lucene are you using?)
Assuming for the time being that you ar
Hi, I got unexpected behavior while testing lucene. To shortly address the
problem: Using IndexWriter I add docs with fields named ID with a consecutive
order (1,2,3,4, etc) then close that index. I get new IndexReader, and call
IndexReader.deleteDocuments(Term). The term is simply new Term("ID
Hi again, everyone. First of all, I want to thank everyone for their
extremely helpful replies so far.
Also, I just started reading the book "Lucene in Action" last night. So far
it's an awesome book, so a big thanks to the authors.
Anyhow, on to my question. As I've mentioned in several of my pre
What is the size of heap u r allocating for your app ?
-Original Message-
From: Harini Raghavan [mailto:[EMAIL PROTECTED]
Sent: Wednesday, August 01, 2007 2:29 PM
To: java-user@lucene.apache.org
Subject: Searching with too many clauses + Out of Memory
Hi Everyone,
I am using Compass 1.
Hi,
Where does (in which field) nutch stores the content of a document
while indexing. I am using this nutch index to search in Lucene. So i want
to know the field in which the content of the document is present.
Thank You
Chhabra, Kapil wrote:
You just have to make sure that what you are searching is indexed (and
esp. in the same format/case).
Use Luke (http://www.getopt.org/luke/) to browse through your index.
Does Luke also work re to Nutch?
Thanks
Michael
This might give you an insight of what you hav
Hi Everyone,
I am using Compass 1.1 M2 which supports Lucene 2.2 to store & search huge
amount of company, executive and employment data. There are some usecases
where I need to search for executives/employments on the result set of
company search. But when I try to create a compass query to sear
21 matches
Mail list logo