Hi guys,
Can i open outlook express mails ( dbx file) in java and index the mails using
lucene.. ?
Do anybody aware of such dbx reader ???
I tried one "mstos" but its not working properly...
- Bhavin pandya
Thanks a lot Daniel :)
Regards,
Maureen
Daniel Noll <[EMAIL PROTECTED]> wrote:
maureen tanuwidjaja wrote:
> Before implementing this search engine,I have designed to build the
> index in such a way that every XML tag is converted using binary
> value,in order to reduce the size ind
Thanks Doron,you are right ,I'm performing delete operation.
Doron Cohen <[EMAIL PROTECTED]> wrote:
Hi Mukesh,
Are you by a chance deleting docs in that loop, using
the same reader as the one used the searcher?
If so, using a separate reader for delete would fix that.
Also see related disc
Thanks Doron,you are right ,I'm performing delete operation.
Doron Cohen <[EMAIL PROTECTED]> wrote:
Hi Mukesh,
Are you by a chance deleting docs in that loop, using
the same reader as the one used the searcher?
If so, using a separate reader for delete would fix that.
Also see related discu
25 jan 2007 kl. 10.15 skrev Bhavin Pandya:
Can i open outlook express mails ( dbx file) in java and index the
mails using lucene.. ?
Not out of the box, sorry.
But I think Outlook express files are nothing but OLE2 documents. If
that is true, then you could use the POIFS part of Jakarta P
Hi,
I am indexing thousands of XML document,then it stops after indexing for
about 7 hrs
...
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37003.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37004.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37008.xml
Indexing C:\swee
btw Daniel,can please give me the reference to find the explanation about
SegmentTermEnum/Field Infos if such one exist? I search but best can only find
http://lucene.apache.org/java/docs/clover/org/apache/lucene/index/SegmentTermEnum.html
which is the source code only...
Many thanks and B
maureen tanuwidjaja wrote:
I am indexing thousands of XML document,then it stops after indexing for
about 7 hrs
...
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37027.xml
java.io.IOException: Lock obtain timed out: [EMAIL
PROTECTED]:\sweetpea\dual_index\DI\write.lock
java.lang
Hello,
I am new to lucene and I tried the make a simple search engine by following the
book 'portla development with open source tools' but I can not get it to work.
Can somebody give/ direct me to a simple search code example for jetspeed2
Regards,
Erik
Hey,
thank you for all your help, I could likely fix the problem.
Unfortunately I needed 10 hours to find the error (1 line of code) ;-)
I forgot to close the IndexSearcher after an index query, so, I think, lots of
instances of IndexSearchers (one for every query) were hold in the memory. Thi
One way to mitigate the cost of this kind of thing is to create a series of
indexes on portions of your corpus and then merge them. Say you have 10,000
documents. Create 10 separate indexes of 1,000 documents each then use
IndexWriter.addIndexes to make them all into a single index.
This pre-supp
Hi Mike,thanks for the reply...
1.Here is the class that I use for indexing..
package edu.ntu.ce.maureen.index;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.
Don't do it that way. You're opening and closing your indexwrwiter for each
document, which is extremely wasteful. And given locking has been a source
of much discussion on this list, it's not clear that locking will withstand
this kind of hammering. You want to do something like
IndexWriter writ
Erick Erickson wrote:
Don't do it that way. You're opening and closing your indexwrwiter for each
document, which is extremely wasteful. And given locking has been a source
of much discussion on this list, it's not clear that locking will withstand
this kind of hammering. You want to do something
Thank u for your reply,
There is not much help in Regain community,
But I can see that when I type e.g.
title:logistics
I have like 0.70
also headlines:logistics 0.70
But when I type logistics I have 0.02
I do not udnerstand since I added this word as title and headlines and I
need a higher sc
Hi Erick and Mike
Really thanks a lot for the advice... =)
I will fix my code..I'll let you guys know if any problem arises.
Many thanks and best regards ^ ^
MauReen
Michael McCandless <[EMAIL PROTECTED]> wrote: Erick Erickson wrote:
> Don't do it that way. You're ope
It seems that the suggestions returned by SpellChecker::suggestSimilar
(queryText, num_sug, reader, field, bool) are randomly chosen, then
sorted. By altering num_sug (10, 5, 3,2,1), I received the following
suggestions for "gnetics":
suggestion0: genetics
suggestion1: ginetics
suggestion2:
Hi all,
Which type of query should I use for the following type of thing.
I have multiple words/phrases. I want to run a search for them all OR'd
together. But I want the documents with the most distinct matches to have
the highest score.
An example. I want to search for "TOM OR DICK OR HARRY
Is there an analyzer that can work with XML? Any suggestions for such?
-arturo
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
25 jan 2007 kl. 20.43 skrev Ryan O'Hara:
Is there anyway to sort the suggestions prior, so that grabbing
only one suggestion would give you the best suggestion, in this
case "genetics"?
Without having looked at the code for a long time, I think the
problem is what the lucene scoring cons
Hi Maureen,
Some relevant info in the file formats doc -
http://lucene.apache.org/java/docs/fileformats.html
Regards,
Doron
maureen tanuwidjaja <[EMAIL PROTECTED]> wrote on 25/01/2007
01:31:25:
> btw Daniel,can please give me the reference to find the explanation
> about SegmentTermEnum/Field I
But, locking should be fine even for this "hammering" use case (and if
it's not, that's a bug, and I'd really like to know about it!).
I have hammered over 2.5 million 5-10k docs into an index this way (a
realtime system that I had not yet added a special load call to) and had 0
problems. On
It's just a google query away :)
http://www.google.com/search?hl=de&q=Lucene+XML+analyze&btnG=Google-Suche&meta=
best regards simon
On 1/25/07, Arturo Perez <[EMAIL PROTECTED]> wrote:
Is there an analyzer that can work with XML? Any suggestions for such?
-arturo
--
Hi Nick,
Have you managed to solve/recreate this issue?
There has been a recent progress on index corruption issues:
http://issues.apache.org/jira/browse/LUCENE-140
http://issues.apache.org/jira/browse/LUCENE-784
In those cases an application created FSDirectory with create=false
and created
In article
<[EMAIL PROTECTED]>,
"Simon Willnauer" <[EMAIL PROTECTED]> wrote:
> http://www.google.com/search?hl=de&q=Lucene+XML+analyze&btnG=Google-Suche&meta
> =
Yeah, I'd seen that. I was hoping for something a bit more tightly
integrated than Digester. More specifically, I already parse my
Hi
I was asking what exactly is the inverted indexing strategy used for storing
the index. Is it batch-based index/b-tree based/segment-based data structure
that is used as an index data structure.
On 1/25/07, Rajiv Roopan <[EMAIL PROTECTED]> wrote:
http://lucene.apache.org/java/docs/api/org/
I have used Lucene to index a small collection - only a few hundred
documents. I have a potential client who wants to index a collection
which will start at about a million documents and could easily grow
to two million.
Has anyone used Lucene with an index that large?
Thank you very much
: For various reasons, we'd like to eliminate the sort step.
can you elaborate on what those reasons are?
FunctionQuery (in the solr code base, you'll find lots of discussing in
the archives of this list) can let you use a numeric field value in the
score calculation, but it still uses the Field
Thanks Doron =)
Regards,
Maureen
Doron Cohen <[EMAIL PROTECTED]> wrote: Hi Maureen,
Some relevant info in the file formats doc -
http://lucene.apache.org/java/docs/fileformats.html
Regards,
Doron
maureen tanuwidjaja wrote on 25/01/2007
01:31:25:
> btw Daniel,can please give me the ref
29 matches
Mail list logo