I have existing code that's like:
final Term t = /* ... */;
final Iterator i = searcher.search( new
TermQuery( t ) ).iterator();
while ( i.hasNext() ) {
final Hit hit = (Hit)i.next();
// "FILE" is the field that recorded the original file indexed
Hi Greg & Kay,
> Lucene (IndexSearchers / IndexReaders) has the notion of cache as well
> so you need to check if you really want a 100% replication of
> RAMDirectory / FSDirectory as well concurrently in memory. Have you
> tested with the FieldCache policies before moving onto the RAMDirectory
>
Have you checked out solr project that provides a service on top of
Lucene + caching / warming up facilities similar to what you need.
The IndexReaders are expensive ( and are the underlying data source for
a given IndexSearcher ) in terms of time and resources , when being
opened / created an
I am trying to figure out the best way to add to a lucene index across a
clustered app server. I cannot grab an IndexWriter for each node in the
cluster, because I would run into lock file problems. I am not sure if I can
share one IndexWriter across the cluster because what happens when two o
Hi All -
What is the best way to load a RAM Directory from a FS Directory, and
periodically reload the RAM Directory to pick up new documents?
The scenario I have is I create several large directories which I create to a
file system, then load them into ram for faster searching.
They takes seve
Hi Ravi,
Lucene can enable this, but you will have some work to do on top of
it. If you search the archives for record linkage (http://www.lucidimagination.com/search/?q=record+linkage
) you will find a fair amount of discussion on this. Also, in
somewhat shameless marketing mode, my co-au
I just cut'n'pasted your word into Solr... it worked fine (it didn't
split the word).
Make sure you're using the latest from the trunk version of Solr...
this was fixed since 1.3
http://localhost:8983/solr/select?q=साल&debugQuery=true
[...]
साल
साल
text:साल
text:साल
-Yonik
On Tue, Jun
Upon a request on the experiences on this issue, I am posting the most
important functions of the program. Every DB record maps directly to one
file. The function that I did not include is "getDataSource()" which
acquires a jdbc datasource to your database.
cheers,
Christoph
private void
Hi Robert, I tried a sample code to check whats the reason. The
worddelimiterfilter uses isLetter() method to tokenize, and for hindi words
some parts of word are not actually letters but just part of the word[but
that doesnot mean they can be used as word delimiters], since they are not
letters is
Woops, you're right. I just fixed that (made it public). Thanks for
raising this!
Mike
2009/6/8 Koji Sekiguchi :
> CHANGES.txt said that we can use HitCollectorWrapper:
>
> 12. LUCENE-1575: HitCollector is now deprecated in favor of a new
> Collector abstract class. For easy migration, people c
This doesn't exist today, but it'd be straightforward to implement
your own LockFactory that is verbose?
Mike
On Fri, Jun 5, 2009 at 1:15 PM, Newman, Billy wrote:
> I am having a problem where I am getting lock timeouts when trying to write
> to my index file. It would be nice if I could turn o
http://lucene.apache.org/java/docs/nightly/
Mike
On Mon, Jun 8, 2009 at 11:52 PM, Artyom Sokolov wrote:
> Good time of day.
>
> If I understand correctly next release will be 2.9. Where one could
> find javadocs for it? I've searched in Hudson a bit but didn't find
> anything.
>
> Thanks.
>
> ---
Note that when using rsync, you must first close the IndexWriter, else
the copy can be corrupt.
If having to close IndexWriter (and stop indexing) is a hassle, then
you should use SnapshotDeletionPolicy; it was created exactly for this
reason (to take a backup of the index even while further index
Unless you optimize it or are doing weird things with merge factors
you won't get completely new files every time you update an index.
Some files will change, or be created, or deleted, and some won't.
Then you can just copy them wherever you want using rsync or whatever
you like. We use rsync, ma
Hello All,
I want to check the feasibility of using Lucene for similarity check
between the two flat csv files. The actual requirement is like this: We
have two files each containing the information of customers like their
name, address, pin code etc. Some customers may be in common in both
I construct a boolean query to search a term in each of the field of
the index. Once I retrieve the hits, is it possible to retrieve which
field matched to the particular term.
For example:
I have fields A B C with data a b c.
A B C
a b a
Then I search for A:a B:a C:a and get a hit.
Can I tell wh
16 matches
Mail list logo