We are having a problem running searches on an index after upgrading to
2.4 and using the new Field.setOmitTf() function. The index size has
been dramatically reduces and even the search performace is better. But
searches do not return any results if searching for something that has a
space i
Yonik Seeley wrote:
> On Wed, Mar 11, 2009 at 2:35 PM, Michael McCandless
> wrote:
>
>> This is expected: phrase searches will not work when you omitTf.
>>
>
> But why would a phrase query be created? The code given looks like it
> should create a boolean query with two terms.
>
> Of cour
is there an ETA for Lucene 2.9 release?
-siraj
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
Try doing a single word search, instead of a phrase. I once had a
similar problem when I indexed using Field.setOmitTf(true) which removed
all the positional information from index, which is required to do
phrase searches.
-siraj
Erick Erickson wrote:
The first thing I'd do is get a copy of
Hi,
I have a Date field in my Lucene index that is currently stored as a
String field with format: MMDDHHMISS. I perform RangeFilter on it
when searching and also sort the results specifying it as a String
field. My question is, will converting it to a Numeric field and start
using Numeri
We have an index with a number field indexed as String field. We do
range searches as well as sorting on this field. Now we want to take
advantage of the NumericField. The question is, will I have to re-index
all the documents or just adding a new document with NumericField will
be enough to
Index optimization fails if we don't have enough space on the drive and
leaves the hard drive almost full. Is there a way not to even start
optimization if we don't have enough space on drive?
regards
-siraj
-
To unsubscribe,
estimate the maximum size used during optimization at 2.5 (a
sort of rough maximum) times your current index size, and not optimize
if your index (at 2.5 times) would exceed your allowable disk space.
Jason
On Mon, Nov 30, 2009 at 2:50 PM, Siraj Haider wrote:
Index optimization fails if we
Hello guys,
We have a dilemma on a few of our lucene machines. We have a tomcat
running our servlets for searching and indexing on each of these
machines. Its a live index where documents are being added to index
while online searches are also being served at the same time. Indexing
happens
running?
You can turn on IndexWriter.setInfoStream to see more details about
what IW is doing, including merging.
Mike
On Tue, Dec 22, 2009 at 5:19 PM, Siraj Haider wrote:
Hello guys,
We have a dilemma on a few of our lucene machines. We have a tomcat running
our servlets for searching and
rNewGC. You
may also read Mark Millers Blog post:
http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot
-camp-draft/
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-----Original Message-
From: Siraj Ha
d if your CPUs are really saturated doing indexing/searching.
Another thing to try is the BalancedSegmentMergePolicy (in
contrib/misc); it also tries to avoid big merges.
Also, how are you opening new readers? Can you share more how you are
using Lucene?
Mike
On Tue, Dec 22, 2009 at 5:57 PM, Siraj H
We upgraded to 2.9.2 from 2.3.2 and the garbage collection performance
deteriorated drastically. The system is going to Full GC cycles with
long pauses very frequently. Did something got changed that we need to
account for?
thanks in advance
-siraj
--
Hello there,
I am getting exception when running queries with new getDocIdSet() in my
customer filter. Following is the code for my getDocIdSet() function:
/public DocIdSet getDocIdSet(IndexReader reader) throws IOException {
OpenBitSet bitSet = new OpenBitSet(reader.maxDoc());
for (in
searching?
Mike
On Wed, Mar 24, 2010 at 3:45 PM, Grant Ingersoll wrote:
On Mar 24, 2010, at 2:13 PM, Siraj Haider wrote:
We upgraded to 2.9.2 from 2.3.2 and the garbage collection performance
deteriorated drastically. The system is going to Full GC cycles with long
pauses very
always a good idea.
--
Ian.
On Wed, Mar 24, 2010 at 6:56 PM, Siraj Haider wrote:
Hello there,
I am getting exception when running queries with new getDocIdSet() in my
customer filter. Following is the code for my getDocIdSet() function:
/public DocIdSet getDocIdSet(IndexReader reader
, Michael McCandless wrote:
How do you reopen your searchers after indexing?
Do you keep a single IW open for all time?
Mike
On Thu, Mar 25, 2010 at 3:11 PM, Siraj Haider wrote:
Indexing happen with frequent intervals on our indexes, but I think
searching is the cause of the issue, because
=6797870
I also use YourKit and watch the allocations...
Mike
On Thu, Mar 25, 2010 at 5:26 PM, Siraj Haider wrote:
how should I get that memory dump? using jmap?
-siraj
On 3/25/2010 4:32 PM, Michael McCandless wrote:
Are you using IndexReader.reopen to open those new searchers?
Can
oops, forgot the attachments... its here now...
On 3/26/2010 10:33 AM, Siraj Haider wrote:
Hi Mike,
I am attaching the dump that I created by putting
-XX:+PrintClassHistogram in catalina options and by issuing a kill -3
command. The machine was not in a bad state (i.e. it was not doing
We are in the process of removing the deprecated api from our code to
move to version. One of the deprecation is, the queryparser now expects
a version parameter in the constructor. I also have read somewhere that
we should pass the same version to analyzer when indexing as wel as when
search
uture want to
move to LUCENE_30. Will the queryparsers/analyzers opened with
LUCENE_30 be able to read indexes created by using LUCENE_29?
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From:
m searches.
Thanks for your comments. Can you please explain more about why the
user might get something from searches that they are not expecting?
Hope this helps,
Shai
On Tuesday, April 13, 2010, Siraj Haider wrote:
We are in the process of removing the deprecated api from our code to move t
Hello there,
In oracle text search there is a feature to reverse search using
ctxrule. What it does is, you create an index (ctxrule) on a column
having your search criteria(s) and then throw a document on it and it
tells you which criteria(s) it satisfies. Is there something in Lucene
that
Index, which can hold more than one document at a
time:
<http://lucene.apache.org/java/3_0_1/api/contrib-instantiated/org/apache/lucene/store/instantiated/InstantiatedIndex.html>
Steve
On 05/17/2010 at 4:38 PM, Siraj Haider wrote:
Hello there,
In oracle text search there is a feature t
which I found by Googling "Lucene MemoryIndex") using
PyLucene:
<http://www.sajalkayan.com/prospective-search-using-python.html>
Give it a try! Lucene is pretty easy to get started with. Ask questions if
you run into trouble.
Good luck,
Steve
On 05/17/2010 at 5:46 PM, Siraj Hai
I am trying to run a search using search(query, filter, n, sort) method
which return TopFieldDocs. The sort is defined like: sort = new
Sort(new SortField("DATEISSUED", SortField.LONG, true)); and I am
passing filter as null. The query I am passing is : +SK:1J +TEAMID:1
which return results s
Just wanted to mention that I am using Lucene 2.9.2 if it helps.
thanks
-siraj
Original Message
Subject:Exception when running search
Date: Thu, 17 Jun 2010 13:06:04 -0400
From: Siraj Haider
Reply-To: java-user@lucene.apache.org
To: java-user
On 6/18/2010 10:12 AM, Uwe Schindler wrote:
Whats the code you use for search? What is n, what type of fields?
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Siraj Haider [mailto:si...@jobdiva.com
I am using DefaultSimilarity and did not boost any field while indexing. My
index is comprised of the following fields:
- Title
- Author
- Bookname
- Description
All of the 4 fields are indexed and can be searched on by the user. Now let's
say the user sea
/IndexSearcher.html#collectionStatistics(java.lang.String)
simon
On Mon, Oct 22, 2012 at 11:25 PM, Siraj Haider wrote:
> I am using DefaultSimilarity and did not boost any field while indexing. My
> index is comprised of the following fields:
>
> - Title
>
>
, Siraj Haider wrote:
> So, just to confirm, using Lucene 4.0, we would be able to issue a
> search on one or more fields and would be able to get the results
> sorted by a custom field and also would be able to get the score of
> each document based on the frequency of the terms sea
Any other suggestions?
regards
-Siraj
(212) 306-0154
-Original Message-
From: Siraj Haider [mailto:si...@jobdiva.com]
Sent: Tuesday, October 23, 2012 6:06 PM
To: java-user@lucene.apache.org
Cc: simon.willna...@gmail.com
Subject: RE: Scoring based on document
Thanks for the suggestion
know if I'm wrong.
On Thu, Oct 25, 2012 at 1:18 AM, Siraj Haider wrote:
> Any other suggestions?
>
> regards
> -Siraj
> (212) 306-0154
>
> -Original Message-
> From: Siraj Haider [mailto:si...@jobdiva.com]
> Sent: Tuesday, October 23, 2012 6:06 PM
> T
How can I get the size of the whole index in bytes?
regards
-Siraj
(212) 306-0154
This electronic mail message and any attachments may contain information which
is privileged, sensitive and/or otherwise exempt from disclosure under
applicable law. The informati
Hi There,
Is there a way to do reverse matching by indexing the queries in an index and
passing a document to see how many queries matched that? I know that I can have
the queries in memory and have the document parsed in a memory index and then
loop through trying to match each query. The issue
/search-percolate.html
On Friday, February 14, 2014 8:21 PM, Siraj Haider wrote:
Hi There,
Is there a way to do reverse matching by indexing the queries in an index and
passing a document to see how many queries matched that? I know that I can have
the queries in memory and have the document
Alan Woodward
www.flax.co.uk
On 17 Feb 2014, at 20:26, Siraj Haider wrote:
> Thanks for your great advice Ahmet. Do you know if I could use luwak
> libraries in my Lucene project diretly? Or do I have to use Solr? Currently,
> we use core lucene libraries in our system and have built our ow
Hello there,
I was looking for best practices for indexing/searching on a
multi-processor/core machine but could not find any specific material on
that. Do you think it is a good idea to create a guide/how-to for that
purpose? It would be very helpful for many people in todays world,
where a
We have been using org.apache.lucene.search.ChaniedFilter in our
application that uses lucene 3.0.3. Today we downloaded version 3.1.0,
but the code wont compile. It says that it could not find
ChainedFilter. Did this class got moved to some other package?
thanks
-siraj
---
I am sorry, but the ChainedFilter was in lucene-misc-3.0.3.jar under
org.apache.lucene.misc but could not find it under the same location in
lucene-misc.3.1.0.jar.
On 4/7/2011 6:31 PM, Siraj Haider wrote:
We have been using org.apache.lucene.search.ChaniedFilter in our
application that uses
We are in the process of upgrading from 2.x to 6.x. In 2.x we implemented our
own similarity where all the functions return 1.0f, how can we implement such
thing in 6.x? Is there an implementation already there that we can use and have
the same results?
--
Regards
-Siraj Haider
(212) 306
Hello there,
What is the default maximum field length in Lucene 6? In Lucene2.9 we use
IndexWriter.MaxFieldLength to increase the default to 100,000 as we index some
very large fields. What should be the alternate for that in Lucene 6?
--
Regards
-Siraj Haider
(212) 306-0154
Thanks Mike,
The name LimitTokenCountAnalyzer suggests that it is used to Limit the token
count, so I was thinking that the default now is no limit and we might not need
to use it as we wanted to increase the field size instead of limiting it.
Please let me know.
--
Regards
-Siraj Haider
We currently use Lucene 2.9 and to keep the indexes running faster we optimize
the indexes during night. In our application the volume of new documents coming
in is very high so most of our indexes have to merge segments during the day
too, when the document count reaches certain number. This ca
values of that
field from a search. So, my question is, can we index two different field types
with same name or do we have to use different names for these fields?
--
Regards
-Siraj Haider
(212) 306-0154
This electronic mail message and any attachments may
Hi Mike,
You said "periodically calling IW.commit when you need durability". Does it
mean that if the program dies without calling the IW.commit() all the index
changes would be lost that were not commited?
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
Fro
()), writer_config);
SearcherManager searcherManager = new SearcherManager(indexWriter, new
SearcherFactory());
--
Regards
-Siraj Haider
(212) 306-0154
This electronic mail message and any attachments may contain information which
is privileged, sensitive
Thanks for the information Uwe, it was very helpful. Do you have any example
code implementing IndexWriter.IndexReaderWarmer class? I am having difficulty
finding any examples on internet.
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
From: Uwe Schindler [mailto:u
I simply run
warmup queries on the acquired IndexSearcher from SearcherManager or is there a
better way to accomplish that.
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Monday, December 19, 2016 1:29 PM
To: java-user
();
searcherManager.maybeRefresh();
Should these sequence of events cause the MergedSegmentWarmer to get called?
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Wednesday, December 21, 2016 11:37 AM
To: java-user@lucene.apache.org
ld by Lucene somewhere. Can you please direct me on
what needs to be checked?
--
Regards
-Siraj Haider
(212) 306-0154
This electronic mail message and any attachments may contain information which
is privileged, sensitive and/or otherwise exempt from discl
Hi all,
We recently switched to Lucene 6.5 from 2.9 and we have an issue that the files
in index directory are not getting released after the IndexWriter finishes up
writing a batch of documents. We are using IndexFolder.listFiles().length to
check the number of files in index folder. We have ev
double digit number when I restart the tomcat.
Thanks for looking into it.
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
From: Ian Lea [mailto:ian@gmail.com]
Sent: Friday, May 05, 2017 9:33 AM
To: java-user@lucene.apache.org
Subject: Re: Un-used index files are not
-la : 236
lsof : 79
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
Sent: Thursday, May 11, 2017 1:34 PM
To: java-user@lucene.apache.org
Cc: ian@gmail.com
Subject: RE: Un-used index files are not getting released
atched? So, for example, it might be that
document 1 was matched because A and B and D were found and for document 2 C
and E were found. Is there a way to check that?
--
Regards
-Siraj Haider
(212) 306-0154
This electronic mail message and any attachments m
Thanks for the response Paul, it would be great if you can point me to that
discussion.
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
From: Paul Libbrecht
Sent: Wednesday, March 17, 2021 4:02 PM
To: java-user@lucene.apache.org; Diego Ceccarelli
Subject: Re: Search
Does that mean that we will need to wrap each clause of our Boolean query in
order to check it for all clauses?
--
Regards
-Siraj Haider
(212) 306-0154
-Original Message-
From: Michael Sokolov
Sent: Wednesday, March 17, 2021 4:13 PM
To: java-user@lucene.apache.org
Cc: Diego
<https://jobdiva.com/>
Siraj Haider
JobDiva Technology
[https://signaturehound.com/api/v1/png/email/default/c52e28.png]
siraj.hai...@jobdiva.com<mailto:siraj.hai...@jobdiva.com>
[https://signaturehound.com/api/v1/png/phone/default/c52e28.png]
212.306.0154
[https://signaturehound.com/api/v
58 matches
Mail list logo