your talks here -
https://communityovercode.org/call-for-presentations/
We hope to see many of you talk about Search in Denver!
--
Anshum Gupta
there.
--
Anshum Gupta
! Hope to
see you all there.
--
Anshum Gupta
.
Good luck!
-Anshum
On Wed, Mar 30, 2022 at 5:47 AM Michael Wechner
wrote:
> Hi Together
>
> I would be interested to submit a proposal/presentation re Lucene's
> vector search, but would like to ask first whether somebody else wants
> to do this as well or might be i
website - https://www.apachecon.com/acah2021/index.html
Registration - https://hopin.com/events/apachecon-2021-home
Slack - http://s.apache.org/apachecon-slack
Search Track - https://www.apachecon.com/acah2021/tracks/search.html
See you all at ApacheCon 2021!
-Anshum
-
To unsubscribe, e-mail: announce-unsubscr...@apachecon.com
For additional commands, e-mail: announce-h...@apachecon.com
--
Anshum Gupta
that they can continue to expect critical bug fixes for releases
previously made under the Apache Lucene project.
We will send another update as the mailing lists and website are set up for
the Solr project.
-Anshum
On behalf of the Apache Lucene and Solr PMC
://www.apachecon.com/acah2020/tracks/search.html
See you at ApacheCon.
--
Anshum Gupta
https://lucene.apache.org/theme/images/lucene/lucene_logo_green_300.png
> >
> > Please vote for one of the above choices. This vote will close about one
> > week from today, Mon, Sept 7, 2020 at 11:59PM.
> >
> > Thanks!
> >
> > [jira-issue] https://issues.apache.org/jira/browse/LUCENE-9221
> > [first-vote]
> >
> http://mail-archives.apache.org/mod_mbox/lucene-dev/202006.mbox/%3cCA+DiXd74Mz4H6o9SmUNLUuHQc6Q1-9mzUR7xfxR03ntGwo=d...@mail.gmail.com%3e
> > [second-vote]
> >
> http://mail-archives.apache.org/mod_mbox/lucene-dev/202009.mbox/%3cCA+DiXd7eBrQu5+aJQ3jKaUtUTJUqaG2U6o+kUZfNe-m=smn...@mail.gmail.com%3e
> > [rank-choice-voting] https://en.wikipedia.org/wiki/Instant-runoff_voting
> >
>
--
Anshum Gupta
tensive mirroring network for
distributing releases. It is possible that the mirror you are using may not
have replicated the release yet. If that is the case, please try another
mirror. This also applies to Maven access.
ReleaseNote70 (last edited 2017-09-20 10:27:30 by AnshumGupta
<https://wiki.apache.org/lucene-java/AnshumGupta>)
Anshum Gupta
try another mirror. This also goes for Maven access.
-Anshum Gupta
replicated the release yet. If that is the case, please
try another mirror. This also goes for Maven access.
--
Anshum Gupta
replicated the release yet. If that is the case, please
try another mirror. This also goes for Maven access.
--
Anshum Gupta
Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using
may not have replicated the release yet. If that is the case, please try
another mirror. This also goes for Maven access.
--
Anshum Gupta
: u...@thetaphi.de
-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net]
Sent: Friday, February 20, 2015 9:55 PM
To: d...@lucene.apache.org; gene...@lucene.apache.org; java-
u...@lucene.apache.org
Subject: [ANNOUNCE] Apache Lucene 5.0.0 released
20 February
and notes on upgrading.
Please report any feedback to the mailing lists (
http://lucene.apache.org/core/discussion.html)
--
Anshum Gupta
http://about.me/anshumgupta
meetup event:
http://www.meetup.com/Bangalore-Apache-Solr-Lucene-Group/events/113806762/ .
--
Anshum Gupta
http://www.anshumgupta.net
Hi Vidya,
Perhaps this could help you:
http://hrycan.com/2009/10/25/lucene-highlighter-howto/
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Fri, Oct 28, 2011 at 2:18 PM, Vidya Kanigiluppai Sivasubramanian
vidya...@hcl.com wrote:
Hi,
I am using lucene 2.4.1 in my project.
I need
hand, why do you want to split a 9G index? Is there a reason?
performance issue? It'd be good if you could share the reason as the problem
could be completely different.
--
Anshum Gupta
http://ai-cafe.blogspot.com
2011/7/27 Gudi, Ravi Sankar ravisankarg.ravisank...@hp.com
Hi Lucene Team
from the
'search' method.
Also, I'd suggest you to grab a copy of Lucene in Action 2nd Edition as it'd
help you a lot in understanding the way Lucene works/is used.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Jun 8, 2011 at 11:00 AM, Pranav goyal pranavgoyal40...@gmail.comwrote:
Hi all
the updateDocument function as of now would internally delete the
document and add the new supplied document.
Hope this answer helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Jun 6, 2011 at 11:59 AM, Pranav goyal pranavgoyal40...@gmail.comwrote:
Hi all,
I am a newbie to lucene.
I
.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Jun 6, 2011 at 4:41 PM, Pranav goyal pranavgoyal40...@gmail.comwrote:
Hi all,
Is there any way to change my lucene document no?
Like if I can change my lucene document no's with con_key.
I am a newbie and don't know whether this is a silly
Yes,
You'd need to delete the document and then re-add a newly created document
object. You may use the key and delete the doc using the Term(key, value).
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Jun 6, 2011 at 4:45 PM, Pranav goyal pranavgoyal40...@gmail.comwrote:
Hi Anshum
){
System.out.println(ir.document(scoreDoc.doc));
}
is.close();
ir.close();
iw.close();
*--Snip--*
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Fri, Apr 15, 2011 at 6:32 AM, Christopher Condit con...@sdsc.edu wrote:
I know that it's best practice to reuse
Could you also print and send the entire stack-trace?
Also, the query.toString()
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Apr 19, 2011 at 7:40 PM, Patrick Diviacco
patrick.divia...@gmail.com wrote:
I get the following error message: java.lang.UnsupportedOperationException
the best.
Relevance or an apt method about boost values, can again be figured out
using varying the boost *via* *trial and error*. That is pretty much a
general practice.
Hope this helps you figuring out a reasonable solution and boost values.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Sat, Apr
Hi Madhu,
You could use IndexSearcher.explain(..) to explain the result and get the
detailed breakup of the score. That should probably help you with
understanding the boost and score as calculated by lucene for your app.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Apr 19, 2011 at 2:32
So Update basically is nothing but delete and add (a fresh doc). You could
just go ahead at using the deletedocument(Query query) function and then
adding the new document? That is the general approach for such cases and it
works just about fine.
--
Anshum Gupta
http://ai-cafe.blogspot.com
So functionally I am assuming you've achieved what you'd been aiming for.
About the scores, the matchalldocs does score docs based on norm factors
etc.
therefore the score wouldn't be 0.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Mar 23, 2011 at 1:38 PM, Patrick Diviacco
patrick.divia
Hi,
No as of now, there's no way to do so.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011 at 12:29 PM, shrinath.m shrinat...@webyog.com wrote:
I am asking for partial update in Lucene,
where I want to update only a selected field of all fields in the document.
Does Lucene
Also,
Is there a particular reason why you wouldn't want to index that considering
you'd want to 'update' documents. Its good practice to index the unique
field specially if you have one. It has generally helped more often than
not.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22
Yes, that's how its generally done. Also, you should just handle data/fields
aptly rather than trying to avoid them in the first place. You could safely
add these, use these internally and never return these or use these for an
end user search.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue
Hi Patrick,
You may have a look at this, perhaps this will help you with it. Let me know
if you're still stuck up.
http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011 at 4:10 PM, karl.wri...@nokia.com
so a few things
1. are you looking to get 'all' documents or only docs matching your query?
2. if its about fetching all docs, why not use the matchalldocs query?
3. did you try using a collector instead of topdocs?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011 at 4:46 PM
are
trying to achieve. You may have a completely different option that you
haven't read which someone could advice if they know the exact intent.
Hope this helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 22, 2011 at 4:59 PM, Patrick Diviacco
patrick.divia...@gmail.com wrote:
1
Hi Suman,
I tried it a while ago. Found it nice and useful.
You could get some hints on using it at
http://ai-cafe.blogspot.com/2009/09/lucid-gaze-tough-nut.html (in case you
need some ! :) )
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Mar 16, 2011 at 11:37 AM, suman.holani suman.hol
, otherwise if you're using very selective field
which may be used though a FieldCache it'd be a nice thing to do.
Hope that helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Mar 10, 2011 at 3:01 PM, suman.holani suman.hol...@zapak.co.inwrote:
Hi,
I am facing the problem
The line
Depends on your data. I know that's a vague answer but that's the point.
What you could do is use FieldCache if memory and data let you do so. Would
it?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Mar 10, 2011 at 3:12 PM, suman.holani suman.hol...@zapak.co.inwrote:
Hi Anshum,
Thanks
Hi Lahiru,
A few questions here.
Why would you need that? Is the field stored?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Mar 1, 2011 at 11:04 AM, Lahiru Samarakoon lahir...@gmail.comwrote:
Hi all,
Is there a way to find the length of a field of a lucene index document?
Thanks
If you actually intend at getting the intersection of 2 results from a
'union' of 2 indexes, you could use the filter and query approach. You could
use a multi searcher or a parallel multi searcher to perform the search in
this case.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Feb 14
Hi Liat,
You could use open a multi/parallelmultisearcher on the indexes that you
have and then construct an OR query e.g. (contents:A OR text:A)
I am assuming that the field names do not overlap. If that is not the case
then you'd need another solution.
--
Anshum Gupta
http://ai
KeywordAnalyzer());
/snip
In the above snip, I instantiate an analyzer which by default would use the
StandardAnalyzer but for 'anotherfield' would use KeywordAnalyzer.
Hope this helps you.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Feb 15, 2011 at 2:19 AM, Yuhan Zhang yzh...@onescreen.com wrote
Why don't you generate your own index off some sample docs or dataset. Would
give you a lot more flexibility to play around as otherwise even if you get
an index, you wouldn't have info in the analyzer used etc.. while indexing.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Sun, Feb 13, 2011
Hi Ranjit,
That would be because all stop words (space, comma, stop word set, etc..)
would be treated in a similar fashion and escaped while indexing, subject to
the analyzer you use while index your content.
Hope that explains the issue.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Feb
of an
ngram, and then treat those phrases at terms.
Doing it at runtime would not be a feasible option.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Jan 20, 2011 at 3:30 PM, Ashish Pancholi apanch...@chambal.comwrote:
Using Lucene_3.0.3. we would like to implement following:
The number
mod of
some numeric (auto increment) userid.
This works well under normal cases unless your partitioning is not
predictable.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Fri, Jan 21, 2011 at 10:52 AM, Ganesh emailg...@yahoo.co.in wrote:
Hello all,
Could you any one guide me what all
mirrors them internally or via a
downstream project)
--
Anshum Gupta
http://ai-cafe.blogspot.com
understanding on
lucene and getting a copy of Lucene In Action 2nd
Edhttp://www.manning.com/hatcher3/.
would be a good idea for you and everyone in your position.
Hope that helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Sun, Jan 16, 2011 at 8:03 PM, Pelit Mamani
pelit.mam
Hi Ryan,
You should try the synonym filter. That should help you with this kinda
problem.
You could also look at turning off norms for the name field, or turning off
tf or idf.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Sat, Jan 8, 2011 at 6:03 AM, Ryan Aylward r...@glassdoor.com wrote
page, starting at
http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB
http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#DEFAULT_RAM_BUFFER_SIZE_MB
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Dec
.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Dec 29, 2010 at 5:36 PM, Jiang mingyuan
mailtojiangmingy...@gmail.com wrote:
Can lucene index survives a machine crash during the merge or optimize
operation?
or can I stop the running index program during the merge or optimize
period
.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Dec 29, 2010 at 3:36 AM, software visualization
softwarevisualizat...@gmail.com wrote:
This has probably been asked before but I couldn't find it, so...
Is it possible / advisable / practical to use Lucene as the basis of a
live
document
Hi Umesh,
I'm not really confident that Zoie or anything built on the current version
of Lucene would be able to handle search as you type kind of a setup.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Dec 29, 2010 at 10:39 AM, Umesh Prasad umesh.i...@gmail.com wrote:
You can also look
below).
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Dec 21, 2010 at 3:54 PM, manjula wijewickrema
manjul...@gmail.comwrote:
Hi Gupta,
Thanx a lot for your reply. But I could not understand whether I could
modify (adding more words) to the default stop word list or should I have
.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Dec 21, 2010 at 9:20 AM, manjula wijewickrema
manjul...@gmail.comwrote:
Hi,
1) In my application, I need to add more words to the stop word list.
Therefore, is it possible to add more words into the default lucene stop
word list?
2
with a single '=' :)
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Nov 30, 2010 at 3:03 PM, maven apache apachemav...@gmail.comwrote:
2010/11/30 Chris Hostetter hossman_luc...@fucit.org
: Subject: What is the difference between the AND and + operator?
In this query, y
You could change Occur.SHOULD to Occur.MUST for both fields.
This should work for you if what I understood is what you wanted.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Nov 30, 2010 at 5:12 PM, maven apache apachemav...@gmail.comwrote:
Hi: I have two documents:
title
#setMinimumNumberShouldMatch(int)Finally
all would depend on the case at hand and what you think is the
expected behavior of search.
Hope this helps.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Nov 29, 2010 at 1:31 PM, yang Yang m4ecli...@gmail.com wrote:
What is the difference between
the index and
the source.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Nov 17, 2010 at 1:36 PM, Lance Norskog goks...@gmail.com wrote:
The Lucene CheckIndex program does this. It is a class somewhere in Lucene
with a main() method.
Samarendra Pratap wrote:
It is not guaranteed
/lucene-java/SpatialSearch
For your understanding, you could have a look at the bounding box approach.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Nov 18, 2010 at 7:38 AM, yang Yang m4ecli...@gmail.com wrote:
We are using the hibernate search which is based on lucene as the search
engine
. This would also give you a fair idea
of the index state.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Nov 16, 2010 at 11:36 AM, Yakob jacob...@opensuse-id.org wrote:
hello all,
I would like to ask about lucene index. I mean I created a simple
program that created lucene indexes and stored
wanting to do so? is it that you
only index data coming from a stream and you don't have access to the
original source at a later time?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Oct 12, 2010 at 11:35 AM, Nilesh Vijaywargiay
nilesh.vi...@gmail.com wrote:
Hi Group,
I understand
ParallelReader though theoretically sounds useful, I doubt if how much the
overhead of maintaining and synchronizing the document ids would be. I
haven't used it so far, perhaps someone who's used the ParallelReader for
such a purpose on production environment/scale may help you.
--
Anshum Gupta
Version? Machine and JVM (32/64 bit)?
This most probably seems like a code level issue rather than lucene, but I
may be wrong.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Oct 13, 2010 at 8:08 AM, Ching zchin...@gmail.com wrote:
Hi All,
Can anyone help with this issue? I have about 2000 pdf
at SOLR, which
provides an out of the box engine.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Oct 13, 2010 at 8:57 AM, Hyun Joo Noh dbfudrp...@gmail.com wrote:
Hi, how would you make Lucene leave a search log of
who searched what, when, etc (i.e. cookie, query, timestamp, etc
this is what you intended!
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Sep 30, 2010 at 11:54 PM, Sahin Buyrukbilen
sahin.buyrukbi...@gmail.com wrote:
Hi all,
I need to get the first term in my index and iterate it. Can anybody help
me?
Best.
Seems like a case of I/O issues. You may be reading content off the index
while performing searches while the I/O for copy is also happening.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Mon, Aug 23, 2010 at 1:12 PM, gag...@graffiti.net wrote:
Hi all,
We're observing search threads
().
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 24, 2010 at 4:38 AM, Justin cry...@yahoo.com wrote:
In an attempt to avoid doubling disk usage when adding new fields to all
existing documents, I added a call to IndexWriter::expungeDeletes. Then my
colleague pointed out that Lucene
of reclaiming lost disc
space.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 24, 2010 at 9:22 AM, Justin cry...@yahoo.com wrote:
My actual code did not call expungeDeletes every time through the loop;
however,
calling expungeDeletes or optimize after the loop means that the index has
doubled
it comfortably. btw,
are you facing any issues in sort time or is it a presumption?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh shelly_si...@infosys.comwrote:
Hi,
I have a Lucene index that contains a numeric field along with certain
other fields
?
--
Anshum Gupta
http://ai-cafe.blogspot.com
2010/8/17 xiaoyan Zheng hillyzh...@gmail.com
the question is like this:
when one user is using IndexWirter.addDocument(doc), and another user has
already finished adding part and have closed IndexWirter, then, the first
user embraces the error ERROR
reading the source takes
time in your case, though, the indexwriter would have to be shared among all
threads.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 10, 2010 at 12:24 PM, Shelly_Singh shelly_si...@infosys.comwrote:
Hi,
I am developing an application which uses Lucene
for that period.
This would make the data manageable and searchable within reasonable time.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Tue, Aug 10, 2010 at 5:49 PM, Shelly_Singh shelly_si...@infosys.comwrote:
No sort. I will need relevance based on TF. If I shard, I will have to
search in al indices
So, you didn't really use the setRamBuffer.. ?
Any reasons for that?
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Wed, Aug 11, 2010 at 10:28 AM, Shelly_Singh shelly_si...@infosys.comwrote:
My final settings are:
1. 1.5 gig RAM to the jvm out of 2GB available for my desktop
2
Hi Saurabh,
I don't think there's a way to do that? Why not use other constructs?
--
Anshum Gupta
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Mon, May 17, 2010 at 8:04 PM, Saurabh Agarwal srbh.g
Hi Manjula,
Yes lucene by default would only tackle exact term matches unless you use a
custom analyzer to expand the index/query.
--
Anshum Gupta
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Fri
Hi Clara,
Any particular reason why you'd need the score? Perhaps this would be of
help
http://lucene.apache.org/java/2_9_1/scoring.html
http://lucene.apache.org/java/2_3_2/scoring.pdf
Hope this explains whatever you were looking for.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
There are a few things you could do,
1. Run the JVM in server mode [-server]
2. Assign more RAM (in case you're running a 64 bit architecture) (both
initial and max limit)
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me
Hi Ravi,
Adding to what Erick said, you could do index the numbers as numeric fields
instead of strings. This should improve things for you by a considerable
amount.
P.S: I'm talking with my knowledge on Java Lucene.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed
something like a synonym analyzer while conducting
search in this case.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Fri, Apr 23, 2010 at 2:39 AM, Wei Yi jasonwe...@gmail.com
Reposting as the first post didn't get many hits!
Apologies for all who consider this spam!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Wed, Feb 17, 2010 at 3:35 PM
the fields
at run time.
As far as relational nature is concerned, I'd say lucene's model is pretty
different from what you're taking it to be. Lucene documents are just a
collection of field/value pairs.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong
copy in
any manner though)
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Tue, Mar 23, 2010 at 3:51 PM, suman.hol...@zapak.co.in wrote:
Hello,
I am trying
Index time is a much better approach. The only negative about it is the
index size increase. I've used it for a considerable sized dataset and even
the index time doesn't seem to go up considerably.
Searching of multiple terms is generally unoptimized when you can do it with
1.
--
Anshum Gupta
tokenized/processed prior
to getting indexed. The way the processing would happen depends on your
analyzer (which here is StopAnalyzer). So point 1. If you analyze a field
with value *'My name is anshum' *it would get broken down into tokens, e.g.
[my] [name] [is] [anshum] where each term
Hi,
How about indexing a dummy token for empty docs? that way you may pick up
all docs that are actually null/empty by querying for the dummy token.
Make sure that the dummy token is never a part of any actual document (token
stream).
Perhaps this should work!
--
Anshum Gupta
Naukri Labs!
http
multiple genres instead of duplicate
entries.
I'm still not sure if I've gotten tre problem correctly, but hope this is of
help!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
://groups.google.com/group/luceneindia* to join and share!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
Hi Mike,
Not really through queries, but you may do this by writing a custom
collector. You'd need some supporting data structure to mark/hash the
occurrence of a domain in your result set.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody
be:
Index flipped terms (using an appropriate analyzer) i.e. cat is also indexed
as tac. You may then query on ta* instead of at*.
Does that solve your issues/concern?
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me
How about getting the original token stream and then converting c++ to
cplusplus or anyother such transform. Or perhaps you might look at
using/extending(in the non java sense) some other tokenized!
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong
in the index
size should be anticipated and handled.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Fri, Dec 11, 2009 at 10:50 PM, Rob Staveley (Tom)
rstave...@seseit.comwrote:
I'm
an indexer from scratch, you'd have to write
a java file on the same lines as the demo (modified) and include it.
Does that help?
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
(in the wrapper code).
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Thu, Dec 3, 2009 at 8:02 AM, blazingwolf7 blazingwo...@gmail.com wrote:
Hi,
As per title
Just add a check in the while statement to exit as soon as the pattern of
the term changes.
You could check if the term does not start with your input and exit from the
while loop there.
It would exit wherever the term start changes from what you want.
--
Anshum Gupta
Naukri Labs!
http://ai
Try this,
Change the code as required:
-
import java.io.IOException;
import org.apache.lucene.index.CorruptIndexException;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TermEnum;
/**
* @author anshum
*
*/
public class
By autosuggest, would you mean similar documents?
In that case you could try the lucene 'morelikethis' class.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Mon, Nov 23
For auto complete, you could try the following:
1. Run a prefix query. [Could be a fuzzy query]
2. Index using something like ngrams.
term : sample is indexed as 4 terms, viz:
t
te
ter
term
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody
Hi Rafal,
If what I understand about your implementation is correct, you could try a
parallelmultisearcher
http://lucene.apache.org/java/2_9_1/api/core/org/apache/lucene/search/ParallelMultiSearcher.html
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong
(field, new FileReader(f12));
iw.addDocument(doc);
--snip ends--
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com
The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw
On Tue, Nov 10, 2009 at 2:50 PM, Wenhao Xu xuwenhao2...@gmail.com
1 - 100 of 194 matches
Mail list logo