I got the answer.
Somehow I missed it.
The PhraseQuery requires the terms to be in a fixed order whereas the
BooleanQuery does not require the terms to be
in a particular order.
On Thu, Mar 31, 2016 at 3:07 PM, Sachin Kulkarni
wrote:
> Hi,
>
> I am using Lucene-5.0.0.
> If I had
wo methods. I am trying to understand why is the
difference.
Thank you.
Regards,
Sachin
Where do you plan to use it?
So far there is no built in learning to rank implementations in Lucene at
least.
There are suggestions to include those.
I do not know about Solr.
I worked on research projects on Learning to Rank algorithms and I had used
Lucene to generate the features which then I r
searched in the index that I have created?
Thank you in advance.
Regards,
Sachin
On Mon, Sep 15, 2014 at 4:36 PM, Sachin Kulkarni
wrote:
> Hi Erick,
>
> Thank you.
>
> Yes the data is in text form with the space delimited tokens.
> The queries are categories that the documents bel
Hi Erick,
Thank you.
Yes the data is in text form with the space delimited tokens.
The queries are categories that the documents belong to.
They are regular text files and will need the transformation at my end.
Regards,
Sachin
On Mon, Sep 15, 2014 at 12:31 PM, Erick Erickson
wrote:
>
Hi Uwe,
Thank you.
I do not have the tokens serialized, so that reduces one step.
I am reading the javadocs and will try it the way you mentioned.
Regards,
Sachin
On Sun, Sep 14, 2014 at 5:11 PM, Uwe Schindler wrote:
> Hi,
>
> If you have the serialized tokens in a file, you ca
mming schemes.
Thank you.
Regards,
Sachin
o.
I works well once I fixed the parser.
Regards,
Sachin Kulkarni
On Tue, Aug 19, 2014 at 9:53 PM, Sachin Kulkarni
wrote:
> Hi Kumaran,
>
> See below some part of the code and the .alg file.
> Here is the function from DocMaker.java from the package "package
> org.apache.luce
docs.dir=PATH_TO_MY_DATASET
doc.term.vector=true
work.dir=work
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
trec.doc.parser=org.apache.lucene.benchmark.byTask.feeds.TrecParserByPath
content.source.forever=false
content.source.encoding=UTF-8
directory=FSDirectory
doc.stored=true
doc.tokenized=true
doc.tokenized.norms=true
doc.body.tokeni
Hi Kumaran,
I am using the benchmark utility from Lucene and doing the indexing via an
.alg file.
Would you like to see the alg file instead?
Thank you.
Regards,
Sachin
On Tue, Aug 19, 2014 at 9:42 AM, Kumaran Ramasubramanian wrote:
> Hi Sachin
>
> i want to look into ur
reTermVectorOffsets : false
list field is : docdate
Field storeTermVectorOffsets : false
list field is : doctitle
Field storeTermVectorOffsets : false
list field is : body
Field storeTermVectorOffsets : false
***/
Hope this code comes out legible in the email.
Thank you.
Regards,
Sachin K
" + IFT.stored());
//for (FieldInfo.IndexOptions c : IFT.indexOptions().values())
// System.out.println(c);
}
// *88 //
On Tue, Aug 19, 2014 at 2:04 AM, Kumaran Ramasubramanian wrote:
> Hi Sachin Kulkarni,
>
> If possible, Please share your code.
&
that I am missing while indexing that is
causing this behavior?
Thanks to Kumaran and Ian for their answers to my previous questions but I
have not been able to figure out the above one yet.
Thank you very much.
Regards,
Sachin
.
But somewhere I am missing the important link in the process.
>From what I see, field.store is specified on all the fields, but
field.index is not specified explicitly.
Thank you again, I will keep looking into the code.
Regards,
Sachin
On Mon, Aug 4, 2014 at 10:26 AM, Kumaran R wrote:
>
.
Regards,
Sachin Kulkarni
be if instead of writing the feature vectors out to a file
using Lucene that I include my external code into Lucene and make it use the
feature vectors in memory?
Thank you very much.
Regards,
Sachin
.
Kind Regards,
Sachin Kulkarni
On Tue, Oct 9, 2012 at 6:59 AM, Andrzej Bialecki wrote:
> Hi all,
>
> Together with Grant Ingersoll and Robert Muir we have submitted a paper to
> the "SIGIR 2012 Workshop on Open Source Information Retrieval" held on 16
> Aug
Hi,
I am using the TRECParserByPath in lucene to index the TREC disc 4-5 data.
This does cover all the filetypes except CR collection
IS Lucene using the default Gov2parser to par the CR Collection?
IS there a parser that can be use for the CR Collection directly?
Thank you.
Regards,
Sachin
your index to the new
format with IndexUpgrader first."
So basically in my case I do not need to set it in the .alg file.
On Wed, Sep 5, 2012 at 7:58 AM, Sachin Kulkarni wrote:
> Hi,
>
> For Lucene core 4.0. BETA, under the search.similarities help page it says
> the followin
ty measure for search.
Thank you.
Regards,
Sachin
updated and JIRA on apache shows last update was in 2010.
Any more information is greatly appreciated.
Thank you.
Regards,
Sachin
at is the best way to
fetch the results?
I hope I have made myself clear.
Thanks
Sachin
On Tue, 2008-01-08 at 20:13 +0530, Developer Developer wrote:
> Provide more details please.
>
> Can you not use boolean query and filters if need be ?
>
>
>
> On Jan 8, 2008 7:23 AM
. Doing this is very easy with SQL query as we need to just
write self join query and database do the rest for you.
What is the best way of implementing the above functionality in lucene?
Regards
Sachin
-
To unsubscribe, e-mail
enabled TermVector when creating the Document.
i.e. new Field(, TermVector.YES) (see http://lucene.apache.org/
java/docs/api/org/apache/lucene/document/Field.TermVector.html for the
full array of options)
-Grant
On Mar 13, 2007, at 1:24 PM, Kainth, Sachin wrote:
> Hi all,
>
Hi all,
The documentation for the above method mentions something called a
vectorized field. Does anyone know what a vectorized field is?
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is stric
Hi all,
Is it possible to search whether a term is equal to the entire contents
of a field rather than that the field contains a term?
So for example if I have a field with this text: "world cup" and I do a
search for "cup" I want it to return false but for another field that
contains exactly the
Hi Ashwin,
Well in that case you might need to use Ifilters some other way instead
of through SeekAFile. I don't know how since I haven't used it myself.
Perhaps someone else here has.
Sachin
-Original Message-
From: ashwin kumar [mailto:[EMAIL PROTECTED]
Sent: 09 March 200
Hi Tony,
Lucene certainly does support it. It just requires you to use a
tokeniser that performs stemming such as any analyzer that uses
PorterStemFilter.
Sachin
-Original Message-
From: Tony Qian [mailto:[EMAIL PROTECTED]
Sent: 08 March 2007 16:52
To: java-user@lucene.apache.org
Hi all,
I have been performing some tests on index segments and have a problem.
I have read the file formats document on the official website and from
what I can see it should be possible to create as many segments for an
index as there are documents (though of course this is not a great
idea). H
link pls
ashwin
On 3/8/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
>
> Well you don't need to actually save the text to disk and then index
> the saved index file, you can directly index that text in-memory.
>
> The only other way I have heard of is to use Ifilters. I
Well you don't need to actually save the text to disk and then index the
saved index file, you can directly index that text in-memory.
The only other way I have heard of is to use Ifilters. I believe
SeekAFile does indexing of pdfs.
Sachin
-Original Message-
From: ashwin
= PDDocument.load(filename);
// create stripper (wish I had the power to do that -
wouldn't leave the house)
PDFTextStripper stripper = new PDFTextStripper();
// get text from doc using stripper
return stripper.getText(doc);
}
Sachin
-Original Me
that a single record will only appear in one spanned file?
Many thanks for your advice
Sachin
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is strictly
prohibited. Unless otherwise expressly
not numeric ranges. Is there a way to use numeric
ranges?
-Original Message-
From: Seeta Somagani [mailto:[EMAIL PROTECTED]
Sent: 26 February 2007 15:23
To: java-user@lucene.apache.org
Subject: RE: Date Searches
This might help.
http://www.catb.org/~esr/faqs/smart-questions.html
-
Anybody?
> __
> From: Kainth, Sachin
> Sent: 26 February 2007 13:36
> To: 'java-user@lucene.apache.org'
> Subject: Date searches
>
> Hi all,
>
> I have an index in which dates are represented a
Hi all,
I have an index in which dates are represented as ranges of two integers
(there are two fields one foreach integer). The two integers are years.
AD dates are represented as a positive integer and BC dates as a
negative one There are three possible types of ranges. These are
listed below
Hi all,
I am using the IndexModifier class to perform index modification. I
have deleted 1 document from an index and the output indicates that 1
document does indeed get deleted. However, running the program again
reveals that the document deleted has appeared again in the index. This
despite
I've just been looking at IndexReader and it seems you can do it using
that, but I don't know which concrete implementation of IndexReader to
use.
-Original Message-
From: Michael McCandless [mailto:[EMAIL PROTECTED]
Sent: 23 February 2007 15:07
To: java-user@lucene.apache.org
Subject: R
hat I don't know is how do we delete documents from the index and
how we replace documents in the index where those documents have
changed.
Cheers
Sachin
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this commun
27;t go to the other classes unless you start getting performance
problems with Hits. The main take-away from Hits is that it'll
re-execute the query every 100 documents you read from it or so, so the
only time you care is when you find yourself assembling large numbers of
documents...
Erick
O
What can you use in place of Hits and how do they differ?
-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED]
Sent: 21 February 2007 22:43
To: java-user@lucene.apache.org
Subject: Re: Returning only a small set of results
: A question about efficiency and the internal wor
ighly inefficient.
Also, if we do physically get back everything then is there a way of
ensuring we only get back a few at a time?
Thanks
Sachin
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is str
2/21/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> I was wondering if Lucene provides any mechanism which helps in
> pagination. In other words is there a way to return the first 10 of
> 500 results and then the next 10 and so on.
>
> Cheers
>
>
>
The
> code, as written, assumes you're using a MemoryIndex for one and only
> one document, so unless you need complex queries, I'd just think about
> rewriting simple queries with ANDs as a SpanNearQuery.
Well, what I meant was instead of using a gap of 1000 what I was
think
Hello,
I was wondering if Lucene provides any mechanism which helps in
pagination. In other words is there a way to return the first 10 of 500
results and then the next 10 and so on.
Cheers
This email and any attached files are confidential and copyright protected. If
you are not the address
//www.nabble.com/Search-in-all-fields-tf3254569.html
: Date: Tue, 20 Feb 2007 12:29:25 -
: From: "Kainth, Sachin" <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Search for a term in all fields
:
: Hi all,
:
: How do I
one and only one
document, so unless you need complex queries, I'd just think about
rewriting simple queries with ANDs as a SpanNearQuery.
Best
Erick
On 2/19/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
>
> Hi All,
>
> I want to be able to do a search for a term in all f
ent.
My question is this: is there a third way?
Cheers
Sachin
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is strictly
prohibited. Unless otherwise expressly agreed in writing, nothing sta
Hi Erik,
I looked at the QueryParser API doc but I can't seem to find what the
default field is. Also, how would the syntax of the index code differ
when indexing a word to the default field from this:
Doc.Add(Field.Text("album", Album));
Cheers
Sachin
-Original Message--
very complex
such as:
((album = Thriller AND artist = (Michael OR Jackson)) OR (date between X
AND Y)) AND (label = sony OR Epic) etc...
b) For such a query what are the performance penalties compared to a
simple search involving 1 term?
Cheers
Sachin
This email and any attached files
that these are the
caches that are built at the first query. So, say storing the results of
a query somewhere and returning that stored copy for the *next* query
that is identical is not something I'd expect Lucene to do.
Best
Erick
On 2/14/07, Kainth, Sachin <[EMAIL PROTECTED]>
op of Lucene to improve on it.
Any comments will be appreciated.
Thanks
Sachin
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is strictly
prohibited. Unless otherwise expressly agreed in wr
I have a similar request. Does anyone know if Lucene is capable of
implementing polyheirarchical taxonomies?
-Original Message-
From: Saroja Kanta Maharana [mailto:[EMAIL PROTECTED]
Sent: 13 February 2007 13:45
To: java-user@lucene.apache.org
Subject: Re: Please Help me
Hi All,
A
I believe that this happens because "AND", "OR" and "NOT" are all
reserved words for joining together other search terms and therefore if
you don't want the exception thrown then you must capture any "AND",
"OR" and "NOT"s that are entered on their own and not pass them to the
QueryParser.
-O
-jar start.jar
Regards,
Marius Hanganu
On 2/12/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
>
> Hello all,
>
> When running the example in the solr release has anyone come up with
> the following issue when going to http://localhost:8983/solr/admin/:
>
> HTTP ER
org.apache.tools.ant.taskdefs.Javac.compile(Javac.java:929)
at org.apache.tools.ant.taskdefs.Javac.execute(Javac.java:758)
at
org.apache.jasper.compiler.Compiler.generateClass(Compiler.java:382)
at
org.apache.jasper.compiler.Compiler.compile(Compiler.java:472)
...
Sachin
This email and any
Java Lucene to
do the indexing and searching? I really want to be able to use
dotLucene. Any help would be appreciated.
Many thanks
Sachin
-Original Message-
From: Patrick Kimber [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 15:11
To: java-user@lucene.apache.org
Subject: Re
Hello all,
Does anyone know if there is a .NET version of Lucene Web Service?
Cheers
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communication is strictly
prohibited. Unless otherwise expressly agreed in w
What does solr provide and how can I use it with dotLucene?
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 14:11
To: java-user@lucene.apache.org
Subject: Re: categorisation
On Feb 9, 2007, at 9:08 AM, Kainth, Sachin wrote:
> Are you saying t
Are you saying that without solr I will have caching problems under
load?
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 14:06
To: java-user@lucene.apache.org
Subject: Re: categorisation
On Feb 9, 2007, at 7:07 AM, Kainth, Sachin wrote:
>
But does that not imply that a second search is made against the index
by the line:
BitSet all = (new QueryFilter(q)).bits(reader)
-Original Message-
From: Kainth, Sachin [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 12:05
To: java-user@lucene.apache.org
Subject: RE: categorisation
Ahhh it all makes sense to me now :-)
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 12:01
To: java-user@lucene.apache.org
Subject: Re: categorisation
On Feb 9, 2007, at 5:40 AM, Kainth, Sachin wrote:
> It makes sense to me only if you tell
You are right I didn't think about it at all to be honest.
-Original Message-
From: karl wettin [mailto:[EMAIL PROTECTED]
Sent: 09 February 2007 10:46
To: java-user@lucene.apache.org
Subject: Re: Empty search
9 feb 2007 kl. 11.34 skrev Kainth, Sachin:
> Yep it is the querypar
PM, Kainth, Sachin wrote:
> Chris has given an example of how to perform categorisation of lucene
> searches:
>
> String[] mfgs = ...;
> String query = "+category:cameras +price:[0 to 10]";
> Query q = QueryParser.parse(query);
> Hits results = searcher.search(q, my
e.org
Subject: Re: Empty search
8 feb 2007 kl. 18.46 skrev Kainth, Sachin:
> Is it my imagination or does lucene produce an error if you present it
> with an empty string to search for?
I presume you are referring to the QueryParser? It sounds about right
that it would throw an except
/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
On 2/8/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I wanted to know if it is possible to store some fields in an index
> with one analyzers and other fields with another analyzer?
>
> Chee
Can you give me an example of how this might be done?
-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 08 February 2007 17:34
To: java-user@lucene.apache.org
Subject: Re: Analyzers
Use PerFieldAnalyzerWrapper.
On 2/8/07, Kainth, Sachin <[EMAIL PROTECTED]>
Chris has given an example of how to perform categorisation of lucene searches:
String[] mfgs = ...;
String query = "+category:cameras +price:[0 to 10]";
Query q = QueryParser.parse(query);
Hits results = searcher.search(q, mySort)
BitSet all = (new QueryFilter(q)).bits(reader)
int[
Thanks Erik,
Is there a .NET version of Solr?
Cheers
Sachin
-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 08 February 2007 15:26
To: java-user@lucene.apache.org
Subject: Re: 'a', 's' and 't' don't index properly
>F
Thanks Erik,
Do you know of an analyzer which doesn't remove the characters 'a', 's'
and 't'.
Sachin
-Original Message-
From: Erick Erickson [mailto:[EMAIL PROTECTED]
Sent: 08 February 2007 13:54
To: java-user@lucene.apache.org
Subject: Re: 'a&
solr
is as it seems to be more suited to my application?
Thanks
Sachin
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: 08 February 2007 13:48
To: java-user@lucene.apache.org
Subject: Re: Counting and Categorisation
On Feb 8, 2007, at 8:28 AM, Kainth, Sachin wrote
sorted order (all other records are
> sorted correctly).
>
> Here is my search command:
>
> Hits hits = searcher.Search(query, new Sort(new SortField[] { new
> SortField("firstletter", SortField.STRING)}));
>
> What I don't know is whether the fault lies i
ine their search.
What I'm asking then is for some specific information about how I can
perform the categorisation and counts.
Many thanks
Sachin
This email and any attached files are confidential and copyright protected. If
you are not the addressee, any dissemination of this communic
.yahoo.com/pickupartistmistry
password: chotachetan
Document prescribes the detail information about Indexing of Lucene.
Document has enough diagrams simplify whole idea.
I am currently working on further Indexing using RDBMS.
It is nice if I receive good review comments for this document.
Thank you
Sac
I feel implementing the Lucene inside the RDBMS is nothing but
implementation of following interfaces :
TermDocs
TermVector
TermPositions
-Original Message-
From: Karel Tejnora [mailto:[EMAIL PROTECTED]
Sent: Friday, October 06, 2006 4:11 PM
To: java-user@lucene.apache.org
Subject: Re:
I am working on the Lucene... I have prepared the document about in-depth
indexing. Unfortunately I can't attach it to the mail due to site
constraint. But I can send it to your Personal Email ..
---
Sachin
-Original Message-
From: Ajani, Akil (Cognizant) [mailto:[EMAIL PROTECTED]
Hello,
Very small and sweet Question?
Does Apache allow me to change the Final classes which are distributed by
Apache for Scorers? Or can I copy and paste some of the Lucene code into my
commercial application within my organization?
TermScorer, BooleanScorer are final classes. But all other sc
Hello,
Very small and sweet Question?
Does Apache allow me to change the Final classes which are distributed by
Apache for Scorers? Or can I copy and paste some of the Lucene code into my
commercial application within my organization?
TermScorer, BooleanScorer are final classes. But all other sc
Its nice if someone shares design documents of Lucene with Me.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hello Great/smart guys
This is my first question for this group as I
started working on the Lucene last month.
Lucene provide the scoring of documents based
on TF-IDF vector analysis. Lucene also provides the Scorer and Weight inside
the Search package. By implementin
80 matches
Mail list logo