I agree their viewpoint!
On Thu, 3 Feb 2005 14:29:13 -0800 (PST), Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Using different analyzers for indexing and searching is not
> recommended.
> Your numbers are not even in the index because you are using
> StandardAnalyzer. Use Luke to look at your i
I think you may can use a filter to get right result!
See examlples below
package lia.advsearching;
import junit.framework.TestCase;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.
Your understanding is right!
The old existing files should be deleted,but it will build new files!
On Thu, 03 Feb 2005 17:36:27 -0800 (PST),
[EMAIL PROTECTED] <[EMAIL PROTECTED]>
wrote:
> Hi,
>
> When I run an optimize in our production environment, old index are
> left in the directory and ar
Hi,
When I run an optimize in our production environment, old index are
left in the directory and are not deleted.
My understanding is that an
optimize will create new index files and all existing index files should be
deleted. Is this correct?
We are running Lucene 1.4.2 on Windows.
Bingo! Nice catch. That was it. Made everything lower case when I set the
field. Works great now.
Thanks!
Luke
- Original Message -
From: "Kauler, Leto S" <[EMAIL PROTECTED]>
To: "Lucene Users List"
Sent: Thursday, February 03, 2005 6:48 PM
Subject: RE: Parsing The Query: Every documen
Because you are build from QueryParser rather than a TermQuery, all
search terms in the query are being lowercased by StandardAnalyzer.
So your query of "olFaithFull:stillhere" requires that there is an exact
index term of "stillhere" in that field. It depends on how you built
the index (index an
"stillHere"
Capital H.
- Original Message -
From: "Kauler, Leto S" <[EMAIL PROTECTED]>
To: "Lucene Users List"
Sent: Thursday, February 03, 2005 6:40 PM
Subject: RE: Parsing The Query: Every document that doesn't have a field
containing x
First thing that jumps out is case-sensitivity
First thing that jumps out is case-sensitivity. Does your olFaithFull
field contain "stillHere" or "stillhere"?
--Leto
> -Original Message-
> From: Luke Shannon [mailto:[EMAIL PROTECTED]
> This works:
>
> query1 = QueryParser.parse("jpg", "kcfileupload", new
> StandardAnalyzer()); qu
This works:
query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
query2 = QueryParser.parse("stillHere", "olFaithFull", new
StandardAnalyzer());
BooleanQuery typeNegativeSearch = new BooleanQuery();
typeNegativeSearch.add(query1, false, false);
typeNegativeSearch.add(query2,
Using different analyzers for indexing and searching is not
recommended.
Your numbers are not even in the index because you are using
StandardAnalyzer. Use Luke to look at your index.
Otis
--- Hetan Shah <[EMAIL PROTECTED]> wrote:
> Hello,
>
> How can one search for a document based on the qu
Hetan Shah wrote:
Hello,
How can one search for a document based on the query which has numbers
in the query srting.
e.g. query = Java 2 Platform J2EE
What do I need to do so that the numbers do not get neglected.
I am using StandardAnalyzer to index the pages and using StopAnalyzer to
search th
Hello,
How can one search for a document based on the query which has numbers
in the query srting.
e.g. query = Java 2 Platform J2EE
What do I need to do so that the numbers do not get neglected.
I am using StandardAnalyzer to index the pages and using StopAnalyzer to
search the documents. Would
I did, I have ran both queries in Luke.
kcfileupload:ppt
returns 1
olFaithfull:stillhere
returns 119
Luke
- Original Message -
From: "Maik Schreiber" <[EMAIL PROTECTED]>
To: "Lucene Users List"
Sent: Thursday, February 03, 2005 4:55 PM
Subject: Re: Parsing The Query: Every document
Yes. There should be 119 with stillHere,
You have double-checked that, haven't you? :)
and if I run a query in Luke on
kcfileupload = ppt, it returns one result. I am thinking I should at least
get this result back with: -kcfileupload:jpg +olFaithFull:stillhere?
You really should.
--
Maik Schreiber
Yes. There should be 119 with stillHere, and if I run a query in Luke on
kcfileupload = ppt, it returns one result. I am thinking I should at least
get this result back with: -kcfileupload:jpg +olFaithFull:stillhere?
Luke
- Original Message -
From: "Maik Schreiber" <[EMAIL PROTECTED]>
To
-kcfileupload:jpg +olFaithFull:stillhere
This looks right to me. Why the 0 results?
Looks good to me, too. You sure all your documents have
olFaithFull:stillhere and there is at least a document with kcfileupload not
being "jpg"?
--
Maik Schreiber * http://www.blizzy.de <-- Get GMail invites
Jian,
I disagree that the Google Mini is useless. $5000 is quite inexpensive
for a commercial search engine. I know of search engines where the cost
is practically 20 cents per document. Heck, a decent server capable of
running a heavily loaded search engine costs $3000. Also, don't forget
you
Hello,
Still working on the same query, here is the code I am currently working
with.
I am thinking this should bring up all the documents that have
olFaithFull=stillHere and kcfileupload!=jpg (so anything else)
query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
query2 =
Ok.
I have added the following to every document:
doc.add(Field.UnIndexed("olFaithfull", "stillHere"));
The plan is a query that says: olFaithull = stillHere and kcfileupload!=jpg.
I have been experimenting with the MultiFieldQueryParser, this is not
working out for me. From a syntax how is thi
One which we've been using can be found at:
http://www.ltg.ed.ac.uk/~richard/ftp-area/html-parser/
We absolutely need to be able to recover gracefully from malformed
HTML and/or SGML. Most of the nicer SAX/DOM/TLA parsers out there
failed this criterion when we started our effort. The above one
On Thursday 03 February 2005 20:18, Bill Tschumy wrote:
> Is there any way to construct a query to locate all documents without a
> specific field? By this I mean the Document was created without ever
> having that field added to it.
One way is to add an extra document field containing the fiel
Thanks!
I can wait for the release.
Luke
- Original Message -
From: "Andrzej Bialecki" <[EMAIL PROTECTED]>
To: "Lucene Users List"
Sent: Thursday, February 03, 2005 2:53 PM
Subject: Re: Synonyms Not Showing In The Index
> Andrzej Bialecki wrote:
> > Luke Shannon wrote:
> >
> >> Hell
Andrzej Bialecki wrote:
Luke Shannon wrote:
Hello;
It seems my Synonym analyzer is working (based on some successful
queries).
But I can't see the synonyms in the index using Luke. Is this correct?
Did you use the combined JAR to run? It contains an oldish version of
Lucene... Other than that, I
Is there any way to construct a query to locate all documents without a
specific field? By this I mean the Document was created without ever
having that field added to it.
--
Bill Tschumy
Otherwise -- Austin, TX
http://www.otherwise.com
--
Alternatively, add a dummy field-value to all documents, like
doc.add(Field.Keyword("foo", "bar"))
Waste of space, but allows you to perform negated queries.
On Thu, 03 Feb 2005 19:19:15 +0100, Maik Schreiber wrote:
>> Negating a term must be combined with at least one nonnegated
>> term to retu
On Thursday 03 February 2005 11:38, Nick Burch wrote:
> Hi All
>
> I'm using lucene from CVS, and I've discovered the rewriting a
> BooleanQuery created with the old style (Query,boolean,boolean) method,
> the rewrite will cause the required parameters to get lost.
>
> Using old style (Query,boo
Negating a term must be combined with at least one nonnegated term to return
documents; in other words, it isn't possible to use a query like NOT term to
find all documents that don't contain a term.
So does that mean the above example wouldn't work?
Exactly. You cannot search for "-kcfileupload:jp
Hello;
I have a query that finds document that contain fields with a specific
value.
query1 = QueryParser.parse("jpg", "kcfileupload", new StandardAnalyzer());
This works well.
I would like a query that find documents containing all kcfileupload fields
that don't contain jpg.
The example I fou
Thank you for your reply.
I am already using compound file format, and the minMergeDocs is already
increased to 50.
As my understanding and observation, files are compounded at the end of
indexing. The error happens when indexing, so compound file format
should not matter.
Chris Lu
Will Allen
Increase the minMergeDocs and use the compact file format when creating your
index.
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#setUseCompoundFile(boolean)
-Original Mes
Hi,
I am getting this exception now and then when I am indexing content.
It doesn't always happen. But when it happens, I have to delete the
index and start over again.
This is a serious problem for us.
In this email, Doug was say it has something to do with win32's lack of
atomic renaming.
http://
The indexing process is totally synchronized in our system. Thus if an
Indexing thread starts up and the index exists, but is locked, I know this
to be the only indexing processing running so the lock must be from a
process that got stopped before it could finish.
So right before I begin writing t
Hello
A commit.lock can get left by a process that dies in the middle of
reading the index, for example because of an OutOfMemoryError. How can I
handle such a left lock gracefully the next time the process runs?
Checking if there is a lock is straight forward - but how can I be sure
that it is
Kevin L. Cobb wrote:
We recently started using SVN for SCM, were using VSS. We're trying out
approach A, branching off for each release. Development always develops
on the trunk, except when a bug is discovered that needs to be patched
to a previous version of the product. When that scenario comes
I am trying to do some filtering and rearrangement of search result. Two
possiblity come into mind are iterating though the Hits or making custom
HitCollector.
All documentation invaribly warn about the performance impact of using
HitCollector with large result set. The scenario that google
For all parser suggestion I think there is one important attribute. Some
parsers returns data provide that the input HTML is sensible. Some parsers
is designed to be most flexible as tolerant as it can be. If the input is
clean and controlled the former class is sufficient. Even some regular
On Feb 3, 2005, at 9:26 AM, Owen Densmore wrote:
Is this the right way to make a porter analyzer using the standard
tokenizer? I'm not sure about the order of the filters.
Owen
class MyAnalyzer extends Analyzer {
public TokenStream tokenStream(String fieldName, Reader reader) {
Is this the right way to make a porter analyzer using the standard
tokenizer? I'm not sure about the order of the filters.
Owen
class MyAnalyzer extends Analyzer {
public TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(
new Sto
We recently started using SVN for SCM, were using VSS. We're trying out
approach A, branching off for each release. Development always develops
on the trunk, except when a bug is discovered that needs to be patched
to a previous version of the product. When that scenario comes up (and
it never has)
Hi,
> From: mahaveer jain [mailto:[EMAIL PROTECTED]
> I am using lucene to index and search my app. Till date I am
> just showing file name or title based on my application. We
> want to show, pharse that contain the keyword searched.
> Has anybody tried this ? Can someone help me start this
Hi All,
I am using lucene to index and search my app. Till
date I am just showing file name or title based on my
application. We want to show, pharse that contain the
keyword searched.
Has anybody tried this ? Can someone help me start
this ?
Thanks
Mahaveer
__
We can work the 1.x and 2.0 lines of code however we need to. We can
branch (a branch or tag in Subversion is inexpensive and a constant
time operation). How we want to manage both versions of Lucene is open
for discussion. Nothing about Subversion changes how we manage this
from how we'd do
You're missing the Commons Digester JAR, which is in the lib directory
of the LIA download. Check the build.xml file for the build details of
how the compile class path is set. You'll likely need some other JAR's
at runtime too.
Erik
On Feb 3, 2005, at 2:12 AM, jac jac wrote:
Hi,
I ju
Hi All
I'm using lucene from CVS, and I've discovered the rewriting a
BooleanQuery created with the old style (Query,boolean,boolean) method,
the rewrite will cause the required parameters to get lost.
Using old style (Query,boolean,boolean):
query = +contents:test* +(class:1.2 class:1.2.*)
rewr
Thank you, I will do that.
> Karl Koch wrote:
>
> >I appologise in advance, if some of my writing here has been said before.
> >The last three answers to my question have been suggesting pattern
> matching
> >solutions and Swing. Pattern matching was introduced in Java 1.4 and
> Swing
> >is somet
Karl,
Two things, try to experiment with both:
1) I would try to write a lexical scanner that strips HTML tags, much
like the regular expression does. Java lexical scanner packages produce
nice pure Java classes that seldom use any advanced API, so they should
work on Java 1.1. They are simple s
Karl Koch wrote:
I appologise in advance, if some of my writing here has been said before.
The last three answers to my question have been suggesting pattern matching
solutions and Swing. Pattern matching was introduced in Java 1.4 and Swing
is something I cannot use since I work with Java 1.1 on a
I am using Java 1.1 with a Sharp Zaurus PDA. I have very limited memory
constraints. I do not think CPU performance is a big issues though. But I
have other parts in my application which use quite a lot of memory and
soemthing run short. I therefore do not look into solutions which build up
tag tre
Karl Koch wrote:
Unfortunaltiy I am faithful ;-). Just for practical reason I want to do that
in a single class or even method called by another part in my Java
application. It should also run on Java 1.1 and it should be small and
simple. As I said before, I am in control of the HTML and it will b
I appologise in advance, if some of my writing here has been said before.
The last three answers to my question have been suggesting pattern matching
solutions and Swing. Pattern matching was introduced in Java 1.4 and Swing
is something I cannot use since I work with Java 1.1 on a PDA.
I am wonde
On Wed, 2005-02-02 at 22:11 -0500, Erik Hatcher wrote:
> I've seen both of these types of procedures followed on Apache
> projects. It really just depends. Lucene's codebase is not being
> modified frequently, so it is not necessary to branch and merge back.
> Rather we simply develop off of
Karl Koch wrote:
Hello Sergiu,
thank you for your help so far. I appreciate it.
I am working with Java 1.1 which does not include regular expressions.
Why are you using Java 1.1? Are you so limited in resources?
What operating system do you use?
I asume that you just need to index the html files
Unfortunaltiy I am faithful ;-). Just for practical reason I want to do that
in a single class or even method called by another part in my Java
application. It should also run on Java 1.1 and it should be small and
simple. As I said before, I am in control of the HTML and it will be well
formated,
Hello Sergiu,
thank you for your help so far. I appreciate it.
I am working with Java 1.1 which does not include regular expressions.
Your turn ;-)
Karl
> Karl Koch wrote:
>
> >I am in control of the html, which means it is well formated HTML. I use
> >only HTML files which I have transformed
Luke Shannon wrote:
Hello;
It seems my Synonym analyzer is working (based on some successful queries).
But I can't see the synonyms in the index using Luke. Is this correct?
Did you use the combined JAR to run? It contains an oldish version of
Lucene... Other than that, I'm not sure - if you can't
55 matches
Mail list logo