On Jan 19, 2004, at 7:27 PM, Syrén Per wrote:
Hi all,
Have a question concerning indexing of HTML files.
One of the files I'm trying to index have a input type=image ...
tag
that also contain a call to a javascript with a string argument that is
about 1300 characters long. At this point Lucene
On Jan 18, 2004, at 11:15 AM, Karl Koch wrote:
lets say I have an index with documents encoded in two fields
filename and
data. Is it possible to extract a file from which I know the filename
directly from this index without performing any search. Like a random
access like
in a filesystem?
It is
You're missing something in your explanation. Lucene does not create
XML files.
On Jan 15, 2004, at 11:35 AM, Pierce, Tania wrote:
Let me preface this by saying I am a total beginner to
apache/java/tomcat/cocoon etc. I'm thankfully fluent in xml/xslt or
this would be a nightmare.
Anyway, I
-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 13, 2004 3:19 AM
To: Lucene Users List
Subject: Re: Philosophy(??) question
On Jan 12, 2004, at 7:59 PM, Scott Smith wrote:
I have some documents I'm indexing which have multiple languages in
them
(i.e., some fields
On Jan 12, 2004, at 7:59 PM, Scott Smith wrote:
I have some documents I'm indexing which have multiple languages in
them
(i.e., some fields in the document are always English; other fields
may be
other languages). Now, I understand why a query against a certain
field
must use the same analyzer
On Jan 13, 2004, at 7:26 AM, [EMAIL PROTECTED] wrote:
Example: I have a very long text. I parse these text with an
WhitespaceAnalyser. From this Text I generate an Index. From this
index I get each word
together with its alsolute frequency / relative frequency.
Can I do it without generating an
On Jan 12, 2004, at 7:49 PM, Scott Smith wrote:
Does the following do that:
BooleanQuery Query QA = new Boolean Query();
Query qa1 = QueryParser.parse(A1, FieldA, analyzer());
Query qa2 = QueryParser.parse(A2, FieldA, analyzer());
QA.add(qa1, false, false); //
On Jan 13, 2004, at 5:21 PM, Scott Smith wrote:
I guess what is confusing me now is that the search code no longer
references an analyzer???!!! How does it know how to tokenize, stem,
etc.
the search terms?
It doesn't. A TermQuery is exactly as-is. If you need the analysis
part, you can use
On Jan 13, 2004, at 6:19 PM, Patrick Kates wrote:
I have a text field called ACTIVE_YEAR that stores (of course) a year
like
2003. When I index this field I can see the number in my index (using
Luke)
but I can't search it. If I add a text character to the end of the
field
and index it (200x)
On Jan 12, 2004, at 6:24 AM, [EMAIL PROTECTED] wrote:
who knows other software projects (like Nutch) which are based and
build
around Lucene?? I think it can be quite interesting and helpful for
new people
to see and learn from examples...
This is the purpose of the Powered by section on
On Jan 12, 2004, at 8:21 AM, Thomas Scheffler wrote:
OK, I've looked inside QueryParser and it's seems to be the right
place to
do that. But it's rather complicated to transform a query to another,
since QueryParserTokenManager as an extreme example is not quite
understandable and needs a huge
On Jan 10, 2004, at 1:43 PM, [EMAIL PROTECTED] wrote:
would it be possible to implement a Analyser who filters HTML code out
of a
HTML page. As a result I would have only the text free of any tagging.
The dilemma is that in a general sense there are multiple fields in
HTML. At least title and
On Jan 7, 2004, at 4:18 PM, Dror Matalon wrote:
Actually I would guess that performence should be fine. I would look at
the code generated by the standard analyzer,
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/analysis/
standard/package-summary.html
which translates from (a AND b)
Actually, creating a Field with a Reader means the field data is
unstored. It is indexed, but the original text is not retrievable as
it is not in the index (yes, it is tokenized, but not kept as a unit,
and is very unlikely to be the same as the original text)
If you need the text to be
On Jan 2, 2004, at 11:49 AM, Colin McGuigan wrote:
1. How do you specify which directory is to be searched
( I assumed it was the current directory ie tomcat\webapps but when I
put in
more searchable content nothing comes up in the search
I have also tried typing java
On Dec 30, 2003, at 3:13 PM, Morus Walter wrote:
Hmm. That's be up to the developers.
Don't know how many of them are reading lucene-user.
I suspect we're all here!
QueryParser is Lucene's red-headed step-child. It works well enough,
but it has more than its share of issues. It is almost a
On Dec 29, 2003, at 5:37 PM, Thomas Krämer wrote:
with the apache commons digester i can read each record into a lucene
document and push each tag as key-value pair, where the tag name (eg.
creator) is the lucene field name and the text enclosed by it the
corresponding string value.
for a lot
On Dec 23, 2003, at 8:15 AM, Niall Gallagher wrote:
I think I have resolved the problem. I was using Lucene to index
several directories concurrently within the same JVM, and as far as I
can tell Lucene cannot do coucurrent indexing. Is this correct ?
You can do it concurrently, but you must use
Geoffrey,
You've done quite a thorough analysis of Lucene. I'll reply below with
a few tidbits of Lucene trivia in hopes that will help
On Dec 22, 2003, at 3:15 PM, Geoffrey Peddle wrote:
One of our applications is a catalog search
application.In this application our documents are
On Dec 19, 2003, at 10:46 PM, Mark R. Diggory wrote:
Has anyone thought about or used Lucene to build an indexed,
searchable help system? Either Server or Application Based?
While maybe not exactly a help system, the application we wrote for
Java Development with Ant uses Lucene to index a
Interestingly, I used a MetaphoneAnalyzer as an example in our book in
progress. I'm curious if you have measured performance with doing it
at analysis time versus query time. Enumerating all terms at query
time is basically the same as doing a WildcardQuery or FuzzyQuery and
involves a
On Friday, December 19, 2003, at 05:42 PM, Ernesto De Santis wrote:
I have news questions:
- apiQuery.add(new TermQuery(new Term(contents, dot)), false,
true);
new Term(contents, dot)
The Term class, work for only one word?
Careful with terminology here. It works for only one term. What is
During indexing, perhaps you could glue all fields text together into
one special field used for searching?
On Thursday, December 18, 2003, at 06:31 AM, Thijs Cadier wrote:
I am using a QueryParser, looked at the MultiFieldQueryParser.
But the issue is that I don't know wich fields are in the
On Tuesday, December 16, 2003, at 05:46 AM, Iain Young wrote:
Treating them as two separate words when quoted is indicative of your
analyzer not being sufficient for your domain. What Analyzer are you
using? Do you have knowledge of what it is tokenizing text into?
I have created a custom
On Monday, December 15, 2003, at 12:12 PM, Iain Young wrote:
A quick question. Is there any way to disable the - and + modifiers in
the
QueryParser?
Not currently.
I've had a bit of success by putting quotes around the offending
names, (as
suggested on this list), but the results are still
Try out the toString(fieldName) trick on your Query instances and
pair them up with what you have below - this will be quite insightful
for the issue - i promise! :)
Look at my QueryParser article and search for toString on that page:
On Saturday, December 13, 2003, at 11:20 AM, Tun Lin wrote:
Hi,
I have tried to type the following at Windows command line at weblucene
directory:
ant build
Everything seems to work fine except the following error:
Everything works fine but it fails miserably?! :)
display the contents of the hits object to a page, I am getting
57 or 58 results on the page. 5 or 6 more results than is shown from
the length() method in the hits object.
Shannon
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Friday, December 12, 2003 9:45 AM
On Wednesday, December 10, 2003, at 04:07 PM, julien gerard wrote:
I'm attempting to optimize a fuzzy search on a big index with
~4.400.000 Documents ( lucene's meanning ) in 600.000 sub-categories
(Simple Text.Keyword type a field ).
My purpose is to limit the amount of documents on wich the
On Wednesday, December 10, 2003, at 05:27 PM, julien gerard wrote:
But in this case the fuzzy is performed on the overall index? The
QueryFilter do his job after ?
I'm not sure to understand the QueryFilter meaning?
But I test the QueryFilter also this way and the time to doing this
search
On Sunday, December 7, 2003, at 09:50 PM, Fitrio Pakana wrote:
I have similar problems with him, which is query using
multiple terms, and to make things worse, the hits
returned is quite absurd. The score of hits using 'OR'
(any words) query is lower than if using 'AND' (all
words) query, thus
On Monday, December 8, 2003, at 05:47 PM, [EMAIL PROTECTED] wrote:
If I generate a query using QueryParser and a
standard analyzer, in some cases I'm getting a
TooManyBooleanClauses exception, e.g.:
[2003-12-08 14:39:23] [ debug1 ] query is +glucose
-kog* always:1
[2003-12-08 14:39:23]
On Sunday, December 7, 2003, at 06:17 PM, Esmond Pitt wrote:
When creating an index, FSDirectory assumes that the directory has no
subdirectories. If a non-empty subdirectory is present,
FSDirectory.create
fails to delete it and throws an IOException. As the subdirectory is
not a
Lucene index
On Sunday, December 7, 2003, at 08:21 PM, Esmond Pitt wrote:
I'm not clear whether this is a 'yes' or a 'no'.
I think other committers would need to weigh in on it. I'm fine with
making a change to check isDirectory as well and not deleting them
since Lucene (currently) does not work with
for Field.Keyword. Please provide more details on the
issue you encountered using Field.Keyword.
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 04, 2003 6:18 PM
To: Lucene Users List
Subject: Re: Returning one result
You really should use a TermQuery
On Friday, December 5, 2003, at 11:59 AM, Allen Atamer wrote:
Below are the results of a debug run on the piece of text that I want
aliased. The token spitline must be recognized as splitline i.e.
when I
do a search for splitline, this record will come up.
1: [173] , start:1, end:2
1: [missing]
On Friday, December 5, 2003, at 01:25 PM, Pleasant, Tracy wrote:
Say ID is Ar3453 .. well the user may want to search for Ar3453, so in
order for it to be searchable then it would have to be indexed and not
a
keyword.
*arg* - we're having a serious communication issue here. My advice to
you is
On Friday, December 5, 2003, at 04:28 PM, Dror Matalon wrote:
Then I'm out of ideas. The next thing is for you to post your search
code so we can see why it's not searching the field.
Giving up so easily, Dror?! :))
The problem is, when using any type of QueryParser with a Keyword
field, you
On Wednesday, December 3, 2003, at 08:51 PM, Dror Matalon wrote:
Hits hits = initSearch(queryString);
Does initSearch close the IndexSearcher?
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
Here's to all those that inquire about searching XML with Lucene:
http://www.tbray.org/ongoing/When/200x/2003/11/30/SearchXML
:))
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
On Thursday, December 4, 2003, at 02:46 PM, Dror Matalon wrote:
Of course, now that I got explain to work I need to figure out what the
following means :-)
-
Explanation:0.0 = product of:
0.0 = sum of:
0.0 = coord(0/5)
-
It means you have a bug in your code :))
On Thursday, December 4, 2003, at 03:07 PM, Dror Matalon wrote:
By the way, all these fun things are going to be part of the CLI that
I've been playing with.
Anyone interested in helping test?
Of course! Is it something you plan on donating to the Lucene project?
LUKE and Limo and your CLI
You really should use a TermQuery in this case anyway, rather than
using QueryParser. You wouldn't have to worry about the analyzer at
that point anyway (and I assume you're using Field.Keyword during
indexing).
Erik
On Thursday, December 4, 2003, at 05:01 PM, Pleasant, Tracy wrote:
Ok I
On Thursday, December 4, 2003, at 05:00 PM, Allen Atamer wrote:
This is the code that I have so far for the next Method within
AliasFilter.
After reading some posts, I also got the idea to call
setPositionIncrement(). Neither way works, because when I search for
the
alias, no search results
On Wednesday, December 3, 2003, at 09:36 AM, Ralph wrote:
is there a maximum of documents Hits provide or is it unlimited (means
limited to heap size of VM)? If there is a maximimum, what is the
number?
Hits represents all documents that matched the query (and optionally
filtered).
But, Hits
On Wednesday, December 3, 2003, at 10:16 AM, Ralph wrote:
Does this mean Hits points to ALL documents and the last one might
have a
score of 0.0 ? If it does not contain all documents, where is the
treshhold
then? Or based on which condition it stops pointing to certain
documents?
I'm a bit
On Monday, December 1, 2003, at 11:55 PM, Tatu Saloranta wrote:
On a related note, it would also be nice if there was a way to start
categorizing general hot topics for Lucene developers; it seems like
there
are about half a dozen areas where there's lots of interest for
improvements
(most of
On Tuesday, December 2, 2003, at 07:34 AM, Otis Gospodnetic wrote:
Could you add a Lucene logo somewhere on your search results, as noted
here:
http://jakarta.apache.org/lucene/docs/powered.html ?
I thought we were going to loosen up the requirement to have the logo
on a search results page?
On Tuesday, December 2, 2003, at 09:32 AM, Tate Avery wrote:
Hello,
This is the first time that I noticed this.
Is the 'powered by Lucene' a legal requirement? Or just a suggestion?
Does it apply to any system embedding Lucene (web pages, applications,
etc)?
That is not covered in the Apache
Also, reindex with the new API as well. There are likely
incompatibilities in the index format.
On Monday, December 1, 2003, at 11:21 AM, Iain Young wrote:
Note, that I've just tried the example webapp supplied with Lucene,
and I
appear to be having exactly the same problem with that. The
On Sunday, November 30, 2003, at 11:13 AM, Kent Gibson wrote:
as per Erik's idea I tried with the BitSet as follows:
QueryFilter qf = new QueryFilter(query);
IndexReader ir = IndexReader.open(indexPath);
Searcher searcher2 = new IndexSearcher(ir);
// get the bit set for the query
BitSet bits =
I enjoy at least attempting to answer questions here, even if I'm half
wrong, so by all means correct me if I misspeak
On Saturday, November 29, 2003, at 06:37 PM, Kent Gibson wrote:
All I would like to know is how many times a query was
found in a particular document. I have no problems
On Tuesday, November 25, 2003, at 10:45 PM, marc wrote:
Hi,
assume a field has the following text
Adenylate kinase (mitochondrial GTP:AMP phosphotransferase)
the following searches all return this document
AMP
AMP
AMP;
can someone explain this to me..i figured that only the first query
woah that seems like an awfully complex answer to the question of
how to tokenize at a comma rather than a space! %-)
On Tuesday, November 25, 2003, at 11:48 AM, MOYSE Gilles (Cetelem)
wrote:
Hi.
You should define expressions.
To define expressions, you first have to define an
On Wednesday, November 26, 2003, at 06:12 AM, Dragan Jotanovic wrote:
You will need to write a custom analyzer. Don't worry, though
it's
quite straightforward. You will also need to write a Tokenizer, but
Lucene helps you a lot here.
Wouldn't I achieve the same result if I index time out
On Wednesday, November 26, 2003, at 11:33 AM, Pleasant, Tracy wrote:
Your website says:
org.apache.lucene.analysis.standard.StandardAnalyzer:
[xyz] [corporation] [EMAIL PROTECTED] [com]
When I run it it keeps the entire email '[EMAIL PROTECTED]
but according to your website it
On Monday, November 24, 2003, at 12:57 PM, [EMAIL PROTECTED] wrote:
Is there a url that will take me to the javadocs for Lucene 1.2,
rather than 1.3-rc2?
No, but the 1.2 binary distribution ships with the javadocs, I believe.
And, of course, they would be easy to generate from the 1.2 source
On Monday, November 24, 2003, at 12:22 PM, Ralf B wrote:
One question:
The similarity class is abstract. Are there default implementations
like in
other parts of this API (Analysers for example) available and how can
I use
it i.e. to calculate weights? Are there some default implementations
On Saturday, November 22, 2003, at 06:33 PM, Dion Almaer wrote:
3. I have some fields suck as title, owner, etc as well as the content
blob which I index and use as
the default search field. Is there an easy way to extend the
QueryParser to merge it with a
MultiTermQuery which can also search
On Sunday, November 23, 2003, at 03:33 PM, Dion Almaer wrote:
This leads me to another issue actually. On certain range queries I
get exceptions:
Query: modifieddate:[1/1/03 TO 12/31/03]
org.apache.lucene.search.BooleanQuery$TooManyClauses
I'm guessing you're using Field.Keyword(String, Date)
On Sunday, November 23, 2003, at 03:33 PM, Dion Almaer wrote:
2. +field:foo and the QueryParser:
I ran into some problems where using +field:foo was
giving strange
results. When I changed the queries to ... AND field:foo
everything
was fine.
Am I missing something there?
Which version of
On Friday, November 21, 2003, at 02:34 PM, Jianshuo Niu wrote:
I read your post on lucene bug list. However, I try the change you
suggested, but it just changed t-shirts to shirt.
What Analyzer are you using?
-
To unsubscribe,
On Tuesday, November 18, 2003, at 04:32 PM, Dan Pelton wrote:
Occasionally I get an Illegal seek error while loading a document
into lucene.
I am new to lucene so I am not sure what to look for. Does any one
have an idea of what may cause this error. Can lucene handle multiple
user inserting
On Sunday, November 16, 2003, at 06:23 PM, Tomcat Programmer wrote:
Yes, I understand that now the QueryParser will trap
the errors and convert to exceptions (with the version
in CVS). I was just voicing my opinion regarding
throwing TokenMgrError's in the first place when they
should really be
On Friday, November 14, 2003, at 07:16 PM, Dror Matalon wrote:
We're seeing slow response time when we apply datefilter. A search that
takes 7 msec with no datefilter takes 368 msec when I filter on the
last
fifteen days, and 632 msec on the last 30 days.
Initially we saved doing
On Saturday, November 15, 2003, at 11:38 AM, Karsten Konrad wrote:
If the number of different date terms causes this effect, why not
round
the date to the nearest or next midnight while indexing. Thus,
filtering
for the last 15 days would require walking over 15-17 different date
terms.
If
On Saturday, November 15, 2003, at 12:03 PM, Dror Matalon wrote:
After posting the original email, I started wondering if that's the
issue, the fact that we store timestamp up to the millisecond rather
than a more reasonable granularity. Dates are too high a granularity
for
us, but minutes, and
On Saturday, November 15, 2003, at 11:59 AM, Dror Matalon wrote:
If this date range is pretty static, you could (in Lucene's CVS
codebase) wrap the DateFilter with a CachingWrappingFilter. Or you
could construct a long-lived instance of an equivalent QueryFilter and
reuse it across multiple
On Thursday, November 13, 2003, at 04:32 PM, Jie Yang wrote:
Well, not quite, User normally enters a search string
A that normally returns 1000 out of 2 millions docs. I
then append A with 500 OR conditions... A AND (B or C
or ... or x500). I am trying to optimse the 500 OR
terms so that it does
On Friday, November 14, 2003, at 01:13 PM, Chong, Herb wrote:
if you didn't have to change the index then you haven't got all the
factors needed to do it well. terms can't cross sentence boundaries
and the index doesn't store sentence boundaries.
You mean if you have text like this: Hello Herb.
On Friday, November 14, 2003, at 02:02 PM, Chong, Herb wrote:
if i just run this query against a million document newswire index, i
know i am going to get lots of hits. the phrase capital gains tax
hits a lot fewer documents, but is overrestrictive. the fact that the
three terms occur next to
On Friday, November 14, 2003, at 02:32 PM, Chong, Herb wrote:
when people type in multiword queries, mostly they are interested in
phrases in the linguistic sense. phrases don't cross sentence
boundaries. you need certain features in the index and in the ranking
algorithm to capture that
On Friday, November 14, 2003, at 02:54 PM, Chong, Herb wrote:
it solves one part of the problem, but there are a lot of sentences in
a typical document. you'll need to composite a rank of a document from
its constituent sentences then. there are less drastic ways to solve
the problem. the
You should write your own code that creates the Document objects with
the fields you wish, with a Field.Keyword for the URL probably. Take
what is useful from IndexHTML.java, but don't use it as-is. If you're
speaking of pulling the document from a URL now you're talking of doing
some HTTP
On Thursday, November 13, 2003, at 03:22 AM, Hackl, Rene wrote:
documents contain very long strings for chemical substances, users are
interested in certain parts of the string e.g. find all documents that
comprise *foo* be it 1-foo-bar or rab-oof-13-foonyl-naphthalene).
So you're saying you want
On Wednesday, November 12, 2003, at 11:52 PM, Tomcat Programmer wrote:
When using the QueryParser class, the parse method
will throw a TokenMgrError when there is a syntax
error even as simple as a missing quote at the end of
a phrase query. According to the javadoc, you should
never see this
On Thursday, November 13, 2003, at 03:28 PM, Dan Quaroni wrote:
To my knowledge the answer is No, lucene performs each query
separately and
then performs the joins after it has all the results. This is
actually a
rather serious problem when it comes to searches in large indexes
where a
single
On Thursday, November 13, 2003, at 04:07 PM, Jie Yang wrote:
Erik, Just to make sure I understand you right, In an
example query: ZipCode:CA10927 AND Gender:Male
Are we talking about that query being entered by the user and you
handing it just like that to QueryParser? If so, then QueryFilter
On Wednesday, November 12, 2003, at 05:55 AM, Pascal Nadal wrote:
My lucene indexes contain fields with values like this www.xxx.yyy.zzz
which are treated as HOST tokens.
My problem is the following : search results never contain documents
with
such fields when doing a wildcard query or a fuzzy
On Wednesday, November 12, 2003, at 10:43 AM, Pascal Nadal wrote:
the HostFilter I wrote (that tokenizes again HOST tokens) works
wonderfully.
I wonder if this has been fixed since Lucene 1.2 could you try the
latest 1.3RC build available and see if it works without your
HostFilter?
Erik
On Wednesday, November 12, 2003, at 10:53 AM, MOYSE Gilles (Cetelem)
wrote:
Hello.
I've made a Filter which recognizes special words and return them in a
boosted form, in a QueryParser sense.
For instance, when the filter receives special_word, it returns
special_word^3, so as to boost it.
The
On Wednesday, November 12, 2003, at 11:52 PM, Tomcat Programmer wrote:
I thought Erik's article was great. There was one
unanswered brainbender I had which I was hoping was in
there, but... Maybe you can add this topic to the next
one, Erik?
Well, I'm not sure another article on QueryParser is
On Tuesday, November 11, 2003, at 02:37 PM, Thomas Krämer wrote:
Is there an overview of the structure of the index of lucene despite
of the javadoc or any other fast access to understanding what happens
inside lucene?
Here is what is inside a Lucene index:
On Tuesday, November 11, 2003, at 10:00 PM, Kumar Mettu wrote:
The format of the file is as follows:
Col1,col2,col3,Value
abababc,xyzza,c,100
ababadx,xyz,adfdfd,101
I need to retrieve the value with simple queries on the data like:
select value where col1
On Friday, November 7, 2003, at 08:38 AM, Chong, Herb wrote:
i'm running in a single thread. the demo app is pretty vague on things
and expects me to read the detailed documentation. not what i like in
a sample application where someone is supposed to learn from it.
taking the close() call out
On Friday, November 7, 2003, at 03:56 AM, Victor Hadianto wrote:
Nonetheless, both creator and the name of the
creator are variables. We depend on the user to give
Of course, but you don't have unlimited fields right? So you know that
creator field is the creator of a book. You can provide the
My latest article is now online at java.net:
http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html
Lot's of gory details about how QueryParser works and issues to
consider when using it are discussed. Feedback (on java.net's site
preferably) is most welcome!
Erik
On Thursday, November 6, 2003, at 01:55 PM, Thomas Fuchs wrote:
Hi,
I used lucene 1.2/it.unige.csita.lucene.RODirectory inside an applet
on CD-ROM. In lucene 1.3 the system property 'disableLuceneLocks' was
introduced to make it.unige.csita.lucene.RODirectory or something like
that obsolete.
On Thursday, November 6, 2003, at 02:44 PM, Chong, Herb wrote:
it's the line with the close(). so the remedy then is to make sure
that it is called only once. what is the recommended way to process
two folders worth of documents then? do i need to create a new
IndexWriter object for each
On Thursday, November 6, 2003, at 07:53 PM, Caroline Jen wrote:
Hi, let me see if I have got the idea. For example,
if I want to search the database for articles written
by Elizabeth Castro, we do what is shown below in
Lucene:
It sounds like you're asking a lot of hypothetical questions without
On Wednesday, November 5, 2003, at 03:51 AM, Marcel Stor wrote:
Hi all,
I'm thinkin' about writing a search tool for my filesystem. I know such
things exist already but programming it myself is much more fun ;-)
So, I would have Lucene crawl through my filesystem and pass each file
to an
Could you try the latest CVS version or 1.3 RC build and see if the
problem has been resolved?
On Tuesday, November 4, 2003, at 12:24 PM, +ACI-Chong, Herb+ACI- wrote:
this is the release 1.2 code. the exception as reported by debug is
java.lang.NullPointerException
at
600 bytes in size.
Herb
-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 04, 2003 1:42 PM
To: Lucene Users List
Subject: Re: crash in Lucene
Could you try the latest CVS version or 1.3 RC build and see if the
problem has been resolved
On Sunday, November 2, 2003, at 09:38 AM, Stefan Groschupf wrote:
sorry a very stupid question does lucene zipf laws until indexing?
I had to look up Zipfs law to understand this. Lucene does include
frequency information about terms indexed, yes. And Analyzers can
remove common words if you
On Friday, October 31, 2003, at 03:53 AM, Albert Vila Puig wrote:
Hi,
Is there a way to remove a token from a document field entry?. For
example, I've got a UnStored field in my index and I want to remove a
token from this field without doing the delete and add document
(because I'm
Field.Text(String, Reader) is an unstored field. It is indexed, but
the contents are not stored in the index.
If you want the contents stored, use Field.Text(String,String)
Erik
On Thursday, October 30, 2003, at 02:40 AM, Günter Kukies wrote:
Hello,
I want to add a Text field to a
Also, referring to my article may help - the code is designed to index
text files:
http://today.java.net/pub/a/today/2003/07/30/LuceneIntro.html
On Thursday, October 30, 2003, at 02:40 AM, Günter Kukies wrote:
Hello,
I want to add a Text field to a LUCENE Document. I checked the index
,...)
So my problem is that I don't get back the LUCENE-Document. Maby I
need a
buffered reader or it is not allowed to close the reader.
Günter
- Original Message -
From: Erik Hatcher [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Thursday, October 30, 2003 9:17 AM
Subject
It was posted on lucene-dev, not lucene-user. I've pasted it below.
I will be fixing this at some point in the near future based on this
fix and other related ones needed.
Erik
On Thursday, October 30, 2003, at 09:31 AM, Otis Gospodnetic wrote:
I believe a person just sent an email with a
On Tuesday, October 28, 2003, at 08:54 AM, William W wrote:
Is there any Lucene best practice ?
Is there anything in particular you're interested in knowing about?
This list and its archives contain in conjunction with the jGuru FAQ
are the best sources for such info currently as well as
601 - 700 of 800 matches
Mail list logo