Re: WhitespaceAnalyzer Problem

2004-10-13 Thread Erik Hatcher
Dera - give the troubleshooting techniques provided here a try: http://wiki.apache.org/jakarta-lucene/AnalysisParalysis Provide us with a more detailed example of a sentence of text you indexed and how you are searching (using QueryParser, I presume) and we can likely offer more assistance.

Re: Multi + Parallel

2004-10-13 Thread Erik Hatcher
On Oct 13, 2004, at 3:14 AM, Karthik N S wrote: I was Curious to Know the Difference between ParallelMultiSearcher and MultiSearcher , 1) Is the working internal functionality of these are same or different . They are different internally. Externally they should return identical results and n

Re: single quote unicode character

2004-10-11 Thread Erik Hatcher
Chris - I suspect something else in your application is getting in the way. Try to simplify and eliminate the servlet, or use a tool like Luke to see what is truly in the index and what truly is being returned. Lucene indexes what you tell it (perhaps your analyzer is manipulating things?), a

Re: WebLucene 0.5 released: with a SAX based indexing sample Re: XML Indexing

2004-10-11 Thread Erik Hatcher
Is this the proper forum to discuss WebLucene? Perhaps this discussion should be moved to the WebLucene e-mail list? Erik On Oct 11, 2004, at 6:37 AM, Sumathi wrote: I have overcome the problem with tomcat also. and the demo is working fine . I tried using a sample xml (Sample.xml) wi

GUUUI - The optimal layout of search result pages

2004-10-11 Thread Erik Hatcher
I found this interesting: http://www.guuui.com/posting.php?id=1585 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

ApacheCon and Lucene

2004-10-10 Thread Erik Hatcher
Pardon the mild advertising interruption - I'm giving a 3 hour tutorial on Lucene at ApacheCon next month. Time is running out for meeting the pre-registration numbers, so I'm encouraging folks that are already going to ApacheCon but have not registered yet to please do so. And if you are und

Re: ezmlm response

2004-10-05 Thread Erik Hatcher
On Oct 5, 2004, at 4:29 PM, Patel, Viral wrote: Does anyone know how can I iterate through entire index and display all of the "records" without typing anything in the query? You can use the IndexReader API to navigate the index and walk through all of the documents. Erik ---

Re: BooleanQuery - Too Many Clases on date range.

2004-10-05 Thread Erik Hatcher
On Oct 4, 2004, at 2:12 PM, Chris Fraschetti wrote: absoultely, limiting the user's query is no problem here. I've currently implemented the lucene javascript to catcha lot of user quries that could cause issues.. blank queries, ? or * at the beginning of query, etc etc... but I couldn't think of a

Re: Sorting on a long string

2004-09-29 Thread Erik Hatcher
On Sep 28, 2004, at 9:46 PM, Daly, Pete wrote: I am new to lucene, and trying to perform a sorted query on a list of people's names. Lucene seem unable to properly sort on the name field of my indexed documents. If I sort by the other (shorter) fields, it seems to work fine. The name sort seem

Re: Memory usage: IndexSearcher & Sort

2004-09-29 Thread Erik Hatcher
On Sep 29, 2004, at 3:11 PM, Bryan Dotzour wrote: 3. Certainly some of you on this list are using Lucene in a web-app environment. Can anyone list some best practices on managing reading/writing/searching a Lucene index in that context? Beyond the advice already given on this thread, since you sa

Re: Sorting Info

2004-09-28 Thread Erik Hatcher
On Sep 27, 2004, at 6:32 PM, [EMAIL PROTECTED] wrote: I'm interested in doing sorting in Lucene. Is there a FAQ or an article that will show me how to do this? I already have my indexing and searching working. From IndexSearcher, use search(Query,Sort) method (or other variants that take a

Re: Keyword query confusion

2004-09-25 Thread Erik Hatcher
On Sep 25, 2004, at 5:59 AM, Erik Hatcher wrote: On Sep 24, 2004, at 12:26 PM, Fred Toth wrote: I'm trying to understand what's going on with the query parser and keyword fields. It's a confusing situation, for sure. I've got a large subset of my documents which are "p

Re: Keyword query confusion

2004-09-25 Thread Erik Hatcher
On Sep 24, 2004, at 12:26 PM, Fred Toth wrote: I'm trying to understand what's going on with the query parser and keyword fields. It's a confusing situation, for sure. I've got a large subset of my documents which are "publications". So as to be able to query these, I've got this in the indexer: do

Re: demo IndexHTML parser breaks unicode?

2004-09-25 Thread Erik Hatcher
As for alternative HTML parsers, there are a few notable ones: NekoHTML - Nutch uses it JTidy - My Ant task in the sandbox uses it and HTMLParser All of the above are surely far more battle-tested in production than Lucene's demo parser, and I'd be surprised if they did not correctly handle Unic

Re: Document contents split among different Fields

2004-09-23 Thread Erik Hatcher
On Sep 23, 2004, at 6:00 PM, Greg Langmead wrote: Doug Cutting wrote: Do you need highlights from all fields? If so, then you can use: TextFragment[] getBestTextFragments(TokenStream, ...); with a TokenStream for each field, then select the highest scoring fragments across all fields. Would th

Re: compiling 1.4 source

2004-09-23 Thread Erik Hatcher
If you obtained the 1.4.1 source distribution, then you're fine and its simply an issue with the properties. We keep the properties set to the _next_ version of Lucene (or as a beta/rc version label) to avoid the CVS HEAD codebase from building as a release label when it is very likely not th

Re: Strange search results with wildcard - Bug?

2004-09-23 Thread Erik Hatcher
On Sep 23, 2004, at 11:00 AM, Ulrich Mayring wrote: Erik Hatcher wrote: Look at AnalysisDemo referred to here: http://wiki.apache.org/jakarta-lucene/AnalysisParalysis Keep in mind that phrase queries do not support wildcards - they are analyzed and any wildcard characters are likely stripped

Re: problem with get/setBoost of document fields

2004-09-23 Thread Erik Hatcher
The boost is not thrown away, but rather combined with the length normalization factor during indexing. So while your actual boost value is not stored directly in the index, it is taken into consideration for scoring appropriately. Erik On Sep 23, 2004, at 8:17 AM, Bastian Grimm [Eastb

Re: Strange search results with wildcard - Bug?

2004-09-23 Thread Erik Hatcher
On Sep 23, 2004, at 5:49 AM, Morus Walter wrote: Ulrich Mayring writes: Will do, thank you very much. However, how do I get at the analyzed form of my terms? Instanciate the analyzer, create a token stream feeding your input, loop over the tokens, output the results. Look at AnalysisDemo referred

Re: Implement custom score

2004-09-22 Thread Erik Hatcher
Sorting is done however you specify, by field, with secondary fields specified, by document id, by score/relevance, or even by a custom implementation to sort by something else (in Lucene in Action we provide an implementation that sorts by two-dimensional distance from a given location, wh

Re: Implement custom score

2004-09-22 Thread Erik Hatcher
Sorting is done however you specify, by field, with secondary fields specified, by document id, by score/relevance, or even by a custom implementation to sort by something else (in Lucene in Action we provide an implementation that sorts by two-dimensional distance from a given location, wh

Re: Implement custom score

2004-09-22 Thread Erik Hatcher
Actually what William should use is the new Sort facility to order results by a field. Doing this with a Similarity would be much trickier. Look at the IndexSearcher.sort() methods which take a Sort and follow the Javadocs from there. Let us know if you have any questions on sorting. It

Re: indexing date ranges

2004-09-21 Thread Erik Hatcher
If it is unindexed, then you cannot query on it, so you do not have a choice. The other option is to use a field that is indexed, not tokenized, and not stored (you have to use new Field(...) to accomplish that) if you don't want to store the field data. Erik On Sep 21, 2004, at 5:54 P

Re: displaying 'pages' of search results...

2004-09-21 Thread Erik Hatcher
The best first approach is to simply re-query every time the user goes to a new page, keeping around the query in some for or another (perhaps the expression if you're using QueryParser) and the page number. If that is fast enough, then you're done! :) If it is not, then you could consider cach

Re: PHP and Lucene

2004-09-16 Thread Erik Hatcher
On Sep 15, 2004, at 1:45 PM, Karthik N S wrote: Hi Erik , Doug , Otis This is general forum - no need to address individuals. 1) Is a there a PHP version of Lucene Implemantation avaliable , If so Where ? Using the Java version of Lucene from PHP is my recommendation. There is not a PHP versio

Re: LUCENE + PHP ???

2004-09-16 Thread Erik Hatcher
On Sep 15, 2004, at 1:45 PM, Karthik N S wrote: Hi Erik , Doug , Otis This is general forum - no need to address individuals. 1) Is a there a PHP version of Lucene Implemantation avaliable , If so Where ? Using the Java version of Lucene from PHP is my recommendation. There is not a PHP versio

Re: ANT +BUILD + LUCENE

2004-09-14 Thread Erik Hatcher
I hope u get the situation. :{ With regards Karthik -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 14, 2004 7:37 PM To: Lucene Users List Subject: Re: ANT +BUILD + LUCENE I'

Re: Addition to contributions page

2004-09-14 Thread Erik Hatcher
Perhaps we should @deprecate the contributions page like we did with the Powered By page, and migrate it to the wiki? Erik On Sep 13, 2004, at 6:50 PM, Daniel Naber wrote: On Friday 10 September 2004 15:48, Chas Emerick wrote: PDFTextStream should be added to the 'Document Converters' sec

Re: ANT +BUILD + LUCENE

2004-09-14 Thread Erik Hatcher
I'm not following what you want very clearly, but there is an task in Lucene's Sandbox. Please post what you are trying, and I'd be happy to help once I see the details. Erik On Sep 12, 2004, at 4:44 PM, Karthik N S wrote: Hi Guys Apologies.. The Task for me is to build the Ind

Indexing object graphs

2004-09-14 Thread Erik Hatcher
Interesting! http://kasparov.skife.org/blog/2004/09/13#lucene-graphs - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: TermQuery PROBLEM!!!

2004-09-11 Thread Erik Hatcher
You have way too much confusing code there for me to try to take in, but surely the situation is AnalysisParalysis: http://wiki.apache.org/jakarta-lucene/AnalysisParalysis Try to narrow down things to a very simple example for posting allowing others to very quickly and clearly see your

Re: *term search

2004-09-08 Thread Erik Hatcher
On Sep 8, 2004, at 6:26 AM, sergiu gordea wrote: I want to discuss a little problem, lucene doesn't support *Term like queries. First of all, this is untrue. WildcardQuery itself most definitely supports wildcards at the beginning. I would like to use "*schreiben". The dilemma you've encountere

Re: Use of explain() vs search()

2004-09-08 Thread Erik Hatcher
Could you create a simple piece of code (using a RAMDirectory) that demonstrates this issue? Erik On Sep 8, 2004, at 12:35 AM, Minh Kama Yie wrote: Hi all, Sorry I should clarify my last point. The search() would return no hits, but the explain() using the apparently invalid docId return

Re: Lucene Book

2004-09-07 Thread Erik Hatcher
On Sep 7, 2004, at 3:00 AM, [EMAIL PROTECTED] wrote: I am new to Lucene. Can anyone guide me from where i can download free Lucene book. Free?! http://www.manning.com/hatcher2 is the book Otis and I have spent the last year laboring on. It has been a long hard effort that is about to come to fru

Re: Concatinated search string in not working!

2004-09-03 Thread Erik Hatcher
The "Keyword"-ness of a field is only at indexing time, and not something known about at query time. You need to use a different analyzer for that field. Check out posts on KeywordAnalyzer and PerFieldAnalyzerWrapper - this combination is the secret :) Erik On Sep 3, 2004, at 9:55 AM

Re: Building Lucene in Eclipse

2004-09-01 Thread Erik Hatcher
You need to use a recent version of Ant - version 1.6.x. Erik On Sep 1, 2004, at 5:19 PM, <[EMAIL PROTECTED]> wrote: I'm trying to use Ant to build Lucene within Eclipse, or rather trying to. I've went to external tools and created a lucenebuild.xml entry. Under location I have the build.

Re: .net version - seperate mailinglist?

2004-09-01 Thread Erik Hatcher
On Aug 31, 2004, at 6:10 PM, Jan Agermose wrote: Im having some troble using Lucene - but is the .NET port. Should I ask questions about the different analyzers and tokenizers on this mailinglist or one some other? This list is primarily for the Java version of Lucene, but I suspect the behavior

Re: Lucene 1.4.1 not listed on jakarta downloads page

2004-08-31 Thread Erik Hatcher
Thanks for reporting this. We actually know. Lucene 1.4.1 was not released "properly", and it is going to require someone to do so. I've done the last two releases, but have been swamped lately. If no one beats me to it, I'll hopefully get around to this in the near future. Erik On

Re: Content from multiple folders in single index

2004-08-27 Thread Erik Hatcher
You should consider using the Ant task in the Sandbox (contributions/ant directory). You'll need to write a custom document handler implementation to handle PDF's and any other types you like. The built-in handler does text and HTML files, but is pluggable. The task uses Ant's filesets to determ

Re: what is wrong with query

2004-08-25 Thread Erik Hatcher
That is correct... fuzzy searches are only on a per-term basis. If what you meant, though, was a phrase query ("full" near "name") you have to add an explicit slop factor like "full name"~5 Erik On Aug 25, 2004, at 2:19 AM, Stephane James Vaucher wrote: From: http://jakarta.apache.org/lu

Re: Lucene Search Applet

2004-08-23 Thread Erik Hatcher
On Aug 23, 2004, at 11:36 AM, Stephane James Vaucher wrote: Should this property be changed in the next major release of lucene to org.apache...disableLuceneLocks? Yes, that makes sense to put an org.apache.lucene prefix. If that is the case, it should be changed to "disableLocks" - no point in

Re: Lucene Search Applet

2004-08-23 Thread Erik Hatcher
On Aug 23, 2004, at 10:48 AM, Stephane James Vaucher wrote: I haven't used it, and I'm a little confused from the code: /** ... * If the system property 'disableLuceneLocks' has the String value of * "true", lock creation will be disabled. */ public final class FSDirectory extends Directory {

Re: Custom filter

2004-08-20 Thread Erik Hatcher
On Aug 20, 2004, at 6:48 PM, [EMAIL PROTECTED] wrote: We're currently in lucene 1.2... haven't moved to 1.3 yet. Skip 1.3 and go straight to 1.4.1 :) Upgrade - why not? Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For a

Re: Custom filter

2004-08-20 Thread Erik Hatcher
Have you considered using the built-in QueryFilter for this? Why isn't it sufficient for your needs? Erik On Aug 20, 2004, at 6:32 PM, [EMAIL PROTECTED] wrote: Hi guys! I was hoping someone here could help me out with a custom filter. We have an index of emails and do some searches on t

Re: Debian build problem with 1.4.1

2004-08-20 Thread Erik Hatcher
On Aug 20, 2004, at 12:36 PM, Jeff Breidenbach wrote: I don't understand this. StandardTokenizer.java hasn't changed since last year. I have packaged Lucene such that 'ant javacc' is called at package build time. I now see the problem - 'import java.io.*;' has been removed from StandardTokenizer.

Re: Debian build problem with 1.4.1

2004-08-20 Thread Erik Hatcher
On Aug 20, 2004, at 11:12 AM, Jeff Breidenbach wrote: Hi Otis, I'm asking, because it looks like your compiler is not finding Reader and IOException classes, both of which are in java.io.* package, which I see imported in StandardTokenizer.java as 'import java.io.*;'. In my copy of StandardTokeniz

Re: lucene and ejb applications

2004-08-20 Thread Erik Hatcher
On Aug 20, 2004, at 7:54 AM, Rupinder Singh Mazara wrote: hi erik thanks for the warning and the code. Let me re-phrase the question, i have a index generated by lucene, i need to have the search capabilty to have a high availabilty. What solutions would be the most optimal I'm guessing from y

Re: lucene and ejb applications

2004-08-20 Thread Erik Hatcher
What would be the best way? Use Lucene outside of EJB. It's quite silly to make such a decision "purely due to a policy decision" when the technicalities of it show that it is an unwise decision. You're going to navigate Hits through a session bean? And as you said, the EJB spec says not to

Re: What's the return order when the scores for two doc are exactly t he same

2004-08-18 Thread Erik Hatcher
The index order is the "secondary" sort order. You can change this by using the new sorting facility if desired. Erik On Aug 18, 2004, at 2:24 PM, Ching-Pei Hsing wrote: Hi, What is the order returned by Lucene when the scores for two result documents are exactly the same? I know this ra

Re: AnalyZer HELP Please

2004-08-18 Thread Erik Hatcher
it didn't come across that way before. Erik T -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 18, 2004 2:00 PM To: Lucene Users List Subject: Re: AnalyZer HELP Please Thanks for doing the legwork. My favorite example is "to

Re: AnalyZer HELP Please

2004-08-18 Thread Erik Hatcher
rom it test" - 0 matches for this exact phrase - i.e. stoplist NOT used for any words in a phrase query Tate p.s. Um... did you say that was a rhetorical question? ;-) -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, August

Re: Restoring a corrupt index

2004-08-18 Thread Erik Hatcher
The details of the segments file (and all the others) is freely available here: http://jakarta.apache.org/lucene/docs/fileformats.html Also, there is Java code in Lucene, of course, that manipulates the segments file which could be leveraged (although probably package scoped and not eas

Re: AnalyZer HELP Please

2004-08-18 Thread Erik Hatcher
e Do (WWGD)? category does Google remove stop words? I'll leave that as a rhetorical question for now :) Erik Thx Karthik -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 17, 2004 7:35 PM To: Lucene Users List Subject: Re: AnalyZe

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher
On Aug 17, 2004, at 9:47 AM, Karthik N S wrote: I did as Erik replied in his mail , and searched for the complete word "\"New Year\"" , but the QueryParser Still returns me hit for "Year" Only. [ The Analyzer I use has 555 English Stop words with "new" present in it ] No wonder! That's whe

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher
On Aug 17, 2004, at 9:23 AM, Karthik N S wrote: So when I did a quick run on Analyzer process and found that it was splitting the Word "New Year" = [New] [Year] Am I doing some thing wrong in here No... this is what this analyzer does. QueryParser does the same thing. The difference

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher
"New Year" = [New] [Year] Am I doing some thing wrong in here Thx in advance. Karthik -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 17, 2004 6:18 PM To: Lucene Users List Subject: Re: AnalyZer HELP Please This is what analyzers do

Re: AnalyZer HELP Please

2004-08-17 Thread Erik Hatcher
This is what analyzers do. I don't know of any analyzer that deals with quotes in the way you're requesting, by keeping the contents together as a complete token. You'll have to write your own variant that does this. QueryParser, however, uses quotes to denote a phrase query, and will query

Re: http AND halt

2004-08-17 Thread Erik Hatcher
What Analyzer is being used? If it is removing stop words, what is the stop word list? Erik On Aug 17, 2004, at 1:56 AM, Leos Literak wrote: One user reported, that if he searches http AND halt, the search fails. This can be found in logs: java.lang.ArrayIndexOutOfBoundsException: -1 at

Re: highlight the search word

2004-08-14 Thread Erik Hatcher
On Aug 14, 2004, at 7:10 AM, lingaraju wrote: How to highlight the search word See Highlighter here: http://jakarta.apache.org/lucene/docs/lucene-sandbox/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands

Re: Finding All?

2004-08-13 Thread Erik Hatcher
On Aug 13, 2004, at 4:01 PM, [EMAIL PROTECTED] wrote: A ranged query that covers the full range does the same thing. Of course it is also inefficient with term generation: myField[a TO z] Note that this won't work if you have more than 1024 matching terms, which is a quite likely scenario. The s

Re: Rename but not reindex

2004-08-13 Thread Erik Hatcher
You have to re-index. Updating is not currently possible, at least not without really low-level hacks. Erik On Aug 13, 2004, at 8:23 AM, Demetrio Zenti wrote: I apologise if it's a stupid question... I index Document objects having 2 fields: - 1° representing file name. It's code is

Re: wildcard uppercase

2004-08-12 Thread Erik Hatcher
Query.toString() is your friend! As well as troubleshooting without QueryParser in the picture too. But, Daniel to the rescue :) Erik On Aug 12, 2004, at 5:06 PM, Otis Gospodnetic wrote: My guess would be 'something in the QueryParser', but I don't know for sure. Erik will know he's

Re: Searching without a specified field

2004-08-11 Thread Erik Hatcher
I suggest you aggregate all the text you want searchable into a single field during indexing. Then search that field at query time instead. The alternative is to build up a (potentially huge) BooleanQuery using that string for each field. The MultiFieldQueryParser can do this, but its not pre

OSCOM talks

2004-08-10 Thread Erik Hatcher
My Tapestry and Lucene talks have been accepted for the upcoming OSCOM conference in Zurich. http://www.oscom.org/events/oscom4/ I look forward to meeting some of the European Apache contingency! Erik - To unsubsc

Re: Negative Boost

2004-08-04 Thread Erik Hatcher
On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote: I can't get negative boosts to work with QueryParser. Is it possible to do so? Closer inspection on the parsing: TOKEN : { )+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT } where <#_NUM_CHAR: ["0"-"9"] > So, no, negative boosts don't appear possible w

Re: Negative Boost

2004-08-04 Thread Erik Hatcher
On Aug 4, 2004, at 7:19 AM, Terry Steichen wrote: I can't get negative boosts to work with QueryParser. Is it possible to do so? More details please. - What exact query expression did you use? - Did you get an error? If so, what was it? - What does Query.toString() output? Erik ---

Re: search exception in servlet!Please help me

2004-08-04 Thread Erik Hatcher
se of permissions. Maybe you're using a different version of Lucene between the command-line and your web application? Erik On Aug 4, 2004, at 3:14 AM, Christiaan Fluit wrote: Erik Hatcher wrote: Where did you get 'i'? Keep in mind that using Hits.doc(n) intends 'n&#x

Re: search exception in servlet!Please help me

2004-08-03 Thread Erik Hatcher
Where did you get 'i'? Keep in mind that using Hits.doc(n) intends 'n' to be a document *id*, not the iteration through the Hits collection. This is a very common mistake, and I'm guessing one you've made here. Erik On Aug 3, 2004, at 7:49 PM, xuemei li wrote: Thank you for your repl

Re: reverse lookup

2004-08-02 Thread Erik Hatcher
On Aug 1, 2004, at 10:25 PM, John Adam wrote: Is there a way to get most significant words of a document if i give a document number. Have a look at the term vector support new in v1.4. For a document number and field name, you get terms and frequencies: TermFreqVector vector = rea

Re: Proximity searching and phrase

2004-07-30 Thread Erik Hatcher
On Jul 30, 2004, at 7:01 AM, Lucene wrote: I was wondering is there is a way to do proximity searches with phrases eg "very good" NEAR "sometimes". Any help on this would be welcome. You can do this with the new SpanQuery family in v1.4. The example you gave would consist of a SpanTermQuery for "

Re: Progress bar for Lucene

2004-07-28 Thread Erik Hatcher
That'd be a pretty quick progress bar in the searches I've seen 10ms would be barely a blink of an eye. Perhaps we should discuss why your searches are slow enough to warrant a progress bar. But a HitCollector might be the right hook you're looking for. Erik On Jul 28, 2004, at 10

Re: Phrase Query

2004-07-27 Thread Erik Hatcher
On Jul 27, 2004, at 11:42 AM, Hetan Shah wrote: Works for me. Here is what I am striving to achieve. phraseString = request.getParameter("phrase"); if (phraseString.length() > 0){ phraseQueryString = "\""+phraseString+("\""); phraseQuery = true; queryString = phraseQueryStr

Re: Time of last insert

2004-07-27 Thread Erik Hatcher
On Jul 27, 2004, at 5:15 AM, Otis Gospodnetic wrote: There is no API for that. Yeah there is! :) IndexReader.lastModified() I borrowed that from LIMO's .jsp page, by the way. Erik - To unsubscribe, e-mail: [EMAIL P

Re: Phrase Query

2004-07-26 Thread Erik Hatcher
Let's turn it around could you send us your code that is not working? Lucene's test cases show PhraseQuery in action, and working. Erik On Jul 26, 2004, at 4:11 PM, Hetan Shah wrote: Hello, Can someone on the mailing list send me a copy of sample code of how to implement the phrase q

Re: Weighting database fields

2004-07-21 Thread Erik Hatcher
On Jul 21, 2004, at 11:40 AM, Anson Lau wrote: Is there any benefit to set the boost during indexing rather than set it during query? It allows setting each document differently. For example, TheServerSide is using field-level boosts at index time to control ordering by date, such that newer ar

Re: Extracting Lucene onto Tomcat

2004-07-21 Thread Erik Hatcher
On Jul 21, 2004, at 11:19 AM, Ian McDonnell wrote: No sorry i didnt mean that i was trying to extract the jars at all. I meant the extraction of the original lucene source bundle. I have been developing in java for going on 5 years now, but am relatively new to Web Apps. I have some experience

Re: Extracting Lucene onto Tomcat

2004-07-21 Thread Erik Hatcher
n i try to compile any of the source it just throws numerous errors. I've got the classpath set to web-inf/classes. Have i extraced it to the wrong directory? --- Erik Hatcher <[EMAIL PROTECTED]> wrote: On Jul 21, 2004, at 8:10 AM, Ian McDonnell wrote: Is the package information and

Re: Weighting database fields

2004-07-21 Thread Erik Hatcher
On Jul 21, 2004, at 10:09 AM, Anson Lau wrote: Apply boost factor to fields when you do a lucene search. Or... set the boost on the Field during indexing. Erik Anson -Original Message- From: John Patterson [mailto:[EMAIL PROTECTED] Sent: Thursday, July 22, 2004 12:07 AM To: [EMAIL

Re: Extracting Lucene onto Tomcat

2004-07-21 Thread Erik Hatcher
On Jul 21, 2004, at 8:10 AM, Ian McDonnell wrote: Is the package information and import paths ready to deploy on Tomcat server. I tried extracting lucene on the server, but when i compile files, it just throws numerous no class definition errors and errors relating to the package. Huh? Lucene c

Re: Lucene vs. MySQL Full-Text

2004-07-21 Thread Erik Hatcher
Interestingly (and ironically) enough, the project I'm currently working on requires full-text searching of Word and PDF resumes. SQL Server is already the required database as well, so we are leveraging the full-text indexing capabilities it has. There is a special trick to drop a BLOB into

Re: Can I retrieve token offsets from Hits?

2004-07-21 Thread Erik Hatcher
On Jul 21, 2004, at 6:59 AM, Stepan Mik wrote: It is possible to retrieve tokens offsets (Token.startOffset(), Token.endOffset()) later when document is found and returned in hit collection? No offsets are not stored in the index. In fact, the only place they are currently used is with the Hi

Re: lucene cutomized indexing

2004-07-20 Thread Erik Hatcher
On Jul 20, 2004, at 2:10 PM, John Wang wrote: I have already provided my opinion on this one - I think it would be fine to allow Token to be public. I'll let others respond to the additional requests you've made. Great, what processes need to be in place before this gets in the code base? You're

Re: lucene cutomized indexing

2004-07-20 Thread Erik Hatcher
On Jul 20, 2004, at 12:12 PM, John Wang wrote: There are few things I want to do to be able to customize lucene: [...] 3) to be able to customize analyzers to add more information to the Token while doing tokenization. I have already provided my opinion on this one - I think it would be fine

Re: The indexer

2004-07-20 Thread Erik Hatcher
On Jul 20, 2004, at 10:07 AM, Ian McDonnell wrote: As for indexing data from mysql - there have been lots of discussions of that recently, so check the archives. Basically you read the data, and index it with Lucene's API. And you are responsible for keeping it >in sync. The problem i am having

Re: The indexer

2004-07-20 Thread Erik Hatcher
possible? Of course. But you'll have to code it. It's only a few lines of code to index a "document" into a Lucene index, but it is up to you to code those into the appropriate spot in your system (most likely right where you insert into mysql). Erik Ian ---

Re: The indexer

2004-07-20 Thread Erik Hatcher
On Jul 20, 2004, at 8:44 AM, Ian McDonnell wrote: Can Lucenes indexer be used to store info in fields in a mysql db? I'm not quite clear on your question. You want to store a Lucene index (aka Directory) within mysql? Or, you want to index data from your existing mysql database into a Lucene in

Re: Post-sorted inverted index?

2004-07-20 Thread Erik Hatcher
On Jul 20, 2004, at 1:27 AM, Aphinyanaphongs, Yindalon wrote: I gather from reading the documentation that the scores for each document hit are computed at query time. I have an application that, due to the complexity of the function, cannot compute scores at query time. Would it be possible f

Re: Wildcard search with my own analyzer

2004-07-15 Thread Erik Hatcher
On Jul 15, 2004, at 10:02 AM, Morus Walter wrote: Joel Shellman writes: What do I need to do so that wildcard searching will work on this? I am using the same analyzer for indexing and searching (otherwise the first search wouldn't work either). Check what query is produced (query.toString(...))

Re: Search +QueryParser+Score

2004-07-15 Thread Erik Hatcher
Guys... Apologies Let me be more Specific regarding the last mail I would like to get all Hits returned with score = 1.0 ONLY using Query Parser . What are my Options. with regards Karthik -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, July 15, 2004 4:

Re: Search +QueryParser+Score

2004-07-15 Thread Erik Hatcher
Kathik, I have a really hard time following your questions, otherwise I'd chime in on them more often. Your meaning is not often clear. In the case of normalizing the score to 1.0 or less - this is precisely what Hits does for you. I'm not sure what you mean by "BEFORE" doing QueryParser - a

Re: Searching against Database

2004-07-15 Thread Erik Hatcher
In this situation, you may want to investigate implementing a custom Filter which is user-specific and constrains the search space to only the rows a specific user is allowed to search. Erik On Jul 15, 2004, at 3:04 AM, Sergiu Gordea wrote: Hi again, I'm thinking to get the list of IDs

Re: One Field!

2004-07-15 Thread Erik Hatcher
On Jul 14, 2004, at 10:19 PM, Jones G wrote: I have an index with multiple fields. Right now I am using MultiFieldQueryParser to search the fields. This means that if the same term occurs in multiple fields, it will be weighed accordingly. Is there any way to treat all the fields in question as

Re: SV: lucene sort error - there are more terms than documents in field ....

2004-07-13 Thread Erik Hatcher
, what is that. Mats -Oprindelig meddelelse- Fra: Erik Hatcher [mailto:[EMAIL PROTECTED] Sendt: 14. juli 2004 01:29 Til: Lucene Users List Emne: Re: lucene sort error - there are more terms than documents in field On Jul 13, 2004, at 7:10 PM, MATL (Mats Lindberg) wrote: Hello. I am

Re: lucene sort error - there are more terms than documents in field ....

2004-07-13 Thread Erik Hatcher
On Jul 13, 2004, at 7:10 PM, MATL (Mats Lindberg) wrote: Hello. I am using: import org.apache.lucene.search.Sort when searching an index, but for some reasons, in some indexes i get this error: caught a class java.lang.RuntimeException with message: there are more terms than documents in field d

Re: Search Result

2004-07-13 Thread Erik Hatcher
Look at the Term Highlighter here: http://jakarta.apache.org/lucene/docs/lucene-sandbox/ On Jul 13, 2004, at 2:32 PM, Hetan Shah wrote: I think I have not explained my question correctly. What is happening is when I show the result on a page the text below the link as shown below. Test

Re: Why is Field.java final?

2004-07-13 Thread Erik Hatcher
On Jul 13, 2004, at 12:51 AM, John Wang wrote: Hi: On the same thought, how about the org.apache.lucene.analysis.Token class. Can we make it non-final? I searched for uses of the Token constructors and I'm currently of the opinion that it is ok for Token to be made non-final. Any reasons not

Re: PhraseQuery with Wildcards?

2004-07-07 Thread Erik Hatcher
On Jul 7, 2004, at 6:24 PM, [EMAIL PROTECTED] wrote: Hi, Is there any way to do a PhraseQuery with Wildcards? No. This very question came up a few days ago. Look at PhrasePrefixQuery - although this will be a bit of effort to expand the terms matching the wildcarded term. I'd like to search for

Re: Searching for asterisk in a term

2004-07-07 Thread Erik Hatcher
On Jul 7, 2004, at 3:41 PM, [EMAIL PROTECTED] wrote: Can you recommend an analyzer that doesn't discard '*' or '/'? WhitespaceAnalyzer :) Check the wiki AnalysisParalysis page also. Erik - To unsubscribe, e-mail: [EMAIL PRO

Re: How to use QueryParser to query to get the index summary info?

2004-07-06 Thread Erik Hatcher
On Jul 5, 2004, at 9:44 PM, Alex Aw Seat Kiong wrote: How to use QueryParser to query to get the index summary info, like? QueryParser is not the appropriate place to get the information you want. Use IndexReader instead. a. Last and first index document? reader.document(0) and reader.document(r

Re: Latest StopAnalyzer.java

2004-07-06 Thread Erik Hatcher
On Jul 6, 2004, at 2:53 AM, Morus Walter wrote: Karthik N S writes: Can SomeBody Tell me Where Can I find Latest copy of "StopAnalyzer.java" which can be used with Lucene1_4-final, On Lucene-Sandbox I am not able to Find it. [ My Company Prohibits me from using CVS ] There is no lucene 1.4 fina

<    1   2   3   4   5   6   7   8   9   10   >