Re: “Too many open files" error hit when I run my Lucene 3.0.3 application on Java 8

2017-02-22 Thread Uwe Schindler
ly since java 8. > >If you -have- rewritten some of the file handling code in your indexing >process, make sure to explicitly close the streams you create, or use >the >(since java 7) try-with-resources construct. > > >On 22/02/2017 16:18, Leonid Bolshinsky wrote: > >

Re: “Too many open files" error hit when I run my Lucene 3.0.3 application on Java 8

2017-02-22 Thread Torsten Krah
ewritten some of the file handling code in your indexing > process, make sure to explicitly close the streams you create, or use the > (since java 7) try-with-resources construct. > > > On 22/02/2017 16:18, Leonid Bolshinsky wrote: > > > I have a search engine based on Lucene 3.0.3

Re: “Too many open files" error hit when I run my Lucene 3.0.3 application on Java 8

2017-02-22 Thread Leonid Bolshinsky
Bolshinsky wrote: > I have a search engine based on Lucene 3.0.3 and I can't change the Lucene > version for reasons that are out of scope of this question. Now I have a > requirement to move from Java 6 to Java 8, however when I run the indexing > using Java 8 JVM, I hit &qu

Re: “Too many open files" error hit when I run my Lucene 3.0.3 application on Java 8

2017-02-22 Thread Frederik Van Hoyweghen
-with-resources construct. On 22/02/2017 16:18, Leonid Bolshinsky wrote: I have a search engine based on Lucene 3.0.3 and I can't change the Lucene version for reasons that are out of scope of this question. Now I have a requirement to move from Java 6 to Java 8, however when I run the ind

“Too many open files" error hit when I run my Lucene 3.0.3 application on Java 8

2017-02-22 Thread Leonid Bolshinsky
I have a search engine based on Lucene 3.0.3 and I can't change the Lucene version for reasons that are out of scope of this question. Now I have a requirement to move from Java 6 to Java 8, however when I run the indexing using Java 8 JVM, I hit "Too many open files issue

Re: inconsistency in multifield sort while using an Integer field in Lucene 3.0.3

2015-01-28 Thread Erick Erickson
manoj raj wrote: > Hi, > > I am working with Lucene 3.0.3 > > I find there is an inconsistency while using Integer fields in multifield > sorting. > > Please Clarify. > > With Regards, > Manoj R. >

inconsistency in multifield sort while using an Integer field in Lucene 3.0.3

2015-01-28 Thread manoj raj
Hi, I am working with Lucene 3.0.3 I find there is an inconsistency while using Integer fields in multifield sorting. Please Clarify. With Regards, Manoj R.

Re: Index corruption with lucene 3.0.3

2014-12-17 Thread Michael McCandless
You are very likely hitting the issue described here: https://issues.apache.org/jira/browse/LUCENE-5541 Mike McCandless http://blog.mikemccandless.com On Wed, Dec 17, 2014 at 2:03 PM, Shlomit Rosen wrote: > Hello, > > We have a client that is using lucene 3.0.3. > They are work

RE: Index corruption with lucene 3.0.3

2014-12-17 Thread Uwe Schindler
very deep knowledge on In some Lucene JAR files is an additional tool to “extract” CFS files (like unzip), you may try to use it – but I am not sure if this was already existent in Lucene 3.0.3 (you need to do some Javadoc search to look it up). But without the dictionary at the end of the

Index corruption with lucene 3.0.3

2014-12-17 Thread Shlomit Rosen
Hello, We have a client that is using lucene 3.0.3. They are working with NAS storage device which recently had permission issues, which might have generated some "out of disk space" exceptions during indexing. We are uncertain if they also suffered JDK crashes in the past few mont

Index corruption with lucene 3.0.3

2014-12-17 Thread Shlomit Rosen
Hello, We have a client that is using lucene 3.0.3. They are working with NAS storage device which recently had permission issues, which might have generated some "out of disk space" exceptions during indexing. We are uncertain if they also suffered JDK crashes in the past few mont

Applying LUCENE-3653 patch to Lucene 3.0.3

2012-02-07 Thread Dhruv
Hi, My company is using an older version of Lucene (3.0.3). In my profiling results with 3.0.3, I have found that my app's threads were blocked due to the issue mentioned at LUCENE-3653. Although I was able to use the 3.6 line which fixes this problem, we are still in the process of condu

RE: lucene-3.0.3

2012-02-02 Thread Prasad KVSH
Hi Everybody, lucene-3.0.3. will handle outlook files, DOCX and .EXLX files while searching a text?? We have taken indexfiles.java and searchfiles.java from lucene-3.0.3\src folder, it is working fine for PDF, txt, doc, excel, java, CSV files. Thanks Prasad

Re: lucene-3.0.3

2012-02-01 Thread Sethi, Parampreet
ere it find the text. Using the >return list we can populate them in User Interface after validating with >user access rights. Actually we have one image server in that there will >be few folders and sub folders, each folder will have may have 10,000 >files. > >so far we are search text

RE: lucene-3.0.3

2012-02-01 Thread Prasad KVSH
we can populate them in User Interface after validating with user access rights. Actually we have one image server in that there will be few folders and sub folders, each folder will have may have 10,000 files. so far we are search text for TXT files only using lucene-3.0.3. Thanks Prasad

RE: lucene-3.0.3

2012-02-01 Thread Prasad KVSH
__ From: KARTHIK SHIVAKUMAR [mailto:nskarthi...@gmail.com] Sent: Wed 2/1/2012 7:04 PM To: java-user@lucene.apache.org Subject: Re: lucene-3.0.3 Hi >>lucene-3.0.3 can be used for searching a text from Lucene 's primary job is to do a text search. May it be PDF/HTML/XML/MSword/PPT/

Re: lucene-3.0.3

2012-02-01 Thread Erick Erickson
ptions. > > Thanks > Prasad > > -Original Message- > From: Ian Lea [mailto:ian@gmail.com] > Sent: Wednesday, February 01, 2012 7:22 PM > To: java-user@lucene.apache.org > Subject: Re: lucene-3.0.3 > > You could also take a look at Solr.  From > h

RE: lucene-3.0.3

2012-02-01 Thread Prasad KVSH
. On Wed, Feb 1, 2012 at 1:34 PM, KARTHIK SHIVAKUMAR wrote: > Hi > >>>lucene-3.0.3 can be used for searching a text from > > Lucene 's primary job is to do a text search. > > May it be PDF/HTML/XML/MSword/PPT/XLS > > U have to have the code for plugin to do 2 t

RE: lucene-3.0.3

2012-02-01 Thread Prasad KVSH
Prasad -Original Message- From: KARTHIK SHIVAKUMAR [mailto:nskarthi...@gmail.com] Sent: Wednesday, February 01, 2012 7:04 PM To: java-user@lucene.apache.org Subject: Re: lucene-3.0.3 Hi >>lucene-3.0.3 can be used for searching a text from Lucene 's primary job is to do a text s

Re: lucene-3.0.3

2012-02-01 Thread Ian Lea
. On Wed, Feb 1, 2012 at 1:34 PM, KARTHIK SHIVAKUMAR wrote: > Hi > >>>lucene-3.0.3 can be used for searching a text from > > Lucene 's primary job is to do a text search. > > May it be PDF/HTML/XML/MSword/PPT/XLS > > U have to have the code for plugin to do 2 t

Re: lucene-3.0.3

2012-02-01 Thread KARTHIK SHIVAKUMAR
Hi >>lucene-3.0.3 can be used for searching a text from Lucene 's primary job is to do a text search. May it be PDF/HTML/XML/MSword/PPT/XLS U have to have the code for plugin to do 2 things 1) Strip text from either of the Documents (PDF/HTML/XML/MSword/PPT/XLS) 2) Index this pro

lucene-3.0.3

2012-02-01 Thread Prasad KVSH
Hi, lucene-3.0.3 can be used for searching a text from PDF, xlsx, docx, doc, xls, msg, TXT files. For this we have any common function to accomplish this. Please help me on this. Thanks Prasad

Re: [Bulk] RE: any tips for upgrading Lucene 3.0.3 -> 3.5.0?

2012-01-20 Thread David Carlton
ltiSearcher. > > Regards > Ganesh > > > > - Original Message - > From: "Uwe Schindler" > To: > Sent: Friday, January 20, 2012 5:18 AM > Subject: [Bulk] RE: any tips for upgrading Lucene 3.0.3 -> 3.5.0? > > > > -Original Message- >

Re: [Bulk] RE: any tips for upgrading Lucene 3.0.3 -> 3.5.0?

2012-01-19 Thread Ganesh
I am also in the way to upgrade from 3.0.3 to 3.5. Any other API changes we need to care about? I use ParallelMultiSearcher. Regards Ganesh - Original Message - From: "Uwe Schindler" To: Sent: Friday, January 20, 2012 5:18 AM Subject: [Bulk] RE: any tips for upgrading Lu

RE: any tips for upgrading Lucene 3.0.3 -> 3.5.0?

2012-01-19 Thread Uwe Schindler
> -Original Message- > From: earlh...@gmail.com [mailto:earlh...@gmail.com] On Behalf Of Earl > Hood > Sent: Friday, January 20, 2012 12:41 AM > To: java-user@lucene.apache.org > Subject: Re: any tips for upgrading Lucene 3.0.3 -> 3.5.0? > > On Thu, Jan 19, 201

Re: any tips for upgrading Lucene 3.0.3 -> 3.5.0?

2012-01-19 Thread Earl Hood
On Thu, Jan 19, 2012 at 4:59 PM, Uwe Schindler wrote: > Lucene 3.5 can read any index going back to 2.0. The IndexUpgrader is only > needed to "forcefully" upgrade indexes for maximum performance and safe > migration to Lucene 4.0 (that can only read indexs >= 3.0). Question: Will Lucene 3.5 auto

Re: any tips for upgrading Lucene 3.0.3 -> 3.5.0?

2012-01-19 Thread David Carlton
> Sent: Thursday, January 19, 2012 8:02 PM > > To: java-user@lucene.apache.org > > Subject: any tips for upgrading Lucene 3.0.3 -> 3.5.0? > > > > I'm hoping to upgrade Lucene on a local code base from 3.0.3 to 3.5.0; is > there > > a good guide out there for

RE: any tips for upgrading Lucene 3.0.3 -> 3.5.0?

2012-01-19 Thread Uwe Schindler
i.de eMail: u...@thetaphi.de > -Original Message- > From: David Carlton [mailto:carl...@sumologic.com] > Sent: Thursday, January 19, 2012 8:02 PM > To: java-user@lucene.apache.org > Subject: any tips for upgrading Lucene 3.0.3 -> 3.5.0? > > I'm hoping to upgrade

any tips for upgrading Lucene 3.0.3 -> 3.5.0?

2012-01-19 Thread David Carlton
I'm hoping to upgrade Lucene on a local code base from 3.0.3 to 3.5.0; is there a good guide out there for particular pitfalls that I should worry about? I've skimmed the ChangeLogs; the mention of an index upgrade tool made me wonder: has the index format changed between those versions? If so, wha

RE: Lucene 3.0.3 with debug information

2011-04-29 Thread Steven A Rowe
Thanks Dawid. – Steve From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Friday, April 29, 2011 4:45 PM To: java-user@lucene.apache.org Cc: Steven A Rowe Subject: Lucene 3.0.3 with debug information This is the e-mail you're looking for, Steven (it w

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Michael McCandless
On Fri, Apr 29, 2011 at 4:25 PM, Paul Taylor wrote: >> Hmm maybe that is enough, Im not sure. I'm profiling with YourkitProfiler >> and it doesnt show anything within the lucene classes so I assumed this >> meant they didnt contain the neccessary debugging info but I would have >> thought that -g

Lucene 3.0.3 with debug information

2011-04-29 Thread Dawid Weiss
This is the e-mail you're looking for, Steven (it wasn't forwarded to the list, apparently). Dawid -- Forwarded message -- From: Paul Taylor Date: Fri, Apr 29, 2011 at 10:11 PM Subject: Re: Lucene 3.0.3 with debug information To: Dawid Weiss On 29/04/2011 15:17, D

RE: Lucene 3.0.3 with debug information

2011-04-29 Thread Steven A Rowe
Hi Paul, On 4/29/2011 at 4:14 PM, Paul Taylor wrote: > On 29/04/2011 16:03, Steven A Rowe wrote: > > What did you find about Luke that's buggy? Bug reports are very > > useful; please contribute in this way. > > Please see previous post, in summary mistake on my part. Okay... Which previous post

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Paul Taylor
On 29/04/2011 21:14, Paul Taylor wrote: Hmm maybe that is enough, Im not sure. I'm profiling with YourkitProfiler and it doesnt show anything within the lucene classes so I assumed this meant they didnt contain the neccessary debugging info but I would have thought that -g is all I need tha

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Dawid Weiss
lls script that added it to the > classpath when I ran luke, however I had put it before the ant jar and my > jar built with maven also included lucene 3.0.3 and because luke 1.0.1 is > packaged with 3.0.0 it was confusing it, but I didnt realize this until I > notice done exception compla

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Paul Taylor
On 29/04/2011 16:03, Steven A Rowe wrote: Hi Paul, What did you find about Luke that's buggy? Bug reports are very useful; please contribute in this way. Please see previous post, in summary mistake on my part. The official Lucene 3.0.3 distribution jars were compiled using the -g cm

RE: Lucene 3.0.3 with debug information

2011-04-29 Thread Steven A Rowe
Hi Paul, What did you find about Luke that's buggy? Bug reports are very useful; please contribute in this way. The official Lucene 3.0.3 distribution jars were compiled using the -g cmdline argument to javac - by default, though, only line number and source file information is gene

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Dawid Weiss
> lucene/Search that is taking the time, I also had another attempt using > luke > > but find it incredibly buggy and of little use > Can you expand on this too? What kind of "incredible bugs" did you see? Without feedback there is little progress, so bug reports count. Dawid

Re: Lucene 3.0.3 with debug information

2011-04-29 Thread Simon Willnauer
Hey paul, you can simply checkout the tag or download the sources right? http://svn.apache.org/repos/asf/lucene/java/tags/lucene_3_0_3/ or http://ftp.download-by.net/apache//lucene/java/3.0.3/ simon On Fri, Apr 29, 2011 at 1:09 PM, Paul Taylor wrote: > Is there a built debug version of luc

Lucene 3.0.3 with debug information

2011-04-29 Thread Paul Taylor
Is there a built debug version of lucene 3.0.3 so I can profile it properly to find what part of the search is taking the time. Note:Ive already profiled by application and determined that it is the lucene/Search that is taking the time, I also had another attempt using luke but find it

RE: lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-27 Thread Steven A Rowe
two:.net)", mfqp.parse("c# AND .net").toString()); } Steve > -Original Message- > From: Ranjit Kumar [mailto:ranjit.ku...@otssolutions.com] > Sent: Wednesday, April 27, 2011 3:24 AM > To: java-user-h...@lucene.apache.org; java-user@lucene.a

Re: lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-27 Thread Ranjit Kumar
Hi, while creating index with the help of lucene standardAnalyzer, we cannot make difference between c, c++ and c# as lucene do not create index for c++ and c#. To make the difference between these term I need to change the grammar of lucene with the help of jFlex, it force me to create my own

Re: lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-26 Thread haichengyl
hope to sent some detail about it. 2011-04-26 haichengyl 发件人: Ranjit Kumar 发送时间: 2011-04-26 21:55:04 收件人: java-user-h...@lucene.apache.org; java-user@lucene.apache.org 抄送: 主题: lucene 3.0.3 | QueryParser | MultiFieldQueryParser Hi, I have created my own custom analyzer and uses

Re: lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-26 Thread haichengyl
help to give some detail info 2011-04-26 haichengyl 发件人: Ranjit Kumar 发送时间: 2011-04-26 21:55:04 收件人: java-user-h...@lucene.apache.org; java-user@lucene.apache.org 抄送: 主题: lucene 3.0.3 | QueryParser | MultiFieldQueryParser Hi, I have created my own custom analyzer and uses jFlex

RE: lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-26 Thread Steven A Rowe
(<_TERM_CHAR>)* > Are you sure that your custom JFlex Analyzer is not being given 'C#' and then stripping off the '#'? You could work around this issue by pre-processing your query (and your documents) to replace C# with csharp or something like it that would not be

lucene 3.0.3 | QueryParser | MultiFieldQueryParser

2011-04-26 Thread Ranjit Kumar
Hi, I have created my own custom analyzer and uses jFlex to made search for c#, .net, c++ etc. While I am trying to search c#, .net, c++ QueryParser parse .net to .net and C++ to C++. So it works fine. But in case of C# QueryParser parse it to C which makes trouble for me. Also tried to use M

Re: lucene 3.0.3 | searching problem with *.docx file

2011-04-12 Thread Erick Erickson
You've given us anything to go on here, except "it doesn't work". You might review this page: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Tue, Apr 12, 2011 at 9:05 AM, Ranjit Kumar wrote: > Hi, > > I am creating index with help of StandardAnalyzer for *.docx file it's > fine. Bu

RE: lucene 3.0.3 | searching problem with *.docx file

2011-04-12 Thread Steven A Rowe
; To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org > Subject: lucene 3.0.3 | searching problem with *.docx file > > Hi, > > I am creating index with help of StandardAnalyzer for *.docx file it's > fine. But at the time of searching it do not gives result for these *.

lucene 3.0.3 | searching problem with *.docx file

2011-04-12 Thread Ranjit Kumar
Hi, I am creating index with help of StandardAnalyzer for *.docx file it's fine. But at the time of searching it do not gives result for these *.docx file. any help or suggestion will be appreciated!!! Thanks & Regards, Ranjit Kumar =

Re: shared IndexSearcher (lucene 3.0.3)

2011-02-25 Thread Simon Willnauer
Hey, the too many open files can be prevented by raising the limit of open files ;) there is a nice summary on the FAQ you might wanna look at: http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_an_IOException_that_says_.22Too_many_open_files.22.3F if you have further questions just

shared IndexSearcher (lucene 3.0.3)

2011-02-25 Thread Akos Tajti
Hi all, in our project we're using lucene in tomcat. To avoid some overhead we have a shared IndexSearcher instance. In the past we had too many open files errors many times. To prevent this the IndexSearcher is closed and reopened after indexing. The shared instance is not closed anywhere else in

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-11 Thread Uwe Schindler
Hi, That is as expected. When IndexReader or IndexSearcher are open, the snapshot of this index is preserved until you reopen it, as all readers only see the index in the state when it was opened, so disk space is still acquired and on windows you even see the files. For optimize (what you shou

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-11 Thread Phil Herold
New information: it appears that the index size increasing (not always doubling but going up significantly) occurs when I search the index while building it. Calling indexWriter.optimize(1, true); when I'm done adding documents sometimes reduces the index down to size, but not always. Has anyon

RE: lucene 3.0.3 | phrase query problem

2011-02-11 Thread Zhang, Lisheng
bruary 10, 2011 10:41 PM To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org Subject: lucene 3.0.3 | phrase query problem Hi Anshum, Thanks for your replay.. Yes, I am agree with you. As right now, I am using StandardAnalyzer it remove stop words, Puts text in lowercase and do not crea

lucene 3.0.3 | phrase query problem

2011-02-10 Thread Ranjit Kumar
Hi Anshum, Thanks for your replay.. Yes, I am agree with you. As right now, I am using StandardAnalyzer it remove stop words, Puts text in lowercase and do not create index for most common word in English. Searching on index created by StandardAnalyzer it gives result as discussed

Re: lucene 3.0.3 | phrase query problem

2011-02-10 Thread Anshum
Hi Ranjit, That would be because all stop words (space, comma, stop word set, etc..) would be treated in a similar fashion and escaped while indexing, subject to the analyzer you use while index your content. Hope that explains the issue. -- Anshum Gupta http://ai-cafe.blogspot.com On Thu, Feb 1

lucene 3.0.3 | phrase query problem

2011-02-10 Thread Ranjit Kumar
searchString = "i am using sql. server setting is easy task."; while i am searching for phrase query "Sql Server" in above string it gives result which is not correct. As In the above string sql and server is seperated by dot(.) using both PhraseQuery and SpanQuery gives same result. Hi,

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-10 Thread Michael McCandless
IndexWriter.setInfoStream -- when you set that, it produces lots of verbose output detailing what IW is doing to the index... Mike On Wed, Feb 9, 2011 at 8:06 PM, Phil Herold wrote: > I didn't have any errors or exceptions. Sorry to be dense, but what exactly > is the "infoStream output" you're

RE: lucene 3.0.3 | phrase query problem

2011-02-09 Thread Zhang, Lisheng
uot;sql. server" we should not get result? Best regards, Lisheng -Original Message- From: Ranjit Kumar [mailto:ranjit.ku...@otssolutions.com] Sent: Wednesday, February 09, 2011 9:39 PM To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org Subject: lucene 3.0.3 | phrase que

lucene 3.0.3 | phrase query problem

2011-02-09 Thread Ranjit Kumar
Hi, I am using SpanQuery and SpanNearQuery to get phrase query like "Sql Server". In my text file in which I am searching, it is present like (sql. server) mean 'sql dot server' which is not a span like "Sql Server". While searching for phrase query "Sql Server". It gives result for (sql. ser

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-09 Thread Phil Herold
I didn't have any errors or exceptions. Sorry to be dense, but what exactly is the "infoStream output" you're asking about? >This is not expected. > >Did the last IW exit "gracefully"? If so, it should delete the old >segments after swapping in the optimized one. >Can you post infoStre

Re: index size doubling / optimization (Lucene 3.0.3)

2011-02-09 Thread Michael McCandless
This is not expected. Did the last IW exit "gracefully"? If so, it should delete the old segments after swapping in the optimized one. Can you post infoStream output after running optimize? Mike On Wed, Feb 9, 2011 at 1:58 PM, Phil Herold wrote: > I know that the size of a Lucene index can do

index size doubling / optimization (Lucene 3.0.3)

2011-02-09 Thread Phil Herold
I know that the size of a Lucene index can double while optimization is underway, but it's supposed to eventually settle back down to the original size, correct? We have a Lucene index consisting of 100K documents, that is normally about 12GB in size. It is split across 10 sub-indexes which we sear

Re: BooleanQuery / multiple indexes - Lucene 3.0.3

2011-02-03 Thread Robert Muir
On Thu, Feb 3, 2011 at 5:57 PM, Phil Herold wrote: > Hi, > > > > I'm getting incorrect search results when I use a MultiSearcher across > multiple indexes with a Boolean query, specifically, foo AND !bar (using > QueryParser). For example, with two indexes, I have a single document that > satisfie

BooleanQuery / multiple indexes - Lucene 3.0.3

2011-02-03 Thread Phil Herold
Hi, I'm getting incorrect search results when I use a MultiSearcher across multiple indexes with a Boolean query, specifically, foo AND !bar (using QueryParser). For example, with two indexes, I have a single document that satisfies both "foo" and "bar", so it should be excluded from the search

Re: Query parse errors for dashes in Lucene (3.0.3)

2011-01-24 Thread Yuhan Zhang
Hi Andrew, you can escape the special characters in the string that QueryParser reserves by: String queryString = QueryParser.escape( queryString ); Query query = QueryParser.parse( queryString ); Yuhan On Mon, Jan 24, 2011 at 6:03 PM, Andrew Kane wrote: > Wow, passing the buck doesn't really

Re: Query parse errors for dashes in Lucene (3.0.3)

2011-01-24 Thread Andrew Kane
Wow, passing the buck doesn't really work for me. If you think Lucene is a *database* that's fine, but in your demo code (or wherever) you should have a translation routine to convert user input into *SQL/whatever language you're using* and solve 95% of the use cases. Does such a translation rout

Re: Query parse errors for dashes in Lucene (3.0.3)

2011-01-24 Thread Erick Erickson
Yes. You're confusing an *engine* with a full-blown application. The user here is a Java programmer. I argue that guessing, which is what you're asking for, is emphatically NOT in the domain of the search *engine*, which is what Lucene is. Imagine the poor programmer trying to understand why certa

Re: Query parse errors for dashes in Lucene (3.0.3)

2011-01-24 Thread Andrew Kane
What are you talking about?! A search engine isn't a compiler with a programmer for a user and a strict syntax. The job of a search engine is to produce the best results it can *for any given input*. Am I missing something here? Andrew. On Mon, Jan 24, 2011 at 5:15 PM, Adriano Crestani wrote

Re: Query parse errors for dashes in Lucene (3.0.3)

2011-01-24 Thread Adriano Crestani
It's valid syntax error, since - is the exclusion operator, so the QP expects a term, phrase, parenthesis, etc after that. On Mon, Jan 24, 2011 at 5:05 PM, Andrew Kane wrote: > Shouldn't these two queries be fine? (from TREC million query track). > Should this be entered as a bug? > > Thanks,

Query parse errors for dashes in Lucene (3.0.3)

2011-01-24 Thread Andrew Kane
Shouldn't these two queries be fine? (from TREC million query track). Should this be entered as a bug? Thanks, Andrew. Cannot parse 'statistics on child labor laws 1930 -': Encountered "" at line 1, column 37. Was expecting one of: "(" ... "*" ... ... ... ... ...

Re: parsing Java log file with Lucene 3.0.3

2011-01-04 Thread Benzion G
OK, I succeeded to write an Analyzer I need. I can't say that I understood all Lucene Analyzer-Tokenizer-Filter logic, but here's attached MyAnalyzer. Hope it will help somebody else. import java.io.Reader; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.CharTokeni

Re: parsing Java log file with Lucene 3.0.3

2011-01-04 Thread Erick Erickson
Lucene In Action has an example of creating a synonymanalyzer that you can adapt. The general idea is to subclass from Analyzer and implement the required functions, perhaps wrapping a Tokenizer in a bunch of Filters. You might be able to crib some ideas from solr.analysis.WordDelimiterFilter Best

Re: parsing Java log file with Lucene 3.0.3

2011-01-04 Thread Benzion G
Problem with SimpleAnalyzer! It ignores digits. For text "customer 123 found" it will take only "customer" and "found", but will ignore "123". StandardAnalyzer handles OK the digits but has the dots problem, I mentioned before. Is there an understandable guide how to write my own Analyzer - a h

Re: parsing Java log file with Lucene 3.0.3

2011-01-03 Thread Benzion G
Thank you guys! Looks like SimpleAnalyzer is OK for my application. I'm still testing but meanwhile it looks good. -- View this message in context: http://lucene.472066.n3.nabble.com/parsing-Java-log-file-with-Lucene-3-0-3-tp2173046p2190354.html Sent from the Lucene - Java Users mailing list ar

Re: parsing Java log file with Lucene 3.0.3

2011-01-02 Thread Erick Erickson
Some days I just can't read... First question: Why do you require standard analyzer?Are you really making use of the special processing? Take a look at other analyzer options. PatternAnalyzer, SimpleAnalyzer, etc. If you really require StandardAnalyzer, consider using two fields. field_original a

Re: parsing Java log file with Lucene 3.0.3

2011-01-01 Thread Benzion G
Of course I want to store and then show to user the original message. That's why I can't change it and the place to handle the dots is the Analyzer area. So how can I make the StandardAnalyzer to handle dots as commas? -- View this message in context: http://lucene.472066.n3.nabble.com/parsing-

Re: parsing Java log file with Lucene 3.0.3

2011-01-01 Thread Erick Erickson
<<>> No, that is not the case. Storing a field stores an exact copy of the input, without any analysis. The intent of storing a field is to return something to display in the results list that reflects the original document. What use would it be to store something that had gone through the analysi

Re: parsing Java log file with Lucene 3.0.3

2011-01-01 Thread Benzion G
I'm testing it with ~50M log files. But in production env the log files will be ~10G. -- View this message in context: http://lucene.472066.n3.nabble.com/parsing-Java-log-file-with-Lucene-3-0-3-tp2173046p2177477.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Re: parsing Java log file with Lucene 3.0.3

2011-01-01 Thread Benzion G
I tried to understand where the StandardAnalyzer and other Standard* classes are handling these dots and commas and how can I change its behaviour. I debugged it as well, but I failed to understand it. -- View this message in context: http://lucene.472066.n3.nabble.com/parsing-Java-log-file-wit

Re: parsing Java log file with Lucene 3.0.3

2011-01-01 Thread Hasan Diwan
On 1 January 2011 21:47, Benzion G wrote: > But I'm afraid it will make my index files much bigger. Since I'm indexing > log files the index will be anyway too big so I can't make it even bigger. Have you tried it out? How large are your log files and how large do you expect them to get? -- Sent

Re: parsing Java log file with Lucene 3.0.3

2011-01-01 Thread Benzion G
Hi, Of course I thought about replacing dots by commas or blanks. But I add this field as Filed.Store.YES. If I'll replace dot with commas it will appear with commas in search results. I also considered adding it as 2 fields: 1. With dots replaced by commas for index and Filed.Store.NO 2. The

Re: parsing Java log file with Lucene 3.0.3

2010-12-31 Thread Erick Erickson
Have you looked at: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Best Erick On Fri, Dec 31, 2010 at 6:12 AM, Benzion G wrote: > Hi, > > I need to parse the Java log files with Lucene 3.0.3. The StandardAnalyzer > is > OK, except it's handling of dots.

Re: parsing Java log file with Lucene 3.0.3

2010-12-31 Thread Hasan Diwan
On 31 December 2010 11:12, Benzion G wrote: > I need to parse the Java log files with Lucene 3.0.3. The StandardAnalyzer is > OK, except it's handling of dots. > > E.g. it handles "java.lang.NullPointerException" as one word and searching for > "NullPointerE

parsing Java log file with Lucene 3.0.3

2010-12-31 Thread Benzion G
Hi, I need to parse the Java log files with Lucene 3.0.3. The StandardAnalyzer is OK, except it's handling of dots. E.g. it handles "java.lang.NullPointerException" as one word and searching for "NullPointerException" will bring nothing. I need an Analyzer that will

Lucene 3.0.3 Release Date

2010-10-28 Thread Shay Banon
Hi, It seems like current 3.0 branch has accumulated some important bug fixes, especially the possible index corruption bug. Is there a date for a formal 3.0.3 release? -shay.banon