ly since java 8.
>
>If you -have- rewritten some of the file handling code in your indexing
>process, make sure to explicitly close the streams you create, or use
>the
>(since java 7) try-with-resources construct.
>
>
>On 22/02/2017 16:18, Leonid Bolshinsky wrote:
>
>
ewritten some of the file handling code in your indexing
> process, make sure to explicitly close the streams you create, or use the
> (since java 7) try-with-resources construct.
>
>
> On 22/02/2017 16:18, Leonid Bolshinsky wrote:
>
> > I have a search engine based on Lucene 3.0.3
Bolshinsky wrote:
> I have a search engine based on Lucene 3.0.3 and I can't change the Lucene
> version for reasons that are out of scope of this question. Now I have a
> requirement to move from Java 6 to Java 8, however when I run the indexing
> using Java 8 JVM, I hit &qu
-with-resources construct.
On 22/02/2017 16:18, Leonid Bolshinsky wrote:
I have a search engine based on Lucene 3.0.3 and I can't change the Lucene
version for reasons that are out of scope of this question. Now I have a
requirement to move from Java 6 to Java 8, however when I run the ind
I have a search engine based on Lucene 3.0.3 and I can't change the Lucene
version for reasons that are out of scope of this question. Now I have a
requirement to move from Java 6 to Java 8, however when I run the indexing
using Java 8 JVM, I hit "Too many open files issue
manoj raj wrote:
> Hi,
>
> I am working with Lucene 3.0.3
>
> I find there is an inconsistency while using Integer fields in multifield
> sorting.
>
> Please Clarify.
>
> With Regards,
> Manoj R.
>
Hi,
I am working with Lucene 3.0.3
I find there is an inconsistency while using Integer fields in multifield
sorting.
Please Clarify.
With Regards,
Manoj R.
You are very likely hitting the issue described here:
https://issues.apache.org/jira/browse/LUCENE-5541
Mike McCandless
http://blog.mikemccandless.com
On Wed, Dec 17, 2014 at 2:03 PM, Shlomit Rosen wrote:
> Hello,
>
> We have a client that is using lucene 3.0.3.
> They are work
very deep knowledge on
In some Lucene JAR files is an additional tool to “extract” CFS files (like
unzip), you may try to use it – but I am not sure if this was already existent
in Lucene 3.0.3 (you need to do some Javadoc search to look it up). But without
the dictionary at the end of the
Hello,
We have a client that is using lucene 3.0.3.
They are working with NAS storage device which recently had permission
issues,
which might have generated some "out of disk space" exceptions during
indexing.
We are uncertain if they also suffered JDK crashes in the past few mont
Hello,
We have a client that is using lucene 3.0.3.
They are working with NAS storage device which recently had permission
issues,
which might have generated some "out of disk space" exceptions during
indexing.
We are uncertain if they also suffered JDK crashes in the past few mont
Hi,
My company is using an older version of Lucene (3.0.3). In my profiling
results with 3.0.3, I have found that my app's threads were blocked due to
the issue mentioned at LUCENE-3653. Although I was able to use the 3.6 line
which fixes this problem, we are still in the process of condu
Hi Everybody,
lucene-3.0.3. will handle outlook files, DOCX and .EXLX files while searching a
text??
We have taken indexfiles.java and searchfiles.java from lucene-3.0.3\src
folder, it is working fine for PDF, txt, doc, excel, java, CSV files.
Thanks
Prasad
ere it find the text. Using the
>return list we can populate them in User Interface after validating with
>user access rights. Actually we have one image server in that there will
>be few folders and sub folders, each folder will have may have 10,000
>files.
>
>so far we are search text
we
can populate them in User Interface after validating with user access rights.
Actually we have one image server in that there will be few folders and sub
folders, each folder will have may have 10,000 files.
so far we are search text for TXT files only using lucene-3.0.3.
Thanks
Prasad
__
From: KARTHIK SHIVAKUMAR [mailto:nskarthi...@gmail.com]
Sent: Wed 2/1/2012 7:04 PM
To: java-user@lucene.apache.org
Subject: Re: lucene-3.0.3
Hi
>>lucene-3.0.3 can be used for searching a text from
Lucene 's primary job is to do a text search.
May it be PDF/HTML/XML/MSword/PPT/
ptions.
>
> Thanks
> Prasad
>
> -Original Message-
> From: Ian Lea [mailto:ian@gmail.com]
> Sent: Wednesday, February 01, 2012 7:22 PM
> To: java-user@lucene.apache.org
> Subject: Re: lucene-3.0.3
>
> You could also take a look at Solr. From
> h
.
On Wed, Feb 1, 2012 at 1:34 PM, KARTHIK SHIVAKUMAR
wrote:
> Hi
>
>>>lucene-3.0.3 can be used for searching a text from
>
> Lucene 's primary job is to do a text search.
>
> May it be PDF/HTML/XML/MSword/PPT/XLS
>
> U have to have the code for plugin to do 2 t
Prasad
-Original Message-
From: KARTHIK SHIVAKUMAR [mailto:nskarthi...@gmail.com]
Sent: Wednesday, February 01, 2012 7:04 PM
To: java-user@lucene.apache.org
Subject: Re: lucene-3.0.3
Hi
>>lucene-3.0.3 can be used for searching a text from
Lucene 's primary job is to do a text s
.
On Wed, Feb 1, 2012 at 1:34 PM, KARTHIK SHIVAKUMAR
wrote:
> Hi
>
>>>lucene-3.0.3 can be used for searching a text from
>
> Lucene 's primary job is to do a text search.
>
> May it be PDF/HTML/XML/MSword/PPT/XLS
>
> U have to have the code for plugin to do 2 t
Hi
>>lucene-3.0.3 can be used for searching a text from
Lucene 's primary job is to do a text search.
May it be PDF/HTML/XML/MSword/PPT/XLS
U have to have the code for plugin to do 2 things
1) Strip text from either of the Documents (PDF/HTML/XML/MSword/PPT/XLS)
2) Index this pro
Hi,
lucene-3.0.3 can be used for searching a text from PDF, xlsx, docx, doc,
xls, msg, TXT files. For this we have any common function to accomplish
this. Please help me on this.
Thanks
Prasad
ltiSearcher.
>
> Regards
> Ganesh
>
>
>
> - Original Message -
> From: "Uwe Schindler"
> To:
> Sent: Friday, January 20, 2012 5:18 AM
> Subject: [Bulk] RE: any tips for upgrading Lucene 3.0.3 -> 3.5.0?
>
>
> > -Original Message-
>
I am also in the way to upgrade from 3.0.3 to 3.5. Any other API changes we
need to care about? I use ParallelMultiSearcher.
Regards
Ganesh
- Original Message -
From: "Uwe Schindler"
To:
Sent: Friday, January 20, 2012 5:18 AM
Subject: [Bulk] RE: any tips for upgrading Lu
> -Original Message-
> From: earlh...@gmail.com [mailto:earlh...@gmail.com] On Behalf Of Earl
> Hood
> Sent: Friday, January 20, 2012 12:41 AM
> To: java-user@lucene.apache.org
> Subject: Re: any tips for upgrading Lucene 3.0.3 -> 3.5.0?
>
> On Thu, Jan 19, 201
On Thu, Jan 19, 2012 at 4:59 PM, Uwe Schindler wrote:
> Lucene 3.5 can read any index going back to 2.0. The IndexUpgrader is only
> needed to "forcefully" upgrade indexes for maximum performance and safe
> migration to Lucene 4.0 (that can only read indexs >= 3.0).
Question: Will Lucene 3.5 auto
> Sent: Thursday, January 19, 2012 8:02 PM
> > To: java-user@lucene.apache.org
> > Subject: any tips for upgrading Lucene 3.0.3 -> 3.5.0?
> >
> > I'm hoping to upgrade Lucene on a local code base from 3.0.3 to 3.5.0; is
> there
> > a good guide out there for
i.de
eMail: u...@thetaphi.de
> -Original Message-
> From: David Carlton [mailto:carl...@sumologic.com]
> Sent: Thursday, January 19, 2012 8:02 PM
> To: java-user@lucene.apache.org
> Subject: any tips for upgrading Lucene 3.0.3 -> 3.5.0?
>
> I'm hoping to upgrade
I'm hoping to upgrade Lucene on a local code base from 3.0.3 to 3.5.0; is
there a good guide out there for particular pitfalls that I should worry
about? I've skimmed the ChangeLogs; the mention of an index upgrade tool
made me wonder: has the index format changed between those versions? If so,
wha
Thanks Dawid. – Steve
From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid
Weiss
Sent: Friday, April 29, 2011 4:45 PM
To: java-user@lucene.apache.org
Cc: Steven A Rowe
Subject: Lucene 3.0.3 with debug information
This is the e-mail you're looking for, Steven (it w
On Fri, Apr 29, 2011 at 4:25 PM, Paul Taylor wrote:
>> Hmm maybe that is enough, Im not sure. I'm profiling with YourkitProfiler
>> and it doesnt show anything within the lucene classes so I assumed this
>> meant they didnt contain the neccessary debugging info but I would have
>> thought that -g
This is the e-mail you're looking for, Steven (it wasn't forwarded to the
list, apparently).
Dawid
-- Forwarded message --
From: Paul Taylor
Date: Fri, Apr 29, 2011 at 10:11 PM
Subject: Re: Lucene 3.0.3 with debug information
To: Dawid Weiss
On 29/04/2011 15:17, D
Hi Paul,
On 4/29/2011 at 4:14 PM, Paul Taylor wrote:
> On 29/04/2011 16:03, Steven A Rowe wrote:
> > What did you find about Luke that's buggy? Bug reports are very
> > useful; please contribute in this way.
>
> Please see previous post, in summary mistake on my part.
Okay... Which previous post
On 29/04/2011 21:14, Paul Taylor wrote:
Hmm maybe that is enough, Im not sure. I'm profiling with
YourkitProfiler and it doesnt show anything within the lucene classes
so I assumed this meant they didnt contain the neccessary debugging
info but I would have thought that -g is all I need
tha
lls script that added it to the
> classpath when I ran luke, however I had put it before the ant jar and my
> jar built with maven also included lucene 3.0.3 and because luke 1.0.1 is
> packaged with 3.0.0 it was confusing it, but I didnt realize this until I
> notice done exception compla
On 29/04/2011 16:03, Steven A Rowe wrote:
Hi Paul,
What did you find about Luke that's buggy? Bug reports are very useful; please
contribute in this way.
Please see previous post, in summary mistake on my part.
The official Lucene 3.0.3 distribution jars were compiled using the -g cm
Hi Paul,
What did you find about Luke that's buggy? Bug reports are very useful; please
contribute in this way.
The official Lucene 3.0.3 distribution jars were compiled using the -g cmdline
argument to javac - by default, though, only line number and source file
information is gene
> lucene/Search that is taking the time, I also had another attempt using
> luke
> > but find it incredibly buggy and of little use
>
Can you expand on this too? What kind of "incredible bugs" did you see?
Without feedback there is little progress, so bug reports count.
Dawid
Hey paul,
you can simply checkout the tag or download the sources right?
http://svn.apache.org/repos/asf/lucene/java/tags/lucene_3_0_3/
or http://ftp.download-by.net/apache//lucene/java/3.0.3/
simon
On Fri, Apr 29, 2011 at 1:09 PM, Paul Taylor wrote:
> Is there a built debug version of luc
Is there a built debug version of lucene 3.0.3 so I can profile it
properly to find what part of the search is taking the time.
Note:Ive already profiled by application and determined that it is the
lucene/Search that is taking the time, I also had another attempt using
luke but find it
two:.net)",
mfqp.parse("c# AND .net").toString());
}
Steve
> -Original Message-
> From: Ranjit Kumar [mailto:ranjit.ku...@otssolutions.com]
> Sent: Wednesday, April 27, 2011 3:24 AM
> To: java-user-h...@lucene.apache.org; java-user@lucene.a
Hi,
while creating index with the help of lucene standardAnalyzer, we cannot make
difference between c, c++ and c# as lucene do not create index for c++ and c#.
To make the difference between these term I need to change the grammar of
lucene with the help of jFlex, it force me to create my own
hope to sent some detail about it.
2011-04-26
haichengyl
发件人: Ranjit Kumar
发送时间: 2011-04-26 21:55:04
收件人: java-user-h...@lucene.apache.org; java-user@lucene.apache.org
抄送:
主题: lucene 3.0.3 | QueryParser | MultiFieldQueryParser
Hi,
I have created my own custom analyzer and uses
help to give some detail info
2011-04-26
haichengyl
发件人: Ranjit Kumar
发送时间: 2011-04-26 21:55:04
收件人: java-user-h...@lucene.apache.org; java-user@lucene.apache.org
抄送:
主题: lucene 3.0.3 | QueryParser | MultiFieldQueryParser
Hi,
I have created my own custom analyzer and uses jFlex
(<_TERM_CHAR>)* >
Are you sure that your custom JFlex Analyzer is not being given 'C#' and then
stripping off the '#'?
You could work around this issue by pre-processing your query (and your
documents) to replace C# with csharp or something like it that would not be
Hi,
I have created my own custom analyzer and uses jFlex to made search for c#,
.net, c++ etc.
While I am trying to search c#, .net, c++ QueryParser parse .net to .net and
C++ to C++. So it works fine. But in case of C# QueryParser parse it to C which
makes trouble for me.
Also tried to use M
You've given us anything to go on here, except "it doesn't work". You might
review this page:
http://wiki.apache.org/solr/UsingMailingLists
Best
Erick
On Tue, Apr 12, 2011 at 9:05 AM, Ranjit Kumar wrote:
> Hi,
>
> I am creating index with help of StandardAnalyzer for *.docx file it's
> fine. Bu
; To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org
> Subject: lucene 3.0.3 | searching problem with *.docx file
>
> Hi,
>
> I am creating index with help of StandardAnalyzer for *.docx file it's
> fine. But at the time of searching it do not gives result for these *.
Hi,
I am creating index with help of StandardAnalyzer for *.docx file it's fine.
But at the time of searching it do not gives result for these *.docx file.
any help or suggestion will be appreciated!!!
Thanks & Regards,
Ranjit Kumar
=
Hey,
the too many open files can be prevented by raising the limit of open files ;)
there is a nice summary on the FAQ you might wanna look at:
http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_an_IOException_that_says_.22Too_many_open_files.22.3F
if you have further questions just
Hi all,
in our project we're using lucene in tomcat. To avoid some overhead we have
a shared IndexSearcher instance. In the past we had too many open files
errors many times. To prevent this the IndexSearcher is closed and reopened
after indexing. The shared instance is not closed anywhere else in
Hi,
That is as expected. When IndexReader or IndexSearcher are open, the snapshot
of this index is preserved until you reopen it, as all readers only see the
index in the state when it was opened, so disk space is still acquired and on
windows you even see the files. For optimize (what you shou
New information: it appears that the index size increasing (not always
doubling but going up significantly) occurs when I search the index while
building it. Calling indexWriter.optimize(1, true); when I'm done adding
documents sometimes reduces the index down to size, but not always.
Has anyon
bruary 10, 2011 10:41 PM
To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org
Subject: lucene 3.0.3 | phrase query problem
Hi Anshum,
Thanks for your replay..
Yes, I am agree with you.
As right now, I am using StandardAnalyzer it remove stop words, Puts text in
lowercase and do not crea
Hi Anshum,
Thanks for your replay..
Yes, I am agree with you.
As right now, I am using StandardAnalyzer it remove stop words, Puts text in
lowercase and do not create index for most common word in English.
Searching on index created by StandardAnalyzer it gives result as
discussed
Hi Ranjit,
That would be because all stop words (space, comma, stop word set, etc..)
would be treated in a similar fashion and escaped while indexing, subject to
the analyzer you use while index your content.
Hope that explains the issue.
--
Anshum Gupta
http://ai-cafe.blogspot.com
On Thu, Feb 1
searchString = "i am using sql. server setting is easy task.";
while i am searching for phrase query "Sql Server" in above string it gives
result which is not correct. As In the above string sql and server is seperated
by dot(.)
using both PhraseQuery and SpanQuery gives same result.
Hi,
IndexWriter.setInfoStream -- when you set that, it produces lots of
verbose output detailing what IW is doing to the index...
Mike
On Wed, Feb 9, 2011 at 8:06 PM, Phil Herold wrote:
> I didn't have any errors or exceptions. Sorry to be dense, but what exactly
> is the "infoStream output" you're
uot;sql. server" we should not get result?
Best regards, Lisheng
-Original Message-
From: Ranjit Kumar [mailto:ranjit.ku...@otssolutions.com]
Sent: Wednesday, February 09, 2011 9:39 PM
To: java-user-h...@lucene.apache.org; java-user@lucene.apache.org
Subject: lucene 3.0.3 | phrase que
Hi,
I am using SpanQuery and SpanNearQuery to get phrase query like "Sql Server".
In my text file in which I am searching, it is present like (sql. server) mean
'sql dot server' which is not a span like "Sql Server".
While searching for phrase query "Sql Server". It gives result for (sql.
ser
I didn't have any errors or exceptions. Sorry to be dense, but what exactly
is the "infoStream output" you're asking about?
>This is not expected.
>
>Did the last IW exit "gracefully"? If so, it should delete the old
>segments after swapping in the optimized one.
>Can you post infoStre
This is not expected.
Did the last IW exit "gracefully"? If so, it should delete the old
segments after swapping in the optimized one.
Can you post infoStream output after running optimize?
Mike
On Wed, Feb 9, 2011 at 1:58 PM, Phil Herold wrote:
> I know that the size of a Lucene index can do
I know that the size of a Lucene index can double while optimization is
underway, but it's supposed to eventually settle back down to the original
size, correct? We have a Lucene index consisting of 100K documents, that is
normally about 12GB in size. It is split across 10 sub-indexes which we
sear
On Thu, Feb 3, 2011 at 5:57 PM, Phil Herold wrote:
> Hi,
>
>
>
> I'm getting incorrect search results when I use a MultiSearcher across
> multiple indexes with a Boolean query, specifically, foo AND !bar (using
> QueryParser). For example, with two indexes, I have a single document that
> satisfie
Hi,
I'm getting incorrect search results when I use a MultiSearcher across
multiple indexes with a Boolean query, specifically, foo AND !bar (using
QueryParser). For example, with two indexes, I have a single document that
satisfies both "foo" and "bar", so it should be excluded from the search
Hi Andrew,
you can escape the special characters in the string that QueryParser
reserves
by:
String queryString = QueryParser.escape( queryString );
Query query = QueryParser.parse( queryString );
Yuhan
On Mon, Jan 24, 2011 at 6:03 PM, Andrew Kane wrote:
> Wow, passing the buck doesn't really
Wow, passing the buck doesn't really work for me. If you think Lucene is a
*database* that's fine, but in your demo code (or wherever) you should have
a translation routine to convert user input into *SQL/whatever language
you're using* and solve 95% of the use cases. Does such a translation
rout
Yes. You're confusing an *engine* with a full-blown application.
The user here is a Java programmer. I argue that guessing, which
is what you're asking for, is emphatically NOT in the domain of the
search *engine*, which is what Lucene is. Imagine the poor programmer
trying to understand why certa
What are you talking about?! A search engine isn't a compiler with a
programmer for a user and a strict syntax. The job of a search engine is to
produce the best results it can *for any given input*. Am I missing
something here?
Andrew.
On Mon, Jan 24, 2011 at 5:15 PM, Adriano Crestani wrote
It's valid syntax error, since - is the exclusion operator, so the QP
expects a term, phrase, parenthesis, etc after that.
On Mon, Jan 24, 2011 at 5:05 PM, Andrew Kane wrote:
> Shouldn't these two queries be fine? (from TREC million query track).
> Should this be entered as a bug?
>
> Thanks,
Shouldn't these two queries be fine? (from TREC million query track).
Should this be entered as a bug?
Thanks,
Andrew.
Cannot parse 'statistics on child labor laws 1930 -': Encountered "" at
line 1, column 37.
Was expecting one of:
"(" ...
"*" ...
...
...
...
...
OK, I succeeded to write an Analyzer I need. I can't say that I understood
all Lucene Analyzer-Tokenizer-Filter logic, but here's attached MyAnalyzer.
Hope it will help somebody else.
import java.io.Reader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.CharTokeni
Lucene In Action has an example of creating a synonymanalyzer that
you can adapt. The general idea is to subclass from Analyzer and
implement the required functions, perhaps wrapping a Tokenizer
in a bunch of Filters.
You might be able to crib some ideas from
solr.analysis.WordDelimiterFilter
Best
Problem with SimpleAnalyzer! It ignores digits.
For text "customer 123 found" it will take only "customer" and "found", but
will ignore "123". StandardAnalyzer handles OK the digits but has the dots
problem, I mentioned before.
Is there an understandable guide how to write my own Analyzer - a h
Thank you guys! Looks like SimpleAnalyzer is OK for my application. I'm still
testing but meanwhile it looks good.
--
View this message in context:
http://lucene.472066.n3.nabble.com/parsing-Java-log-file-with-Lucene-3-0-3-tp2173046p2190354.html
Sent from the Lucene - Java Users mailing list ar
Some days I just can't read...
First question: Why do you require standard analyzer?Are you really making
use of
the special processing? Take a look at other analyzer options.
PatternAnalyzer,
SimpleAnalyzer, etc.
If you really require StandardAnalyzer, consider using two fields.
field_original
a
Of course I want to store and then show to user the original message. That's
why I can't change it and the place to handle the dots is the Analyzer area.
So how can I make the StandardAnalyzer to handle dots as commas?
--
View this message in context:
http://lucene.472066.n3.nabble.com/parsing-
<<>>
No, that is not the case. Storing a field stores an exact copy of the
input, without any analysis. The intent of storing a field is to return
something to display in the results list that reflects the original
document. What use would it be to store something that had gone
through the analysi
I'm testing it with ~50M log files. But in production env the log files will
be ~10G.
--
View this message in context:
http://lucene.472066.n3.nabble.com/parsing-Java-log-file-with-Lucene-3-0-3-tp2173046p2177477.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
I tried to understand where the StandardAnalyzer and other Standard* classes
are handling these dots and commas and how can I change its behaviour. I
debugged it as well, but I failed to understand it.
--
View this message in context:
http://lucene.472066.n3.nabble.com/parsing-Java-log-file-wit
On 1 January 2011 21:47, Benzion G wrote:
> But I'm afraid it will make my index files much bigger. Since I'm indexing
> log files the index will be anyway too big so I can't make it even bigger.
Have you tried it out? How large are your log files and how large do
you expect them to get?
--
Sent
Hi,
Of course I thought about replacing dots by commas or blanks. But I add this
field as Filed.Store.YES.
If I'll replace dot with commas it will appear with commas in search
results.
I also considered adding it as 2 fields:
1. With dots replaced by commas for index and Filed.Store.NO
2. The
Have you looked at:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
Best
Erick
On Fri, Dec 31, 2010 at 6:12 AM, Benzion G wrote:
> Hi,
>
> I need to parse the Java log files with Lucene 3.0.3. The StandardAnalyzer
> is
> OK, except it's handling of dots.
On 31 December 2010 11:12, Benzion G wrote:
> I need to parse the Java log files with Lucene 3.0.3. The StandardAnalyzer is
> OK, except it's handling of dots.
>
> E.g. it handles "java.lang.NullPointerException" as one word and searching for
> "NullPointerE
Hi,
I need to parse the Java log files with Lucene 3.0.3. The StandardAnalyzer is
OK, except it's handling of dots.
E.g. it handles "java.lang.NullPointerException" as one word and searching for
"NullPointerException" will bring nothing.
I need an Analyzer that will
Hi,
It seems like current 3.0 branch has accumulated some important bug
fixes, especially the possible index corruption bug. Is there a date for a
formal 3.0.3 release?
-shay.banon
86 matches
Mail list logo