Re: Index Optimization

2008-03-11 Thread masz-wow
thanks hossman. will post it at java-user hossman wrote: > > > 1) > http://wiki.apache.org/lucene-java/LuceneFAQ#head-adee7c1d869aa20101733944da79e15a1a2e7dfa > > FAQ: "Why do I have a deletable file (and old segment files remain) after > running optimize?" > > 2) http://people.apache.org/~ho

Re: Index Optimization

2008-03-11 Thread Chris Hostetter
1) http://wiki.apache.org/lucene-java/LuceneFAQ#head-adee7c1d869aa20101733944da79e15a1a2e7dfa FAQ: "Why do I have a deletable file (and old segment files remain) after running optimize?" 2) http://people.apache.org/~hossman/#java-dev Please Use "[EMAIL PROTECTED]" Not "[EMAIL PROTECTED]" You

Index Optimization

2008-03-11 Thread masz-wow
I managed to optimize my index successfully. The problem that I'm having now is when I check the index using Lucene Index Toolbox there are a few files in the index itself is deletable. I understand that optimize method will merge the index files but How come there is still deletable index files i

Re: How to add a jar to a contrib build.xml

2008-03-11 Thread Chris Hostetter
: Here is how the span highlighter I have been working on uses the Memory : contrib (I think I copied this from another contrib that has a dependency): You might want to take a look at contrib/xml-query-parser/build.xml as a slightly better example of this. It uses to test if the dependency h

[jira] Updated: (LUCENE-1223) lazy fields don't enforce binary vs string value

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1223: --- Attachment: LUCENE-1223.patch Attached patch that just propagates the "binary" value

[jira] Created: (LUCENE-1223) lazy fields don't enforce binary vs string value

2008-03-11 Thread Michael McCandless (JIRA)
lazy fields don't enforce binary vs string value Key: LUCENE-1223 URL: https://issues.apache.org/jira/browse/LUCENE-1223 Project: Lucene - Java Issue Type: Bug Components: Index

[jira] Commented: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577637#action_12577637 ] Michael McCandless commented on LUCENE-1217: OK the new patch passes all tests

[jira] Updated: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1219: Attachment: LUCENE-1219.patch this one keeps addition of new methods localized to AbstractField, does not

[jira] Updated: (LUCENE-1035) Optional Buffer Pool to Improve Search Performance

2008-03-11 Thread Ning Li (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Li updated LUCENE-1035: Attachment: LUCENE-1035.contrib.patch Re-do as a contrib package. Creating BufferPooledDirectory with your

Re: Ideas to refactor Filed

2008-03-11 Thread Chris Hostetter
: I think, if you give it the same name, it just grays out the old ones. See : https://issues.apache.org/jira/browse/LUCENE-550 for an example. : : Thus, I prefer #3, but am fine with #2 as well. #3 makes it easier, IMO, to : find the latest. use the same name if the patch serves the same purp

[jira] Updated: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1217: Attachment: Lucene-1217-take1.patch new patch, fixes isBinary status in LazyField > use isBinary cached v

[jira] Commented: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577601#action_12577601 ] Eks Dev commented on LUCENE-1217: - hah, this bug just justified this patch :) sorry, I

[jira] Commented: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577598#action_12577598 ] Michael McCandless commented on LUCENE-1217: Actually seeing a test failure wi

[jira] Commented: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577597#action_12577597 ] Eks Dev commented on LUCENE-1219: - I do not know for sure if this is something we could no

[jira] Commented: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577591#action_12577591 ] Eks Dev commented on LUCENE-1217: - thanks fof looking into it! Subclassing now with backwa

Re: [jira] Updated: (LUCENE-1198) Exception in DocumentsWriter.ThreadState.init leads to corruption

2008-03-11 Thread Chris Hostetter
: Thanks Hoss! I did the easy book-keeping part ... you're the guy fixing the bugs and merging them into the release branches :) -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTE

[jira] Updated: (LUCENE-1222) IndexWriter.doAfterFlush not being called when there are no deletions flushed

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1222: - Fix Version/s: 2.4 2.3.2 targeted for 2.3.2 bug fix release > IndexWriter.doAfterFlu

[jira] Updated: (LUCENE-1199) NullPointerException in IndexModifier.close()

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1199: - Fix Version/s: 2.3.2 targeted for 2.3.2 bug fix release > NullPointerException in IndexModifier.close()

[jira] Updated: (LUCENE-1210) IndexWriter & ConcurrentMergeScheduler deadlock case if starting a merge hits an exception

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1210: - Fix Version/s: 2.3.2 targeted for 2.3.2 bug fix release > IndexWriter & ConcurrentMergeScheduler deadlo

[jira] Updated: (LUCENE-1200) IndexWriter.addIndexes* can deadlock in rare cases

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1200: - Fix Version/s: 2.3.2 targeted for 2.3.2 bug fix release > IndexWriter.addIndexes* can deadlock in rare

[jira] Updated: (LUCENE-1208) Deadlock case in IndexWriter on exception just before flush

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1208: - Fix Version/s: 2.3.2 targeted for 2.3.2 bug fix release > Deadlock case in IndexWriter on exception jus

Re: [jira] Updated: (LUCENE-1198) Exception in DocumentsWriter.ThreadState.init leads to corruption

2008-03-11 Thread Michael McCandless
Thanks Hoss! Mike On Mar 11, 2008, at 3:28 PM, Hoss Man (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1198? page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1198: - Fix Version/s: 2.3.2 target

[jira] Updated: (LUCENE-1198) Exception in DocumentsWriter.ThreadState.init leads to corruption

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1198: - Fix Version/s: 2.3.2 targeted for 2.3.2 bug fix release > Exception in DocumentsWriter.ThreadState.init

[jira] Updated: (LUCENE-1197) IndexWriter can flush too early when flushing by RAM usage

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1197: - Fix Version/s: 2.3.2 targeted for 2.3.2 bug fix release > IndexWriter can flush too early when flushing

[jira] Updated: (LUCENE-1191) If IndexWriter hits OutOfMemoryError it should not commit

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1191: - Fix Version/s: 2.3.2 targeted for 2.3.2 bug fix release > If IndexWriter hits OutOfMemoryError it shoul

[jira] Updated: (LUCENE-1207) Allow spell check input to be part of the results

2008-03-11 Thread Hoss Man (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-1207: - Lucene Fields: [New, Patch Available] (was: [Patch Available, New]) Fix Version/s: (was: 2.3.1)

[jira] Commented: (LUCENE-1221) DocumentsWriter truncates term text at \uFFFF

2008-03-11 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577563#action_12577563 ] Yonik Seeley commented on LUCENE-1221: -- If there is a real character that doesn't app

[jira] Commented: (LUCENE-1221) DocumentsWriter truncates term text at \uFFFF

2008-03-11 Thread Marcel Reutegger (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577561#action_12577561 ] Marcel Reutegger commented on LUCENE-1221: -- I'll see if I can build some kind of

[jira] Commented: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577551#action_12577551 ] Michael McCandless commented on LUCENE-1219: Hmm ... one problem is Fieldable

[jira] Commented: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577547#action_12577547 ] Michael McCandless commented on LUCENE-1217: Patch looks good. I will commit

[jira] Commented: (LUCENE-1221) DocumentsWriter truncates term text at \uFFFF

2008-03-11 Thread Marcel Reutegger (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577544#action_12577544 ] Marcel Reutegger commented on LUCENE-1221: -- > How/why are you seeing/using this c

[jira] Resolved: (LUCENE-1222) IndexWriter.doAfterFlush not being called when there are no deletions flushed

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1222. Resolution: Fixed > IndexWriter.doAfterFlush not being called when there are no de

[jira] Created: (LUCENE-1222) IndexWriter.doAfterFlush not being called when there are no deletions flushed

2008-03-11 Thread Michael McCandless (JIRA)
IndexWriter.doAfterFlush not being called when there are no deletions flushed - Key: LUCENE-1222 URL: https://issues.apache.org/jira/browse/LUCENE-1222 Project: Lucene - Java

[jira] Commented: (LUCENE-1221) DocumentsWriter truncates term text at \uFFFF

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577513#action_12577513 ] Michael McCandless commented on LUCENE-1221: Hmmm ... 0x is one of the "in

[jira] Updated: (LUCENE-1221) DocumentsWriter truncates term text at \uFFFF

2008-03-11 Thread Marcel Reutegger (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated LUCENE-1221: - Attachment: OddTermTest.java Test to reproduce the issue. > DocumentsWriter truncates t

[jira] Created: (LUCENE-1221) DocumentsWriter truncates term text at \uFFFF

2008-03-11 Thread Marcel Reutegger (JIRA)
DocumentsWriter truncates term text at \u - Key: LUCENE-1221 URL: https://issues.apache.org/jira/browse/LUCENE-1221 Project: Lucene - Java Issue Type: Bug Components: Index Affect

Re: Ideas to refactor Filed

2008-03-11 Thread eks dev
thanks, I get it now, matter of taste :) I would opt for, #3 if you fix bugs from previous patch, decorate javadoc..., but you leave things mainly as they are #2 is better to mark interface, approach change or something more substantial - Original Message From: Grant Ingersoll <[EMA

[jira] Resolved: (LUCENE-1220) PDF search is not working

2008-03-11 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-1220. - Resolution: Invalid Lucene knows nothing about PDFs. It is up to your application to ha

Re: Ideas to refactor Filed

2008-03-11 Thread Grant Ingersoll
I think, if you give it the same name, it just grays out the old ones. See https://issues.apache.org/jira/browse/LUCENE-550 for an example. Thus, I prefer #3, but am fine with #2 as well. #3 makes it easier, IMO, to find the latest. -Grant On Mar 11, 2008, at 10:26 AM, Michael McCandle

[jira] Created: (LUCENE-1220) PDF search is not working

2008-03-11 Thread Akshya kumar (JIRA)
PDF search is not working - Key: LUCENE-1220 URL: https://issues.apache.org/jira/browse/LUCENE-1220 Project: Lucene - Java Issue Type: Test Reporter: Akshya kumar I uploaded pdf file in my repository and

Re: Ideas to refactor Filed

2008-03-11 Thread Michael McCandless
I like #2. I don't think we should delete/replace attachments in Jira. The history can be useful. Mike eks dev wrote: Michael, others what is Lucene/Jira best practice for new versions of the same patch: 1. delete existing / add new patch wit the same name 2. add new patch with some fu

Re: Ideas to refactor Filed

2008-03-11 Thread eks dev
Michael, others what is Lucene/Jira best practice for new versions of the same patch: 1. delete existing / add new patch wit the same name 2. add new patch with some funky version e.g. "Jira-1219-take3.patch" 3. just add new patch with the same name ?

[jira] Updated: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1219: Attachment: LUCENE-1219.patch Michael McCandless had some nice ideas on how to make getValue() change pe

Re: Ideas to refactor Filed

2008-03-11 Thread eks dev
tip with extra checks is good, deprecate even better, I will update patch - Original Message From: Michael McCandless <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Tuesday, 11 March, 2008 2:45:56 PM Subject: Re: Ideas to refactor Filed Hello! Responses below: eks dev wr

Re: Ideas to refactor Filed

2008-03-11 Thread Michael McCandless
Hello! Responses below: eks dev wrote: Moin Moin Michael, for the first issue I have crated LUCENE-1217, and for the second one I have some questions. if we maintain length and offset internally in Field than we have one, imo, theoretical "legacy performance problem" as we need to crea

[jira] Assigned: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1217: -- Assignee: Michael McCandless > use isBinary cached variable instead of instanc

[jira] Assigned: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1219: -- Assignee: Michael McCandless > support array/offset/ length setters for Field

[jira] Updated: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1219: Attachment: LUCENE-1219.patch > support array/offset/ length setters for Field with binary data >

[jira] Updated: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1219: Attachment: (was: LUCENE-1219.patch) > support array/offset/ length setters for Field with binary data

[jira] Updated: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1219: Attachment: LUCENE-1219.patch all tests pass with this patch. some polish needed and probably more testi

Re: [jira] Commented: (LUCENE-1208) Deadlock case in IndexWriter on exception just before flush

2008-03-11 Thread Michael McCandless
OK I've backported fixes for these issues to the 2.3 branch! Mike Michael Busch wrote: Michael McCandless (JIRA) wrote: [ https://issues.apache.org/jira/browse/LUCENE-1208? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel&focusedCommentId=12576941#action_12576941 ]

[jira] Created: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-03-11 Thread Eks Dev (JIRA)
support array/offset/ length setters for Field with binary data --- Key: LUCENE-1219 URL: https://issues.apache.org/jira/browse/LUCENE-1219 Project: Lucene - Java Issue Type: Improv

Re: Ideas to refactor Filed

2008-03-11 Thread eks dev
Moin Moin Michael, for the first issue I have crated LUCENE-1217, and for the second one I have some questions. if we maintain length and offset internally in Field than we have one, imo, theoretical "legacy performance problem" as we need to create new byte[length] and copy in order to prese

[jira] Updated: (LUCENE-1218) PassTokenizerFilter that pass text in a Token

2008-03-11 Thread Hiroaki Kawai (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hiroaki Kawai updated LUCENE-1218: -- Attachment: PassTokenizer.java > PassTokenizerFilter that pass text in a Token > -

[jira] Created: (LUCENE-1218) PassTokenizerFilter that pass text in a Token

2008-03-11 Thread Hiroaki Kawai (JIRA)
PassTokenizerFilter that pass text in a Token - Key: LUCENE-1218 URL: https://issues.apache.org/jira/browse/LUCENE-1218 Project: Lucene - Java Issue Type: New Feature Components: Analysis

[jira] Updated: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1217: Attachment: LUCENE-1217.patch > use isBinary cached variable instead of instanceof in Filed >

[jira] Created: (LUCENE-1217) use isBinary cached variable instead of instanceof in Filed

2008-03-11 Thread Eks Dev (JIRA)
use isBinary cached variable instead of instanceof in Filed --- Key: LUCENE-1217 URL: https://issues.apache.org/jira/browse/LUCENE-1217 Project: Lucene - Java Issue Type: Improvement

[jira] Resolved: (LUCENE-1213) MultiFieldQueryParser ignores slop parameter

2008-03-11 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen resolved LUCENE-1213. - Resolution: Fixed Lucene Fields: [Patch Available] (was: [New]) Committed, thanks Trejka

[jira] Issue Comment Edited: (LUCENE-1213) MultiFieldQueryParser ignores slop parameter

2008-03-11 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576880#action_12576880 ] doronc edited comment on LUCENE-1213 at 3/11/08 1:00 AM: -- Tre

[jira] Commented: (LUCENE-584) Decouple Filter from BitSet

2008-03-11 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577326#action_12577326 ] Paul Elschot commented on LUCENE-584: - >From the traceback I suppose this happened at t