[Lucene.Net] [jira] [Commented] (LUCENENET-85) SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
[ https://issues.apache.org/jira/browse/LUCENENET-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009691#comment-13009691 ] Robert Jordan commented on LUCENENET-85: IIRC, this was done on purpose to remain index-compatible with Java Lucene. SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator -- Key: LUCENENET-85 URL: https://issues.apache.org/jira/browse/LUCENENET-85 Project: Lucene.Net Issue Type: Bug Reporter: Digy Assignee: George Aroush Priority: Minor Attachments: SupportClass.patch Again System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator problem public static System.Single Parse(System.String s) { try { if (s.EndsWith(f) || s.EndsWith(F)) return System.Single.Parse(s.Substring(0, s.Length - 1).Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); else return System.Single.Parse(s.Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); } catch(System.FormatException fex) { throw fex; } } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[HUDSON] Solr-3.x - Build # 301 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-3.x/301/ No tests ran. Build Log (for compile errors): [...truncated 17991 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6201 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6201/ 2 tests failed. REGRESSION: org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2894) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:589) at java.lang.StringBuffer.append(StringBuffer.java:337) at java.text.RuleBasedCollator.getCollationKey(RuleBasedCollator.java:617) at org.apache.lucene.collation.CollationKeyFilter.incrementToken(CollationKeyFilter.java:93) at org.apache.lucene.collation.CollationTestBase.assertThreadSafe(CollationTestBase.java:304) at org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe(TestCollationKeyAnalyzer.java:89) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1075) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007) REGRESSION: org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange Error Message: null Stack Trace: junit.framework.AssertionFailedError: at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1075) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007) at org.apache.lucene.index.TestIndexWriterMergePolicy.checkInvariants(TestIndexWriterMergePolicy.java:236) at org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange(TestIndexWriterMergePolicy.java:168) Build Log (for compile errors): [...truncated 5123 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Solr-trunk - Build # 1450 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-trunk/1450/ No tests ran. Build Log (for compile errors): [...truncated 19064 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
On Tue, Mar 22, 2011 at 06:21, Chris Hostetter hossman_luc...@fucit.org wrote: (replying to the dev list, see context below) : Unfortunately, you can't easily recover from this (except by : reindexing your docs again). : : Failing to call IW.commit() or IW.close() means no segments file was written... I know there were good reasons for eliminating the autoCommit functionality from IndexWriter, but threads like tis make me thing thta even though autoCommit on flush/merge/whatever was bad, having an option for some sort of autoClose using a finalizer might by a good idea to give new/novice users a safety net. In the case of totally successful normal operation, this would result in one commit at GC (assuming the JVM calls the finalizer) and if there were any errors it should (if i understnad correclty) do an implicit rollback. Anyone see a downside? Yes. Totally unexpected magical behaviour. What if I didn't commit something on purporse? ... : I had a program running for 2 days to build an index for around 160 million : text files, and after program ended, I tried searching the index and found : the index was not correctly built, *indexReader.numDocs()* returns 0. I : checked the index directory, it looked good, all the index data seemed to be : there, the directory is 1.5 Gigabytes in size. : : I checked my code and found that I forgot to call *indexWriter.optimize()*and : *indexWriter.close()*, I want to know if it is possible to : *re-optimize()*the index so I don't need to rebuild the whole index : from scratch? I don't : really want the program to take another 2 days. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко E-Mail/Jabber: ear...@gmail.com Phone: +7 (495) 683-567-4 ICQ: 104465785 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Nested Document support in Lucene
May I know in which release will it ready to use. On Sat, Mar 19, 2011 at 2:23 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Sat, Mar 19, 2011 at 9:39 AM, Kapil Charania lonestar.ka...@gmail.com wrote: Hi, I am a newbie to Lucene. I have already created indexes for my project. But now requirement is to go with Nested Document. I googled a lot but can not find much implementation of nested documents. My I know if its already implemented in any release of Lucene. Thanks in Advances !!! AFAIK this is still under heavy development and it doesn't seem to be ready in the near future. I has not yet been released. simon -- Kapil Charania. -- Kapil Charania.
[jira] [Commented] (LUCENE-2881) Track FieldInfo per segment instead of per-IW-session
[ https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009599#comment-13009599 ] Michael McCandless commented on LUCENE-2881: Patch looks great! Only a few things: * We still need a header (id + version) on the fnx file? * FieldInfos ctor that loads from index file name but makes a new bimap seems spooky...? Is this only used by tests now...? * Can you add a comment where we create the N.fnx name, explaining why it has no leading _? (ie, because it's not a per-segment file, but rather a global file, shared by multiple segments) * In CHANGES entry, persistend is mis-spelled (need to drop the n); also remove the _ from _X.fnx, and add . after successful commit. Track FieldInfo per segment instead of per-IW-session - Key: LUCENE-2881 URL: https://issues.apache.org/jira/browse/LUCENE-2881 Project: Lucene - Java Issue Type: Improvement Affects Versions: Realtime Branch, CSF branch, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: Realtime Branch, CSF branch, 4.0 Attachments: LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch Currently FieldInfo is tracked per IW session to guarantee consistent global field-naming / ordering. IW carries FI instances over from previous segments which also carries over field properties like isIndexed etc. While having consistent field ordering per IW session appears to be important due to bulk merging stored fields etc. carrying over other properties might become problematic with Lucene's Codec support. Codecs that rely on consistent properties in FI will fail if FI properties are carried over. The DocValuesCodec (DocValuesBranch) for instance writes files per segment and field (using the field id within the file name). Yet, if a segment has no DocValues indexed in a particular segment but a previous segment in the same IW session had DocValues, FieldInfo#docValues will be true since those values are reused from previous segments. We already work around this limitation in SegmentInfo with properties like hasVectors or hasProx which is really something we should manage per Codec Segment. Ideally FieldInfo would be managed per Segment and Codec such that its properties are valid per segment. It also seems to be necessary to bind FieldInfoS to SegmentInfo logically since its really just per segment metadata. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Nested Document support in Lucene
AFAIK this is still under heavy development and it doesn't seem to be ready in the near future. It's stable as far as I'm concerned. Lucene-2454 includes the code and Junit tests that work with the latest 3.0.3 release. I have versions of this running in production with 2.4 and 2.9-based releases. The only concern for users is the need to carefully control when flushing occurs and the accompanying readme.txt gives advice on how to achieve this. From: Kapil Charania kapil.chara...@gmail.com To: simon.willna...@gmail.com Cc: Simon Willnauer simon.willna...@googlemail.com; dev@lucene.apache.org Sent: Tue, 22 March, 2011 9:12:20 Subject: Re: Nested Document support in Lucene May I know in which release will it ready to use. On Sat, Mar 19, 2011 at 2:23 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Sat, Mar 19, 2011 at 9:39 AM, Kapil Charania lonestar.ka...@gmail.com wrote: Hi, I am a newbie to Lucene. I have already created indexes for my project. But now requirement is to go with Nested Document. I googled a lot but can not find much implementation of nested documents. My I know if its already implemented in any release of Lucene. Thanks in Advances !!! AFAIK this is still under heavy development and it doesn't seem to be ready in the near future. I has not yet been released. simon -- Kapil Charania. -- Kapil Charania.
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
Hi, I know there were good reasons for eliminating the autoCommit functionality from IndexWriter, but threads like tis make me thing thta even though autoCommit on flush/merge/whatever was bad, having an option for some sort of autoClose using a finalizer might by a good idea to give new/novice users a safety net. In the case of totally successful normal operation, this would result in one commit at GC (assuming the JVM calls the finalizer) and if there were any errors it should (if i understnad correclty) do an implicit rollback. Anyone see a downside? I think finalize() is that not trustworthy, in that it may never be called, e.g. in case GC happened to not collect the specific object, and so the way for programmers to guarantee execution of any code at shutdown is with shutdown hooks, I guess this is that what you meant, that Lucene would add a shutdown hook? I.e, each IndexWriter object opened for write would add its own method as a shutdown hook, so that at shutdown, that writer would check its state, and in case that it was not closed (and hence also not rolled-back) and has pending uncommitted changes, those changes would be committed, is this what you mean? I think it is almost okay - it would save the use case of this thread, but could still surprise someone... Perhaps there's a third option - semi-commit? - that is, with the proposed shutdown hook, iw commits without deleting the previous commit, and marks on dir that its state is semi-commit and so when that index is opened for read or write it would throw a special new exception that indicates this stare, and the caller, before continuing to use this index for either read or write would have to call either one of two new utility methods: - commitSemiCommit(Directory) - roolbackSemiCommit(Directory) (Perhaps better names, rollbackSelfCommit, rollbackPartialCommit, etc.) After that, it would be possible to open the index as usual. It seems to me that something like this can work. Not totally convinced that it is worth the effort...? Yes. Totally unexpected magical behaviour. What if I didn't commit something on purporse? Applications can call rollback() in this case. Regards, Doron
[jira] [Commented] (LUCENE-2881) Track FieldInfo per segment instead of per-IW-session
[ https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009610#comment-13009610 ] Simon Willnauer commented on LUCENE-2881: - bq. We still need a header (id + version) on the fnx file? its there I guess you got the wrong patch. {quote} FieldInfos ctor that loads from index file name but makes a new bimap seems spooky...? Is this only used by tests now...? {quote} Well this on is only for the read case where we open a FIS. Yet, we need to do this since we store a fnx file per SIS and upon IW#addIndexed(Directory) we could have a FIS that has field number different to the global map. This is fine as long as we don't seed the FIS on read. I will open another issue to make this case more efficient and assert that the FIS is read-only once we created the FIS from a directory. I fixed all the remaining issues and will go ahead and commit now. Thanks mike Track FieldInfo per segment instead of per-IW-session - Key: LUCENE-2881 URL: https://issues.apache.org/jira/browse/LUCENE-2881 Project: Lucene - Java Issue Type: Improvement Affects Versions: Realtime Branch, CSF branch, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: Realtime Branch, CSF branch, 4.0 Attachments: LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch Currently FieldInfo is tracked per IW session to guarantee consistent global field-naming / ordering. IW carries FI instances over from previous segments which also carries over field properties like isIndexed etc. While having consistent field ordering per IW session appears to be important due to bulk merging stored fields etc. carrying over other properties might become problematic with Lucene's Codec support. Codecs that rely on consistent properties in FI will fail if FI properties are carried over. The DocValuesCodec (DocValuesBranch) for instance writes files per segment and field (using the field id within the file name). Yet, if a segment has no DocValues indexed in a particular segment but a previous segment in the same IW session had DocValues, FieldInfo#docValues will be true since those values are reused from previous segments. We already work around this limitation in SegmentInfo with properties like hasVectors or hasProx which is really something we should manage per Codec Segment. Ideally FieldInfo would be managed per Segment and Codec such that its properties are valid per segment. It also seems to be necessary to bind FieldInfoS to SegmentInfo logically since its really just per segment metadata. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
On Mon, Mar 21, 2011 at 11:21 PM, Chris Hostetter hossman_luc...@fucit.org wrote: Anyone see a downside? I don't think we should do anything serious in a gc finalizer. sounds like its asking for a JRE crash. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: Urgent! Forgot to close IndexWriter after adding Documents to the index.
Hi, I know there were good reasons for eliminating the autoCommit functionality from IndexWriter, but threads like tis make me thing thta even though autoCommit on flush/merge/whatever was bad, having an option for some sort of autoClose using a finalizer might by a good idea to give new/novice users a safety net. In the case of totally successful normal operation, this would result in one commit at GC (assuming the JVM calls the finalizer) and if there were any errors it should (if i understnad correclty) do an implicit rollback. Anyone see a downside? I am against all finalizer stuff, because it also lead to problems and is unreliable - we already removed all finalizer stuff in Lucene left over from early day, so we should not add them again. This error done by this user is only done once, the second time this user will have a try...finally block around his stuff. A comparison is relational databases with autocommit off. If I crash my app or don't correctly commit my stuff, it's also reverted on loose of connection or foreful shutdown of JDBC driver! Where is the difference? But I am for adding a recovery tool for uncommitted segments to CheckIndex. I this this should not be too hard. Something like looking for cfs/other filetypes and creating SegmentReaders that are then added using addIndex(). Uwe - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2894) Use of google-code-prettify for Lucene/Solr Javadoc
[ https://issues.apache.org/jira/browse/LUCENE-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved LUCENE-2894. -- Resolution: Fixed Use of google-code-prettify for Lucene/Solr Javadoc --- Key: LUCENE-2894 URL: https://issues.apache.org/jira/browse/LUCENE-2894 Project: Lucene - Java Issue Type: Improvement Components: Javadocs Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Blocker Fix For: 3.1, 4.0 Attachments: LUCENE-2894-solr-analysis-AtoC.patch, LUCENE-2894-solr-analysis-DtoH.patch, LUCENE-2894-solr-analysis-ItoN.patch, LUCENE-2894-solr-analysis-OtoR.patch, LUCENE-2894-solr-analysis-StoZ.patch, LUCENE-2894.patch, LUCENE-2894.patch, LUCENE-2894.patch, LUCENE-2894.patch, LUCENE-2894.patch, LUCENE-2894.patch My company, RONDHUIT uses google-code-prettify (Apache License 2.0) in Javadoc for syntax highlighting: http://www.rondhuit-demo.com/RCSS/api/com/rondhuit/solr/analysis/JaReadingSynonymFilterFactory.html I think we can use it for Lucene javadoc (java sample code in overview.html etc) and Solr javadoc (Analyzer Factories etc) to improve or simplify our life. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2396) add [ICU]CollationField
[ https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009627#comment-13009627 ] Toke Eskildsen commented on SOLR-2396: -- A rough idea: It seems that ICU Collator Keys are null-terminated. Would it be possible to allow for a key that contained the original String? Something like [collator-bytes][null][term-as-utf-8]? This embedding would make collator-ordered faceting with multiple terms/document much easier. add [ICU]CollationField --- Key: SOLR-2396 URL: https://issues.apache.org/jira/browse/SOLR-2396 Project: Solr Issue Type: Improvement Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch In LUCENE-2551 collation support was changed to use byte[] keys. Previously it encoded sort keys with IndexableBinaryString into char[], but this is wasteful with regards to RAM and disk when terms can be byte. A better solution would be [ICU]CollationFieldTypes, as this would also allow locale-sensitive range queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-85) SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
[ https://issues.apache.org/jira/browse/LUCENENET-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009628#comment-13009628 ] Sergey Mirvoda commented on LUCENENET-85: - Why invariant culture? You should always use CurrentCulture or require cultureInfo as a parameter. If you use invariant culture you will fail with parsing _any_ culture specific strings for example in Russia ,(not .) is a decimal separator. and I also think that method above is wrong. SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator -- Key: LUCENENET-85 URL: https://issues.apache.org/jira/browse/LUCENENET-85 Project: Lucene.Net Issue Type: Bug Reporter: Digy Assignee: George Aroush Priority: Minor Attachments: SupportClass.patch Again System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator problem public static System.Single Parse(System.String s) { try { if (s.EndsWith(f) || s.EndsWith(F)) return System.Single.Parse(s.Substring(0, s.Length - 1).Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); else return System.Single.Parse(s.Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); } catch(System.FormatException fex) { throw fex; } } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (SOLR-2396) add [ICU]CollationField
[ https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009630#comment-13009630 ] Robert Muir commented on SOLR-2396: --- bq. A rough idea: It seems that ICU Collator Keys are null-terminated. This isn't always the case, at least at query-time for example if you are using a bound mode (http://icu-project.org/apiref/icu4j/com/ibm/icu/text/CollationKey.BoundMode.html) I think for the UPPER_LONG case it does not exist. But, in any case I think we can't rely upon the fact that ICU might currently avoid zero bytes: this isn't really specified anywhere and just an optional impl detail (http://unicode.org/reports/tr10/#Avoiding_Zero_Bytes) add [ICU]CollationField --- Key: SOLR-2396 URL: https://issues.apache.org/jira/browse/SOLR-2396 Project: Solr Issue Type: Improvement Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch In LUCENE-2551 collation support was changed to use byte[] keys. Previously it encoded sort keys with IndexableBinaryString into char[], but this is wasteful with regards to RAM and disk when terms can be byte. A better solution would be [ICU]CollationFieldTypes, as this would also allow locale-sensitive range queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
I like Uwe's idea. As for Hoss's original suggestion, my initial reaction is that if a user understands the need to set the option in the first place, they're also more likely to understand the need for close(). FWIW Erick On Tue, Mar 22, 2011 at 8:15 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, I know there were good reasons for eliminating the autoCommit functionality from IndexWriter, but threads like tis make me thing thta even though autoCommit on flush/merge/whatever was bad, having an option for some sort of autoClose using a finalizer might by a good idea to give new/novice users a safety net. In the case of totally successful normal operation, this would result in one commit at GC (assuming the JVM calls the finalizer) and if there were any errors it should (if i understnad correclty) do an implicit rollback. Anyone see a downside? I am against all finalizer stuff, because it also lead to problems and is unreliable - we already removed all finalizer stuff in Lucene left over from early day, so we should not add them again. This error done by this user is only done once, the second time this user will have a try...finally block around his stuff. A comparison is relational databases with autocommit off. If I crash my app or don't correctly commit my stuff, it's also reverted on loose of connection or foreful shutdown of JDBC driver! Where is the difference? But I am for adding a recovery tool for uncommitted segments to CheckIndex. I this this should not be too hard. Something like looking for cfs/other filetypes and creating SegmentReaders that are then added using addIndex(). Uwe - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-85) SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
[ https://issues.apache.org/jira/browse/LUCENENET-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009637#comment-13009637 ] Sergej Koščejev commented on LUCENENET-85: -- As far as I can tell, the methods in SupportClass are used to convert between doubles and strings for indexing purposes. This conversion should always happen in the same way, independent of the current culture. SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator -- Key: LUCENENET-85 URL: https://issues.apache.org/jira/browse/LUCENENET-85 Project: Lucene.Net Issue Type: Bug Reporter: Digy Assignee: George Aroush Priority: Minor Attachments: SupportClass.patch Again System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator problem public static System.Single Parse(System.String s) { try { if (s.EndsWith(f) || s.EndsWith(F)) return System.Single.Parse(s.Substring(0, s.Length - 1).Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); else return System.Single.Parse(s.Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); } catch(System.FormatException fex) { throw fex; } } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6228 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6228/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange Error Message: null Stack Trace: junit.framework.AssertionFailedError: at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1214) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1146) at org.apache.lucene.index.TestIndexWriterMergePolicy.checkInvariants(TestIndexWriterMergePolicy.java:239) at org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange(TestIndexWriterMergePolicy.java:171) Build Log (for compile errors): [...truncated 3172 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-85) SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
[ https://issues.apache.org/jira/browse/LUCENENET-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009650#comment-13009650 ] Sergey Mirvoda commented on LUCENENET-85: - Not sure relevant this rule to lucene or not but. As for our team we always use following practice: If we need strings only for internal use _and_ client gives us real numbers - We always use InvariantCulture. If client gives us numbers as strings - We always use CurrentCulture or parameter. SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator -- Key: LUCENENET-85 URL: https://issues.apache.org/jira/browse/LUCENENET-85 Project: Lucene.Net Issue Type: Bug Reporter: Digy Assignee: George Aroush Priority: Minor Attachments: SupportClass.patch Again System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator problem public static System.Single Parse(System.String s) { try { if (s.EndsWith(f) || s.EndsWith(F)) return System.Single.Parse(s.Substring(0, s.Length - 1).Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); else return System.Single.Parse(s.Replace(., System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator)); } catch(System.FormatException fex) { throw fex; } } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[VOTE] Release Lucene/Solr 3.1
Please vote to release the artifacts at http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2 as Lucene 3.1 and Solr 3.1 Thanks for everyone's help pulling all this together! -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2396) add [ICU]CollationField
[ https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009665#comment-13009665 ] Toke Eskildsen commented on SOLR-2396: -- The JavaDoc for CollationKey is very explicit about the null-termination, but I do not know enough about the inner workings to judge whether a concatenation would work in all cases. add [ICU]CollationField --- Key: SOLR-2396 URL: https://issues.apache.org/jira/browse/SOLR-2396 Project: Solr Issue Type: Improvement Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch In LUCENE-2551 collation support was changed to use byte[] keys. Previously it encoded sort keys with IndexableBinaryString into char[], but this is wasteful with regards to RAM and disk when terms can be byte. A better solution would be [ICU]CollationFieldTypes, as this would also allow locale-sensitive range queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2396) add [ICU]CollationField
[ https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009670#comment-13009670 ] Robert Muir commented on SOLR-2396: --- Well, its something we could be consider?, but on another issue really as its not solr-related... this one just exposes the lucene functionality from LUCENE-2551 and deprecates the old support. But really, this is going to be wasteful in most cases versus IDENTICAL strength (which will basically give someone the same functionality from a sort/range-query perspective). add [ICU]CollationField --- Key: SOLR-2396 URL: https://issues.apache.org/jira/browse/SOLR-2396 Project: Solr Issue Type: Improvement Reporter: Robert Muir Fix For: 4.0 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch In LUCENE-2551 collation support was changed to use byte[] keys. Previously it encoded sort keys with IndexableBinaryString into char[], but this is wasteful with regards to RAM and disk when terms can be byte. A better solution would be [ICU]CollationFieldTypes, as this would also allow locale-sensitive range queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2978) Upgrade benchmark from commons-compress-1.0 to commons-compress-1.1 for 15 times faster gzip decompression
[ https://issues.apache.org/jira/browse/LUCENE-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen resolved LUCENE-2978. - Resolution: Fixed Lucene Fields: (was: [New]) Committed: - r1084210 - trunk - r1084231 - 3x Upgrade benchmark from commons-compress-1.0 to commons-compress-1.1 for 15 times faster gzip decompression -- Key: LUCENE-2978 URL: https://issues.apache.org/jira/browse/LUCENE-2978 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Fix For: 3.2, 4.0 Attachments: LUCENE-2978.patch In LUCENE-1540 TrecContentSource moved from Java's GZipInputStream to common-compress 1.0. This slowed down gzip decompression by a factor of 15. Upgrading to 1.1 solves this problem. I verified that the problem is only in GZIP, not in BZIP. On the way, as 1.1 introduced constants for the compression methods, the code can be made a bit nicer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2881) Track FieldInfo per segment instead of per-IW-session
[ https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009718#comment-13009718 ] Simon Willnauer commented on LUCENE-2881: - Committed to trunk - I will keep this open until RT and docvalues have synced up with it. Track FieldInfo per segment instead of per-IW-session - Key: LUCENE-2881 URL: https://issues.apache.org/jira/browse/LUCENE-2881 Project: Lucene - Java Issue Type: Improvement Affects Versions: Realtime Branch, CSF branch, 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: Realtime Branch, CSF branch, 4.0 Attachments: LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch Currently FieldInfo is tracked per IW session to guarantee consistent global field-naming / ordering. IW carries FI instances over from previous segments which also carries over field properties like isIndexed etc. While having consistent field ordering per IW session appears to be important due to bulk merging stored fields etc. carrying over other properties might become problematic with Lucene's Codec support. Codecs that rely on consistent properties in FI will fail if FI properties are carried over. The DocValuesCodec (DocValuesBranch) for instance writes files per segment and field (using the field id within the file name). Yet, if a segment has no DocValues indexed in a particular segment but a previous segment in the same IW session had DocValues, FieldInfo#docValues will be true since those values are reused from previous segments. We already work around this limitation in SegmentInfo with properties like hasVectors or hasProx which is really something we should manage per Codec Segment. Ideally FieldInfo would be managed per Segment and Codec such that its properties are valid per segment. It also seems to be necessary to bind FieldInfoS to SegmentInfo logically since its really just per segment metadata. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1084256 - in /lucene/dev/branches/branch_3x: ./ lucene/ lucene/contrib/benchmark/src/test/org/apache/lucene/benchmark/byTask/feeds/TrecContentSourceTest.java solr/
- assertEquals(Wrong numbre of documents created by osurce!,5,n); + assertEquals(Wrong number of documents created by osurce!,5,n); Err... Was this meant to be 'source' in the new version? D. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-380) Evaluate Sharpen as a port tool
[ https://issues.apache.org/jira/browse/LUCENENET-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009726#comment-13009726 ] Scott Lombard commented on LUCENENET-380: - I was investigating what license sharpen uses and found they offer what is called db4o Opensource Compatibility License (dOCL) instead of the GPL (refer to http://www.db4o.com/about/company/legalpolicies/docl.aspx). I read this a possiblity, but I have to admit I don't fully understand all the legal language. Evaluate Sharpen as a port tool --- Key: LUCENENET-380 URL: https://issues.apache.org/jira/browse/LUCENENET-380 Project: Lucene.Net Issue Type: Task Components: Build Automation, Lucene.Net Contrib, Lucene.Net Core, Lucene.Net Demo, Lucene.Net Test Reporter: George Aroush Assignee: Alex Thompson Attachments: 3.0.2_JavaToCSharpConverter_AfterPostProcessing.zip, 3.0.2_JavaToCSharpConverter_NoPostProcessing.zip, IndexWriter.java, Lucene.Net.3_0_3_Sharpen20110106.zip, Lucene.Net.Sharpen20101104.zip, Lucene.Net.Sharpen20101114.zip, NIOFSDirectory.java, QueryParser.java, TestBufferedIndexInput.java, TestDateFilter.java This task is to evaluate Sharpen as a port tool for Lucene.Net. The files to be evaluated are attached. We need to run those files (which are off Java Lucene 2.9.2) against Sharpen and compare the result against JLCA result. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: svn commit: r1084256 - in /lucene/dev/branches/branch_3x: ./ lucene/ lucene/contrib/benchmark/src/test/org/apache/lucene/benchmark/byTask/feeds/TrecContentSourceTest.java solr/
:) Thanks, will fix. On Tue, Mar 22, 2011 at 7:17 PM, Dawid Weiss dawid.we...@gmail.com wrote: -assertEquals(Wrong numbre of documents created by osurce!,5,n); +assertEquals(Wrong number of documents created by osurce!,5,n); Err... Was this meant to be 'source' in the new version? D. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-380) Evaluate Sharpen as a port tool
[ https://issues.apache.org/jira/browse/LUCENENET-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009741#comment-13009741 ] Alex Thompson commented on LUCENENET-380: - The dOCL is just an option in addition to (not instead of) the GPL. From the dOCL page: Versant licenses the Software (as defined in Section 1) pursuant to (1) this Agreement, (2) a commercial, royalty-based license agreement, and (3) the GNU General Public License v2.0 (GPL). You may choose to license the Software pursuant to any of the three agreements. I think the purpose of the dOCL is to cover the scenario where you are mixing code that is already under another open source license. For our sharpen fork I think we would just keep the whole thing GPL. Evaluate Sharpen as a port tool --- Key: LUCENENET-380 URL: https://issues.apache.org/jira/browse/LUCENENET-380 Project: Lucene.Net Issue Type: Task Components: Build Automation, Lucene.Net Contrib, Lucene.Net Core, Lucene.Net Demo, Lucene.Net Test Reporter: George Aroush Assignee: Alex Thompson Attachments: 3.0.2_JavaToCSharpConverter_AfterPostProcessing.zip, 3.0.2_JavaToCSharpConverter_NoPostProcessing.zip, IndexWriter.java, Lucene.Net.3_0_3_Sharpen20110106.zip, Lucene.Net.Sharpen20101104.zip, Lucene.Net.Sharpen20101114.zip, NIOFSDirectory.java, QueryParser.java, TestBufferedIndexInput.java, TestDateFilter.java This task is to evaluate Sharpen as a port tool for Lucene.Net. The files to be evaluated are attached. We need to run those files (which are off Java Lucene 2.9.2) against Sharpen and compare the result against JLCA result. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[HUDSON] Lucene-trunk - Build # 1505 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1505/ No tests ran. Build Log (for compile errors): [...truncated 8651 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text) --- Key: LUCENE-2980 URL: https://issues.apache.org/jira/browse/LUCENE-2980 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
[ https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-2980: Attachment: LUCENE-2980.patch Patch with test that fail due to this bug. Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text) --- Key: LUCENE-2980 URL: https://issues.apache.org/jira/browse/LUCENE-2980 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-2980.patch file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export
[ https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe reopened LUCENE-2973: - Ant's {{exec}} task can take arguments in two forms: as a line to be parsed/split by the shell ({{arg line=...)}}, or as a pre-parsed list (with one {{arg value=...}} for each cmdline argument). The {{svn-export-source}} target should be rewritten to use the pre-parsed form, to avoid splitting problems (e.g. spaces in paths) with the {{arg line=...}} form. Source distribution packaging targets should make a tarball from svn export - Key: LUCENE-2973 URL: https://issues.apache.org/jira/browse/LUCENE-2973 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Blocker Fix For: 3.1, 3.2, 4.0 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch Instead of picking and choosing which stuff to include from a local working copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target should simply perform svn export with the same revision and URL as the local working copy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
[ https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009796#comment-13009796 ] Shai Erera commented on LUCENE-2980: Patch looks good. Few tiny comments: * Should ContentSourceTest extend BenchmarkTestCase? * I think that instead of assertTrue(testDir.isDirectory()); you can assertTrue(testDir.mkdirs()); * In case you wanted a second opinion about the nocommit lines, I think they can all go away :). Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text) --- Key: LUCENE-2980 URL: https://issues.apache.org/jira/browse/LUCENE-2980 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-2980.patch file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Updated] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
Heh -- I think INFRA fixed the subject line thing. Both emails were grouped together by Gmail. Notice that [Created] and [Updated] in both of them. Nice ! On Tue, Mar 22, 2011 at 8:58 PM, Doron Cohen (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] Doron Cohen updated LUCENE-2980: Attachment: LUCENE-2980.patch Patch with test that fail due to this bug. Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text) --- Key: LUCENE-2980 URL: https://issues.apache.org/jira/browse/LUCENE-2980 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-2980.patch file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export
[ https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009798#comment-13009798 ] Steven Rowe commented on LUCENE-2973: - bq. The {{svn-export-source}} target should be rewritten to use the pre-parsed form, to avoid splitting problems (e.g. spaces in paths) with the {{arg line=...}} form. Committed: - r1084307 trunk - r1084309 branch_3x I'll leave this issue open to also be committed on the lucene_solr_3_1 branch before a 3.1.1 release (if there is one). Source distribution packaging targets should make a tarball from svn export - Key: LUCENE-2973 URL: https://issues.apache.org/jira/browse/LUCENE-2973 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Blocker Fix For: 3.1, 3.2, 4.0 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch Instead of picking and choosing which stuff to include from a local working copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target should simply perform svn export with the same revision and URL as the local working copy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export
[ https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated LUCENE-2973: Priority: Minor (was: Blocker) Source distribution packaging targets should make a tarball from svn export - Key: LUCENE-2973 URL: https://issues.apache.org/jira/browse/LUCENE-2973 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Minor Fix For: 3.1, 3.2, 4.0 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch Instead of picking and choosing which stuff to include from a local working copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target should simply perform svn export with the same revision and URL as the local working copy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
HEADS UP - Rebuild any trunk indices
Heads up -- LUCENE-2881, which I committed earlier today, changes the SegmentInfos file format and adds a new ${number}.fnx file storing global field numbers. If you have any indexes lying around built with revs of trunk before this commit, after you update you should completely reindex. Indexes prior to trunk ( 4.0) will work fine and don't require any reindexing Simon - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1084324 - in /lucene/dev/branches/branch_3x: lucene/build.xml solr/build.xml solr/common-build.xml
Thanks, we should add this rev also to the 3.1.1 issue still open. Maybe we get another rc, we can merge then earlier. -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de sar...@apache.org schrieb: Author: sarowe Date: Tue Mar 22 19:50:05 2011 New Revision: 1084324 URL: http://svn.apache.org/viewvc?rev=1084324view=rev Log: use ${svn.exe} property instead of directly calling svn executable Modified: lucene/dev/branches/branch_3x/lucene/build.xml lucene/dev/branches/branch_3x/solr/build.xml lucene/dev/branches/branch_3x/solr/common-build.xml Modified: lucene/dev/branches/branch_3x/lucene/build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/build.xml?rev=1084324r1=1084323r2=1084324view=diff_ --- lucene/dev/branches/branch_3x/lucene/build.xml (original) +++ lucene/dev/branches/branch_3x/lucene/build.xml Tue Mar 22 19:50:05 2011 @@ -373,7 +373,7 @@ target name=svn-export-source depends=get-svn-info delete dir=${svn.export.dir} includeemptydirs=true failonerror=false/ mkdir dir=${build.dir}/ - exec dir=. executable=svn failonerror=true + exec dir=. executable=${svn.exe} failonerror=true arg value=export/ arg value=--native-eol/ arg value=LF/ @@ -388,7 +388,7 @@ !-- Populates properties svn.URL and svn.Revision using svn info. -- !--_ -- target name=get-svn-info - exec dir=. executable=svn outputproperty=svn.info failonerror=true + exec dir=. executable=${svn.exe} outputproperty=svn.info failonerror=true arg value=info/ /exec loadproperties Modified: lucene/dev/branches/branch_3x/solr/build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/build.xml?rev=1084324r1=1084323r2=1084324view=diff_ --- lucene/dev/branches/branch_3x/solr/build.xml (original) +++ lucene/dev/branches/branch_3x/solr/build.xml Tue Mar 22 19:50:05 2011 @@ -761,7 +761,7 @@ target name=svn-export-source depends=get-svn-info delete dir=${svn.export.dir} includeemptydirs=true failonerror=false/ mkdir dir=${dest}/ - exec dir=. executable=svn failonerror=true + exec dir=. executable=${svn.exe} failonerror=true arg value=export/ arg value=--native-eol/ arg value=LF/ @@ -774,7 +774,7 @@ !-- Populates properties svn.URL and svn.Revision using svn info -- target name=get-svn-info - exec dir=.. executable=svn outputproperty=svn.info failonerror=true + exec dir=.. executable=${svn.exe} outputproperty=svn.info failonerror=true arg value=info/ /exec loadproperties @@ -874,7 +874,7 @@ /target target name=svn-up - exec executable=svn + exec executable=${svn.exe} arg value=update/ /exec /target Modified: lucene/dev/branches/branch_3x/solr/common-build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/common-build.xml?rev=1084324r1=1084323r2=1084324view=diff_ --- lucene/dev/branches/branch_3x/solr/common-build.xml (original) +++ lucene/dev/branches/branch_3x/solr/common-build.xml Tue Mar 22 19:50:05 2011 @@ -81,6 +81,7 @@ default PATH. (this is useful for Hudson) -- property name=svnversion.exe value=svnversion / + property name=svn.exe value=svn / !-- Java Version we are compatible with -- property name=java.compat.version value=1.5 /
[jira] [Updated] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export
[ https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated LUCENE-2973: Fix Version/s: 3.1.1 Source distribution packaging targets should make a tarball from svn export - Key: LUCENE-2973 URL: https://issues.apache.org/jira/browse/LUCENE-2973 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Minor Fix For: 3.1, 3.1.1, 3.2, 4.0 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch Instead of picking and choosing which stuff to include from a local working copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target should simply perform svn export with the same revision and URL as the local working copy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1084324 - in /lucene/dev/branches/branch_3x: lucene/build.xml solr/build.xml solr/common-build.xml
I agree – I’ve added it. - Steve From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, March 22, 2011 3:54 PM To: dev@lucene.apache.org Subject: Re: svn commit: r1084324 - in /lucene/dev/branches/branch_3x: lucene/build.xml solr/build.xml solr/common-build.xml Thanks, we should add this rev also to the 3.1.1 issue still open. Maybe we get another rc, we can merge then earlier. -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de sar...@apache.orgmailto:sar...@apache.org schrieb: Author: sarowe Date: Tue Mar 22 19:50:05 2011 New Revision: 1084324 URL: http://svn.apache.org/viewvc?rev=1084324view=rev Log: use ${svn.exe} property instead of directly calling svn executable Modified: lucene/dev/branches/branch_3x/lucene/build.xml lucene/dev/branches/branch_3x/solr/build.xml lucene/dev/branches/branch_3x/solr/common-build.xml Modified: lucene/dev/branches/branch_3x/lucene/build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/build.xml?rev=1084324r1=1084323r2=1084324view=diff --- lucene/dev/branches/branch_3x/lucene/build.xml (original) +++ lucene/dev/branches/branch_3x/lucene/build.xml Tue Mar 22 19:50:05 2011 @@ -373,7 +373,7 @@ target name=svn-export-source depends=get-svn-info delete dir=${svn.export.dir} includeemptydirs=true failonerror=false/ mkdir dir=${build.dir}/ - exec dir=. executable=svn failonerror=true + exec dir=. executable=${svn.exe} failonerror=true arg value=export/ arg value=--native-eol/ arg value=LF/ @@ -388,7 +388,7 @@ !-- Populates properties svn.URL and svn.Revision using svn info. -- !-- -- target name=get-svn-info - exec dir=. executable=svn outputproperty=svn.infohttp://svn.info failonerror=true + exec dir=. executable=${svn.exe} outputproperty=svn.infohttp://svn.info failonerror=true arg value=info/ /exec loadproperties Modified: lucene/dev/branches/branch_3x/solr/build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/build.xml?rev=1084324r1=1084323r2=1084324view=diff --- lucene/dev/branches/branch_3x/solr/build.xml (original) +++ lucene/dev/branches/branch_3x/solr/build.xml Tue Mar 22 19:50:05 2011 @@ -761,7 +761,7 @@ target name=svn-export-source depends=get-svn-info delete dir=${svn.export.dir} includeemptydirs=true failonerror=false/ mkdir dir=${dest}/ - exec dir=. executable=svn failonerror=true + exec dir=. executable=${svn.exe} failonerror=true arg value=export/ arg value=--native-eol/ arg value=LF/ @@ -774,7 +774,7 @@ !-- Populates properties svn.URL and svn.Revision using svn info -- target name=get-svn-info - exec dir=.. executable=svn outputproperty=svn.infohttp://svn.info failonerror=true + exec dir=.. executable=${svn.exe} outputproperty=svn.infohttp://svn.info failonerror=true arg value=info/ /exec loadproperties @@ -874,7 +874,7 @@ /target target name=svn-up - exec executable=svn + exec executable=${svn.exe} arg value=update/ /exec /target Modified: lucene/dev/branches/branch_3x/solr/common-build.xml URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/common-build.xml?rev=1084324r1=1084323r2=1084324view=diff --- lucene/dev/branches/branch_3x/solr/common-build.xml (original) +++ lucene/dev/branches/branch_3x/solr/common-build.xml Tue Mar 22 19:50:05 2011 @@ -81,6 +81,7 @@ default PATH. (this is useful for Hudson) -- property name=svnversion.exe value=svnversion / + property name=svn.exe value=svn / !-- Java Version we are compatible with -- property name=java.compat.version value=1.5 /
[jira] [Created] (LUCENE-2981) Review and potentially remove unused/unsupported Contribs
Review and potentially remove unused/unsupported Contribs - Key: LUCENE-2981 URL: https://issues.apache.org/jira/browse/LUCENE-2981 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Fix For: 3.2, 4.0 Some of our contribs appear to be lacking for development/support or are missing tests. We should review whether they are even pertinent these days and potentially deprecate and remove them. One of the things we did in Mahout when bringing in Colt code was to mark all code that didn't have tests as @deprecated and then we removed the deprecation once tests were added. Those that didn't get tests added over about a 6 mos. period of time were removed. I would suggest taking a hard look at: ant db lucli swing (spatial should be gutted to some extent and moved to modules) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export
[ https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009812#comment-13009812 ] Steven Rowe commented on LUCENE-2973: - The {{svn-export-source}} targets were not using the {{$\{svn.exe\}}} property, but they should, to allow for use of {{svn}} executables not on the {{$PATH}}. Committed: - r1084322 trunk - r1084324 branch_3x This change should also be added to the lucene_solr_3_1 branch before a 3.1.1 release (if there is one). Source distribution packaging targets should make a tarball from svn export - Key: LUCENE-2973 URL: https://issues.apache.org/jira/browse/LUCENE-2973 Project: Lucene - Java Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Minor Fix For: 3.1, 3.1.1, 3.2, 4.0 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch Instead of picking and choosing which stuff to include from a local working copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target should simply perform svn export with the same revision and URL as the local working copy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] Release Lucene/Solr 3.1
I found a few documentation issues in the binary Lucene .zip (these are not blockers, IMHO): Lucene binary .zip -- A. README.txt: 1. contrib/demo/lucene-demos-XX.jar (demos should be demo) 2. See BUILD.txt for building a source distribution (there is no such file in the binary distribution) 3. No mention of the included test jar: lucene-core-3.1.0-tests.jar 4. No mention of the javadoc jars (one for the test jar, one for core jar) B. Javadoc: The Test Framework API home page is the same as the root home page. (At a minimum, it should be blank, but better would be a description of the test framework.) Steve -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: Tuesday, March 22, 2011 10:21 AM To: dev@lucene.apache.org Subject: [VOTE] Release Lucene/Solr 3.1 Please vote to release the artifacts at http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2 as Lucene 3.1 and Solr 3.1 Thanks for everyone's help pulling all this together! -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2428) Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar
[ https://issues.apache.org/jira/browse/SOLR-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009824#comment-13009824 ] Dawid Weiss commented on SOLR-2428: --- Also, update the license's (c) years. Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar - Key: SOLR-2428 URL: https://issues.apache.org/jira/browse/SOLR-2428 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 3.1.1, 3.2 Reporter: Steven Rowe Assignee: Dawid Weiss Priority: Minor Fix For: 3.1.1, 3.2 As of not-yet-released version 3.4.4, the carrot2-core jar will be published as a retrowoven 1.5 version (in addition to a Java-1.6-compiled version) - see Dawid Weiss's comment on [LUCENE-2957|https://issues.apache.org/jira/browse/LUCENE-2957?focusedCommentId=13006878page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13006878] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 3.1
Don't know how important this is, but: 1) I've just tried following the instructions from example/README.txt, under cygwin curl is not installed by default and post.sh assumes it is always available, resulting command-not-found ugliness. 2) example/solr/conf/solrconfig.xml states that:\ !-- Clustering Component http://wiki.apache.org/solr/ClusteringComponent This relies on third party jars which are notincluded in the release. To use this component (and the /clustering handler) Those jars will need to be downloaded, and you'll need to set the solr.cluster.enabled system property when running solr... java -Dsolr.clustering.enabled=true -jar start.jar -- Not true, all the required JARs are included. None are blockers, I will fix #2 in the trunk, let me know if this should also be applied to the 3.x branch. Dawid On Tue, Mar 22, 2011 at 3:21 PM, Yonik Seeley yo...@lucidimagination.com wrote: Please vote to release the artifacts at http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2 as Lucene 3.1 and Solr 3.1 Thanks for everyone's help pulling all this together! -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
[ https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009869#comment-13009869 ] Doron Cohen commented on LUCENE-2980: - Thanks Shai! I fixed the super class and the assert as suggested. For those nocommits, they stand for a larger problem - I was ready for a trivial fix for this bug - just lower case the extension in ContentSource before consulting with the map. However the test failed, and I found out that this is because the input stream returned by CompressorStreamFactory.createCompressorInputStream() does not close its underlying stream when it is exhausted or when its close method is called. I opened COMPRESS-127 for this. As a workaround to this bug, ContentSource now returns a wrapper on the input stream created by the CsFactory, delegates all methods to it, except for close() which is also delegated to the underlying stream. This fix is required for the extension letter cases tests to pass, but it fixes a more serious problem, - leaking file handles in ContentSource. As Solr also makes use of CommonCompress I searched in it for references to CompressorStreamFactory.createCompressorInputStream() but found none, so it seems Solr is not affected by COMPRESS-127. Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text) --- Key: LUCENE-2980 URL: https://issues.apache.org/jira/browse/LUCENE-2980 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-2980.patch file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
[ https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-2980: Attachment: LUCENE-2980.patch Attached patch adds fixes for: - extension letter case problem. - closing underlying stream of g/bzip streams. Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text) --- Key: LUCENE-2980 URL: https://issues.apache.org/jira/browse/LUCENE-2980 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-2980.patch, LUCENE-2980.patch file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-2982) Get rid of ContenSource's workaround for closing b/gzip input stream once this is fixed in CommonCompress
Get rid of ContenSource's workaround for closing b/gzip input stream once this is fixed in CommonCompress - Key: LUCENE-2982 URL: https://issues.apache.org/jira/browse/LUCENE-2982 Project: Lucene - Java Issue Type: Task Components: contrib/benchmark Reporter: Doron Cohen Priority: Minor Once COMPRESS-127 is fixed get rid of the entire workaround method ContentSource.closableCompressorInputStream(). It would simplify the code and would perform better without that delegation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-3.x - Build # 322 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-3.x/322/ All tests passed Build Log (for compile errors): [...truncated 18984 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
: I like Uwe's idea. As for Hoss's original suggestion, my initial : reaction is that if a user understands the need to set the option : in the first place, they're also more likely to understand the need : for close(). my intention was that if the user used a novice type API for getting an IndexWriter, it would default to true but any of hte non-trivial constructors where default to false. : I am against all finalizer stuff, because it also lead to problems and is : unreliable - we already removed all finalizer stuff in Lucene left over from generally i agree with you, you shouldn't *expect* finalizers to be called, but i'm not aware of any problems that can happen by using the finalizer as a safety net ... rmuir mentioned it could cause a JRE crash but i don't understand how that would happen. : A comparison is relational databases with autocommit off. If I crash my app : or don't correctly commit my stuff, it's also reverted on loose of : connection or foreful shutdown of JDBC driver! Where is the difference? the difference is a lot of DBs do default to autocommit, and we not only don't have autocommit (or autoclose as i'm suggestion) as a defualt, we don't even offer it as an option. it just seems like the kind of thing that could easily bite someone in the ass that we could help prevent. not just in the caes of a person who writes their first Lucene app and doesn't know to call close() or commit() at all, but in the case of someone who has an app that works fine 90% of the time, but doesn't realize they have a stray code path where they aren't committing/closing properly ... so *most* of hte time their app works fine and all of their data is there, but sometimesfor reasons they can't understand, data is missing when they do searches (even though their indexing code logs that it was added successfully) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
: I think finalize() is that not trustworthy, in that it may : never be called, e.g. in case GC happened to not collect the specific : object, : and so the way for programmers to guarantee execution of any code : at shutdown is with shutdown hooks, I guess this is that what you meant, i'm not suggesting that this be documented as a *reliable* garunteed way to get a commit, just as a safety net for nocie users. I don't know enough about finer points of shutdown hooks to comment on the distinctio, but my off the cuff assumption is that a shutdown hook would be a bad idea ... in a long running program wouldn't thta keep the IndexWriter from being GCed until shutdown? : Yes. Totally unexpected magical behaviour. : What if I didn't commit something on purporse? ... : Applications can call rollback() in this case. or more specificly along the lines of my original point: people who read the docs carefully are more likely to know about rollback and call it explicitly, or to see the autoClose option and explicitly set it to false (or use a constructor where it defualts to false) -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 3.1
Overall, things look good to me. As discussed on IRC, one minor nit: 1. In the source bundle, the Changes.html is missing and so index.html has dead links. I know Changes.html is generated. We could just hook this into the svn export target and then I think the docs would be whole. I guess I'd say +1 at this point. Sigs look good, examples look good for both Solr and Lucene. Maven artifacts look reasonable at a glance. -Grant On Mar 22, 2011, at 10:21 AM, Yonik Seeley wrote: Please vote to release the artifacts at http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2 as Lucene 3.1 and Solr 3.1 Thanks for everyone's help pulling all this together! -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Nested Document support in Lucene
May I know in which release will it be ready to use. On Tue, Mar 22, 2011 at 2:42 PM, Kapil Charania kapil.chara...@gmail.comwrote: May I know in which release will it ready to use. On Sat, Mar 19, 2011 at 2:23 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Sat, Mar 19, 2011 at 9:39 AM, Kapil Charania lonestar.ka...@gmail.com wrote: Hi, I am a newbie to Lucene. I have already created indexes for my project. But now requirement is to go with Nested Document. I googled a lot but can not find much implementation of nested documents. My I know if its already implemented in any release of Lucene. Thanks in Advances !!! AFAIK this is still under heavy development and it doesn't seem to be ready in the near future. I has not yet been released. simon -- Kapil Charania. -- Kapil Charania. -- Kapil Charania.
[HUDSON] Lucene-trunk - Build # 1507 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1507/ No tests ran. Build Log (for compile errors): [...truncated 77 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser
[ https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009972#comment-13009972 ] Phillipe Ramalho commented on LUCENE-2979: -- Hi, I am considering doing a gsoc proposal about this, any specific points I should be covering on the proposal? I saw Adriano's comment on another LUCENE-1823: {quote} The map idea is really good and fits well as configuration for the QP, but I would like to restrict the key type, so the user doesn't use a String object as key. String keys may lead to runtime errors, mainly when they are inserted inline. I would prefer to use enums as keys, it would enforce the user to always pass the same object as key when referencing the same configuration. It also avoids duplicated configuration keys, once each enum type has only one instance per JVM. If nobody complains about using a MapEnum?, Object as configuration for QP framework, I will start working on a new patch including these changes soon. {quote} I will try to initially cover how we can use Map to replace the current config API. Also I would like to cover how/whether we can make the new API compatible with the old one, so users can migrate from old to new slowly, deprecating the old one of course. I will also investigate the best way to enforce the user to always pass the same key object. Also try to suggest an API that will allow the users to retrieve the config values without casting them from Object, maybe Java generic capability will enable it, but I am not sure it will work with Enum. Anything else I should be covering on the proposal? Simplify configuration API of contrib Query Parser -- Key: LUCENE-2979 URL: https://issues.apache.org/jira/browse/LUCENE-2979 Project: Lucene - Java Issue Type: Improvement Components: contrib/* Affects Versions: 2.9, 3.0 Reporter: Adriano Crestani Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11 Fix For: 3.2, 4.0 The current configuration API is very complicated and inherit the concept used by Attribute API to store token information in token streams. However, the requirements for both (QP config and token stream) are not the same, so they shouldn't be using the same thing. I propose to simplify QP config and make it less scary for people intending to use contrib QP. The task is not difficult, it will just require a lot of code change and figure out the best way to do it. That's why it's a good candidate for a GSoC project. I would like to hear good proposals about how to make the API more friendly and less scaring :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6232 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6232/ No tests ran. Build Log (for compile errors): [...truncated 47 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008825#comment-13008825 ] Bill Bell edited comment on SOLR-2242 at 3/23/11 2:31 AM: -- Can someone look this patch over? Also requested +1 from Isha Garg isha.g...@orkash.com Thanks,. was (Author: billnbell): Can someone loom this patch over? Thanks,. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
write byte[] directly to TokenStream
I'm messing with putting binary data directly in the index. I have a field class with: @Override public TokenStream tokenStreamValue() { byte[] value = (byte[])fieldsData; Token token = new Token( 0, value.length, geo ); token.resizeBuffer( value.length ); BytesRef ref = token.getBytesRef(); ref.bytes = value; ref.length = value.length; ref.offset = 0; token.setLength( ref.length ); return new SingleTokenTokenStream( token ); } but that is just writing an empty token. Is it possible to set the Token value without converting to char[]? thanks ryan - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 3.1
+1 * Walked through the solr example * Tested a simple maven project, worked well I don't think the minor issues listed so far are blockers Thanks to everyone who worked on this! ryan On Tue, Mar 22, 2011 at 10:21 AM, Yonik Seeley yo...@lucidimagination.com wrote: Please vote to release the artifacts at http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2 as Lucene 3.1 and Solr 3.1 Thanks for everyone's help pulling all this together! -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2454) Nested Document query support
[ https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009985#comment-13009985 ] Ryan McKinley commented on LUCENE-2454: --- bq. Solr, however does introduce a schema and much more that assumes a flat model. In SOLR-1566 we could add a DocList as a field within a SolrDocument -- this would at least allow the output format to return a nested structure. I have not looked this patch so this comment may be off base. Nested Document query support - Key: LUCENE-2454 URL: https://issues.apache.org/jira/browse/LUCENE-2454 Project: Lucene - Java Issue Type: New Feature Components: Search Affects Versions: 3.0.2 Reporter: Mark Harwood Assignee: Mark Harwood Priority: Minor Attachments: LuceneNestedDocumentSupport.zip A facility for querying nested documents in a Lucene index as outlined in http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)
[ https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009986#comment-13009986 ] Shai Erera commented on LUCENE-2980: That's a serious problem - good catch ! Patch looks good. Perhaps we should add a specific test in CSTest for this problem? I wouldn't use file.delete() as in indicator because on Linux it will pass. Perhaps a test which writes to a byte[] and then an extension of ByteArrayInputStream would mark whether close() was called and the test would assert on it. Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text) --- Key: LUCENE-2980 URL: https://issues.apache.org/jira/browse/LUCENE-2980 Project: Lucene - Java Issue Type: Bug Components: contrib/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor Fix For: 3.2, 4.0 Attachments: LUCENE-2980.patch, LUCENE-2980.patch file.gz is correctly handled as gzip, but file.GZ handled as text which is wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2310) Reduce Fieldable, AbstractField and Field complexity
[ https://issues.apache.org/jira/browse/LUCENE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male updated LUCENE-2310: --- Attachment: LUCENE-2310.patch Reactivating this issue. New patch against 3x (since its all about deprecations). AbstractField and Fieldable are deprecated. Document is changed a little in preparation for using Field. Code cleanup in Field. I'm going to leave the suggested larger changes to Document and Field for a later issue. I really want to get these classes out of trunk so I can then visualize the code better. Reduce Fieldable, AbstractField and Field complexity Key: LUCENE-2310 URL: https://issues.apache.org/jira/browse/LUCENE-2310 Project: Lucene - Java Issue Type: Sub-task Components: Index Reporter: Chris Male Attachments: LUCENE-2310-Deprecate-AbstractField-CleanField.patch, LUCENE-2310-Deprecate-AbstractField.patch, LUCENE-2310-Deprecate-AbstractField.patch, LUCENE-2310-Deprecate-AbstractField.patch, LUCENE-2310-Deprecate-DocumentGetFields-core.patch, LUCENE-2310-Deprecate-DocumentGetFields.patch, LUCENE-2310-Deprecate-DocumentGetFields.patch, LUCENE-2310.patch In order to move field type like functionality into its own class, we really need to try to tackle the hierarchy of Fieldable, AbstractField and Field. Currently AbstractField depends on Field, and does not provide much more functionality that storing fields, most of which are being moved over to FieldType. Therefore it seems ideal to try to deprecate AbstractField (and possible Fieldable), moving much of the functionality into Field and FieldType. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.
Hi, I don't know enough about finer points of shutdown hooks to comment on the distinctio, but my off the cuff assumption is that a shutdown hook would be a bad idea ... in a long running program wouldn't thta keep the IndexWriter from being GCed until shutdown? Could be, haven't use them either... ...If IW.close() calls RT.removeShutdownHook() I think this should work. Doron
Re: write byte[] directly to TokenStream
On Mar 22, 2011 11:38 PM, Ryan McKinley ryan...@gmail.com wrote: I'm messing with putting binary data directly in the index. I have a field class with: @Override public TokenStream tokenStreamValue() { byte[] value = (byte[])fieldsData; Token token = new Token( 0, value.length, geo ); token.resizeBuffer( value.length ); BytesRef ref = token.getBytesRef(); ref.bytes = value; ref.length = value.length; ref.offset = 0; token.setLength( ref.length ); return new SingleTokenTokenStream( token ); } but that is just writing an empty token. Is it possible to set the Token value without converting to char[]? check out Test2BTerms for an example...
[HUDSON] Solr-3.x - Build # 302 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-3.x/302/ No tests ran. Build Log (for compile errors): [...truncated 17993 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org