date:20110322

Build: https://hudson.apache.org/hudson/job/Solr-3.x/301/

No tests ran.

Build Log (for compile errors):
[...truncated 17991 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6201 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6201/

2 tests failed.
REGRESSION:  org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2894)
at 
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:589)
at java.lang.StringBuffer.append(StringBuffer.java:337)
at 
java.text.RuleBasedCollator.getCollationKey(RuleBasedCollator.java:617)
at 
org.apache.lucene.collation.CollationKeyFilter.incrementToken(CollationKeyFilter.java:93)
at 
org.apache.lucene.collation.CollationTestBase.assertThreadSafe(CollationTestBase.java:304)
at 
org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe(TestCollationKeyAnalyzer.java:89)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1075)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007)


REGRESSION:  
org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError: 
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1075)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007)
at 
org.apache.lucene.index.TestIndexWriterMergePolicy.checkInvariants(TestIndexWriterMergePolicy.java:236)
at 
org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange(TestIndexWriterMergePolicy.java:168)




Build Log (for compile errors):
[...truncated 5123 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Solr-trunk - Build # 1450 - Still Failing

Build: https://hudson.apache.org/hudson/job/Solr-trunk/1450/

No tests ran.

Build Log (for compile errors):
[...truncated 19064 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Earwin Burrfoot

On Tue, Mar 22, 2011 at 06:21, Chris Hostetter hossman_luc...@fucit.org wrote:

 (replying to the dev list, see context below)

 : Unfortunately, you can't easily recover from this (except by
 : reindexing your docs again).
 :
 : Failing to call IW.commit() or IW.close() means no segments file was 
 written...


 I know there were good reasons for eliminating the autoCommit
 functionality from IndexWriter, but threads like tis make me thing thta
 even though autoCommit on flush/merge/whatever was bad, having an option
 for some sort of autoClose using a finalizer might by a good idea to
 give new/novice users a safety net.

 In the case of totally successful normal operation, this would result in
 one commit at GC (assuming the JVM calls the finalizer) and if there were
 any errors it should (if i understnad correclty) do an implicit rollback.

 Anyone see a downside?
Yes. Totally unexpected magical behaviour.
What if I didn't commit something on purporse?

        ...

 :  I had a program running for 2 days to build an index for around 160 
 million
 :  text files, and after program ended, I tried searching the index and found
 :  the index was not correctly built, *indexReader.numDocs()* returns 0. I
 :  checked the index directory, it looked good, all the index data seemed to 
 be
 :  there, the directory is 1.5 Gigabytes in size.
 : 
 :  I checked my code and found that I forgot to call 
 *indexWriter.optimize()*and
 :  *indexWriter.close()*, I want to know if it is possible to
 :  *re-optimize()*the index so I don't need to rebuild the whole index
 :  from scratch? I don't
 :  really want the program to take another 2 days.


 -Hoss

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





-- 
Kirill Zakharenko/Кирилл Захаренко
E-Mail/Jabber: ear...@gmail.com
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Nested Document support in Lucene

2011-03-22 Thread Kapil Charania

May I know in which release will it ready to use.

On Sat, Mar 19, 2011 at 2:23 PM, Simon Willnauer 
simon.willna...@googlemail.com wrote:

 On Sat, Mar 19, 2011 at 9:39 AM, Kapil Charania
 lonestar.ka...@gmail.com wrote:
  Hi,
 
  I am a newbie to Lucene. I have already created indexes for my project.
 But
  now requirement is to go with Nested Document. I googled a lot but can
 not
  find much implementation of nested documents.
 
  My I know if its already implemented in any release of Lucene.
 
  Thanks in Advances !!!

 AFAIK this is still under heavy development and it doesn't seem to be
 ready in the near future. I has not yet been released.

 simon
 
  --
  Kapil Charania.
 




-- 
Kapil Charania.

[jira] [Commented] (LUCENE-2881) Track FieldInfo per segment instead of per-IW-session

2011-03-22 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009599#comment-13009599
]

Michael McCandless commented on LUCENE-2881:

Patch looks great! Only a few things:

* We still need a header (id + version) on the fnx file?

* FieldInfos ctor that loads from index file name but makes a new
bimap seems spooky...? Is this only used by tests now...?

* Can you add a comment where we create the N.fnx name, explaining
why it has no leading _? (ie, because it's not a per-segment
file, but rather a global file, shared by multiple segments)

* In CHANGES entry, persistend is mis-spelled (need to drop the n);
also remove the _ from _X.fnx, and add . after successful commit.

Track FieldInfo per segment instead of per-IW-session
-

Key: LUCENE-2881
URL: https://issues.apache.org/jira/browse/LUCENE-2881
Project: Lucene - Java
Issue Type: Improvement
Affects Versions: Realtime Branch, CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Fix For: Realtime Branch, CSF branch, 4.0

Attachments: LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch,
LUCENE-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch,
lucene-2881.patch, lucene-2881.patch

Currently FieldInfo is tracked per IW session to guarantee consistent global
field-naming / ordering. IW carries FI instances over from previous segments
which also carries over field properties like isIndexed etc. While having
consistent field ordering per IW session appears to be important due to bulk
merging stored fields etc. carrying over other properties might become
problematic with Lucene's Codec support. Codecs that rely on consistent
properties in FI will fail if FI properties are carried over.
The DocValuesCodec (DocValuesBranch) for instance writes files per segment
and field (using the field id within the file name). Yet, if a segment has no
DocValues indexed in a particular segment but a previous segment in the same
IW session had DocValues, FieldInfo#docValues will be true since those
values are reused from previous segments.
We already work around this limitation in SegmentInfo with properties like
hasVectors or hasProx which is really something we should manage per Codec
Segment. Ideally FieldInfo would be managed per Segment and Codec such that
its properties are valid per segment. It also seems to be necessary to bind
FieldInfoS to SegmentInfo logically since its really just per segment
metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Nested Document support in Lucene

2011-03-22 Thread mark harwood

AFAIK this is still under heavy development and it doesn't seem to be ready in 
the near future.

It's stable as far as I'm concerned. 
Lucene-2454 includes the code and Junit tests that work with the latest 3.0.3 
release. I have versions of this running in production with 2.4 and 2.9-based 
releases.
The only concern for users is the need to carefully control when flushing 
occurs 
and the accompanying readme.txt gives advice on how to achieve this.







From: Kapil Charania kapil.chara...@gmail.com
To: simon.willna...@gmail.com
Cc: Simon Willnauer simon.willna...@googlemail.com; dev@lucene.apache.org
Sent: Tue, 22 March, 2011 9:12:20
Subject: Re: Nested Document support in Lucene

May I know in which release will it ready to use.


On Sat, Mar 19, 2011 at 2:23 PM, Simon Willnauer 
simon.willna...@googlemail.com wrote:

On Sat, Mar 19, 2011 at 9:39 AM, Kapil Charania
lonestar.ka...@gmail.com wrote:
 Hi,

 I am a newbie to Lucene. I have already created indexes for my project. But
 now requirement is to go with Nested Document. I googled a lot but can not
 find much implementation of nested documents.

 My I know if its already implemented in any release of Lucene.

 Thanks in Advances !!!

AFAIK this is still under heavy development and it doesn't seem to be
ready in the near future. I has not yet been released.

simon

 --
 Kapil Charania.




-- 
Kapil Charania.

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Doron Cohen

Hi,


  I know there were good reasons for eliminating the autoCommit
  functionality from IndexWriter, but threads like tis make me thing thta
  even though autoCommit on flush/merge/whatever was bad, having an
 option
  for some sort of autoClose using a finalizer might by a good idea to
  give new/novice users a safety net.
 
  In the case of totally successful normal operation, this would result in
  one commit at GC (assuming the JVM calls the finalizer) and if there were
  any errors it should (if i understnad correclty) do an implicit rollback.
 
  Anyone see a downside?


I think finalize() is that not trustworthy, in that it may
never be called, e.g. in case GC happened to not collect the specific
object,
and so the way for programmers to guarantee execution of any code
at shutdown is with shutdown hooks, I guess this is that what you meant,
that Lucene would add a shutdown hook?

I.e, each IndexWriter object opened for write would add its own method
as a shutdown hook, so that at shutdown, that writer would check its state,
and in case that it was not closed (and hence also not rolled-back) and
has pending uncommitted changes, those changes would be committed,
is this what you mean?

I think it is almost okay - it would save the use case of this thread, but
could
still surprise someone...

Perhaps there's a third option - semi-commit? - that is, with the proposed
shutdown hook, iw commits without deleting the previous commit, and marks
on dir that its state is semi-commit and so when that index
is opened for read or write it would throw a special new exception that
indicates
this stare, and the caller, before continuing to use this index for either
read or
write would have to call either one of two new utility methods:
- commitSemiCommit(Directory)
- roolbackSemiCommit(Directory)
(Perhaps better names, rollbackSelfCommit, rollbackPartialCommit, etc.)
After that, it would be possible to open the index as usual.

It seems to me that something like this can work.
Not totally convinced that it is worth the effort...?



 Yes. Totally unexpected magical behaviour.
 What if I didn't commit something on purporse?


Applications can call rollback() in this case.

Regards,
Doron

[jira] [Commented] (LUCENE-2881) Track FieldInfo per segment instead of per-IW-session

2011-03-22 Thread Simon Willnauer (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009610#comment-13009610
]

Simon Willnauer commented on LUCENE-2881:
-

bq. We still need a header (id + version) on the fnx file?
its there I guess you got the wrong patch.

{quote}
FieldInfos ctor that loads from index file name but makes a new
bimap seems spooky...? Is this only used by tests now...?
{quote}

Well this on is only for the read case where we open a FIS. Yet, we need to do
this since we store a fnx file per SIS and upon IW#addIndexed(Directory) we
could have a FIS that has field number different to the global map. This is
fine as long as we don't seed the FIS on read. I will open another issue to
make this case more efficient and assert that the FIS is read-only once we
created the FIS from a directory.

I fixed all the remaining issues and will go ahead and commit now. Thanks mike

Track FieldInfo per segment instead of per-IW-session
-

Attachments: LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch,
LUCENE-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch,
lucene-2881.patch, lucene-2881.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Robert Muir

On Mon, Mar 21, 2011 at 11:21 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 Anyone see a downside?


I don't think we should do anything serious in a gc finalizer.

sounds like its asking for a JRE crash.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Uwe Schindler

Hi,

 I know there were good reasons for eliminating the autoCommit
 functionality from IndexWriter, but threads like tis make me thing thta
even
 though autoCommit on flush/merge/whatever was bad, having an option
 for some sort of autoClose using a finalizer might by a good idea to
give
 new/novice users a safety net.
 
 In the case of totally successful normal operation, this would result in
one
 commit at GC (assuming the JVM calls the finalizer) and if there were any
 errors it should (if i understnad correclty) do an implicit rollback.
 
 Anyone see a downside?

I am against all finalizer stuff, because it also lead to problems and is
unreliable - we already removed all finalizer stuff in Lucene left over from
early day, so we should not add them again. This error done by this user is
only done once, the second time this user will have a try...finally block
around his stuff.

A comparison is relational databases with autocommit off. If I crash my app
or don't correctly commit my stuff, it's also reverted on loose of
connection or foreful shutdown of JDBC driver! Where is the difference?

But I am for adding a recovery tool for uncommitted segments to CheckIndex.
I this this should not be too hard. Something like looking for cfs/other
filetypes and creating SegmentReaders that are then added using addIndex().

Uwe


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-2894) Use of google-code-prettify for Lucene/Solr Javadoc

2011-03-22 Thread Yonik Seeley (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved LUCENE-2894.
--

Resolution: Fixed

 Use of google-code-prettify for Lucene/Solr Javadoc
 ---

 Key: LUCENE-2894
 URL: https://issues.apache.org/jira/browse/LUCENE-2894
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Javadocs
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Blocker
 Fix For: 3.1, 4.0

 Attachments: LUCENE-2894-solr-analysis-AtoC.patch, 
 LUCENE-2894-solr-analysis-DtoH.patch, LUCENE-2894-solr-analysis-ItoN.patch, 
 LUCENE-2894-solr-analysis-OtoR.patch, LUCENE-2894-solr-analysis-StoZ.patch, 
 LUCENE-2894.patch, LUCENE-2894.patch, LUCENE-2894.patch, LUCENE-2894.patch, 
 LUCENE-2894.patch, LUCENE-2894.patch


 My company, RONDHUIT uses google-code-prettify (Apache License 2.0) in 
 Javadoc for syntax highlighting:
 http://www.rondhuit-demo.com/RCSS/api/com/rondhuit/solr/analysis/JaReadingSynonymFilterFactory.html
 I think we can use it for Lucene javadoc (java sample code in overview.html 
 etc) and Solr javadoc (Analyzer Factories etc) to improve or simplify our 
 life.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2396) add [ICU]CollationField

2011-03-22 Thread Toke Eskildsen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009627#comment-13009627
 ] 

Toke Eskildsen commented on SOLR-2396:
--

A rough idea: It seems that ICU Collator Keys are null-terminated. Would it be 
possible to allow for a key that contained the original String? Something like 
[collator-bytes][null][term-as-utf-8]? This embedding would make 
collator-ordered faceting with multiple terms/document much easier.

 add [ICU]CollationField
 ---

 Key: SOLR-2396
 URL: https://issues.apache.org/jira/browse/SOLR-2396
 Project: Solr
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, 
 SOLR-2396.patch


 In LUCENE-2551 collation support was changed to use byte[] keys.
 Previously it encoded sort keys with IndexableBinaryString into char[],
 but this is wasteful with regards to RAM and disk when terms can be byte.
 A better solution would be [ICU]CollationFieldTypes, as this would also allow 
 locale-sensitive range queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Commented] (LUCENENET-85) SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator

2011-03-22 Thread Sergey Mirvoda (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENENET-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009628#comment-13009628
 ] 

Sergey Mirvoda commented on LUCENENET-85:
-

Why invariant culture? 
You should always use CurrentCulture or require cultureInfo as a parameter.
If you use invariant culture you will fail with parsing _any_ culture specific 
strings for example in Russia ,(not .) is a decimal separator.

and I also think that method above is wrong.

 SupportClass.Parse and 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
 --

 Key: LUCENENET-85
 URL: https://issues.apache.org/jira/browse/LUCENENET-85
 Project: Lucene.Net
  Issue Type: Bug
Reporter: Digy
Assignee: George Aroush
Priority: Minor
 Attachments: SupportClass.patch


 Again 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
  problem
public static System.Single Parse(System.String s)
 {
 try
 {
 if (s.EndsWith(f) || s.EndsWith(F))
 return System.Single.Parse(s.Substring(0, s.Length - 
 1).Replace(., 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator));
 else
 return System.Single.Parse(s.Replace(., 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator));
 }
 catch(System.FormatException fex)
 {
 throw fex;
 }
 }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SOLR-2396) add [ICU]CollationField

2011-03-22 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009630#comment-13009630
 ] 

Robert Muir commented on SOLR-2396:
---

bq. A rough idea: It seems that ICU Collator Keys are null-terminated. 

This isn't always the case, at least at query-time for example if you are using 
a bound mode 
(http://icu-project.org/apiref/icu4j/com/ibm/icu/text/CollationKey.BoundMode.html)
 I think for the UPPER_LONG case it does not exist. 

But, in any case I think we can't rely upon the fact that ICU might currently 
avoid zero bytes: this isn't really specified anywhere and just an optional 
impl detail (http://unicode.org/reports/tr10/#Avoiding_Zero_Bytes)


 add [ICU]CollationField
 ---

 Key: SOLR-2396
 URL: https://issues.apache.org/jira/browse/SOLR-2396
 Project: Solr
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, 
 SOLR-2396.patch


 In LUCENE-2551 collation support was changed to use byte[] keys.
 Previously it encoded sort keys with IndexableBinaryString into char[],
 but this is wasteful with regards to RAM and disk when terms can be byte.
 A better solution would be [ICU]CollationFieldTypes, as this would also allow 
 locale-sensitive range queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Erick Erickson

I like Uwe's idea. As for Hoss's original suggestion, my initial
reaction is that if a user understands the need to set the option
in the first place, they're also more likely to understand the need
for close().

FWIW
Erick

On Tue, Mar 22, 2011 at 8:15 AM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 I know there were good reasons for eliminating the autoCommit
 functionality from IndexWriter, but threads like tis make me thing thta
 even
 though autoCommit on flush/merge/whatever was bad, having an option
 for some sort of autoClose using a finalizer might by a good idea to
 give
 new/novice users a safety net.

 In the case of totally successful normal operation, this would result in
 one
 commit at GC (assuming the JVM calls the finalizer) and if there were any
 errors it should (if i understnad correclty) do an implicit rollback.

 Anyone see a downside?

 I am against all finalizer stuff, because it also lead to problems and is
 unreliable - we already removed all finalizer stuff in Lucene left over from
 early day, so we should not add them again. This error done by this user is
 only done once, the second time this user will have a try...finally block
 around his stuff.

 A comparison is relational databases with autocommit off. If I crash my app
 or don't correctly commit my stuff, it's also reverted on loose of
 connection or foreful shutdown of JDBC driver! Where is the difference?

 But I am for adding a recovery tool for uncommitted segments to CheckIndex.
 I this this should not be too hard. Something like looking for cfs/other
 filetypes and creating SegmentReaders that are then added using addIndex().

 Uwe


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Commented] (LUCENENET-85) SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator

2011-03-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/LUCENENET-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009637#comment-13009637
 ] 

Sergej Koščejev commented on LUCENENET-85:
--

As far as I can tell, the methods in SupportClass are used to convert between 
doubles and strings for indexing purposes. This conversion should always happen 
in the same way, independent of the current culture.

 SupportClass.Parse and 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
 --

 Key: LUCENENET-85
 URL: https://issues.apache.org/jira/browse/LUCENENET-85
 Project: Lucene.Net
  Issue Type: Bug
Reporter: Digy
Assignee: George Aroush
Priority: Minor
 Attachments: SupportClass.patch


 Again 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
  problem
public static System.Single Parse(System.String s)
 {
 try
 {
 if (s.EndsWith(f) || s.EndsWith(F))
 return System.Single.Parse(s.Substring(0, s.Length - 
 1).Replace(., 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator));
 else
 return System.Single.Parse(s.Replace(., 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator));
 }
 catch(System.FormatException fex)
 {
 throw fex;
 }
 }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6228 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6228/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError: 
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1214)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1146)
at 
org.apache.lucene.index.TestIndexWriterMergePolicy.checkInvariants(TestIndexWriterMergePolicy.java:239)
at 
org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange(TestIndexWriterMergePolicy.java:171)




Build Log (for compile errors):
[...truncated 3172 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Commented] (LUCENENET-85) SupportClass.Parse and System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator

2011-03-22 Thread Sergey Mirvoda (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENENET-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009650#comment-13009650
 ] 

Sergey Mirvoda commented on LUCENENET-85:
-

Not sure relevant this rule to lucene or not but.
As for our team we always use following practice:
If we need strings only for internal use _and_ client gives us real numbers
- We always use InvariantCulture.
If client gives us numbers as strings 
- We always use CurrentCulture or parameter.


 SupportClass.Parse and 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
 --

 Key: LUCENENET-85
 URL: https://issues.apache.org/jira/browse/LUCENENET-85
 Project: Lucene.Net
  Issue Type: Bug
Reporter: Digy
Assignee: George Aroush
Priority: Minor
 Attachments: SupportClass.patch


 Again 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator
  problem
public static System.Single Parse(System.String s)
 {
 try
 {
 if (s.EndsWith(f) || s.EndsWith(F))
 return System.Single.Parse(s.Substring(0, s.Length - 
 1).Replace(., 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator));
 else
 return System.Single.Parse(s.Replace(., 
 System.Globalization.CultureInfo.CurrentCulture.NumberFormat.NumberDecimalSeparator));
 }
 catch(System.FormatException fex)
 {
 throw fex;
 }
 }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[VOTE] Release Lucene/Solr 3.1

2011-03-22 Thread Yonik Seeley

Please vote to release the artifacts at
http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2
as Lucene 3.1 and Solr 3.1

Thanks for everyone's help pulling all this together!

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2396) add [ICU]CollationField

2011-03-22 Thread Toke Eskildsen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009665#comment-13009665
 ] 

Toke Eskildsen commented on SOLR-2396:
--

The JavaDoc for CollationKey is very explicit about the null-termination, but I 
do not know enough about the inner workings to judge whether a concatenation 
would work in all cases.

 add [ICU]CollationField
 ---

 Key: SOLR-2396
 URL: https://issues.apache.org/jira/browse/SOLR-2396
 Project: Solr
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, 
 SOLR-2396.patch


 In LUCENE-2551 collation support was changed to use byte[] keys.
 Previously it encoded sort keys with IndexableBinaryString into char[],
 but this is wasteful with regards to RAM and disk when terms can be byte.
 A better solution would be [ICU]CollationFieldTypes, as this would also allow 
 locale-sensitive range queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2396) add [ICU]CollationField

2011-03-22 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009670#comment-13009670
 ] 

Robert Muir commented on SOLR-2396:
---

Well, its something we could be consider?, but on another issue really as its 
not solr-related... this one just exposes the lucene functionality from 
LUCENE-2551 and deprecates the old support.

But really, this is going to be wasteful in most cases versus IDENTICAL 
strength (which will basically give someone the same functionality from a 
sort/range-query perspective).

 add [ICU]CollationField
 ---

 Key: SOLR-2396
 URL: https://issues.apache.org/jira/browse/SOLR-2396
 Project: Solr
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: SOLR-2396.patch, SOLR-2396.patch, SOLR-2396.patch, 
 SOLR-2396.patch


 In LUCENE-2551 collation support was changed to use byte[] keys.
 Previously it encoded sort keys with IndexableBinaryString into char[],
 but this is wasteful with regards to RAM and disk when terms can be byte.
 A better solution would be [ICU]CollationFieldTypes, as this would also allow 
 locale-sensitive range queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-2978) Upgrade benchmark from commons-compress-1.0 to commons-compress-1.1 for 15 times faster gzip decompression


 [ 
https://issues.apache.org/jira/browse/LUCENE-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen resolved LUCENE-2978.
-

   Resolution: Fixed
Lucene Fields:   (was: [New])

Committed:
- r1084210 - trunk
- r1084231 - 3x

 Upgrade benchmark from commons-compress-1.0 to commons-compress-1.1 for 15 
 times faster gzip decompression
 --

 Key: LUCENE-2978
 URL: https://issues.apache.org/jira/browse/LUCENE-2978
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
 Fix For: 3.2, 4.0

 Attachments: LUCENE-2978.patch


 In LUCENE-1540 TrecContentSource moved from Java's GZipInputStream to 
 common-compress 1.0. 
 This slowed down gzip decompression by a factor of 15. 
 Upgrading to 1.1 solves this problem.
 I verified that the problem is only in GZIP, not in BZIP.
 On the way, as 1.1 introduced constants for the compression methods, the code 
 can be made a bit nicer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2881) Track FieldInfo per segment instead of per-IW-session

2011-03-22 Thread Simon Willnauer (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009718#comment-13009718
]

Simon Willnauer commented on LUCENE-2881:
-

Committed to trunk - I will keep this open until RT and docvalues have synced
up with it.

Track FieldInfo per segment instead of per-IW-session
-

Attachments: LUCENE-2881.patch, LUCENE-2881.patch, LUCENE-2881.patch,
LUCENE-2881.patch, lucene-2881.patch, lucene-2881.patch, lucene-2881.patch,
lucene-2881.patch, lucene-2881.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1084256 - in /lucene/dev/branches/branch_3x: ./ lucene/ lucene/contrib/benchmark/src/test/org/apache/lucene/benchmark/byTask/feeds/TrecContentSourceTest.java solr/

2011-03-22 Thread Dawid Weiss

 -    assertEquals(Wrong numbre of documents created by osurce!,5,n);
 +    assertEquals(Wrong number of documents created by osurce!,5,n);

Err... Was this meant to be 'source' in the new version?

D.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Commented] (LUCENENET-380) Evaluate Sharpen as a port tool

2011-03-22 Thread Scott Lombard (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENENET-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009726#comment-13009726
 ] 

Scott Lombard commented on LUCENENET-380:
-

I was investigating what license sharpen uses and found they offer what is 
called db4o Opensource Compatibility License (dOCL) instead of the GPL (refer 
to http://www.db4o.com/about/company/legalpolicies/docl.aspx).  I read this a 
possiblity, but I have to admit I don't fully understand all the legal language.

 Evaluate Sharpen as a port tool
 ---

 Key: LUCENENET-380
 URL: https://issues.apache.org/jira/browse/LUCENENET-380
 Project: Lucene.Net
  Issue Type: Task
  Components: Build Automation, Lucene.Net Contrib, Lucene.Net Core, 
 Lucene.Net Demo, Lucene.Net Test
Reporter: George Aroush
Assignee: Alex Thompson
 Attachments: 3.0.2_JavaToCSharpConverter_AfterPostProcessing.zip, 
 3.0.2_JavaToCSharpConverter_NoPostProcessing.zip, IndexWriter.java, 
 Lucene.Net.3_0_3_Sharpen20110106.zip, Lucene.Net.Sharpen20101104.zip, 
 Lucene.Net.Sharpen20101114.zip, NIOFSDirectory.java, QueryParser.java, 
 TestBufferedIndexInput.java, TestDateFilter.java


 This task is to evaluate Sharpen as a port tool for Lucene.Net.
 The files to be evaluated are attached.  We need to run those files (which 
 are off Java Lucene 2.9.2) against Sharpen and compare the result against 
 JLCA result.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: svn commit: r1084256 - in /lucene/dev/branches/branch_3x: ./ lucene/ lucene/contrib/benchmark/src/test/org/apache/lucene/benchmark/byTask/feeds/TrecContentSourceTest.java solr/

2011-03-22 Thread Doron Cohen

:)

Thanks, will fix.

On Tue, Mar 22, 2011 at 7:17 PM, Dawid Weiss dawid.we...@gmail.com wrote:

  -assertEquals(Wrong numbre of documents created by osurce!,5,n);
  +assertEquals(Wrong number of documents created by osurce!,5,n);

 Err... Was this meant to be 'source' in the new version?

 D.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Commented] (LUCENENET-380) Evaluate Sharpen as a port tool

2011-03-22 Thread Alex Thompson (JIRA)

[
https://issues.apache.org/jira/browse/LUCENENET-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009741#comment-13009741
]

Alex Thompson commented on LUCENENET-380:
-

The dOCL is just an option in addition to (not instead of) the GPL. From the
dOCL page:

Versant licenses the Software (as defined in Section 1) pursuant to (1) this
Agreement, (2) a commercial, royalty-based license agreement, and (3) the GNU
General Public License v2.0 (GPL). You may choose to license the Software
pursuant to any of the three agreements.

I think the purpose of the dOCL is to cover the scenario where you are mixing
code that is already under another open source license. For our sharpen fork I
think we would just keep the whole thing GPL.

Evaluate Sharpen as a port tool
---

Key: LUCENENET-380
URL: https://issues.apache.org/jira/browse/LUCENENET-380
Project: Lucene.Net
Issue Type: Task
Components: Build Automation, Lucene.Net Contrib, Lucene.Net Core,
Lucene.Net Demo, Lucene.Net Test
Reporter: George Aroush
Assignee: Alex Thompson
Attachments: 3.0.2_JavaToCSharpConverter_AfterPostProcessing.zip,
3.0.2_JavaToCSharpConverter_NoPostProcessing.zip, IndexWriter.java,
Lucene.Net.3_0_3_Sharpen20110106.zip, Lucene.Net.Sharpen20101104.zip,
Lucene.Net.Sharpen20101114.zip, NIOFSDirectory.java, QueryParser.java,
TestBufferedIndexInput.java, TestDateFilter.java

This task is to evaluate Sharpen as a port tool for Lucene.Net.
The files to be evaluated are attached. We need to run those files (which
are off Java Lucene 2.9.2) against Sharpen and compare the result against
JLCA result.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[HUDSON] Lucene-trunk - Build # 1505 - Still Failing

Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1505/

No tests ran.

Build Log (for compile errors):
[...truncated 8651 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)

Benchmark's ContentSource should not rely on file suffixes to be lower cased 
when detecting file type (gzip/bzip2/text)
---

 Key: LUCENE-2980
 URL: https://issues.apache.org/jira/browse/LUCENE-2980
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Fix For: 3.2, 4.0


file.gz is correctly handled as gzip, but file.GZ handled as text which is 
wrong.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-2980:


Attachment: LUCENE-2980.patch

Patch with test that fail due to this bug.

 Benchmark's ContentSource should not rely on file suffixes to be lower cased 
 when detecting file type (gzip/bzip2/text)
 ---

 Key: LUCENE-2980
 URL: https://issues.apache.org/jira/browse/LUCENE-2980
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-2980.patch


 file.gz is correctly handled as gzip, but file.GZ handled as text which is 
 wrong.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export


 [ 
https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe reopened LUCENE-2973:
-


Ant's {{exec}} task can take arguments in two forms: as a line to be 
parsed/split by the shell ({{arg line=...)}}, or as a pre-parsed list (with 
one {{arg value=...}} for each cmdline argument).  The 
{{svn-export-source}} target should be rewritten to use the pre-parsed form, to 
avoid splitting problems (e.g. spaces in paths) with the {{arg line=...}} 
form.

 Source distribution packaging targets should make a tarball from svn export
 -

 Key: LUCENE-2973
 URL: https://issues.apache.org/jira/browse/LUCENE-2973
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 3.2, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Blocker
 Fix For: 3.1, 3.2, 4.0

 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, 
 LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch


 Instead of picking and choosing which stuff to include from a local working 
 copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target 
 should simply perform svn export with the same revision and URL as the 
 local working copy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)

2011-03-22 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009796#comment-13009796
 ] 

Shai Erera commented on LUCENE-2980:


Patch looks good. Few tiny comments:

* Should ContentSourceTest extend BenchmarkTestCase?
* I think that instead of assertTrue(testDir.isDirectory()); you can 
assertTrue(testDir.mkdirs());
* In case you wanted a second opinion about the nocommit lines, I think they 
can all go away :).

 Benchmark's ContentSource should not rely on file suffixes to be lower cased 
 when detecting file type (gzip/bzip2/text)
 ---

 Key: LUCENE-2980
 URL: https://issues.apache.org/jira/browse/LUCENE-2980
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-2980.patch


 file.gz is correctly handled as gzip, but file.GZ handled as text which is 
 wrong.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Updated] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)

2011-03-22 Thread Shai Erera

Heh -- I think INFRA fixed the subject line thing. Both emails were grouped
together by Gmail. Notice that [Created] and [Updated] in both of them.

Nice !

On Tue, Mar 22, 2011 at 8:58 PM, Doron Cohen (JIRA) j...@apache.org wrote:


 [
 https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

 Doron Cohen updated LUCENE-2980:
 

Attachment: LUCENE-2980.patch

 Patch with test that fail due to this bug.

  Benchmark's ContentSource should not rely on file suffixes to be lower
 cased when detecting file type (gzip/bzip2/text)
 
 ---
 
  Key: LUCENE-2980
  URL: https://issues.apache.org/jira/browse/LUCENE-2980
  Project: Lucene - Java
   Issue Type: Bug
   Components: contrib/benchmark
 Reporter: Doron Cohen
 Assignee: Doron Cohen
 Priority: Minor
  Fix For: 3.2, 4.0
 
  Attachments: LUCENE-2980.patch
 
 
  file.gz is correctly handled as gzip, but file.GZ handled as text which
 is wrong.

 --
 This message is automatically generated by JIRA.
 For more information on JIRA, see: http://www.atlassian.com/software/jira

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export


[ 
https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009798#comment-13009798
 ] 

Steven Rowe commented on LUCENE-2973:
-

bq. The {{svn-export-source}} target should be rewritten to use the pre-parsed 
form, to avoid splitting problems (e.g. spaces in paths) with the {{arg 
line=...}} form.

Committed:
- r1084307 trunk
- r1084309 branch_3x

I'll leave this issue open to also be committed on the lucene_solr_3_1 branch 
before a 3.1.1 release (if there is one).

 Source distribution packaging targets should make a tarball from svn export
 -

 Key: LUCENE-2973
 URL: https://issues.apache.org/jira/browse/LUCENE-2973
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 3.2, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Blocker
 Fix For: 3.1, 3.2, 4.0

 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, 
 LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch


 Instead of picking and choosing which stuff to include from a local working 
 copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target 
 should simply perform svn export with the same revision and URL as the 
 local working copy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export


 [ 
https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2973:


Priority: Minor  (was: Blocker)

 Source distribution packaging targets should make a tarball from svn export
 -

 Key: LUCENE-2973
 URL: https://issues.apache.org/jira/browse/LUCENE-2973
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 3.2, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.1, 3.2, 4.0

 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, 
 LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch


 Instead of picking and choosing which stuff to include from a local working 
 copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target 
 should simply perform svn export with the same revision and URL as the 
 local working copy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

HEADS UP - Rebuild any trunk indices

2011-03-22 Thread Simon Willnauer

Heads up -- LUCENE-2881, which I committed earlier today, changes the
SegmentInfos file format and adds a new ${number}.fnx file storing
global field numbers.

If you have any indexes lying around built with revs of trunk before
this commit, after you update you should completely reindex.

Indexes prior to trunk ( 4.0) will work fine and don't require any reindexing

Simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1084324 - in /lucene/dev/branches/branch_3x: lucene/build.xml solr/build.xml solr/common-build.xml

2011-03-22 Thread Uwe Schindler

Thanks, we should add this rev also to the 3.1.1 issue still open. Maybe we get 
another rc, we can merge then earlier.
--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de



sar...@apache.org schrieb:

Author: sarowe Date: Tue Mar 22 19:50:05 2011 New Revision: 1084324 URL: 
http://svn.apache.org/viewvc?rev=1084324view=rev Log: use ${svn.exe} property 
instead of directly calling svn executable Modified: 
lucene/dev/branches/branch_3x/lucene/build.xml 
lucene/dev/branches/branch_3x/solr/build.xml 
lucene/dev/branches/branch_3x/solr/common-build.xml Modified: 
lucene/dev/branches/branch_3x/lucene/build.xml URL: 
http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/build.xml?rev=1084324r1=1084323r2=1084324view=diff_
--- lucene/dev/branches/branch_3x/lucene/build.xml (original) +++ 
lucene/dev/branches/branch_3x/lucene/build.xml Tue Mar 22 19:50:05 2011 @@ 
-373,7 +373,7 @@ target name=svn-export-source depends=get-svn-info 
delete dir=${svn.export.dir} includeemptydirs=true failonerror=false/ 
mkdir dir=${build.dir}/ - exec dir=. executable=svn 
failonerror=true + exec dir=. executable=${svn.exe} failonerror=true 
arg value=export/ arg value=--native-eol/ arg value=LF/ @@ -388,7 
+388,7 @@ !-- Populates properties svn.URL and svn.Revision using svn info. 
-- !--_
-- target name=get-svn-info - exec dir=. executable=svn 
outputproperty=svn.info failonerror=true + exec dir=. 
executable=${svn.exe} outputproperty=svn.info failonerror=true arg 
value=info/ /exec loadproperties Modified: 
lucene/dev/branches/branch_3x/solr/build.xml URL: 
http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/build.xml?rev=1084324r1=1084323r2=1084324view=diff_
--- lucene/dev/branches/branch_3x/solr/build.xml (original) +++ 
lucene/dev/branches/branch_3x/solr/build.xml Tue Mar 22 19:50:05 2011 @@ -761,7 
+761,7 @@ target name=svn-export-source depends=get-svn-info delete 
dir=${svn.export.dir} includeemptydirs=true failonerror=false/ mkdir 
dir=${dest}/ - exec dir=. executable=svn failonerror=true + exec 
dir=. executable=${svn.exe} failonerror=true arg value=export/ arg 
value=--native-eol/ arg value=LF/ @@ -774,7 +774,7 @@ !-- Populates 
properties svn.URL and svn.Revision using svn info -- target 
name=get-svn-info - exec dir=.. executable=svn 
outputproperty=svn.info failonerror=true + exec dir=.. 
executable=${svn.exe} outputproperty=svn.info failonerror=true arg 
value=info/ /exec loadproperties @@ -874,7 +874,7 @@ /target target 
name=svn-up - exec executable=svn + exec executable=${svn.exe} arg 
value=update/ /exec /target Modified:
lucene/dev/branches/branch_3x/solr/common-build.xml URL: 
http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/common-build.xml?rev=1084324r1=1084323r2=1084324view=diff_
--- lucene/dev/branches/branch_3x/solr/common-build.xml (original) +++ 
lucene/dev/branches/branch_3x/solr/common-build.xml Tue Mar 22 19:50:05 2011 @@ 
-81,6 +81,7 @@ default PATH. (this is useful for Hudson) -- property 
name=svnversion.exe value=svnversion / + property name=svn.exe 
value=svn / !-- Java Version we are compatible with -- property 
name=java.compat.version value=1.5 /

[jira] [Updated] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export


 [ 
https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-2973:


Fix Version/s: 3.1.1

 Source distribution packaging targets should make a tarball from svn export
 -

 Key: LUCENE-2973
 URL: https://issues.apache.org/jira/browse/LUCENE-2973
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 3.2, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.1, 3.1.1, 3.2, 4.0

 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, 
 LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch


 Instead of picking and choosing which stuff to include from a local working 
 copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target 
 should simply perform svn export with the same revision and URL as the 
 local working copy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: svn commit: r1084324 - in /lucene/dev/branches/branch_3x: lucene/build.xml solr/build.xml solr/common-build.xml

2011-03-22 Thread Steven A Rowe

I agree – I’ve added it. - Steve

From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Tuesday, March 22, 2011 3:54 PM
To: dev@lucene.apache.org
Subject: Re: svn commit: r1084324 - in /lucene/dev/branches/branch_3x: 
lucene/build.xml solr/build.xml solr/common-build.xml

Thanks, we should add this rev also to the 3.1.1 issue still open. Maybe we get 
another rc, we can merge then earlier.
--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de


sar...@apache.orgmailto:sar...@apache.org schrieb:
Author: sarowe Date: Tue Mar 22 19:50:05 2011 New Revision: 1084324 URL: 
http://svn.apache.org/viewvc?rev=1084324view=rev Log: use ${svn.exe} property 
instead of directly calling svn executable Modified: 
lucene/dev/branches/branch_3x/lucene/build.xml 
lucene/dev/branches/branch_3x/solr/build.xml 
lucene/dev/branches/branch_3x/solr/common-build.xml Modified: 
lucene/dev/branches/branch_3x/lucene/build.xml URL: 
http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/build.xml?rev=1084324r1=1084323r2=1084324view=diff

--- lucene/dev/branches/branch_3x/lucene/build.xml (original) +++ 
lucene/dev/branches/branch_3x/lucene/build.xml Tue Mar 22 19:50:05 2011 @@ 
-373,7 +373,7 @@ target name=svn-export-source depends=get-svn-info 
delete dir=${svn.export.dir} includeemptydirs=true failonerror=false/ 
mkdir dir=${build.dir}/ - exec dir=. executable=svn 
failonerror=true + exec dir=. executable=${svn.exe} failonerror=true 
arg value=export/ arg value=--native-eol/ arg value=LF/ @@ -388,7 
+388,7 @@ !-- Populates properties svn.URL and svn.Revision using svn info. 
-- !--

-- target name=get-svn-info - exec dir=. executable=svn 
outputproperty=svn.infohttp://svn.info failonerror=true + exec dir=. 
executable=${svn.exe} outputproperty=svn.infohttp://svn.info 
failonerror=true arg value=info/ /exec loadproperties Modified: 
lucene/dev/branches/branch_3x/solr/build.xml URL: 
http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/build.xml?rev=1084324r1=1084323r2=1084324view=diff

--- lucene/dev/branches/branch_3x/solr/build.xml (original) +++ 
lucene/dev/branches/branch_3x/solr/build.xml Tue Mar 22 19:50:05 2011 @@ -761,7 
+761,7 @@ target name=svn-export-source depends=get-svn-info delete 
dir=${svn.export.dir} includeemptydirs=true failonerror=false/ mkdir 
dir=${dest}/ - exec dir=. executable=svn failonerror=true + exec 
dir=. executable=${svn.exe} failonerror=true arg value=export/ arg 
value=--native-eol/ arg value=LF/ @@ -774,7 +774,7 @@ !-- Populates 
properties svn.URL and svn.Revision using svn info -- target 
name=get-svn-info - exec dir=.. executable=svn 
outputproperty=svn.infohttp://svn.info failonerror=true + exec dir=.. 
executable=${svn.exe} outputproperty=svn.infohttp://svn.info 
failonerror=true arg value=info/ /exec loadproperties @@ -874,7 
+874,7 @@ /target target name=svn-up - exec executable=svn + exec 
executable=${svn.exe} arg value=update/ /exec /target Modified: 
lucene/dev/branches/branch_3x/solr/common-build.xml URL: 
http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/common-build.xml?rev=1084324r1=1084323r2=1084324view=diff

--- lucene/dev/branches/branch_3x/solr/common-build.xml (original) +++ 
lucene/dev/branches/branch_3x/solr/common-build.xml Tue Mar 22 19:50:05 2011 @@ 
-81,6 +81,7 @@ default PATH. (this is useful for Hudson) -- property 
name=svnversion.exe value=svnversion / + property name=svn.exe 
value=svn / !-- Java Version we are compatible with -- property 
name=java.compat.version value=1.5 /

[jira] [Created] (LUCENE-2981) Review and potentially remove unused/unsupported Contribs

2011-03-22 Thread Grant Ingersoll (JIRA)

Review and potentially remove unused/unsupported Contribs
-

 Key: LUCENE-2981
 URL: https://issues.apache.org/jira/browse/LUCENE-2981
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Grant Ingersoll
 Fix For: 3.2, 4.0


Some of our contribs appear to be lacking for development/support or are 
missing tests.  We should review whether they are even pertinent these days and 
potentially deprecate and remove them.

One of the things we did in Mahout when bringing in Colt code was to mark all 
code that didn't have tests as @deprecated and then we removed the deprecation 
once tests were added.  Those that didn't get tests added over about a 6 mos. 
period of time were removed.

I would suggest taking a hard look at:
ant
db
lucli
swing

(spatial should be gutted to some extent and moved to modules)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2973) Source distribution packaging targets should make a tarball from svn export


[ 
https://issues.apache.org/jira/browse/LUCENE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009812#comment-13009812
 ] 

Steven Rowe commented on LUCENE-2973:
-

The {{svn-export-source}} targets were not using the {{$\{svn.exe\}}} property, 
but they should, to allow for use of {{svn}} executables not on the {{$PATH}}.

Committed:
- r1084322 trunk
- r1084324 branch_3x

This change should also be added to the lucene_solr_3_1 branch before a 3.1.1 
release (if there is one).

 Source distribution packaging targets should make a tarball from svn export
 -

 Key: LUCENE-2973
 URL: https://issues.apache.org/jira/browse/LUCENE-2973
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.1, 3.2, 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.1, 3.1.1, 3.2, 4.0

 Attachments: LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, 
 LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch, LUCENE-2973.patch


 Instead of picking and choosing which stuff to include from a local working 
 copy, Lucene's dist-src/package-tgz-src target and Solr's package-src target 
 should simply perform svn export with the same revision and URL as the 
 local working copy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [VOTE] Release Lucene/Solr 3.1

2011-03-22 Thread Steven A Rowe

I found a few documentation issues in the binary Lucene .zip (these are not 
blockers, IMHO):

Lucene binary .zip
--
A. README.txt:

   1. contrib/demo/lucene-demos-XX.jar
  (demos should be demo)
   2. See BUILD.txt for building a source distribution 
  (there is no such file in the binary distribution)
   3. No mention of the included test jar: lucene-core-3.1.0-tests.jar
   4. No mention of the javadoc jars (one for the test jar, one for core jar)

B. Javadoc:

The Test Framework API home page is the same as the root home page.
(At a minimum, it should be blank, but better would be a
  description of the test framework.)

Steve

 -Original Message-
 From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
 Seeley
 Sent: Tuesday, March 22, 2011 10:21 AM
 To: dev@lucene.apache.org
 Subject: [VOTE] Release Lucene/Solr 3.1
 
 Please vote to release the artifacts at
 http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2
 as Lucene 3.1 and Solr 3.1
 
 Thanks for everyone's help pulling all this together!
 
 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2428) Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar

2011-03-22 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009824#comment-13009824
 ] 

Dawid Weiss commented on SOLR-2428:
---

Also, update the license's (c) years.

 Upgrade carrot2-core dependency to a version with a Java 1.5-compiled jar
 -

 Key: SOLR-2428
 URL: https://issues.apache.org/jira/browse/SOLR-2428
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Clustering
Affects Versions: 3.1.1, 3.2
Reporter: Steven Rowe
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 3.1.1, 3.2


 As of not-yet-released version 3.4.4, the carrot2-core jar will be published 
 as a retrowoven 1.5 version (in addition to a Java-1.6-compiled version) - 
 see Dawid Weiss's comment on 
 [LUCENE-2957|https://issues.apache.org/jira/browse/LUCENE-2957?focusedCommentId=13006878page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13006878]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release Lucene/Solr 3.1

2011-03-22 Thread Dawid Weiss

Don't know how important this is, but:

1) I've just tried following the instructions from example/README.txt,
under cygwin curl is not installed by default and post.sh assumes it
is always available, resulting command-not-found ugliness.

2) example/solr/conf/solrconfig.xml states that:\

  !-- Clustering Component

   http://wiki.apache.org/solr/ClusteringComponent

   This relies on third party jars which are notincluded in the
   release.  To use this component (and the /clustering handler)
   Those jars will need to be downloaded, and you'll need to set
   the solr.cluster.enabled system property when running solr...

  java -Dsolr.clustering.enabled=true -jar start.jar
--

Not true, all the required JARs are included.

None are blockers, I will fix #2 in the trunk, let me know if this
should also be applied to the 3.x branch.

Dawid


On Tue, Mar 22, 2011 at 3:21 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 Please vote to release the artifacts at
 http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2
 as Lucene 3.1 and Solr 3.1

 Thanks for everyone's help pulling all this together!

 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)


[ 
https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009869#comment-13009869
 ] 

Doron Cohen commented on LUCENE-2980:
-

Thanks Shai!

I fixed the super class and the assert as suggested.

For those nocommits, they stand for a larger problem - I was ready for a 
trivial fix for this bug - just lower case the extension in ContentSource 
before consulting with the map. However the test failed, and I found out that 
this is because the input stream returned by 
CompressorStreamFactory.createCompressorInputStream() does not close its 
underlying stream when it is exhausted or when its close method is called. 

I opened COMPRESS-127 for this.

As a workaround to this bug, ContentSource now returns a wrapper on the input 
stream created by the CsFactory, delegates all methods to it, except for 
close() which is also delegated to the underlying stream. This fix is required 
for the extension letter cases tests to pass, but it fixes a more serious 
problem, - leaking file handles in ContentSource.

As Solr also makes use of CommonCompress I searched in it for references to 
CompressorStreamFactory.createCompressorInputStream() but found none, so it 
seems Solr is not affected by COMPRESS-127.

 Benchmark's ContentSource should not rely on file suffixes to be lower cased 
 when detecting file type (gzip/bzip2/text)
 ---

 Key: LUCENE-2980
 URL: https://issues.apache.org/jira/browse/LUCENE-2980
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-2980.patch


 file.gz is correctly handled as gzip, but file.GZ handled as text which is 
 wrong.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doron Cohen updated LUCENE-2980:


Attachment: LUCENE-2980.patch

Attached patch adds fixes for:
- extension letter case problem.
- closing underlying stream of g/bzip streams.

 Benchmark's ContentSource should not rely on file suffixes to be lower cased 
 when detecting file type (gzip/bzip2/text)
 ---

 Key: LUCENE-2980
 URL: https://issues.apache.org/jira/browse/LUCENE-2980
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-2980.patch, LUCENE-2980.patch


 file.gz is correctly handled as gzip, but file.GZ handled as text which is 
 wrong.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-2982) Get rid of ContenSource's workaround for closing b/gzip input stream once this is fixed in CommonCompress

Get rid of ContenSource's workaround for closing b/gzip input stream once this 
is fixed in CommonCompress
-

 Key: LUCENE-2982
 URL: https://issues.apache.org/jira/browse/LUCENE-2982
 Project: Lucene - Java
  Issue Type: Task
  Components: contrib/benchmark
Reporter: Doron Cohen
Priority: Minor


Once COMPRESS-127 is fixed get rid of the entire workaround method 
ContentSource.closableCompressorInputStream(). It would simplify the code and 
would perform better without that delegation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-3.x - Build # 322 - Still Failing

Build: https://hudson.apache.org/hudson/job/Lucene-3.x/322/

All tests passed

Build Log (for compile errors):
[...truncated 18984 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Chris Hostetter


: I like Uwe's idea. As for Hoss's original suggestion, my initial
: reaction is that if a user understands the need to set the option
: in the first place, they're also more likely to understand the need
: for close().

my intention was that if the user used a novice type API for getting an 
IndexWriter, it would default to true but any of hte non-trivial 
constructors where default to false.

:  I am against all finalizer stuff, because it also lead to problems and is
:  unreliable - we already removed all finalizer stuff in Lucene left over from

generally i agree with you, you shouldn't *expect* finalizers to be 
called, but i'm not aware of any problems that can happen by using the 
finalizer as a safety net ... rmuir mentioned it could cause a JRE crash 
but i don't understand how that would happen.

:  A comparison is relational databases with autocommit off. If I crash my app
:  or don't correctly commit my stuff, it's also reverted on loose of
:  connection or foreful shutdown of JDBC driver! Where is the difference?

the difference is a lot of DBs do default to autocommit, and we not only 
don't have autocommit (or autoclose as i'm suggestion) as a 
defualt, we don't even offer it as an option.

it just seems like the kind of thing that could easily bite someone in the 
ass that we could help prevent.

not just in the caes of a person who writes their first Lucene app and 
doesn't know to call close() or commit() at all, but in the case of 
someone who has an app that works fine 90% of the time, but doesn't 
realize they have a stray code path where they aren't committing/closing 
properly ... so *most* of hte time their app works fine and all of their 
data is there, but sometimesfor reasons they can't understand, data is 
missing when they do searches (even though their indexing code logs that 
it was added successfully)

-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Chris Hostetter


: I think finalize() is that not trustworthy, in that it may
: never be called, e.g. in case GC happened to not collect the specific
: object,
: and so the way for programmers to guarantee execution of any code
: at shutdown is with shutdown hooks, I guess this is that what you meant,

i'm not suggesting that this be documented as a *reliable* garunteed way 
to get a commit, just as a safety net for nocie users.  I don't know 
enough about finer points of shutdown hooks to comment on the distinctio, 
but my off the cuff assumption is that a shutdown hook would be a bad idea 
... in a long running program wouldn't thta keep the IndexWriter 
from being GCed until shutdown?

:  Yes. Totally unexpected magical behaviour.
:  What if I didn't commit something on purporse?
...
: Applications can call rollback() in this case.

or more specificly along the lines of my original point: people who read 
the docs carefully are more likely to know about rollback and call it 
explicitly, or to see the autoClose option and explicitly set it to false 
(or use a constructor where it defualts to false)

-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release Lucene/Solr 3.1

2011-03-22 Thread Grant Ingersoll

Overall, things look good to me. 

As discussed on IRC, one minor nit:
1. In the source bundle, the Changes.html is missing and so index.html has dead 
links.  I know Changes.html is generated.  We could just hook this into the svn 
export target and then I think the docs would be whole.

I guess I'd say +1 at this point.  Sigs look good, examples look good for both 
Solr and Lucene.  Maven artifacts look reasonable at a glance.

-Grant


On Mar 22, 2011, at 10:21 AM, Yonik Seeley wrote:

 Please vote to release the artifacts at
 http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2
 as Lucene 3.1 and Solr 3.1
 
 Thanks for everyone's help pulling all this together!
 
 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Nested Document support in Lucene

2011-03-22 Thread Kapil Charania

May I know in which release will it be ready to use.

On Tue, Mar 22, 2011 at 2:42 PM, Kapil Charania kapil.chara...@gmail.comwrote:

 May I know in which release will it ready to use.

 On Sat, Mar 19, 2011 at 2:23 PM, Simon Willnauer 
 simon.willna...@googlemail.com wrote:

 On Sat, Mar 19, 2011 at 9:39 AM, Kapil Charania
 lonestar.ka...@gmail.com wrote:
  Hi,
 
  I am a newbie to Lucene. I have already created indexes for my project.
 But
  now requirement is to go with Nested Document. I googled a lot but can
 not
  find much implementation of nested documents.
 
  My I know if its already implemented in any release of Lucene.
 
  Thanks in Advances !!!

 AFAIK this is still under heavy development and it doesn't seem to be
 ready in the near future. I has not yet been released.

 simon
 
  --
  Kapil Charania.
 




 --
 Kapil Charania.




-- 
Kapil Charania.

[HUDSON] Lucene-trunk - Build # 1507 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1507/

No tests ran.

Build Log (for compile errors):
[...truncated 77 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-03-22 Thread Phillipe Ramalho (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009972#comment-13009972
]

Phillipe Ramalho commented on LUCENE-2979:
--

Hi,

I am considering doing a gsoc proposal about this, any specific points I should
be covering on the proposal?

I saw Adriano's comment on another LUCENE-1823:

{quote}
The map idea is really good and fits well as configuration for the QP, but I
would like to restrict the key type, so the user doesn't use a String object as
key. String keys may lead to runtime errors, mainly when they are inserted
inline. I would prefer to use enums as keys, it would enforce the user to
always pass the same object as key when referencing the same configuration. It
also avoids duplicated configuration keys, once each enum type has only one
instance per JVM.

If nobody complains about using a MapEnum?, Object as configuration for QP
framework, I will start working on a new patch including these changes soon.
{quote}

I will try to initially cover how we can use Map to replace the current config
API. Also I would like to cover how/whether we can make the new API compatible
with the old one, so users can migrate from old to new slowly, deprecating the
old one of course. I will also investigate the best way to enforce the user to
always pass the same key object. Also try to suggest an API that will allow the
users to retrieve the config values without casting them from Object, maybe
Java generic capability will enable it, but I am not sure it will work with
Enum.

Anything else I should be covering on the proposal?

Simplify configuration API of contrib Query Parser
--

Key: LUCENE-2979
URL: https://issues.apache.org/jira/browse/LUCENE-2979
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/*
Affects Versions: 2.9, 3.0
Reporter: Adriano Crestani
Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11
Fix For: 3.2, 4.0

The current configuration API is very complicated and inherit the concept
used by Attribute API to store token information in token streams. However,
the requirements for both (QP config and token stream) are not the same, so
they shouldn't be using the same thing.
I propose to simplify QP config and make it less scary for people intending
to use contrib QP. The task is not difficult, it will just require a lot of
code change and figure out the best way to do it. That's why it's a good
candidate for a GSoC project.
I would like to hear good proposals about how to make the API more friendly
and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6232 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6232/

No tests ran.

Build Log (for compile errors):
[...truncated 47 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-2242) Get distinct count of names for a facet field

2011-03-22 Thread Bill Bell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008825#comment-13008825
 ] 

Bill Bell edited comment on SOLR-2242 at 3/23/11 2:31 AM:
--

Can someone look this patch over?

Also requested +1 from Isha Garg isha.g...@orkash.com

Thanks,.

  was (Author: billnbell):
Can someone loom this patch over?

Thanks,.
  
 Get distinct count of names for a facet field
 -

 Key: SOLR-2242
 URL: https://issues.apache.org/jira/browse/SOLR-2242
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Affects Versions: 4.0
Reporter: Bill Bell
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR.2242.v2.patch


 When returning facet.field=name of field you will get a list of matches for 
 distinct values. This is normal behavior. This patch tells you how many 
 distinct values you have (# of rows). Use with limit=-1 and mincount=1.
 The feature is called namedistinct. Here is an example:
 http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1
 Here is an example on field hgid (without namedistinct):
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=HGPY045FD36D4000A1/int 
   int name=HGPY0FBC6690453A91/int 
   int name=HGPY1E44ED6C4FB3B1/int 
   int name=HGPY1FA631034A1B81/int 
   int name=HGPY3317ABAC43B481/int 
   int name=HGPY3A17B2294CB5A5/int 
   int name=HGPY3ADD2B3D48C391/int 
   /lst
   /lst
 {code}
 With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
 HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
 HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows 
 (7), not the number of values (11).
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=_count_7/int 
   /lst
   /lst
 {code}
 This works actually really good to get total number of fields for a 
 group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

write byte[] directly to TokenStream

2011-03-22 Thread Ryan McKinley

I'm messing with putting binary data directly in the index.  I have a
field class with:

  @Override
  public TokenStream tokenStreamValue() {
byte[] value = (byte[])fieldsData;

Token token = new Token( 0, value.length, geo );
token.resizeBuffer( value.length );
BytesRef ref = token.getBytesRef();
ref.bytes = value;
ref.length = value.length;
ref.offset = 0;
token.setLength( ref.length );
return new SingleTokenTokenStream( token );
  }

but that is just writing an empty token.  Is it possible to set the
Token value without converting to char[]?

thanks
ryan

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Release Lucene/Solr 3.1

2011-03-22 Thread Ryan McKinley

+1

* Walked through the solr example
* Tested a simple maven project, worked well

I don't think the minor issues listed so far are blockers

Thanks to everyone who worked on this!

ryan


On Tue, Mar 22, 2011 at 10:21 AM, Yonik Seeley
yo...@lucidimagination.com wrote:
 Please vote to release the artifacts at
 http://people.apache.org/~yonik/staging_area/lucene-solr-3.1RC2
 as Lucene 3.1 and Solr 3.1

 Thanks for everyone's help pulling all this together!

 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2454) Nested Document query support

2011-03-22 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009985#comment-13009985
 ] 

Ryan McKinley commented on LUCENE-2454:
---

bq. Solr, however does introduce a schema and much more that assumes a flat 
model.

In SOLR-1566 we could add a DocList as a field within a SolrDocument -- this 
would at least allow the output format to return a nested structure.

I have not looked this patch so this comment may be off base.

 Nested Document query support
 -

 Key: LUCENE-2454
 URL: https://issues.apache.org/jira/browse/LUCENE-2454
 Project: Lucene - Java
  Issue Type: New Feature
  Components: Search
Affects Versions: 3.0.2
Reporter: Mark Harwood
Assignee: Mark Harwood
Priority: Minor
 Attachments: LuceneNestedDocumentSupport.zip


 A facility for querying nested documents in a Lucene index as outlined in 
 http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2980) Benchmark's ContentSource should not rely on file suffixes to be lower cased when detecting file type (gzip/bzip2/text)

2011-03-22 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009986#comment-13009986
 ] 

Shai Erera commented on LUCENE-2980:


That's a serious problem - good catch !

Patch looks good. Perhaps we should add a specific test in CSTest for this 
problem? I wouldn't use file.delete() as in indicator because on Linux it will 
pass. Perhaps a test which writes to a byte[] and then an extension of 
ByteArrayInputStream would mark whether close() was called and the test would 
assert on it.

 Benchmark's ContentSource should not rely on file suffixes to be lower cased 
 when detecting file type (gzip/bzip2/text)
 ---

 Key: LUCENE-2980
 URL: https://issues.apache.org/jira/browse/LUCENE-2980
 Project: Lucene - Java
  Issue Type: Bug
  Components: contrib/benchmark
Reporter: Doron Cohen
Assignee: Doron Cohen
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-2980.patch, LUCENE-2980.patch


 file.gz is correctly handled as gzip, but file.GZ handled as text which is 
 wrong.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2310) Reduce Fieldable, AbstractField and Field complexity

2011-03-22 Thread Chris Male (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Male updated LUCENE-2310:
---

Attachment: LUCENE-2310.patch

Reactivating this issue.

New patch against 3x (since its all about deprecations). AbstractField and
Fieldable are deprecated. Document is changed a little in preparation for
using Field. Code cleanup in Field.

I'm going to leave the suggested larger changes to Document and Field for a
later issue. I really want to get these classes out of trunk so I can then
visualize the code better.

Reduce Fieldable, AbstractField and Field complexity

Key: LUCENE-2310
URL: https://issues.apache.org/jira/browse/LUCENE-2310
Project: Lucene - Java
Issue Type: Sub-task
Components: Index
Reporter: Chris Male
Attachments: LUCENE-2310-Deprecate-AbstractField-CleanField.patch,
LUCENE-2310-Deprecate-AbstractField.patch,
LUCENE-2310-Deprecate-AbstractField.patch,
LUCENE-2310-Deprecate-AbstractField.patch,
LUCENE-2310-Deprecate-DocumentGetFields-core.patch,
LUCENE-2310-Deprecate-DocumentGetFields.patch,
LUCENE-2310-Deprecate-DocumentGetFields.patch, LUCENE-2310.patch

In order to move field type like functionality into its own class, we really
need to try to tackle the hierarchy of Fieldable, AbstractField and Field.
Currently AbstractField depends on Field, and does not provide much more
functionality that storing fields, most of which are being moved over to
FieldType. Therefore it seems ideal to try to deprecate AbstractField (and
possible Fieldable), moving much of the functionality into Field and
FieldType.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Doron Cohen

Hi,


 I don't know
 enough about finer points of shutdown hooks to comment on the distinctio,
 but my off the cuff assumption is that a shutdown hook would be a bad idea
 ... in a long running program wouldn't thta keep the IndexWriter
 from being GCed until shutdown?


Could be, haven't use them either...
...If IW.close() calls RT.removeShutdownHook() I think this should work.

Doron

Re: write byte[] directly to TokenStream

2011-03-22 Thread Robert Muir

On Mar 22, 2011 11:38 PM, Ryan McKinley ryan...@gmail.com wrote:

 I'm messing with putting binary data directly in the index.  I have a
 field class with:

  @Override
  public TokenStream tokenStreamValue() {
byte[] value = (byte[])fieldsData;

Token token = new Token( 0, value.length, geo );
token.resizeBuffer( value.length );
BytesRef ref = token.getBytesRef();
ref.bytes = value;
ref.length = value.length;
ref.offset = 0;
token.setLength( ref.length );
return new SingleTokenTokenStream( token );
  }

 but that is just writing an empty token.  Is it possible to set the
 Token value without converting to char[]?


check out Test2BTerms for an example...

[HUDSON] Solr-3.x - Build # 302 - Still Failing